Emojis for scripts
Added by Paul Janson over 5 years ago
Emojis are unicode codepoints above 65535, so you cannot just use $chr() to show them. you must use a pair of "surrogate" UTF8 characters. This alias $chr2 works just like the $chr identifier, except it also supports the whole range of mapped codepoints beyond a million. The limitation is that these emojis can only be seen if you're using a font which has mapped those codepoints. For me, these include "Segoe UI Symbol" and "Dejavu Sans". i.e. if you're using Consolas and /say them to channel, you can't see these emojis, but someone using "Segoe UI Symbol" can see them. Note that even though you can only see the 1 character, your script sees length 2, as the 2 surrogates. This has the potential to break scripts that need to handle string "characters" individually, as the surrogates are invalid except as used in valid pairs. Even though each surrogate's codepoint encodes as 3 bytes, as a surrogate-pair instead of 2x3=6 it is seen by /bset as the 4-byte UTF8 pattern for codepoints above 65535. Assuming you're using a font which displays these symbols correctly:
Black cat emoji is created by the surrogates 55357 and 56369: //var %a $chr2(128049) | echo -a $asc($mid(%a,1,1)) $asc($mid(%a,2,1))
Combining these codepoints shows the black cat emoji: //echo -a $chr(55357) $+ $chr(56369)
Surrogate codepoints 55296-57343 not part of a pair are invalid, so /bset returns the same as $chr(65533), but as part of a pair /bset holds the 4 byte UTF8 encoding://bset -t &v 1 $chr(55357) | bset -t &v2 1 $chr(55357) $+ $chr(56369) | echo -a $bvar(&v,1-) vs $bvar(&v2,1-)
8419 is sometimes seen as part of an emoji. It's modified by the character preceding it. Most fonts only put the square around # and 0-9.//echo -a $regsubex(1234567890#,/(.)/gu,\t $+ $chr(8419))
Not all emojis are mapped, so you can't show an emoji by selecting a random number within the range. You'd need to have a flag variable to indicate which are mapped and which aren't://var %a , %i 1 , %emojis 1111111111111111011101010101110111111001111010011110111111111111000011111111111 , %reps 50 | while (%i <= %reps) { var %j $rand(1,$Len(%emojis)) | if ($mid(%emojis,%j,1)) { var %a %a $+ $chr2($calc(128512 + %j)) | inc %i } } | echo -a %a
; supports codepoints above 65535, by using a pair of UTF-8 surrogates, $len == 2 ; $chr2(N) N=integer, $chr2hex(N) N=hex $chr635(N) is $chr2(N) for v6.35 which doesn't support $chr(256+) ; //echo -a $chr2(128049) black cat length $len($chr2(128049)) ; //echo -a $chr2hex(1F431) black cat length $len($chr2hex(1F431)) ; to see the emojis, you must be using a font which maps those ; ie Segoi UI Symbol, Dejavu Sans, etc. alias chr2 { if ($1 isnum 0-65535) returnex $chr($1) if ($1 isnum 65536-1114111) { returnex $chr($calc(55232 + $1 // 1024)) $+ $chr($calc(56320 + $1 % 1024)) } } alias chr2hex { var %hex $base($1,16,10) if (%hex isnum 0-65535) returnex $chr(%hex) if (%hex isnum 65536-1114111) { returnex $chr($calc(55232 + %hex // 1024)) $+ $chr($calc(56320 + %hex % 1024)) } } ; this is like $chr2 except supports mIRC v6.35 by not using $chr() > 255 nor floor divide ; output is anywhere from 1-4 UTF8 bytes. alias chr635 { ;if ((($version isnum 7-) || ($~adiircexe)) && ($1 isnum 0-65535)) returnex $chr($1) if ($1 isnum 0-127) var %a $chr($1) elseif ($1 isnum 128-2047) var %a $chr( $int($calc(192 + $1 / 64)) ) $+ $chr( $calc(128 + $1 % 64) ) elseif ($1 isnum 2048-65535) var %a $chr( $int($calc(224 + $1 / 4096)) ) $+ $chr( $int($calc(128 + ($1 / 64) % 64)) ) $+ $chr( $calc(128 + $1 % 64) ) elseif ($1 isnum 65536-1114111) var %a $chr( $int($calc(240 + $1 / 262144)) ) $+ $chr( $int($calc(128 + ($1 / 4096) % 64)) ) $+ $chr( $int($calc(128 + ($1 / 64) % 64)) ) $+ $chr( $calc(128 + $1 % 64) ) elseif ($1 isnum 65536-4123456789) var %a $chr( $int($calc(240 + $1 / 262144)) ) $+ $chr( $int($calc(128 + ($1 / 4096) % 64)) ) $+ $chr( $int($calc(128 + ($1 / 64) % 64)) ) $+ $chr( $calc(128 + $1 % 64) ) if (($version isnum 7-) || ($~adiircexe)) returnex $utfdecode(%a) | returnex %a }