sample characters from cjk extension G
-rw-r--r-- 33428 unicode.txt