Shiar's Git - unicode-sampler.git/log

git.shiar.nl / unicode-sampler.git / log

Mischa POSLAWSKY [Wed, 19 Aug 2015 05:29:36 +0000 (07:29 +0200)]

braille contraction missed by converter

Groupsign -en- (lower e) may be used in the end of "queen" (explicitly
mentioned in <http://www.brailleauthority.org/literary/ebae2002.pdf>).

commit | commitdiff | tree

Mischa POSLAWSKY [Wed, 19 Aug 2015 03:36:55 +0000 (05:36 +0200)]

braille of english pangram instead

Replace long story in "scientific" (and very artificial) GS8 notation
by a transcription of the English panphone using common Grade-2 British
courtesy of <https://www.branah.com/braille-translator>.

Loses coverage of some 8-cell dots, but includes common abbreviations and
practical orthography. Use braille blanks U+2800 instead of spaces.

commit | commitdiff | tree

Mischa POSLAWSKY [Wed, 19 Aug 2015 00:24:38 +0000 (02:24 +0200)]

adjust english phonetic sample to cover more sounds

* Reorder words for more a less contrived meaning.
* Replace "wanted before" by "looked for it", to add /ʊ/ (the only absent
monophtong) without losing any sounds.
* Introduce /ʉ/ by using a slight Scottish accent for "hue".
* Include commonly found glottal stop before "all".

commit | commitdiff | tree

Mischa POSLAWSKY [Tue, 18 Aug 2015 23:49:25 +0000 (01:49 +0200)]

english phonetic pangram/panphone for ipa showcase

Replace the poor "linguistics" section by the last example from
<http://www.quora.com/Is-there-a-text-that-covers-the-entire-English-phonetic-range>
which covers most English phonemes including more rare distinctions
(m/ɱ, x, l/ɫ, w/ʍ).

Manually transcribed in an attempt to cover most sounds naturally, using a
mostly Irish/generic pronunciation (I'm not native though). Compare
<https://en.wikipedia.org?oldid=673810019> for an overview of regional
differences. Alternate IPA transcriptions in native dialects found at
<https://www.reddit.com/r/conlangs/comments/2quvnf/make_a_dialect_of_english/>
but not used due to more limited inventories.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 14 Aug 2015 21:27:27 +0000 (23:27 +0200)]

japanese iroha in all scripts

Kanji, hiragana, and original version downloaded from
<https://en.wikipedia.org?oldid=670286422>.

Katakana transliteration from <http://www.columbia.edu/~fdc/utf8/>.

Halfwidth variant derived using perl -Mcharnames=:full -CS -pe'
package charnames; s/\S/chr vianame("HALFWIDTH ".viacode(ord $&))/ge'
with incompatible characters replaced by small forms (prefer coverage over
natural conversion) and a voiced mark appended for even more coverage.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 14 Aug 2015 19:40:34 +0000 (21:40 +0200)]

replace chinese extension B character from extension A

U+4D85 is obviously incorrect; assume U+24D85 was intended.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 14 Aug 2015 19:36:51 +0000 (21:36 +0200)]

chinese samples of extended unicode blocks

Random characters from each block from <http://ctext.org/font-test-page>.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 14 Aug 2015 19:14:55 +0000 (21:14 +0200)]

chinese sample text: 1st chapter of qian zi wen

Classic coverage poem in traditional orthography downloaded from
<http://www.gutenberg.org/ebooks/24184>.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 14 Aug 2015 19:02:39 +0000 (21:02 +0200)]

chinese transliteration of 3 choice characters -ü

Selected most frequently used characters ending in ü with all its tones.
Covers the most difficult pinyin (multiple accents), some limited bopomofo,
IPA tone bars (combinable into contours), and traditional/simplified glyph
comparison.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 14 Aug 2015 18:17:28 +0000 (20:17 +0200)]

chinese selection of 50 most common mandarin characters

Extracted from Modern Chinese Character Frequency List (updated 2005-12-21)
published by 笪骏 [DA Jun] <http://lingua.mtsu.edu/chinese-computing>.
These characters should cover 30% of modern chinese texts.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 14 Aug 2015 17:25:09 +0000 (19:25 +0200)]

tibetan declaration of human rights

Good sample copied from ཝེ་ཁེ་རིག་མཛོད <https://bo.wikipedia.org?oldid=123541>.
Prefix the title for yig-mgo, and adoption date [1948-12-10] as found on
<http://blog.amdotibet.cn/aaa999/archives/82869.aspx> for numbers.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 14 Aug 2015 17:17:51 +0000 (19:17 +0200)]

tamil and kannada poems from Kermit UTF-8 Sampler

Extracted from 2012-05-07 version of <http://www.columbia.edu/~fdc/utf8/>
by Frank da Cruz.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 14 Aug 2015 17:15:00 +0000 (19:15 +0200)]

apl function for game of life

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 31 Jul 2015 01:23:13 +0000 (03:23 +0200)]

drop headers and abbreviate descriptions

Get rid of some English clutter;
Original sources should be easy to find by searching online.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 31 Jul 2015 00:30:32 +0000 (02:30 +0200)]

hebrew sample

Ideally test RTL, but good for modern script coverage in any case.
Best use of the common Unicode invitation so far, as it mixes direction
and includes niqqud.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 31 Jul 2015 00:19:09 +0000 (02:19 +0200)]

devanagari sample

Copied from <http://r12a.github.io/scripts/summaries/devanagari>.

Context-based positioning at start of last 2 lines; digits at end of line 3;
multiple combining characters at line 2 start; contextual shaping in line 1
and start of line 4.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 31 Jul 2015 00:16:12 +0000 (02:16 +0200)]

spell old english in old english

Includes capital AE.

commit | commitdiff | tree

Mischa POSLAWSKY [Fri, 31 Jul 2015 00:13:23 +0000 (02:13 +0200)]

replace s in latin old english by long variants

Also include precomposed st-ligature for good measure (matching runic).

commit | commitdiff | tree

Mischa POSLAWSKY [Thu, 30 Jul 2015 23:49:10 +0000 (01:49 +0200)]

transliterate runes with traditional orthography

Prefer original thorn and wynn letters. Then "modernize" eth and long
vowels for additional coverage of Old English transcription.

commit | commitdiff | tree

Markus Kuhn [Mon, 6 Apr 2009 18:13:43 +0000 (20:13 +0200)]

update to current upstream version 2002/2009

Latest <http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt>
removes trailing whitespace.

commit | commitdiff | tree

Markus Kuhn [Thu, 25 Jul 2002 12:00:00 +0000 (12:00 +0000)]

update to 2002 version

Retrieved from <http://www.cl.cam.ac.uk/~mgk25>.

commit | commitdiff | tree

Markus Kuhn [Fri, 20 Aug 1999 12:00:00 +0000 (12:00 +0000)]

UTF-8 encoded sample plain-text file

Extracted from <http://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html>,
the earliest version I could find.

Unicode sampler - various texts to test Unicode support

RSS Atom