Mischa POSLAWSKY [Sun, 5 Nov 2023 20:25:16 +0000 (21:25 +0100)]
domino tile celebrating number 21
Technically 15 in base7, but assume the pips are read as base10 digits.
Alternatively U+1F0F5 🃵 PLAYING CARD TRUMP-21 might be represented by
roman ⅩⅡ, but not semantically.
Mischa POSLAWSKY [Sun, 5 Nov 2023 18:59:09 +0000 (19:59 +0100)]
extended repertoire of latest muchunicode conversions
More relevant matches from later or obscure unicode extensions:
- Filler lookalikes and/or digraphs for:
- "c" (with hook to refer to "h" pronunciation),
- "un" (armenian t),
- "i" (information symbol),
- "co" (digraph),
- "d" (roman 500),
- "e" (estimated symbol).
- Acute variants for "h" and "d" (fuzzy semantics but similar).
- Letter variants with dot below,
- only missing "C"/"c", good for combining dot and diaeresis.
- Stroked variants for "m" (mill) and "n" (oblique), improved for:
- "d" (overlay),
- "u" (stroke without smallcaps standin).
- "U" (bar),
- complementary digits.
Mischa POSLAWSKY [Thu, 26 Oct 2023 17:23:21 +0000 (19:23 +0200)]
missing muchunicode conversions
Initial copy of Fullwidth (plus Tags) and [A-cute, Rock Dots, Stroked]
pseudoalphabets from <https://qaz.wtf/u/convert.cgi?text=MuchUniCode>.
Mischa POSLAWSKY [Mon, 17 Oct 2022 20:24:59 +0000 (22:24 +0200)]
game emojis (cards, hands, races, environments, foods)
Either direct representations, or picked visually most recognisable:
- playing card suits
- French: diamonds, hearts, spades, clubs
- Swiss: bells, shields, roses, acorns (chestnut)
- Latin: clubs (dagger), cups (trophy), swords, coins
- rock paper scissors hands (fist palm V-sign)
- excluding spock 🖖 and lizard 🤏
- Starcraft races (terran zerg protoss, or man bug alien)
- Magic the Gathering mana colours
- sun for white plains
- tree for green forests
- fire(ball) for red mountains
- skull for black swamps
- water (drop) for blue islands
- Wingspan foods (cherries, worm, wheat (rice), fish, rat)
Mischa POSLAWSKY [Sun, 21 Aug 2022 17:50:04 +0000 (19:50 +0200)]
encode actual data in base2048, also ecoji
Replace random payload by the popular 0x09F9 cryptographic key (belatedly)
following the 2007 protest/meme.
Alternate encoding in emoji glyphs <https://github.com/keith-turner/ecoji>
for some random coverage.
Mischa POSLAWSKY [Sun, 21 Aug 2022 14:15:22 +0000 (16:15 +0200)]
base2048 encrypted line of random glyphs
Binary encoding <https://github.com/qntm/base2048> using Twitter "light"
characters, encountered for Hatetris <https://qntm.org/hatetris> replays.
Mischa POSLAWSKY [Mon, 26 Sep 2022 15:49:10 +0000 (17:49 +0200)]
replace fake armenian full stop
Proper U+0589 (։) instead of lookalike colon U+3A (:).
Mischa POSLAWSKY [Wed, 14 Sep 2022 00:22:02 +0000 (02:22 +0200)]
animal band learns to play unicode 15 instruments
Bring back turtle and snail to show off some flute and maracas.
Mischa POSLAWSKY [Sun, 11 Sep 2022 22:57:09 +0000 (00:57 +0200)]
alfauxbet old-lisu letters D/G/L
Prefer Fraser script ꓓꓖꓡ over (roman/cyrillic/armenian) ⅮԌԼ,
as the latter are more likely to appear visually distinct (serifed).
Mischa POSLAWSKY [Sun, 18 Sep 2022 16:03:31 +0000 (18:03 +0200)]
one eight fill blocks in block box
Mischa POSLAWSKY [Sun, 18 Sep 2022 16:16:05 +0000 (18:16 +0200)]
fill overview table gap by fill blocks
Mischa POSLAWSKY [Wed, 14 Sep 2022 00:09:08 +0000 (02:09 +0200)]
proper goose instead of swan imposter in duckling story
Finally a literal encoding of "and five white geese" in Unicode 15.0.
Mischa POSLAWSKY [Tue, 13 Sep 2022 23:45:30 +0000 (01:45 +0200)]
block support table updated to unicode 15.0
Copy Devanagari-A and Kawi (and bonus CJK-H) representations from just
updated Last resort <https://github.com/unicode-org/last-resort-font>.
Mischa POSLAWSKY [Sun, 11 Sep 2022 22:58:33 +0000 (00:58 +0200)]
replace subscript c replacement x by greek chi
Transliteration of chi matches the c in much.
Mischa POSLAWSKY [Fri, 2 Sep 2022 23:44:40 +0000 (01:44 +0200)]
sample characters from cjk extension G
Include named references given on Wikipedia about this block:
> The infamous exotic characters Biáng and Taito are present in this block,
> along with the character for Ky Fan's given name.
Mischa POSLAWSKY [Fri, 19 Aug 2022 00:41:15 +0000 (02:41 +0200)]
javascript emoji (food transforms and combined people)
Copied from: https://twitter.com/steveluscher/status/
741089564329054208
Mischa POSLAWSKY [Thu, 18 Aug 2022 14:17:51 +0000 (16:17 +0200)]
another stable release v2.1
Mischa POSLAWSKY [Fri, 30 Jul 2021 15:25:03 +0000 (17:25 +0200)]
fuck it animal band
Compilation of animals and musical instruments as found on Twitter
https://twitter.com/beIicoso/status/
1218722519638794240 and
https://twitter.com/jenniferdaniel/status/
1397294999289536515
with more varied animal talent inspired by replies.
Mischa POSLAWSKY [Thu, 29 Jul 2021 19:17:07 +0000 (21:17 +0200)]
the fuzzy duckling in emoji
Original transcription from the 1949 book by Jane Werner Watson.
Missing geese and cattails replaced by approximating swans and rice ears
respectively.
Mischa POSLAWSKY [Thu, 29 Jul 2021 19:05:25 +0000 (21:05 +0200)]
caterpillar uneatable illustrations
Addition of everything emojiable from the book.
Mischa POSLAWSKY [Thu, 29 Jul 2021 19:02:28 +0000 (21:02 +0200)]
the very hungry caterpillar in emoji
Transcription by Jennifer Daniel of the 1969 book by Eric Carle:
https://twitter.com/jenniferdaniel/status/
1397793178351149061
Mischa POSLAWSKY [Mon, 9 Mar 2020 22:28:39 +0000 (23:28 +0100)]
remove superfluous key symbols from overview
Lesser used symbols for alt ⎇ and newline  are specific to shell prompts;
remaining option ⌥ and return ↵ are enough to get an impression.
Mischa POSLAWSKY [Wed, 17 Aug 2022 00:44:13 +0000 (02:44 +0200)]
reorder currency symbols in font overview
More problematic CJK yuan to secondary line, common pound and won on first.
Mischa POSLAWSKY [Mon, 15 Aug 2022 21:51:09 +0000 (23:51 +0200)]
reduced font overview width with 5th line
Remove less related columns for Georgian, Armenian, and Hebrew/Arabic,
as these are more rarely (expected to be) supported, provide a very
limited summary, and can be seen in the following table.
Move all symbols of ambiguous width to a new row, filled out with wide
emoji samples (common categories in GBoard &al), personal preferences:
- Smileys & Emotion: smiley
- People: raised hand
- Animals & Nature: turtle
- Food & Drink: coffee (only BMP glyph not in U+1F*)
- Travel & Places: ideal transport
- Activities & Events: trophy
- Objects: bulb
- Symbols: keyboard input
- Flags: checkered flag
Mischa POSLAWSKY [Sun, 14 Aug 2022 21:40:17 +0000 (23:40 +0200)]
armenian
Pangram (if "clunky") suggested by @Armenotype on Twitter.
Paragraph from Wikipedia including a number and Old Armenian quote.
Mischa POSLAWSKY [Mon, 6 Jul 2020 00:58:33 +0000 (02:58 +0200)]
fücking rock dots
Combine common umlaut and uncommon triple dot (logo from Die Ärzte)
with "metal" in each increasingly ill-suited CJK script.
Also attempt a diaeresis below exclamation mark to finish ströng
with overlapping dots.
Inspired by Chinese Häagen Dazs with a fucking umlaut in 哈̈根達斯 as seen on
<https://twitter.com/GretchenAMcC/status/
1279629956406968321>.
Mischa POSLAWSKY [Sat, 4 Apr 2020 04:01:49 +0000 (06:01 +0200)]
xml header in html example
Mischa POSLAWSKY [Tue, 20 Aug 2019 13:19:20 +0000 (15:19 +0200)]
latin alfauxbet (or fauxlphabet) A-Z in mostly cyrillic
Found 18 [extended] Cyrillic capitals that usually appear (near) identical
to Latin counterparts; so with remaining kin from Greek, Lisu, and Armenian,
and one mathematical symbol, can provide a complete IDN homograph attack
example (similar to AAA lookalikes earlier).
Mischa POSLAWSKY [Thu, 29 Jul 2021 20:24:12 +0000 (22:24 +0200)]
slopes and diagonals from legacy drawing block U+1FBxx
Copied from test samples in vte <https://gitlab.gnome.org/GNOME/vte>
doc/boxes.txt added in commit 0.59.91~41 (2019-11-21).
Mischa POSLAWSKY [Thu, 12 Aug 2021 03:20:40 +0000 (05:20 +0200)]
compass comparison in 11 columns of arrows and triangles
Besides existing single/double arrows and black triangles, add samples for
paired, white, black, triangle-headed (also w/bar), sans-serif, light barb
arrows, and white and medium triangles.
Double triangles last and separate because they are commonly interpreted as
double-width and emoji.
Mischa POSLAWSKY [Thu, 29 Jul 2021 19:26:32 +0000 (21:26 +0200)]
legacy computing shade blocks block
Unicode 13.0 added vertical halves of medium shade and checkered fills,
combined with other gradients into a coherent testing box inspired by the
Unscii 2.0 test picture <http://viznut.fi/unscii/>.
Mischa POSLAWSKY [Sat, 14 Nov 2020 20:12:57 +0000 (21:12 +0100)]
reorganise arrows like triangles
Mischa POSLAWSKY [Fri, 13 Nov 2020 21:36:46 +0000 (22:36 +0100)]
symmetric kaomoji by reversing a tilde
Mischa POSLAWSKY [Fri, 13 Nov 2020 13:49:35 +0000 (14:49 +0100)]
chinese periodic table of elements
Some chronology requiring up to Unicode 11.0 for the latest additions.
Copied from Wikipedia.
Mischa POSLAWSKY [Tue, 10 Nov 2020 20:36:25 +0000 (21:36 +0100)]
aztek code in 2x3 block segments
Redraw the image from commit
v1.0-30-g378bf4526a (2015-09-05)
using 27 distinct 2x3 block segments from Unicode v13.0,
replacing 14 2x2 drawing characters still covered in an earlier diagram.
Results in a square aspect ratio assuming tall fonts.
Larger micro QR code does not offer more distinct characters.
Mischa POSLAWSKY [Mon, 9 Nov 2020 20:23:47 +0000 (21:23 +0100)]
redraw heavy grid transitions in 7x7 box
Visually similar presentation of all significant light/heavy line characters.
No longer includes ╇╈╁╀ in addition to other more complex combinations:
┞┲┭┮┱┧
┟╀┾╈╉┦
┡┽╁╊╇┪
┢┵┺┹┶┩
Mischa POSLAWSKY [Sun, 8 Nov 2020 07:22:41 +0000 (08:22 +0100)]
version update to unicode 13.0
Mischa POSLAWSKY [Sun, 8 Nov 2020 08:37:18 +0000 (09:37 +0100)]
block allocation glyphs for brahmic zone U+11xxx
Up to unicode 13.0, mostly following Last Resort icons in
<https://github.com/unicode-org/last-resort-font/releases/tag/13.001>.
Mischa POSLAWSKY [Sat, 7 Nov 2020 17:40:39 +0000 (18:40 +0100)]
update sanskrit transcriptions
Replace flawed contents with recent improvements from:
https://commons.wikimedia.org/wiki/File:संस्कृतम्.png?oldid=
495574592
Seems to address concerns raised on
https://mendenlama.tumblr.com/post/
120050473698/camfoc-issues
Mischa POSLAWSKY [Sat, 7 Nov 2020 16:13:02 +0000 (17:13 +0100)]
inverse bullet in empty center of block drawing
Mischa POSLAWSKY [Sat, 7 Nov 2020 15:56:32 +0000 (16:56 +0100)]
drop icelandic pangram
Unique thorn and eth both represented in old english just above.
Mischa POSLAWSKY [Sat, 7 Nov 2020 16:37:07 +0000 (17:37 +0100)]
yezidi letter at empty U+10E8x block
Mischa POSLAWSKY [Sat, 7 Nov 2020 14:47:28 +0000 (15:47 +0100)]
change letter symbols to MuchUniCode
Replace symbols by spelling out a more descriptive phrase without as much
encouragement.
Slight increase in glyph coverage, with different advantages, notably:
- Double-struck letters include C from legacy letterlike block.
- Fraktur letters with legacy black-letter C.
- Missing subscript letters for c and d replaced by soundalikes.
- Squared O might be an exceptional emoji :o2: (blood type).
- No good representation for rotated U, substitute big ∩.
- Epigraphic inverted M from latin extended D.
- Include negative enclosed variants instead of missing lowercase.
- Circled japanese and korean to spell out mu-ch.
Count each item with a number in similar style if possible.
Reorder considering those with only single digits available.
Most characters generated by: echo MuchUniCode |
perl -mcharnames -CO -E'
my $line = <STDIN>;
print $line =~ s{\S}{
$name = uc "$_ $&";
$name =~ s/CAPITAL/SMALL/ if $& =~ /\p{Ll}/;
chr(charnames::vianame($name) || 0xFFFD)
}egr for @ARGV;
' \
'MATHEMATICAL MONOSPACE CAPITAL' \
'MATHEMATICAL BOLD CAPITAL' \
'MATHEMATICAL ITALIC CAPITAL' \
'MATHEMATICAL BOLD ITALIC CAPITAL' \
'MATHEMATICAL DOUBLE-STRUCK CAPITAL' \
'MATHEMATICAL SANS-SERIF CAPITAL' \
'MATHEMATICAL SANS-SERIF BOLD CAPITAL' \
'MATHEMATICAL SANS-SERIF ITALIC CAPITAL' \
'MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL' \
'LATIN CAPITAL LETTER TURNED' \
'MATHEMATICAL FRAKTUR CAPITAL' \
'MATHEMATICAL BOLD FRAKTUR CAPITA' \
'MATHEMATICAL SCRIPT CAPITAL' \
'MATHEMATICAL BOLD SCRIPT CAPITAL' \
'MODIFIER LETTER CAPITAL' \
'LATIN SUBSCRIPT SMALL LETTER' \
'PARENTHESIZED LATIN CAPITAL LETTER' \
'CIRCLED LATIN CAPITAL LETTER' \
'SQUARED LATIN CAPITAL LETTER'
Mischa POSLAWSKY [Sat, 31 Oct 2020 08:08:33 +0000 (09:08 +0100)]
deobfuscate haskell golf a bit
Keep only a single <$> to test its ligature.
Language comment in front and in full.
Mischa POSLAWSKY [Wed, 11 Mar 2020 19:48:04 +0000 (20:48 +0100)]
cjk extension G character biáng
Famously complex with 58 strokes, and very recently added so minimally
supported (in some cases not even recognised as a wide character).
More details on <https://en.wikipedia.org/wiki/Biangbiang_noodles
#Chinese_character_for_bi.C3.A1ng> of course.
Mischa POSLAWSKY [Sat, 31 Oct 2020 03:38:44 +0000 (04:38 +0100)]
compare variant forms in chinese transliteration
Choose characters to showcase differences in simplified (chinese and
japanese) and traditional cjk, sometimes even matching a separately encoded
root symbol. Also adds some cyrillic of dungan mandarin, and (traditional)
character composition in IDS.
Words mostly chosen at random, pronunciation from wiktionaries.
Mischa POSLAWSKY [Tue, 21 Apr 2020 04:06:45 +0000 (06:06 +0200)]
additional potential A-lookalikes
Related letters found in Pau Cin Hau, Carian, Old Italic and Lydian,
which could look identical depending on the font.
Same for mathematical monospace symbol assuming monospace display.
Mischa POSLAWSKY [Fri, 30 Oct 2020 23:07:15 +0000 (00:07 +0100)]
stroke diacritics can emulate underline, strikethrough, overscore
Support is remarkably bad in some fonts, with marks not connecting,
being mispositioned, or not even combining.
Mischa POSLAWSKY [Tue, 21 Apr 2020 03:11:55 +0000 (05:11 +0200)]
reorder overview to align math and currency symbols
Mischa POSLAWSKY [Sat, 4 Apr 2020 03:49:58 +0000 (05:49 +0200)]
sample glyphs from kStrange categories A-U
From Unicode proposal L2/20-059 by Ken Lunde containing Han ideographs
considered "strange" in 12 categories:
Asymmetric, Bopomofo, Cursive, Fully-reflective, Hangul Component,
Incomplete, Katakana Component, Mirrored, Odd Component, Rotated,
Stroke-heavy, Unusual Arrangment/Structure.
Pick 4 mostly random characters, preferably from different CJK blocks of
increasing version up to extension F.
Mischa POSLAWSKY [Sat, 31 Oct 2020 07:15:43 +0000 (08:15 +0100)]
vowelless arabic transliteration
Omit spoken sounds not in the written script.
Mischa POSLAWSKY [Sat, 4 Apr 2020 00:58:53 +0000 (02:58 +0200)]
arabic test sentences
From "Automating the generation and typesetting of Arabic script" by Mansour,
test in §5.4.3:
> The eight sentences were chosen to test these main features:
> - Each letter is correctly generated without compatibility problems with MetaPost
> - Choosing the correct forms of letters (context analysis engine test)
> - Connections between letters (joining problems that require modification in Meta-font files)
> - Kerning
Mischa POSLAWSKY [Sat, 4 Apr 2020 00:57:10 +0000 (02:57 +0200)]
arabic presentational forms
Copy contextual forms from vim output +set arabic to compare with rendered
ligatures. Besides testing compatibility encoding, they mostly serve to
highlight missing support in some terminals (or even rtl if they do).
Mischa POSLAWSKY [Sat, 4 Apr 2020 00:45:14 +0000 (02:45 +0200)]
arabic vowel diacritics
Fully vocalised version with roman transliteration copied from:
http://clagnut.com/blog/2380/#Perfect_pangrams_in_English_.2826_letters.29
Mischa POSLAWSKY [Sat, 4 Apr 2020 00:15:02 +0000 (02:15 +0200)]
common arabic pangram
Mentioned in "Computational Linguistics, Speech and Image Processing for
Arabic Language" by El Gayar and Suen at Test image 3.1.4.b as "the most
common Arabic pangram ... containing all of the basic letters".
Copied version from <http://clagnut.com/blog/2380/
#Perfect_pangrams_in_English_.2826_letters.29> with additional diacritics.
Mischa POSLAWSKY [Sat, 4 Apr 2020 00:03:25 +0000 (02:03 +0200)]
larger african sample in taa
Part of a story on <http://archive.phonetics.ucla.edu/Language/NMN
/nmn_story_1972_01.html> transcript using special glyphs ǀǁǂǃʘʔɟʼʰʲa̰ɜʉʉ̰m̩ŋn̩ɲə
(the last two missing in this snippet).
Contains all distinctive characters of the Khoekhoe pangram, so replace that
by a phrase from <https://en.wikipedia.org/wiki/Taa_language?oldid=
943791933>
with slightly different orthography including ɢ.
Mischa POSLAWSKY [Sat, 4 Apr 2020 01:04:58 +0000 (03:04 +0200)]
ForTheWin in smallcaps, superscript, subscript, turned letters
Mischa POSLAWSKY [Thu, 19 Mar 2020 02:08:15 +0000 (03:08 +0100)]
compare mathematical letter symbols for ForTheWin
Reintroduce an even more elaborate overview of letterlike scripts, dismissed
in commit
26e8aa257e (2015-09-12) [drop mathematical ABC symbols line].
While it remains as ill-advised for spelling out words, unfortunately it is
widely used that way nowadays (at least in Twitter user names).
Regardless, style should be consistent (especially considering characters
have been introduced in different versions) and distinct (as it's intended
for unique variables), so a comparison makes sense in any case.
Just put it after proper language scripts.
Mischa POSLAWSKY [Sat, 4 Apr 2020 03:42:48 +0000 (05:42 +0200)]
cjk comparison of traditional, simplified, shinjitai
Sample characters copied from <https://en.wikipedia.org/wiki
/Differences_between_Shinjitai_and_Simplified_characters?oldid=
932950382>
"Different simplifications in both languages".
Mischa POSLAWSKY [Sat, 14 Mar 2020 22:19:54 +0000 (23:19 +0100)]
random characters from cjk extension F
Mischa POSLAWSKY [Sat, 14 Mar 2020 22:18:11 +0000 (23:18 +0100)]
common line break after ethiopic header
Match /^\N+:\n\n/ syntax like other titles.
Mischa POSLAWSKY [Sat, 14 Mar 2020 22:06:06 +0000 (23:06 +0100)]
single word of zalgo should suffice
Generated by <https://www.zalgotextgenerator.com/>.
Mischa POSLAWSKY [Sat, 14 Mar 2020 21:50:29 +0000 (22:50 +0100)]
append cypriot glyph to linear A space
Previously omitted as it shares the same 0x80 space as Aramaic,
but can be safely moved a bit since the preceding code point is empty.
Mischa POSLAWSKY [Sat, 14 Mar 2020 21:44:33 +0000 (22:44 +0100)]
match circled ideograph to cjk representation
Reuse a common glyph as outlined in the Last Resort guidelines, improving
over their own choice.
Mischa POSLAWSKY [Sat, 14 Mar 2020 21:32:07 +0000 (22:32 +0100)]
move A homographs to typography section
Good place to test for expected similarities in derived writing systems,
next to unwanted same-script lookalikes.
The large amount of O-ish glyphs do not really add much except for testing
script coverage.
Mischa POSLAWSKY [Sat, 14 Mar 2020 21:09:49 +0000 (22:09 +0100)]
include letter Ø as common 0 lookalike
Programming fonts regularly use a slashed zero to distinguish the number,
but in some cases make it look very similar to the scandinavian vowel.
In low quality could even be mistaken for 8 shown subsequently.
Confusion with the symbols ∅ and ⌀ is less likely, usually having a
different shape, as do glyphs representing a dotted digit 0 (ʘ, ☉, ⊙, ⨀).
Mischa POSLAWSKY [Sat, 14 Mar 2020 20:22:40 +0000 (21:22 +0100)]
lower pronunciation of Arthur in panphone
Replace duplicate mid-central ɚ by open-mid ɝ for a near-identical result
with a different glyph, already distinguished in all other orthographies.
Mischa POSLAWSKY [Sat, 14 Mar 2020 20:12:12 +0000 (21:12 +0100)]
reorder panphone to drop "on" in second line
Equivalent sentence containing the same sounds, but shorter so it fits
within 76 characters.
Mischa POSLAWSKY [Sat, 14 Mar 2020 20:36:09 +0000 (21:36 +0100)]
panphone in deseret
Manual composition, disregarding the automated translation by <2deseret.com>
as it does not conform to standard spelling outlined in
<http://www.chem.ucla.edu/~jericks/Historical%20or%20Technical/Linguistics/Deseret_Guide.pdf>
Consider it a mostly dead script, so not positioned next to other English.
Mischa POSLAWSKY [Sat, 14 Mar 2020 00:21:44 +0000 (01:21 +0100)]
align U+03x blocks for single width hexagram
Assume U+4DC3 symbol should be one column wide, ignoring double width
displayed by Unifont.
Mischa POSLAWSKY [Thu, 12 Mar 2020 05:31:09 +0000 (06:31 +0100)]
smp block allocation glyphs
Extend representation characters for U+10000-10FFF usually copying
the Unicode font <https://github.com/unicode-org/last-resort-font>.
Mischa POSLAWSKY [Thu, 12 Mar 2020 02:45:16 +0000 (03:45 +0100)]
prefer apple last resort glyphs
Described at <http://developer.apple.com/fonts/LastResortFont/>
for a more uniform style:
> Unicode blocks are illustrated by a representative glyph from the block,
> chosen to be as distinct as possible from glyphs of other blocks.
>
> Examplar glyphs were chosen in a number of ways. Almost all of the
> Brahmic scripts show the initial consonant ka. Latin uses the letter A
> because it's the first letter, and because in each Latin block there is
> a letter A so they can be easily differentiated. Greek and Cyrillic use
> their last letters, omega and ya, because they are so distinctive. Most
> other alphabets and syllabaries use their initial letter where
> distinctive.
Try to avoid unnecessary exceptions, though in some cases I can't help but
know better (usually improving distinctiveness, especially considering
unknown output variants).
Restrict to a single entry per 0x80, mostly keeping the latest unicode
version for maximum effort.
Mischa POSLAWSKY [Wed, 11 Mar 2020 21:51:08 +0000 (22:51 +0100)]
bmp block allocation glyphs
Overview of BMP blocks similar to <http://sheet.shiar.nl/charset/unicode>
each represented by an identifying glyph copied from Unidings v9.19
<http://users.teilar.gr/~g1951d/Unidings.pdf> by George Douros.
Characters align neatly to 0x40 code points, preferring every other column
if feasible, but keeping various smaller (rtl, brahmic) scripts for now.
Silently break positions between U+3400 and U+A400 because these only
contain cjk and yi with usually uniform font coverage.
Mischa POSLAWSKY [Tue, 10 Mar 2020 01:54:56 +0000 (02:54 +0100)]
update version to unicode 10.0
Required for recently added hentaigana.
Mischa POSLAWSKY [Mon, 9 Mar 2020 23:02:33 +0000 (00:02 +0100)]
hentaigana variant of iroha
Version from 現今児童重宝記 <https://www.sljfaq.org/afaq/iroha.html>
(No. 6) painstakingly matched to unicode glyphs described on
<https://en.wikipedia.org/wiki/Hentaigana?oldid=
935497055>
with uncertainties resolved by comparing <https://www.semanticscholar.org
/paper/Distinction-and-Difference:-From-Kana-to-Hiragana-Marks
/
f334ea3cb70e933ed4f52e174a41c0242d5204a2/figure/5>.
Mischa POSLAWSKY [Mon, 9 Mar 2020 22:41:25 +0000 (23:41 +0100)]
haskell oneliner with programming ligatures
Some obfuscated code (not particularly typical) as found and explained on
<https://stackoverflow.com/questions/
12659951/-obfuscated-haskell-code-work>
featuring multi-character combinations <$>, <*>, =<<, >>= substituted by
"modern" coding fonts such as <https://github.com/tonsky/FiraCode>.
This whole practice seems like an awful idea to me, but regardless needs to
be represented for font comparison.
Mischa POSLAWSKY [Mon, 9 Mar 2020 22:28:48 +0000 (23:28 +0100)]
powerline lookalikes for branch and linenr
Indicators for vc branch and line number are common requirements of modern
status bars. While console fonts still prefer private use area U+E0Ax,
similar symbols for "alternate key" and "newline" as mentioned in
<https://vi.stackexchange.com/a/3363> can be advertised instead.
Mischa POSLAWSKY [Mon, 9 Mar 2020 22:22:57 +0000 (23:22 +0100)]
reorder minority alphabets in overview
Cyrillic before Greek as it's stylistically closer to Latin.
Georgian before Armenian as it aligns better with following lines.
Mischa POSLAWSKY [Mon, 9 Mar 2020 22:16:57 +0000 (23:16 +0100)]
reorganise overview table
Mischa POSLAWSKY [Mon, 9 Mar 2020 20:43:45 +0000 (21:43 +0100)]
restrict currencies to most traded
A "compact" overview does not need 17 different currency symbols, mostly
inherited from the <http://kermitproject.org/utf8.html> sampler line.
Test only internationally significant valuta, guided by the top 28 listed
at <https://en.wikipedia.org/wiki/Currency?oldid=
942055810#cite_ref-10>,
keeping recent additions (₹, ₽) and adding a full-width character (元).
Mischa POSLAWSKY [Mon, 25 Nov 2019 13:13:37 +0000 (14:13 +0100)]
homoglyphs of A and O
Collect visually similar characters from different scripts.
Unlike ASCII lookalikes presented earlier, these are not expected to be
distinguishable if mixed, and a worst-case scenario of homograph attacks.
Mischa POSLAWSKY [Mon, 22 Oct 2018 22:04:28 +0000 (00:04 +0200)]
excessive, scary usage of diacritics; ZALGO!
Copied from <https://knowyourmeme.com/memes/zalgo#scrambled-text>
to stress test combining marks:
- Rendering limitations may exclude glyphs after a certain number.
- Accumulated marks should extend vertically to avoid overlapping.
- Monospace rendering with increased height may cause lines to overlap
or be cropped.
Mischa POSLAWSKY [Fri, 29 Jun 2018 20:06:21 +0000 (22:06 +0200)]
coptic sample text in old nubian
Equivalents in coptic and greek characters, copied from:
https://en.wikipedia.org/wiki/Old_Nubian?oldid=
847789397#Sample_text
Mischa POSLAWSKY [Mon, 9 Mar 2020 22:33:33 +0000 (23:33 +0100)]
different vietnamese dong
Fill available space by a "different" expression with the đồng currency sign
used in its homonym (compensating for its removal from the font overview),
followed by IPA pronunciation averaging several Wiktionary entries including
<https://zh.wiktionary.org/wiki/bất_đồng?oldid=
4888545> to most
significantly provide missing ɓ, ɗ, ɜ.
Mischa POSLAWSKY [Fri, 29 Jun 2018 20:04:47 +0000 (22:04 +0200)]
pointer compass, triangles in all directions
Black triangle characters and related, similar (and next) to arrows.
Mischa POSLAWSKY [Fri, 29 Jun 2018 20:03:09 +0000 (22:03 +0200)]
random kaomoji faces
Test appearance of some common Japanese face characters, picked from
https://en.wikipedia.org/wiki/Emoticon#Japanese_style and (mixed)
https://en.wikipedia.org/wiki/List_of_emoticons#Eastern containing
some complex Unicode glyphs.
Excellent test of mixed scripts and common visual expectations.
Mischa POSLAWSKY [Fri, 29 Jun 2018 20:01:39 +0000 (22:01 +0200)]
sanskrit transcriptions from wikipedia
Compare some brahmic scripts with a common sentence copied from:
https://commons.wikimedia.org/?title=File:Phrase_sanskrit.png&oldid=
308591152
Meaning seems nice and related:
> May Śiva bless those who take delight in the language of the gods
Even though the variants have various issues, and actual source is unknown
according to http://mendenlama.tumblr.com/post/
120050473698/camfoc-issues:
> the ascription of this blessing phrase to Kālidāsa is spurious
Mischa POSLAWSKY [Fri, 29 Jun 2018 14:09:31 +0000 (16:09 +0200)]
shavian corrections for "waters", "heard"
Fix waters being incorrectly transcribed as woiters, losing oil.
Replace expected h-err-d by morphophonemic h-ear-d to include another glyph.
Remaining letters unrepresented: 𐑬𐑭𐑲𐑴 𐑹𐑺𐑾
Mischa POSLAWSKY [Sat, 31 Mar 2018 18:23:14 +0000 (20:23 +0200)]
align hangeul decomposition
Start sentences at same column assuming expected character widths.
Mischa POSLAWSKY [Wed, 13 Jul 2016 17:08:18 +0000 (19:08 +0200)]
chinese transliteration below samples
Mixed scripts after more typical CJK.
Mischa POSLAWSKY [Sun, 5 Jun 2016 12:44:23 +0000 (14:44 +0200)]
cantonese transliteration (jyutping, ipa)
One character per line for better overview and space for additional details,
introducing common non-pinyin tone digits and ɵ pronunciation.
Mischa POSLAWSKY [Sun, 13 Sep 2015 18:17:12 +0000 (20:17 +0200)]
update dated update date to uptodate date
Mischa POSLAWSKY [Sun, 13 Sep 2015 18:07:29 +0000 (20:07 +0200)]
symmetric ascii art bunny
Keep to ASCII characters as commonly used (curved quotation marks were
likely substituted due to an erroneous copypaste).
Mischa POSLAWSKY [Sat, 12 Sep 2015 13:29:06 +0000 (15:29 +0200)]
drop mathematical ABC symbols line
Places too much emphasis on an relatively insignificant plane 1 block.
One such character also introduced in commit
30491ef4cf (2015-09-09)
[complex conjugate formula to cover blackletter and italic letters]
remains elsewhere.
Mischa POSLAWSKY [Sat, 12 Sep 2015 13:25:54 +0000 (15:25 +0200)]
insert non-joiner between non-ligature fl in german pangram
Lost during copypaste from original.
Mischa POSLAWSKY [Fri, 11 Sep 2015 18:11:40 +0000 (20:11 +0200)]
fix mistyped letter in greek iliad
Obvious mistake caught while rereading.
Mischa POSLAWSKY [Fri, 11 Sep 2015 17:22:47 +0000 (19:22 +0200)]
glagolitic tower of bable transcription
Another line to properly finish the story. Preferred succession from
Slavonic would be old Croatian in Glagolitic script. However, unable to
find any such version online, settle for an original composition.
Based on a different source of Church Slavonic without abbreviations
from <http://www.vechnoe.info/bible/translit/gen/11>:
Прїидѣте и изшедше смѣсимъ имъ ту язы́ки ихъ,
да не услы́шатъ ко́ждо дру́га своего.
Converted to Glagolitic using some naive conversion rules:
tr{абвгдежѕзиїклмнопрстуфхѡщцчшъыьѣёюяѩѫ,.}
{ⰰⰱⰲⰳⰴⰵⰶⰷⰸⰺⰻⰽⰾⰿⱀⱁⱂⱃⱄⱅⱆⱇⱈⱉⱋⱌⱍⱎⱏⰹⱐⱑⱖⱓⱔⱗⱘ·:};
s/ⰹ/ⱏⰹ/g;
s/\Bⰺ/ⰻ/g;
Arbitrarily appended three dot+paragraphos punctuation to end text.
Mischa POSLAWSKY [Fri, 11 Sep 2015 14:32:41 +0000 (16:32 +0200)]
cyrillic tower of babel in multiple slavic languages
Replace Russian sample by Genesis 11:1-6 with each line in another
translation from <http://www.omniglot.com/babel/langfam.htm#ie>:
Russian, Serbian, Belarusian, Ukrainian, Macedonian, Church Slavonic.
Adds 16 distinct letters in 29 forms, only loses ф.
Manually transcribed the image of Slavonic (hopefully correctly),
featuring obsolete letters (yat, yus, ou both monographic and digraphic)
and diacritics including U+0483 titlo and U+2DED es with pokrytie.
Mischa POSLAWSKY [Wed, 9 Sep 2015 21:07:56 +0000 (23:07 +0200)]
complex conjugate formula to cover blackletter and italic letters
Elaborate on complex numbers ℂ, covering some more symbols including an
italic i from plane 1. Should provide a new challenge to render correctly
(notably aligning bracket lines after combining mark and 4-byte UTF-8).
Mischa POSLAWSKY [Wed, 9 Sep 2015 20:44:50 +0000 (22:44 +0200)]
reorder languages to transition from semitic to indic
Ethiopic after Hebrew (similar languages, simple rendering);
Thai before Hindi to group the more complex scripts together.