Monday 12 May 2003

Unicode rocks

The feedback on my previous post, This is a test of Japanese, indicates that our East Asian languages experiment is proving to be surprisingly successful. The number of visitors—mainly on Mac OS X and Linux/Unix systems—who could see the Japanese characters without making any modifications to their system surpassed my expectations.

A couple of visitors reported success with Windows 98 and Windows XP, though Phil Ringnalda suffered a typically Microsoftian experience:

You two are killing me. I know how to enable CJK (I assume someone will be along with Chinese, anyway) characters, just fine. Then Windows says “show me your Windows CD”, and I say “how about this piece of crap ‘recovery disk’ that’s all I got instead?” Time to toss this laptop for something better. How’s OS X’s support for Japanese?

From the feedback so far, Phil, OS X’s support for Japanese is excellent. Looks like you might be another step closer to buying a PowerBook. If you decide on the 12-inch model though, make sure you buy a pair of asbestos gloves—it seems that cute little sucker runs hot.

Kurt Easterwood came up with a great piece of advice:

I wonder if it might not be a good idea to point users to how to install the Japanese IME from Microsoft. This page has good instructions and download links. (It’s from the same site Stavros linked to for the Korean IME.)

The page in question, Declan’s Guide to Installing and Using Microsoft’s Japanese IME, is a “comprehensive guide to installing and using the Microsoft Japanese IME for Window95/98/ME, Windows 2000 and Windows XP. The IME allows users of non-Japanese versions of Windows to read and enter Japanese hiragana, katakana and kanji scripts in IME enabled applications.” I’ll write a post with links to these and other East Asian language resources and link to it from my sidebar.

There were two reports that the Japanese characters didn’t appear in my RSS feed. That’s the fault of Burningbird’s Evil Twin who cast a spell that made me forget to change the character encoding from iso-8859-1 to UTF-8 in my RSS 1.0 and 2.0 templates. My apologies—it’s fixed. I’ve just downloaded and installed FeedReader and both RSS feeds look fine. (I thought about using the highly-regarded SharpReader but I’m trying to avoid the .NET framework for a little while longer.) Hopefully the Japanese characters is also displaying properly in other newsreaders.

I noticed one vaguely interesting glitch when I looked at the post in Mozilla 1.3.1 on my RedHat 7.2 installation: the period at the end of the sentence is near the top of the block of characters rather than near the baseline.

Mozilla 1.3.1/RedHat Linux 7.2 Japanese characters as seen in Mozilla on Linux
Mozilla 1.3.1/Windows 2000 Japanese characters as seen in Mozilla on Windows
IE 6/Windows 2000 Japanese characters as seen in IE on Windows

The IE Windows text also looks smoother than it’s Mozilla equivalent. Now I’m wondering whether it might be useful to include a font:family font-family declaration in my stylesheet—though to do that I’ll have to find out the names of the Japanese fonts on Mac OS X, Linux, and BSD Unix. And I’d really rather write Japanese-related entries than continue fussing with the technicalities of East Asian typography, particularly now that Stavros has made such an impressive start to his long-promised review of Hangul, the Korean writing system.

Korean is a subject-object-verb language, for example, and has a rich system of postpositional case markers. Chinese, a subject-verb-object language, does not. Korean has a complicated system of honorifics, part of which is expressed as verb endings. Chinese does not, and doesn’t have any characters to represent these verb-ending morphemes.

I hadn’t realized that Korean and Japanese so similar structurally: like Korean, Japanese is a subject-object-verb language with postpositional case markers and a system of honorifics. In one of my Japanese grammar books, Senko K. Maynard’s An Introduction to Japanese Grammar and Communication Strategies, it says that “Japanese is suggested to be distantly related to Korean, and therefore to the Altaic languages (among them, Mongolian and Turkish).” I’m looking forward to seeing how Stavros’ series unfolds and am hoping he’ll cover how to use the Korean IME to write Korean sentences (there’s one I’m dying to include in a post).

Permalink | Technorati


For what it's worth, I'm seeing the period on the baseline, in Moz 1.2.1 on Red Hat 7.2. I can't imagine why the Mozilla version would make a difference, though.

Posted by James on 12 May 2003 (Comment Permalink)

You know this already, I know you do -- but it's font-family: .

Posted by Dorothea Salo on 12 May 2003 (Comment Permalink)

I realized my error, Dorothea, as soon as you pointed it out. Now it's fixed. Perhaps I can only deal with one class of technical complexity at a time -- at the moment it's East Asian languages and so CSS is left to languish.

Posted by Jonathon on 12 May 2003 (Comment Permalink)

Japanese displays fine for me on OS X 10.2.6 and Mozilla 1.4b. The period displays however halfway between the baseline and the topline.

韓國말을 웹로그에 表示해 보시려면, 이 걸로 해 보시지...

Let's see if this sentence in Korean above displays correctly for you.

Posted by dda on 14 May 2003 (Comment Permalink)

The relationship between languages such as Japanese, Korean, and Turkish are amazing to me.

Knowing Japanese since childhood, I later took on Korean a bit during college, and then later learned some Turkish before our honeymoon trip to Turkey and Egypt.

I honestly couldn't believe how similar Korean and Japanese are. I also was amazed at how similar Turkish is to these as well, although the vowel changes in Turkish are a pain in the butt... ;) But structurally it's all the same, so the important thing is that if you can _think_ in Japanese, you can _think_ in Korean or Turkish, and that makes it that much easier to learn... :)

When I get a chance, I would love to explore these relationships more on my blog...

Posted by Trevor Hill on 14 May 2003 (Comment Permalink)

This discussion is now closed. My thanks to everyone who contributed.

© Copyright 2007 Jonathon Delacour