Languages
 
 
 
 
 

[Last Modified:12 Feb 2002]

Back To GameBoy Project Main Page


Game Boy Book Reader (Korean Version)


[Dec '01 - Since the completion of this project back in July 2001, the entire Gameboy Book Reader project was re-written based on the use of Unicode, and of a font size independent on the Gameboy tile size.

The new project includes Dong-an's original 8 pixel high Hangul font, but has also made possible further larger Hangul fonts, as well as inclusion of the Hanja and Johab characters, and faster support of other languages.

Some of the text below has been updated to cover new information - but much is unaltered from the time it was written during the 2 weeks of the project]


The Project

I have often received requests for a Chinese, Japanese or Korean version of GameBoy Book Reader, but without some help from someone in one of these parts of the world it would not have been possible for me to start.


Kwak Dong-an's original design concept, drawn by him.


However now Kwak Dong-an from Korea has offered to help me by telling me how his PC works in Korea, and by designing a font. The picture above shows his original impression of a page, before we started. The idea was to decide if the pre-combined hangul characters are readable enough if done with a maximum height of 7 pixels!

 

 

Project Status

The project is now released - it took 16 days!

Kwak Dong-an added all the about 2500 characters in only about 1 week, using it. He doesn't seem to need much sleep!

If you are Korean and have questions or suggestions you can send Kwak Dong-an an email here.


An actual screenshot of the working Korean GameBoy Book Reader


Acknowledgments

Many thanks to Kwak Dong-an without whom the Korean version would definitely not have been possible.


Example Text (see Korean PC's below for explanation)

The following image shows part of a text file as it would be displayed in Korean Wordpad. Don't ask me what it says!

Here are the bytes in the text file used above. Notice the english characters B&O near the start. Notice how they are less than 0x7f. You can also see space characters and CR, LF (0x0d and 0x0a). The Korean looking characters are pairs of bytes, like BF C0.

(By the way, the display above comes from the PROMDRIVER software of my sponsor MQP Electronics. Their device programmer software has a built-in file editor which lends itself nicely to examining and modifying various files.)


Everything I know about Korean

Introduction

First of all let me say that a week ago I had never heard of any of the stuff I'm describing below. A lot of it is probably wrong - so I hope that if you notice an error, you'll e-mail me!

Writing method

Well, I couldn't assume it, but Kwak Dong-an tells me that Koreans write from left to right starting at the top of the page. Just like I do. That makes things easier!

Character set

This is defined in KS C 5601-1992. It includes 2350 pre-combined hangul, 4888 hanja, and a smaller number of other characters including western ascii.

Pre-combined Hangul are best described to a westerner as a symbol representing a three letter word (or presumably a three letter syllable). It seems that there are 2350 meaningful combinations, or a total of 11172 if you include all the ones that never get used (called Johab encoding). We are obviously looking at doing the minimum character set here to allow it to fit into the gameboy.

Hanja are Chinese characters. It seems that we may be able to leave these out and still have a useful book reader.

 

 

Korean PC's

Text Files

The important thing to understand, if we are going to display Korean characters, is how they are represented in a text file. I am used to every character in a text file being 1 byte. Since this is a Windows project I consider a text file to be one which looks like text in Wordpad (or Notepad).

Looking at a Korean text file I found that there are 1 byte characters and 2 byte characters. One byte characters are (almost) the same as ASCII characters. They have values from 0x00 to 0x7e. Any byte which is 0x81-0xfe is the first byte of a pair of bytes representing a character. These 2 byte characters have enough different combinations for all the rest of the character set.

There seems to be a standard called Wansung encoding where the 2 byte characters range from 0xa1a1- 0xfefe. This allows room for all the characters in KS C 5601-1992.

Microsoft seem to use a superset of this called Extended Wansung or Codepage 949. This has a first byte from 0x81-0xfe and a second byte from 0x41- 0x5a, 0x61 - 0x7a or 0x81 - 0xfe. Because of this two bytes can represent many more characters such as the rest of the Johab characters. Extended Wansung is upward compatible from Wansung.

Keyboards

I expect Western readers are wondering how all these characters can be entered on a PC keyboard. The answer apparently is called Korean IME (Input Method Editor). As you type the individual Hangul (see keyboard picture below) the IME combines them into the correct combinations.

If you want to type Hanja you enable a particular function and then you just type the Hangul that make the same sounds as the Hanja (you can usually write a word in both) and the IME gives you a list of Hanja to choose from (as many make the same sound). The list of possible Hanja is organised in order of frequency of use. This order apparently can adapt to the user's frequency of use.

Possible IME's are MS KOIME (from Microsoft's web site) or NJIME (from NJStar).

[Thanks to Mark Williamson for the above IME information.- 10 December, 2001]

These characters are Hangul, not pre-combined.


The font editor document is being re-designed to deal with Korean. Here it is, nearly finished. What I thought was cool is that the Hangul character in the character description actually comes out looking like it should when you run the program on a Korean PC!


Useful Links

Book Recommendations and Links

An Introduction to Korean

Ken Lunde links page (see the CJK.INF document)

Back To GameBoy Project Main Page