Adding Japanese Support
This page is intended to provide
information on the status of adding Japanese
support to GameBoy Book Reader.
I've made a start on this. The
tasks I have to perform are:
As you can see from the picture above I have
made some progress. But there is still some
way to go.
Bear in mind the following. I am constrained
by the terms of the project to find the text
file format in use with Windows in Japan and
provide support for this. This approach has
worked well in most other languages, but there
seem to be quite a number of different text
file formats. There seems no doubt that text
formatted according to Codepage 932 should
work with Windows and I am supporting this.
But if other text file formats are more common
I will need to provide import functions for
them. I have been told, for example, that
unicode is almost universally used for text
files. Please
help if you are Japanese and can tell me more
about this.
Character Sets and
Encoding Methods
The elements of the Japanese language which
need representing in a character set are:
- Hiragana (83 characters used to write
the grammatical parts of words and sentences,
and to write Japanese words which don't
have a Kanji)
- Katakana (86 characters used for writing
non-Japanese words, and for advertising)
- Kanji (Thousands of chinese ideographs;
each character conveys a meaning, rather
than a sound)
In addition CJK character sets tend to include
other potentially useful things like:
- Latin alphabet
- Greek alphabet
- Cyrillic alphabet
- Special symbols
- Line drawing elements
You will have to pardon me if I appear to
mix up character sets and encoding here. It
is hard to get clear information of this subject
without spending money.
In the following JIS means Japanese industrial
standard, which are standards published by
the Japanese Standards Association (JSA)
I have found references to the following
character set standards:
JIS X 201 - 1976
This is more or less ASCII (128 chars) plus
63 half-width katakana characters (a minimum
set a characters for expressing Japanese).
Clearly intended as a single 8-bit byte encoded
set.
JIS X 208 - 1990
Basic Japanese set of 6,879 characters including
6,355 kanji, plus various other useful elements.
JIS X 212 - 1990
Supplemental Japanese set of 6,067
characters including 5,801 kanji, plus various
other useful elements. Most or all are additional
to JIS X 208 - 1990. May not be in very common
use?
JIS X 213 - ????
Extension to JIS X 208 - 1990.
Don't know if it ever emerged. If anyone can
tell me, I'd be grateful.
JIS X 221 - 1995
Japanese version of Unicode
EUC Encoding
This is used on Unix systems. Briefly...
21 - 7E |
ASCII |
A1A1 -
FEFE |
JIS X
208 |
8EA1 -
8EDF |
Half-width
Katakana |
8FA1 -
8FFE |
JIS X
212-1990 |
Codepage 932
This is how I believe Windows expects
to see a text file encoded. This is also called
Shift-JIS. A brief summary follows:
Single
Byte Characters |
00 - 7F |
As ASCII
but with Yen symbol at 5C, |
Single
Byte Characters |
A1 - DF |
Halfwidth
Katakana |
Double
Byte Characters |
8140 -
9FFC |
Loads
of other stuff |
Double
Byte Characters |
E040 - FBFC,
FC40 - FC4B
|
Loads
of other stuff |
Notice how single byte and double byte characters
are still distinguishable by the first byte.
Adding Characters To
Font
I have found something, which appears to
be Freeware, containing a set of suitable
Japanese glyphs at 12 pixel high. I have added
these to the Gameboy Reader font file font_12.gbf
and you can see the result in the screenshot
above. Before I finalise on this set of glyphs
I would like some feedback from a Japanese
as to whether It looks OK or whether I have
made a gross error.
Scroll Bar Text
Luc has provided a scroll bar text translation
- thanks Luc!
Classic Text
I have found a suitable text
which is shown above displayed on the Gameboy.
It is called something like 'I am a Cat'.
Other Topics
What's all this
'Half-width / Full-width' stuff?
As I started with a variable width font it
wasn't quite as obvious to me. But I suppose
the point is that in the days when computer
screens were 80 characters across, and they
had to generate chinese ideographs, they needed
to take up two normal character positions
(to get the detail needed). This must be a
full-width character and there must have been
40 positions for them. So if some of the characters
used are an eightieth of a screen wide, because
you can fit more Katakana in that way, you
probably would call them half-width characters.
Looking at Japanese texts it would seem that
Japanese still expect all characters to be
the same width, so that the text appears in
columns as well as rows.
|