The Project
I have often received requests for a Chinese
version of GameBoy Book Reader, but without
some help from someone in this part of the
world it would not have been possible for
me to start. Several people have now offered
help, and sent me information.
This page is an attempt to keep them, and
anyone else who is interested, informed about
progress on this project.
Thanks to JACK for the Game Boy Book Reader
button at the top of the page.
Any comments to this e-mail
address.
I will also be looking for some short stories
in Chinese in text file form, suitable for
viewing in Notepad...
Chinese Characters and Scrolling
Chinese text display presents new problems
for the Gameboy Book Reader which are going
to involve a complete change of approach.
The first problem encountered was that the
complexity of Chinese characters makes it
virtually impossible to represent them in
tiles 8 pixels high, especially as we expect
to leave the bottom row relatively clear to
allow a gap between rows.
The current approach to text display in the
reader engine uses the tile mapping as a means
of fast scrolling of a line. Instead of moving
the character data we just re-number the tile
map so that the displayed position changes
(this involves sixteen times less bytes to
write).
However this more or less forces the use
of characters which are 8 or 16 pixels high
(including any space between rows). And as
soon as we go to 16 pixel high characters
we halve the number of rows displayable (to
6), and probably halve the number of characters
on a line. So the number of characters on
a page would be reduced quite severely.
A good compromise might be characters 11
or 12 pixels high, as it look possible to
make chinese convincing characters this high.
The other factor is that a couple of files
have been made available to me, containing
chinese fonts in 12x12 and 16x16 characters
respectively. This promises to save a lot
of work for somebody. The point I am currently
uncertain about is whether I am going to need
to allow a further row of pixels to separate
these characters vertically.
I have also seen a bitmap of the chinese
version of Notepad. This seems to use a system
font with 11 x 11 pixel chinese characters.
If I could get hold of a file with this font,
I would be really pleased!
All this points to the need for the reader
engine to be flexible in the character height,
and not require an integral number of tiles
vertically per character row. This then means
that vertical scrolling will need to be performed
by a memory copy process of the actual character
data.
Initial mental calculations suggest that
this can be fast enough not to be too obtrusive,
so I have made the commitment to re-write
the engine along these lines.
Chinese Characters and Storage.
The next problem with chinese characters
is that as well as being larger (and so requiring
up to four times as much storage per character),
there are many more possible characters. Whereas
it is very feasible to store a font for all
the West European languages in a few thousand
bytes, Chinese characters number in the tens
of thousands. If we always store all of these
in the book cartridge, we require the use
of a large cartridge even for a very short
book. We also have to deal with english books
just not needing this overhead.
As we now have the potential requirement
for a range of different languages, it seems
logical to try to put in place support for
any language which may be required. We do
this by making a superset of character fonts
available to the Makebook utility, which selects
only the character glyphs required in the
book being processed, and transfers these
to the book cartridge.
Storage of Text and Font
in Book Cartridge
When the text is stored in the book cartridge
we have two main requirements:
- the text should not take up more room
than necessary, but
- about 30,000 different characters may
need to be represented within the same book
The chosen solution here is to represent
the ansi codes from 20h to 7Fh as single byte
characters, and other characters as 2 byte
characters.
Furthermore, characters other than the basic
set 20h-7Fh will be allocated codes dynamically
as they are found in the book text by Makebook.
The first character found will be allocated
the code 80h, 00h. the next character 80h,
01h and so on... all the way up to FFh, FFh
if necessary; over 30,000 codes.
As the characters are allocated codes they
are also assigned sequential space in the
font table. Each character glyph will comprise
up to 33 bytes. The first byte is the character
width (including the blank space following
the character on the right). The next two
bytes are the (up to) 16 pixels of the top
row of the character. The other rows follow
in the next bytes. A glyph for an 8 pixel
high font will thus have 17 bytes. A 12 pixel
high font character will have 25 bytes, and
so on.
The lead characters from 00h to 1Fh are reserved
for special meanings.
- 01h introduces a column specifier
- 02h introduces a custom character
A column specifier acts as a kind of tabulation
command. A custom character allows, amongst
other things the insertion of monochrome bitmaps
into the page.
|
|
Project Status
FLASH: OK It's here
now! Just go to the Download
Page to get it. And let me know what you
think. This page will be updated soon.
Please note that on
some versions of Windows in China the text
file will look like rubbish. But don't worry.
If it looks ok in Notepad it will still make
a good book ROM. I would be grateful if any
internationalisation expert can help me explain
this!
A
partially working Gameboy Display (using 16
pixel high characters)
The
scroll bar now works (at least in Chinese
- thanks to JACK) This display uses 12 pixel
high characters). There must be somthing wrong
with the row filling logic as there is always
room for 1 more character.
Reader Engine
Done so far...
-
This has been partly revised so that
the tile mapping for the text area now
has a fixed numbering, and a software
scroll technique is used. This has not
yet been optimised to allow the scrolling
to be fast enough not to be noticed.
-
The basic (built-in for testing) english
font (8 pixel high) has been redone to
have the potential for 16 pixel wide glyphs.
-
The character generating routine has
been adjusted to use this new font.
-
Internal font has
been ripped out completely. Character
display routines have been re-done to
display any height of character from 8
to 16 pixels high. Tested at 8 and 16
high so far.
- Started to put scroll
bar info back in. Works in English and Chinese
so far.
Still to do...
Makebook Utility
Done so far...
-
Started to modify font editing dialog
to allow characters up to 16 x 16
-
Modified Font description files to add
Unicode codepoint information.
-
Discovered to my
surprise that (at least in Visual Studio)
a dialog cannot have more than about 256
controls. Probably a good thing, as I
am now using a more GDI programmed method
of displaying the glyph being edited.
It should look nicer and be more maintainable.
-
Managed to add
an import function for chinese font files
(see picture above).
-
Import function
now imports previous version font files
plus various other formats.
-
Imported font files
do not overwrite existing charcaters -
allows merging of font files from different
sources.
-
Arranged that characters
are stored against correct codes (including
transformations from local code pages
to unicode.
-
Added a function
to display a complete page of 256 codes
at once.
-
Changed font edit
function so that left mouse held down
draws all pixels passed through, and right
button erases pixels in the same way.
-
Modified Export
to interpret characters in text file according
to specified code page.
- Modified Export
to generate table of glyphs actually used
in text file.
Still to do...
- Export title bar
correctly
- etc...
Unicode
A unicode text file is assumed here to be
a file containing characters, each represented
by a pair of bytes. There may additionally
be a marker pair of bytes at the start of
the file; either FF, FE or FE, FF which indicate
whether the other byte pairs are arranged
in big endian or little endian form.
The basic multilingual plane (BMP) of unicode
characters with its 16 bit codes, allows enough
code points for up to 64K characters, enough
for all the world symbols in common use. Nearly
50 thousand such characters have been assigned
positions in this code table.
It should be noted that local representations
of a given one of these characters may differ
from country to country, so a single world
font file is still not really practical, but
may be close enough for the Gameboy Book reader.
Multibyte Encoding
The programs which we are concerned with
here Notepad, Wordpad, and Makebook all appear
to work with multibyte encoding, at least
on Windows 9x systems.
In a multibyte encoding system a character
may either be one byte or two bytes long.
They can be distinguished because one byte
characters are confined to a range of values
(such as 00-7F), and 2 byte characters must
start with a value in a different range. Then
the second byte in a 2 byte character can
in theory have any value from 00-FF.
The Makebook Font Editor
Microsoft's web site provides text files
listing the character values used in different
parts of the world. These tables include corresponding
unicode values, and descriptions of the characters
represented.
So using these tables it is possible to create
a font editor which presents locally used
characters (in the country running the editor),
and then to store the edited character glyph
against the unicode code point in the font
file. Users in different countries would then
be able to edit a common file, but accessing
only the code points which
- they were interested in, and
- their computers were set up to display
|