Adding
Hebrew Support
For
the purposes of the Book Reader, Hebrew text
is represented using Codepage 1255. This is
a single byte codepage with most of the possble
256 codes being utilised. As normal for such
pages the bottom half of the codepage, is
identical to the 'ASCII' set, and the special
Hebrew characters occupy code points in the
range 80h-FFh.

The
hebrew consonants behave as normal characters;
one character takes up one position on the
page. Consecutive characters from the text
file are 'printed' on the page from right
to left.
For
the most part Hebrew does not use vowels.
Readers are expected to add the vowel sounds
from memory. A limited number of vowel sounds
are represented by consonant characters in
in-line text.
However,
at a time when the number of native Hebrew
speakers was limited, a system of marking
vowel sounds was developed to allow the correct
pronunciation of religious texts, without
disturbing the original character sequence.
These comprise small marks made under, over,
or inside a consonant glyph.
I
was unable to discover any formal description
of the rules concerning sequencing of these
characters in a text file, but I have worked
out the rules based on a very limited supply
of text files and corresponding printouts.
The
vowels are represented by certain codepoints,
which follow in the text file the consonants
which they are to be combined with. The combining
character is not limited to just one per consonant.
I have seen at least two used.
So
with these two rules (right to left, combining
characters follow the character they are to
combine with) it was not too difficult to
add Hebrew capability to the Book Reader.
Punctuation
There
is one interesting problem which has arisen
from examination of my sample texts. Left
and right brackets seem to give a lot of problems.
According to Microsoft (and so presumably
the specs these are based on) the code positions
of glyphs representing left and right brackets
is the same as for left to right languages.
If you visualise this you will see that this
could look strange }if you see what I mean{
unless the writer deliberately swaps the use
of the 2 characters. Some texts seem to do
this, some don't, and some even use the same
bracket for both ends. I would be grateful
for a definitive answer to this.
Encoding
When
obtaining or preparing a text for use with
Book Reader, it must be remembered that the
program currently works with text files encoded
in Codepage 1255. I have no idea how common
such files are, but I have certainly encountered
a number of unsuitable files. Typically they
may be encoded in a unique encoding, or in
unicode.
The
Tanach I found which had vowel pointing was
in fact on a web html page. I converted it
to text by using the browser's Save (as text)
menu function. The result in this case was
what appeared to me to be a valid text file
encoded as codepage 1255.
Viewing
Hebrew Text in Makebook
An
obvious current limitation of Makebook is
that text in languages other than the native
language for the PC can be unreadable. This
is certainly the case with Hebrew on my computer.
The resulting Gameboy book is, as far as I
can tell, valid Hebrew. However editing a
file when you cannot see the correct characters
is not easy!
V4.5
allows the selection of a display font for
Makebook. One of the parameters in the 'script'.
This allows you to choose a font for a different
language from the local language. You need
to have installed the different language options
using your control panel. If you have then
the starting font will attempt to match the
language selected. This should work well for
8 bit character sets, and variably well for
other situations.
In
the meantime the screen shot below demonstrates
the difficulties.
Markup

Book
Reader markup such as 'chapter marks' (~c)
work as normal. However markup involving a
start and end mark (eg text which links to
another location) is confusing. Assuming that
you are displaying the text file on a western
PC you will see a lot of unrecognisable symbols
displayed from left to right. If you can locate
some text to convert to a link (difficult
in itself when the characters are displaying
so strangely) then you need to insert the
two elements before and after the text but
the opposite way round to the usual way. In
other words the end of link marker comes before
the text (to the left) and the start of link
marker after the text. Note that the content
of each marker is still as normal left to
right.
.This
reveals a possible flaw in the way the right
to left conversion is done, so it is quite
possible that the markup of a right to left
text will change in the future.
The
markers are automatically inserted correctly
by the toolbar icon functions
|