Languages
 
 
 
 

GameBoy Book Reader -
Hebrew Support

[Last Modified: 16 March 2003]

   

This is part of the Tanach. On the left shown on a Gameboy, and on the right on a Gameboy Advance. Both are showing 12 pixel high characters.

Version V4.5 now incorporates a line filling logic which reverses the characters from the text file for display purposes, given the required right to left text direction. It also introduces 'combining' characters to the fonts, to allow Hebrew vowel pointing.


 

Adding Hebrew Support

For the purposes of the Book Reader, Hebrew text is represented using Codepage 1255. This is a single byte codepage with most of the possble 256 codes being utilised. As normal for such pages the bottom half of the codepage, is identical to the 'ASCII' set, and the special Hebrew characters occupy code points in the range 80h-FFh.

The hebrew consonants behave as normal characters; one character takes up one position on the page. Consecutive characters from the text file are 'printed' on the page from right to left.

For the most part Hebrew does not use vowels. Readers are expected to add the vowel sounds from memory. A limited number of vowel sounds are represented by consonant characters in in-line text.

However, at a time when the number of native Hebrew speakers was limited, a system of marking vowel sounds was developed to allow the correct pronunciation of religious texts, without disturbing the original character sequence. These comprise small marks made under, over, or inside a consonant glyph.

I was unable to discover any formal description of the rules concerning sequencing of these characters in a text file, but I have worked out the rules based on a very limited supply of text files and corresponding printouts.

The vowels are represented by certain codepoints, which follow in the text file the consonants which they are to be combined with. The combining character is not limited to just one per consonant. I have seen at least two used.

So with these two rules (right to left, combining characters follow the character they are to combine with) it was not too difficult to add Hebrew capability to the Book Reader.

Punctuation

There is one interesting problem which has arisen from examination of my sample texts. Left and right brackets seem to give a lot of problems. According to Microsoft (and so presumably the specs these are based on) the code positions of glyphs representing left and right brackets is the same as for left to right languages. If you visualise this you will see that this could look strange }if you see what I mean{ unless the writer deliberately swaps the use of the 2 characters. Some texts seem to do this, some don't, and some even use the same bracket for both ends. I would be grateful for a definitive answer to this.

Encoding

When obtaining or preparing a text for use with Book Reader, it must be remembered that the program currently works with text files encoded in Codepage 1255. I have no idea how common such files are, but I have certainly encountered a number of unsuitable files. Typically they may be encoded in a unique encoding, or in unicode.

The Tanach I found which had vowel pointing was in fact on a web html page. I converted it to text by using the browser's Save (as text) menu function. The result in this case was what appeared to me to be a valid text file encoded as codepage 1255.

Viewing Hebrew Text in Makebook

An obvious current limitation of Makebook is that text in languages other than the native language for the PC can be unreadable. This is certainly the case with Hebrew on my computer. The resulting Gameboy book is, as far as I can tell, valid Hebrew. However editing a file when you cannot see the correct characters is not easy!

V4.5 allows the selection of a display font for Makebook. One of the parameters in the 'script'. This allows you to choose a font for a different language from the local language. You need to have installed the different language options using your control panel. If you have then the starting font will attempt to match the language selected. This should work well for 8 bit character sets, and variably well for other situations.

In the meantime the screen shot below demonstrates the difficulties.

Markup

Book Reader markup such as 'chapter marks' (~c) work as normal. However markup involving a start and end mark (eg text which links to another location) is confusing. Assuming that you are displaying the text file on a western PC you will see a lot of unrecognisable symbols displayed from left to right. If you can locate some text to convert to a link (difficult in itself when the characters are displaying so strangely) then you need to insert the two elements before and after the text but the opposite way round to the usual way. In other words the end of link marker comes before the text (to the left) and the start of link marker after the text. Note that the content of each marker is still as normal left to right.

.This reveals a possible flaw in the way the right to left conversion is done, so it is quite possible that the markup of a right to left text will change in the future.

The markers are automatically inserted correctly by the toolbar icon functions

 


Sources of Hebrew text files

-to be added-

Address for comments, etc:

pc@bookreader.co.uk