Page 3 of 3

Re:Loading Arabic Fonts

Posted: Thu Jun 30, 2005 12:36 am
by Brendan
Hi,

Probably one of the most effective techniques for displaying Unicode code points in text modes is to dynamically re-define the font data used.

The basic idea would be to count how many times each code point occurs and then create font data that contains the 256 (or 512) most frequently used code points. For this method, if there's 600 unique code points needed you'd still be able to display most of them - the remaining code points would need to be replaced by an "undisplayable" character (typically a question mark or a square).

Despite this, using a graphical video mode would be far easier and lead to much better results, as graphics modes allow for anti-aliased fonts (better curves), proportional fonts (where some characters are wider than others - e.g. 'W' and 'i'), and allow any number of code points to be displayed (plus windows, menus, icons, etc).

For Arabic in particular, Unicode has 227 code points of which 45 are "combining". AFAIK this means that to display Arabic correctly you'd need to be able to display 182 unique characters. This rough calculation is most likely completely wrong (I know nothing of Arabic). Some notes:

a) There's different dialects of Arabic, and I'm not too sure how many of the Unicode code points are actually needed for a specific dialect. It might be possible to reduce the number of code points needed by only supporting one main dialect.

b) There's "subtending marks", which (I guess) are meant to underline a group of code points, for e.g. a number may consist of a group of numerical digits that are collectively underlined via. the Arabic number sign. With latin characters this might look like "1234," where the underline and comma are meant to represent the Arabic number sign. This would be incredibly difficult to do in text mode. There's actually 4 of these subtending marks, one for footnotes and the remaining 2 called "sanah" and "safha" (not sure where they'd be used). These subtending marks are not straight lines but are curved - you can't just use normal underlining.


Cheers,

Brendan

Re:Loading Arabic Fonts

Posted: Tue Jan 03, 2006 1:19 pm
by samir
Hi,

I think there is an other problem: The linking between various characters (Rabt). This was resolved in arabic Ms-Dos (the gliffs can be used is any place within the word because of the low resolution: gliffs are not smooth enough to be distingwishable among various other forms).

For the unicode staff, all you need is a translation table. Most of the work, then, will be on the keyboard handler.

To keep english there too, I recommand using the OEM char table (code page 720), it won't be hard to find on internet.

If you are very lazy :), you may get the gliffs by taking a snapshot from an arabic dos window (win98 Arabic Enabled) or directly form the chars table (vga), it would suffice!


For information, there is a program (service) named "arcon" that "enables" arabic in linux's console.

;)

Re:Loading Arabic Fonts

Posted: Tue Jan 03, 2006 3:10 pm
by Pype.Clicker
You may want to check out mar-rih's first release of his OS before you resurrect a 6-month-old thread ...