Hello,
Does anyone here know anything about Unicode?
Like the Linux X-windows published in 1994 with
multilingual support with alien langauge - from
a movie. It would be fun to just take a look
how it would be defined if the system can be written
with multilingual support.
Thanks.
Multilingual Support
RE:Multilingual Support
1. Unicode characters are 31 bits each.
2. For now, only the first 65536 characters are
defined. This is Plane 0, or Basic Multilingual
Plane (BMP).
3. BMP characters can be stored as 2-byte values.
Windows and Java do this.
4. BMP characters can also be stored as multi-byte
sequences; 1-3 bytes each. This is UTF-8, used by
Linux.
5. Characters are named like U+xxxx, where xxxx
is the 16-bit BMP hex value.
6. U+0000 to U+007F = ASCII
U+0080 to U+00FF = Latin
U+0370 to U+03FF = Greek
U+0400 to U+04FF = Cyrillic (Russian)
U+0600 to U+06FF = Arabic
U+20A0 to U+20CF = Currency symbols
U+2500 to U+257F = Box-drawing characters
Some font notes:
1. Text-mode VGA can display no more than 512
unique characters on the screen at one time.
If you want Unicode, you probably want a graphic
display.
2. Some alphabets are written right-to-left
(Arabic?). Some are bi-directional (Hebrew?).
Some are written vertically (Mongolian Uighur).
3. Some alphabets are cursive (Arabic; Mongolian).
The appearance of each character depends on the
previous and next characters.
http://www.cl.cam.ac.uk:80/~mgk25/unicode.html
http://www.unicode.org
>On 2001-04-22 20:45:56, Ben Hsu wrote:
>Hello,
> Does anyone here know anything about Unicode?
2. For now, only the first 65536 characters are
defined. This is Plane 0, or Basic Multilingual
Plane (BMP).
3. BMP characters can be stored as 2-byte values.
Windows and Java do this.
4. BMP characters can also be stored as multi-byte
sequences; 1-3 bytes each. This is UTF-8, used by
Linux.
5. Characters are named like U+xxxx, where xxxx
is the 16-bit BMP hex value.
6. U+0000 to U+007F = ASCII
U+0080 to U+00FF = Latin
U+0370 to U+03FF = Greek
U+0400 to U+04FF = Cyrillic (Russian)
U+0600 to U+06FF = Arabic
U+20A0 to U+20CF = Currency symbols
U+2500 to U+257F = Box-drawing characters
Some font notes:
1. Text-mode VGA can display no more than 512
unique characters on the screen at one time.
If you want Unicode, you probably want a graphic
display.
2. Some alphabets are written right-to-left
(Arabic?). Some are bi-directional (Hebrew?).
Some are written vertically (Mongolian Uighur).
3. Some alphabets are cursive (Arabic; Mongolian).
The appearance of each character depends on the
previous and next characters.
http://www.cl.cam.ac.uk:80/~mgk25/unicode.html
http://www.unicode.org
>On 2001-04-22 20:45:56, Ben Hsu wrote:
>Hello,
> Does anyone here know anything about Unicode?