Page 4 of 4

Posted: Sat Mar 01, 2008 10:41 pm
by Brendan
Hi,
SpooK wrote:PS: As for UTF-8 being internally "appropriate" in an OS, at least you will have a strlen function that will be worth the function calling overhead ;)
For a modern OS, the strlen function correctly returns the length of a zero terminated string in bytes (for both ASCII and UTF8).

For a modern OS, the strlen function never correctly returns the width that the string will be on the screen because modern OSs use proportional fonts - for example, the letter 'W' is a few pixels wider than the number '1'. You need to use a font engine to determine the width of the string on the screen for ASCII and/or UTF8.

Therefore, the strlen function can be a simple "find the first zero" function and it won't matter if you're using ASCII or UTF8.

Note: I am *not* accusing anyone of writing a modern OS... ;)


Cheers,

Brendan

Posted: Sun Mar 02, 2008 5:17 am
by Solar
mbrlen(), wcslen(), both in <wchar.h>, which has been part of the standard C library since 1995 (Amendment 1)...

As for ASCII "not working", the problem with ASCII always was that you had to "guess" which ASCII variant a text was written in. When your guess was bad, your text was mangled.

Posted: Tue Mar 04, 2008 6:53 am
by jal
Brendan wrote:For a modern OS, the strlen function correctly returns the length of a zero terminated string in bytes (for both ASCII and UTF8).
True, as that's defined by the C standard.
For a modern OS, the strlen function never correctly returns the width that the string will be on the screen
True as well, but I don't think anyone would expect that, as it depends on font face, point size and, as you mention, the actual characters. However, sometimes you'd like to know how many characters there are in a string. Whether or not to call that function strlen is not that important (unless you want to keep stricly C standard), but if UTF-8, you'll have to traverse the string.


JAL