Unicode

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
User avatar
Colonel Kernel
Member
Member
Posts: 1437
Joined: Tue Oct 17, 2006 6:06 pm
Location: Vancouver, BC, Canada
Contact:

Post by Colonel Kernel »

lollynoob wrote:I'll personally be fine with supporting only ASCII in my (hobby) kernel, since I'm neither a large corporation or a non-English-speaker.
Sounds good to me. I don't think anyone was trying to tell you what to do with your own kernel. It sounded more like you were trying to claim that Unicode (or the more general concept of a universal character set) is unnecessary.

Or, maybe we just fell into this pattern:

Image
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Post by jal »

lollynoob wrote:Sorry guys, I never realized it was my obligation to keep on the nice side of every foreigner I'll never meet.
You sir, are the archetypical example of an ignorant American. Please, keep dreaming in your safe little ideal world of monolinguism and xenophobia. But don't bother us with it, please.


JAL
User avatar
lollynoob
Member
Member
Posts: 150
Joined: Sun Oct 14, 2007 11:49 am

Post by lollynoob »

@Colonel Kernel: This was pretty much what I was getting at; I wasn't saying Unicode is never a good choice, nor was I saying that ASCII and encodings per-region was the best solution in the world. I was just pointing out my preference for non-unified encodings and my reasons for it.

@Jal:
http://en.wikipedia.org/wiki/Xenophobia ... xplanation

It's not just an American thing to prefer one's own region over others, and to be more comfortable with the people of that region. Xenophobia is part of human nature, to an extent, and I see no reason to go against what's comfortable for me to please the entire world population. Sorry if I'm not worldly (trendy) enough to make you happy.
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Post by jal »

lollynoob wrote:It's not just an American thing to prefer one's own region over others, and to be more comfortable with the people of that region.
That is true. However, it is very difficult in Europe to pretend that you're alone. In the US, that's a lot easier, and the general linguistic xenophobia found there has no match on the old continent (except perhaps a general dislike for the influence of the English language by stupid pundits).
Xenophobia is part of human nature
So is murder and rape. Neither of which I like to take part in.
I see no reason to go against what's comfortable for me to please the entire world population. Sorry if I'm not worldly (trendy) enough to make you happy.
Your OS will not be known by the entire world population. In fact, it will probably only be known by a handful, so if you for pragmatic reasons do not want to support Unicode, fine. However, you should realize that many on this list are from other places than the US, and displaying US-centricity is not favoured amongst those not living there (or even amongst those that do).


JAL
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Post by Solar »

lollynoob wrote:Sorry guys, I never realized it was my obligation to keep on the nice side of every foreigner I'll never meet.
As a civilized and socialized being, it should be your obligation to keep on the nice side of any person as long as that person does not give you reason to do otherwise. Whether or not that person is a "foreigner" doesn't come into it. (I'd like to hear your definition of "foreigner", if only because I'm curious how many rules of basic political correctness you would violate in a single statement.)
I'll personally be fine with supporting only ASCII in my (hobby) kernel, since I'm neither a large corporation or a non-English-speaker.
Personally I feel that aiming for a functionality level about on-par with DOS 3.3 to be pretty low standards even - no, especially - for a hobby project.
ASCII works, has worked, and with UTF-8 being backwards-compatible with it, will continue to work for a long while.
No, sorry - ASCII has never "worked", has always been the reason for needless headaches, which is why something like Unicode popped up in the first place.
Sorry if some Chinese guy doesn't like my choices, but I wouldn't be able to read his angry e-mails anyways.
Arrogant... you wouldn't be able to properly read the names of people or cities, for example. And people won't be much impressed if you couldn't even spell their names properly in a letter. You don't have to go as far as China, you don't even have to leave your own country. You know, that hispanic next door isn't named Jose, his name is José...
I was just pointing out my preference for non-unified encodings and my reasons for it.
Unfortunately in a tone that incites criticism, because you try to hide your lazyness, and perhaps even your inability to comprehend, behind a claim of "not necessary".
Xenophobia is part of human nature, to an extent...
...and trying to get it over with is a sign of civilized behaviour.

You know, you Americans did a great effort driving home the message to us Germans that xenophobia and chauvinism is a bad thing. It's been a major subject at school for the last half-century. Perhaps it's time to return the favor...
...and I see no reason to go against what's comfortable for me to please the entire world population.
See, we see no reason to leave you unflamed just because it would be comfortable to you.
Every good solution is obvious once you've found it.
User avatar
lollynoob
Member
Member
Posts: 150
Joined: Sun Oct 14, 2007 11:49 am

Post by lollynoob »

@Solar:

Well, breaking down posts definitely seems like the cool thing to do here.
As a civilized and socialized being, it should be your obligation to keep on the nice side of any person as long as that person does not give you reason to do otherwise. Whether or not that person is a "foreigner" doesn't come into it. (I'd like to hear your definition of "foreigner", if only because I'm curious how many rules of basic political correctness you would violate in a single statement.)
As far as I know, I've got no obligations to any person or civilization, much less one I doubt I'll ever meet or be a part of. Secondly, to define foreigner, I mean anyone not in the United States, which I thought you would have known unless you were curious as to my nationality. In case you haven't realized, "political correctness" is an overrated fad that simply gives people an excuse to act like little white(no, there aren't any implications behind this) knights on the internet and on television.
Personally I feel that aiming for a functionality level about on-par with DOS 3.3 to be pretty low standards even - no, especially - for a hobby project.
Where did I mention DOS 3.3? Judging by your (massive, seriously) post count, I would think you'd know by now that the type of character encoding has nothing to do with the capabilities of the operating system. Perhaps by not bothering about which is the most politically correct or all-encompassing standard to align myself with, I can, you know, work on an operating system.
No, sorry - ASCII has never "worked", has always been the reason for needless headaches, which is why something like Unicode popped up in the first place.
I fail to see how ASCII (a fixed-width, single-language encoding, with 256 (128, excluding the higher half) characters) is more of a headache than Unicode (a variable-width(sometimes), multi-language encoding, with (as of yet) over 100,000 characters) to implement. Personally, I like to be able to count on 'a' <= x <= 'z' to check for lower case, and I don't feel a bit of shame, as you put it, for thinking so.
Arrogant... you wouldn't be able to properly read the names of people or cities, for example. And people won't be much impressed if you couldn't even spell their names properly in a letter. You don't have to go as far as China, you don't even have to leave your own country. You know, that hispanic next door isn't named Jose, his name is José...
My computer is not a map or a pen, and I don't have a neighbor named Jose. I don't have an accented 'e' key, either; these things don't apply to me, so why should I support them in something I create in my spare time?
Unfortunately in a tone that incites criticism, because you try to hide your lazyness, and perhaps even your inability to comprehend, behind a claim of "not necessary".
I'm by no means lazy, and I can comprehend Unicode just fine; I'd just rather not bother with something I won't use. It's the same reason I'm not developing my operating system for the PowerPC.
...and trying to get it over with is a sign of civilized behaviour.

You know, you Americans did a great effort driving home the message to us Germans that xenophobia and chauvinism is a bad thing. It's been a major subject at school for the last half-century. Perhaps it's time to return the favor...
Who are you to define civilized behavior? If anything, it would be uncivilized to favor people in another country over the ones who live around me. Also, it would be civilized for you to not make broad generalizations about my country, and the people in it.
See, we see no reason to leave you unflamed just because it would be comfortable to you.
Classy. Hint hint, you're the only one flaming me; why do you refer to yourself as we?
Shadyjames
Posts: 11
Joined: Wed Nov 14, 2007 4:48 am

Post by Shadyjames »

lollynoob wrote:@Solar:
Classy. Hint hint, you're the only one flaming me; why do you refer to yourself as we?
Actually, i would very much like to do some flaming but i don't have the time right now, and since solar is doing such a smashing job i'm just watching the show.

I think all of this could have been avoided if you'd been less arrogant about how everything would be so much easier if the planet conformed to the ways of your bubble world. I mean seriously, how do you expect a conversation like that to END?

Xenophobia is human nature yes, but the fact that you aren't trying to be better than that screams either immaturity or exposure to an environment in which your xenophobia is acceptable. I'll let people draw their own conclusions about the latter.
Shadyjames wrote:If my calculations are accurate, than 76% of the awesomeness in this room has emanated directly from us
pcmattman wrote:Tomorrow lets try for 85
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Post by jal »

lollynoob wrote:As far as I know, I've got no obligations to any person or civilization, much less one I doubt I'll ever meet or be a part of.
You have no such obligation, but don't expect people to help you when you are being rude to them.


JAL
Last edited by jal on Fri Feb 29, 2008 3:03 am, edited 1 time in total.
User avatar
Wave
Member
Member
Posts: 50
Joined: Sun Jan 20, 2008 5:51 am

Post by Wave »

lollynoob wrote:
No, sorry - ASCII has never "worked", has always been the reason for needless headaches, which is why something like Unicode popped up in the first place.
I fail to see how ASCII (a fixed-width, single-language encoding, with 256 (128, excluding the higher half) characters) is more of a headache than Unicode (a variable-width(sometimes), multi-language encoding, with (as of yet) over 100,000 characters) to implement. Personally, I like to be able to count on 'a' <= x <= 'z' to check for lower case, and I don't feel a bit of shame, as you put it, for thinking so.
Checking if x is >= a' and <= 'z' doesn't correctly check for lower case letters. æøöåß are lower case ASCII letters that won't pass your test.
Conway's Law: If you have four groups working on a compiler, you'll get a 4-pass compiler.
Melvin Conway
User avatar
bluecode
Member
Member
Posts: 202
Joined: Wed Nov 17, 2004 12:00 am
Location: Germany
Contact:

Post by bluecode »

Wave wrote:æøöåß are lower case ASCII letters that won't pass your test.
Actually that is wrong. They are not part of ASCII, just part of some ASCII extensions afaik (reference: wikipedia).
User avatar
lollynoob
Member
Member
Posts: 150
Joined: Sun Oct 14, 2007 11:49 am

Post by lollynoob »

I've just got a few more things to say, then I'll stop, really.
Xenophobia is human nature yes, but the fact that you aren't trying to be better than that...
Why in the world would I want to be better than what I am? It's a part of humanity, and I have no problems with it. People in Europe can hate me if they want (and judging by these forum posts, they do already); it won't bother me a bit, I promise.
...or exposure to an environment in which your xenophobia is acceptable.
I think it's this one. I like the right to choose who I'm friendly with, or more relatedly, who my operating system is directed towards. Being that my country isn't situated near any other that is vastly different from itself, it makes perfect sense that this sort of xenophobia* isn't frowned angrily down upon here, and I'm glad. Having to worry about everyone when I say anything seems pretty ridiculous.

*: I like how not supporting Unicode is equated with some sort of hate-crime here; I never said other countries couldn't have their own encodings for their own languages (in fact, I supported this), I just said I'd be using ASCII since it made the most sense for me (I do live where ASCII originated from, you know). Sorry if I've sounded brash, xenophobic, or whatever, but you guys really know how to piss someone off.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

Some people should know when to quit a pointless discussion. Someone decides not to support unicode and gives a perfectly valid reason for it. That the argument does not hold for other circumstances is a different matter, but it is not a reason to keep feeding a holy war that is by definition endless in nature.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Post by Solar »

As I said before, the problem is not lollynoob's decision not to support Unicode - that's a perfectly valid step to take.

The problem is not the reasoning behind this decision - that his hobby OS is his hobby OS and that he doesn't expect anyone with any needs for internationalization to ever use it. Perfectly valid, too.

The problem is not his ignorance or arrogance when it comes to matters of internationalization in computer software. Everyone is entitled to being ignorant on something, and arrogance is a character trait that can only be helped by much discipline and effort.

The problem is how he's relishing in his ignorance and arrogance, expecting everyone else to just shut up.

My last message to lollynoob: If, in a couple of years, you have matured a bit (not that I hold my breath for it because in your cozy bubble there's really no need for people to grow up), and wonder why people around the world aren't really chums with the USA anymore, look back at this thread...
Every good solution is obvious once you've found it.
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Post by jal »

Solar wrote:look back at this thread...
Ah, yes, the shame he must feel then (speaking from personal experience) :).


JAL
SpooK
Member
Member
Posts: 260
Joined: Sun Jun 18, 2006 7:21 pm

Post by SpooK »

I'm not quite sure what angle Solar was working with "ASCII has never worked", it has "worked" just fine for plenty of years and for plenty of people... and it will still work for plenty more.

As for if ASCII is adequate anymore, then that answer would have to be no... it never really was. The entire concept was focused on the English language (typewriters even, IIRC) and as a result is severely limited. Moreover, some people (e.g. IBM, Windows, etc...) can't even agree on how exactly the 8-bit ASCII extension (upper 128) characters should look.

Unicode was developed to address the shortcomings of ASCII as applied to a more global market. Multiple encoding methods of Unicode have been developed that can suit just about any application. More importantly, Unicode was developed so we wouldn't need to be in a situation where every language is overlapping each other in respect to their binary encodings.

What amazes me about this near flame-war is that you have a simple solution to the ASCII vs. Unicode struggle, one that has been mentioned more than once in this thread... that being UTF-8.

lollynoob, if you want to stick to ASCII for now, then do so. I don't see how this is much of a "crime" for a hobby OS :|

If you ever feel the need to move to Unicode, "simply" implement UTF-8 and it will work like an extension to ASCII... problem solved.

PS: As for UTF-8 being internally "appropriate" in an OS, at least you will have a strlen function that will be worth the function calling overhead ;)
Post Reply