Own charset in C or assembler and keyboard input

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
lukassevc
Posts: 20
Joined: Wed May 29, 2019 2:56 pm
Libera.chat IRC: lukassevc

Own charset in C or assembler and keyboard input

Post by lukassevc »

Hi, I am Lucas and i have one question. I am no from English speaking country ( I am from Slovakia ( not from Slovenia ) ) and I am making my own OS in C and assembler but when I came to printing text on screen and I want to use my characters ( that I use in my country ) I don't know how to use them because C compiler doesn't compiled it very well for my OS so it didn't work. Later i change text encoding ( to ISO 8859-2 ) it worked better but not enough - some characters it did compiled correctly and some not, and I don't know know how to fix it, I decide best way will be make own charset but I don't know how to do it. Also I need help with keyboard in C - i know that I can read input from it using port 0x60 and 0x64 but i need also recognize characters and here is my problem - using scan codes i don't know if every key on keyboard has special scan code or if special letter has own and unique scan code, if first, it will be easy to make function to read for example Slovak, English, American or any keyboard because every key has own scan code, and doesn't depends on letters written on it. But if second, it will be harder because i can't find Slovak keyboard scan codes so please help me, it is confusing me ( very very confusing ) - sorry for mistakes, because ( as I mentioned before ) I am not from English speaking country.

Thanks for help and understand.
deleted8917
Member
Member
Posts: 119
Joined: Wed Dec 12, 2018 12:16 pm

Re: Own charset in C or assembler and keyboard input

Post by deleted8917 »

If you're saving your source files with ISO-8859, and some characters still doesn't print correctly, maybe you should check your putchar function (or the name that you give to the function that prints characters) and change char to unsigned char in the function parameter where you pass the character to print.
Seeing your source code would be useful.
If you want to use a custom charset, you'll need to set a video mode first and then an putpixel function.
Octocontrabass
Member
Member
Posts: 5584
Joined: Mon Mar 25, 2013 7:01 pm

Re: Own charset in C or assembler and keyboard input

Post by Octocontrabass »

I recommend using UTF-8 (Unicode) encoding in your OS. It's a little bit more work, but you can use it for all text.

If you're using VGA text mode, the default character set mapping is code page 437. You can change the mapping by uploading different glyphs to the graphics card.

You can display any text you want if you use graphics mode instead of text mode.

Scan codes are for keys, not letters. For example, the scan code for ô on a Slovak keyboard is the same as the scan code for ; on an American keyboard.
lukassevc
Posts: 20
Joined: Wed May 29, 2019 2:56 pm
Libera.chat IRC: lukassevc

Re: Own charset in C or assembler and keyboard input

Post by lukassevc »

Octocontrabass wrote:I recommend using UTF-8 (Unicode) encoding in your OS. It's a little bit more work, but you can use it for all text.

If you're using VGA text mode, the default character set mapping is code page 437. You can change the mapping by uploading different glyphs to the graphics card.

You can display any text you want if you use graphics mode instead of text mode.

Scan codes are for keys, not letters. For example, the scan code for ô on a Slovak keyboard is the same as the scan code for ; on an American keyboard.
OK, I understand, my next problem is that I couldn't find scan codes for keys on keyboard, I only found scan codes for american letters, but not for keyboard as general, can you please help me?

Thanks!
lukassevc
Posts: 20
Joined: Wed May 29, 2019 2:56 pm
Libera.chat IRC: lukassevc

Re: Own charset in C or assembler and keyboard input

Post by lukassevc »

hextakatt wrote:If you're saving your source files with ISO-8859, and some characters still doesn't print correctly, maybe you should check your putchar function (or the name that you give to the function that prints characters) and change char to unsigned char in the function parameter where you pass the character to print.
Seeing your source code would be useful.
If you want to use a custom charset, you'll need to set a video mode first and then an putpixel function.
You did not understand it correctly, I am only compiling it as ISO-8859-2
kzinti
Member
Member
Posts: 898
Joined: Mon Feb 02, 2015 7:11 pm

Re: Own charset in C or assembler and keyboard input

Post by kzinti »

lukassevc wrote:OK, I understand, my next problem is that I couldn't find scan codes for keys on keyboard, I only found scan codes for american letters, but not for keyboard as general, can you please help me?
After 10 seconds of googling "slovak keyboard scan codes" I found a few resources... Here is a very nice intuitive one:

http://kbdlayout.info/KBDSL/
http://kbdlayout.info/KBDSL/scancodes
http://kbdlayout.info/KBDSL/virtualkeys
Last edited by kzinti on Thu May 30, 2019 12:21 pm, edited 1 time in total.
lukassevc
Posts: 20
Joined: Wed May 29, 2019 2:56 pm
Libera.chat IRC: lukassevc

Re: Own charset in C or assembler and keyboard input

Post by lukassevc »

kzinti wrote:
lukassevc wrote:OK, I understand, my next problem is that I couldn't find scan codes for keys on keyboard, I only found scan codes for american letters, but not for keyboard as general, can you please help me?
After 10 seconds of googling "slovak keyboard scan codes" I found a few resources... Here is a very nice intuitive one:

http://kbdlayout.info/KBDSL/
http://kbdlayout.info/KBDSL/scancodes
http://kbdlayout.info/KBDSL/virtualkeys
Thanks for your help, I tried it later, but I didn't found nice one. I wrote it very complicated ( I think that was the reason ! ).
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Own charset in C or assembler and keyboard input

Post by Solar »

OK, slow down a bit.

There are several encodings at work, here.
  • The encoding your source is written in (i.e. what your text editor saves).
  • The encoding you're telling your compiler your source is written in.
Basically, the language standard doesn't guarantee anything beyond the "Basic Character Set" to actually work. That would be the printable characters present in ASCII-7, minus $ and ` (backtick). So, yes, you could write your source in ISO-8859-2 or UTF-8, and many compilers might actually support that, but purist that I am, personally I wouldn't use any non-ASCII-7 characters in source, and instead use character escapes. But that's just me. The important thing is that those two encodings match, or your executable won't hold the characters you think it holds. You can check this easily by inspecting the generated object code, with a hex editor if nothing more sophisticated is at hand.

Then we go on:
  • The encoding that is written to stream (by printf() or whatever).
  • The encoding that the output device is set to accept.
Again, it's very important that the two match.

And then the output device must actually support that encoding properly....

(There are actually even more points of encoding in play here, depending on how your development environment is set up. Your terminal in which you do the development, for example. But I hope I gave an idea how important it is that the parties involved actually agree on the encoding used...)
Every good solution is obvious once you've found it.
lukassevc
Posts: 20
Joined: Wed May 29, 2019 2:56 pm
Libera.chat IRC: lukassevc

Re: Own charset in C or assembler and keyboard input

Post by lukassevc »

Solar wrote:OK, slow down a bit.

There are several encodings at work, here.
  • The encoding your source is written in (i.e. what your text editor saves).
  • The encoding you're telling your compiler your source is written in.
Basically, the language standard doesn't guarantee anything beyond the "Basic Character Set" to actually work. That would be the printable characters present in ASCII-7, minus $ and ` (backtick). So, yes, you could write your source in ISO-8859-2 or UTF-8, and many compilers might actually support that, but purist that I am, personally I wouldn't use any non-ASCII-7 characters in source, and instead use character escapes. But that's just me. The important thing is that those two encodings match, or your executable won't hold the characters you think it holds. You can check this easily by inspecting the generated object code, with a hex editor if nothing more sophisticated is at hand.

Then we go on:
  • The encoding that is written to stream (by printf() or whatever).
  • The encoding that the output device is set to accept.
Again, it's very important that the two match.

And then the output device must actually support that encoding properly....

(There are actually even more points of encoding in play here, depending on how your development environment is set up. Your terminal in which you do the development, for example. But I hope I gave an idea how important it is that the parties involved actually agree on the encoding used...)
Yes, I understood it before too, but I tried it save with ISO 8859 and compile i with same encoding, output was very bad.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Own charset in C or assembler and keyboard input

Post by Solar »

Yes but which ISO-8859? Did you tell your compiler that input would be 8859-2?

If you want to compare "what I got, versus what I expected", may I shamelessly advertise http://encoding.rootdirectory.de? That's a side-by-side table of the ISO-8859 encodings, plus the more common Windows codepages. (As I work with text encodings 9-to-5, I created that page a couple of years ago for just that kind of troubleshooting.)
Every good solution is obvious once you've found it.
lukassevc
Posts: 20
Joined: Wed May 29, 2019 2:56 pm
Libera.chat IRC: lukassevc

Re: Own charset in C or assembler and keyboard input

Post by lukassevc »

Solar wrote:Yes but which ISO-8859? Did you tell your compiler that input would be 8859-2?

If you want to compare "what I got, versus what I expected", may I shamelessly advertise http://encoding.rootdirectory.de? That's a side-by-side table of the ISO-8859 encodings, plus the more common Windows codepages. (As I work with text encodings 9-to-5, I created that page a couple of years ago for just that kind of troubleshooting.)
I saved my source file as ISO-8859-2 and I told compiler that encoding will be ISO-8859-2
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Own charset in C or assembler and keyboard input

Post by Solar »

That leaves the second part of my original post. Is what your compiled-as-8859-2 source writes to stream also 8859-2? Does your output device "know" that, and does it support that encoding?
Every good solution is obvious once you've found it.
Post Reply