Page 1 of 1
Own charset in C or assembler and keyboard input
Posted: Wed May 29, 2019 3:03 pm
by lukassevc
Hi, I am Lucas and i have one question. I am no from English speaking country ( I am from Slovakia ( not from Slovenia ) ) and I am making my own OS in C and assembler but when I came to printing text on screen and I want to use my characters ( that I use in my country ) I don't know how to use them because C compiler doesn't compiled it very well for my OS so it didn't work. Later i change text encoding ( to ISO 8859-2 ) it worked better but not enough - some characters it did compiled correctly and some not, and I don't know know how to fix it, I decide best way will be make own charset but I don't know how to do it. Also I need help with keyboard in C - i know that I can read input from it using port 0x60 and 0x64 but i need also recognize characters and here is my problem - using scan codes i don't know if every key on keyboard has special scan code or if special letter has own and unique scan code, if first, it will be easy to make function to read for example Slovak, English, American or any keyboard because every key has own scan code, and doesn't depends on letters written on it. But if second, it will be harder because i can't find Slovak keyboard scan codes so please help me, it is confusing me ( very very confusing ) - sorry for mistakes, because ( as I mentioned before ) I am not from English speaking country.
Thanks for help and understand.
Re: Own charset in C or assembler and keyboard input
Posted: Wed May 29, 2019 4:57 pm
by deleted8917
If you're saving your source files with ISO-8859, and some characters still doesn't print correctly, maybe you should check your putchar function (or the name that you give to the function that prints characters) and change char to unsigned char in the function parameter where you pass the character to print.
Seeing your source code would be useful.
If you want to use a
custom charset, you'll need to set a
video mode first and then an
putpixel function.
Re: Own charset in C or assembler and keyboard input
Posted: Wed May 29, 2019 5:30 pm
by Octocontrabass
I recommend using
UTF-8 (Unicode) encoding in your OS. It's a little bit more work, but you can use it for all text.
If you're using VGA text mode, the default character set mapping is
code page 437. You can change the mapping by
uploading different glyphs to the graphics card.
You can display any text you want if you use graphics mode instead of text mode.
Scan codes are for keys, not letters. For example, the scan code for ô on a Slovak keyboard is the same as the scan code for ; on an American keyboard.
Re: Own charset in C or assembler and keyboard input
Posted: Thu May 30, 2019 11:41 am
by lukassevc
Octocontrabass wrote:I recommend using
UTF-8 (Unicode) encoding in your OS. It's a little bit more work, but you can use it for all text.
If you're using VGA text mode, the default character set mapping is
code page 437. You can change the mapping by
uploading different glyphs to the graphics card.
You can display any text you want if you use graphics mode instead of text mode.
Scan codes are for keys, not letters. For example, the scan code for ô on a Slovak keyboard is the same as the scan code for ; on an American keyboard.
OK, I understand, my next problem is that I couldn't find scan codes for keys on keyboard, I only found scan codes for american letters, but not for keyboard as general, can you please help me?
Thanks!
Re: Own charset in C or assembler and keyboard input
Posted: Thu May 30, 2019 11:50 am
by lukassevc
hextakatt wrote:If you're saving your source files with ISO-8859, and some characters still doesn't print correctly, maybe you should check your putchar function (or the name that you give to the function that prints characters) and change char to unsigned char in the function parameter where you pass the character to print.
Seeing your source code would be useful.
If you want to use a
custom charset, you'll need to set a
video mode first and then an
putpixel function.
You did not understand it correctly, I am only compiling it as ISO-8859-2
Re: Own charset in C or assembler and keyboard input
Posted: Thu May 30, 2019 12:17 pm
by kzinti
lukassevc wrote:OK, I understand, my next problem is that I couldn't find scan codes for keys on keyboard, I only found scan codes for american letters, but not for keyboard as general, can you please help me?
After 10 seconds of googling "slovak keyboard scan codes" I found a few resources... Here is a very nice intuitive one:
http://kbdlayout.info/KBDSL/
http://kbdlayout.info/KBDSL/scancodes
http://kbdlayout.info/KBDSL/virtualkeys
Re: Own charset in C or assembler and keyboard input
Posted: Thu May 30, 2019 2:17 pm
by lukassevc
Thanks for your help, I tried it later, but I didn't found nice one. I wrote it very complicated ( I think that was the reason ! ).
Re: Own charset in C or assembler and keyboard input
Posted: Thu May 30, 2019 3:53 pm
by Solar
OK, slow down a bit.
There are
several encodings at work, here.
- The encoding your source is written in (i.e. what your text editor saves).
- The encoding you're telling your compiler your source is written in.
Basically, the language standard doesn't guarantee anything beyond the "Basic Character Set" to actually work. That would be the printable characters present in ASCII-7,
minus $ and ` (backtick). So, yes, you could write your source in ISO-8859-2 or UTF-8, and many compilers might actually support that, but purist that I am, personally I wouldn't use any non-ASCII-7 characters in source, and instead use character escapes. But that's just me. The important thing is that those two encodings
match, or your executable won't hold the characters you think it holds. You can check this easily by inspecting the generated object code, with a hex editor if nothing more sophisticated is at hand.
Then we go on:
- The encoding that is written to stream (by printf() or whatever).
- The encoding that the output device is set to accept.
Again, it's very important that the two
match.
And then the output device must actually
support that encoding properly....
(There are actually even more points of encoding in play here, depending on how your development environment is set up. Your terminal in which you do the development, for example. But I hope I gave an idea how important it is that the parties involved actually
agree on the encoding used...)
Re: Own charset in C or assembler and keyboard input
Posted: Fri May 31, 2019 8:36 am
by lukassevc
Solar wrote:OK, slow down a bit.
There are
several encodings at work, here.
- The encoding your source is written in (i.e. what your text editor saves).
- The encoding you're telling your compiler your source is written in.
Basically, the language standard doesn't guarantee anything beyond the "Basic Character Set" to actually work. That would be the printable characters present in ASCII-7,
minus $ and ` (backtick). So, yes, you could write your source in ISO-8859-2 or UTF-8, and many compilers might actually support that, but purist that I am, personally I wouldn't use any non-ASCII-7 characters in source, and instead use character escapes. But that's just me. The important thing is that those two encodings
match, or your executable won't hold the characters you think it holds. You can check this easily by inspecting the generated object code, with a hex editor if nothing more sophisticated is at hand.
Then we go on:
- The encoding that is written to stream (by printf() or whatever).
- The encoding that the output device is set to accept.
Again, it's very important that the two
match.
And then the output device must actually
support that encoding properly....
(There are actually even more points of encoding in play here, depending on how your development environment is set up. Your terminal in which you do the development, for example. But I hope I gave an idea how important it is that the parties involved actually
agree on the encoding used...)
Yes, I understood it before too, but I tried it save with ISO 8859 and compile i with same encoding, output was very bad.
Re: Own charset in C or assembler and keyboard input
Posted: Fri May 31, 2019 8:44 am
by Solar
Yes but
which ISO-8859? Did you
tell your compiler that input would be 8859-2?
If you want to compare "what I got, versus what I expected", may I shamelessly advertise
http://encoding.rootdirectory.de? That's a side-by-side table of the ISO-8859 encodings, plus the more common Windows codepages. (As I work with text encodings 9-to-5, I created that page a couple of years ago for just that kind of troubleshooting.)
Re: Own charset in C or assembler and keyboard input
Posted: Fri May 31, 2019 1:15 pm
by lukassevc
Solar wrote:Yes but
which ISO-8859? Did you
tell your compiler that input would be 8859-2?
If you want to compare "what I got, versus what I expected", may I shamelessly advertise
http://encoding.rootdirectory.de? That's a side-by-side table of the ISO-8859 encodings, plus the more common Windows codepages. (As I work with text encodings 9-to-5, I created that page a couple of years ago for just that kind of troubleshooting.)
I saved my source file as ISO-8859-2 and I told compiler that encoding will be ISO-8859-2
Re: Own charset in C or assembler and keyboard input
Posted: Fri May 31, 2019 2:25 pm
by Solar
That leaves the second part of my original post. Is what your compiled-as-8859-2 source writes to stream also 8859-2? Does your output device "know" that, and does it support that encoding?