Scan Codes, Key Codes, ASCII, ANSI and Terminals

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
User avatar
SpyderTL
Member
Member
Posts: 1074
Joined: Sun Sep 19, 2010 10:05 pm

Scan Codes, Key Codes, ASCII, ANSI and Terminals

Post by SpyderTL »

There has been a lot of discussion about I/O layers lately, which has convinced me to, among other things, start redesigning my current, simplified keyboard handler into something that can stream data, and share compatibility with other streamed data -- specifically, RS-232 data.

So, I started splitting up my current keyboard code into several layers -- Scan Codes -> Key Codes -> ASCII, etc. And I'm now at the point where I want to output ASCII characters, which is going to require another component that is responsible for keeping track of key down/key up state. But at the same time, I'm starting to think about bringing in support for plugging in a RS-232 layer, and build an interface that will work equally well for both "input sources". This has introduced the concept of terminal codes into the mix, and I'm trying to figure out how to make it all work together.

So, if I wanted to write an application that works equally well with a ps/2 keyboard, a USB keyboard, a telnet connection and a RS-232 terminal, what format should the application expect the incoming data to be in?

Should it be more of a simple stream of characters, encoded in a way that includes additional information, like VT100? Or should it be more of an event driven system, where everything is converted to key press events, which are queued somewhere, and read one at a time by the application?

I'm leaning toward the latter approach, but the stream of characters approach would be a little more flexible given the way the rest of the system works, currently.

Maybe the character stream goes on the end of the chain, after the event queue, so that keyboard data goes from scan code -> key code -> key event -> terminal event. And RS-232 data goes from byte -> ASCII character -> terminal event.

I'm curious what approach you guys have done, and what worked, and what didn't. And whether there is an altogether different approach to this problem that I should consider...

Thanks, guys.
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
onlyonemac
Member
Member
Posts: 1146
Joined: Sat Mar 01, 2014 2:59 pm

Re: Scan Codes, Key Codes, ASCII, ANSI and Terminals

Post by onlyonemac »

SpyderTL wrote:Maybe the character stream goes on the end of the chain, after the event queue, so that keyboard data goes from scan code -> key code -> key event -> terminal event. And RS-232 data goes from byte -> ASCII character -> terminal event.
Make it "scan code -> key code -> key event -> ASCII character -> terminal event" and you shouldn't go far wrong. Personally I would combine the three first steps into one driver layer though, or possibly two driver layers if you're wanting to separate translation of the scan code to a key code from the translation of a key code to a key event (which you'll probably need to do if/when you add support for different keyboard layouts).
When you start writing an OS you do the minimum possible to get the x86 processor in a usable state, then you try to get as far away from it as possible.

Syntax checkup:
Wrong: OS's, IRQ's, zero'ing
Right: OSes, IRQs, zeroing
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Scan Codes, Key Codes, ASCII, ANSI and Terminals

Post by Combuster »

My design for keyboards is really that they are glorified joysticks - from a computer's perspective the keys on the keyboard are blank anyway. On top of that I'd have an IME driver that converts keystrokes into interaction events.

The rest is userspace work that does little other than routing bytestreams of data between perhipherals and software, with appropriate conversions on each end. The bytestreams themselved can very well be simple VT100, so you have an identity mapping from network/serial and the only real logic comes from converting the keyboard and graphics peripherals to these bytestreams.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Scan Codes, Key Codes, ASCII, ANSI and Terminals

Post by Brendan »

Hi,
SpyderTL wrote:I'm leaning toward the latter approach, but the stream of characters approach would be a little more flexible given the way the rest of the system works, currently.
Actually, no, stream of (ASCII) characters is hideously limited. It can't handle internationalisation (without going to UTF-8), it can't handle input method editors (needed for languages like Japanese, Chinese, etc), it can't handle "key released" (needed for various games - e.g. "ship's primary laser is on while control key pressed"), and it can't easily handle keys that don't have ASCII representations (function keys, etc) and needs ugly escape sequences to work around that.

The approach I'd recommend is for keyboard driver to send "events" consisting of a set of flags (e.g. control, alt, shift and LED states, plus a pressed/repeated/released 2-bit field), a Unicode codepoint (that may be null where it doesn't make sense), a key code (an integer that identifies the raw key number itself), and possibly a time-stamp indicating when the key was pressed (which is only useful to resolve race conditions in some kinds of fast paced games - e.g. to determine if key was pressed before something happened, even if the key wasn't actually received until after something happened due to latency in both hardware and software).

Normally (for some OSs) there's a terminal emulator layer that converts "events" into "character stream" (and the reverse for output), which is mostly just for compatibility with ancient technology.

Note that ASCII itself became obsolete in the 1990s (and should have been banned/prohibited 15 years ago after a 10 year grace period), serial ports became obsolete near the end of last century (replaced by USB), VT100 (1978) was replaced by VT125 in 1981, real terminals have supported (limited) graphics for almost 40 years, and teletype dates back to teleprinters (about 100 years before computers existed). None of these things belong in an OS that was designed in this millennium. :roll:


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
embryo2
Member
Member
Posts: 397
Joined: Wed Jun 03, 2015 5:03 am

Re: Scan Codes, Key Codes, ASCII, ANSI and Terminals

Post by embryo2 »

SpyderTL wrote:if I wanted to write an application that works equally well with a ps/2 keyboard, a USB keyboard, a telnet connection and a RS-232 terminal, what format should the application expect the incoming data to be in?
If there's a set of code sets (scan or terminal codes) on one side and there's one code set (utf-8/16) on another side then it's obvious place for converters. But is there just one code set on the side of internal text representation in your OS?

Next part is the transport for the codes. Because the transport support shouldn't multiply our problems it's a good idea to have just one component responsible for codes movement across an application/OS. And if there's just one code set, internally suported by an OS, then it's obvious that the transport also should use it.

In the end we have many code sources accompanied by code set converters. After converters we have one transport. And after the transport we have all other OS internals. So, transport connects many inputs to many outputs without any extra effort, related to the input representation before it reaches the transport.
My previous account (embryo) was accidentally deleted, so I have no chance but to use something new. But may be it was a good lesson about software reliability :)
User avatar
SpyderTL
Member
Member
Posts: 1074
Joined: Sun Sep 19, 2010 10:05 pm

Re: Scan Codes, Key Codes, ASCII, ANSI and Terminals

Post by SpyderTL »

onlyonemac wrote:Make it "scan code -> key code -> key event -> ASCII character -> terminal event" and you shouldn't go far wrong.
I actually meant to say scan code -> key code -> key event -> terminal event -> ASCII character, but I messed up, and I was too lazy to go back and fix it...

And I do think that I'm going to combine the first three together, or more likely, have three different readers -- one that returns scan codes, one that returns key codes, and one that returns events, and I'll just pick the one I need at that moment.
Combuster wrote:On top of that I'd have an IME driver that converts keystrokes into interaction events.
IME? As in Intel Management Engine?
Brendan wrote:The approach I'd recommend is for keyboard driver to send "events" consisting of a set of flags (e.g. control, alt, shift and LED states, plus a pressed/repeated/released 2-bit field), a Unicode codepoint (that may be null where it doesn't make sense), a key code (an integer that identifies the raw key number itself), and possibly a time-stamp indicating when the key was pressed
This is pretty much what I had in mind, but I wanted to be able to convert that back to ASCII in case I just wanted to print characters to the screen, or send them to a text file, or print them to the LPT port, or send them to a serial port... Hence the addition of the final link in the chain, above.
embryo2 wrote:But is there just one code set on the side of internal text representation in your OS?
Well, right now, it's mainly ASCII... or byte streams that just happen to show up as ASCII characters on the screen. But I'm currently working on moving to key events / terminal events, and getting away from ASCII characters.
embryo2 wrote:In the end we have many code sources accompanied by code set converters. After converters we have one transport.
I guess this was my original question. What format should the internal transport use? Bytes, characters, escaped characters, or structs? And I think the consensus here is "structs", and convert everything from/to that format on both ends.
Last edited by SpyderTL on Thu Jun 16, 2016 8:24 am, edited 2 times in total.
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
onlyonemac
Member
Member
Posts: 1146
Joined: Sat Mar 01, 2014 2:59 pm

Re: Scan Codes, Key Codes, ASCII, ANSI and Terminals

Post by onlyonemac »

SpyderTL wrote:
Combuster wrote:On top of that I'd have an IME driver that converts keystrokes into interaction events.
IME? As in Intel Management Engine?
No, what Combuster is referring to is Input Method Editor. Basically, an IME layer is a way of abstracting input devices. So for example a physical keyboard is one IME (that produces "keypress" interaction events), a mouse is another IME (that produces "pointing" or "clicking" interaction events), a touchscreen is another IME (that also produces "pointing" and "clicking" interaction events), and an on-screen keyboard is another IME that takes "clicking" interaction events and produces "keypress" interaction events.
Last edited by onlyonemac on Thu Jun 16, 2016 3:42 pm, edited 1 time in total.
When you start writing an OS you do the minimum possible to get the x86 processor in a usable state, then you try to get as far away from it as possible.

Syntax checkup:
Wrong: OS's, IRQ's, zero'ing
Right: OSes, IRQs, zeroing
embryo2
Member
Member
Posts: 397
Joined: Wed Jun 03, 2015 5:03 am

Re: Scan Codes, Key Codes, ASCII, ANSI and Terminals

Post by embryo2 »

SpyderTL wrote:
embryo2 wrote:In the end we have many code sources accompanied by code set converters. After converters we have one transport.
I guess this was my original question. What format should the internal transport use? Bytes, characters, escaped characters, or structs? And I think the consensus here is "structs", and convert everything from/to that format on both ends.
It depends on the transport and it's purpose. It can be messaging, it can be events, it can be a buffer, it can be a method/function call or something else. The purpose can be "just get some ASCII characters" or "deliver everything to every component". For the latter case it's highly advisable not to use converters everywhere (on both sides, for example), because there could be 1000 sources and 1000 destinations and every destination then should know about 1000 converters.
My previous account (embryo) was accidentally deleted, so I have no chance but to use something new. But may be it was a good lesson about software reliability :)
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Scan Codes, Key Codes, ASCII, ANSI and Terminals

Post by Combuster »

onlyonemac wrote:
SpyderTL wrote:
Combuster wrote:On top of that I'd have an IME driver that converts keystrokes into interaction events.
IME? As in Intel Management Engine?
No, what Combuster is referring to is Input Method Editor. Basically, an IME layer is a way of abstracting input devices. So for example a physical keyboard is one IME (that produces "keypress" interaction events), a mouse is another IME (that produces "pointing" or "clicking" interaction events), a touchscreen is another IME (that also produces "pointing" and "clicking" interaction events), and an on-screen keyboard is another IME that takes "clicking" interaction events and produces "keypress" interaction events.
I've noticed that a keyboard only ever has two purposes: button mashing and typing text. So things like games that doesn't need to know more than the state of a key (and possibly it's name so it shows up properly in the ) can get the raw input, and everything else should support proper deadkeying or I can't even write my native language properly. Therefore, nothing is going to even get a raw ASCII input because of free bugs.

Of course, mine is going to be a tad more complicated since I also want to support Japanese out of the box, of which I expect requires enough features that all the other languages require no real code changes afterwards.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
Post Reply