Page 1 of 1

Translating keyboard input into the best form to send on.

Posted: Fri Jan 07, 2011 7:28 pm
by DavidCooper
There are a lot of things in my OS that are a bit of a mess (even though they work reliably) - in particular, the way I deal with keyboard input could be better designed, but it's possible that I might just be about to create a new mess which I'll only end up having to tidy up again later, so I thought I should describe what I'm thinking of doing and see if anyone has any suggestions to make.

The first stage is fine - the interrupt routine simply reads the 8042's data port and sticks whatever it finds into a buffer.

The next stage is done whenever the OS is asked to read the keyboard by whichever user interface is active at the time: it transfers what it finds in the buffer into a 12-byte array which can store information on up to six keys being held down at a time, three of those being shifts (shift/ctrl/alt), and the other three being any other keys. I designed this part of the system to make it easy to treat any key as a shift key so that whole words can potentially be entered by holding down any char key and then pressing another (which must be released first if it's to count, a reasonable delay between them being pressed determining that the second key was not pressed by accident). If the second char key is held down after the first is released, it counts as a separate char being input. This part of the system is efficient and works fine, so it will probably stay as it is.

The next part is the bit that needs to be tidied up. At the moment it translates everything into a form to send to apps (or for the OS itself to use), putting a code in al, either ASKII for a char key or a non-ASKII value for other kinds of key. Some values indicate that ah should be looked at as well because there aren't enough values to go round. The reason I'm not happy with this system is that every app has to filter out all the ASKII values every time by comparing with various values, and it's tiresome having to do this filtering every time. (I do have alternative routines for collecting input where the OS does the filtering in advance and only passes values of a certain type, but normally an app wants the whole lot.) I now plan to divide the codes being passed to the app into different groups with the code value held in ah rather than al so that al is kept free to indicate which group the code belongs to. The result of this is that an ASKII char can be filtered out with a single compare instead of having to do multiple comparisons with different values. A single compare would also be able to isolate the numbers, the letters or the other symbols, so the value in al might be 0 for numbers, 1 for letters and 2 for other kinds of symbol. The various cursor keys and other positioning keys could all be identified by a 4 in al, and the F1 to F12 keys by a 5 (the ah values in the latter case being the same as on those keys, plus 12 if shifted, plus 24 if ctrl, etc.). The value 6 in al could indicate a char shifted by a char (for whole word entry), the chars involved being in ah and the next free byte in eax. The value 7 could be used to pass information about shift keys being pressed, just in case an app cares. The values 3 and 13 would probably be used for Esc and Return, without any need to put them in ah. Unmakes can also be sent if 128 is added to the byte in al, even if no app bothers to do anything with them, but it makes sense to supply all the information that any app might want even if most aren't interest in half of it - they can respond to the ones they're interested in while ignoring the rest and without the process ever being messy.

So, what I'd like to know is, does this sound like a good way to do things or am I missing a trick or two? I'd like to know how other people have gone about this if they've done something different, particularly if some aspect of my design might make some kind of functionality which I haven't thought of harder. Ideally what I'd like to do is create or tie in with something that might become a common standard that many OSes could support (perhaps in addition to their existing way of doing things - it should be possible to switch the translation code to suit individual apps when their user interface is active so that they get input in the form they expect, after the OS has filtered out anything that might be intended for it directly, so a common standard needn't limit any OS that wants to use the keyboard in some exotic way which cannot be made compatible with that standard and which would itself limit the keyboard's potential functionality in some other way).

Re: Translating keyboard input into the best form to send on

Posted: Fri Jan 07, 2011 9:23 pm
by ~
Maybe the following will help you in some of your points. Also look at how Windows do things: it translates every make/break scan code to virtual key identifiers. That means that the keyboard driver is the one responsible to validate the scan codes.

In this way you make the key presses much more portable, and the application will always have to compare the virtual identifiers, but in a hardware independent fashion, which means that it can either write text or assign actions to any key. See how there are different categories of keys:

About keyboard input
Virtual key codes

Here are some other references:
http://wiki.osdev.org/Keyboard_Input
Keyboard programming from Bran's kernel development tutorial

It seems that if a keyboard driver is to be stable, it MUST read only one byte from the KBC per each triggered IRQ 1 automatically as soon as it appears via its ISR.

Also, as far as I know, this keyboard driver itself should determine which scan code (key) was pressed or released every single time you read a byte per IRQ into your keyboard driver's buffer, and also as long as it isn't empty (as long as it has bytes to check).

To diminish a little the load of the comparisons, you could do this: suppose you have only 1 byte in your buffer. You should be able to know this (how many bytes there are in your driver buffer). Then you can limit the comparisons to only scan code definitions that have a width of 1 byte, ignoring and/or skipping scan code definitions with 2, 6 or more.

Then you can carry some actions and/or send the virtual key code and set/clear flags for things like Shift, Num Lock, Ctrl status, etc., etc.

In the same way, if there are, say, 7 bytes in your buffer, you should be able to determine that this value is nearer to the widest scan code (suppose the Pause key for scan code 1). Then you could check all scan codes from a table and skip all the smaller ones when you see that are too small, and start with the bigger ones.

You could either search from smaller to bigger, or vice versa.

Or start searching either dynamically/statically from a specially tuned table for the most common/recently used scan codes, to better predict the chances of finding common scan codes faster, etc., and if you make proper optimizations of your own you can cut the time it would take to improve the search.

But as far as I know you will always have to make many comparisons, at most as many as those for all available make/break codes of the standard scan code set you are using (1, 2 or 3).

As you can see from the Bran's tutorial, you could also use the scan code value as the actual index into a table, but I suspect it would only work straightforwardly for 1-byte scan codes. You could easily merge these methods and check for special first-byte scan code values (0xE0) to inform yourself of whether you are dealing immediately with a single character (indexing method) or a multi-byte scan code (comparison from table method) as well as checking bit 7 for detecting make or break of a key, and then do the actions that you consider should be done for the key presses/releases.

I also have sort of a draft in my website for a keyboard/mouse tutorial, unfinished, very low level and maybe not very debugged but with some correction. Maybe the references to other sites contained there could help you as well as the knowledge I have managed to gather about the subject so far.

Also remember to preferably use the method of a circular buffer containing raw scan code bytes, big enough to never get filled normally (assuming not too much CPU charge), no matter how fast the user presses keys. This circular buffer will work dramatically better if it's a power of 2 as you will get rid of the need to use any IFs to check for the limits of the read and the write pointers and will only need to use an AND bit mask with all (consecutive) interesting bits set to 1 to crop its limit into a repetitive, "self-resetting" miniature read/write software implemented "address space" after incrementing or decrementing pointers (the physical RAM memory addressing also resembles a huge circular buffer in this aspect of incrementing and decrementing pointers sequentially).

http://126.sytes.net/tutorials/KBC_SourceDoc/

Re: Translating keyboard input into the best form to send on

Posted: Fri Jan 07, 2011 10:44 pm
by Brendan
Hi,
DavidCooper wrote:The first stage is fine - the interrupt routine simply reads the 8042's data port and sticks whatever it finds into a buffer.
No, that's not fine.. :)

If you've ever got a choice between polling and not needing to poll, then polling is always the wrong choice. If your interrupt service routine just puts whatever it finds into a buffer, then how does the next stage avoid polling that buffer?

You need to use some sort of IPC so that polling can be avoided. The basic characteristic I'd look for is some sort of "while anything I'm waiting for hasn't happened yet, don't give me any CPU time at all" facility. It doesn't matter if it's messaging and a "get_message()" function, or if it's pipes or sockets with a "select()" function, or if it's something else.

It looks like you've made this same mistake everywhere else - lots of applications constantly polling the GUI, which in turn constantly polls that keyboard buffer. It adds up to a lot of wasted time for nothing.

Also, ASCII should've been banned a decade ago. Use Unicode or I will hunt you down and steal all your cookies.. :)


Cheers,

Brendan

Re: Translating keyboard input into the best form to send on

Posted: Sat Jan 08, 2011 10:59 pm
by DavidCooper
~ wrote:Maybe the following will help you in some of your points. Also look at how Windows do things: it translates every make/break scan code to virtual key identifiers. That means that the keyboard driver is the one responsible to validate the scan codes.

In this way you make the key presses much more portable, and the application will always have to compare the virtual identifiers, but in a hardware independent fashion, which means that it can either write text or assign actions to any key. See how there are different categories of keys:

About keyboard input
Virtual key codes
That first link was particularly interesting - it seems that the app gets sent all manner of stuff that it probably doesn't need and then tends to call for translation.

The other links you've given cover the basics of dealing with getting input, after which you discuss processing scan codes, but I was actually asking about the next step which is about the best way for the OS to present the key input data to the app, and in particular the idea of avoiding the need for the app to filter out the character codes from control codes (such as cursor keys) - I want apps to be able to compare the key input with a single value and isolate all the char codes in one go rather than having to compare two values for each end of the range of values involved. It's not clear from the Windows links above if Windows does that.
Or start searching either dynamically/statically from a specially tuned table for the most common/recently used scan codes, to better predict the chances of finding common scan codes faster, etc., and if you make proper optimizations of your own you can cut the time it would take to improve the search.

But as far as I know you will always have to make many comparisons, at most as many as those for all available make/break codes of the standard scan code set you are using (1, 2 or 3).
I just use a table and XLAT - it's not that part of the process that I'm asking about, but about how the app deals with the translated values that it is sent by the OS. I've been looking through my code and I keep finding places where key input is sent to some program that has to compare numerous values just to determine which type of action is required in response. It seems to me that giving the app a value to isolate the type of action that's likely to be required should be the highest priority rather than sending a coding which might be in any group, so the value 0 would be sent for the digits 0 to 9, or the value 2 for any letter from A to Z or a to z. I notice from one of those Windows links that I forgot about diacritics - they should have their own value too as they need to be held back to be combined with the next value (and they can't be held back by the OS as you want the diacritic to be displayed straight away).
I also have sort of a draft in my website for a keyboard/mouse tutorial, unfinished, very low level and maybe not very debugged but with some correction. Maybe the references to other sites contained there could help you as well as the knowledge I have managed to gather about the subject so far.
I assume that's the link at the bottom - I've just read through most of it and it's a good resource which I've bookmarked so that I can refer to it when I write my mouse driver.
Also remember to preferably use the method of a circular buffer containing raw scan code bytes, big enough to never get filled normally (assuming not too much CPU charge), no matter how fast the user presses keys. This circular buffer will work dramatically better if it's a power of 2 as you will get rid of the need to use any IFs to check for the limits of the read and the write pointers and will only need to use an AND bit mask with all (consecutive) interesting bits set to 1 to crop its limit into a repetitive, "self-resetting" miniature read/write software implemented "address space" after incrementing or decrementing pointers (the physical RAM memory addressing also resembles a huge circular buffer in this aspect of incrementing and decrementing pointers sequentially).
I am using a circular buffer containing raw scan codes, though I also copy the value from the system timer variable into the buffer each time so that I can work out when each key was pressed, how long it was held down, etc., the idea being to provide possibilities for identifying the user by their typing style, and also to give keys different functions according to how long they're held down. I hadn't thought of using a bit mask to avoid checking for limits - that's an excellent idea which I will definiely employ. Thanks!

Re: Translating keyboard input into the best form to send on

Posted: Sun Jan 09, 2011 1:09 am
by DavidCooper
Brendan wrote:Hi,
DavidCooper wrote:The first stage is fine - the interrupt routine simply reads the 8042's data port and sticks whatever it finds into a buffer.
No, that's not fine.. :)

If you've ever got a choice between polling and not needing to poll, then polling is always the wrong choice. If your interrupt service routine just puts whatever it finds into a buffer, then how does the next stage avoid polling that buffer?
I thought about this a long time ago and decided that you'd always have to have something polling at one level or other, but you're right - having the app doing the polling can be eliminated and the user-interface thread of the app can be deactivated until new input appears instead of being run periodically to poll for it. I currently have the OS call that thread from time to time (though technically the OS rets to it) so that it can call a routine which looks to see if there's any new input, and whenever the keyboard routine rets with a zero in al, the app simply rets to the OS (by calling it). This uses an infinitesimal percentage of processor time, but it is still more than zero time, so clearly it isn't the best way of doing things - I should get the keyboard interrupt routine to activate the thread that is currently to act on key input, and that would be as easy as setting a variable for the carousel code (which controls the multitasking) to read in order to determine whether it should wake up the app's user interface thread. There is actually still some polling going on, but it's the carousel code that does it by polling that variable. Even that polling could be eliminated, but at the cost of getting the interrupt routine to call code to wake up the app's user interface thread by moving it back into the carousel array for active threads. Maybe you think it should do that too, but the amount of polling is down to one tiny check of a variable for every lap of the carousel array, so that may be a price worth paying to hasten the iret.
You need to use some sort of IPC so that polling can be avoided. The basic characteristic I'd look for is some sort of "while anything I'm waiting for hasn't happened yet, don't give me any CPU time at all" facility. It doesn't matter if it's messaging and a "get_message()" function, or if it's pipes or sockets with a "select()" function, or if it's something else.

It looks like you've made this same mistake everywhere else - lots of applications constantly polling the GUI, which in turn constantly polls that keyboard buffer. It adds up to a lot of wasted time for nothing.
No: only one thread polls for key input as all other user-interface threads are inactive, so the amount of wasted time is miniscule and would be hard to measure, but even that can be reduced to a tiny fraction again if the interrupt code sets that variable I mentioned. Maybe I should just go the whole hog and get the interrupt routine to call a routine to enter the relevant thread's details in the carousel array, but I don't want to do that at the cost of making the interrupts less responsive. Maybe processors are so fast now that I shouldn't be worried about that.
Also, ASCII should've been banned a decade ago. Use Unicode or I will hunt you down and steal all your cookies.. :)
The simple answer's to have different routines for the app to call depending on the form it wants the keyboard input to be sent in. Windows appears to send all the keyboard scan codes to the app which practically every app will ignore and ask for translation instead, so it would be best just to have a routine for an app to call which would give it the keyboard scan codes if that's what it wants, while other apps could ask for ASKII, Unicode or whatever else they fancy. The interrupt routine triggers the app's user interface thread to wake up either by setting a variable or by calling a routine to put its details into the carousel array, and then when that thread runs it calls a specific routine to get the kind of input it wants. It doesn't need to look for messages...

Except that it might have been woken up for some other reason, such as to handle data that came in from the mouse, so that would need to be accessed through the same mechanism... which turns it into a messaging system rather than just a keyboard reader. So I should actually be designing a messaging system rather than just a system for getting hold of key input. The interrupt routine needs to indicate that it woke the app up to deal with key data, while another interrupt routine would tell it that it's mouse data that's waiting for it, and the routine the app then calls to find out what's going on is actually a message fetching routine which will check the var set by the interrupt (which could be a bitmap of different message senders) to find out which buffer or buffers need to be read...

And then the bit-map becomes a limitation because a more advanced app may need to wake up to respond to information form other threads, so chuck the bit-map and instead have a proper messaging system which enables the carousel code to wake up threads and post messages to them to tell them why they've being activated. Ah yes! Thank you for triggering this cascade of realisations. I hope I'm on the right lines now. The app is woken up and calls the messaging system, and if there's key or mouse input to deal with, that could be sent back as a direct reply, probably in a form such as the one I described before with a number in al to indicate what it is that needs to be dealt with, so values from 0 to 15 might relate to key input in which case that input can be found in ah, while other values in al would indicate mouse data or that there is some other kind of message of a non-standard kind which the OS doesn't know how to package and which the app therefore needs to collect properly, the details of the address and length of which could be passed in other registers. Or, the message could simply be stored in the address space of the app itself so that the OS only needs to store one address per thread rather than having to store messages in its own warehouse. So, would it be a good plan for every thread to have its messages delivered by the OS straight into its own address space so that it can simply read them directly? It could still call the OS to pre-read the messages and to translate them into the right form to act on if they're about key or mouse input, but if they aren't something the OS knows how to convert it would simply tell the app to handle the message itself. After dealing with the message, the app would then call the OS to get itself deactivated and the OS would check to see if there are any more messages waiting to be dealt with first, and after deactivating the app the OS would then have to check to see if it needs to be reactivated again straight away due to an interrupt routine sending another message, so I guess that means there has to be a message queue in the OS which is where the interrupt routine should put the message, and the OS then works out who the messages are for and posts them on to the threads that they're aimed at. All messages go into that simple queue first and an OS postman thread is woken up to deliver them, and that's when the the app (or whichever thread the message is for) gets woken up. So the interrupt routine doesn't need to wake up the app directly at all, but simply wakes up the postman instead and he does the job for it - the interrupt routine could wake up the postman thread simply by checking and changing a couple of variables to increase the size of the carousel array so that the postman thread entry ends up inside the array rather than outside it, and thus absolutely no polling is inovlved at any stage. There could actually be some problems involved in waking up the postman and sending him back to sleep due to variables being changed when an interrupt routine is trying to adjust things at the same time, but I think disabling the interrupts for a moment would cure that.

I'd better go and read up on messaging before I start work on all that, but thanks for your input. I'm very glad now that I asked the question here because I was indeed missing a trick or two.