Key Updates, I/O APIC and sleep

DevNoteHQ · Post by **DevNoteHQ** » Fri Oct 13, 2017 4:45 pm

Hi!

I'm currently working on my keyboard driver and the infrastructure for the driver. Since configuring the I/O APIC is kind of a pain in the @$$, i've currently only activated activated the PIC and the APIC.
Consider the following code:

keys.hpp:

Code: Select all

#ifndef KEYS_KEYS_HPP
#define KEYS_KEYS_HPP

namespace Keys
{
	void Init();
	class Key
	{
	public:
		bool IsPressed;
		bool IsHeld;
		bool IsReleased;
	};

	extern void(*UpdateFocusedElement)();

	extern Key A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z;
}

#endif

keyboard.cpp:

Code: Select all

char ScanCode(uint8_t iCode, uint8_t iTable)
{
	if (iTable = 1)
	{
		switch (iCode)
		{
		case 0x10: Keys::Q.IsPressed = true; Keys::UpdateFocusedElement(); Keys::Q.IsPressed = false; Keys::Q.IsHeld = true; Keys::UpdateFocusedElement(); break;
		}
	}
}

Now in the far future i'll have a GUI, which means Keys::UpdateFocusedElement(); will call a function of a process that is currently focused. But in order to do so, i'll have to make two task switches per Update, which means hitting Q would cause 4 task switches and releasing it would cause 2 task switches. Isn't that a little bit performance hungry? I mean i could maybe eliminate the second call by deleting .IsHeld and making it an process internal value...
By the way, the Key-Objects will get copied into public read-only memory for all processes.

APIC and I/O APIC:
If i've understood it correctly, the I/O APIC is only used for IRQs and IPIs, right? Is there a reason why i shouldn't send all IRQs to the BSP? Because if there is no real reason, i could just leave the PIC activated. And is there a good programmers documentation on ACPI? It seems to be a pain in the @$$ to detect and configure

sleep:
How does sleep work if i have multiple requests? I'll need the APIC-Timer for multitasking, so i'll have to use the PIT. But what should i do if i get multiple sleep requests at once? Set a 1ms timer, subtract 1ms from all requests at each IRQ00 and switch task if necessary? I mean I want to complete a sheduling round every 1ms, so i would need a 1ms PIT anyways...

Brendan · Post by **Brendan** » Fri Oct 13, 2017 9:48 pm

Hi,

DevNoteHQ wrote:But in order to do so, i'll have to make two task switches per Update, which means hitting Q would cause 4 task switches and releasing it would cause 2 task switches. Isn't that a little bit performance hungry?

That depends on how fast you type.

If a task switch costs 1000 cycles, a CPU runs at 4 GHz and you type 1 million characters per second; all of a CPU's time would be spent doing task switches; but if you type 500 characters per second 4 task switches per character adds up to 0.005% of one CPU's time.

Of course the PS/2 hardware is limited to a maximum rate of about 10000 characters per second (I forgot the exact baud rate), which means worst case (someone typing as fast as PS/2 hardware can handle) the task switches add up to about 0.01% of a CPU's time.

Note that it'd probably be more like:

from some other process (e.g. idle task) to the GUI when a key is pressed
from GUI to whatever has focus
from whatever has focus back to GUI (to update screen)
from GUI back to some other process (e.g. idle task) after screen is updated

Assuming you don't send "key release" to normal applications (it's typically only needed for games); if the "key release" arrives while GUI is updating the screen it costs nothing (no task switch because that task is already running). For other cases there'd be 2 more task switches for key release (from whatever to GUI, then from GUI back to whatever).

Also, if you type fast enough (or your GUI and application is slow enough) you'd avoid 2 task switches (switching to some other process and back - e.g. idle task).

This means that (depending on conditions, exact timing, etc) you'd be looking at anything from 6 task switches per key press/release to 2 task switches per key press.

DevNoteHQ wrote:I mean i could maybe eliminate the second call by deleting .IsHeld and making it an process internal value...

Usually the keyboard driver sends some kind of "key packet"; which includes:

A key code (or the raw scancode maybe)
A unicode sequence if any (e.g. if the key is 'A' then keyboard driver might send the character 'A')
A set of flags indicating:
- If the key was pressed, repeated or released
- If any of the special keys where also held down (left shift, right shift, left control, right control, left alt, right alt)
- If "caps lock" was on
- If "scroll lock" was on
- If "number lock" was on
Whatever else you might feel like (e.g. maybe "time when keyboard IRQ occurred")

The information in the "key packet" makes it possible for an application to handle the key (including special key combinations used for keyboard shortcuts, etc) without key releases and without needing ".isHeld()".

DevNoteHQ wrote:By the way, the Key-Objects will get copied into public read-only memory for all processes.

How do you defend against malware (e.g. keyboard loggers that record every key pressed in the hope of capturing the user's log-in details/passwords for everything)? It should be impossible for a normal process to know what keys were pressed until/unless it has focus.

DevNoteHQ wrote:APIC and I/O APIC:
If i've understood it correctly, the I/O APIC is only used for IRQs and IPIs, right?

IO APIC is only used for IRQs. The local APIC is used for IPIs.

DevNoteHQ wrote:Is there a reason why i shouldn't send all IRQs to the BSP?

Assume that the BSP got too hot and had to be throttled down to 12.5% of its normal speed; and now several million IRQs per second (from network cards, USB controllers, etc) are all pounding the living daylights out of that poor little CPU while 7 other CPUs are all still running at full speed and sitting there doing absolutely nothing.

DevNoteHQ wrote:sleep:
How does sleep work if i have multiple requests? I'll need the APIC-Timer for multitasking, so i'll have to use the PIT. But what should i do if i get multiple sleep requests at once? Set a 1ms timer, subtract 1ms from all requests at each IRQ00 and switch task if necessary? I mean I want to complete a sheduling round every 1ms, so i would need a 1ms PIT anyways...

For sleep; usually you set the task's state to "sleeping" so that the shceduler doesn't give it CPU time; then put the task on a "list of stuff to wake up" with the time it's supposed to wake up. For the timer; you determine which task has to be woken up next; then configure the timer so it generates an IRQ at that time. When the IRQ occurs you remove the task and from the "list of stuff to wake up" and wake the task up (change the task's state so that the scheduler will give it CPU time again), then determine which task has to be woken up next (and configure the timer so it generates an IRQ at that time, and ....).

Cheers,

Brendan

Korona · Post by **Korona** » Sat Oct 14, 2017 5:29 am

Brendan's post gives a great overview of how the keyboard handler can work. I only want to comment on the following:

Brendan wrote:Usually the keyboard driver sends some kind of "key packet"; which includes:
A key code (or the raw scancode maybe)

A unicode sequence if any (e.g. if the key is 'A' then keyboard driver might send the character 'A')
[...]

I really think that translating key codes to unicode sequences should not be the job of the keyboard driver. This translation should be done in a library that is linked to the application or maybe inside the GUI compositor that distributes input events to individual windows. The reason for that is that the keyboard driver simply has no sane way to perform the translation. It does not know the language that is associated with the text field you're typing into. It should not need to bother with any compositing (e.g. for CJK input) and dead key handling. It has no knowledge of application specific state (e.g. the caps lock state in a virtual machine). The only way to handle those things correctly, is to handle them at the application level.

DevNoteHQ · Post by **DevNoteHQ** » Sat Oct 14, 2017 7:37 am

Thanks a lot for the help!

For some reason i thought all processes should get access to the currently pressed keys

Are Idle task and the GUI individual tasks? Because then kernel -> Idle task -> GUI -> "process that has focus" are 6 task switches in total...
Couldn't I include the GUI in the Idle task? I mean i am planing on calling the Idle task at least every 1ms (at full load i'll call it every 1ms), so there should be sufficient time to update everything... Or is the Idle task usually included in the kernel?

So you are basically saying the I/O APIC is needed if the BSP is under full load? Because i think i'll prefer the other processors in the... I guess the sheduler manages the load distribution? So i'll have to deal with I/O APIC sometime in the future, but not now since the PIC works fine?

By the way i got the idea with the keys from C#. And C# has 3 or even 4 key-states so i thought i'll need to send them to the processes.

Edit:
Do you have any recommendations which algorithm to use for:

Code: Select all

void *kmalloc(uint64_t size)

I'll also have to do:

Code: Select all

void *kmalloc(uint64_t size, uint16_t align)

And therefore also:

Code: Select all

void* operator new(size_t len, uint16_t align)

Any recommendations on that?

Brendan · Post by **Brendan** » Sat Oct 14, 2017 10:10 pm

Hi,

DevNoteHQ wrote:Are Idle task and the GUI individual tasks? Because then kernel -> Idle task -> GUI -> "process that has focus" are 6 task switches in total...
Couldn't I include the GUI in the Idle task? I mean i am planing on calling the Idle task at least every 1ms (at full load i'll call it every 1ms), so there should be sufficient time to update everything... Or is the Idle task usually included in the kernel?

There's 2 ways for a scheduler to handle "there's no work for a CPU to do at all". The first way is to have a special idle state built into the scheduler. The second way is to have an "idle task" that is only ever given CPU time when nothing else can use the CPU (where the existence of an idle task means that the scheduler knows there's always a task to run and avoids the need for the scheduler to deal with a special state).

Note that this is typically tied up with power management (e.g. reducing power consumption when CPU has nothing to do), which gets complex and (if done well) depends on a whole bunch of information that is harder to obtain from an idle task. In general, "idle task" is easier if you're not doing any power management; and "special idle state" is easier if you're doing advanced/complex power management.

Also note that almost all real schedulers have multiple scheduling policies, where each scheduling policy might use a completely different algorithm. For example; "policy 0" might be intended for real-time tasks and might use an [url=https://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling]"earliest deadline first" algorithm; "policy 1" might be intended for tasks that need very low latency but don't use much CPU time, might have 256 task priorities, and might use a "highest priority task gets the CPU" scheduling algorithm; "policy 2" might be intended for normal tasks and might use a "variable frequency" scheduling algorithm (tasks with higher priority get more time slices); and "policy 3" might be intended for background tasks, might have 256 task priorities, and might use a "variable length time slice round robin" scheduling algorithm (where higher priority tasks are given larger time slices). In this case, if the OS uses an idle task, it can just be a task that uses the lowest priority within a scheduling policy intended for unimportant/background work.

This is because different tasks have different requirements - some tasks might need low latency (e.g. when something happens they need the CPU as soon as possible), some tasks don't care about latency at all but need lots of CPU time (plodding away doing calculations in the background), some tasks need a specific amount of CPU time (e.g. enough to handle 60 frames per second, no more and no less), etc. Different scheduling algorithms are good for some requirements and bad for others; and scheduling policies allows an OS to use different scheduling algorithms for different task requirements.

DevNoteHQ wrote:So you are basically saying the I/O APIC is needed if the BSP is under full load? Because i think i'll prefer the other processors in the... I guess the sheduler manages the load distribution? So i'll have to deal with I/O APIC sometime in the future, but not now since the PIC works fine?

IO APIC (and/or Message Signalled Interrupts/MSI) is needed to balance the overhead of IRQs across available CPUs (possibly in a "dynamic, changing in response to power management and other factors" way); which helps to reduce IRQ latency (IRQ balancing reduces the chance of "CPU can't start IRQ handler immediately because CPU is already executing an IRQ handler" and therefore reduces the time between an IRQ occuring and its IRQ handler being started) and can help to improve cache locality (IRQ handler's code and data still in that CPU's caches).

For now; you're learning and should probably do whatever helps you learn more faster; but this depends on what you're learning (e.g. if you're trying to learn about memory management, then the IO APIC would be a useless distraction).

DevNoteHQ wrote:Do you have any recommendations which algorithm to use for:
Code: Select all
void *kmalloc(uint64_t size)
I'll also have to do:
Code: Select all
void *kmalloc(uint64_t size, uint16_t align)
And therefore also:
Code: Select all
void* operator new(size_t len, uint16_t align)
Any recommendations on that?

I think "one algorithm" is a bad idea, regardless of which algorithm it is.

Quickly: For something where performance matters (e.g. kernel) you want to be able to control cache locality (e.g. pieces of memory that are used as a set placed close to each other), and the easiest way to do that is to have "memory pools" where each pool has separate allocator (e.g. one pool for "scheduler data" and one pool for "keyboard driver data", so that all of the scheduler's data is in the same area, same set of cache lines, same set of pages, etc, and you don't get more cache misses and more TLB misses because scheduler's data is interspersed with keyboard driver's data). You also want quotas/limits (e.g. so that a memory leak in a keyboard driver can't/won't gobble all the memory and prevent the scheduler from functioning properly), and memory pools works for that too. You also want some hints - one for "lifetime" (for short-lived things allocation time is more important than avoiding fragmentation and wastage, and for long-lived things avoiding fragmentation and wastage is more important than allocation time); one for "performance vs. security" (for the encrypted RAM features modern CPUs have); one for "bandwidth vs. latency" (for various optimisations for things like NUMA, "high bandwidth RAM built into CPU", etc), and maybe a "which NUMA domain" hint (more optimisation). These hints could be associated with a memory pool and (in some cases) could determine which algorithm is used to manage a pool's memory. Finally you want some bug detection and "debug-ability" - a way to detect bugs (e.g. when you free something that is already free), a way to associate a "pointer to name string" (and maybe a "which line of source code in which file") to each pool and each allocated piece of memory within each pool, a generic way to obtain useful statistics (e.g. "Scheduler's pool is 128 MiB, and 12% of RAM within scheduler's pool is allocated for process data objects"), etc.

Cheers,

Brendan

DevNoteHQ · Post by **DevNoteHQ** » Mon Oct 16, 2017 12:23 pm

Thanks a lot for the help!

To kmalloc():
I can't determine the source calling the kmalloc()-function, right? Which means i would have to pass a variable that specifies which pool to use.

Brendan · Post by **Brendan** » Mon Oct 16, 2017 7:46 pm

Hi,

DevNoteHQ wrote:To kmalloc():
I can't determine the source calling the kmalloc()-function, right? Which means i would have to pass a variable that specifies which pool to use.

If you want to pass "pool number" and "source line number" and "source file name" to kmalloc then you'd need parameters for them. For the callers it's possible to have a macro/wrapper that determines "source line number" and "source file name" using pre-processor variables (e.g. "__LINE__" and "__FILE__" in GCC); and it'd also be possible to use the preprocessor (with something like "#ifdef DEBUGGING") to enable/disable the parameters that are only used for debugging (in "kmalloc()" itself, and in the macro callers use).

For the pool number it's impossible to auto-guess; and in some cases you might need the same function to work with any pool. For a very simple example:

Code: Select all

char *clone_string(POOL pool, char *src, int length) {
    char *dest;

    dest = kmalloc(pool, length);
    if(dest == NULL) return NULL;
    memcpy(dest, src, length);
    return dest;
}

Cheers,

Brendan

DevNoteHQ · Post by **DevNoteHQ** » Tue Oct 17, 2017 4:43 am

Okay, thanks a lot!

OSDev.org

Key Updates, I/O APIC and sleep

Key Updates, I/O APIC and sleep

Re: Key Updates, I/O APIC and sleep

Re: Key Updates, I/O APIC and sleep

Re: Key Updates, I/O APIC and sleep

Re: Key Updates, I/O APIC and sleep

Re: Key Updates, I/O APIC and sleep

Re: Key Updates, I/O APIC and sleep

Re: Key Updates, I/O APIC and sleep