TTY/PTY details

proxy · Post by **proxy** » Tue Dec 16, 2014 3:11 pm

I think I have a pretty rock solid core for my OS at this point. But I never got around to implementing a proper "terminals". Right now, there is just the console with a little bit of abstraction via the VFS system. Processes can read from "device:///stdin" and write to "device:///stdout" to interact with the console and keyboard and this works great. At least for now, stdin, eventually boils down to a pop from a blocking queue that is filled by the keyboard interrupt. There is no explicit concept of an application being in the foreground so the keystrokes go to the processes on a first come/first serve basis. I want to take things to the next level.

So I want to implement an equivalent to /dev/ptmx and /dev/pts/... But to be honest, I'm simply not familiar enough with how this typically works to know where to start. I've looked in the wiki and it's very abstract and a bit light on details. Is there any relatively easy to grok spec on the APIs that terminals usually offer? Here's my current thoughts.

Assuming we are implementing a traditional 80x25 console. I could start with having a 2000 character buffer that represents the current screen. Writing to the TTY would, in effect, simply update the contents of this buffer and only if that TTY is active would that change be reflected on the screen. (I can imagine some optimizations here, but let's keep it simple). Then I suppose this implies a concept of "which TTY" is active". For a console this seems trivial, I can map the Fn keys to switch around the active one. If I ever get to windowed processes, maybe it's more complex?

What are everyone's thoughts on this? Am I way off track? Or do I have the general concept reasonably on target?

Thanks

proxy · Post by **proxy** » Tue Dec 16, 2014 3:14 pm

I forgot to mention. Another concept I think is "needed" is that since stdin doesn't have the be the keyboard. Perhaps there should be a generic queue structure for incoming data for each TTY? Or is that something that would generally be dealt with at the VFS layer (as it is now with just raw console stuff).

gerryg400 · Post by **gerryg400** » Tue Dec 16, 2014 3:52 pm

http://www.linusakesson.net/programming/tty/index.php has some good information to get you started

Brendan · Post by **Brendan** » Tue Dec 16, 2014 9:08 pm

Hi,

proxy wrote:What are everyone's thoughts on this?

My thoughts are that you're mixing many different/separate things:

The interface provided by a standard library to the application (e.g. streams)
The interface provided by the kernel to the standard library
A form of inter-process communication (e.g. pipes, messages)
The code (process?) that allows multiple things to share the same keyboard and screen
The video buffering and rendering
The interface provided by the keyboard driver

Let's start from the lowest level. The keyboard (and all user input devices - mouse, touchscreen, joystick, whatever) really are things that send "events". For the keyboard an event might consist of a key code, a pressed/released flag, the state of various other keys at the time (control, alt, shift), the state of various toggles at the time (capslock, numlock, scrollock), plus an optional unicode codepoint.

For "events", it doesn't make much sense sending a stream of bytes - this just makes it hard to detect where one event ends and the next starts, and increases overhead. You want the event (which is essentially a variable length group of bytes) to be treated as a whole - e.g. atomically sending the whole event, and atomically receiving the whole event; where it's impossible to receive part of the event. Mostly, you want to use messaging (where an event is a message).

These events/messages go from the keyboard driver to "the code (process?) that allows multiple things to share the same keyboard and screen". It doesn't go anywhere else. This might be a virtual terminal layer, or it might be a GUI. Whatever it is, it's the root of a hierarchical tree, where user input goes from the root to its children (to their children), and where user output goes from the children's children back to the children back to the root.

Part of the work the "root of the hierarchical tree" does is deciding if the keypress is something it needs to handle (e.g. "alt+F3" in a virtual terminal layer to switch to a different child, "alt+tab" in a GUI to switch to a different child, etc). If it's not something like this, then it just gets forwarded to the current child.

Because the virtual terminal layer is historical memorabilia, it also needs to emulate ancient technology. This means converting those nice/useful events into a stream of characters (and discarding a lot of useful information, like all of the "key released" events). It also means converting a stream of characters (stdout) into nice/useful graphics; and buffering the output of "not currently selected" children.

For buffering the output of "not currently selected" children; you may buffer the original stream or buffer the resulting graphics. If the video mode or font size changes and you've buffered to original stream, then you can regenerate the graphics to suit the new video mode or font size. If you buffer the resulting graphics then you can't. Also note that you do want to provide scroll-back - e.g. if you buffer the resulting graphics, then you might want an 80*1024 character buffer (where there's 25 lines on the screen and 999 additional lines you can scroll through). If you buffer the original stream (e.g. keep the last 64 KiB of data received) you can do the same, and it will cost less memory for the the same amount of scroll-back (but it will also be a bit more complex).

The virtual terminal layer needs to communicate with its children (and its children need to communicate with their children, etc). The standard library does whatever it needs to do to make this communication look like pipes; so the communication you actually use can be anything you like. If you've already got messaging anyway, then I'd just use messaging for communication (and emulate pipes in the standard library).

Cheers,

Brendan

Antti · Post by **Antti** » Wed Dec 17, 2014 12:18 am

Brendan wrote:For the keyboard an event might consist of a key code, a pressed/released flag, the state of various other keys at the time (control, alt, shift), the state of various toggles at the time (capslock, numlock, scrollock), plus an optional unicode codepoint.

An optional unicode codepoint is something that bothers me. What would be a good way to define the keyboard layout? Is it the keyboard driver that handles it? If so, then it sounds like there is too much policy involved instead of just having mechanism for sending the keyboard state, e.g. buttons that are pressed/released.

I am not sure about this, hence the questions.

Brendan · Post by **Brendan** » Wed Dec 17, 2014 1:46 am

Hi,

Antti wrote:
Brendan wrote:For the keyboard an event might consist of a key code, a pressed/released flag, the state of various other keys at the time (control, alt, shift), the state of various toggles at the time (capslock, numlock, scrollock), plus an optional unicode codepoint.
An optional unicode codepoint is something that bothers me. What would be a good way to define the keyboard layout?

Mostly, by having a file for each keyboard layout, that contains some configuration data (e.g. to indicate which meta-keys do what) and a set of lookup tables (one table for each shift/capslock/numlock combination, that does the key code to unicode conversion).

Antti wrote:Is it the keyboard driver that handles it? If so, then it sounds like there is too much policy involved instead of just having mechanism for sending the keyboard state, e.g. buttons that are pressed/released.

That's up to you - it can be done almost anywhere (maybe even in a library that each application uses).

However, it's possibly worth pointing out that for some languages (Japanese, Chinese, etc) you'll also need an Input Method Editor; which is mostly a graphical tool that people use because the keyboard doesn't have enough keys to support "one key per character" for the language. This might mean that the IME sits between the keyboard driver and the GUI and (while the IME is active/being used - not when playing games, etc) rewrites the key presses before they're sent to the GUI.

Finally, don't forget security. If you're going to have something between the keyboard driver and the GUI (whether its something that handles keyboard layouts, or an IME, or perhaps something to handle global keyboard macros) then you'd want to worry about malicious keyboard loggers pretending to be one of those things.

Cheers,

Brendan

proxy · Post by **proxy** » Wed Dec 17, 2014 8:32 am

Thanks for the info guys, I'll have to read the links and digest some of these posts, but I think they are adding some clarity

.

@Brendan, Just as a point of clarity, my keyboard/mouse drivers are exposed to processes via the VFS layer through things like "device:///mouse", while these devices have a byte oriented interface, they round down the number of bytes to read to a multiple of the event struct size. So for example, I have a struct that looks like this:

Code: Select all

	struct MouseEvent {
		uint8_t buttons;
		int8_t  changeX;
		int8_t  changeY;
		uint8_t reserved;
	};

So if you ask to read 9 bytes from the mouse device, you'll get a reply with 2 complete mouse events and a return value of 8. This ensures that consumer processes, will always get whole event objects at a time.

OSDev.org

TTY/PTY details

TTY/PTY details

Re: TTY/PTY details

Re: TTY/PTY details

Re: TTY/PTY details

Re: TTY/PTY details

Re: TTY/PTY details

Re: TTY/PTY details