how does vim distinguish ESC key from the escape sequence ?

Schol-R-LEA · Post by **Schol-R-LEA** » Fri Jul 13, 2018 7:06 am

Solar wrote:
Schol-R-LEA wrote:...the base of that list probably ought to be TECO, even though it was not in the direct lineage of ED (especially in terms of the command language). Both Lampson and Deutsch would have been familiar with TECO, both using it and modifying the code for it...
I know the discussion has moved past this point now, but I explicitly limited my history lesson to direct lineage, not influences -- that would quickly have gotten out of hand.

OK, fair enough.

lovelyshell · Post by **lovelyshell** » Fri Jul 13, 2018 7:45 am

iansjack wrote:But couldn't read() return 3 bytes for the <Home> key and only one for the <Esc> key?

If the character(s) in stdin buffer came from a man hitting keyboard, It's safe because read() can finish before the next keystroke. But sometimes people use a Ctrl-Shift-V paste operation to feed stdin.

EDIT:
I change one sentence above to red since it seems not right.
And also, i have an impression: when i input to a program which had no response temporarily, the input was echoed immediately he came to life. This should be a counterexample。

simeonz · Post by **simeonz** » Fri Jul 13, 2018 7:51 am

iansjack wrote:But couldn't read() return 3 bytes for the <Home> key and only one for the <Esc> key?

Technically, that is probably exactly what it does. And most likely some programs have become dependent on that. In principle however, the terminal emulators (xterm or sshd, ..) open device file pairs (ptys) to communicate with the child processes. They feed the master end and the children consume from the slave end. In between the master and the slave there is a 64K buffer (or 4K or smth). Meaning that if the child process reads slowly, it could get multiple keys at once and they can be reconstituted erroneously as an escape sequence. Additionally, there is an anti-DoS protection in the kernel specifically for ptys, splitting long writes on some smaller boundary. So, you could get a long buffer of keys split midway, breaking some escape sequence, meaning that the application cannot rely that the sequences will also arrive in one block.

Furthermore, in the case of ssh - an escape key followed by something else may get transmitted as two encrypted tcp segments, but they may get reordered on its way to the destination, and delivered at the same time. If sshd deploys no special measures here to separate the escape code from the succeeding codes, nothing prevents those from being interpreted as one escape sequence, rather then multiple keys. And the worst part is - after looking around a bit, I haven't actually found any indication for special treatment of the escape key in SSH's protocol. Although some users report having fixed related problems by changing the terminal emulation types in putty - so there might be something there in the client code that is not immediately obvious.

The situation is different for the so called virtual terminals, which are directly sourcing the input hardware. They are also buffered, but it is unlikely that the client code reads at such sluggish pace that it cannot keep with the user's typing speed. If the application does not use a separate input thread however, I would assume that if it became busy, the data could accumulate and ambiguity might manifest.

Edit: Right - lovelyshell made a good point with a much simpler example.

iansjack · Post by **iansjack** » Fri Jul 13, 2018 8:29 am

I think it unlikely that a user would be pasting an <Esc> keypress.

simeonz · Post by **simeonz** » Fri Jul 13, 2018 10:15 am

iansjack wrote:I think it unlikely that a user would be pasting an <Esc> keypress.

Well. The question still stands then. Is the transmission of <Esc> followed by other keys unambiguous from terminal control and escape sequences when used over ssh (or elsewhere for that matter)?

iansjack · Post by **iansjack** » Fri Jul 13, 2018 10:29 am

I'd say that a user can't type fast enough to confuse individual key strokes with a combination of characters from a single key. (The transmission medium is irrelevant IMO. TCP is stream oriented, so talk of individual packets doen't apply.)

simeonz · Post by **simeonz** » Fri Jul 13, 2018 10:51 am

iansjack wrote:I'd say that a user can't type fast enough to confuse individual key strokes with a combination of characters from a single key.

If the application consumes its input in the same thread in which it performs other activities, the input can pile up in the kernel buffer while processing takes place. Several keys that were typed independently will be read as one payload. Wont that be a problem?

iansjack wrote:(The transmission medium is irrelevant IMO. TCP is stream oriented, so talk of individual packets doen't apply.)

Why? TCP is stream oriented, but it allows multiple segments to be in transmission simultaneously. Let's say, the first key is sent in one segment and the second key in another. If the ip packet of the first segment is lost in transmission, the segment will be re-transmitted later, but will ultimately arrive second. The stream order will be recovered by the TCP layer at the destination, but the time separation wont be. The two segments will now be read as a single payload. (With ssh there is encryption and application protocol to be considered, but it wont be sufficient to rely on tcp and the user's typing speed.)

iansjack · Post by **iansjack** » Fri Jul 13, 2018 11:14 am

But we're only talking about a single key - Esc or Home. In one case a packet containing a single code will be sent; in the other a packet containig three codes. However poor the connection there is plenty of time for the packet to be retransmitted and processed long before the user presses another key.

I can't imagine a situation where the conection is so poor that a user could outtype it; in that case I guess you'd have all sorts of problems.

simeonz · Post by **simeonz** » Fri Jul 13, 2018 11:32 am

iansjack wrote:But we're only talking about a single key - Esc or Home.

We are talking about <Esc> [ H typed sequentially (at human typing speed), or the escape sequence the given terminal type uses for <Home>, vs a single <Home> key being typed.

iansjack wrote:However poor the connection there is plenty of time for the packet to be retransmitted and processed long before the user presses another key.

My example emphasizes the speed of retransmission, not successful transmission. This I think can be quite slow, because IIRC, tcp does not have a negative acknowledgement mechanism, not to mention that the intermediate nodes are at the internet layer.

iansjack wrote:I can't imagine a situation where the conection is so poor that a user could outtype it; in that case I guess you'd have all sorts of problems.

True, but it doesn't help to introduce terminal misbehavior. (P.S. You can shut the terminal up and deny the terminal service, but why misbehave.)

Also, in the first example I gave above, with single-threaded batch oriented application, there is no need of connection instability. It will all happen locally unless the program is redesigned to have a separate ui thread.

iansjack · Post by **iansjack** » Fri Jul 13, 2018 11:54 am

Rather than arguing the theory I'll do a few experiments (when I have the time) and see what results I get for the number of characters returned by read() for various keypresses (local and via SSH).

simeonz · Post by **simeonz** » Fri Jul 13, 2018 1:44 pm

iansjack wrote:Rather than arguing the theory I'll do a few experiments (when I have the time) and see what results I get for the number of characters returned by read() for various keypresses (local and via SSH).

I have tried the following experiment:

Code: Select all

#include <unistd.h>
#include <stdio.h>
#include <termios.h>

int main()
{
  struct termios term, term_old;

  if(tcgetattr(0, &term_old)) {
    puts("tcgetattr failed\n");
    return -1;
  }

  term = term_old;

  //term.c_lflag &= ~ICANON;
  //term.c_lflag &= ~ECHO;
  //term.c_cc[VMIN] = 0;
  //term.c_cc[VTIME] = 0;
  cfmakeraw(&term);

  if (tcsetattr(0, TCSANOW, &term)) {
    puts("tcsetattr failed\n");
    return -2;
  }

  sleep(5);

  char buf[256];
  int c = read(STDIN_FILENO, buf, 256);
  for (int i = 0; i < c; ++i) { printf("%02x\n", buf[i]); }

  if (tcsetattr(0, TCSANOW, &term_old)) {
    puts("tcsetattr failed\n");
    return -3;
  }

  return c;
}

This is on VirtualBox, using putty and the VM's terminal (identical output). When I type "<Esc>[1~" and <Home> consecutively (fast), I get this:

Code: Select all

# ./some; echo $?
1b
5b
31
7e
1b
5b
31
7e
8

Therefore, if the application is busy for 5s and then reads the input, it will not be able to distinguish manually typed characters from the escape sequence. The 8 at the end is the number of characters. I am reading the terminal in raw mode, because that is what a TUI application would do.

lovelyshell · Post by **lovelyshell** » Fri Jul 13, 2018 10:13 pm

iansjack wrote:I think it unlikely that a user would be pasting an <Esc> keypress.

Yes. But network congestion(and so on) or program busy can also cause keystrokes combined, for the former, they are combined by server before entering stdin buffer, for the latter, the key is correctly sent by keyboard driver to stdin buffer immediately on pressed , but the program is too busy to call read(0, x,x) , so the keys accumulate in stdin buffer. From the view of that program, it's no different from suffering a ctrl-shift-v operation.

I think the program illustrated by @simeonz was designed to check the occasion similar to the latter case above.

Korona · Post by **Korona** » Sun Jul 15, 2018 1:41 am

The question was about vim: As there is no way to distinguish a fast enough ESC+x from an escape sequence vim indeed uses timing information to figure this out. That is documented and there are options to control it: esckeys, timeout, timeoutlen.

simeonz · Post by **simeonz** » Sun Jul 15, 2018 3:03 am

Korona wrote:The question was about vim: As there is no way to distinguish a fast enough ESC+x from an escape sequence vim indeed uses timing information to figure this out. That is documented and there are options to control it: esckeys, timeout, timeoutlen.

This is concise answer to most of the original question. There was also a sub-question however, and I am interested to know as well:

lovelyshell wrote:I don't know how vim avoids such problem, and do you have a better approach?

And then the discussion carried on to network congestion effects interacting with ssh, etc.

From Korona's clarification and what we know, I think it is safe to say that console applications accessing terminal character devices are not supposed to think of the keyboard as their input. They can apply heuristics like vim, but IIUC, the reliable behavior is to think of the pseudo-terminal device as a classic terminal that somehow gets emulated using the modern hardware. (P.S. In that sense, the <Esc> key can be considered a key specifically intended for escape sequences.)

Apparently there are two ways to deal with the keyboard in a precise way. One is to read the input device in devfs (as demonstrated by the code linked here), and the other is to create an x-window client (which uses the same approach, but does not require root privileges.)

Edit: Obviously, you wouldn't want to deploy x apps everywhere, but the alternatives are to demand root privileges, or use a timing heuristics on the terminal device. (The last may be the conventional technique for simple TUI-style apps, but I think is pretty much a hack.)

Sik · Post by **Sik** » Thu Aug 30, 2018 4:01 am

I honestly hate how terminals handle keystrokes for anything that isn't a normal character. I get it about old terminals, but how come none of the extensions over time ever fixed this mess? >:|

Anyway, the reason why I'm bumping this thread: for those who want a way to provide the Esc key in their own OSes without running into that clash, I noticed there's the CAN control character (ASCII 0x18) which has pretty much the modern meaning of the Esc key (to cancel the operation that was being currently entered), so honestly that'd be a pretty good substitute for your own designs. Shame that it won't work with existing stuff out there :/ (aside maybe those that can take remapping?)

And there's my minirant for today.

OSDev.org

how does vim distinguish ESC key from the escape sequence ?

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc

Re: how does vim distinguish ESC key from the escape sequenc