ttgetc() used some homebrew utf8 to unicode translation, limited to just
the normal latin1 characters. Use the utf8 helper functions to get it
right for the more complex cases.
NOTE! We don't actually handle characters > 0xff right anyway. And we
still end up doing Latin1 in the buffers on input. One small step at a
time.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Let's just plan on being fully utf8 some day. We're not there yet, and
maybe we'll never be, but having the halfway mode is not useful either.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is disgusting. And quite frankly, it's debatable whether this will
ever work. The "line" structure is still just an array of characters,
so that has to work with utf-8.
But the 'struct video' thing is what represents the actual screen
rectangle, and is fixed-size by the size of the screen. So making it
contain actual 32-bit unicode characters *may* make sense.
Right now we translate things the same way we always used to, though, so
utf-8 in 'struct line' will not be translated to the proper unicode
array, but to the bytes of the utf-8 representation. So this really
doesn't improve anything per se yet, just expands the memory use of the
video array.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These two constants are only needed/used by the posix.c file,
so just define them there.
Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Return statement is not a function so remove superfluous use of parenthesis.
Cc: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This avoids the annoying behavior where we're on the command line,
waiting for an ESC, and any control character sequence ends up finishing
the command line and eating the first ESC.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NOTE! MicroEmacs is very much a byte-based editor, and the new utf-8
support is purely an issue of terminal input and output. The file
contents themselves are in the 8-bit space. In that space, Unicode is
the same as Latin1.
The new mode is called "utf-8", and is enabled automatically by the
new emacs.rc when $LANG contains the substring "UTF-8".
I'm sure people would like to some day also edit real UTF-8 contents,
rather than just edit old 8-bit Latin1 contents in a UTF-8 terminal.
However, that's an independent (and much bigger and thornier) issue.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a slightly updated version of uemacs-PK (PK is Pekka
Kutvonen) which was used at Helsinki University a long time
ago. My fingers cannot be retrained.