1
0
mirror of https://github.com/rfivet/uemacs.git synced 2024-11-02 19:37:19 -04:00
Commit Graph

301 Commits

Author SHA1 Message Date
73c372fc7f add file header. 2014-12-22 14:46:05 +08:00
93f2a6d691 clean up line dependencies. 2014-12-22 14:45:55 +08:00
4bba6e7417 refactor main and basic out of efunc. 2014-12-22 14:45:46 +08:00
0e9fc236f9 display depends on window instead of efunc. 2014-12-22 14:45:37 +08:00
08a3aa81e1 crypt depends on display and input instead of efunc. 2014-12-22 14:45:26 +08:00
86d5b10fa9 fileio depends on display instead of efunc. 2014-12-22 14:45:16 +08:00
f6780cb71b remove crypt from efunc, update dependencies. 2014-12-22 14:45:06 +08:00
fa56e5dfff remove fileio from efunc, update dependencies. 2014-12-22 14:44:58 +08:00
a65f7ca38c read files in text mode.
review fileio prototypes.
2014-12-22 14:44:49 +08:00
d9bb0ea262 refactor epath into eval. 2014-12-22 14:44:35 +08:00
e86bdad4fc refactor epath into bind and util into eval. 2014-12-22 14:44:26 +08:00
886402ccad update file dependencies towards util. 2014-12-22 14:44:16 +08:00
4958b7d2af use constant strings for pathnames. 2014-12-22 14:44:06 +08:00
13f4a7cefd usage has been removed. should have been part of commit #776bd25 2014-12-22 14:43:57 +08:00
c9a59faf42 usage obsolete as refactored into wrapper. 2014-12-22 14:43:36 +08:00
86afdef45e refactor handling of version and program name strings. 2014-12-22 14:43:23 +08:00
9f909644e9 rename program from 'em' to 'ue'. 2014-12-22 14:14:10 +08:00
34615aae05 first step for constant version strings. 2014-12-22 14:13:54 +08:00
8ef70b86fb revert CYGWIN to termio for compatibility with console window 2014-12-22 14:13:29 +08:00
2e2d684697 enable ^S in posix 2014-12-22 14:13:11 +08:00
646fbbc4f6 remove need for usage 2014-12-22 14:12:55 +08:00
45a6523572 use posix (termios) with Cygwin) 2014-12-22 14:12:41 +08:00
052f7ff956 don't compile ansi, ibmpc, vmsvt, vt52 2014-12-22 14:12:27 +08:00
4b53c4887b add ncurses directory for Cygwin 2014-12-22 14:12:15 +08:00
a1124441f7 fix compilation warning 2014-12-22 14:06:30 +08:00
02823cb59d remove unnecessary include 2014-12-22 14:06:06 +08:00
c961759288 rework version and help printing
em --help now returns EXIT_SUCCESS
2014-12-22 14:05:53 +08:00
f3ce8236af update file dependencies: usage, wrapper, version 2014-12-22 14:05:40 +08:00
68a79430e6 cleanup usage and wrapper 2014-12-22 14:04:17 +08:00
U-Renaud-PC\Renaud
128354e657 Adapatation to Cygwin32 2014-12-22 14:03:11 +08:00
Linus Torvalds
fa00fe882f Stop using 'short' for line and allocation sizes
Yes, yes, it probably made sense 30 years ago as a way to save a tiny
amount of memory, but especially when interspersed in structures that
have pointers (aligned to 64 bits these days), it's not even saving
memory today.  And it makes us fail in nasty ways when looking at files
with long lines.

So just make them 'int'.  And if you have a line that is longer than
2GB, you only have yourself to blame.  I no longer care.

In case anybody care, the "test-case" for this was a lovely UDDF file
with a binary divecomputer dump encoded as an XML element.  Resulting in
a lovely 41kB single line.  Not what poor micro-emacs was designed for,
I'm afraid.

I really should just learn another editor, rather than continue to
polish this turd.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-22 14:32:16 -08:00
Linus Torvalds
25f0141df1 Avoid memory access errors if llength() overflows
llength() is currently a 'short' which can overflow and result in signed
numbers if line lengths are larger than 32k.  We'll fix the overflow
separately, but before we do that, just use a signed int to hold the
value so that we don't overrun memory allocations when we converted that
negative number to a large positive unsigned integer.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-22 14:29:43 -08:00
Linus Torvalds
8899ed4e1f Fix the unicode character limit (0 .. 0x10ffff)
For some reason I had limited things to 0xffff, it really should be 0x10ffff.

We don't actually support a full 32-bit unicode model anyway, since we
use the high bits for the control/meta/^X/special bits, but there was no
reason to limit things to 16 bits when we had 28 bits available.  And
the real limit for real Unicode characters is 0x10ffff.

Add a silly example character past the 16-bit range to the UTF8 demo
file:
  'SMILING FACE WITH HALO' (U+1F607)
from the 'emoticons' block.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-24 19:44:21 -07:00
Pekka Enberg
dbf1a014a7 uemacs: Remove unused 'lflag' variables from file.c
GCC spotted the following unused variable:

    CC       file.o
  file.c: In function ‘readin’:
  file.c:225:6: warning: variable ‘lflag’ set but not used [-Wunused-but-set-variable]
  file.c: In function ‘ifile’:
  file.c:553:6: warning: variable ‘lflag’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-16 11:30:06 -07:00
Linus Torvalds
ddd45dbff1 Fix 'getccol()' and 'getgoal()' functions for multibyte UTF-8 characters
These functions convert the byte offset into the column number
(getccol()) and vice versa (getgoal()).

Getting this right means that moving up and down the text gets us the
right columns, rather than moving randomly left and right when you move
up and down.  We also won't end up in the middle of a utf-8 character,
because we're not just moving into some random byte offset, we're moving
into a proper column.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-15 14:36:38 -07:00
Linus Torvalds
1edeced67c Fix vtputc() and simplify show_line by using it again
This re-introduces vtputc() as the way to show characters, which
reinstates the control character handing, and simplifies show_line() in
the process.

vtputc now takes an "int" that is either a unicode character or a signed
char (so negative values in the range [-1, -128] are considered to be
the same as [128, 255]).  This allows us to use it regardless of what
the source of data is.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-11 11:23:32 -07:00
Linus Torvalds
0a8b429059 Start doing character removal properly
This makes actual basic editing work.  Including things like
justify-paragraph etc, so lines get justified by number of UTF8
characters rather than bytes.

There are probably tons of broken stuff left, but this actually seems to
get the basics working right.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-11 10:43:16 -07:00
Linus Torvalds
0e9fc2be15 Start actually inserting full utf8 sequences
This makes it possible to cut-and-paste the UTF8 testfile into a new
buffer, and the end result looks correct.

NOTE! We still do various things wrong while editing.  For example,
while the cursor movements were fixed, simple things like deleting a
character still work on single bytes, rather than utf8 characters.

So while this is getting much closer to actually editing UTF-8 data,
it's not there yet.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-11 02:21:36 -07:00
Linus Torvalds
4bccfab632 Make 'show_line()' do proper TAB handling
The TAB handling got broken by commit cee00b0efb ("Show UTF-8 input as
UTF-8 output") when it stopped doing things one byte at a time.

I'm sure the other special character cases are broken too.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-11 01:22:32 -07:00
Linus Torvalds
42d9cb33a5 Expand keycode to 'int' from 'short'
This uses the four high bits for the meta and control key sequences.
This means that we will be limiting our Unicode space to 28 bits, but
that's more than we really need.

It *would* be nicer if we just used the sign bit to mark "we have meta
character information") but that would require bigger changes.  And we
really don't need to worry about 30-bit unicode.  Small steps, remember.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 18:14:02 -07:00
Linus Torvalds
6e4a45c005 character input: make sure we have enough bytes for a full utf8 character
.. but we do have that 0.1s delay, so if somebody feeds us non-utf8
sequences, we won't delay forever.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 18:11:30 -07:00
Linus Torvalds
3abd3dba42 utf8: make sure to honor the array length properly
Right now the input side can give partial utf8 input, and that showed
that we didn't properly handle that case.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 17:56:53 -07:00
Linus Torvalds
3c7bd9a7d2 Make kbd macro save area use 'int' instead of short
I'm starting to expand the input value from 'short' (with flags in the
upper eight bytes) to 'int' (with negative values having flags).

Small baby steps.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 17:42:19 -07:00
Linus Torvalds
ec6f4f36ec Use utf8 helper functions for keyboard input
ttgetc() used some homebrew utf8 to unicode translation, limited to just
the normal latin1 characters.  Use the utf8 helper functions to get it
right for the more complex cases.

NOTE! We don't actually handle characters > 0xff right anyway.  And we
still end up doing Latin1 in the buffers on input.  One small step at a
time.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 17:36:30 -07:00
Linus Torvalds
6b793211c2 Make cursor movement (largely) understand UTF-8 character boundaries
Ok, so it may do odd things if it's not truly utf-8, and when moving up
and down lines that have utf-8 the cursor moves oddly (because the byte
offset within the line stays constant, rather than the character
offset), but with this you can actually open the UTF8 example file and
move around it, and at least some of the movement makes sense.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 16:40:36 -07:00
Linus Torvalds
e62cdf04cf Split up the utf8 helper functions into a file of their own
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 16:21:35 -07:00
Linus Torvalds
12e4647deb Remove the old utf8_mode thing.
Let's just plan on being fully utf8 some day.  We're not there yet, and
maybe we'll never be, but having the halfway mode is not useful either.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 15:09:31 -07:00
Linus Torvalds
cee00b0efb Show UTF-8 input as UTF-8 output
.. by doing the stupid "convert to unicode value and back" model.

This actually populates the 'struct video' array with the unicode
values, so UTF8 input actually shows correctly.  In particular, the nice
test-file (UTF-8-demo.txt) shows up not as garbage, but as the UTF-8 it
is.

HOWEVER!

Since the *editing* doesn't know about UTF-8, and considers it just a
stream of bytes, the end result is not actually a usable utf-8 editor.
So don't get too excited yet: this is just a partial step to "actually
edit utf8 data"

NOTE NOTE NOTE! If the character buffer contains Latin1, we will
transform that Latin1 to unicode, and then output it as UTF8.  And we
will edit it correctly as the character-by-character data.  Also, we
still do the "UTF8 to Latin1" translation on *input*, so with this
commit we can actually continue to *edit* Latin1 text.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 15:08:17 -07:00
Linus Torvalds
e8f984a1b0 Make the 'struct video' contain an array of unicode characters rather than bytes
This is disgusting.  And quite frankly, it's debatable whether this will
ever work.  The "line" structure is still just an array of characters,
so that has to work with utf-8.

But the 'struct video' thing is what represents the actual screen
rectangle, and is fixed-size by the size of the screen.  So making it
contain actual 32-bit unicode characters *may* make sense.

Right now we translate things the same way we always used to, though, so
utf-8 in 'struct line' will not be translated to the proper unicode
array, but to the bytes of the utf-8 representation.  So this really
doesn't improve anything per se yet, just expands the memory use of the
video array.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 14:24:23 -07:00
Linus Torvalds
2dddd4f970 Show lines with a single helper function, not one byte at a time
Let's see how hard it is to show UTF-8 characters properly.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-10 13:38:41 -07:00