Add a '$EXTERN NAME(ARGTYPES) AS RET pragma so compiled BASIC can call C
functions directly, the natural follow-up to Level 1 (--emit-obj /
--main-name). The pragma is an apostrophe comment, so the interpreter
ignores it while the compiler registers it.
Map INTEGER/SINGLE/DOUBLE/STRING to int16_t/float/double/const char* at the
boundary: a string argument crosses as a temporary C copy that is freed
after the call, and a string return is copied into the pool. The call name
is matched case-insensitively but emitted as the C symbol with the case
written in the pragma. Names are recognized before parse_var() truncates
identifiers to two significant characters, so multi-character C function
names work.
A string return that aliases a char* argument is copied before the argument
temporaries are freed, which avoids a use-after-free. Over-supplied
arguments are consumed without desyncing the token stream and warn on arity
mismatch.
Docs: getting-started.md "Foreign Functions from BASIC". Test:
tests/run_ffi_test.sh, wired into CI. 63/63 compiler, 72/72 interpreter,
68/68 compat still pass.
Also refile the roadmap "Next Up" backlog as git-bug issues and prune
docs/roadmap.md to point at git-bug as the source of truth for planned work.
Co-authored-by: Eremey Valetov <evvaletov@users.noreply.github.com>
Level 1 of the cross-language linking roadmap entry: produce an
object file with a renamed entry point so a BASIC program can be
linked into a larger C or Fortran build.
- src/compiler_main.c: --emit-obj runs gcc -c (compile-only,
produces prog.o) and skips the runtime link. --main-name NAME
(or --main-name=NAME) is plumbed through codegen_opts_t.
- src/codegen.c: emit `int <name>(int argc, char **argv)` instead
of always emitting `main`. Default unchanged when --main-name
isn't specified.
- include/codegen.h: add main_name to codegen_opts_t.
- docs/getting-started.md: new "Cross-Language Linking" section
with C and Fortran (iso_c_binding) driver examples.
- docs/roadmap.md: three levels of cross-language linking, with
Level 1 marked done, Level 2 (BASIC-side EXTERN declarations)
as the next concrete step, Level 3 (BASIC SUBs as C functions)
deferred. Also added: FORTRAN-style WRITE / C-style PRINTF
formatted I/O extensions, and a NumPy / DataFrame / Matplotlib-
style standard library section as a separate sub-project track.
Verified end-to-end: a BASIC program compiled with --emit-obj
--main-name=run_basic_greet links cleanly with both a C driver
(gcc) and a Fortran driver (gfortran with iso_c_binding), and
prints the BASIC output before returning to the host. All
72 interpreter / 68 compat / 63 compiler tests still pass.
Four roadmap items:
- codegen: fix parenthesized string comparison. emit_atom didn't
consume the body of a string-literal token (`"`), so for
PRINT (A$+B$ < "ZZZ") it emitted a 0 placeholder, advanced one byte,
and left "ZZZ" to be reparsed as a variable + extra trailing tokens
-- the binary then failed to link with `var_ZZ_sng` undeclared.
emit_atom now skips to the closing quote. Separately, the
left_type tracking in emit_num_prec dropped VT_STR after a string +
string concat (becoming VT_SNG), so the string-comparison codepath
skipped when the relational operator arrived. Preserve VT_STR
through TOK_PLUS when both operands are strings. Verified: paren
string-cmp now compiles and produces the same -1 / 0 result as the
interpreter.
- compiler: --no-gc-check and --fast-math optimization flags.
--no-gc-check skips the per-line gwrt_check_line() (no string-pool
GC, no Ctrl+Break trap). --fast-math drops the divide-by-zero
guard on `/`; the divisor still goes through (double) so 10/0
produces inf rather than SIGFPE. Both threaded through
codegen_opts_t and exposed in --help. --inline-arrays from the
roadmap deferred -- larger refactor.
- interp: raise static caps on 32-bit / Linux builds. vars 256
-> 1024, arrays 64 -> 256, MAX_FOR_DEPTH 16 -> 64, MAX_GOSUB_DEPTH
24 -> 128, MAX_WHILE_DEPTH 16 -> 64. Codegen FOR_STACK_MAX 16
-> 64. Analysis-pass caps: MAX_LINES 4096 -> 8192, MAX_VARS 256
-> 1024, MAX_GOTOS 256 -> 1024, MAX_DATA 1024 -> 4096,
MAX_GOSUB_RET 256 -> 1024. 16-bit DOS keeps the original modest
caps via #ifdef _M_I86 -- the MEDIUM model has a single 64KB
DGROUP for all static data and the bumped sizes broke runtime
startup under DOSBox-X. 16-bit binary grew from 128KB to 132KB
from the offset_secs field plus DATE$/TIME$ shift code, well
within the FreeDOS budget.
- interp + codegen: DATE$ / TIME$ assignment via process-local
clock offset. Was a no-op accept-and-ignore. Now sets
gw.time_offset_secs (long), and DATE$ / TIME$ / TIMER readers
apply it to time(NULL) before formatting. The OS clock is
unaffected (would need root). Compiled-binary readers also
reference gw.time_offset_secs since libgwrt shares the gw
struct. Verified: PRINT DATE$; DATE$="12-31-1999"; PRINT DATE$
shows the expected before/after in both interpreter and AOT
paths.
After these changes: 72/72 interpreter tests, 68/68 compat, 63/63
compiler tests, DOS smoke under DOSBox-X all pass. Build clean on
both Linux (cmake) and 16-bit DOS (build_dos.sh 16).
Three fixes that lift seven test programs from skipped to passing,
bringing the AOT compiler harness from 56/56 to 63/63.
- Unnumbered programs (compiler_main.c): src/compiler_main.c skipped
any line that didn't start with a digit, so direct-mode .bas files
like hello.bas, math_ops.bas, string_ops.bas (no line numbers)
failed with "No program lines found". load_file now auto-assigns
line numbers (last_num + 10) to unnumbered lines, with overflow
protection at line 65520.
- String concatenation in PRINT (codegen.c): emit_str_atom had a
broken concat loop that emitted "; _cat = gw_str_concat(&...
_cat.sval ...)" — _cat was never declared, so any program with a
string-literal concat in PRINT (like PRINT "ABC" + "DEF") failed
to link. Concat is properly handled by emit_str_expr's outer
loop; remove the dead/broken code in the atom. Fixes
string_ops.bas.
- Transcendental result type (codegen.c): peek_expr_type returned
VT_DBL for ATN/LOG/EXP/VAL, so PRINT formatted them with 15-digit
double precision (e.g. 3.141592653589793) while real GW-BASIC and
the interpreter format the single-precision result as 3.141593.
Real GW-BASIC's transcendentals are single-precision; only CDBL
forces double. Demote ATN/LOG/EXP/VAL to VT_SNG; CDBL stays
VT_DBL. Fixes math_ops.bas.
Also: tests/run_compiler_tests.sh now runs the compiled binary from
the project root rather than the tempdir where it was built, so
test programs that reference tests/programs/ via relative paths
(chain_test, common_test, run_file, misc_stmts) resolve their
targets. Earlier I'd misdiagnosed those failures as ON ERROR
divergence — they were just CWD-dependent path lookups.
Doc/test counts: 56 → 63 in README, docs/index.md, docs/development.md,
docs/roadmap.md. Roadmap updated to note the compiler now accepts
unnumbered programs.
QA findings from a multi-round review of the FreeDOS submission prep work:
- TUI rendering refactor: src/tui.c emitted ANSI escape sequences via
printf, which displays as raw text on bare FreeDOS (no ANSI.SYS).
Add four HAL ops (tui_enter, tui_leave, render_run, set_cursor_shape)
and route per-cell rendering through them. POSIX backend keeps the
ANSI path; DOS backend drives BIOS INT 10h via the existing
bios_set_cursor / bios_write_char helpers. The TUI's logical cursor
goes through the saved orig_locate to avoid recursing through the
swapped-in gw_hal->locate.
- DOS extended-key mapping: dos_getch returns 0x100 | scancode for
arrows / F-keys; tui_read_key wasn't translating those to its TK_*
constants, so the editor never saw arrow keys or F1-F10 on DOS.
Add a __MSDOS__-conditional translation table in tui_read_key.
- Version banner: GW_VERSION was still 0.16.0 even though the v0.17.0
release prep was already in CHANGES.TXT. Bump.
- Compiler PulseAudio link: gwbasic-compile -c hardcoded
'-lgwrt -lm -lpthread' on the gcc command line. When libgwrt was
built against libpulse-simple (the default on any host with the
PulseAudio dev headers installed), the compile workflow failed with
'undefined reference to pa_simple_drain'. CMake now passes
GWRT_HAS_PULSEAUDIO to gwbasic-compile when libpulse is present, and
the compiler appends -lpulse-simple to the link line.
- FRE("") garbage collection: the interpreter skipped strpool_gc with a
comment 'unsafe during expression eval', but that's exactly what real
GW-BASIC's FRE("") does (and the AOT compiler path already did). Add
the GC call; strpool_pin/unpin is the existing escape hatch if a
caller has live pool pointers on the C stack. Fixes the string_gc
compat test.
- Test harness normalization: run_tests.sh stripped trailing whitespace
on the actual output but not the expected file, causing spurious
mismatches against golden files captured from real GWBASIC.EXE.
Normalize both sides identically. Fixes the peek_gfx mismatch.
- Print_using: snprintf into mantissa[32] with %.*f and an unbounded
dec triggered a -Wformat-truncation warning. Clamp dec to 20 (IEEE
double has at most ~17 significant decimal digits).
- Doc/version consistency: 16-bit binary size reported as 127KB in one
place and 128KB in three; standardize on 128KB. HAL backend count
said '1 file' but is now 2. CI test count said 'all 66 test
programs' but is 72. Add a v0.17.0 row to the development.md table.
Update getting-started.md DOS section to match the BIOS-rendering
reality and add a manual TUI verification checklist.
- dos_init now writes back BIOS-reported cols/rows to dos_hal struct
fields (forward-declared so dos_init can reference it).
After these changes: 72/72 interpreter tests pass, compat 68/68
matched, no warnings on the Linux build.
- Add CHANGES.TXT with full version history (DOS-friendly format)
- Add DOS/FreeDOS section to README.md
- Move compiler memory safety and DOS target to Completed in roadmap
- Add hal_dos.c to architecture.md module map
- Add .tab-color to .gitignore
Remove Unicode em dashes (U+2014) and en dashes (U+2013) from all
Markdown files. Use ASCII -- for parenthetical breaks and - for
hyphenation, matching standard plain-text conventions.
README.md: rewrite with v0.16.0 version, compiler section, Jupyter
kernel section, hardware I/O in statement table, accurate test counts
(72 interpreter + 14 kernel + 69 compiler), build instructions for
all three targets.
docs/roadmap.md: clean up Phase 2 accumulation into single coherent
compiler description. List all language coverage, operators, functions,
and optimizations. Remove stale intermediate progress markers.
docs/development.md: update test counts (72 programs, 68 golden files),
add kernel and compiler test commands.
Rewrites gfx_draw() as a recursive draw_engine() to support all DRAW
mini-language features:
Bug fixes:
- M command parsing: skip generic arg parser so M100,50 correctly
parses both coordinates instead of consuming x as a generic arg
- S (scale) semantics: distance is now (arg ?: 1) * scale / 4, matching
original GW-BASIC where S4 means 1 pixel per unit, not 4
- A (rotation): implements 90-degree rotation state with direction
vector transform for all 8 direction commands
New features:
- TA n: arbitrary rotation angle (-360 to 360 degrees) via cos/sin
- =variable;: numeric variable substitution in DRAW strings
- X stringvar;: execute substring from string variable (recursive)
- Scale factor applied to relative M coordinates
Binary tokenized SAVE/LOAD now stores float constants in Microsoft Binary
Format (MBF) on disk, matching original GWBASIC.EXE. A token-walking function
(convert_floats) converts IEEE↔MBF at the save_binary()/load_binary() boundary.
Also fixes a latent bug where load_binary() scanned for 0x00 to find the end
of each token line — this fails when float bytes contain null (e.g. MBF for
100.5 is 00 00 49 87). The loader now uses the next-line pointer to compute
token data length, matching the original's approach.
Expand roadmap with detailed implementation plans for the next three
features: MBF binary file compatibility (token-stream conversion at
load/save boundary), hardware I/O simulator (portio.c with PIT/speaker/
CGA/joystick port emulation), and DRAW command fixes (M parsing bug,
scale semantics, A rotation, TA/variable substitution). Remove hardware
I/O from known limitations (moving to planned). Fix stale test counts
(64 -> 66) and version string (0.11.0 -> 0.13.0) across docs.
BSAVE/BLOAD: save and load virtual memory blocks with 0xFD-header
binary format, operating on the current DEF SEG segment.
TUI color: tui_refresh emits ANSI SGR codes from cell attributes;
COLOR statement sets tui.current_attr when TUI is active.
Extended PEEK/POKE: CGA graphics framebuffer (interlaced layout) via
gfx_cga_peek/poke routed through virmem when gfx_active(); BIOS
keyboard shift flags (offset 0x17 bit 7 = insert mode).
Add bibliography to language reference. 64 tests, all passing.
Binary SAVE/LOAD: SAVE now writes tokenized binary by default (0xFF header
format), matching original GW-BASIC behavior. SAVE "file",A for ASCII.
LOAD auto-detects binary vs ASCII from the first byte. Command-line file
loading also auto-detects, so binary .BAS files just work.
INKEY$ extended keys: arrow keys, Home/End/PgUp/PgDn, Insert/Delete, and
F1-F10 now return the correct CHR$(0) + scan_code two-byte sequences per
the IBM PC convention. Refactored event trap key parsing to use tui_read_key()
instead of duplicating escape sequence parsing.
Golden-file regression tests: generated .expected output files for 55 of 58
test programs (3 timing-dependent tests excluded). The test runner now
reports compat match status alongside pass/fail.
Classic programs: added Hamurabi, Lunar Lander, Gunner, and Diamond from
David Ahl's BASIC Computer Games (1978) in tests/classic/ for manual
compatibility testing.
Docs updated with compiler roadmap item and hardware I/O simulator plan.
Add event-driven programming: ON TIMER(n) GOSUB with TIMER ON/OFF/STOP,
ON KEY(n) GOSUB with KEY(n) ON/OFF/STOP for F1-F10. Fix F-key escape
sequence parser (F9/F10 detection, push back consumed bytes on unmatched
sequences). Add EDIT statement for TUI line editing. Guard key trap
polling so keystrokes aren't consumed when no traps are configured.
RND can now be called without parentheses (equivalent to RND(1)),
matching real GW-BASIC behavior for legacy code compatibility.
Update all Sphinx documentation pages to reflect features added in
v0.6.0-0.8.0: DATE$/TIME$/TIMER, FILES/SHELL/CHDIR/MKDIR/RMDIR,
AUTO/RENUM/DELETE, COMMON, LPRINT/LLIST with --lpt, --full TUI flag,
dynamic screen buffer, 54 tests.
Authentic GW-BASIC screen editor with 25x80 buffer, free cursor movement,
enter-on-any-line, F1-F10 function keys, Insert/Overwrite toggle, KEY
ON/OFF/LIST statement, and Ctrl+Break handling. HAL pointer swap routes
all PRINT/LIST/error output through the TUI automatically. Piped mode
unchanged (50/50 tests pass).
Adds automated compatibility testing infrastructure: DOSBox-X headless
config, PRINT-to-file transform script, and run_compat.sh with --generate
and --compare modes for verifying output against real GWBASIC.EXE.
Project renamed from gwbasic-c to GW-BASIC 2026.