* tests: automated headless FreeDOS QEMU smoke
Fully-automated counterpart to the manual run_freedos_qemu.sh: overlays the
FreeDOS image (no mutation), stages the interpreter and a SYSTEM-terminated
smoke on C:, injects the run plus poweroff into the image's startup batch,
boots headless, and diffs OUT.TXT against the golden file. Local-dev only
(needs qemu, a FreeDOS qcow2, mtools, nbd, and passwordless sudo); CI keeps
using the DOSBox-X path. Exercises the binary on a real FreeDOS install
rather than DOSBox-X emulation.
* Release 0.18.0
Cross-language linking (link BASIC into C/Fortran, call C from BASIC via
'$EXTERN), the v0.18 codegen/perf batch (paren string-comparison fix,
--no-gc-check/--fast-math, larger 32-bit caps, process-local DATE$/TIME$),
and the automated FreeDOS QEMU smoke. Bumps GW_VERSION and updates the
banners, CHANGES.TXT, and the development history table.
---------
Co-authored-by: Eremey Valetov <evvaletov@users.noreply.github.com>
Add a '$EXTERN NAME(ARGTYPES) AS RET pragma so compiled BASIC can call C
functions directly, the natural follow-up to Level 1 (--emit-obj /
--main-name). The pragma is an apostrophe comment, so the interpreter
ignores it while the compiler registers it.
Map INTEGER/SINGLE/DOUBLE/STRING to int16_t/float/double/const char* at the
boundary: a string argument crosses as a temporary C copy that is freed
after the call, and a string return is copied into the pool. The call name
is matched case-insensitively but emitted as the C symbol with the case
written in the pragma. Names are recognized before parse_var() truncates
identifiers to two significant characters, so multi-character C function
names work.
A string return that aliases a char* argument is copied before the argument
temporaries are freed, which avoids a use-after-free. Over-supplied
arguments are consumed without desyncing the token stream and warn on arity
mismatch.
Docs: getting-started.md "Foreign Functions from BASIC". Test:
tests/run_ffi_test.sh, wired into CI. 63/63 compiler, 72/72 interpreter,
68/68 compat still pass.
Also refile the roadmap "Next Up" backlog as git-bug issues and prune
docs/roadmap.md to point at git-bug as the source of truth for planned work.
Co-authored-by: Eremey Valetov <evvaletov@users.noreply.github.com>
Level 1 of the cross-language linking roadmap entry: produce an
object file with a renamed entry point so a BASIC program can be
linked into a larger C or Fortran build.
- src/compiler_main.c: --emit-obj runs gcc -c (compile-only,
produces prog.o) and skips the runtime link. --main-name NAME
(or --main-name=NAME) is plumbed through codegen_opts_t.
- src/codegen.c: emit `int <name>(int argc, char **argv)` instead
of always emitting `main`. Default unchanged when --main-name
isn't specified.
- include/codegen.h: add main_name to codegen_opts_t.
- docs/getting-started.md: new "Cross-Language Linking" section
with C and Fortran (iso_c_binding) driver examples.
- docs/roadmap.md: three levels of cross-language linking, with
Level 1 marked done, Level 2 (BASIC-side EXTERN declarations)
as the next concrete step, Level 3 (BASIC SUBs as C functions)
deferred. Also added: FORTRAN-style WRITE / C-style PRINTF
formatted I/O extensions, and a NumPy / DataFrame / Matplotlib-
style standard library section as a separate sub-project track.
Verified end-to-end: a BASIC program compiled with --emit-obj
--main-name=run_basic_greet links cleanly with both a C driver
(gcc) and a Fortran driver (gfortran with iso_c_binding), and
prints the BASIC output before returning to the host. All
72 interpreter / 68 compat / 63 compiler tests still pass.
Four roadmap items:
- codegen: fix parenthesized string comparison. emit_atom didn't
consume the body of a string-literal token (`"`), so for
PRINT (A$+B$ < "ZZZ") it emitted a 0 placeholder, advanced one byte,
and left "ZZZ" to be reparsed as a variable + extra trailing tokens
-- the binary then failed to link with `var_ZZ_sng` undeclared.
emit_atom now skips to the closing quote. Separately, the
left_type tracking in emit_num_prec dropped VT_STR after a string +
string concat (becoming VT_SNG), so the string-comparison codepath
skipped when the relational operator arrived. Preserve VT_STR
through TOK_PLUS when both operands are strings. Verified: paren
string-cmp now compiles and produces the same -1 / 0 result as the
interpreter.
- compiler: --no-gc-check and --fast-math optimization flags.
--no-gc-check skips the per-line gwrt_check_line() (no string-pool
GC, no Ctrl+Break trap). --fast-math drops the divide-by-zero
guard on `/`; the divisor still goes through (double) so 10/0
produces inf rather than SIGFPE. Both threaded through
codegen_opts_t and exposed in --help. --inline-arrays from the
roadmap deferred -- larger refactor.
- interp: raise static caps on 32-bit / Linux builds. vars 256
-> 1024, arrays 64 -> 256, MAX_FOR_DEPTH 16 -> 64, MAX_GOSUB_DEPTH
24 -> 128, MAX_WHILE_DEPTH 16 -> 64. Codegen FOR_STACK_MAX 16
-> 64. Analysis-pass caps: MAX_LINES 4096 -> 8192, MAX_VARS 256
-> 1024, MAX_GOTOS 256 -> 1024, MAX_DATA 1024 -> 4096,
MAX_GOSUB_RET 256 -> 1024. 16-bit DOS keeps the original modest
caps via #ifdef _M_I86 -- the MEDIUM model has a single 64KB
DGROUP for all static data and the bumped sizes broke runtime
startup under DOSBox-X. 16-bit binary grew from 128KB to 132KB
from the offset_secs field plus DATE$/TIME$ shift code, well
within the FreeDOS budget.
- interp + codegen: DATE$ / TIME$ assignment via process-local
clock offset. Was a no-op accept-and-ignore. Now sets
gw.time_offset_secs (long), and DATE$ / TIME$ / TIMER readers
apply it to time(NULL) before formatting. The OS clock is
unaffected (would need root). Compiled-binary readers also
reference gw.time_offset_secs since libgwrt shares the gw
struct. Verified: PRINT DATE$; DATE$="12-31-1999"; PRINT DATE$
shows the expected before/after in both interpreter and AOT
paths.
After these changes: 72/72 interpreter tests, 68/68 compat, 63/63
compiler tests, DOS smoke under DOSBox-X all pass. Build clean on
both Linux (cmake) and 16-bit DOS (build_dos.sh 16).
- .github/workflows/ci.yml: the dos-cross-compile job failed on the
first push because build_dos.sh sources $HOME/openwatcom-v2/setvars.sh
if WATCOM is unset, but that file isn't part of the OpenWatcom V2
snapshot — I'd been creating it locally by hand. Add a "Configure
OpenWatcom env" step that generates setvars.sh after extraction (so
build_dos.sh works) AND exports WATCOM/PATH/INCLUDE via $GITHUB_ENV
(so subsequent steps work even if setvars.sh sourcing changes).
Also stash both DOS binaries before the next-mode clean wipes them,
so the artifact upload actually has both .exe files.
- src/codegen.c: switch the four remaining emit_str_atom callers
(CVI/CVS/CVD function args + string-comparison left/right) to
emit_str_expr. Now `CVS(A$+B$)` and `A$+B$ < C$` accept
concatenation in their string operands; previously the atom-level
caller stopped at the first identifier and the trailing `+` confused
downstream parsing. Verified: CVS(MKS$(3.14)+MKS$(0)) round-trips
to 3.14 in both interpreter and compiled binary. All 72 interpreter
+ 63 compiler tests still pass.
- docs/getting-started.md: document that gwbasic-compile auto-numbers
unnumbered direct-mode lines (last_num + 10) so scratchpad-style
programs compile without manual renumbering.
- tests/run_freedos_qemu.sh: helper for going through the manual TUI
checklist on bare FreeDOS. Modern qemu-kvm doesn't expose -fda on
the default machine type and fat:rw: protocol is gone, so a fully
automated FreeDOS smoke isn't tractable from userspace; this script
builds a FAT data image (mtools), attaches it as -hdb to the FreeDOS
qcow2, and points the user at the manual sequence in the script
header. The DOSBox-X harness (run_dos_smoke.sh) remains the
automated DOS smoke.
QA findings from a multi-round review of the FreeDOS submission prep work:
- TUI rendering refactor: src/tui.c emitted ANSI escape sequences via
printf, which displays as raw text on bare FreeDOS (no ANSI.SYS).
Add four HAL ops (tui_enter, tui_leave, render_run, set_cursor_shape)
and route per-cell rendering through them. POSIX backend keeps the
ANSI path; DOS backend drives BIOS INT 10h via the existing
bios_set_cursor / bios_write_char helpers. The TUI's logical cursor
goes through the saved orig_locate to avoid recursing through the
swapped-in gw_hal->locate.
- DOS extended-key mapping: dos_getch returns 0x100 | scancode for
arrows / F-keys; tui_read_key wasn't translating those to its TK_*
constants, so the editor never saw arrow keys or F1-F10 on DOS.
Add a __MSDOS__-conditional translation table in tui_read_key.
- Version banner: GW_VERSION was still 0.16.0 even though the v0.17.0
release prep was already in CHANGES.TXT. Bump.
- Compiler PulseAudio link: gwbasic-compile -c hardcoded
'-lgwrt -lm -lpthread' on the gcc command line. When libgwrt was
built against libpulse-simple (the default on any host with the
PulseAudio dev headers installed), the compile workflow failed with
'undefined reference to pa_simple_drain'. CMake now passes
GWRT_HAS_PULSEAUDIO to gwbasic-compile when libpulse is present, and
the compiler appends -lpulse-simple to the link line.
- FRE("") garbage collection: the interpreter skipped strpool_gc with a
comment 'unsafe during expression eval', but that's exactly what real
GW-BASIC's FRE("") does (and the AOT compiler path already did). Add
the GC call; strpool_pin/unpin is the existing escape hatch if a
caller has live pool pointers on the C stack. Fixes the string_gc
compat test.
- Test harness normalization: run_tests.sh stripped trailing whitespace
on the actual output but not the expected file, causing spurious
mismatches against golden files captured from real GWBASIC.EXE.
Normalize both sides identically. Fixes the peek_gfx mismatch.
- Print_using: snprintf into mantissa[32] with %.*f and an unbounded
dec triggered a -Wformat-truncation warning. Clamp dec to 20 (IEEE
double has at most ~17 significant decimal digits).
- Doc/version consistency: 16-bit binary size reported as 127KB in one
place and 128KB in three; standardize on 128KB. HAL backend count
said '1 file' but is now 2. CI test count said 'all 66 test
programs' but is 72. Add a v0.17.0 row to the development.md table.
Update getting-started.md DOS section to match the BIOS-rendering
reality and add a manual TUI verification checklist.
- dos_init now writes back BIOS-reported cols/rows to dos_hal struct
fields (forward-declared so dos_init can reference it).
After these changes: 72/72 interpreter tests pass, compat 68/68
matched, no warnings on the Linux build.
Fix 6 crash paths when TUI screen buffer allocation fails (common on
16-bit DOS due to near-heap exhaustion):
- main.c: REPL and AUTO mode fall back to fgets-based line reader
- tui.c: tui_key_on/off/list return early if tui.screen is NULL
Add DOS build documentation to getting-started.md (16-bit and 32-bit
targets, running on FreeDOS). Fix stale version string (0.14.0 -> 0.16.0).
Remove Unicode em dashes (U+2014) and en dashes (U+2013) from all
Markdown files. Use ASCII -- for parenthetical breaks and - for
hyphenation, matching standard plain-text conventions.
Binary tokenized SAVE/LOAD now stores float constants in Microsoft Binary
Format (MBF) on disk, matching original GWBASIC.EXE. A token-walking function
(convert_floats) converts IEEE↔MBF at the save_binary()/load_binary() boundary.
Also fixes a latent bug where load_binary() scanned for 0x00 to find the end
of each token line — this fails when float bytes contain null (e.g. MBF for
100.5 is 00 00 49 87). The loader now uses the next-line pointer to compute
token data length, matching the original's approach.
Expand roadmap with detailed implementation plans for the next three
features: MBF binary file compatibility (token-stream conversion at
load/save boundary), hardware I/O simulator (portio.c with PIT/speaker/
CGA/joystick port emulation), and DRAW command fixes (M parsing bug,
scale semantics, A rotation, TA/variable substitution). Remove hardware
I/O from known limitations (moving to planned). Fix stale test counts
(64 -> 66) and version string (0.11.0 -> 0.13.0) across docs.
Binary SAVE/LOAD: SAVE now writes tokenized binary by default (0xFF header
format), matching original GW-BASIC behavior. SAVE "file",A for ASCII.
LOAD auto-detects binary vs ASCII from the first byte. Command-line file
loading also auto-detects, so binary .BAS files just work.
INKEY$ extended keys: arrow keys, Home/End/PgUp/PgDn, Insert/Delete, and
F1-F10 now return the correct CHR$(0) + scan_code two-byte sequences per
the IBM PC convention. Refactored event trap key parsing to use tui_read_key()
instead of duplicating escape sequence parsing.
Golden-file regression tests: generated .expected output files for 55 of 58
test programs (3 timing-dependent tests excluded). The test runner now
reports compat match status alongside pass/fail.
Classic programs: added Hamurabi, Lunar Lander, Gunner, and Diamond from
David Ahl's BASIC Computer Games (1978) in tests/classic/ for manual
compatibility testing.
Docs updated with compiler roadmap item and hardware I/O simulator plan.
Add event-driven programming: ON TIMER(n) GOSUB with TIMER ON/OFF/STOP,
ON KEY(n) GOSUB with KEY(n) ON/OFF/STOP for F1-F10. Fix F-key escape
sequence parser (F9/F10 detection, push back consumed bytes on unmatched
sequences). Add EDIT statement for TUI line editing. Guard key trap
polling so keystrokes aren't consumed when no traps are configured.
RND can now be called without parentheses (equivalent to RND(1)),
matching real GW-BASIC behavior for legacy code compatibility.
Update all Sphinx documentation pages to reflect features added in
v0.6.0-0.8.0: DATE$/TIME$/TIMER, FILES/SHELL/CHDIR/MKDIR/RMDIR,
AUTO/RENUM/DELETE, COMMON, LPRINT/LLIST with --lpt, --full TUI flag,
dynamic screen buffer, 54 tests.
Authentic GW-BASIC screen editor with 25x80 buffer, free cursor movement,
enter-on-any-line, F1-F10 function keys, Insert/Overwrite toggle, KEY
ON/OFF/LIST statement, and Ctrl+Break handling. HAL pointer swap routes
all PRINT/LIST/error output through the TUI automatically. Piped mode
unchanged (50/50 tests pass).
Adds automated compatibility testing infrastructure: DOSBox-X headless
config, PRINT-to-file transform script, and run_compat.sh with --generate
and --compare modes for verifying output against real GWBASIC.EXE.
Project renamed from gwbasic-c to GW-BASIC 2026.