17 Commits

Author SHA1 Message Date
Eremey Valetov
cabd93caea Release 0.18.0 + automated FreeDOS QEMU smoke (#2)
* tests: automated headless FreeDOS QEMU smoke

Fully-automated counterpart to the manual run_freedos_qemu.sh: overlays the
FreeDOS image (no mutation), stages the interpreter and a SYSTEM-terminated
smoke on C:, injects the run plus poweroff into the image's startup batch,
boots headless, and diffs OUT.TXT against the golden file. Local-dev only
(needs qemu, a FreeDOS qcow2, mtools, nbd, and passwordless sudo); CI keeps
using the DOSBox-X path. Exercises the binary on a real FreeDOS install
rather than DOSBox-X emulation.

* Release 0.18.0

Cross-language linking (link BASIC into C/Fortran, call C from BASIC via
'$EXTERN), the v0.18 codegen/perf batch (paren string-comparison fix,
--no-gc-check/--fast-math, larger 32-bit caps, process-local DATE$/TIME$),
and the automated FreeDOS QEMU smoke. Bumps GW_VERSION and updates the
banners, CHANGES.TXT, and the development history table.

---------

Co-authored-by: Eremey Valetov <evvaletov@users.noreply.github.com>
2026-06-13 15:37:45 +03:00
Eremey Valetov
89fe0fb0b3 Compiler: $EXTERN pragma for calling C functions from BASIC (Level 2 FFI) (#1)
Add a '$EXTERN NAME(ARGTYPES) AS RET pragma so compiled BASIC can call C
functions directly, the natural follow-up to Level 1 (--emit-obj /
--main-name). The pragma is an apostrophe comment, so the interpreter
ignores it while the compiler registers it.

Map INTEGER/SINGLE/DOUBLE/STRING to int16_t/float/double/const char* at the
boundary: a string argument crosses as a temporary C copy that is freed
after the call, and a string return is copied into the pool. The call name
is matched case-insensitively but emitted as the C symbol with the case
written in the pragma. Names are recognized before parse_var() truncates
identifiers to two significant characters, so multi-character C function
names work.

A string return that aliases a char* argument is copied before the argument
temporaries are freed, which avoids a use-after-free. Over-supplied
arguments are consumed without desyncing the token stream and warn on arity
mismatch.

Docs: getting-started.md "Foreign Functions from BASIC". Test:
tests/run_ffi_test.sh, wired into CI. 63/63 compiler, 72/72 interpreter,
68/68 compat still pass.

Also refile the roadmap "Next Up" backlog as git-bug issues and prune
docs/roadmap.md to point at git-bug as the source of truth for planned work.

Co-authored-by: Eremey Valetov <evvaletov@users.noreply.github.com>
2026-06-13 15:06:23 +03:00
Eremey Valetov
791b5a4710 compiler: --emit-obj and --main-name for cross-language linking
Level 1 of the cross-language linking roadmap entry: produce an
object file with a renamed entry point so a BASIC program can be
linked into a larger C or Fortran build.

- src/compiler_main.c: --emit-obj runs gcc -c (compile-only,
  produces prog.o) and skips the runtime link.  --main-name NAME
  (or --main-name=NAME) is plumbed through codegen_opts_t.

- src/codegen.c: emit `int <name>(int argc, char **argv)` instead
  of always emitting `main`.  Default unchanged when --main-name
  isn't specified.

- include/codegen.h: add main_name to codegen_opts_t.

- docs/getting-started.md: new "Cross-Language Linking" section
  with C and Fortran (iso_c_binding) driver examples.

- docs/roadmap.md: three levels of cross-language linking, with
  Level 1 marked done, Level 2 (BASIC-side EXTERN declarations)
  as the next concrete step, Level 3 (BASIC SUBs as C functions)
  deferred.  Also added: FORTRAN-style WRITE / C-style PRINTF
  formatted I/O extensions, and a NumPy / DataFrame / Matplotlib-
  style standard library section as a separate sub-project track.

Verified end-to-end: a BASIC program compiled with --emit-obj
--main-name=run_basic_greet links cleanly with both a C driver
(gcc) and a Fortran driver (gfortran with iso_c_binding), and
prints the BASIC output before returning to the host.  All
72 interpreter / 68 compat / 63 compiler tests still pass.
2026-05-05 06:50:50 -04:00
Eremey Valetov
f207d74aec codegen fixes, --no-gc-check / --fast-math, raise caps, DATE$/TIME$ shift
Four roadmap items:

- codegen: fix parenthesized string comparison.  emit_atom didn't
  consume the body of a string-literal token (`"`), so for
  PRINT (A$+B$ < "ZZZ") it emitted a 0 placeholder, advanced one byte,
  and left "ZZZ" to be reparsed as a variable + extra trailing tokens
  -- the binary then failed to link with `var_ZZ_sng` undeclared.
  emit_atom now skips to the closing quote.  Separately, the
  left_type tracking in emit_num_prec dropped VT_STR after a string +
  string concat (becoming VT_SNG), so the string-comparison codepath
  skipped when the relational operator arrived.  Preserve VT_STR
  through TOK_PLUS when both operands are strings.  Verified: paren
  string-cmp now compiles and produces the same -1 / 0 result as the
  interpreter.

- compiler: --no-gc-check and --fast-math optimization flags.
  --no-gc-check skips the per-line gwrt_check_line() (no string-pool
  GC, no Ctrl+Break trap).  --fast-math drops the divide-by-zero
  guard on `/`; the divisor still goes through (double) so 10/0
  produces inf rather than SIGFPE.  Both threaded through
  codegen_opts_t and exposed in --help.  --inline-arrays from the
  roadmap deferred -- larger refactor.

- interp: raise static caps on 32-bit / Linux builds.  vars 256
  -> 1024, arrays 64 -> 256, MAX_FOR_DEPTH 16 -> 64, MAX_GOSUB_DEPTH
  24 -> 128, MAX_WHILE_DEPTH 16 -> 64.  Codegen FOR_STACK_MAX 16
  -> 64.  Analysis-pass caps: MAX_LINES 4096 -> 8192, MAX_VARS 256
  -> 1024, MAX_GOTOS 256 -> 1024, MAX_DATA 1024 -> 4096,
  MAX_GOSUB_RET 256 -> 1024.  16-bit DOS keeps the original modest
  caps via #ifdef _M_I86 -- the MEDIUM model has a single 64KB
  DGROUP for all static data and the bumped sizes broke runtime
  startup under DOSBox-X.  16-bit binary grew from 128KB to 132KB
  from the offset_secs field plus DATE$/TIME$ shift code, well
  within the FreeDOS budget.

- interp + codegen: DATE$ / TIME$ assignment via process-local
  clock offset.  Was a no-op accept-and-ignore.  Now sets
  gw.time_offset_secs (long), and DATE$ / TIME$ / TIMER readers
  apply it to time(NULL) before formatting.  The OS clock is
  unaffected (would need root).  Compiled-binary readers also
  reference gw.time_offset_secs since libgwrt shares the gw
  struct.  Verified: PRINT DATE$; DATE$="12-31-1999"; PRINT DATE$
  shows the expected before/after in both interpreter and AOT
  paths.

After these changes: 72/72 interpreter tests, 68/68 compat, 63/63
compiler tests, DOS smoke under DOSBox-X all pass.  Build clean on
both Linux (cmake) and 16-bit DOS (build_dos.sh 16).
2026-05-04 18:56:58 -04:00
Eremey Valetov
da1e6cebf1 CI: fix dos-cross-compile env; codegen: accept concat in CVI/CVS/CVD/cmp
- .github/workflows/ci.yml: the dos-cross-compile job failed on the
  first push because build_dos.sh sources $HOME/openwatcom-v2/setvars.sh
  if WATCOM is unset, but that file isn't part of the OpenWatcom V2
  snapshot — I'd been creating it locally by hand.  Add a "Configure
  OpenWatcom env" step that generates setvars.sh after extraction (so
  build_dos.sh works) AND exports WATCOM/PATH/INCLUDE via $GITHUB_ENV
  (so subsequent steps work even if setvars.sh sourcing changes).
  Also stash both DOS binaries before the next-mode clean wipes them,
  so the artifact upload actually has both .exe files.

- src/codegen.c: switch the four remaining emit_str_atom callers
  (CVI/CVS/CVD function args + string-comparison left/right) to
  emit_str_expr.  Now `CVS(A$+B$)` and `A$+B$ < C$` accept
  concatenation in their string operands; previously the atom-level
  caller stopped at the first identifier and the trailing `+` confused
  downstream parsing.  Verified: CVS(MKS$(3.14)+MKS$(0)) round-trips
  to 3.14 in both interpreter and compiled binary.  All 72 interpreter
  + 63 compiler tests still pass.

- docs/getting-started.md: document that gwbasic-compile auto-numbers
  unnumbered direct-mode lines (last_num + 10) so scratchpad-style
  programs compile without manual renumbering.

- tests/run_freedos_qemu.sh: helper for going through the manual TUI
  checklist on bare FreeDOS.  Modern qemu-kvm doesn't expose -fda on
  the default machine type and fat:rw: protocol is gone, so a fully
  automated FreeDOS smoke isn't tractable from userspace; this script
  builds a FAT data image (mtools), attaches it as -hdb to the FreeDOS
  qcow2, and points the user at the manual sequence in the script
  header.  The DOSBox-X harness (run_dos_smoke.sh) remains the
  automated DOS smoke.
2026-05-04 18:16:21 -04:00
Eremey Valetov
70ffd39562 v0.17.0: BIOS-routed TUI on DOS, version banner, compiler PulseAudio link
QA findings from a multi-round review of the FreeDOS submission prep work:

- TUI rendering refactor: src/tui.c emitted ANSI escape sequences via
  printf, which displays as raw text on bare FreeDOS (no ANSI.SYS).
  Add four HAL ops (tui_enter, tui_leave, render_run, set_cursor_shape)
  and route per-cell rendering through them.  POSIX backend keeps the
  ANSI path; DOS backend drives BIOS INT 10h via the existing
  bios_set_cursor / bios_write_char helpers.  The TUI's logical cursor
  goes through the saved orig_locate to avoid recursing through the
  swapped-in gw_hal->locate.

- DOS extended-key mapping: dos_getch returns 0x100 | scancode for
  arrows / F-keys; tui_read_key wasn't translating those to its TK_*
  constants, so the editor never saw arrow keys or F1-F10 on DOS.
  Add a __MSDOS__-conditional translation table in tui_read_key.

- Version banner: GW_VERSION was still 0.16.0 even though the v0.17.0
  release prep was already in CHANGES.TXT.  Bump.

- Compiler PulseAudio link: gwbasic-compile -c hardcoded
  '-lgwrt -lm -lpthread' on the gcc command line.  When libgwrt was
  built against libpulse-simple (the default on any host with the
  PulseAudio dev headers installed), the compile workflow failed with
  'undefined reference to pa_simple_drain'.  CMake now passes
  GWRT_HAS_PULSEAUDIO to gwbasic-compile when libpulse is present, and
  the compiler appends -lpulse-simple to the link line.

- FRE("") garbage collection: the interpreter skipped strpool_gc with a
  comment 'unsafe during expression eval', but that's exactly what real
  GW-BASIC's FRE("") does (and the AOT compiler path already did).  Add
  the GC call; strpool_pin/unpin is the existing escape hatch if a
  caller has live pool pointers on the C stack.  Fixes the string_gc
  compat test.

- Test harness normalization: run_tests.sh stripped trailing whitespace
  on the actual output but not the expected file, causing spurious
  mismatches against golden files captured from real GWBASIC.EXE.
  Normalize both sides identically.  Fixes the peek_gfx mismatch.

- Print_using: snprintf into mantissa[32] with %.*f and an unbounded
  dec triggered a -Wformat-truncation warning.  Clamp dec to 20 (IEEE
  double has at most ~17 significant decimal digits).

- Doc/version consistency: 16-bit binary size reported as 127KB in one
  place and 128KB in three; standardize on 128KB.  HAL backend count
  said '1 file' but is now 2.  CI test count said 'all 66 test
  programs' but is 72.  Add a v0.17.0 row to the development.md table.
  Update getting-started.md DOS section to match the BIOS-rendering
  reality and add a manual TUI verification checklist.

- dos_init now writes back BIOS-reported cols/rows to dos_hal struct
  fields (forward-declared so dos_init can reference it).

After these changes: 72/72 interpreter tests pass, compat 68/68
matched, no warnings on the Linux build.
2026-05-03 12:25:41 -04:00
Eremey Valetov
817c26f55f Fix TUI null-pointer crashes and add DOS build docs
Fix 6 crash paths when TUI screen buffer allocation fails (common on
16-bit DOS due to near-heap exhaustion):
- main.c: REPL and AUTO mode fall back to fgets-based line reader
- tui.c: tui_key_on/off/list return early if tui.screen is NULL

Add DOS build documentation to getting-started.md (16-bit and 32-bit
targets, running on FreeDOS). Fix stale version string (0.14.0 -> 0.16.0).
2026-04-10 10:09:22 -04:00
Eremey Valetov
20ecdae938 Add --warn and --safe memory safety flags to the compiler
Three progressive levels for gwbasic-compile:

--warn: static analysis warnings (uninitialized variables, GOTO to
nonexistent line, unreachable code detection). Zero runtime cost.

--safe (implies --warn): runtime checked integer arithmetic via
gw_int_add/sub/mul/neg matching real GW-BASIC overflow semantics,
enhanced array bounds diagnostics with variable names and line numbers,
GOSUB stack overflow diagnostics with source line reporting.

--safe=sanitize (implies --safe): passes -fsanitize=address,undefined
to gcc for full memory error detection.

Also: fix pre-existing missing closing paren in array LET-to-integer
codegen, add strpool_pin/unpin infrastructure, add compiler optimization
flags and memory safety sections to roadmap.

72/72 interpreter tests pass. 64/64 eligible compiler tests pass in
--safe mode.
2026-04-09 13:14:26 -04:00
Eremey Valetov
15af9ad12c Replace em dashes with ASCII -- across all docs
Remove Unicode em dashes (U+2014) and en dashes (U+2013) from all
Markdown files. Use ASCII -- for parenthetical breaks and - for
hyphenation, matching standard plain-text conventions.
2026-03-30 16:50:43 -04:00
Eremey Valetov
3f3104c385 Implement MBF binary file compatibility, fix binary loader null-byte truncation, update to v0.14.0
Binary tokenized SAVE/LOAD now stores float constants in Microsoft Binary
Format (MBF) on disk, matching original GWBASIC.EXE.  A token-walking function
(convert_floats) converts IEEE↔MBF at the save_binary()/load_binary() boundary.

Also fixes a latent bug where load_binary() scanned for 0x00 to find the end
of each token line — this fails when float bytes contain null (e.g. MBF for
100.5 is 00 00 49 87).  The loader now uses the next-line pointer to compute
token data length, matching the original's approach.
2026-03-21 01:38:53 -04:00
Eremey Valetov
c488cb526d Update Sphinx docs with v0.14.0 roadmap and fix stale references
Expand roadmap with detailed implementation plans for the next three
features: MBF binary file compatibility (token-stream conversion at
load/save boundary), hardware I/O simulator (portio.c with PIT/speaker/
CGA/joystick port emulation), and DRAW command fixes (M parsing bug,
scale semantics, A rotation, TA/variable substitution). Remove hardware
I/O from known limitations (moving to planned). Fix stale test counts
(64 -> 66) and version string (0.11.0 -> 0.13.0) across docs.
2026-03-11 08:11:13 -04:00
Eremey Valetov
0743757029 Implement DEF SEG/PEEK/POKE, GET/PUT sprites, fix PRINT USING, update to v0.11.0 2026-03-01 13:07:28 -05:00
Eremey Valetov
e7f35c21ff Implement binary SAVE/LOAD, INKEY$ extended keys, golden tests, update to v0.10.0
Binary SAVE/LOAD: SAVE now writes tokenized binary by default (0xFF header
format), matching original GW-BASIC behavior. SAVE "file",A for ASCII.
LOAD auto-detects binary vs ASCII from the first byte. Command-line file
loading also auto-detects, so binary .BAS files just work.

INKEY$ extended keys: arrow keys, Home/End/PgUp/PgDn, Insert/Delete, and
F1-F10 now return the correct CHR$(0) + scan_code two-byte sequences per
the IBM PC convention. Refactored event trap key parsing to use tui_read_key()
instead of duplicating escape sequence parsing.

Golden-file regression tests: generated .expected output files for 55 of 58
test programs (3 timing-dependent tests excluded). The test runner now
reports compat match status alongside pass/fail.

Classic programs: added Hamurabi, Lunar Lander, Gunner, and Diamond from
David Ahl's BASIC Computer Games (1978) in tests/classic/ for manual
compatibility testing.

Docs updated with compiler roadmap item and hardware I/O simulator plan.
2026-03-01 12:25:47 -05:00
Eremey Valetov
3fa8c6f034 Implement EDIT statement and ON TIMER/ON KEY event trapping, update to v0.9.0
Add event-driven programming: ON TIMER(n) GOSUB with TIMER ON/OFF/STOP,
ON KEY(n) GOSUB with KEY(n) ON/OFF/STOP for F1-F10. Fix F-key escape
sequence parser (F9/F10 detection, push back consumed bytes on unmatched
sequences). Add EDIT statement for TUI line editing. Guard key trap
polling so keystrokes aren't consumed when no traps are configured.
2026-02-27 17:29:09 -05:00
Eremey Valetov
5105ecafd6 Allow RND without parentheses, update Sphinx docs to v0.8.0
RND can now be called without parentheses (equivalent to RND(1)),
matching real GW-BASIC behavior for legacy code compatibility.

Update all Sphinx documentation pages to reflect features added in
v0.6.0-0.8.0: DATE$/TIME$/TIMER, FILES/SHELL/CHDIR/MKDIR/RMDIR,
AUTO/RENUM/DELETE, COMMON, LPRINT/LLIST with --lpt, --full TUI flag,
dynamic screen buffer, 54 tests.
2026-02-22 14:09:18 -05:00
Eremey Valetov
ad21350003 Add full-screen TUI editor, DOSBox-X compat testing, rename to GW-BASIC 2026
Authentic GW-BASIC screen editor with 25x80 buffer, free cursor movement,
enter-on-any-line, F1-F10 function keys, Insert/Overwrite toggle, KEY
ON/OFF/LIST statement, and Ctrl+Break handling. HAL pointer swap routes
all PRINT/LIST/error output through the TUI automatically. Piped mode
unchanged (50/50 tests pass).

Adds automated compatibility testing infrastructure: DOSBox-X headless
config, PRINT-to-file transform script, and run_compat.sh with --generate
and --compare modes for verifying output against real GWBASIC.EXE.

Project renamed from gwbasic-c to GW-BASIC 2026.
2026-02-22 12:18:17 -05:00
Eremey Valetov
c2684712af Add Sphinx documentation site with GitHub Pages deployment
Furo theme, 6 pages split from existing docs, auto-deploys on push.
2026-02-15 16:56:02 -05:00