Files
gw-basic-2026/docs/roadmap.md
Eremey Valetov f207d74aec codegen fixes, --no-gc-check / --fast-math, raise caps, DATE$/TIME$ shift
Four roadmap items:

- codegen: fix parenthesized string comparison.  emit_atom didn't
  consume the body of a string-literal token (`"`), so for
  PRINT (A$+B$ < "ZZZ") it emitted a 0 placeholder, advanced one byte,
  and left "ZZZ" to be reparsed as a variable + extra trailing tokens
  -- the binary then failed to link with `var_ZZ_sng` undeclared.
  emit_atom now skips to the closing quote.  Separately, the
  left_type tracking in emit_num_prec dropped VT_STR after a string +
  string concat (becoming VT_SNG), so the string-comparison codepath
  skipped when the relational operator arrived.  Preserve VT_STR
  through TOK_PLUS when both operands are strings.  Verified: paren
  string-cmp now compiles and produces the same -1 / 0 result as the
  interpreter.

- compiler: --no-gc-check and --fast-math optimization flags.
  --no-gc-check skips the per-line gwrt_check_line() (no string-pool
  GC, no Ctrl+Break trap).  --fast-math drops the divide-by-zero
  guard on `/`; the divisor still goes through (double) so 10/0
  produces inf rather than SIGFPE.  Both threaded through
  codegen_opts_t and exposed in --help.  --inline-arrays from the
  roadmap deferred -- larger refactor.

- interp: raise static caps on 32-bit / Linux builds.  vars 256
  -> 1024, arrays 64 -> 256, MAX_FOR_DEPTH 16 -> 64, MAX_GOSUB_DEPTH
  24 -> 128, MAX_WHILE_DEPTH 16 -> 64.  Codegen FOR_STACK_MAX 16
  -> 64.  Analysis-pass caps: MAX_LINES 4096 -> 8192, MAX_VARS 256
  -> 1024, MAX_GOTOS 256 -> 1024, MAX_DATA 1024 -> 4096,
  MAX_GOSUB_RET 256 -> 1024.  16-bit DOS keeps the original modest
  caps via #ifdef _M_I86 -- the MEDIUM model has a single 64KB
  DGROUP for all static data and the bumped sizes broke runtime
  startup under DOSBox-X.  16-bit binary grew from 128KB to 132KB
  from the offset_secs field plus DATE$/TIME$ shift code, well
  within the FreeDOS budget.

- interp + codegen: DATE$ / TIME$ assignment via process-local
  clock offset.  Was a no-op accept-and-ignore.  Now sets
  gw.time_offset_secs (long), and DATE$ / TIME$ / TIMER readers
  apply it to time(NULL) before formatting.  The OS clock is
  unaffected (would need root).  Compiled-binary readers also
  reference gw.time_offset_secs since libgwrt shares the gw
  struct.  Verified: PRINT DATE$; DATE$="12-31-1999"; PRINT DATE$
  shows the expected before/after in both interpreter and AOT
  paths.

After these changes: 72/72 interpreter tests, 68/68 compat, 63/63
compiler tests, DOS smoke under DOSBox-X all pass.  Build clean on
both Linux (cmake) and 16-bit DOS (build_dos.sh 16).
2026-05-04 18:56:58 -04:00

5.6 KiB

Roadmap

Completed

Ahead-of-Time Compiler (v0.16.0)

gwbasic-compile translates tokenized .bas programs to C source, then invokes GCC to produce native executables linked against libgwrt.a.

Pipeline: .basgw_crunch() → analysis pass → C codegen → gcc → native binary.

63 of 63 eligible tests pass (100%) via tests/run_compiler_tests.sh. The harness only skips hardware-dependent tests (graphics/sound/timer) and CHAIN/RUN target files that aren't standalone. The compiler now accepts unnumbered direct-mode programs by auto-numbering them.

Language coverage:

  • All statements: PRINT, LET, IF/THEN/ELSE, GOTO, GOSUB/RETURN, FOR/NEXT, WHILE/WEND, ON GOTO/GOSUB, ON ERROR GOTO, RESUME/RESUME NEXT, DIM, DEF FN, SWAP, READ/DATA/RESTORE, INPUT/LINE INPUT, OPEN/CLOSE/PRINT#/INPUT#/WRITE#, FIELD/LSET/RSET/GET/PUT, BSAVE/BLOAD, SAVE/LOAD, CHAIN/COMMON, SCREEN, PSET/PRESET, COLOR/LOCATE/CLS, CIRCLE/DRAW/PAINT/PLAY, VIEW/WINDOW/PALETTE, POKE/OUT/WAIT, DEF SEG, RANDOMIZE, CLEAR, MID$ assignment, ERROR, KILL/NAME/FILES/SHELL/MKDIR/CHDIR/RMDIR, ENVIRON, LPRINT/LLIST, WIDTH, KEY
  • All operators: + - * / \ MOD ^ AND OR XOR NOT EQV IMP > < = <= >= <> (including string comparison via strcmp)
  • All functions: math, string, file, conversion (CVI/CVS/CVD/MKI$/MKS$/MKD$), graphics (POINT/PMAP), system (FRE/ERR/ERL/TIMER/DATE$/TIME$/ENVIRON$/INKEY$)
  • Token embedding for complex statements (PRINT USING, DEF FN, graphics, file I/O, MID$ assignment) with selective variable sync
  • Division-by-zero detection, RNG matching (gw_rnd), ON ERROR GOTO via setjmp/longjmp

Optimizations:

  • Constant folding (compile-time arithmetic on literals)
  • Dead code elimination (skip statements after GOTO/END/STOP)
  • FOR step=1 elision (var++ instead of step variable, simple comparison)
  • Fast-path expression emitter (skip buffering for common case)
  • Selective variable sync in delegated statements
  • REM-line skip (no runtime check for comment-only lines)

Hardware I/O Simulator (v0.15.0)

Implemented in portio.c / portio.h following the virmem.c dispatch pattern. Emulates 8253 PIT channel 2 (speaker frequency), PPI port B (speaker on/off with continuous tone via PulseAudio), CGA mode/color registers, game port (joystick stub), and COM1 serial (transmitter-ready stub). Default: reads return 0xFF (floating bus), writes discarded.

Also in v0.15.0: 100% token coverage (all 144 GW-BASIC tokens handled), string space pool with compacting garbage collector, RESET, ENVIRON/ENVIRON$, ERDEV/ERDEV$, IOCTL/IOCTL$, LCOPY, DATE$/TIME$ assignment, CALL, COM.

Jupyter Kernel (v0.15.0)

gwbasickernel/ -- Jupyter notebook kernel using the persistent subprocess model with sentinel protocol.

  • Inline Sixel graphics -- pure-Python Sixel decoder renders SCREEN commands as inline PNG images in the notebook
  • INPUT statement support via Jupyter stdin protocol
  • Pygments syntax highlighting for code cells
  • Tab completion for all GW-BASIC keywords
  • Magic commands: %reset, %timeout, %new

Install: pip install -e . && gwbasickernel-install --user

Compiler Memory Safety (v0.17.0)

--warn, --safe, and --safe=sanitize flags for the ahead-of-time compiler.

  • --warn -- static analysis: uninitialized variables, GOTO to nonexistent line, unreachable code detection. Zero runtime cost.
  • --safe (implies --warn) -- checked integer arithmetic via gw_int_add/sub/mul/neg (raises Overflow instead of wrapping), enhanced array bounds diagnostics with variable names and line numbers, GOSUB stack overflow diagnostics, ABS/SGN type-preserving codegen, string pool GC pinning infrastructure
  • --safe=sanitize -- above plus -fsanitize=address,undefined passed to gcc

DOS / FreeDOS Target (v0.17.0)

Cross-compiles to DOS using OpenWatcom V2. Two targets:

  • 16-bit real-mode (Makefile.dos16): 128KB standalone MZ executable, MEDIUM memory model, far-heap TUI screen buffer, no DOS extender required
  • 32-bit DOS/4GW (Makefile.dos): 175KB LE executable, flat memory model, requires DOS4GW.EXE extender

Tested on FreeDOS 1.4 via QEMU.

Next Up

Compiler Optimization Flags

  • --no-gc-check -- skip gwrt_check_line() per-line calls (no string pool GC, no Ctrl+Break check) for maximum throughput
  • --inline-arrays -- emit direct array indexing for statically-DIMmed arrays instead of runtime gwrt_array_elem() lookup
  • --fast-math -- skip division-by-zero checks, allow unsafe float ops
  • -O0 through -O3 -- compiler-level optimization tiers mapping to different sets of codegen optimizations (constant folding, dead code elimination, FOR step=1 elision, fast-path expressions)

IDE Integration

  • VS Code extension -- syntax highlighting (TextMate grammar), snippets, run/debug tasks, integrated terminal runner
  • JetBrains plugin (IntelliJ/CLion) -- syntax highlighting, code completion, run configurations, debugger integration (breakpoints via STOP, variable inspection), structure view (line number outline)

Known Limitations

  • Static caps -- 32-bit / Linux builds: 1024 variables, 256 arrays, 64 FOR nesting, 128 GOSUB nesting, 64 WHILE nesting. 16-bit real-mode DOS keeps the original modest caps (256 / 64 / 16 / 24 / 16) because the MEDIUM model has a single 64KB DGROUP for all static data.
  • CALL/CALLS (machine code execution) raises Illegal function call
  • DATE$/TIME$ assignment shifts the program's view of the clock via a process-local offset; the OS time is unaffected (setting the OS clock would require root)
  • Device stubs (ERDEV, IOCTL, COM, LCOPY) return defaults