Version 0.16.0 consolidates the major features added since v0.14.0:
Ahead-of-time compiler (gwbasic-compile):
- Translates .bas programs to C source → GCC → native executables
- 64 of 72 test programs produce correct output (89%)
- Zero compile errors — all 72 programs compile successfully
- Token embedding for complex statements (PRINT USING, DEF FN,
graphics, file I/O, MID$ assignment)
- String comparison, division-by-zero detection, ON ERROR GOTO/RESUME
- libgwrt.a runtime library from existing interpreter modules
Jupyter kernel (gwbasickernel):
- Persistent subprocess with sentinel protocol
- Inline Sixel graphics rendering (pure-Python decoder → PNG)
- INPUT statement support via Jupyter stdin protocol
- Pygments GW-BASIC syntax highlighter
Hardware I/O simulator (portio.c):
- 8253 PIT, PPI speaker, CGA mode/color, COM1, game port
- Continuous tone via PulseAudio pthread worker
Interpreter improvements:
- 100% token coverage (all 144 GW-BASIC tokens handled)
- String space pool with compacting garbage collector
- RESET, ENVIRON/ENVIRON$, ERDEV/ERDEV$, IOCTL/IOCTL$,
LCOPY, DATE$/TIME$ assignment, CALL, COM
5.1 KiB
Architecture
Pipeline
Source text → Tokenizer (CRUNCH) → Token stream
↓
Expression evaluator (FRMEVL)
↓
Statement dispatcher (NEWSTT)
↓
TUI screen buffer (interactive)
↓
HAL (platform I/O)
The interpreter mirrors the original GW-BASIC's internal pipeline, which — like
most Microsoft interpreters of the era — is a tight loop around three core
routines. CRUNCH tokenizes source lines into a compact byte stream. NEWSTT
dispatches each statement. FRMEVL evaluates expressions. All platform I/O
goes through a HAL vtable (hal_ops_t), so the core interpreter has no idea
whether it's talking to an ANSI terminal or a teletype from 1975.
In interactive mode, the TUI layer swaps in its own HAL function pointers and redirects all output through a dynamically allocated screen buffer, displayed via ANSI escape sequences. In piped mode the TUI stays out of the way and the HAL writes straight to stdout.
Module Map
| Module | Source | Original Assembly |
|---|---|---|
| Tokenizer (CRUNCH/LIST) | tokenizer.c |
GWMAIN.ASM |
| Expression evaluator | eval.c |
GWEVAL.ASM |
| Execution loop + control flow | interp.c |
BINTRP.ASM |
| TUI screen editor | tui.c |
— |
| Graphics engine | graphics.c |
— |
| Token/keyword tables | tokens.c, tokens.h |
IBMRES.ASM |
| Error handling | error.c |
GWDATA.ASM |
| Integer arithmetic | math_int.c |
MATH1.ASM |
| Float ops + MBF conversion | math_float.c |
MATH2.ASM |
| Transcendentals | math_transcend.c |
MATH1.ASM |
| String functions | strings.c |
BISTRS.ASM |
| PRINT / LPRINT | print.c |
BINTRP.ASM |
| PRINT USING | print_using.c |
BIPRTU.ASM |
| Variables + arrays | vars.c, arrays.c |
GWMAIN.ASM |
| File I/O + random access | fileio.c |
BIPTRG.ASM |
| Program I/O (SAVE/LOAD) | program_io.c |
BIMISC.ASM |
| INPUT/LINE INPUT | input.c |
BINTRP.ASM |
| Sound engine | sound.c |
— |
| Virtual memory (PEEK/POKE) | virmem.c |
— |
| Hardware I/O ports | portio.c |
— |
| String space pool + GC | strpool.c |
GWEVAL.ASM (GETSPA/GARBAG) |
| AOT compiler analysis | analysis.c |
— |
| AOT compiler codegen | codegen.c |
— |
| Compiled program runtime | gwrt.c |
— |
| Platform abstraction | hal_posix.c |
OEM*.ASM |
Source Layout
src/ — core interpreter + compiler (27 files)
include/ — headers (18 files)
platform/ — HAL backends (1 file)
gwbasickernel/ — Jupyter notebook kernel (Python, 6 files)
tests/ — 72 automated test programs, 4 classic interactive programs, compat harness
docs/ — Sphinx documentation
TUI Architecture
The TUI (tui.c) implements the classic GW-BASIC full-screen editor:
- Screen buffer —
tui_cell_t *screenis dynamically allocated atrows × cols(default 25×80, or full terminal size with--full). Each cell stores a character and color attribute, accessed viaTUI_CELL(r, c). - HAL interception —
tui_init()swaps HAL function pointers so all existing PRINT/LIST/error output automatically goes through the screen buffer. No changes needed toprint.c,error.c, or most ofinterp.c. - Line editor —
tui_read_line()implements the defining GW-BASIC UX: free cursor movement with arrow keys, and pressing Enter on any screen line re-enters that line's content as BASIC input. - Function keys — F1-F10 with default GW-BASIC bindings, configurable via
the
KEY n, "string"statement.KEY ONshows the bar on the bottom row. - Break handling — SIGINT sets a flag checked each statement in the run loop.
Design Decisions
Relation to Original Assembly
Microsoft released the original GW-BASIC source
in 2020 — 43,771 lines of 8088 assembly spread across 43 .ASM files, complete
with Greg Whitten's comments and Neil Konzen's transcendental math routines
(which are, frankly, impressive for 16-bit fixed-point). This reimplementation
uses that assembly as a reference, not as input to a transliterator — the
algorithms are reimplemented in idiomatic C with modern data structures.
Key Differences from the Original
- IEEE 754 floating point — MBF (Microsoft Binary Format) conversion is used at the binary save/load boundary and for file I/O (CVI/CVS/CVD, MKI$/MKS$/MKD$), matching the original's on-disk format
- Dynamic memory allocation —
malloc/freeinstead of a 64KB segment layout - String space pool — 32KB contiguous pool with compacting GC at statement boundaries, matching the original's GETSPA/GARBAG approach
setjmp/longjmp— for error recovery, matching the original's stack reset behavior- ANSI terminal — TUI uses ANSI escape sequences and alternate screen buffer instead of direct CGA memory access
- Dynamic screen buffer — allocated at runtime based on terminal size, rather than hardcoded to 25×80