Commit Graph

10 Commits

Author SHA1 Message Date
Eremey Valetov
efd41dceb1 add uc2.1 man page and install rules
mdoc man page covering all modes and the OTS/ingest long options,
verified with groff and NetBSD mandoc. CMake installs the binary and
the man page (guarded against add_subdirectory embedding). Also
corrects the stale direction-1 comment in the DOSBox round-trip
script: multi-file archives created by v3 have extracted fine in the
original since the custom-Huffman-tree fix.
2026-06-11 15:17:50 -04:00
Eremey Valetov
bd0d1911b1 djgpp: DOSBox-X smoke test for the cross-compiled uc2.exe
tests/scripts/dos_smoke.sh runs the DJGPP-built uc2 inside DOSBox-X
via the flatpak and asserts:
- uc2 -h loads under a real DPMI host and prints the banner
- uc2 -l <archive> opens an existing UC2 archive and produces output

Skips cleanly when any of uc2.exe, CWSDPMI.EXE, or DOSBox-X are
missing.  CWSDPMI.EXE is the standard DJGPP DPMI extender from
csdpmi7b.zip; fetch recipe added to cmake/README-djgpp.md.

Verified locally against build-djgpp/cli/uc2.exe +
tests/archives/basic.uc2.

Closes 20019aa.  CI matrix entry (9379647) remains a separate
follow-up.
2026-05-05 03:00:23 -04:00
Eremey Valetov
4a51918b83 ci: lint gate + test_ots fixes against assert(side-effect) NDEBUG bug
Same bug class as dae8a50 and 6d8087f: under -DNDEBUG (CMake's default
for Release, which CI uses) the assert macro expands to ((void)0) and
the wrapped expression is not evaluated.  Calls inside assert() are
silently dropped.

Found 6 occurrences in test_ots.c (uc2_ots_varint_decode, parse_file)
where the call writes through output pointers.  Under Release builds
these tests silently no-op rather than testing anything.  Converted to
capture-then-check.

Audit otherwise clean: production code (lib/, cli/) has only one
assert-on-call, and it wraps a pure arithmetic helper.

Adds tests/scripts/check_assert_side_effects.py as a CI gate to keep
this class of bug out: matches assert(IDENT(...)) where IDENT contains
a side-effect verb (encode/decode/parse/...).  Pure queries (_equal,
_match, _verify, _has_, _is_, _id, _root, _attest_name, memcmp, ...)
are not flagged.  Wired into build.yml on the Linux runner.

Also gitignore Testing/ (CTest run outputs) and __pycache__/.
2026-05-04 18:23:55 -04:00
Eremey Valetov
5c01fec996 Add Phase 7 OpenTimestamps integration
uc2_sha256: pure-C FIPS 180-4 implementation, one-shot and incremental
API, validated against published vectors (empty, abc, 56-byte,
1M 'a', byte-by-byte, every-split-point boundary).

uc2_ots: parser, serializer, and walker for the standard .ots binary
format.  Strict canonical varint with 64-bit overflow check, depth-
bounded recursion, varbytes cap, max-digest cap.  Walker supports
the calendar-path subset (APPEND, PREPEND, SHA256); proofs that
include other crypto ops (SHA1, RIPEMD160, KECCAK256) are accepted
as structurally valid but flagged for follow-up via the standard
'ots verify'.

UC2-OTS trailer: magic-bracketed sidecar appended after the recorded
archive bytes.  Reverse-scan-safe; original UC2 Pro reader ignores
trailing bytes past its recorded length so backward compatibility is
preserved.  Layout (all integers little-endian uint32):
  front-magic + version + archive_len + proof_len + proof
  + proof_len + back-magic.

CLI: --ots-attach validates that the proof's leaf digest equals
SHA-256(archive[0..archive_len)) before appending and refuses to
overwrite an existing trailer unless -f is given.  --ots-extract
writes the proof verbatim, byte-compatible with the standard
'ots verify'.  --ots-info parses and prints the leaf, archive-match
status, and attestation list.  uc2 -t recomputes the archive
SHA-256 and walks the proof.

Tests: 17 OTS unit tests (varint round-trip, canonical/overflow
rejection, file-envelope round-trip, walker on append/sha256/
sibling/unsupported-op/truncated/trailing-garbage, attest_name,
trailer round-trip + corruption rejection in 4 scenarios).
Plus an optional ctest target ots_cross_check that round-trips
the .ots through python-opentimestamps when the package is
installed; skipped (return code 77) otherwise.
2026-05-03 12:15:30 -04:00
Eremey Valetov
6e62a7aa28 Fix multi-file backward compatibility with original UC2 Pro
Always assign custom master indices (>= FIRSTMASTER=2) to all files,
never SuperMaster (index 0).  The original's ExtractFiles() routes
SuperMaster files through a code path that hangs.  The original itself
never uses SuperMaster in file COMPRESS records — it always creates
at least one custom master, even for archives without dedup groups.

For ungrouped files, a default custom master is built from the largest
file's first 64KB.  All files reference this master, matching the
original's archive structure.

The automated DOSBox-X test now validates multi-file round-trip in
both directions: 4 files UC2 v3 -> original, 5 files original -> UC2 v3.
All content verified byte-for-byte.
2026-03-29 15:21:30 -04:00
Eremey Valetov
eddecfcfc2 Add bidirectional cross-tool round-trip test (both directions pass)
Single-file UC2 v3 archives are now fully backward compatible with the
original UC2 Pro — listing and extraction verified in automated DOSBox-X
test.  SFX extraction timeout increased to 600s with 22-file completeness
check (incomplete extraction caused false test results throughout the
earlier investigation).  Direction 1 (UC2 v3 -> original) test added.
2026-03-29 14:29:33 -04:00
Eremey Valetov
c736b19bae Fix single-file backward compatibility with original UC2 Pro
Root cause: the original UC2 Pro expects csize=0 in the cdir COMPRESS
record (it ignores the field entirely).  UC2 v3 was writing the actual
compressed size, which confused the original's archive reader.

Additional changes:
- Use default Huffman tree for all blocks (ensures tree encoding compat)
- Write method=compression_level in cdir COMPRESS (was hardcoded to 1)
- Add tests/scripts/bitdump.py for bit-level bitstream analysis

Single-file UC2 v3 archives are now fully readable by the original UC2
Pro (listing and extraction verified in DOSBox-X).  Multi-file archives
still hang — the cdir bitstream decodes correctly in our Python analyzer
but fails in the original's ASM decompressor kernel.  Investigation
continues; the bitdump.py tool enables targeted comparison.
2026-03-29 09:58:36 -04:00
Eremey Valetov
be7085c4d3 Rewrite Huffman tree generation to match original UC2 Pro
Port the original TreeGen/RepairLengths/CodeGen algorithms faithfully
from TREEGEN.CPP for bitstream compatibility with the 1992 UC2 Pro:

- treegen() now accepts max_code_bits parameter (13 for main trees,
  7 for tree-encoding meta-tree)
- Heap uses >= for child comparison (prefer right child on ties),
  matching original Reheap()
- BuildCodeTree uses extract-one-then-combine pattern
- RepairLengths uses sorted linked lists with cascading space-fill
- Single/zero symbol cases assign length 1 to two symbols
- tree_enc RLE: trigger at run > 6 (not >= 6), max 20 per chunk,
  single RepeatCode per run
- First block uses default tree (tree-changed=0) matching original
  behavior for small blocks

Full backward compatibility with original UC2 Pro archives (Direction 2)
is maintained.  Forward compatibility (UC2 v3 -> original, Direction 1)
remains in progress — the original still hangs, likely due to residual
bitstream-level differences in the ASM decompressor kernel.
2026-03-29 06:25:21 -04:00
Eremey Valetov
ab2d37286c Add cross-tool round-trip test vs original UC2 Pro in DOSBox-X
Automated test that runs the original 1992 UC2 Pro (UC.EXE) in DOSBox-X
headlessly to create archives from the test corpus, then extracts with
UC2 v3 and verifies byte-for-byte file identity.

Key findings during development:
- uc2pro.exe is a UCEXE self-extracting archive, not the tool itself;
  the actual archiver is UC.EXE inside the distribution
- UC.EXE must be run from its own directory (needs DOS.SEA overlay)
- DOSBox-X flatpak requires work dirs under $HOME (filesystem=home)
- The reverse direction (UC2 v3 → original) does not work: the original
  UC2 Pro hangs reading UC2 v3 archives due to compression bitstream
  differences (added as a roadmap item)

Also fixes create_archives.sh to use the same two-session DOSBox pattern
(extract SFX first, then use UC.EXE).
2026-03-28 18:55:03 -04:00
Eremey Valetov
ff06506bbc Add testing infrastructure with reference UC2 archives
Test corpus (empty, text, binary, compressible, incompressible) with
reference archives created by original UC2 v2.3 in DOSBox. Two CTest
tests: test_identify (magic detection) and test_extract (full
extraction pipeline verified byte-for-byte against corpus).
2026-03-11 08:29:04 -04:00