Files
uc2/tests/test_cli_bigfile.cmake
Eremey Valetov 84672c00b6 fix rANS extraction crash and >64KB window corruption
Extraction of level 6-9 archives crashed (first seen on NetBSD/sdf.org,
reproducible everywhere), and files larger than the 64KB sliding window
silently corrupted at every level. Four causes:

- cli: master COMPRESS records hardcoded method 1 while master data was
  compressed at opt.level, so rANS masters were fed to the Huffman
  decoder. Records now carry method 10 at levels 6-9; levels 2-5 keep
  method 1 for original UC2 Pro compatibility.

- decompress: decompressor_rans stopped at remaining == 0 without
  consuming the end-of-block pair and its 12 extra bits, leaving the
  bit cursor desynchronized; the next block-present read landed inside
  the EOB extras and parsed a phantom block. The loop now decodes all
  nsyms symbols and guards output writes instead.

- decompress: a refill read returning a single byte into an empty
  buffer let head overtake tail in bits_feed; the unsigned difference
  wrapped and head walked off the 4KB buffer (the actual segfault).
  The refill now loops until a full byte pair is available, and a
  sticky error flag stops the decoder treating negative bits_get
  returns as data.

- compress/decompress: chunk loads wrote linearly past the circular
  window edge, and the rANS decoder flushed output in one linear write
  that cannot express ring wrap. Loads are now capped at the edge and
  the decoder flushes incrementally in ring order.

Also: BCJ E8/E9 byte assembly no longer shifts promoted ints into the
sign bit, and the libarchive plugin uses timegm on NetBSD/OpenBSD/
DragonFly so DOS timestamps are not offset by the local timezone.

New cli_bigfile regression test (>128KB round-trip at L5 and L6); it
fails against the previous binary and passes now. Verified: 22/22
ctest including the DOSBox-X round-trip against original uc2pro.exe,
ASan/UBSan clean, and the full matrix on NetBSD 10 (sdf.org).
2026-06-11 13:14:01 -04:00

52 lines
1.8 KiB
CMake

# CLI round-trip test for files larger than the 64KB sliding window.
# Regression test for the window-edge bugs fixed 2026-06-11 (git-bug
# d747658): linear chunk loads crossing the circular-buffer edge
# corrupted archives at every level, and the rANS decoder could not
# flush more than one window of output. Content must be varied, not
# uniform: all-zeros hides reordering corruption.
file(REMOVE_RECURSE "${TEST_DIR}")
file(MAKE_DIRECTORY "${TEST_DIR}/input")
# ~200KB of deterministic varied text (seeded string(RANDOM)), with a
# compressible refrain so both literal and match paths are exercised.
set(BIG "")
foreach(i RANGE 1 180)
string(RANDOM LENGTH 1024 RANDOM_SEED ${i} CHUNK)
string(APPEND BIG "${CHUNK}\nThe quick brown fox jumps over the lazy dog ${i}\n")
endforeach()
file(WRITE "${TEST_DIR}/input/bigfile.txt" "${BIG}")
foreach(LEVEL 5 6)
set(WORK "${TEST_DIR}/L${LEVEL}")
file(MAKE_DIRECTORY "${WORK}/output")
execute_process(
COMMAND "${UC2_CLI}" -q -w -L ${LEVEL} "${WORK}/big.uc2"
"${TEST_DIR}/input/bigfile.txt"
RESULT_VARIABLE RC
)
if(NOT RC EQUAL 0)
message(FATAL_ERROR "uc2 -w -L ${LEVEL} failed: ${RC}")
endif()
execute_process(
COMMAND "${UC2_CLI}" -q -d "${WORK}/output" "${WORK}/big.uc2"
RESULT_VARIABLE RC
)
if(NOT RC EQUAL 0)
message(FATAL_ERROR "uc2 -d failed at -L ${LEVEL}: ${RC}")
endif()
execute_process(
COMMAND "${CMAKE_COMMAND}" -E compare_files
"${TEST_DIR}/input/bigfile.txt" "${WORK}/output/bigfile.txt"
RESULT_VARIABLE RC
)
if(NOT RC EQUAL 0)
message(FATAL_ERROR "bigfile round-trip mismatch at -L ${LEVEL}")
endif()
message(STATUS "bigfile round-trip OK at -L ${LEVEL}")
endforeach()