Every milestone earned through real debugging, real regressions caught,
and real fixes documented. No skipping steps.
April 27, 2026 · Week 1
Boot
FAT12
The Very First Boot
The journey started with a 512-byte boot sector and a kernel that printed dots
to prove it was loading. Those dots were everything — each one meant a sector had
crossed the BIOS disk I/O layer correctly. Getting even that working required
BIOS geometry queries (int 13h ah=08h), proper LBA-to-CHS conversion,
and a retry loop because early drafts corrupted their own loop counter mid-read.
The biggest early surprise: FAT12 stores bytes-per-sector as 00 02
in little-endian, but the initial validation code checked only the low byte
(00) and decided every valid disk was invalid. Classic.
Every shell command reported "filesystem unavailable" for days until that
single-byte comparison bug was spotted.
Kernel size: 7 sectors
Selftest: pass=8 fail=0
Apps: HELLO.COM, INFO.COM
April 27–28, 2026 · Week 1
Shell
API
CP/M Shell & the int 60h API
With a kernel that could boot and list files, the next step was making it
actually useful. A CP/M-inspired shell was born: table-driven
command dispatch, case-insensitive matching, external app fallback, and
VGA text scrolling so long output didn't wrap to the top and eat itself.
The int 60h kernel API was designed around this era: print string,
read line, DMA buffer selection, file open/read/close. Apps don't touch kernel
internals — they call a software interrupt and get a clean interface back.
That design decision has paid dividends ever since. Every app written since then
works the same way.
; Every app is a flat .COM binary loaded at 2000:0100
org APP_LOAD_OFFSET ; from memory_map.inc
mov si, message
mov ah, 0x00 ; int 60h: print string
int 0x60
ret ; return to shell
message: db 'Hello from Deca-Tiny-OS!', 13, 10, 0
API services: 12 defined
VGA: Text scrolling
April 29, 2026 · Week 1
FAT12 Writes
Write Safety
Writing to Disk: The Hard Part
Reading from a FAT12 disk is satisfying. Writing to it is terrifying. One
wrong byte in the FAT and you've corrupted the entire filesystem. The approach
taken here: build comprehensive rollback from day one, and stress-test every
failure path before building on top of it.
The diskfull diagnostic fills every data cluster until the disk
overflows, verifying that allocation rollback correctly releases partial chains.
The rootfull diagnostic saturates all 224 root directory entries.
The wrterr diagnostic injects simulated disk-write failures
via a kernel countdown mechanism and verifies that no bytes leak.
The most insidious bug from this era: the FAT release function was called with
the start cluster in AX, then immediately called
fs_load_metadata which clobbered AX to 19.
Every delete silently freed cluster 19 instead of the requested chain —
which also happened to be where GFXDEMO.COM lived. That
one took a while.
Write ops: create, delete, append, rename
Rollback paths: data, FAT, root
Diagnostics: diskfull, rootfull, wrterr, wrtfail
April 29–30, 2026 · Week 1–2
BASIC
Language
Writing a BASIC Interpreter in Assembly
Here's a question that sounds absurd: can you write a BASIC interpreter
in 16-bit x86 assembly, fit it in a 8KB .COM file, and have
it actually be useful? The answer turned out to be yes — if you're disciplined
about scope.
The interpreter was built in twelve slices: tokenizer first, then a REPL
skeleton, then expression parsing, variables, input, program storage,
execution, conditionals, loops, subroutines, and finally file I/O.
Each slice had its own smoke test. Each slice produced a working artifact.
Nothing was left in a half-finished state.
The end result: basic DEMO.BAS RUN loads a BASIC program from
disk, runs it, handles loops and subroutines and user input, and returns
cleanly to the shell. FOR/NEXT, GOSUB/
RETURN, IF...THEN, SAVE/LOAD
— all working. All within 8KB.
10 PRINT "SUBS"
20 FOR I=1 TO 3
30 GOSUB 100
40 NEXT I
50 PRINT "DONE"
60 END
100 PRINT I
110 RETURN
→ Output: SUBS 1 2 3 DONE
BASIC.COM size: ~8KB
Variables: A–Z (16-bit signed)
Selftest: pass=21 fail=0
May 1, 2026 · Week 2
File Tools
Text Tools
Building a Toolkit: Text Tools & File Management
An OS isn't useful if you can only run one thing at a time. The toolkit phase
built out a full suite of Unix-inspired text tools — all as external
.COM apps using the kernel's handle-based streaming API.
more: a paged text viewer with keyboard controls.
head: shows the first N lines of a file.
wc: counts bytes, lines, and words.
find: case-sensitive literal search through a file.
All of these stream through the file handle API — no whole-file loading,
no 8KB buffer limits. They work on any size file.
The shell also grew: touch for zero-byte file creation,
append file text for LF-terminated text appends from the
command line, copy /overwrite with explicit overwrite semantics
(default copy is always safe), stat for file metadata,
df for disk usage. Every command with clear, specific error messages.
Shell commands: df, stat, touch, append, copy/overwrite
Text tools: more, head, wc, find
Selftest: pass=24 fail=0
May 2, 2026 · Week 2
640K Memory
Segments
640K: Moving Everything Out of Segment Zero
The first 64KB of memory is precious real estate in a real-mode system.
Bootloader at 0000:7C00. Interrupt vectors at 0000:0000.
BIOS data. Kernel code. FAT buffers. File buffers. App load window. Stack.
All competing for the same cramped space. Something had to give.
The 640K preparation arc was a methodical ten-slice migration. First,
detect conventional memory via BIOS int 12h. Then centralize
memory constants in kernel/memory_map.inc. Then add helper
routines for far-pointer buffer operations. Then — one by one — move the
disk buffer, FAT buffer, root directory buffer, and file staging buffer
into high conventional memory at segment 7000h.
The payoff: apps now load at 2000:0100 with 36KB of app
window. The kernel has 35KB of headroom before the app region. Runtime
buffers live at 7000:9000+. Everything is segment-aware.
The system is ready for the full 640KB address space.
App window: 2000:0100 → 36KB
Buffer segment: 7000:9000+
Kernel region limit: 0000:E000
Selftest: pass=30 fail=0
May 2–3, 2026 · Week 2
Pmode Extender
DOS/4GW-style
Protected Mode, Without Leaving Real Mode
The kernel still lives in 16-bit real mode. But individual apps now have
the option to run as 32-bit protected-mode programs, with access to the
full 4 GiB linear address space and a 16 MiB extended-memory bump pool.
The classic DOS/4GW playbook — and a 4-epic, 13-slice arc to land it.
Prerequisites first: CPU level detection (NT-bit flip for ≥386, CPUID for
≥586), an INT 15h E820 → E801 → AH=88h fallback chain for extended-memory
discovery, and a four-step A20 enable (BIOS INT 15h AX=2401h → fast A20
via port 0x92 → KBC via 0x60/0x64), each verified with a wraparound check
at 0xFFFF:0x0510.
Then the launcher: a 5-entry GDT (null, code32/data32, code16/data16),
a CR0.PE flip behind a far-jmp, a pmode32 round-trip selftest that writes
0xDECA32 at linear 0x100000 and reads it back,
and finally program_launch_pmode_app — recognises any
.com file whose first 8 bytes are 'DECA32',0,0,
copies it to 0x100000, mode-switches in, and returns through
a pre-pushed exit thunk. Pmode apps reach kernel services via a syscall
thunk that bounce-buffers strings through 7000:C000, dispatches
int 60h in real mode, and preserves all eight 32-bit GP regs
across the call with pushad/popad.
Pmode apps: pmhello, pmcat, pmsieve
Extended-mem pool: 16 MiB @ 0x200000
Selftest: pass=40 fail=0
May 3, 2026 · Week 2
Game
Mode 13h
pmpong: A Playable Pong on the Pmode Extender
With the pmode extender in place, the next obvious question: can a 32-bit
app actually drive VGA mode 13h with playable input and frame pacing,
end-to-end, with zero kernel changes? Answer: yes — first as
pmbounce (a 591-byte bouncing-ball proof), then as a full
8-slice / 3-epic Pong arc.
pmpong is player vs AI: UP/DOWN moves the left paddle, the
right paddle tracks the ball with a 4-pixel deadzone at half the player's
speed (so the AI is deliberately beatable), AABB collision keeps the ball
out of paddles, top/bottom walls bounce, balls past either side score and
re-serve toward whoever scored. A 5×7 bitmap font scaled 4× renders the
score; first to 7 freezes the playfield with the final scoreboard until
ESC quits.
Direct VGA writes to 0xA0000, port-0x60 keyboard
polling (the masked-IRQ BIOS keyboard buffer is unreliable in pmode), and
PIT-based ~30 fps frame pacing — the same primitives proven
by pmbounce. Notable: across the entire 8-slice arc, the
kernel binary stayed at exactly 21767 B / 43 sectors —
the pmode-extender ABI absorbed every requirement, no new int 60h
services and no new thunk dispatch cases. Pre-merge bug count for the arc:
zero.
pmpong.com: 1569 bytes
Win condition: first to 7
Frame pacing: ~30 fps via PIT
Selftest: pass=40 fail=0
May 4–5, 2026 · Week 3
C Toolchain
libc
A Freestanding libc for Pmode Apps
The pmode-extender ABI was already C-runtime-shaped — flat 32-bit segments,
a function-pointer-style syscall thunk in a kinfo struct, register
preservation across the call. The C toolchain arc made that potential
concrete: a .c file in apps/c/<name>/ now
becomes a .COM binary identifiable by the same
'DECA32',0,0 magic as hand-written NASM pmode apps, compiled
by i686-elf-gcc 13.2.0, linked at flat 0x100000
via a small pmapp.ld, and dead-stripped to per-function
granularity through -ffunction-sections +
--gc-sections.
~50 libc functions across six headers, all hand-rolled.
<string.h> mem*/str* family.
<ctype.h> table-driven predicates.
<stdlib.h> with a real freelist malloc
over a 1 MiB arena drawn from the pmem bump pool, plus qsort
and exit/abort. <setjmp.h>
in ~50 lines of NASM (System V i386 jmp_buf layout — six dwords).
<stdio.h> with a 270-line vsnprintf core
feeding printf / fopen / fread /
fwrite / fseek / remove.
<time.h> via direct PIT polling and CMOS RTC reads.
The libc never reaches around apps/c/lib/pm_syscall.h —
DIP held; porting to a different host platform is a one-file swap.
Five C apps shipped: cstub (empty smoke), chello
(494 B), clibtest (11.2 KB — exercises the entire libc
surface), cprintf (3.6 KB — formats SAMPLES.TXT
with line numbers), and cwrite (4.5 KB — fwrite/append/remove
round-trip). One new int 60h opcode (AH=0x19
lseek) and six new pmode thunk dispatch cases. Real-mode
apps and the existing pmode NASM apps were untouched — the arc was
purely additive.
#include "pm_syscall.h"
#include <stdio.h>
int main(int argc, char **argv) {
(void)argc; (void)argv;
printf("CHELLO PASS\n");
return 0;
}
// $ chello → CHELLO PASS · returns to shell
C apps: chello, cstub, clibtest, cprintf, cwrite
libc: ~50 fns / 6 headers
Toolchain: i686-elf-gcc 13.2.0
Selftest: pass=41 fail=0
May 4–5, 2026 · Week 3
FAT16
HDD Boot
32 MB
HDD + FAT16: Same Kernel, Two Image Flavours
The 1.44 MB floppy ceiling lifts. A new os-hdd.img (32 MB,
FAT16) now boots the same kernel binary as the floppy, with the
format detected at runtime from the on-disk BPB. No recompilation, no
per-flavour kernels — every layout constant the FS layer used to hardcode
(FAT LBA, sectors-per-FAT, root LBA, root entry count, cluster→LBA base,
cluster cap, end-of-chain marker) is now read from the BPB at boot.
Structurally the biggest single arc in the project.
kernel/fs.asm split into a format-agnostic dispatcher plus a
new kernel/fs_fat12.asm (8 routines, ~225 lines, lifted
verbatim from fs.asm and renamed) and a fresh
kernel/fs_fat16.asm (6 routines, ~250 lines). Six dispatcher
entry points route by [fs_format]; kernel/api.asm
callers don't know which format is mounted. FAT16's 130 KB FAT can't fit
in the conventional-memory FAT buffer — so a one-sector-at-a-time FAT
cache pages on demand, and a parallel root-directory sector cache
handles the FAT16 32-sector root.
The single most fun quirk: the kernel sits at LBA 128, which lands
inside FAT1 on the 32 MB image. Writing the kernel to disk
overwrites FAT1 sectors 127..173. The build script caps allocation at
MAX_SAFE_CLUSTER=32000 so no app's cluster lookup ever
reads a corrupted FAT1 entry; FAT2 stays clean and serves as the source
of truth for the leak check. Both PowerShell and Python builders agree
byte-identically on both image kinds; both regression and smoke gate
both flavours end-to-end.
Kernel: 23697 B / 47 sectors
FAT16 image: 32 MB / 64991 clusters
Dispatchers: 6 · routes by [fs_format]
Selftest: pass=41 fail=0 (both kinds)
May 5, 2026 · Mini-slice
Shell polish
Format-aware
where Tells the Truth About the Filesystem
Caught while exploring the freshly-merged FAT16 HDD image: the
where shell command was happily printing
Apps: FAT12 root, ... even when the kernel had auto-detected
and mounted FAT16 from the BPB. The string was a single hardcoded literal
back when FAT12 was the only option. Twelve lines of asm in
kernel/memory.asm later, where branches on
[fs_format] and prints the right label — FAT12 floppy still
says FAT12 root, FAT16 HDD now says FAT16 root,
and the smoke harness's existing FAT12 markers continue to match without
modification.
Kernel: 23754 B / 47 sectors
Selftest: pass=41 fail=0
May 6, 2026 · Mini-arc (3 slices)
FS
Kernel ABI
32-bit File API: Lifting the 64 KiB Position Cap
Slice D5.1 wired the WAD-bundling pipeline, then immediately surfaced a
hard blocker for the Doom port: the kernel's per-handle file size and
position were both 16-bit. Any fseek past 65535 truncated;
a multi-megabyte DOOM1.WAD was unreachable through libc
stdio. The 32-bit File API mini-arc opened, lifted, and closed in a single
day — three slices (F0 doc kickoff, F1 atomic implementation, F2
arc-close docs).
F1 widened api_handle_size / api_handle_pos from
word arrays to dword arrays, lifted api_return_cx to a dword,
rewrote api_open_handle_slot to read all 4 bytes of the FAT
directory size field, rewrote api_lseek_handle to take a
signed 32-bit offset packed in DX:CX, dropped the legacy
cmp word [es:si+30], 0 append-rejection that capped append
targets at 64 KiB, and lifted fseek's 32767 cap in libc.
Six call sites updated for the new fs_create_root_entry
signature; one new selftest opens DOOM.COM and seeks past
64 KiB to prove the path. Behaviour-preserving for the 13 existing apps —
they all live below 64 KiB and the 32-bit math reduces to the old 16-bit
behaviour for files that small by construction.
Kernel: 24496 B / 48 sectors
Selftest: pass=42 fail=0
Cap: 64 KiB → 4 GiB
May 6, 2026 · Doom Arc Closes
Pmode
C App
FAT16 HDD
DOOM. All Nine E1 Maps. From the Shell.
doomgeneric's six platform callbacks
(DG_Init, DG_DrawFrame, DG_GetKey,
DG_GetTicksMs, DG_SleepMs,
DG_SetWindowTitle) are wired to the OS as a verbatim drop
of upstream under apps/c/doom/engine/ plus a tiny
apps/c/doom/shim/ layer. Mode 13h via
pm_set_video_mode; framebuffer at 0xA0000
via direct pmode write; PLAYPAL uploaded to the VGA DAC via raw
out 0x3C8 / 0x3C9; keyboard via raw port-0x60 polling;
tic timing via 8254 PIT counter 0 polling. Doom's Z_Malloc
zone is backed by a single pm_alloc(8 * 1024 * 1024) at
startup, separate from libc's malloc arena.
PMAPP_MEM_SIZE(2 * 1024 * 1024) covers the binary's working
memory; the 8 MiB Z_Malloc pool sits separately in the 16 MiB pmem pool.
The launchable artefact is a 290 KB DOOM.COM identifiable
by the same 'DECA32',0,0 magic as every other pmode app.
The OS ships with a stripped 3.6 MB DOOM1.WAD (1151 lumps,
via tools/strip-wad.py --keep-music --keep-all-e1) — no
SFX, no music, no demo lumps, but every Episode 1 map intact. Drop
any unmodified Doom 1.9 IWAD (shareware DOOM1.WAD or
retail Ultimate Doom) into apps/c/doom/data/ instead and
the build picks it up via the existing samples-glob path; the engine
identifies the version from its own hard-coded filename list.
FAT16-HDD-only — the 1.44 MB FAT12 floppy can't fit a
multi-MB WAD.
From a fresh boot of os-hdd.img: type doom,
wait ~25 seconds for W_Init to cache lumps, the title screen
fades in with the correct PLAYPAL palette, the menu navigates with
arrows + ENTER, all nine maps (E1M1 Hangar,
E1M2 Nuclear Plant, E1M3 Toxin Refinery,
E1M4 Command Control, E1M5 Phobos Lab,
E1M6 Central Processing,
E1M7 Computer Station,
E1M8 Phobos Anomaly, plus secret level
E1M9 Military Base) load and play smoothly with no
lag on QEMU TCG. ESC quits cleanly back to the shell. Eight prior
arcs (FAT16, big-app loader, per-app mem-size, -flto,
vendored doomgeneric source, Z_Malloc bridge, mode-13h
DG_DrawFrame, PLAYPAL DAC upload, raw-port input, PIT
timing, WAD bundling, 32-bit file API) all fed this one.
DOOM.COM: 290528 B
DOOM1.WAD: 3.6 MB / 1151 lumps
Z_Malloc zone: 8 MiB @ pmem
Maps: E1M1–E1M9 ✓
Selftest: pass=42 fail=0
May 7, 2026 · Lua Arc · 8 slices in a day
Pmode
C App
Scripting
Lua 5.4 Runs on Deca-Tiny-OS.
lua is now a runnable shell command on both image kinds.
lua -e "print(1+2)" evaluates inline,
lua HELLO.LUA runs a script from disk, bad syntax prints
a clean diagnostic and returns rc=1 to the shell. Lua 5.4.7 vendored
verbatim under apps/c/lua/engine/ at SHA-256
9fbf5e28…; 28 .c files, 27 headers, with five sources
deliberately excluded (lua.c/luac.c we replace,
liolib.c/loslib.c/loadlib.c we drop
per locked decisions). Five edits to upstream all documented in
UPSTREAM.md: four luaconf.h overrides
(LUA_USE_C89, LUA_32BITS=1,
l_signalT=int, lua_getlocaledecpoint='.') plus
one linit.c table trim dropping io/os/loadlib.
Same "verbatim engine + thin shim" pattern that worked for Doom — second
instance in the repo, now reusable.
lua_Number is LUA_FLOAT_FLOAT (32-bit float),
lua_Integer is long (32-bit), and
LUAI_NUMFMT is overridden from "%.14g" to
"%f" to stay inside the printf-float subset shipped in
slice L1.2 of this same arc. Heap is libc malloc behind a
~20-line deca_alloc shim that honours Lua's
lua_Alloc contract. PMAPP_MEM_SIZE(2 * 1024 * 1024)
covers stack + .bss + a 1 MiB libc malloc arena with slack. Stdlibs
included: base, coroutine, table,
string, math (stubbed at this point — real
libm follows in the next arc), utf8, debug.
Five sample fixtures in samples/ exercise the language
end-to-end: HELLO.LUA (print + the pm_print_char
path), FIB.LUA (recursive Fibonacci 1..10),
TABLE.LUA (table.insert + ipairs + table.sort),
COROUTI.LUA (coroutine.create + resume + yield
producer-consumer), STRING.LUA (string.upper +
string.sub + string.format). Each emits a unique
LUA <CASE> PASS smoke marker. Eight prior arcs (C
toolchain, HDD/FAT16, big-app loader, per-app mem-size,
-flto, 32-bit file API, plus L1+L2 epics) all required for
this single line of Lua to evaluate. Arc opened, vendored, debugged,
and closed in a single day.
LUA.COM: 123,922 B
Engine: Lua 5.4.7 verbatim
Stdlibs: 7 included · 3 dropped
Fixtures: 5 (HELLO/FIB/TABLE/COROUTI/STRING)
Slices: 8 in 1 day
Selftest: pass=42 fail=0
May 7, 2026 · libm Arc · same day
Pmode
libc
Numerics
Real Math: FreeBSD msun Replaces the Stubs.
The Lua port shipped earlier the same day with deliberately-wrong
constant-return libm stubs (sin→0, cos→1,
sqrt→x identity, pow→1, all logs→0) so the
arc could close on schedule. The follow-on libm arc — opened and
closed the same day — replaces every stub with a real public-domain
implementation vendored verbatim from FreeBSD msun
(lib/msun/src/) at pinned commit
640af0d9067bee6e8f300c158f0cf928e666977c. BSD-2-Clause
throughout. 38 .c files + 3 helper headers, one upstream
file per function — perfect match for our --gc-sections
discipline.
All 26 stub functions from L2.1 are now numerically correct:
sqrt, floor/ceil,
fmod, modf, frexp/ldexp,
sin/cos/tan,
asin/acos/atan/atan2,
exp/log/log2/log10,
pow — plus the matching *f single-precision
twins. math.sqrt(2) ≈ 1.4142136;
math.sin(math.pi/2) ≈ 1; 2^10 =
1024; math.exp(1) ≈ 2.7182818.
Two boundary shims (sys/cdefs.h with no-op stand-ins
for __always_inline /
__strong_reference; machine/endian.h) plus
one in-tree edit (rnintl long-double helper removed
because 0x1.8p doesn't lex on gcc 13.2.0) absorbed all
msun source assumptions without touching the engine bodies.
Verified end-to-end via a new MATHFIX.LUA sample
fixture: 25 Lua-side math.* assertions across
math.pi, sqrt, floor/ceil,
fmod, sin/cos/tan
(including at math.pi/2, math.pi, math.pi/4),
atan/asin/acos,
exp/log (one-arg + two-arg base form), and the
^ operator. Each call validated through a tolerance helper
(1e-10 double, 1e-5 float for irrational args). On success prints
LUA MATH PASS as a smoke marker. C-side: 96 libm
fixtures across sqrt, decomp, trig, inverse trig, exp/log, and pow
families in clibtest — all rolled into the existing
LIBTEST C PASS marker.
Vendor: FreeBSD msun · BSD-2-Clause
Files: 38 .c + 3 .h
Functions: 26 stubs → real
C fixtures: 96
Lua assertions: 25 (LUA MATH PASS)
LUA.COM: 140,800 B
May 7–8, 2026 · SMS Arc · 11 slices, 2 days
Pmode
C App
Emulator
Sega Master System on Deca-Tiny-OS.
sms ROMNAME.SMS from the shell loads a real SMS / Game
Gear / SG-1000 ROM, enters mode 13h, runs a Z80 @ 3.58 MHz against a
TI VDP at frame-paced 60 fps, responds to PS/2 input, and exits
cleanly to the shell on Esc. Single binary handles all
three consoles via ROM-extension dispatch
(.SMS/.GG/.SG).
SMS Plus GX (libretro fork) vendored verbatim under
apps/c/sms/engine/ at pinned commit
6dc7119f…. 35 source files (14 .c + 21
.h, ~12 K LOC) flat under engine/ with
subdirectory structure flattened at vendor time so the build
pipeline's default -Iapps/c/lib resolves the upstream's
flat-path includes without per-app -I noise. License
stack: GPLv2+ for the engine + sound glue + FM + PSG + BSD-3-Clause
for the Z80 CPU core; combined sms.com is GPL-2 — same
shape as Doom.
Per-concern shim layout under apps/c/sms/shim/: 7 files,
one concern each. init.c enters mode 13h via
pm_set_video_mode(1). draw.c palette-uploads
and per-row letterbox-memcpys the VDP screen buffer into
0xA0000 with a 32-px left + 4-px top centre offset for
SMS (and 80 + 28 for GG). input.c polls PS/2 port 0x60
and maps cursor keys → d-pad, Z → btn1, X →
btn2, Enter → pause/start. timing.c 8254
PIT counter-0 polling + 60 fps frame pacing. palette.c
translates SMS 6-bit-RGB → VGA DAC via raw out 0x3C8 / 0x3C9.
audio.c is a no-op stub — same precedent as Doom; SN76489
PSG callbacks no-op, v1 ships silent. allocator.c routes
engine malloc through libc + pm_alloc for
the cart ROM buffer.
Three-iteration real-ROM bring-up at slice S5.2 mirrors Doom's WAD
bring-up cycle. (iter 1) Cart loaded but the screen
stayed black — Z80 PAIR-union register access reads the wrong byte
halves on i386 without LSB_FIRST set. Fix: in-tree edit
to shared.h, #define LSB_FIRST 1 (the literal
1 matters because render.c:306 uses
#if LSB_FIRST not #ifdef). Cart now ran but
(iter 2) sprites had a pink/magenta overlay against
correctly-coloured backgrounds — engine/render.c:258
stamps bit 0x40 on sprite pixels as an internal
"this-is-a-sprite" marker, the marker bit was leaking through to
mode 13h DAC indices 0x50–0x5F (default VGA
pink/magenta strip). Fix: shim/draw.c per-row loop now
masks each byte with PIXEL_MASK (0x1F).
(iter 3) FAT12 floppy diskfull-smoke timed out after
a 128 KiB user-supplied homebrew ROM bloated the floppy past its
known starting state. Fix: build pipeline now always skips
.sms/.gg/.sg on FAT12 (category
gate, not size gate); FAT16 HDD bundles them all.
Multi-ROM Epic S5 checkpoint verified 8 of 9 ROMs
across a 128 KiB–800 KiB library. Working:
bb.sms (Black Belt, headline demo),
ab.sms, dd.sms (Double Dragon),
ds.sms, gauntlet.sms, sb2.sms,
sd.sms, sf2.sms (Street Fighter
2, attract demo). Known limitation: zool.sms
loads + plays from title screen but hangs partway into level 1 —
likely a v1-out-of-scope item (CodeMasters mapper / FM unit /
timing-sensitive code), not a regression in the engine bind. Eight
prior arcs (FAT16, big-app loader, per-app mem-size,
-flto, 32-bit file API, doomgeneric port, Lua 5.4 port,
libm port) plus all 9 of this arc's implementation slices fed the
final demo.
SMS.COM: 91,908 B
Engine: SMS Plus GX verbatim
Files: 35 vendored (~12 K LOC)
Consoles: SMS + GG + SG-1000
ROMs working: 8 / 9
In-tree edits: 6 documented
Smoke marker: SMS CORE PASS (#14)
Selftest: pass=42 fail=0
May 8–11, 2026 · Sound Arc · 11 slices, 4 days
Pmode
Audio Drivers
SB16 · OPL2 · PC Spkr
Audio: Doom Has SFX, SMS Has Music.
A complete DOS-era audio stack lands as a pure libc extension — no
kernel changes, no new int 60h services, selftest stays
at pass=42 fail=0. Apps drive hardware directly at
CPL=0, the same pattern Doom uses for the VGA DAC at 0x3C8
and the PS/2 keyboard at 0x60. Three subsystems land:
PC speaker via PIT counter 2 + port 0x61
(audio_pcspk.c, 95 LOC); Sound Blaster 16
via DSP at 0x220 + 8237 DMA channel 1 + 8-bit unsigned mono PCM at
5–44 kHz with single-shot, polled auto-init streaming, and a
4-channel software mixer (audio_sb16.c, 519 LOC);
Yamaha OPL2 at 0x388/0x389 with detection, register
write, and 9-voice abstraction including frequency-to-fnum/block
conversion (audio_opl.c, 265 LOC). Total ~1156 LOC of
new libc plus a single audio.h public surface header.
Seven sample apps demonstrate each primitive end-to-end, each with
its own smoke marker: bell (PC-speaker chime,
BELL PASS), sbinfo (DSP version query,
SB16 INIT PASS — QEMU reports DSP 4.05),
pcmplay (1-second 220 Hz single-shot sine,
PCM PASS), opltest (1-second A4 FM tone,
OPL TONE PASS), fmtest (C-major arpeggio
across 4 OPL2 voices, FM PASS), pcmstrm
(5-second frequency-sweep via auto-init DMA, STREAM PASS),
mixtest (4 overlapping waveforms via software mixer,
MIX PASS). Smoke-marker count goes 14 → 21.
Then the two headline re-emergences. Epic A5
rewrites apps/c/sms/shim/audio.c (was an 18-line empty
stub from SMS arc S2.2): per-frame engine snd.output[]
samples route into a double-buffered async chunked-DMA
pattern at 44100 Hz mono. (The streaming pattern from A4.1 broke VGA
mode 13h rendering on QEMU; switched to chunked async to keep DMA
bursty with idle windows for VGA updates — new
audio_sb16_play_8m_async + audio_sb16_wait_8m
driver helpers landed for that.) Black Belt's PSG
tone-channel music is now audible. Noise-channel SFX
(snare/punch/kick) shelved as a known limitation — diagnostic chain
ruled out inter-chunk gaps, int16→uint8 quantisation, and QEMU DAC
filtering; remaining theory is engine-side tone-vs-noise balance in
the per-scanline mixer.
Epic A6 adds a new apps/c/doom/shim/sound.c
(~300 LOC) implementing Doom's I_StartSound /
I_StopSound / I_UpdateSound by routing
DMX-format WAD-lump SFX (the DS*-prefix lumps) through
the 4-channel mixer + double-buffered async chunked DMA at 11025 Hz
mono. Gun fire, shotgun, chainsaw, doors, item pickups,
monster grunts, explosions — all audible during gameplay,
up to 4 simultaneous SFX. Two minor edits: i_sound.c
SDL_mixer.h include removed, plus -DFEATURE_SOUND added
to the toolchain flags so Doom's sound_modules[] links
our module. Doom MUS music stays silent — MUS/MIDI synth is a
separate future arc. The Doom-SFX-clean result narrowed the SMS-PSG
noise diagnosis from "generic int16→uint8 issue" to "SMS engine
mixer tone-vs-noise balance."
Audio libc: 1156 LOC across 4 files
Hardware: PC Spkr + SB16 + OPL2
Sample apps: 7 new
Smoke markers: 14 → 21
SMS.COM: 91,908 → 92,580 B
DOOM.COM: 293,088 → 293,152 B
Kernel: unchanged · pass=42 fail=0
May 12, 2026 · IRQ Arc · 7 slices, 1 day
Pmode
8259 + IDT
IRQ-driven SB16
Pmode IRQs: the kernel finally takes interrupts.
Every audio path through Sound A6 was polled — the pmode
extender masked interrupts on CR0.PE flip and never re-enabled them.
The IRQ arc lands the missing piece in 7 slices in one day:
(I1.1) pushf/popf around all four CR0-toggle
sites so a pmode app's sti survives its first syscall;
(I1.2) per-pmode-hop 8259 PIC remap from BIOS default
to standard 0x20–0x2F, undone on every pmode exit so BIOS
shell ISRs that out 0x20, 0x20 as a literal EOI keep
working; (I1.3) a 256-entry pmode IDT in kernel BSS
loaded via lidt on entry, with spurious-IRQ-aware default
stubs (poll master ISR via out 0x0B; in 0x20 before EOI)
and two trampoline templates — master-only EOI for IRQ 0–7,
slave + master cascade for IRQ 8–15; (I1.4) three new
int 60h services (AH=0x22 set_irq_handler,
0x23 irq_unmask, 0x24 irq_mask) and a libc
surface in apps/c/lib/pm_irq.{h,c} giving apps
pm_irq_register(int irq, void (*)(void)),
_unmask, _mask, _sti,
_cli. Then (I2) the consumer: a new
audio_sb16_stream_start_irq(buf, size, rate, refill_cb)
alongside the polled streaming API. CPU exceptions land in halt stubs
with a visible serial marker instead of triple-faulting the box.
Two sample apps prove the chain. irqping registers an
IRQ-0 (PIT) handler that increments a counter, sleeps 1 second, and
asserts counter > 10 (BIOS PIT default ~18.2 Hz) — emits
IRQPING PASS. pcmstrm2 kicks off the SB16
5-second sweep via the new IRQ-driven refill, sleeps 5 seconds with a
literally idle main loop, stops, emits
STREAM IRQ PASS. First non-polled hardware
driver in the system. Real-mode shell still works after pmode
return because the PIC remap is undone on exit. Pre-merge bug caught
in I1.4: a 2 KiB initialised idt_table declaration pushed
kernel.bin past the boot-loader's 53-sector safe limit;
fix was to declare the table as a bare label past all initialised
data so NASM truncates the flat-binary output. An I1.4 follow-up also
added an irq_keyboard_drain_stub at IDT[0x21]
to drain port 0x60 during pmode hops — keyboard activity during a
pmode app no longer wedges the 8042. Same arc retired the FAT12
floppy: every smoke gate is now FAT16-HDD-only (regression PS↔Python
parity still runs both kinds to catch builder drift).
Kernel: 24,496 → 25,960 B (48 → 51 sects)
Selftest: pass=42 → 46 fail=0
Smoke markers: 21 → 23
Sample apps: irqping + pcmstrm2
New libc: pm_irq.{h,c} ~140 LOC
FAT12: retired
May 12, 2026 · PS/2 Mouse Arc · 5 slices, 1 day
Input
8042 aux
IRQ 12
Pointer input: PS/2 mouse via IRQ 12.
The first consumer of the IRQ infrastructure outside the SB16 streaming
demo. PS/2 mouse fires IRQ 12 on the 8042 keyboard controller's
auxiliary port, so the arc is mostly an app-side affair —
apps/c/lib/pm_mouse.{h,c} (~280 LOC combined) owns the
full 8042-aux init sequence (enable aux via 0xA8 → 0x64,
set the aux-IRQ-enable config bit, mouse reset 0xD4 0xFF,
enable streaming 0xD4 0xF4), the 3-byte standard PS/2
packet state machine (byte 0 sync-bit check, signed dx/dy in bytes 1
and 2, restart on sync failure), and a static IRQ thunk that calls
back into the app's
void(uint8_t buttons, int8_t dx, int8_t dy) handler.
Public surface: pm_mouse_init,
pm_mouse_register(cb), pm_mouse_stop.
Kernel side grows by exactly one drain stub
(irq_mouse_drain_stub at IDT[0x2C], slave +
master cascade EOI) plus one selftest case
(selftest_mouse_drain_stub_installed) — closes the latent
IRQ-arc-era gap where the default irq_slave_eoi_stub did
not drain port 0x60 and a stray mouse byte during a pmode hop would
wedge the 8042 controller.
mousetst initialises the mouse, registers a
packet-counting handler, pm_irq_stis,
audio_delay_ms(2000) — wiggle the mouse during the window
— stops, reports packet count + cumulative dx/dy + last button state,
emits MOUSE PASS. Interactive QEMU produces 20–100+
packets for moderate wiggle, with the correct button bit when a button
is held. Shell remains responsive afterwards (verifies the mouse drain
stub and the I1.4 keyboard drain coexist cleanly). Mouse-driven
paint upgrade and mouse-look for Doom are queued as
follow-ons.
Kernel: 25,960 → 26,064 B (+104 B, 51 sects)
Selftest: pass=46 → 47 fail=0
Smoke marker: MOUSE PASS (#24)
libc: pm_mouse.{h,c} ~280 LOC
Sample: mousetst (4,164 B)
Pre-merge bugs: 0
May 13, 2026 · MIDI / MUS Arc · 7 slices, 1 day
Pmode
OPL2 GM-subset
MUS + SMF
Doom music. "At Doom's Gate" plays through OPL2.
For six days DG_music_module in
apps/c/doom/shim/sound.c was a no-op stub — Doom played
SFX from the Sound A6 arc but the soundtrack was silent. The MUS/MIDI
arc closes that. Three new libc layers ship in one day with zero
kernel changes: (M1) apps/c/lib/audio_opl_gm.{h,c} — a
16-patch hand-crafted GM-subset bank baked as static const
data, a 9-voice allocator with note-stealing on the lowest-priority
channel, MIDI-note-to-freq conversion, velocity-to-carrier-level
mapping, plus a thin audio_midi_note_on /
_note_off / _program_change /
_controller surface on top of the existing
audio_opl primitives. (M2) pm_music.{h,c} —
a format-agnostic playback engine driven by
pm_music_poll(uint32_t ms_now) with a shared 24,576-entry
event pool. (M4) smf_parser.c — adds Standard MIDI File
support behind pm_music_load_smf: header + track chunks,
variable-length quantities, running status, tempo meta events, and an
SMF Format 1 multi-track merge by absolute ms time at load.
Two visible payoffs. Standalone: musictst
embeds a ~200-byte MUS payload as static const uint8_t,
plays a ~5-second pattern through OPL2, emits the full
MUSIC INIT PASS → LOAD PASS → PLAY PASS → MUSIC PASS
marker chain. musictst hello.mid exercises the SMF path
through samples/HELLO.MID. Headline:
M5 wires DG_music_module to the new libc —
RegisterSong → pm_music_load_mus,
PlaySong → pm_music_play, etc. — and Doom's
I_UpdateSound hook calls
pm_music_poll(DG_GetTicksMs()) once per gametick.
With a --keep-music-stripped DOOM1.WAD,
E1M1 plays "At Doom's Gate" through OPL2 in-game.
E1M2 plays "The Imp's Song". Pause/resume via Doom's menu cuts
sustained voices + rebases the poll clock. Pre-merge bug caught at
the M5 interactive checkpoint: PM_MUSIC_MAX_EVENTS = 8192
overflowed on D_E1M2 (8,372 events) and
D_E1M8 (16,742). Bumped the ceiling to 24,576 and added
tools/mus-event-count.py for future audits.
Kernel: 26,064 → 26,072 B (+8 B for help_msg)
Selftest: pass=47 fail=0 (unchanged)
Smoke marker: MUSIC PASS (#25)
DOOM.COM: 293,152 → 296,736 B (+3,584 B)
MUSICTST.COM: 6,825 B
FAT16 SHA: f796471d…
May 13, 2026 · Graphics SDK Arc · 7 slices, 1 day
Pmode
VGA mode 13h
PCX + Bresenham + Font
A real graphics library. Lines, circles, text, PCX, and a mouse cursor.
For weeks every C app that wanted graphics rewrote framebuffer access
from scratch — Doom, SMS, the old paint.asm /
snake.asm / gfxdemo.asm, even
mousetst printed text-mode-only. No shared C primitive
existed for Bresenham lines, midpoint circles, polygon fill, text
overlay in graphics mode, mouse cursor rendering, or image-file I/O.
The Graphics SDK arc closes that gap in 7 slices in one day —
structurally a pure libc extension, same shape as the libm and music
arcs. Zero kernel changes; selftest stays at
pass=47.
Three new files in apps/c/lib/: gfx.{h,c}
(~400 LOC) owns drawing primitives — direct framebuffer write at
linear 0xA0000, rect-fill via memset, Bresenham line
with octant dispatch, midpoint circle, scanline convex-polygon fill,
VGA DAC palette write via ports 0x3C8/0x3C9 with
vertical-retrace sync, and a cursor save/draw/restore trio
(gfx_cursor_save_bg / _draw_sprite /
_restore_bg). gfx_font.c (~120 LOC) embeds
a public-domain 8×8 BIOS-style font as
static const uint8_t font_8x8[128][8], ASCII 0x20-0x7F
(96 glyphs ≈ 1 KiB strippable), with gfx_text(x, y, fg, str).
gfx_pcx.c (~220 LOC) is a full PCX 256-color v5
encoder/decoder — 128-byte header + RLE scanlines + 769-byte palette
appendix — files are byte-compatible with GIMP, ImageMagick, and
IrfanView out of the box.
gfxtest proves every primitive end-to-end. Plain
gfxtest runs a deterministic visual demo (palette ramp,
lines, circles, polygons, text banner, cursor blit round-trip, PCX
encode/decode round-trip with pixel-compare) and emits the
GFX MODE → PRIMS → FONT → CURSOR → PCX → PASS lifecycle
marker chain. gfxtest mouse is the headline checkpoint:
a cursor sprite follows the PS/2 mouse pointer in real time via the
IRQ-12 packet callback from the mouse arc; left-button-drag draws a
Bresenham line directly into the framebuffer; ESC saves the canvas
as gfxdemo.pcx. gfxtest gfxdemo.pcx reloads
and displays the saved image. The bundled
samples/GFXDEMO.PCX (17 KiB hand-painted) ships so the
app does something interesting from a fresh boot. Three pre-merge
fixes at the mouse-mode checkpoint: replaced pm_poll_key()
(which round-tripped through real-mode and dropped mouse IRQs) with
a direct-port kbd_poll_esc(); masked IRQ 1 during
mouse mode so the kernel keyboard-drain stub wouldn't eat the ESC
scancode; gave the PCX viewer the same treatment.
Kernel: 26,072 → 26,080 B (+8 B, 51 sects)
Selftest: pass=47 fail=0 (unchanged)
Smoke marker: GFX PASS (#26)
New libc: gfx + gfx_font + gfx_pcx ~740 LOC
GFXTEST.COM: 9,272 B
GFXDEMO.PCX: 17,113 B bundled
May 14, 2026 · Deca Paint v1 Arc · 7 slices, 1 day
Productivity
Mouse UI
PCX save/load
Deca Paint v1. The first genuinely productive end-user app.
Every prior app shipped is either a demo, a developer tool, a port
(Doom / Lua / SMS), or infrastructure. None of them are something a
user opens to actually make something. Deca Paint v1
changes that. It's a Deluxe-Paint-style drawing app that turns the
just-shipped Graphics SDK from "primitives library + tech demo" into
a real productivity tool you can pick up, work in, save your output
from, and reopen later with everything preserved. Single-binary C
app, ~700 LOC of pure UI + tool logic. Zero kernel
changes, zero libc additions — selftest stays at
pass=47; the entire arc rides on the SDK + the mouse
driver + the PCX codec from the prior two arcs.
The UI is fixed-layout 320×200: 8-pixel status bar at the top, 16-pixel
tool palette down the left edge (8 tool icons with single-char glyphs
P/L/R/r/C/c/B/E), 304×184 canvas in the middle, 16-swatch color
palette across the bottom. 8 drawing tools: pencil
(Bresenham-interpolated drag), line / rect / filled-rect / circle /
filled-circle (press → release), flood-fill bucket (explicit-stack
scanline BFS with a 1024-span cap), eraser. 16-named-color
palette shared byte-for-byte with gfxtest so
PCX files interop cross-app. Mouse-driven UI — clicking a tool icon
updates the active tool (yellow highlight moves), clicking a swatch
updates the active color (outline moves). Full keyboard shortcut
coverage: 1-8 colors, P/L/R/Shift+R/C/Shift+C/B/E tools, Z undo,
S save, ESC quit. Single-level undo via a static
64 KiB BSS snapshot captured before each tool application. Save/load
via command-tail filename — dpaint MYPAINT.PCX both
loads (if the file exists) and saves to that name.
Four pre-merge bugs caught at the headline P2 checkpoint —
all four are 8042/IRQ interactions worth knowing about.
(1) redraw_chrome internally called
clear_canvas, wiping the loaded PCX right after
try_load_pcx blitted it. (2) Lowercase filenames
silently failed to load with no flash — fix: uppercase
argv[1] in main, explicit "No file: NAME" on miss.
(3) Slow PCX load + still-held Enter key meant a keyboard scancode
landed in port 0x60 just as pm_mouse_init was reading
mouse responses; the kernel's irq_keyboard_drain_stub
consumed the ACK byte; init failed silently. Fix:
pm_irq_cli() + drain 8042 manually before mouse init.
(4) The ~400-syscall save sequence had pmode↔real PIC remaps; mouse
IRQ 12 firing during a BIOS-mode window landed at vector 0x74 with
no installed handler; ICW1 re-init on pmode return + the edge-triggered
PIC dropped the still-asserted line; 8042's single output buffer
blocked all subsequent keyboard + mouse IRQs (total freeze with the
"Saved" flash stuck on screen). Fix: new
drain_8042_post_syscall() helper at the end of
try_save_pcx + try_load_pcx. Bundled
samples/PAINTTUT.PCX (6 KiB) ships as the first-launch
experience. Cross-app compatibility verified —
gfxtest PAINTTUT.PCX opens the same file the paint app
saves.
Kernel: 26,080 → 26,088 B (+8 B, 51 sects)
Selftest: pass=47 fail=0 (unchanged)
Smoke marker: DPAINT PASS (#27)
DPAINT.COM: 14,592 B
PAINTTUT.PCX: 6,010 B bundled
Tools: 8 · Colors: 16 · Undo: 1-level