DF-0074 / panic.txt

================================================================
 DF-0074 - DIOCGSLICEINFO heap overflow -> kernel PANIC capture
 Guest: DragonFly 6.5-DEVELOPMENT master DEV
        DragonFly v6.5.0.1712.g89e6a-DEVELOPMENT #1: Mon Jun 29 14:18:01 UTC 2026
        x86_64 / X86_64_GENERIC
 Source: dfbsd-qemu/boot.log (serial console, comconsole baked into baseline)
================================================================

Both panics below are direct consequences of the DIOCGSLICEINFO heap overflow
at sys/kern/subr_diskslice.c:557, which bcopy()s 130 slots of struct
diskslice (256 B each) into a 16-slot destination buffer (4128 B), writing
29184 bytes (~28 KB) of attacker-influenced data past the end of an
M_IOCTLOPS slab object and corrupting adjacent slab zones.

----------------------------------------------------------------
[ Run 1 ]  fresh boot (vm.sh reset -> clean-install)
           attach overflow.img + single trigger + 5-iter stress + 16-proc flood
----------------------------------------------------------------
The trigger returned nslices=130 on every iteration (overflow happened each
time); the guest then froze ASYNCHRONOUSLY while the idle slab-cleanup
reclaimer walked zone metadata smashed by the overrun:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; lapic id = 1
fault virtual address	= 0x0
fault code		= supervisor read data, page not present
instruction pointer	= 0x8:0xffffffff80655e49
stack pointer	        = 0x10:0xfffff800649fc970
frame pointer	        = 0x10:0xfffff800649fc9b0
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 0, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= Idle
current thread          = pri 12 (CRIT)
kernel: type 12 trap, code=0

CPU1 stopping CPUs: 0x00000001
 stopped
Stopped at      slab_cleanup+0x1c9:     cmpq    %rbx,(%rcx)
db>

----------------------------------------------------------------
[ Run 2 ]  fresh boot (vm.sh reset -> clean-install)
           attach overflow.img + single trigger + 5-iter stress + 16-proc flood
----------------------------------------------------------------
The trigger returned nslices=130; the panic then fired during the parallel
flood when fork1()->_kmalloc() walked into the overflow-corrupted zone and
the slab allocator's own integrity check (KKASSERT) caught it:

login: panic: slaballoc: corrupted zone
cpuid = 1
Trace beginning at frame 0xfffff800abacb798
_kmalloc() at _kmalloc+0x95a 0xffffffff80656e2a
_kmalloc() at _kmalloc+0x95a 0xffffffff80656e2a
fork1() at fork1+0x9c9 0xffffffff806429c9
sys_fork() at sys_fork+0x33 0xffffffff80642c63
sys_xsyscall() at sys_xsyscall+0x89 0xffffffff80bd6749
syscall2() at syscall2+0x11e 0xffffffff80bd611e
Debugger("panic")

CPU1 stopping CPUs: 0x00000001
 stopped
Stopped at      Debugger+0x7c:  movb    $0,0xbd77f9(%rip)
db>

================================================================
 INTERPRETATION
================================================================
- "nslices=130" returned to userspace proves the live kernel diskslices has
  dss_nslices=130, so the bcopy at subr_diskslice.c:557 copies 130*256+32 =
  33312 B into the 4128-B data buffer -> 29184 B overrun, every single run.
- "slaballoc: corrupted zone" is the slab allocator detecting the overrun
  (sys/kern/kern_slaballoc.c zone-consistency assertion).
- The two different crash sites across two identical runs (slab_cleanup NULL
  deref vs. _kmalloc<-fork1 assertion) are both hallmarks of slab-metadata
  corruption and confirm the corruption source is the DIOCGSLICEINFO overflow.
- The panic is asynchronous and probabilistic in exact site/timing (the
  overrun corrupts whatever slab object is adjacent at runtime), but the
  underlying heap overflow is 100% deterministic on every invocation.

----------------------------------------------------------------
[ Run 3 ]  fresh boot (vm.sh reset -> clean-install)
           attach overflow.img + single trigger + 5-iter stress + 16-proc flood
----------------------------------------------------------------
The trigger returned nslices=130; this time the 28 KB overrun corrupted the
ROOT disk's (vbd0s1d) slice metadata in kernel heap, producing dscheck
"slice too large" errors and a downstream hammer2 filesystem assertion:

dscheck(vbd0s1d): slice too large 2/0
hammer2: chain error during flush
hammer2: WRITE PATH: dbp bread error
xop_strategy_write: error 22 loff=0000000000000000
panic: assertion "parent->error == 0" failed in hammer2_chain_create at /usr/src/sys/vfs/hammer2/hammer2_chain.c:3112
cpuid = 1
Trace beginning at frame 0xfffff800682935f0
hammer2_chain_create() at hammer2_chain_create+0x1209 0xffffffff80973509
hammer2_chain_create() at hammer2_chain_create+0x1209 0xffffffff80973509
hammer2_assign_physical() at hammer2_assign_physical+0x209 0xffffffff80983069
hammer2_xop_strategy_write() at hammer2_xop_strategy_write+0x681 0xffffffff80984a41
hammer2_primary_xops_thread() at hammer2_primary_xops_thread+0x280 0xffffffff8095d140
Debugger("panic")

CPU1 stopping CPUs: 0x00000001
 stopped
Stopped at      Debugger+0x7c:  movb    $0,0xbd77f9(%rip)
db>

(Note: this panic is in the ROOT filesystem's driver, NOT in the crafted vn0
 disk -- proof that the overrun corrupts arbitrary adjacent kernel heap,
 here the live diskslices/slab objects describing vbd0. Three fresh-boot
 runs, three different crash sites, one root cause: the DIOCGSLICEINFO
 28 KB heap overflow at subr_diskslice.c:557.)