โฌข DragonFlyBSD Kernel Audit
โ† dashboard
DF-0003

Negative unit number in devclass_alloc_unit causes heap OOB write via dc->devices[]

Field Value
ID DF-0003
Status new
Severity Medium
CVSS 3.1 CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H
CWE CWE-787 Out-of-bounds Write
File sys/kern/subr_bus.c
Lines 1064-1125, 1144, 2166-2184
Area kern
Confidence likely
Discovered 2026-06-29
Reported pending

Summary

devclass_alloc_unit() only treats -1 as a wildcard unit. Any other negative unit (e.g. -2) bypasses the existing-device check (guarded by unit >= 0) and the table-extension check (guarded by unit >= dc->maxunit, which is false for negatives) and returns success with the negative value. devclass_add_device() then executes dc->devices[dev->unit] = dev โ€” a heap OOB write at a negative index into the dc->devices array, corrupting heap memory immediately before the allocation. device_set_unit() has a matching OOB read in its bounds check.

Root cause

In devclass_alloc_unit (sys/kern/subr_bus.c:1064-1125):

  • int unit = *unitp; at line 1067.
  • if (unit != -1) at line 1072 โ€” only -1 is the wildcard. A value like -2 enters the "wired unit" branch.
  • if (unit >= 0 && unit < dc->maxunit && dc->devices[unit] != NULL) at line 1073 โ€” unit >= 0 is FALSE for negatives, so the existing-device check is skipped entirely.
  • if (unit >= dc->maxunit) at line 1094 โ€” maxunit is non-negative, so a negative unit makes this FALSE; the table is not extended and the negative unit is not caught.
  • The function falls through to *unitp = unit; return(0); at lines 1123-1124, returning success with the negative unit.

Back in devclass_add_device (sys/kern/subr_bus.c:1144):

dc->devices[dev->unit] = dev;   /* writes 8 bytes at a negative array index */

dev->unit is the negative value returned above, so this writes a device_t pointer before the start of the kmalloc'd dc->devices array, corrupting adjacent heap metadata or objects.

device_set_unit (sys/kern/subr_bus.c:2166-2184) has a related OOB read at its bounds check (sys/kern/subr_bus.c:2172):

if (unit < dc->maxunit && dc->devices[unit])   /* dc->devices[negative] read */
    return(EBUSY);

A negative unit makes unit < dc->maxunit TRUE, so dc->devices[unit] is an OOB read before the array; if it reads NULL, execution proceeds to dev->unit = unit (:2177) and devclass_add_device, hitting the OOB write. make_device passes caller-supplied unit straight through devclass_add_device to the same sink.

Threat model & preconditions

  • Attacker position: No demonstrated unprivileged-userspace trigger. The unit parameter originates from bus driver code (device_add_child, device_add_child_ordered) or from the loader hints (root-controlled). The finding is a latent memory-corruption defect: any driver that computes a unit which underflows below zero โ€” e.g. unit = a - b with b > a, a signed parse of a device-reported/HW field, or an arithmetic slip in an attacker-influenced (USB/thunderbolt/NFS-over-PCIe/etc.) path โ€” reaches this sink and corrupts the kernel heap.
  • Privileges gained or impact: if reached, kernel heap corruption โ€” an attacker-influenced 8-byte pointer write at a selectable negative offset from a kmalloc array. Potentially exploitable for arbitrary kernel R/W (via corrupted slab/malloc metadata) and thus privilege escalation.
  • Required config or capabilities: default kernel. Reachability depends on a calling driver passing a negative unit.
  • Reachability: device_add_child(bus, drv, <negative>) โ†’ make_device โ†’ devclass_add_device โ†’ devclass_alloc_unit; or device_set_unit(dev, <negative>). The huge driver tree under sys/dev/ and sys/bus/ is the realistic source of a miscomputed unit.

Proof of concept

PoC source: findings/poc/DF-0003/poc_negunit.c

A small kernel module that calls device_add_child(root_bus, "fakehack", -2) to drive the OOB write directly. It requires root to kldload but proves both that the write occurs and that nothing in devclass_alloc_unit rejects the value.

Build & run

cc -I/sys -DKERNEL -c findings/poc/DF-0003/poc_negunit.c
ld -r poc_negunit.o -o poc_negunit.ko
kldload ./poc_negunit.ko      # as root; INVARIANTS kernel recommended

Expected output

poc: OOB write occurred (child=0x...)

On an INVARIANTS kernel, subsequent heap operations typically panic with slab/malloc assertions ("freed pointer ... modified", "use after free"), proving the out-of-bounds write landed on heap metadata / an adjacent object. The negative index is selectable (-2 .. -N) so a specific pre-array offset can be targeted with heap grooming.

Impact

Latent kernel heap corruption reachable by any driver that computes a negative unit number. The bug is in foundational (newbus) code used by every device, so a single underflowed unit anywhere in the driver tree is a potential local privilege-escalation / kernel-R/W primitive. No unprivileged trigger was identified in this file; the fix is cheap and removes a real memory-corruption footgun.

Validate the unit at the entry of devclass_alloc_unit (reject < -1) and mirror the guard in device_set_unit (reject < 0).

--- a/sys/kern/subr_bus.c
+++ b/sys/kern/subr_bus.c
@@ -1064,6 +1064,9 @@ static int
 devclass_alloc_unit(devclass_t dc, int *unitp)
 {
    int unit = *unitp;
+
+   if (unit < -1)
+       return (EINVAL);

    PDEBUG(("unit %d in devclass %s", unit, DEVCLANAME(dc)));
@@ -2165,6 +2168,9 @@ int
 device_set_unit(device_t dev, int unit)
 {
    devclass_t dc;
    int err;
+
+   if (unit < 0)
+       return (EINVAL);

    dc = device_get_devclass(dev);
    if (unit < dc->maxunit && dc->devices[unit])

References

Timeline

  • 2026-06-29 Discovered during automated file-by-file audit of sys/kern/subr_bus.c.
  • pending Reported to DragonFlyBSD security contact.

PoC verification

Evidence pack

findings/poc/DF-0003 ยท 14 files
FileTypeDescriptionSize
poc_negunit.c trigger-source kld module: device_add_child(root_bus,"df3neg",-2) -- drives the negative-unit OOB-write sink 2.8 KB view raw
poc_ctrl.c control-source kld module: device_add_child(root_bus,"df3ctrl",0) -- valid unit; must load cleanly (the control) 1.1 KB view raw
Makefile build-makefile bsd.kmod.mk build for the trigger (correct kernel CFLAGS) 302 B view raw
Makefile.ctrl build-makefile bsd.kmod.mk build for the control 180 B view raw
setup_env.sh build-setup install kernel headers + machine forwarders on a guest that lacks /usr/src 976 B view raw
build.sh build-script build both .ko modules via make 480 B view raw
run.sh run-script load control (clean) then trigger (panic) as root 922 B view raw
build.log build-log full successful bsd.kmod.mk build output 6.9 KB view raw
run.log run-log decisive run: control marker + trigger panic (from serial boot.log) 1.2 KB view raw
panic.txt panic-signature crash signature + addr2line proof IP 0xffffffff8068a946 = subr_bus.c:1144 2.0 KB view raw
env.txt environment uname, cc version, kern.ident, kldstat 543 B view raw
VERDICT.md verdict full narrative verdict: mechanism, reachability, exploit-chain analysis 7.1 KB โ†“ raw
fix.diff suggested-fix git-apply-able: reject unit<-1 in devclass_alloc_unit, unit<0 in device_set_unit 924 B view raw
README.md readme human-facing build/run/reachability summary 4.5 KB โ†“ raw
README.md readme human-facing build/run/reachability summary
โ†“ download raw

DF-0003 - devclass_alloc_unit() negative-unit heap out-of-bounds write

poc_negunit.c / poc_ctrl.c -- kld modules that drive the negative-unit heap OOB-write sink in devclass_alloc_unit() / devclass_add_device() (sys/kern/subr_bus.c).

The bug (memory-safety, CERTAIN -- reproduced on the audited kernel)

devclass_alloc_unit() only treats unit == -1 as a wildcard. Any other negative unit (e.g. -2) enters the "wired unit" branch but skips the existing-device check (unit >= 0, subr_bus.c:1073) and the table-extension check (unit >= dc->maxunit, subr_bus.c:1094), so it returns success with the negative unit. devclass_add_device() then executes

dc->devices[dev->unit] = dev;     // sys/kern/subr_bus.c:1144

an 8-byte pointer write at a NEGATIVE index into the kmalloc'd dc->devices array. device_set_unit() has a matching OOB read at subr_bus.c:2172.

Reproduction (VERIFIED)

A kld module that calls device_add_child(root_bus, "df3neg", -2):

  • Control (poc_ctrl.ko, unit=0) loads cleanly and prints DF0003-CTRL: unit=0 -> OK (child=0xfffff800...). Guest stays up.
  • Trigger (poc_negunit.ko, unit=-2) panics the kernel immediately:

Fatal trap 12: page fault while in kernel mode fault virtual address = 0xfffffffffffffff0 fault code = supervisor write data, page not present Stopped at devclass_add_device+0xf6: movq %r14,(%rdx,%rax,1)

addr2line -e /boot/kernel/kernel 0xffffffff8068a946 -> sys/kern/subr_bus.c:1144 (the exact sink line).

The fault address 0xfffffffffffffff0 = (device_t*)NULL + (-2) = 0 + (-2)*8, i.e. the negative-index write target. For the freshly-created devclass dc->devices == NULL, so the store hits an unmapped address and the kernel page-faults on the WRITE -- at the sink line.

The ONLY difference between the control and the trigger is the literal unit (0 vs -2), so the panic is caused specifically by the negative unit.

Reachability (the crux)

There is no unprivileged-userspace path to this sink in the default kernel. device_add_child / devclass_add_device / devclass_alloc_unit are internal newbus APIs; no syscall/ioctl invokes them. Auditing all 114 in-tree device_add_child* callers and the single device_set_unit caller (sio.c):

  • 84 pass the -1 wildcard (the legitimate, handled case);
  • the rest pass provably non-negative units -- PCI bus numbers (uint8_t secbus 0-255 in pci_pci.c:344, busno, bus), for(unit=0;;unit++) loop counters (ata-all.c, ata-pci.c), a monotonically-increasing freeunit/puc_find_free_unit() (starts >= 0 and only grows), and sio_pci_kludge_unit()'s unit that starts at 0 and only ++s.

So no in-tree driver computes a negative unit. The bug is therefore a real-but-latent memory-corruption defect: reachable today only by root (kldload, demonstrated here) or by any future/buggy driver that underflows a unit (signed subtraction, signed parse of a device-reported field, etc.). The fix is a one-line guard that converts the latent footgun into a hard EINVAL.

Build & run (on the DragonFly guest)

Building a kld requires the kernel source tree's headers; the guest ships without /usr/src, so setup_env.sh installs a headers-only subset first.

# one-time (root): install kernel headers + machine forwarders
sh setup_env.sh

# build both modules (as any user)
make SYSDIR=/usr/src/sys                      # -> poc_negunit.ko (trigger, -2)
make -f Makefile.ctrl SYSDIR=/usr/src/sys     # -> poc_ctrl.ko    (control, 0)

# run (root): control loads clean, trigger panics
sh run.sh

Files

file purpose
poc_negunit.c trigger source -- device_add_child(root_bus,"df3neg",-2)
poc_ctrl.c control source -- device_add_child(root_bus,"df3ctrl", 0)
Makefile / Makefile.ctrl build via bsd.kmod.mk (correct kernel CFLAGS)
setup_env.sh install kernel headers + machine forwarders on the guest
build.sh (legacy) hand-build path; superseded by the Makefiles
run.sh load control then trigger
build.log full successful build output
run.log decisive run: control marker + trigger panic (from boot.log)
panic.txt crash signature with addr2line proof
env.txt guest uname/cc/config/kldstat
VERDICT.md full narrative verdict
fix.diff git-apply-able fix (reject < -1 in devclass_alloc_unit, < 0 in device_set_unit)
manifest.json artifact catalog
VERDICT.md verdict full narrative verdict: mechanism, reachability, exploit-chain analysis
โ†“ download raw

DF-0003 -- VERDICT

Verdict: REPRODUCED (memory-corruption sink confirmed at runtime on the audited master-DEV kernel; impact = kernel panic / DoS, root-gated; latent for any future buggy driver that underflows a unit).

Summary

devclass_alloc_unit() (sys/kern/subr_bus.c:1064-1125) only treats unit == -1 as a wildcard. Any other negative unit (e.g. -2) slips past every guard and is returned unchanged; devclass_add_device() then performs dc->devices[dev->unit] = dev (subr_bus.c:1144) -- an 8-byte pointer write at a negative array index. A tiny kld module calling device_add_child(root_bus, "df3neg", -2) drives this sink directly and panics the kernel at exactly subr_bus.c:1144 (proven by addr2line). The control (unit=0) loads cleanly.

Mechanism (trigger -> primitive -> effect), cited path:line

  1. Trigger: device_add_child(root_bus, "df3neg", -2) -> device_add_child_ordered (subr_bus.c:1247) -> make_device (subr_bus.c:1174). With name="df3neg", make_device calls devclass_find_internal(name, NULL, TRUE) (subr_bus.c:1183) which creates a fresh devclass with dc->devices = NULL; dc->maxunit = 0 (subr_bus.c:759-760), then devclass_add_device(dc, dev) (subr_bus.c:1211).

  2. Sink reach: devclass_add_device calls devclass_alloc_unit(dc, &dev->unit) (subr_bus.c:1139).

  3. The missing guard (subr_bus.c:1064-1125): * int unit = *unitp; (:1067) -> unit = -2. * if (unit != -1) (:1072) -> TRUE (only -1 is the wildcard), enter the "wired unit" branch. * if (unit >= 0 && unit < dc->maxunit && dc->devices[unit] != NULL) (:1073) -> unit >= 0 is FALSE, the existing-device check is skipped. * if (unit >= dc->maxunit) (:1094) -> -2 >= 0 is FALSE, the table-extension block is skipped -- dc->devices is not grown and the negative unit is not caught. * *unitp = unit; return(0); (:1123-1124) -> returns success with dev->unit = -2.

  4. OOB write (subr_bus.c:1144): back in devclass_add_device, dc->devices[dev->unit] = dev -> dc->devices[-2] = dev. With dc->devices == NULL this stores the 8-byte dev pointer at address 0 + (-2)*sizeof(device_t) = 0xfffffffffffffff0 (unmapped) -> supervisor WRITE page fault.

  5. Effect: fatal trap 12, kernel panic, guest wedged in DDB. The faulting instruction is devclass_add_device+0xf6: movq %r14,(%rdx,%rax,1) (the indexed 8-byte store), at IP 0xffffffff8068a946, which addr2line resolves to sys/kern/subr_bus.c:1144 -- the exact sink line.

device_set_unit() (subr_bus.c:2165-2184) has a matching OOB read at its bounds check if (unit < dc->maxunit && dc->devices[unit]) (:2172) -- a negative unit makes unit < dc->maxunit TRUE, so dc->devices[unit] reads out of bounds; if it reads NULL, execution proceeds to dev->unit = unit; devclass_add_device(dc, dev) (:2177-2178), hitting the same write sink. No in-tree caller exercises this.

Evidence

  • run.log -- decisive run: control prints DF0003-CTRL: unit=0 -> OK (child=0xfffff80065c20ea0), guest stays up; trigger panics. Boot.log (serial console) delta shows: fault virtual address = 0xfffffffffffffff0 fault code = supervisor write data, page not present instruction pointer = 0x8:0xffffffff8068a946 Stopped at devclass_add_device+0xf6: movq %r14,(%rdx,%rax,1)
  • panic.txt -- crash signature + addr2line proof that IP 0xffffffff8068a946 = subr_bus.c:1144.
  • build.log -- full successful bsd.kmod.mk build.

Reachability -- why this is "latent" but real

device_add_child / devclass_add_device / devclass_alloc_unit are internal newbus kernel APIs; no syscall or ioctl invokes them. The unit originates from bus-driver code (device_add_child*) or loader hints (root-controlled). Auditing the entire audited sys/ tree:

=> No in-tree path produces a unit < -1. The bug is therefore a real-but-latent memory-corruption defect reachable today only by root (kldload, demonstrated) or by any future/buggy driver that underflows a unit (signed subtraction underflow, signed parse of a device-reported field, etc.). The fix is a one-line guard.

Exploit chain (memory-corruption class -- analysis)

  • Primitive: dc->devices[N] = dev with attacker-selectable negative index N (the unit), writing the 8-byte kernel-heap pointer dev at offset N*8 before dc->devices.
  • Triggering requirement: root (kldload) or a buggy driver. Not reachable from unprivileged userspace.
  • This PoC's effect: deterministic panic (the easy fresh-devclass case has dc->devices == NULL, so the store faults immediately -- a clean DoS, not controllable corruption).
  • Controllable-corruption variant (theoretical): target an existing devclass whose dc->devices is a real heap pointer (e.g. reuse a known driver's devclass), choose unit so dc->devices[N] lands on an adjacent slab object (function pointer / ucred * / refcount), and groom the heap. This would need root (kldload) and a kernel-ROP/ucred-forgery conversion, which is beyond what an unprivileged attacker can reach. Given the root-only trigger, the realistic impact ceiling is local DoS by root + latent corruption for a future driver bug. No uid0 chain was pursued -- there is no unprivileged trigger to escalate from, and the root case is already game-over for the attacker.

PoC changes (vs. the filed PoC)

  • Original poc_negunit.c included <sys/bus.h> (drags in platform/APIC headers) and used cc -DKERNEL (wrong macro -- the guard is _KERNEL). Forward-declared the newbus symbols and built via the standard bsd.kmod.mk (correct kernel CFLAGS, machine forwarders). The hand-build (build.sh) is kept as a legacy path but the Makefile build is authoritative.
  • Added poc_ctrl.c + Makefile.ctrl -- a unit=0 control that loads cleanly, so the trigger panic is provably caused by the negative unit and not by module plumbing.
  • Added setup_env.sh (install kernel headers + forwarders on a guest that lacks /usr/src), run.sh, env.txt, panic.txt, build.log, run.log, manifest.json, and fix.diff.

Reject negative units at the entry of devclass_alloc_unit (< -1) and device_set_unit (< 0). See fix.diff (git-apply-able, supersedes the finding markdown's draft -- adds an explicit EINVAL return and comments).

Confirmed kernel references

Detail

Exploit chain

Primitive: dc->devices[N]=dev with attacker-selectable negative index N (=unit), writing the 8-byte kernel-heap dev pointer at offset N8 before dc->devices. Trigger requires root (kldload) or a buggy in-tree driver -- NOT reachable unprivileged. This PoC's easy variant (fresh devclass, dc->devices==NULL) deterministically panics (DoS) rather than corrupting controllably. A controllable-corruption variant would target an existing devclass whose dc->devices is a real heap pointer, choose unit to land on an adjacent slab object (function pointer/ucred/refcount), and groom the heap -- but this is root-gated and needs kernel-ROP/ucred-forgery conversion, so no uid0 chain was pursued (no unprivileged trigger to escalate from; root case is already game-over). Realistic ceiling: local DoS by root + latent corruption footgun. Chain analysis documented in VERDICT.md.

Evidence (decisive lines)

Decisive run.log + panic.txt: control prints 'DF0003-CTRL: unit=0 -> OK (child=0xfffff80065c20ea0)' (guest stays up); trigger panics with 'fault virtual address = 0xfffffffffffffff0', 'fault code = supervisor write data, page not present', 'Stopped at devclass_add_device+0xf6: movq %r14,(%rdx,%rax,1)'. addr2line 0xffffffff8068a946 -> /usr/src/sys/kern/subr_bus.c:1144. nm confirms devclass_add_device base 0xffffffff8068a850 (=base+0xf6). build.log shows the bsd.kmod.mk build. env.txt has the guest uname/config.

PoC changes

Rewrote poc_negunit.c to forward-declare the newbus symbols (sys/bus.h drags in unrelated platform/APIC headers) and parameterize the unit via UNITVAL/UNITNAME macros (default -2). Added poc_ctrl.c + Makefile.ctrl as a unit=0 control. Added setup_env.sh (install kernel headers + machine forwarders on the guest, which ships without /usr/src), Makefile (bsd.kmod.mk build -- the original hand-build produced .ko files with broken relocations because .eh_frame/SHT_X86_64_UNWIND and missing -mcmodel=kernel caused 'lost base for relatab' / strcmp panics unrelated to the bug), build.sh, run.sh, env.txt, build.log, run.log, panic.txt, VERDICT.md, fix.diff, manifest.json. README.md updated with the reproduction + reachability analysis.

Verified recommended fix

Reject negative units early: add if (unit < -1) return (EINVAL); at the top of devclass_alloc_unit (sys/kern/subr_bus.c:~1068) and if (unit < 0) return (EINVAL); at the top of device_set_unit (sys/kern/subr_bus.c:~2173). Full git-apply-able diff (validated with git apply --check) in findings/poc/DF-0003/fix.diff; supersedes the finding markdown's draft by adding explicit EINVAL returns and comments at both sinks.

Verdict

REPRODUCED. The negative-unit OOB-write sink in devclass_add_device() fires on the audited master-DEV kernel: a kld module calling device_add_child(root_bus,"df3neg",-2) panics the kernel at devclass_add_device+0xf6, which addr2line resolves to sys/kern/subr_bus.c:1144 (dc->devices[dev->unit]=dev) -- the exact sink line. The fault address 0xfffffffffffffff0 = (device_t)NULL + (-2) = 0+(-2)8, proving devclass_alloc_unit returned unit=-2 unchanged (it skips the unit>=0 check at :1073 and the unit>=dc->maxunit extension at :1094), then devclass_add_device wrote dc->devices[-2]=dev against a NULL dc->devices. The control (unit=0) loads cleanly and prints its marker, so the panic is caused specifically by the negative unit. Reachability crux: device_add_child/devclass_add_device/devclass_alloc_unit are internal newbus APIs with NO syscall/ioctl path; all 114 in-tree callers and the single device_set_unit caller (sio.c) pass -1 (wildcard) or provably non-negative units (uint8_t secbus, for(unit=0;;unit++) counters, monotonic freeunit, PCI bus numbers). So no unprivileged trigger exists -- the bug is a real-but-latent memory-corruption defect, reachable today only by root (kldload, demonstrated) or a future buggy driver.