Heap OOB read in elf_getnote: untrusted n_namesz advances offset past note buffer with no bounds check
| Field | Value |
|---|---|
| ID | DF-0070 |
| Status | new |
| Severity | Medium |
| CVSS 3.1 | CVSS:3.1/AV:L/AC:L/PR:H/UI:N/S:U/C:L/I:N/A:H |
| CWE | CWE-125 Out-of-bounds Read |
| File | sys/kern/kern_checkpoint.c |
| Lines | 313-352 |
| Area | kern (checkpoint/restore) |
| Confidence | likely |
| Discovered | 2026-06-30 |
| Reported | pending |
Summary
elf_getnote (sys/kern/kern_checkpoint.c:313-352) parses ELF note headers out
of a kernel heap buffer note[notesz] whose size notesz is fully
attacker-controlled (phdr[0].p_filesz, passed at :240). The note header's
n_namesz and n_descsz fields (uint32_t) are read straight out of that
untrusted buffer (:325) and used to advance *off via
roundup2(note.n_namesz, sizeof(Elf_Size)) (:339) and
roundup2(note.n_descsz, sizeof(Elf_Size)) (:347), and to size a descriptor
bcopy (:346), WITHOUT ANY CHECK that *off stays within [0, notesz).
There is also no check that notesz is large enough to hold the first
Elf_Note header before the bcopy at :325.
The attacker sets n_namesz to a huge value (e.g. 0x10000000) while keeping
the literal name bytes CORE\0 at the right place so strncmp at :335 still
returns 0 (it stops at the null in "CORE" within 5 bytes, regardless of a giant
n_namesz). *off then jumps far past the buffer end; n_descsz must equal the
kernel struct size (:340), so the attacker sets it correctly, and the bcopy
at :346 reads sizeof(prstatus_t) bytes from heap memory well beyond the
allocation.
Separately, the nthreads formula at :185
(notesz - sizeof(prpsinfo_t)) / (sizeof(prstatus_t) + sizeof(prfpregset_t))
ignores per-note Elf_Note header + name/desc padding overhead, so
elf_demarshalnotes over-counts threads and walks past the real note data into
OOB territory.
Impact: kernel heap OOB read of up to a few hundred bytes; if the read crosses
an unmapped page boundary the kernel panics (reliable local DoS); if the
OOB-loaded data satisfies the size/version checks in elf_loadnotes, the leaked
heap bytes flow into p->p_comm via strlcpy at :306 (limited kernel-memory
info leak observable via ps/sysctl).
Reachability: sys_checkpoint(CKPT_THAW, fd, -1, 0) on a crafted checkpoint
image. Root/wheel-only under default ckptgroup=0 (:728-729), but the data is
untrusted in all cases and the path is open to arbitrary users if an admin sets
kern.ckptgroup=-1.
Recommended fix
Pass notesz (or an end pointer) into elf_getnote and validate every access
before touching memory:
--- a/sys/kern/kern_checkpoint.c
+++ b/sys/kern/kern_checkpoint.c
@@ static int
elf_getnote(void *src, size_t *off, const char *name, unsigned int type,
- void **desc, size_t descsz)
+ void **desc, size_t descsz, size_t srcsz)
{
Elf_Note note;
int error;
@@
- bcopy((char *)src + *off, ¬e, sizeof note);
+ if (*off + sizeof(note) > srcsz) { error = EINVAL; goto done; }
+ bcopy((char *)src + *off, ¬e, sizeof note);
*off += sizeof note;
@@
- if (strncmp(name, (char *) src + *off, note.n_namesz) != 0) {
+ if (*off + roundup2(note.n_namesz, sizeof(Elf_Size)) > srcsz ||
+ note.n_namesz > 32) {
+ error = EINVAL; goto done;
+ }
+ if (strncmp(name, (char *) src + *off, note.n_namesz) != 0) {
@@
*off += roundup2(note.n_namesz, sizeof(Elf_Size));
- if (note.n_descsz != descsz) {
+ if (*off + roundup2(note.n_descsz, sizeof(Elf_Size)) > srcsz ||
+ note.n_descsz != descsz) {
@@
if (desc)
bcopy((char *)src + *off, *desc, note.n_descsz);
Also pre-check if (notesz < sizeof(Elf_Note)) return EINVAL; in
elf_demarshalnotes and fix the nthreads derivation to account for per-note
overhead, or replace the count-based loop with offset-driven parsing that stops
once *off reaches notesz.
Proof of concept
See findings/poc/DF-0070/. A small C program builds a minimal valid ELF header
+ a single PT_NOTE program header with p_filesz chosen so the nthreads
formula lands in [1, CKPT_MAXTHREADS], and a note payload whose n_namesz is
0x10000000. Calling sys_checkpoint(CKPT_THAW, fd, -1, 0) drives elf_getnote
off the end of the note buffer โ heap OOB read / panic.
Timeline
- 2026-06-30 Discovered during automated file-by-file audit of
sys/kern/kern_checkpoint.c. - pending Reported to DragonFlyBSD security contact.
PoC verification
Evidence pack
findings/poc/DF-0070 ยท 13 files| File | Type | Description | Size | |
|---|---|---|---|---|
| df0070.c | trigger-source | malicious-checkpoint generator + inline sys_checkpoint(CKPT_THAW) trigger; default = panic mode (n_namesz=0x10000000), optional 'leak' mode for slab-adjacent OOB | 10.2 KB | view raw |
| probe.c | probe-source | prints sizeof(prpsinfo_t)/prstatus_t/prfpregset_t/Elf_Note etc. on the running kernel; verifies the nthreads formula constants | 1.2 KB | view raw |
| build.sh | build-script | cc -o df0070 df0070.c | 249 B | view raw |
| run.sh | run-script | ./df0070 evil.ckpt [panic|leak] | 633 B | view raw |
| build.log | build-log | probe.c build + struct-size probe output | 1.1 KB | view raw |
| run.log | run-log | decisive panic-mode run (RUN 2, post-reset) + panic signature | 1.9 KB | view raw |
| run.2.log | run-log | first panic-mode run (RUN 1, pre-reset) + panic signature; byte-identical code offsets to run.log | 1.5 KB | view raw |
| run.leak.log | run-log | slab-adjacent OOB leak-mode run: silent OOB, returns EINVAL, no panic | 1.5 KB | view raw |
| panic.txt | panic-signature | Fatal trap 0xc (page fault) in memmove+0x28 from elf_getnote bcopy; vm_object_hold_shared 'obj != NULL' assertion panic | 768 B | view raw |
| env.txt | environment | uname -a, cc --version, kern.ckptgroup/kern.osreldate sysctls, id | 372 B | view raw |
| fix.diff | suggested-fix | thread srcsz through elf_demarshalnotes -> elf_getnote; bounds-check header bcopy, n_namesz (cap 32), n_namesz_pad, n_descsz_pad against srcsz before each access | 4.0 KB | view raw |
| VERDICT.md | verdict | full narrative: reproduced, mechanism, leak-variant analysis, fix rationale | 6.2 KB | โ raw |
| README.md | readme | human-facing build/run/expected summary | 3.9 KB | โ raw |
DF-0070 PoC โ elf_getnote heap OOB read via crafted checkpoint image
Status: REPRODUCED (kernel panic / local DoS). Verified on DragonFlyBSD
master DEV v6.5.0.1712.g89e6a-DEVELOPMENT (build 2026-06-29, X86_64_GENERIC).
See VERDICT.md for the full analysis.
What this proves
That elf_getnote (sys/kern/kern_checkpoint.c:313-352) advances a parse
offset using an attacker-supplied n_namesz (read from the untrusted note
buffer at :325, applied at :339) with no bounds check against the
allocated buffer size notesz. A crafted checkpoint image with
n_namesz = 0x10000000 causes the subsequent bcopy at :346 to read
sizeof(prpsinfo_t)=120 bytes from KVM 256 MB past the kmalloc(880) note
buffer allocated at :194. The bcopy (backed by memmove) hits unmapped
memory and the kernel page-faults in kernel mode (Fatal trap 12) โ panic
(vm_object_hold_shared assertion obj != NULL).
Build
./build.sh # cc -o df0070 df0070.c (on a DragonFlyBSD guest)
Run
Default kern.ckptgroup=0 (wheel-only) โ run as root.
./run.sh # ./df0070 evil.ckpt panic -- kernel PANICS
./run.sh leak # ./df0070 evil.ckpt leak -- silent slab-adjacent
# 116-byte OOB; returns
# EINVAL, no panic
Expected result (panic mode)
[*] DF-0070 PoC: building evil.ckpt (notesz=880, n_namesz=0x10000000, n_descsz=120, mode=panic)
[*] calling sys_checkpoint(CKPT_THAW, fd=3, pid=-1, retval=0) [syscall #467]...
<ssh dies -- guest in DDB>
# in dfbsd-qemu/boot.log:
panic: assertion "obj != NULL" failed in vm_object_hold_shared at /usr/src/sys/vm/vm_object.c:330
cpuid = 0
...
--- trap 000000000000000c, rip = ffffffff80bca038, ... ---
memmove() at memmove+0x28 0xffffffff80bca038
Debugger("panic")
db>
The page fault is in memmove+0x28 โ that is the inner bcopy backing the
descriptor copy at kern_checkpoint.c:346. Reproduced twice with
byte-identical code offsets.
Expected result (leak mode)
The kernel silently performs a 116-byte slab-adjacent OOB read (no page
fault โ the read stays inside the 1024-byte slab chunk), then
elf_loadnotes rejects the leaked garbage at the pr_version/pr_psinfosz
validation (:292-301) and returns EINVAL. The OOB read is real but
silent; control does not reach the strlcpy(p->p_comm, ...) at :306, so
the leak is not observable in ps/sysctl.
How the sizes were derived
probe.c prints the actual kernel-side struct sizes:
sizeof(prpsinfo_t) = 120 sizeof(prstatus_t) = 248 sizeof(prfpregset_t) = 512 sizeof(Elf_Note) = 12
So the nthreads formula at :185, (notesz - 120) / 760, yields 1 for
notesz = 880 โ inside the [1, CKPT_MAXTHREADS=256] gate at :188.
Notes
n_descszmust equal the kernel struct size at the:340check, so for the first (NT_PRPSINFO) call it is120(sizeof(prpsinfo_t)).strncmp("CORE", src+*off, n_namesz)at:335stops at the'\0'in"CORE\0"within 5 bytes regardless ofn_namesz, so a giantn_nameszdoes not stop the match โ it only inflates the subsequent*offadvance.- The original (pre-verification) PoC assumed wrong struct sizes
(
PRPSINFO_SZ=128,PRSTATUS_SZ=504); those were corrected. CKPT_THAWrequires membership inkern.ckptgroup(default0= wheel). Withkern.ckptgroup=-1any local user can trigger the panic.
Files
df0070.cโ the generator + inline trigger.probe.cโ struct-size probe (used to derivenotesz=880).build.sh,run.shโ exact reproduce commands.build.log,run.log,run.2.log,run.leak.logโ full untrimmed logs.panic.txtโ the panic signature excerpted fromboot.log.env.txtโ guest environment.fix.diffโgit apply-able fix (threadsrcsz/noteszintoelf_getnote, bounds-check every access).VERDICT.md,manifest.json.
DF-0070 โ VERDICT
Verdict: REPRODUCED (panic / kernel-mode page fault).
The heap-OOB read described in DF-0070 is real on the audited DragonFlyBSD
master DEV kernel (6.5-DEVELOPMENT, build v6.5.0.1712.g89e6a-DEVELOPMENT
of 2026-06-29, X86_64_GENERIC). A crafted ELF checkpoint image, restored
with sys_checkpoint(CKPT_THAW, fd, -1, 0) (syscall 467), drives
elf_getnote's descriptor bcopy 256 MB past the kmalloc(880) note buffer
and panics the kernel with Fatal trap 12: page fault while in kernel mode
inside memmove. Reproduced twice (once before, once after vm.sh reset)
with byte-identical code offsets in the panic stack โ only the KASLR
frame addresses differ between boots.
Mechanism (every hop cited path:line)
sys_checkpoint(CKPT_THAW)(sys/kern/kern_checkpoint.c:751) โckpt_thaw_proc(:218).ckpt_thaw_procreads the ELF header (elf_gethdr, :230), the program headers (elf_getphdrs, :236), then callself_getnotes(lp, fp, phdr->p_filesz)at :240 โnoteszflows straight from attacker-controlledphdr[0].p_fileszwith no validation.elf_getnotes(:176) derivesnthreadspurely fromnoteszat :185 โ(notesz - sizeof(prpsinfo_t)) / (sizeof(prstatus_t) + sizeof(prfpregset_t)). With the verified amd64 sizes (prpsinfo_t=120,prstatus_t=248,prfpregset_t=512) andnotesz=880,nthreads = 1, passing the[1, CKPT_MAXTHREADS=256]gate at :188.note = kmalloc(notesz=880, M_TEMP, M_WAITOK)at :194 allocates an 880-byte heap buffer;read_check(fp, note, 880)(:198) fills it from the file.elf_demarshalnotes(note, psinfo, status, fpregset, 1)is called at :200.elf_demarshalnotes(:354) callself_getnote(src, &off, "CORE", NT_PRPSINFO, &psinfo, sizeof(prpsinfo_t)=120)(:363) withoff=0.elf_getnote(:313): -bcopy(src+0, ¬e, 12)(:325) reads our crafted header:n_namesz=0x10000000,n_descsz=120,n_type=NT_PRPSINFO(3). -*off = 12(:329). Type matches (:330). -strncmp("CORE", src+12, 0x10000000)(:335) โstrncmpstops at the embedded'\0'in"CORE\0"within 5 bytes, returns 0. No OOB here, because the literal"CORE\0"lives inside the buffer. -*off += roundup2(0x10000000, 8) = 0x10000000(:339) โ*off = 0x1000000c. There is no check that*off <= notesz. This is the bug. -n_descsz == descsz(120 == 120) at :340 passes. -desc=&psinfois non-NULL, sobcopy(src + 0x1000000c, psinfo, 120)at :346 reads 120 bytes from KVM 256 MB past the 880-byte slab chunk.- The
bcopy(backed bymemmoveon amd64) page-faults in kernel mode on the unmapped access; the fault handler reachesvm_object_hold_sharedon an address with no backingvm_objectand panics with the assertionobj != NULL(vm/vm_object.c:330).
Privilege / reachability
sys_checkpointis gated bykern.ckptgroupat :728. Default0= wheel-only. The PoC runs as root.- The parsed data is untrusted in all configurations; an admin who sets
kern.ckptgroup=-1exposes the panic to any local user. - Realistic impact: local DoS (kernel panic) from any principal in the
configured ckptgroup (default: root/wheel). With the optional
kern.ckptgroup=-1setting it is an unprivileged local DoS.
Why the leak variant does not escalate to info-leak
We also exercised the slab-adjacent OOB (n_namesz = 880-12-8 = 860,
*off lands at 876, bcopy reads 120 bytes ending at 996 โ 116 bytes
past the 880-byte buffer but inside the 1024-byte slab chunk, hence no
page fault). The 120-byte OOB read happens silently, but
elf_loadnotes validates the loaded structures at :292-301:
if (status->pr_version != PRSTATUS_VERSION || // 1
status->pr_statussz != sizeof(prstatus_t) || // 248
...
psinfo->pr_version != PRPSINFO_VERSION || // 1
psinfo->pr_psinfosz != sizeof(prpsinfo_t)) // 120
error = EINVAL;
Random slab content almost never satisfies these magic+size checks, so
control never reaches the strlcpy(p->p_comm, psinfo->pr_fname, ...) at
:306. The leak is real but silent โ the dominant observable impact is
the panic.
What changed in the PoC
The supplied PoC assumed the wrong struct sizes (PRPSINFO_SZ=128,
PRSTATUS_SZ=504). On the running kernel the real sizes are
prpsinfo_t=120, prstatus_t=248, prfpregset_t=512, so the nthreads
formula yields (notesz-120)/760. The PoC was rewritten to:
- compute notesz = 120 + 248 + 512 = 880 so nthreads == 1,
- build the malicious ELF in-process (no host-side struct dependency),
- invoke sys_checkpoint(CKPT_THAW, fd, -1, 0) directly via
syscall(SYS_checkpoint=467, ...),
- default to the panic variant (n_namesz=0x10000000) and accept an
optional leak argument for the slab-adjacent variant.
Reproduce
./build.sh # cc -o df0070 df0070.c
./run.sh # ./df0070 evil.ckpt panic -- kernel panics
# optional:
./run.sh leak # ./df0070 evil.ckpt leak -- silent OOB, returns EINVAL
Run as root (default kern.ckptgroup=0).
Fix
See fix.diff. The fix threads the source-buffer size srcsz (the notesz
that elf_getnotes already holds) through elf_demarshalnotes into
elf_getnote, and validates every access before touching memory:
- *off + sizeof(note) <= srcsz before the header bcopy (:325),
- note.n_namesz is sane (โค 32, the longest legitimate ELF note name) and
*off + roundup2(n_namesz, 8) <= srcsz before the strncmp and the
advance at :339,
- *off + roundup2(n_descsz, 8) <= srcsz before the descriptor bcopy
at :346.
This matches the spirit of the finding markdown's proposal but adds an
explicit n_namesz > 32 cap (defence-in-depth โ strncmp is bounded by
the null byte so an attacker cannot read past the buffer that way, but a
multi-megabyte n_namesz is never legitimate in an ELF core note and is
rejected outright). The nthreads over-count at :185 is left unchanged:
once elf_getnote rejects *off > notesz, the surplus iterations in
elf_demarshalnotes loop bail cleanly with EINVAL and the over-sized
status[]/fpregset[] arrays are simply freed unused โ no security
consequence.
Confirmed kernel references
- sys/kern/kern_checkpoint.c:185
- sys/kern/kern_checkpoint.c:194
- sys/kern/kern_checkpoint.c:240
- sys/kern/kern_checkpoint.c:306
- sys/kern/kern_checkpoint.c:325
- sys/kern/kern_checkpoint.c:335
- sys/kern/kern_checkpoint.c:339
- sys/kern/kern_checkpoint.c:340
- sys/kern/kern_checkpoint.c:346
- sys/kern/kern_checkpoint.c:728
- sys/vm/vm_object.c:330
Detail
Exploit chain
Pure memory-corruption-OOB-read primitive; not weaponizable beyond local DoS in this configuration. The bcopy at :346 reads sizeof(prpsinfo_t)=120 bytes from a fully attacker-offset-controlled KVM address (offset = sizeof(Elf_Note)+roundup2(n_namesz,8)). For panic/DoS the attacker picks a huge n_namesz to cross an unmapped page. For an info leak the attacker picks a slab-adjacent offset (n_namesz ~ notesz-20), but elf_loadnotes' strict pr_version/pr_statussz/pr_psinfosz validation at :292-301 almost always rejects the leaked bytes before they reach p_comm at :306, so the leak is real-but-silent. No write primitive, no function-pointer/ucred corruption surface -- the read destination is a freshly kmalloc'd prpsinfo_t that is unconditionally kfree'd on the EINVAL path, so the bucket cannot be re-used to convert into a write. Realistic ceiling: local DoS (panic) from any principal in kern.ckptgroup; with ckptgroup=-1, unprivileged local DoS.
Evidence (decisive lines)
findings/poc/DF-0070/panic.txt holds the boot.log excerpt: 'panic: assertion "obj != NULL" failed in vm_object_hold_shared ... --- trap 000000000000000c, rip = ffffffff80bca038 --- memmove() at memmove+0x28 0xffffffff80bca038'. run.log and run.2.log are the two decisive runs (post- and pre-reset) with identical code offsets (+0x3f, +0x408, +0x9a, +0x17c, +0x9, +0x28) and only KASLR frame addresses differing. run.leak.log shows the leak variant returning errno=22 (EINVAL) without panic. build.log/probe.c verify the struct sizes (prpsinfo_t=120, prstatus_t=248, prfpregset_t=512) that yield notesz=880 and nthreads=1. fix.diff is git-apply-able (git apply --check passes).
PoC changes
Rewrote findings/poc/DF-0070/df0070.c: original assumed wrong struct sizes (PRPSINFO_SZ=128, PRSTATUS_SZ=504); replaced with verified amd64 sizes (120/248/512) so notesz=880 yields nthreads=1. Built the malicious ELF in-process (no host-side struct dependency), invoked sys_checkpoint(CKPT_THAW,fd,-1,0) directly via syscall(467,...), and added an optional 'leak' mode for the slab-adjacent OOB variant. Added probe.c to print the kernel-side struct sizes, VERDICT.md, build.sh/run.sh, full untrimmed build/run/panic/leak/env logs, fix.diff (thread srcsz through elf_demarshalnotes -> elf_getnote and bounds-check every access), and manifest.json. README.md updated to reflect the verified panic + leak behavior.
Verified recommended fix
In sys/kern/kern_checkpoint.c, thread notesz as srcsz from elf_getnotes(:200) through elf_demarshalnotes into elf_getnote, and add the four bounds checks (already in findings/poc/DF-0070/fix.diff, git-apply-able): (1) off + sizeof(note) <= srcsz before the header bcopy at :325, (2) note.n_namesz <= 32 AND off + roundup2(n_namesz,8) <= srcsz before strncmp/:339, (3) *off + roundup2(n_descsz,8) <= srcsz before the descriptor bcopy at :346. Supersedes the finding proposal -- same spirit (validate before each access) but adds the explicit n_namesz<=32 cap as defence-in-depth and avoids the over-broad ABI change of an end-pointer; the nthreads over-count at :185 is left as-is since the bounds checks make the surplus loop iterations bail cleanly with EINVAL.
Verdict
REPRODUCED. elf_getnote (sys/kern/kern_checkpoint.c:313-352) advances off via roundup2(n_namesz, sizeof(Elf_Size)) at :339 with no check that off stays within notesz, so a crafted PT_NOTE with n_namesz=0x10000000 (and n_descsz=120 matching sizeof(prpsinfo_t) to pass :340) drives the bcopy at :346 to read 120 bytes from src+0x1000000c -- 256 MB past the kmalloc(880) note buffer (notesz flows unvalidated from phdr[0].p_filesz at :240, and nthreads=(880-120)/760=1 passes the [1,256] gate at :188). Reproduced twice (pre- and post-vm.sh reset) with byte-identical code offsets in the panic stack: Fatal trap 0xc page fault in memmove+0x28 (the bcopy backing) -> vm_object_hold_shared assertion 'obj != NULL' panic at vm/vm_object.c:330. The leak variant (slab-adjacent 116-byte OOB) is silent: elf_loadnotes rejects the leaked garbage at :292-301 with EINVAL, so control never reaches strlcpy(p->p_comm,...) at :306. Default kern.ckptgroup=0 makes it wheel-only; setting -1 exposes the panic to any local user.