UAF: wg_peer_destroy frees peer struct while data-plane paths access it without peer refcount
Summary
wg_peer has NO refcount. wg_peer_destroy(:692) kfree(peer) under sc_lock. Data-plane: wg_output(:2334)/wg_input(:2187)/wg_decrypt(:1887)/wg_encrypt(:1850) get peer via wg_aip_lookup/noise_remote_arg WITHOUT sc_lock. noise_remote refcount protects remote not peer. r->r_arg dangles after kfree. Lockmgr on embedded queue mutexes in freed struct. Root triggers destroy via SIOCSWG; any local user triggers wg_output side; authenticated remote peer triggers input/decrypt side. Heap UAF -> corruption/panic.
PoC verification
Evidence pack
findings/poc/DF-0315 ยท 10 files| File | Type | Description | Size | |
|---|---|---|---|---|
| wgrace.c | trigger-source | wg_peer UAF racer: wg_output (UDP senders) vs wg_peer_destroy (root SIOCSWG destroy/re-add loop), with NOEP/NOPRIV/NORACE/NOGROOM/NSEND/NGROOM/SDELAY env knobs and an exec-churn 1024-bucket groomer | 10.6 KB | view raw |
| build.sh | build-script | cc -O2 -pthread -o wgrace wgrace.c | 269 B | view raw |
| run.sh | run-script | decisive run: env NOEP=1 NSEND=4 NGROOM=6 ./wgrace 55 (run as root for SIOCSWG) | 1.5 KB | view raw |
| build.log | build-log | final successful build, full output (clean, BUILD_EXIT=0) | 65 B | view raw |
| run.log | run-log | decisive run: 22708 destroy/readd iters, 16855911 sends, no panic, guest up | 218 B | view raw |
| env.txt | environment | uname, cc, kldstat(if_wg.ko loaded), hw.ncpu=2, debug.trace_on_panic=1 | 818 B | view raw |
| endpoint_crash_separate_bug.txt | panic-signature | BONUS: a DIFFERENT wg panic found during testing (udp_send+0x2cf, current process Idle), fires with NO peer destruction -> NOT DF-0315; should be filed separately | 2.5 KB | view raw |
| fix.diff | suggested-fix | git apply-able refcount fix: add wg_peer p_refcnt + wg_peer_ref/wg_peer_put, ref in wg_aip_lookup/wg_encrypt/wg_decrypt/wg_input, put in wg_output + sites, wg_peer_destroy detaches then drops table ref; verified git apply --check clean; supersedes finding proposal (finding had none) | 6.7 KB | view raw |
| VERDICT.md | verdict | full narrative: code-confirmed UAF (line-by-line), reachability, why it did not panic dynamically on GENERIC (narrow window + slow destroy + stale-but-valid slab), what would make it fire | 8.0 KB | โ raw |
| README.md | readme | build/run/expected + env-knob reference | 2.9 KB | โ raw |
DF-0315 โ wg_peer use-after-free PoC
Finding: UAF โ wg_peer_destroy frees the peer struct while data-plane paths
(wg_output/wg_input/wg_encrypt/wg_decrypt) dereference it without a peer
refcount. sys/net/wg/if_wg.c, severity High, CWE-416.
Files
| file | what |
|---|---|
wgrace.c |
the racer: sets up wg0+peer, races wg_output (UDP senders) vs wg_peer_destroy (root SIOCSWG destroy/re-add loop), with optional heap groomers |
build.sh / run.sh |
exact build & decisive-run commands |
build.log |
full compiler output (clean) |
run.log |
decisive run (NOEP + destroy race + 1024-bucket exec churn): no panic, guest up |
env.txt |
uname, cc, kldstat(if_wg), ncpu, slab/debug sysctls |
endpoint_crash_separate_bug.txt |
a DIFFERENT wg panic found during testing (NOT DF-0315) |
fix.diff |
git apply-able refcount fix (verified --check clean) |
VERDICT.md |
full narrative + line-by-line code proof + negative-result reasoning |
manifest.json |
machine-readable catalog |
Build (as maxx)
cd poc/DF-0315 && cc -O2 -pthread -o wgrace wgrace.c
Run
The racer must run as root for the SIOCSWG destroy side
(SYSCAP_RESTRICTEDROOT, if_wg.c:2671). The wg_output (data-plane) side is
what an unprivileged local user drives; the destroy side is what root does.
# DECISIVE DF-0315 test (isolated from the separate wg-send crash via NOEP): ssh dfbsd cd ~maxx/poc/DF-0315 && env NOEP=1 NSEND=4 NGROOM=6 ./wgrace 55 # Optional: the SEPARATE wg-send crash (NOT DF-0315), for reference: env NOPRIV=1 NORACE=1 NSEND=1 NGROOM=0 SDELAY=2000 ./wgrace 20
Expected
- DF-0315 (decisive,
NOEP) โ on a vulnerable kernel that lets the UAF window be hit, a page-fault panic in thewg_output/wg_peer_send_stagedpath shortly after start (signature indfbsd-qemu/boot.log). On this master DEVX86_64_GENERICkernel it does NOT panic: the run exits cleanly after ~16.8M sends / ~22k destroy-readd cycles and the guest stays up. SeeVERDICT.mdfor why (narrow window + slow destroy + GENERIC slab leaves the freed peer's fields stale-but-valid). - Separate crash (
NOPRIV+endpoint) โ panics inudp_send+0x2cfwithin ~20 s; this is a different bug (no peer destruction involved), documented inendpoint_crash_separate_bug.txt.
Environment knobs (env vars)
NOEP=1 peer has no endpoint (isolates the wg_output UAF from the separate
wg-send crash). NOPRIV=1 omit interface private key (handshake can't be
created). NORACE=1 disable destroy/re-add (control). NOGROOM=1 disable
groomers. NSEND=n / NGROOM=n thread counts. SDELAY=us inter-send delay.
DF-0315 โ wg_peer use-after-free in sys/net/wg/if_wg.c
Verdict: NOT REPRODUCED dynamically on this kernel โ but a real, reachable, code-confirmed UAF (not a false positive)
The use-after-free described in DF-0315 is genuinely present in the code and
the if_wg module is loaded and reachable on the audited master DEV guest.
However, after a thorough, multi-config racing campaign I was unable to turn
the destroy-race UAF into an observable kernel panic on this non-INVARIANTS
X86_64_GENERIC kernel: the wg_output post-lookup window is too narrow and
wg_peer_destroy() (which blocks on taskqueue_drain) too slow for the
kfree(peer) to land inside it, and freeing a large object on the slab leaves
the freed slot's relevant fields stale-but-valid (the slab free-list only
clobbers the first word), so even a hit reads valid-looking data and does not
fault. This file is the full reasoning with path:line evidence.
1. The bug, confirmed line by line (read-only sys/)
wg_peer has no refcount of its own. The data plane obtains a struct
wg_peer * and uses it without holding sc->sc_lock, while the control
plane frees the peer under sc->sc_lock.
- Allocation / no peer refcount โ
sys/net/wg/if_wg.c:226-266(struct wg_peer); created atwg_peer_create:589kmalloc(sizeof(*peer), M_WG, M_WAITOK|M_ZERO). Nothing counts references to the peer struct. - The data-plane entry โ
wg_output:2334peer = wg_aip_lookup(sc, af, ...).wg_aip_lookup(:854) takessc->sc_aip_lockSHARED only to do the radix lookup and then callsnoise_remote_ref(peer->p_remote)at:880โ i.e. it pinspeer->p_remote(thenoise_remote), notpeerโ and releasessc_aip_lockat:884before returning. After the return,wg_outputhas a rawpeerpointer with no lock and no peer refcount. - The unprotected window โ back in
wg_output:peer->p_endpoint...is read at:2342-2345;wg_queue_push_staged(&peer->p_stage_queue, pkt)at:2350;wg_peer_send_staged(peer)at:2351(which itself dereferencespeer->p_sc,peer->p_stage_queue,peer->p_remote,peer->p_encrypt_serialat:2230-2269); and finallynoise_remote_put(peer->p_remote)at:2352(a UAF read of the freedpeer->p_remotefield). None of this holdssc->sc_lock. The same pattern recurs at the other cited sites:wg_encrypt:1849-1874,wg_decrypt:1886-1940, thewg_inputdata path:2187-2194, and the task handlerswg_deliver_out:1991-2034/wg_deliver_in:2037+(which receive the peer as a raw task argument). - The free โ
wg_peer_destroy:640(KKASSERTsc_lock EXCLUSIVE at:644) does its teardown and thenkfree(peer, M_WG)at:692. It is reached from theSIOCSWGioctl (wg_ioctl_setholds sc_lock EXCLUSIVE at:2522, callswg_peer_destroyat:2539and:2602;SIOCSWGrequiresSYSCAP_RESTRICTEDROOTat:2671). Thenoise_remoterefcount held by the data plane keeps the remote alive but does not keeppeeralive:kfree(peer)happens unconditionally at:692regardless of outstanding remote refs.
So: peer can be kfree()d while wg_output/wg_encrypt/wg_decrypt/the
deliver tasks are still dereferencing it. That is a textbook heap UAF.
2. Reachability (confirmed)
kldstat: if_wg.ko (/boot/kernel/if_wg.ko) -- already loaded on this guest uname: DragonFly dfbsd 6.5-DEVELOPMENT ... X86_64_GENERIC x86_64
ifconfig wg0 create works, peers/allowed-IPs/endpoints are configurable via
SIOCSWG, and bringing wg0 up installs the connected route so a local user's
sendto() into the wg subnet reaches wg_output. The cited paths are live
code, not behind a koption.
3. Why it did not manifest as a panic (the negative result)
I raced wg_output (senders flooding UDP into wg0) against wg_peer_destroy
(root SIOCSWG REPLACE_PEERS destroy/re-add loop), with correct heap
grooming: kern_exec.c:602 allocates proc args via kmalloc(sizeof(struct
pargs)+i, M_PARGS); a ~700-byte argv lands in the same kmalloc-1024 zone
as wg_peer (~700 B), so freed peer slots are reclaimed with foreign bytes.
| Run | config | iters / sends | result |
|---|---|---|---|
| baseline (no destroy) | NOEP, 4 senders, 4 socket-groomers | 6.09M sends | stable (guest up) |
| destroy race | NOEP, 4 senders, 6 exec-groomers (1024-bkt) | 22.7k destroy/readd, 16.8M sends | stable (guest up) |
The destroy race ran 16.8 million wg_output calls with the correct bucket
grooming and never faulted. Root causes of the non-fault:
- Window is too narrow. In the only stable config (peer with no
endpoint),
wg_outputreturnsEHOSTUNREACHat:2346immediately after the singlepeer->p_endpointread at:2342. The window betweenwg_aip_lookup(:2334) and the last peer deref is a handful of instructions. wg_peer_destroyis too slow to fit in that window. Destroy blocks ontaskqueue_drain(:669-670) and runs fivecallout_terminates before reachingkfree(:692); thekfreeessentially never lands inside the microsecond-scalewg_outputwindow.- Slab behaviour on GENERIC. Even when a hit occurs,
kfreeof a ~700-byte object only overwrites the first pointer of the slab slot with the free-list link; the fieldswg_outputreads (p_endpointat ~offset 140,p_stage_queueat ~offset 200+) stay stale-but-valid, so the dereference succeeds. A panic needs the freed slot to be reclaimed with foreign content AND the reclaim to land inside the tiny window โ possible in principle, not achieved in 16.8M sends.
Widening the window by giving the peer an endpoint (so wg_output runs the full
wg_peer_send_staged path) was blocked by a separate bug: under even
trivial UDP load into a wg interface whose peer has an endpoint, the kernel
panics in wg_deliver_out -> wg_send -> udp_send (current process = Idle)
within ~15-20 s, with no peer destruction at all โ i.e. not the DF-0315
UAF. See endpoint_crash_separate_bug.txt. That crash is reproducible (4/4)
but is a distinct defect and should be filed separately.
4. What would make it fire
The UAF would reliably panic on:
* an INVARIANTS/DEBUG kernel that poisons freed slab memory (the stale
reads would then hit poison bytes and fault), or
* a build with Witness/use-after-free detection, or
* racing the taskqueue data plane (wg_encrypt/wg_decrypt/the deliver
tasks, whose peer pointer is held for the whole task and is not drained for
the parallel encrypt/decrypt tasks) against destroy โ but that needs a
completed handshake (keypair), which needs a real responding peer.
5. Suggested fix (fix.diff, verified git apply --check clean)
Add a refcount to wg_peer, take it in the data-plane entry points
(wg_aip_lookup, wg_encrypt, wg_decrypt, the wg_input data path) and drop
it on exit, and have wg_peer_destroy() detach the peer (so no new refs are
taken) and then drop the table's reference via wg_peer_put(); the peer is
freed only when the last in-flight data-plane reference is released. This
mirrors the peer-refcounting used by FreeBSD's if_wg. The remaining
refinement (task-lifetime refs for wg_deliver_out/wg_deliver_in, which
receive the peer as a raw task argument) is noted in the diff comments; the
diff as shipped closes the primary local-attack vector (wg_output /
wg_aip_lookup) plus the inline encrypt/decrypt/input sites.
6. Reproduce
ssh dfbsd-maxx # unprivileged build cd poc/DF-0315 && cc -O2 -pthread -o wgrace wgrace.c ssh dfbsd # root run (SIOCSWG needs SYSCAP_RESTRICTEDROOT) cd ~maxx/poc/DF-0315 && env NOEP=1 NSEND=4 NGROOM=6 ./wgrace 55
On this kernel it exits cleanly (no panic, guest up). To see the separate
wg-send crash instead: env NOPRIV=1 NORACE=1 NSEND=1 NGROOM=0 SDELAY=2000
./wgrace 20 โ panic in udp_send+0x2cf within ~20 s (NOT DF-0315).
Confirmed kernel references
- sys/net/wg/if_wg.c:226
- sys/net/wg/if_wg.c:589
- sys/net/wg/if_wg.c:640
- sys/net/wg/if_wg.c:644
- sys/net/wg/if_wg.c:669
- sys/net/wg/if_wg.c:692
- sys/net/wg/if_wg.c:854
- sys/net/wg/if_wg.c:880
- sys/net/wg/if_wg.c:1838
- sys/net/wg/if_wg.c:1849
- sys/net/wg/if_wg.c:1877
- sys/net/wg/if_wg.c:1886
- sys/net/wg/if_wg.c:2187
- sys/net/wg/if_wg.c:2230
- sys/net/wg/if_wg.c:2272
- sys/net/wg/if_wg.c:2334
- sys/net/wg/if_wg.c:2342
- sys/net/wg/if_wg.c:2350
- sys/net/wg/if_wg.c:2351
- sys/net/wg/if_wg.c:2352
- sys/net/wg/if_wg.c:2510
- sys/net/wg/if_wg.c:2522
- sys/net/wg/if_wg.c:2539
- sys/net/wg/if_wg.c:2602
- sys/net/wg/if_wg.c:2671
Detail
Exploit chain
Memory-corruption class (heap UAF on the ~700-byte wg_peer object in the kmalloc-1024 zone): if the destroy<->wg_output race window were widened/hit, an attacker frees the peer while the data plane dereferences it; with slab grooming the freed slot is reclaimed by a foreign object (e.g. proc-args from kern_exec.c:602) and wg_output's lockmgr on peer->p_stage_queue.q_mtx or the peer->p_remote read would then fault/corrupt -> potential local DoS/panic, and with precise grooming a corruption primitive toward uid0. I did NOT develop the chain because I could not trigger the window on this kernel: the wg_output window is too narrow and widening it (peer-with-endpoint -> full wg_peer_send_staged) triggers the separate wg-send crash instead. On an INVARIANTS kernel the UAF would reliably panic (poisoned freed memory).
Evidence (decisive lines)
run.log shows the decisive run: 'destroy/readd iterations: 22708, sends: 16855911 ... RUN_EXIT=0' with guest 'up' (no panic). VERDICT.md contains the full line-by-line code proof (peer freed at if_wg.c:692 under sc_lock vs accessed without sc_lock at :2334-2352/:1849-1874/:1886-1940/:2187-2194/:1991-2034) and the negative-result reasoning (narrow window + slow destroy + stale-but-valid slab). build.log: clean BUILD_EXIT=0. fix.diff: git apply --check passes. endpoint_crash_separate_bug.txt: the separate reproduced panic (udp_send+0x2cf, 4/4 runs) that is NOT DF-0315.
PoC changes
Authored the entire evidence pack from scratch (no prior PoC folder). wgrace.c: a racer that configures wg0+peer via direct SIOCSWG ioctls, drives the data plane (UDP senders into the wg subnet -> wg_output -> wg_aip_lookup) and races it against a root SIOCSWG REPLACE_PEERS destroy/re-add loop, with env knobs (NOEP/NOPRIV/NORACE/NOGROOM/NSEND/NGROOM/SDELAY) and an exec-churn groomer (kern_exec.c:602 ~700-byte pargs -> kmalloc-1024, matching wg_peer's bucket). NOEP mode isolates the wg_output UAF from a separate wg-send crash. Also authored fix.diff (wg_peer refcount: p_refcnt + wg_peer_ref/wg_peer_put, ref in wg_aip_lookup/wg_encrypt/wg_decrypt/wg_input, put in wg_output + sites, wg_peer_destroy detaches then drops the table ref) verified git apply --check clean.
Verified recommended fix
fix.diff adds a refcount (u_int p_refcnt) to struct wg_peer (if_wg.c:262), initializes it in wg_peer_create, adds wg_peer_ref()/wg_peer_put() (put frees on last release), has wg_aip_lookup take a peer ref alongside noise_remote_ref, has wg_output/wg_encrypt/wg_decrypt/wg_input drop it on exit, and reworks wg_peer_destroy to detach (disable remote/timers, remove aips + sc_peers entry) then drop the table ref via wg_peer_put so the peer is freed only when all in-flight data-plane refs are gone (mirrors FreeBSD if_wg peer-refcounting). The finding markdown had no concrete fix proposal, so this supersedes. Full git-apply-able diff in findings/poc/DF-0315/fix.diff.
Verdict
The DF-0315 use-after-free is REAL at the code level (certain) and REACHABLE (if_wg.ko is loaded; wg0/peer/route configurable; wg_output reachable by any local user; wg_peer_destroy reachable by root via SIOCSWG). It is NOT a false positive: wg_peer has no refcount, wg_output:2334 obtains a peer via wg_aip_lookup (which only noise_remote_ref's the remote at :880, not the peer) and dereferences peer at :2342-2352 without sc_lock, while wg_peer_destroy:692 kfree(peer) under sc_lock -- so the peer can be freed mid-deref. However I could NOT turn the destroy-race into an observable panic on this non-INVARIANTS X86_64_GENERIC kernel: the decisive run (NOEP config isolating the UAF from a separate wg-send crash, plus correct kmalloc-1024 exec-churn grooming matching wg_peer's bucket) completed 22708 destroy/re-add cycles and 16.8M wg_output calls with NO crash and the guest stayed up. The wg_output post-lookup window is too narrow and wg_peer_destroy (blocking taskqueue_drain at :669-670) too slow to land kfree inside it, and the GENERIC slab leaves the freed peer's fields stale-but-valid (only the first word is clobbered), so even a hit reads valid-looking data. Separately I found a DIFFERENT, trivially-triggered wg panic (udp_send+0x2cf, current process Idle, fires with NO peer destruction) -- documented in endpoint_crash_separate_bug.txt as a distinct defect, NOT DF-0315.