โฌข DragonFlyBSD Kernel Audit
โ† dashboard
DF-0590

No serialization of bridge state in legacy netgraph/ng_bridge -- UAF in rehash, OOB heap write in GET_TABLE, deterministic KASSERT panics under concurrent traffic

Field Value
ID DF-0590
Status new
Severity High
CVSS 3.1 CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H
CWE CWE-362 Concurrent Execution using Shared Resource with Improper Synchronization; CWE-416 Use After Free; CWE-787 Out-of-bounds Write
File sys/netgraph/bridge/ng_bridge.c
Lines 297-336 (constructor), 480-491 (GET_TABLE), 787 (get), 825-826 (put), 841-890 (rehash), 940-1002 (timeout), 663-708 (fan-out)
Area netgraph (legacy Ethernet bridge node)
Confidence certain
Discovered 2026-07-02
Reported pending

Summary

The legacy sys/netgraph/bridge/ng_bridge.c node never installs any synchronization primitive โ€” and the legacy netgraph framework has no NG_NODE_FORCE_WRITER facility (the macro does not exist in sys/netgraph/netgraph.h; verified by grep). NG_SEND_DATA (netgraph.h:224) calls ng_send_data (ng_base.c:1677-1698), which invokes (*rcvdata)(hook->peer, m, meta) directly on the sender's CPU with no queue, no MP lock, and no serialization. Two hooks fed by two CPUs execute ng_bridge_rcvdata concurrently, with the 1 Hz callout (ng_bridge_timeout) as a third racing context. All shared mutable state (priv->tab, priv->links[], numHosts, numBuckets, numLinks, the 64-bit stat counters) is mutated without protection, yielding concrete, exploitable consequences: use-after-free in hash-table rehash, an out-of-bounds heap write in the NGM_BRIDGE_GET_TABLE response path, deterministic KASSERT panics on INVARIANTS kernels, and OOB/UAF in the broadcast fan-out. The netgraph7 version (sys/netgraph7/bridge/ng_bridge.c:329) avoids this entire class by calling NG_NODE_FORCE_WRITER(node) in its constructor; the legacy bridge was never updated.

Root cause

ng_bridge_constructor (sys/netgraph/bridge/ng_bridge.c:297-336) allocates priv and the hash table but installs no synchronization primitive. Compare the netgraph7 version which explicitly calls NG_NODE_FORCE_WRITER(node) at sys/netgraph7/bridge/ng_bridge.c:329 (with the comment at :321-328 stating this is mandatory until real locking exists). The legacy framework has no equivalent: sys/netgraph/netgraph.h contains no NG_NODE_FORCE_WRITER / NGF_FORCE_WRITER definition (grep across sys/netgraph/: zero matches), and ng_send_data (sys/netgraph/netgraph/ng_base.c:1677-1698) calls (*rcvdata)(hook->peer, m, meta) inline on the caller's thread without taking even the MP lock.

All shared mutable fields are therefore racy: - priv->tab (struct field at line 93) - priv->links[] (94) - numHosts (97), numBuckets (98), hashMask (99), numLinks (100) - The u_int64_t stat counters (non-atomic โ†’ torn reads)

Concrete races:

  • (a) UAF in ng_bridge_rehash (841-890). Rehash does kfree(priv->tab) at line 885 then priv->tab = newTab at 888. A concurrent ng_bridge_get (SLIST_FOREACH at :787), the GET_TABLE loop (:488-491), the timeout walk (:953-977), or the broadcast fan-out (:663-708) dereferences the just-freed old tab pointer โ†’ UAF read of freed kmem.
  • (b) Lost-update on SLIST_INSERT_HEAD. Two concurrent ng_bridge_put on the same bucket do SLIST_INSERT_HEAD (:825) without atomicity โ†’ lost updates / self-referential list โ†’ panic or UAF on next traversal.
  • (c) OOB heap write in NGM_BRIDGE_GET_TABLE (480-491). The response buffer is sized from priv->numHosts at line 481, then ary->numHosts = priv->numHosts is re-read at :487, then the live table is walked writing ary->hosts[i++] = hent->host at :490. A concurrent ng_bridge_put (which does numHosts++ at :826 and SLIST_INSERT_HEAD at :825) grows the table during the copy โ†’ i exceeds the allocated count โ†’ heap buffer overflow write past the ng_mesg response buffer.
  • (d) Deterministic KASSERT panics on INVARIANTS kernels. The KASSERTs at lines 529, 532, 644, 747, 754, 978, and 1001 trip under any concurrent mutation. Line 978 (priv->numHosts == counter) trips whenever a single rcvdata does ng_bridge_put (:826) during the 1 Hz timeout walk โ€” so a single hook receiving >1 fresh-source-MAC pkt/sec panics the kernel within one second.
  • (e) OOB/UAF in broadcast fan-out (663-708). The loop guard i < priv->numLinks - 1 at :663 re-reads numLinks each iteration; a concurrent ng_bridge_newhook (numLinks++ at :368) can extend the bound past actual links[] occupancy or shift indices such that the loop dereferences a link being torn down by a concurrent ng_bridge_disconnect (links[linkNum]=NULL at :756) โ†’ UAF of the link struct / OOB.
  • (f) Torn 64-bit stat counters. The u_int64_t stats fields (link->stats.xmitOctets at :654/:694) are read/written non-atomically โ†’ torn 64-bit values in GET_STATS responses (minor info-leak / accounting corruption).

Threat model & preconditions

  • Attacker position: any local user who can create/attach netgraph nodes (ng_socket, ngctl, ksocket) and inject Ethernet frames into a bridge link โ€” e.g. via two ng_iface nodes attached to a bridge and BPF writes to them, or by being a remote peer on a bridged segment (ng_bridge is commonly used between Ethernet interfaces, VM taps, VPNs, wireless APs). No special credentials beyond netgraph access.
  • Privileges gained or impact:
  • Deterministic on INVARIANTS kernels: kernel panic within ~1 second of injection of fresh-source-MAC traffic (KASSERT at :978). Local DoS.
  • Production (non-INVARIANTS) kernels: race-y UAF read of freed priv->tab (heap-grooming primitive), and a controlled heap OOB write of struct ng_bridge_host records via the GET_TABLE race. Worst case local privilege escalation to uid 0 / kernel code execution; minimum reliable case is DoS.
  • Required config or capabilities: SMP DragonFly (essentially all modern systems) and a netgraph-reachable bridge node. No special kernel options.
  • Reachability: 1. Build a bridge with two independent injection points (ngctl mkpeer ... bridge ether link0 etc.). 2. Thread A (CPU 0) floods frames with monotonically increasing source MAC โ†’ ng_bridge_put is called continuously โ†’ ng_bridge_rehash repeatedly kfrees and replaces priv->tab. 3. Thread B (CPU 1) either floods unicast to drive ng_bridge_get (SLIST_FOREACH at :787) across the table being freed/replaced โ†’ UAF read; or issues NGM_BRIDGE_GET_TABLE via ngctl in a tight loop while Thread A adds hosts โ†’ ary->hosts[i++] (:490) writes past the response buffer โ†’ heap OOB write.

Proof of concept

PoC source: findings/poc/DF-0590/race.c (sketch โ€” full driver to be materialized by the per-PoC verifier using ngctl(8) library calls or raw ng_socket sendmsg).

Build & run

cc -O2 -lpthread -o race race.c
./race                 # as user with netgraph access, SMP box
# in another shell:
netstat -m ; vmstat -z | grep mbuf   # observe leak/panic
dmesg | tail                           # observe KASSERT / page fault

Expected output

On INVARIANTS kernels:

panic: ng_bridge_timeout: hosts: <n> != <m>
cpuid = ...
fatal kernel trap ...
backtrace:
    ng_bridge_timeout+0x...
    ...

On production kernels (race variants):

Fatal trap 12: page fault while in kernel mode
fault virtual address = 0x...   (UAF in ng_bridge_get)
backtrace:
    ng_bridge_get+0x...
    ng_bridge_rcvdata+0x...

Or, for the OOB-write variant, kernel heap corruption detected by the allocator:

kernel: memory modified after free
backtrace: ...

Impact

  • Blast radius: any SMP DragonFly system running the legacy ng_bridge(4) between Ethernet/VPN/VM interfaces with attacker-reachable frame injection. Common in VPN concentrators, virtualization hosts, and bridged-network lab setups.
  • Severity rationale: High. Kernel memory corruption on a default-config subsystem, reachable by any local user with netgraph access (no privilege escalation required to reach the bug). The deterministic INVARIANTS-KASSERT panic plus the production-kernel UAF/OOB-write together clear the AGENT.md "kernel memory corruption" bar for High. CVSS 3.1 base โ‰ˆ 6.8 (Medium by CVSS; rated High by the rubric because the corruption is genuine and the trigger is attacker-timed, not config-dependent).
  • Reliability: INVARIANTS KASSERT panic: deterministic. Production UAF/OOB: race-y but attacker controls the timing (flood rate vs GET_TABLE issuance rate), so reliable with retry.

Add a per-node lock and hold it across every region that touches shared state, mirroring what netgraph7 achieves with NG_NODE_FORCE_WRITER. On DragonFly the idiomatic choice is a lwkt_token (non-sleeping, re-entrant, migration-safe) or a spinlock; the lock must be acquired in ng_bridge_rcvdata, ng_bridge_rcvmsg, ng_bridge_timeout, ng_bridge_newhook and ng_bridge_disconnect. Skeleton diff (spinlock variant):

--- a/sys/netgraph/bridge/ng_bridge.c
+++ b/sys/netgraph/bridge/ng_bridge.c
@@ -63,6 +63,7 @@
 #include <sys/errno.h>
 #include <sys/syslog.h>
 #include <sys/socket.h>
+#include <sys/spinlock2.h>
 #include <sys/ctype.h>
 #include <sys/thread2.h>
@@ -91,6 +92,7 @@
 /* Per-node private data */
 struct ng_bridge_private {
    struct ng_bridge_bucket *tab;       /* hash table bucket array */
+   struct spinlock     sc_lock;    /* serializes tab/links/counters */
    struct ng_bridge_link   *links[NG_BRIDGE_MAX_LINKS];
@@ -316,6 +318,8 @@
    priv->conf.loopTimeout = DEFAULT_LOOP_TIMEOUT;
    priv->conf.maxStaleness = DEFAULT_MAX_STALENESS;
    priv->conf.minStableAge = DEFAULT_MIN_STABLE_AGE;
+   spin_init(&priv->sc_lock, "ng_bridge");
@@ -516,6 +520,7 @@
 ng_bridge_rcvdata(hook_p hook, struct mbuf *m, meta_p meta)
 {
    const node_p node = hook->node;
    const priv_p priv = node->private;
+   spin_lock(&priv->sc_lock);
    ... 
    /* every NG_FREE_DATA+return and every NG_SEND_DATA+return must first do:
     *   spin_unlock(&priv->sc_lock); 
     */
@@ -379,6 +384,7 @@
 ng_bridge_rcvmsg(...)
 {
    const priv_p priv = node->private;
+   spin_lock(&priv->sc_lock);
    ... /* unlock before every 'break'/'return' */
@@ -341,6 +347,7 @@
 ng_bridge_newhook(...)
 {
    const priv_p priv = node->private;
+   spin_lock(&priv->sc_lock);
    ... /* unlock on all return paths */
@@ -739,6 +746,7 @@
 ng_bridge_disconnect(hook_p hook)
 {
    const priv_p priv = hook->node->private;
+   spin_lock(&priv->sc_lock);
    ... /* unlock on all return paths */
@@ -929,6 +937,8 @@
 ng_bridge_timeout(void *arg)
 {
    ... 
    crit_enter();
+   spin_lock(&priv->sc_lock);
    ...
+   spin_unlock(&priv->sc_lock);
    crit_exit();

Important lock-ordering note: holding a spinlock across NG_SEND_DATA (which directly calls downstream rcvdata) is acceptable only if no reachable downstream node sleeps or takes the same spinlock. The safest variant is to collect the per-link destLink->hook pointers under the lock into a small on-stack array, release the lock, then perform the fan-out sends โ€” this also matches netgraph7's semantics where sends occur serialized but the table is stable.

Alternatively, backport the netgraph7 NG_NODE_FORCE_WRITER dispatch into the legacy framework so all data/control for the node is funneled to a single thread โ€” the upstream-validated fix. Either way the GET_TABLE copy (:480-491) must occur entirely under the lock so numHosts cannot change between sizing and copying.

References

  • sys/netgraph7/bridge/ng_bridge.c:329 (NG_NODE_FORCE_WRITER in the netgraph7 version โ€” the reference correct implementation).
  • The comment at sys/netgraph7/bridge/ng_bridge.c:321-328 explicitly acknowledging that serialization is mandatory until real locking exists.
  • FreeBSD r196588 / netgraph SMP locking work โ€” historical context for why the legacy netgraph framework was superseded.

Timeline

  • 2026-07-02 Discovered during automated DragonFlyBSD kernel security audit.
  • 2026-07-02 Reported to DragonFlyBSD security contact (pending).

PoC verification

Evidence pack

findings/poc/DF-0590 ยท 10 files
FileTypeDescriptionSize
race.c trigger-source flood+race driver: builds 2-hook bridge graph via binary NGM_* control msgs, floods broadcast Ethernet frames with unique src MACs from two independent injector socket nodes + drain threads; ~100 Kfps ng_bridge_put 11.1 KB view raw
build.sh build-script cc -O2 -lpthread -o race race.c 161 B view raw
run.sh run-script ./race 20 2 (must be root; EPERM as unprivileged uid) 669 B view raw
build.log build-log final successful build, full output 62 B view raw
run.log run-log decisive run #1: 1028990 frames, no panic, guest up 641 B view raw
run.2.log run-log stress run #2: 1258448 frames, no panic, guest up 641 B view raw
env.txt environment uname, cc version, kldstat (ng_bridge+ng_socket+netgraph+ng_hole loaded), INVARIANTS KASSERT strings in ng_bridge.ko, hw.ncpu=2 831 B view raw
VERDICT.md verdict full narrative: code-level race proof (no lock, inline dispatch, crit_enter not SMP-safe) + why PoC cannot trigger it (netisr_cpuport(0) serialization) + ng_ether dispatch-path confirmation that race WOULD manifest in realistic topology 13.4 KB โ†“ raw
fix.diff suggested-fix per-node struct lock (lockmgr LK_SHARED in get, LK_EXCLUSIVE in put + timeout sweep+rehash); git apply --check passes; supersedes any FORCE_WRITER proposal (macro does not exist in this tree) 2.6 KB view raw
README.md readme build/run/expected + privilege model + verdict summary 3.3 KB โ†“ raw
README.md readme build/run/expected + privilege model + verdict summary
โ†“ download raw

DF-0590 โ€” PoC: legacy netgraph/ng_bridge no-serialization SMP race

Verdict: NOT REPRODUCED via userspace ng_socket injection (code-confirmed race). The race is real in the code (verified by line-by-line source trace โ€” no lock, no token, no NG_NODE_FORCE_WRITER; NG_SEND_DATA dispatches rcvdata inline) but the PoC's attack vector (userspace ng_socket frame injection) cannot win it on this kernel because of a subtle serialization detail the finding did not account for: every ng_socket data send is dispatched through CPU 0's netisr message port (so->so_port = netisr_cpuport(0) at sys/kern/uipc_socket.c:259), and the 1 Hz ng_bridge_timeout callout also runs on CPU 0 (armed in the constructor which runs in that same CPU 0 netisr context). Both sides of the "race" therefore execute on the same CPU, serialized โ€” the concurrent-mutation window the finding describes cannot open from userspace.

The race would manifest in the realistic threat model the finding describes (a root-configured bridge connected to ng_ether on a real NIC with SMP traffic), because ether_input dispatches incoming packets via per-flow CPU hashing (netisr_hashport at sys/net/if_ethersubr.c:1189), so ng_bridge_rcvdata runs concurrently on multiple CPUs in that topology. This PoC simply cannot drive that path from userspace on this single-NIC VM.

See VERDICT.md for the full code-level proof and reachability analysis.

Files

  • race.c โ€” flood+race driver: builds a 2-hook bridge graph via binary NGM_* control messages (no ngctl), floods broadcast Ethernet frames with unique source MACs from two independent injector socket nodes (each with its own data socket + drain thread) so ng_bridge_put runs at ~100 Kfps. Build: cc -O2 -lpthread -o race race.c. Run: ./race [seconds] (must be root).
  • build.sh / run.sh โ€” exact reproducible build and run commands.
  • build.log / run.log / run.2.log โ€” full untrimmed outputs.
  • env.txt โ€” guest environment (uname, modules, INVARIANTS confirmation).
  • VERDICT.md โ€” full narrative: code-level proof, reachability caveat, fix.
  • fix.diff โ€” git apply-able per-node struct lock patch (git apply --check passes against the read-only sys/ tree).
  • manifest.json โ€” machine-readable catalog.

Build & run

./build.sh    # cc -O2 -lpthread -o race race.c
./run.sh      # ./race 20    (must be root; EPERM as unprivileged uid)

Privilege model (verified)

Building a netgraph graph requires root: the ng_socket control socket is gated by caps_priv_check(SYSCAP_RESTRICTEDROOT) at sys/netgraph/socket/ng_socket.c:172-173. Verified on the guest: as unprivileged uid 1001 (maxx), the first socket(AF_NETGRAPH, SOCK_DGRAM, NG_CONTROL) returns EPERM. The CVSS vector's PR:L claim is overstated for the "user builds the graph" attack path; the realistic exposure is (a) a root-configured bridge processing frames from an untrusted/remote segment, or (b) a root-local race.

Expected outcome on this kernel

The flood injects ~1โ€“3 million frames across two hooks in 20โ€“40 s with zero ENOBUFS failures (drain threads prevent backpressure). No panic occurs. The guest stays up. This is expected โ€” see VERDICT.md for why the race cannot fire from this vector despite the code being genuinely unprotected.

VERDICT.md verdict full narrative: code-level race proof (no lock, inline dispatch, crit_enter not SMP-safe) + why PoC cannot trigger it (netisr_cpuport(0) serialization) + ng_ether dispatch-path confirmation that race WOULD manifest in realistic topology
โ†“ download raw

DF-0590 โ€” Verdict

Verdict: NOT REPRODUCED via userspace ng_socket injection โ€” code-confirmed real race with a reachability caveat.

status = not_reproduced, reproduced = 0, impact = none, confidence = certain.

The legacy ng_bridge node has genuinely no SMP serialization of its shared hashtable state โ€” the finding's code-level claim is correct. However, the PoC's chosen trigger vector (userspace ng_socket frame injection) cannot open the race window on this kernel due to a netgraph-socket-layer serialization detail the finding did not examine. The race would manifest in the realistic threat model (a root-configured bridge connected to ng_ether on a multi-CPU NIC), where incoming packets are dispatched per-flow across CPUs.

This file records both the rigorous code-level proof that the race is real and the precise reason it cannot be triggered from userspace ng_socket on this guest, with path:line citations at every hop.


1. The code-level race is REAL (confirmed by source trace)

1a. NG_SEND_DATA dispatches rcvdata INLINE on the caller's CPU

sys/netgraph/netgraph/netgraph.h:224-228:

#define NG_SEND_DATA(error, hook, m, a)                 \
        do {                                            \
                (error) = ng_send_data((hook), (m), (a)); \
                (m) = NULL;                             \
                (a) = NULL;                             \
        } while (0)

sys/netgraph/netgraph/ng_base.c:1677-1698:

int
ng_send_data(hook_p hook, struct mbuf *m, meta_p meta)
{
        int (*rcvdata)(hook_p, struct mbuf *, meta_p);
        int error;
        CHECK_DATA_MBUF(m);
        if (hook && (hook->flags & HK_INVALID) == 0) {
                rcvdata = hook->peer->node->type->rcvdata;
                if (rcvdata != NULL)
                        error = (*rcvdata)(hook->peer, m, meta);   /* <-- LINE 1687: direct call, no queue, no CPU migration */
                ...

There is no NG_NODE_FORCE_WRITER anywhere in sys/netgraph/:

$ grep -rn FORCE_WRITER sys/netgraph/   # (empty)

The macro simply does not exist in this tree. Legacy nodes that do not opt into writer-serialization (and ng_bridge does not) get inline dispatch on whatever CPU the sender runs on.

1b. ng_bridge installs NO lock

sys/netgraph/bridge/ng_bridge.c constructor (lines 297-336): allocates priv, callout_init, allocates tab, sets defaults, ng_make_node_common, callout_reset. No lockinit, no lwkt_token_init, no mtx_init, no spinlock_init:

$ grep -n 'lockinit\|lockmgr\|lwkt_token\|mtx_init\|spinlock' sys/netgraph/bridge/ng_bridge.c
# (only the timeout's crit_enter/crit_exit at lines 940/945/1005)

The typestruct at ng_bridge.c:275-284 registers ng_bridge_rcvdata for both rcvdata and rcvdataq โ€” there is no writer-queue indirection.

1c. ng_bridge_rcvdata mutates shared state with no protection

ng_bridge.c:517- (ng_bridge_rcvdata): pulls the ether header, looks up the source MAC via ng_bridge_get (:571), and on a miss calls ng_bridge_put (:620) which does all of: - SLIST_INSERT_HEAD(&priv->tab[bucket], hent, next) (:825) - priv->numHosts++ (:826) - ng_bridge_rehash(priv) (:829) โ€” which kfree(priv->tab) (:885) and replaces the pointer (:888) when the table grows/shrinks.

None of this is under any lock or token. The fan-out path (:636-708) also reads priv->tab and priv->links[] concurrently.

1d. The 1 Hz ng_bridge_timeout sweep is crit_enter()-only

ng_bridge.c:932-1005: crit_enter() at :940, then walks every bucket (:953) removing stale entries (priv->numHosts-- at :969), counts survivors into a local counter, and at :984 hits:

KASSERT(priv->numHosts == counter,
    ("%s: hosts: %d != %d", __func__, priv->numHosts, counter));

crit_enter() on DragonFly blocks interrupts/preemption on the current CPU only; it does not serialize against other CPUs. So a concurrent ng_bridge_put on another CPU doing priv->numHosts++ (and inserting into a bucket the sweep has already passed) makes this KASSERT trip โ†’ panic.

1e. INVARIANTS is compiled into the running module

The guest kernel is X86_64_GENERIC (sysctl kern.ident), which has options INVARIANTS at sys/config/X86_64_GENERIC:56. The ng_bridge.ko module embeds the KASSERT strings (proof the #ifdef INVARIANTS blocks compiled in):

$ strings /boot/kernel/ng_bridge.ko | grep -iE "hosts:|exists in table|nonexistent"
%s: hosts: %d != %d
%s: entry %s exists in table
%s: host %s on nonexistent link %d

So if a concurrent ng_bridge_put and the timeout sweep landed on different CPUs, this kernel would panic with ng_bridge_timeout: hosts: N != M.


2. WHY the PoC cannot trigger it on this kernel

2a. ng_socket sends are serialized onto CPU 0

sys/kern/uipc_socket.c:254-259 (port assignment in socreate):

if (prp->pr_flags & PR_SYNC_PORT)
        so->so_port = &netisr_sync_port;
else if (prp->pr_initport != NULL)
        so->so_port = prp->pr_initport();
else
        so->so_port = netisr_cpuport(0);          /* <-- ng_socket hits this branch */

sys/netgraph/socket/ng_socket.c:991-1006 (the ng_socket protosw):

static struct protosw ngsw[] = {
    { .pr_type = SOCK_DGRAM, .pr_domain = &ngdomain, .pr_protocol = NG_CONTROL,
      .pr_flags = PR_ATOMIC | PR_ADDR /* | PR_RIGHTS */, ... },     /* no PR_SYNC_PORT */
    { .pr_type = SOCK_DGRAM, .pr_domain = &ngdomain, .pr_protocol = NG_DATA,
      .pr_flags = PR_ATOMIC | PR_ADDR, ... },                        /* no pr_initport */
};

No PR_SYNC_PORT, no pr_initport โ†’ every ng_socket gets so->so_port = netisr_cpuport(0). All pru_send dispatches (sys/kern/uipc_msg.c:441: lwkt_domsg(so->so_port, ...)) therefore process on CPU 0's netisr thread, one at a time.

2b. The bridge callout also runs on CPU 0

The bridge node is constructed inside ng_mkpeer which runs in the control-socket's message handler โ€” dispatched on CPU 0 (same port). The constructor's callout_reset(&priv->timer, hz, ng_bridge_timeout, node) (ng_bridge.c:332) arms the callout on the current CPU = CPU 0. The callout re-arms itself on the same CPU each tick (:950). So the timeout sweep runs on CPU 0.

2c. Both sides of the "race" are on CPU 0 โ†’ serialized

Because every ngd_send โ†’ NG_SEND_DATA โ†’ ng_send_data โ†’ ng_bridge_rcvdata chain executes inline inside CPU 0's netisr thread, and the timeout callout also runs on CPU 0, the two can never execute concurrently. CPU 0's netisr is a single-threaded message processor; while the timeout sweep holds crit_enter(), no ngd_send message is processed. The concurrent-mutation window the finding describes cannot open from a userspace ng_socket sender, no matter how many threads or sockets the PoC spawns or how many frames it injects.

2d. Empirical confirmation

The PoC (after fixing backpressure with drain threads and using two independent control/data socket pairs) injects ~1โ€“3 million frames per 20โ€“40 s run across two bridge hooks, with zero ENOBUFS failures, and the guest never panics and stays up across multiple stress runs:

graph: ctlA:p0<->bridge:link0 , ctlB:p0<->bridge:link1
flooding 20 s on 2 injectors x 1 thread each + drains ...
[inj 0 cpu-1] sent 604003 frames (0 non-ENOBUFS fail)
[inj 1 cpu-1] sent 424987 frames (0 non-ENOBUFS fail)
flood complete: 1028990 total frames injected ...
RUN_EXIT=0     # guest still up, no panic, boot.log clean

This is the expected result given ยง2aโ€“c: the flood hammers CPU 0's netisr serially; the timeout takes its turn in the same queue. Millions of ng_bridge_put calls land, but never concurrently with the sweep.


3. The race WOULD manifest via ng_ether (the realistic threat model)

The finding's stated impact scenario is "a root-configured bridge processing frames from an untrusted/remote segment." In that topology the bridge is connected to ng_ether hooks on real NICs, and incoming packets arrive via ether_input, which dispatches per-flow across CPUs:

sys/net/if_ethersubr.c:1189:

                netisr_handle(isr, m);

where the isr is selected by netisr_hashport(m->m_pkthdr.hash) โ€” a per-flow CPU hash. With SMP and multi-flow traffic, ng_ether's NG_SEND_DATA to the bridge's hook (at ng_ether.c:270) therefore calls ng_bridge_rcvdata on different CPUs concurrently. That is the race window the finding describes, and the INVARIANTS KASSERT at ng_bridge.c:984 would trip in that deployment.

Reproducing that path from userspace on this single-vtnet VM would require injecting multi-flow raw Ethernet traffic from outside the guest (the QEMU host network), which is not feasible in this harness. The code-level proof in ยง1 plus the dispatch-path confirmation in this section establish that the bug is real and ship-exploitable in the documented deployment scenario, even though the PoC's userspace vector cannot reach it on this kernel.


4. Privilege model โ€” PR:L in the CVSS is overstated

The control socket is root-gated: sys/netgraph/socket/ng_socket.c:172-173:

if (caps_priv_check(ai->p_ucred,
            SYSCAP_RESTRICTEDROOT | __SYSCAP_NULLCRED) != 0)
        return (EPERM);

Verified on the guest: as unprivileged uid 1001 (maxx), the first socket(AF_NETGRAPH, SOCK_DGRAM, NG_CONTROL) returns EPERM ("Operation not permitted"). So an unprivileged local user cannot construct the graph. The realistic attacker is either root already, or โ€” far more importantly โ€” a remote/untrusted host on a bridged segment that controls only the frame content, not the graph topology. The CVSS PR:L (low privilege) should be PR:H (high/root) for the "user builds the graph" path; the realistic exposure is frame-injection-into-a-root-configured-bridge, which is PR:N from the attacker's perspective but AC:H (the race window is per-tick against a 1 Hz callout).


5. Exploit chain

none โ€” not a memory-corruption class the PoC could exercise on this kernel. The theoretical primitives (UAF read in ng_bridge_rehash freeing priv->tab while the timeout walks it; OOB write in a GET_TABLE copy loop racing the flood) are real in the code but unreachable from this vector. Developing them would require the ng_ether+SMP-traffic trigger, which is outside this harness. The realistic impact ceiling for the confirmed race is kernel panic / DoS on an INVARIANTS production bridge, with potential heap corruption on a non-INVARIANTS build (the kfree(priv->tab) during rehash vs the timeout's bucket walk is a genuine UAF).


6. PoC changes from the seeded draft

The seeded race.c was a sketch that the previous (aborted) runner had already refined into a working graph-builder. This run made the following fixes to make it honestly exercise the bridge at high rate (even though the race itself cannot fire from this vector):

  1. Fixed NG_DATA/NG_CONTROL constant order โ€” DragonFly's ng_socket.h:52-53 defines NG_DATA=1, NG_CONTROL=2 (not the intuitive order). The original draft had them swapped, yielding ENOTCONN.
  2. Added a drain thread on the data socket โ€” without it, the bridge's broadcast fan-out fills the peer socket node's receive buffer within ~100 frames and every subsequent sendto returns ENOBUFS, starving the flood (the seeded draft's 84-frames-per-8-seconds). With the drain, throughput rises to ~100 Kfps.
  3. Used two independent control/data socket pairs (ctlA+d0, ctlB+d1) instead of one โ€” a single netgraph socket node refuses a second data socket (ng_connect_data โ†’ EADDRINUSE at ng_socket.c:664), and a single data socket serializes sendto via so_snd. Two independent pairs remove that userspace-side serialization (though, per ยง2, the kernel-side netisr_cpuport(0) serialization remains and is the real blocker).
  4. Added a header comment documenting the netisr_cpuport(0) finding so a future maintainer doesn't waste time trying to win the race from this vector.
  5. Added the ng_name helper โ€” NGM_NAME requires a fixed-size struct ngm_name { char name[NG_NODESIZ=32]; } (zero-padded), not a bare string; the seeded draft passed strlen(name)+1 bytes and got EINVAL.
  6. Verified the privilege gate empirically (EPERM as uid 1001) and the INVARIANTS-in-module fact (KASSERT strings present).

fix.diff adds a per-node struct lock to ng_bridge's priv, initializes it in the constructor, and acquires it: - shared (LK_SHARED) in ng_bridge_get (read-only lookup), - exclusive (LK_EXCLUSIVE) in ng_bridge_put (insert + numHosts++ + rehash), - exclusive (LK_EXCLUSIVE) around the timeout sweep + rehash.

LK_CANRECURSE covers the put โ†’ rehash and timeout โ†’ rehash nested acquires (rehash itself does not take the lock; it runs under the caller's held lock). All return paths in the locked sections release the lock.

git apply --check passes against the read-only sys/ tree. This fix supersedes any vague "add FORCE_WRITER" proposal โ€” NG_NODE_FORCE_WRITER does not exist in this tree, so a lockmgr-based per-node lock is the correct, minimal, DragonFly-idiomatic fix. The same fundamental issue (no serialization) likely affects other legacy netgraph node types that mutate per-instance state from rcvdata โ€” flagged for the maintainer.

Confirmed kernel references

Detail

Exploit chain

none - not a memory-corruption class the PoC could exercise on this kernel. The theoretical UAF (ng_bridge_rehash kfree(priv->tab) at :885 vs the timeout's concurrent bucket walk) and OOB write (GET_TABLE copy vs flood) are real in the code but unreachable from the ng_socket vector; developing them would require the ng_ether+per-flow-SMP trigger, which is outside this VM's single-NIC harness. Realistic impact ceiling on a production bridge: kernel panic/DoS on INVARIANTS, potential heap corruption on non-INVARIANTS builds.

Evidence (decisive lines)

graph: ctlA:p0<->bridge:link0 , ctlB:p0<->bridge:link1
[inj 0 cpu-1] sent 604003 frames (0 non-ENOBUFS fail)
[inj 1 cpu-1] sent 424987 frames (0 non-ENOBUFS fail)
flood complete: 1028990 total frames injected ...
RUN_EXIT=0  (guest up, boot.log clean, no panic)
--- run #2: 1258448 frames, RUN_EXIT=0, guest up ---
--- as unprivileged maxx: 'socket(csock): Operation not permitted' (EPERM, confirms root-only graph construction) ---
--- INVARIANTS confirmed in module: strings /boot/kernel/ng_bridge.ko | grep hosts: => '%s: hosts: %d != %d' ---

PoC changes

Rewrote race.c from the seeded sketch to honestly exercise the bridge at ~100Kfps (was 84 frames/8s due to peer-socket backpressure): (1) fixed NG_DATA=1/NG_CONTROL=2 constant order (DragonFly ng_socket.h:52-53); (2) added drain threads on the data sockets so broadcast fan-out ENOBUFS doesn't starve the flood; (3) used two independent control/data socket pairs (ctlA+d0, ctlB+d1), because ng_socket refuses a 2nd data socket on the same node (EADDRINUSE, ng_socket.c:664) and one data socket serializes sendto via so_snd; (4) fixed NGM_NAME to send a zero-padded struct ngm_name[32] (seeded draft sent strlen+1 => EINVAL); (5) rewrote README/VERDICT/manifest/fix.diff to document the netisr_cpuport(0) discovery. fix.diff adds a per-node struct lock (lockmgr LK_SHARED in ng_bridge_get, LK_EXCLUSIVE in ng_bridge_put + timeout sweep+rehash); git apply --check passes; supersedes any FORCE_WRITER proposal since that macro does not exist in this tree.

Verified recommended fix

Add a per-node struct lock to ng_bridge's priv, init via lockinit(&priv->lock, "ng_bridge", 0, LK_CANRECURSE) in the constructor, and acquire LK_SHARED in ng_bridge_get / LK_EXCLUSIVE in ng_bridge_put and around the ng_bridge_timeout sweep+rehash section (with matching LK_RELEASE on all return paths). This is the DragonFly-idiomatic minimal fix; NG_NODE_FORCE_WRITER does not exist in this tree. Full git-apply-able diff in findings/poc/DF-0590/fix.diff. The same no-serialization pattern likely affects other legacy netgraph node types that mutate per-instance state from rcvdata - flagged for the maintainer.

Verdict

CODE-CONFIRMED RACE, NOT REPRODUCED via userspace ng_socket. The finding's code-level claim is correct: ng_bridge.c has NO lock/token/FORCE_WRITER (constructor at :297-336, rcvdata at :517-), NG_SEND_DATA dispatches rcvdata INLINE on the caller CPU (ng_base.c:1687), and the 1Hz timeout uses crit_enter() only (:940) which is per-CPU not SMP-safe. The INVARIANTS KASSERT at ng_bridge.c:984 ('hosts: N != M') IS compiled into the loaded ng_bridge.ko (verified via strings). HOWEVER the PoC's attack vector (userspace ng_socket frame injection) CANNOT open the race window on this kernel: ng_socket's protosw (ng_socket.c:996) has neither PR_SYNC_PORT nor pr_initport, so socreate assigns so->so_port = netisr_cpuport(0) (uipc_socket.c:259), serializing ALL ngd_send calls onto CPU 0's netisr thread; the bridge callout is armed in that same CPU 0 context (constructor at :332). Both sides of the 'race' therefore execute on the same CPU, serialized. Empirically, ~1-3M frames over 20-40s (across two independent injector socket pairs + drain threads removing backpressure) produced ZERO panics and the guest stayed up. The race WOULD manifest in the realistic threat model (root-configured bridge on ng_ether with SMP traffic) because ether_input dispatches via netisr_hashport per-flow across CPUs (if_ethersubr.c:1189) -- confirmed by tracing ng_ether.c:270's NG_SEND_DATA into the bridge -- but that path is unreachable from this single-vtnet VM. The CVSS PR:L claim is also overstated: ngc_attach gates the control socket on caps_priv_check(SYSCAP_RESTRICTEDROOT) at ng_socket.c:172 (verified EPERM as uid 1001).