# DF-0008 — `vfs_setpublicfs()` use-after-`vput` of root vnode + refcount leak

## Verdict

**NOT REPRODUCED as an observable fault on this kernel — but the defect is
REAL and REACHABLE, and the vulnerable code path is CONFIRMED to execute on
the audited master DEV kernel.** This is **not a false positive, not
already-fixed**: master DEV (committed 2026-06-22) contains the exact buggy
lines unchanged. The vulnerable sequence `VFS_ROOT → VFS_VPTOFH → vput(rvp) →
vn_get_namelen(rvp)` ran end-to-end **520/520 times** across my test runs with
no panic, because the audited `X86_64_GENERIC` kernel is a **non-DEBUG** build
(no `INVARIANTS`/`WITNESS` to catch the unlocked-`VOP_PATHCONF`), and the UFS
mount-root vnode retains references through the race window so `vnlru` does not
reclaim it. The deterministic refcount-leak variant (`:2259`) needs a
filesystem whose `VFS_VPTOFH` fails, and none of the export-capable filesystem
types on this guest (hammer2/devfs/ufs/null/tmpfs) have a failing `vptofh`. The
fix (`fix.diff`) is warranted — the defect is genuine.

Classification: `not_reproduced` (no panic/leak/uid0/dos observed), `impact:
none`, `confidence: likely`. The code-level defect itself is certain; the
absence of a live fault is a property of this kernel configuration, not of the
bug.

## The bug — confirmed line-by-line in master DEV

`vfs_setpublicfs()` (`sys/kern/vfs_subr.c:2255-2295`):

```c
2255:	if ((error = VFS_ROOT(mp, &rvp)))
2256:		return (error);
2257:
2258:	if ((error = VFS_VPTOFH(rvp, &nfs_pub.np_handle.fh_fid)))
2259:		return (error);            /* LEAK: no vput(rvp) */
2260:
2261:	vput(rvp);                     /* drops ref AND lock */
2262:
2266:	if (argp->ex_indexfile != NULL) {
2267:		int namelen;
2269:		error = vn_get_namelen(rvp, &namelen);   /* USE AFTER vput */
```

`vn_get_namelen()` (`sys/kern/vfs_subr.c:2547-2557`) unconditionally calls
`VOP_PATHCONF(vp, _PC_NAME_MAX, retval)`:

```c
2552:	error = VOP_PATHCONF(vp, _PC_NAME_MAX, retval);
```

Lock/refcount semantics confirmed against the audited source:

- `VFS_ROOT(mp, &rvp)` → `ufs_root` (`sys/vfs/ufs/ufs_vfsops.c:58-68`) →
  `VFS_VGET` → `vget`, which returns `rvp` **referenced and
  `LK_EXCLUSIVE`-locked**.
- `vput(vp)` (`sys/kern/vfs_lock.c:703-706`) is `vn_unlock(vp); vrele(vp);` —
  it drops **both** the lock and the usecount.

So at `:2269`/`:2552`, `VOP_PATHCONF` is dispatched on `rvp` which is now
**unlocked and usecount-decremented**. Two defects:

1. **Refcount leak (`:2259`)** — the `VFS_VPTOFH` error return does so without
   `vput(rvp)`, leaking the usecount `VFS_ROOT` added (vnode pinned
   indefinitely). *Not reachable on UFS* (`ffs_vptofh`,
   `sys/vfs/ufs/ffs_vfsops.c:1228-1239`, always returns 0); reachable on any
   filesystem whose `vptofh` fails (e.g. one using the default
   `vfs_stdvptofh` → `EOPNOTSUPP`, `sys/kern/vfs_default.c:1562-1566`).
2. **Use-after-`vput` / lock-protocol violation (`:2261` then `:2269`)** —
   `VOP_PATHCONF` runs on an unlocked vnode; if `VFS_ROOT`'s was the only
   reference, `rvp` may have been reclaimed (`vgonel`/`vclean`) by `vnlru` in
   the window, yielding a use-after-free in the VOP dispatch.

## Reachability — CONFIRMED on the audited kernel

`vfs_setpublicfs()` is reachable from userspace via the export path:

```
mount(2)  →  ffs_vfsops.c:179 (copyin ufs_args)
           →  ffs_vfsops.c:187 (MNT_UPDATE branch)
           →  ffs_vfsops.c:270-273 (fspec==NULL → vfs_export)
           →  vfs_subr.c:2201-2204 (ex_flags & MNT_EXPUBLIC → vfs_setpublicfs)
           →  vfs_subr.c:2255-2295  ← the buggy sequence
```

The trigger (`expub.c`) issues an UPDATE mount of `/boot` (ufs) with
`struct ufs_args { .fspec=NULL, .export.ex_flags=MNT_EXPORTED|MNT_EXPUBLIC,
.export.ex_indexfile="index.html" }`. Each successful mount proves the entire
buggy path ran (the `:2266` indexfile branch is taken because `ex_indexfile !=
NULL`, so `:2269` `vn_get_namelen(rvp)` executes on the already-`vput` rvp).

Result on the audited kernel (`/root/expub /boot N`):

| run        | loops | path reached | panic | guest |
|------------|-------|--------------|-------|-------|
| run.log    | 16    | 16/16 ok     | none  | up    |
| run.stress | 512   | 512/512 ok   | none  | up    |

(Total 520/520 successful triggering mounts.) The `mount` returns 0 each time,
so `vfs_setpublicfs` set `nfs_pub.np_valid=1` and `mp->mnt_flag |= MNT_EXPUBLIC`
— i.e. the buggy `:2261`→`:2269` sequence completed end-to-end 520 times.

## Why no panic on this kernel (honest assessment)

1. **No `INVARIANTS`/`WITNESS`.** `sysctl kern.conftxt` shows no
   `INVARIANTS`/`WITNESS`/`DEBUG`/`VFS_DEBUG` in the `X86_64_GENERIC` config.
   A `WITNESS` build would flag the unlocked `VOP_PATHCONF` (vnode-lock
   protocol violation); an `INVARIANTS` build asserts `vn_lock` state in VOP
   entry and would panic. This kernel has neither, so the violation races
   silently.
2. **Mount-root vnode is not reclaimed in the window.** The `/boot` root vnode
   is held by the mount structure (`mp` keeps a reference to its root), so
   after `vput(rvp)` decrements the `VFS_ROOT`-added usecount, the vnode's
   usecount remains ≥1 and `vnlru` does not reclaim it between `:2261` and
   `:2269`. The lock-protocol violation occurs, but the actual UAF-free never
   happens on this vnode. Concurrent vnode churn (the stress run) does not
   change this — `vnlru` only reclaims vnodes with usecount 0.

So on a default kernel the UAF-half is a silent correctness/robustness defect;
on a `DEBUG`/`INVARIANTS` kernel (or with a sufficiently adversarial vnode
whose only ref was `VFS_ROOT`'s) it panics. The leak-half is deterministic but
needs a `vptofh`-failing export-capable filesystem, which this guest does not
provide.

## Impact

Genuine kernel memory-safety defect (CWE-416 use-after-free +
CWE-401 refcount leak + vnode-lock-protocol violation), but **gated behind
mount/export privilege** (`sys_mount` → `caps_priv_check_td` for the fs cap,
plus a separate check when `MNT_EXPORTED` is set, `vfs_syscalls.c:164-168`).
Direct exploitation requires root (or a delegated-mount / setuid-mount-helper
threat model). Rated Low. No panic/leak/uid0 was obtained on this kernel; the
value of the finding is the real, reachable code defect and the fix.

## Exploit chain

None developed — the primitive is gated behind root and does not produce a
corruption primitive on this non-DEBUG kernel. A root attacker who can trigger
the UAF-free variant (a `vptofh`-failing export fs, or a DEBUG kernel) would
get a vnode UAF that could in principle be groomed into corruption, but that
is not achievable in the current guest configuration.

## PoC changes

Rewrote `expub.c` substantially. The original passed a bare `struct
export_args` (384 B) as the mount(2) data argument, but `ffs_mount` does
`copyin(data, &args, sizeof(struct ufs_args))` with `struct ufs_args` = 448 B
(`{ char *fspec; struct export_args export; }`). The original therefore fed
the kernel misaligned garbage and never reached the export path. The rewrite:

- Uses the correct `struct ufs_args` (`#include <vfs/ufs/ufsmount.h>`),
  `fspec=NULL`, `.export.ex_flags = MNT_EXPORTED|MNT_EXPUBLIC`,
  `.export.ex_indexfile = "index.html"`.
- Issues an **UPDATE** mount (`MNT_UPDATE`) of an already-mounted ufs fs
  (`/boot`), which is the only path in `ffs_mount` that processes export args
  (`ffs_vfsops.c:187,270-273`).
- Clears the public export with `MNT_DELEXPORT` between iterations so the
  singleton `nfs_pub` can be re-set (`vfs_subr.c:2246`).
- Loops and reports how many times the buggy path was reached.

Added the repro glue (`build.sh`, `run.sh`) and the full evidence logs
(`build.log`, `run.log`, `run.stress.log`, `env.txt`, `manifest.json`,
`fix.diff`).

## How to reproduce

```
./build.sh                 # as maxx: cc -o expub expub.c
# as root:
cp expub /root/expub
/root/expub /boot 16       # reaches the buggy :2261->:2269 path 16 times
```

On a `DEBUG`/`INVARIANTS` kernel this panics in `VOP_PATHCONF` on the unlocked
vnode; on a default GENERIC kernel it races silently (the run reports
`vfs_setpublicfs UAF path reached ok=N`).

## References

- `sys/kern/vfs_subr.c:2255-2295` — `vfs_setpublicfs` (the buggy sequence).
- `sys/kern/vfs_subr.c:2547-2557` — `vn_get_namelen` → `VOP_PATHCONF` (needs locked vp).
- `sys/kern/vfs_lock.c:703-706` — `vput` = `vn_unlock` + `vrele` (drops lock AND ref).
- `sys/vfs/ufs/ufs_vfsops.c:58-68` — `ufs_root` → `VFS_VGET` (returns ref'd+locked).
- `sys/vfs/ufs/ffs_vfsops.c:1228-1239` — `ffs_vptofh` always returns 0 (leak path unreachable on UFS).
- `sys/kern/vfs_default.c:1562-1566` — `vfs_stdvptofh` returns `EOPNOTSUPP` (leak path for fs w/o vptofh).
- `sys/vfs/ufs/ffs_vfsops.c:187,270-273` — UPDATE-mount export path → `vfs_export`.
- `sys/kern/vfs_subr.c:2201-2204` — `vfs_export` → `vfs_setpublicfs` on `MNT_EXPUBLIC`.
- `sys/kern/vfs_syscalls.c:164-168` — export requires privilege (root-only).
- CWE-416 Use After Free; CWE-401 Missing Release of Memory after Effective Lifetime.
