# DF-0001 — Verdict

**Verdict: REPRODUCED** (deterministic kernel panic, confirmed across two
independent runs including one from a fresh `vm.sh reset`).

**Impact:** local denial-of-service (kernel panic) on `INVARIANTS`+quota
hosts; no memory corruption, no privilege escalation. Rated **Low** by the
finding — confirmed.

---

## What the bug is

`kern_truncate()` and `kern_ftruncate()` (both reachable from any local user
with write permission to a file via `truncate(2)`/`ftruncate(2)`) call
`VOP_GETATTR`/`VOP_GETATTR_FP` to read the file's uid/gid/size for quota
accounting, and **unconditionally `KASSERT` that the call succeeded**:

- `sys/kern/vfs_syscalls.c:4036-4042` — `kern_truncate`:
  ```c
  if (vfs_quota_enabled) {
      error = VOP_GETATTR(vp, &vattr);
      KASSERT(error == 0, ("kern_truncate(): VOP_GETATTR didn't return 0"));   /* :4038 */
      ...
  }
  ```
- `sys/kern/vfs_syscalls.c:4111-4117` — `kern_ftruncate`:
  ```c
  if (vfs_quota_enabled) {
      error = VOP_GETATTR_FP(vp, &vattr, fp);
      KASSERT(error == 0, ("kern_ftruncate(): VOP_GETATTR didn't return 0"));  /* :4113 */
      ...
  }
  ```

`KASSERT` is `panic` under `options INVARIANTS` and a no-op otherwise
(`sys/sys/systm.h:94-117`). The block is gated only by the **global**
`vfs_quota_enabled` (a boot loader tunable, `sys/kern/vfs_quota.c:112-115`,
`CTLFLAG_RD` so settable only at boot), not by any per-mount or
can-the-FS-actually-fail-GETATTR check. `VOP_GETATTR` is **not** guaranteed
to succeed: any filesystem whose `vop_getattr` can return a nonzero error
post-lookup/lock trips it. There is no privilege check before the KASSERT.

## Precondition check on the tested guest (master DEV, `X86_64_GENERIC`)

| Precondition                                   | Status on guest                                                                 |
|------------------------------------------------|---------------------------------------------------------------------------------|
| `options INVARIANTS` in kernel config          | **PRESENT** — `sys/config/X86_64_GENERIC` has `options INVARIANTS` (not commented) |
| KASSERT compiled in (panic, not no-op)         | **YES** — `strings /boot/kernel/kernel` shows **both** panic strings (`kern_truncate(): VOP_GETATTR didn't return 0`, `kern_ftruncate(): ...`) |
| `vfs.quota_enabled=1`                          | default `0`; set to `1` via `/boot/loader.conf` `vfs.quota_enabled="1"` + reboot (sysctl is `CTLFLAG_RD`) |
| Reachable from unprivileged user               | **YES** — `sys_truncate`→`kern_truncate`, `sys_ftruncate`→`kern_ftruncate`, no `suser`/`priv_check` before the KASSERT |
| A VFS whose `VOP_GETATTR` can return nonzero   | local hammer2/UFS GETATTR is effectively infallible → must use a network FS; loopback NFS used (see below) |

The guest's `X86_64_GENERIC` kernel **does** ship `INVARIANTS`, so the
KASSERT is a live `panic()` — contrary to the common assumption that
production GENERIC kernels leave `INVARIANTS` off. This is what makes the
bug fire on this exact kernel.

## How the panic was triggered (the ESTALE path)

The finding's narrative names two GETATTR-failure modes: an NFS transient
(`ESTALE`/`EIO`) and a forced-reclaim vnode. Empirically, on DragonFly
master:

- **Dead-server (transport-failure) does NOT reach the KASSERT.** When the
  NFS server is killed, the client logs `nfs server ... not responding` /
  `nfs send error 61` (ECONNREFUSED) and `nfs_getattr()`
  (`sys/vfs/nfs/nfs_vnops.c:685-738`) **returns cached/local attributes with
  `error=0`** rather than propagating the transport error (the attribute
  cache at `sys/vfs/nfs/nfs_subs.c:885-933` serves the GETATTR, and for a
  client-written file the `NLMODIFIED` local-attr path makes the hit
  sticky). So `truncate()`/`ftruncate()` propagate the error only from the
  later `VOP_SETATTR` RPC (as `EINTR`/`EIO`), and the KASSERT — which sits
  on `GETATTR`, before `SETATTR` — never sees a nonzero value. *The most
  obvious "kill the NFS server" scenario does not trip this bug on master.*
- **A genuine application-level GETATTR error DOES reach the KASSERT.** The
  clean way to force `nfs_getattr()` to return a nonzero error is **ESTALE**:
  the client holds an open `fd` (fixed vnode/filehandle), the server deletes
  and recreates the file (new inode → old filehandle is now stale), and the
  server — still **up and responding** — returns `NFSERR_STALE` for the
  GETATTR RPC on the stale handle. `nfsm_request` + the `NEGKEEPOUT`/
  `ERROROUT` macros (`sys/vfs/nfs/nfsm_subs.h:109-124`) propagate that
  error out of `nfs_getattr()` (`return (error)` at `nfs_vnops.c:737`), so
  `VOP_GETATTR_FP` returns `ESTALE` to `kern_ftruncate`, and the KASSERT at
  `vfs_syscalls.c:4113` fires.

### Trigger choreography (see `run.sh`)

1. Boot `vfs.quota_enabled=1` (loader tunable, reboot).
2. Stand up a loopback NFS server (`rpcbind`/`mountd`/`nfsd`) exporting
   `/export`; NFS-mount it **soft, UDP, attribute-cache disabled**
   (`mount_nfs -U -s -x 1 -t 1 -o acregmin=0,acregmax=0,...`).
3. As the unprivileged user `maxx` (uid 1001, not in `wheel`), `open()`
   `/mnt/estale_target` → `fd` holds a fixed vnode/filehandle; sleep.
4. **Server-side:** `rm /export/estale_target && touch /export/estale_target`
   (new inode; the client `fd`'s handle is now stale; server still UP).
5. The process wakes and calls `ftruncate(fd, 0)`:
   `kern_ftruncate` → `VOP_GETATTR_FP` → NFS GETATTR RPC on the stale
   filehandle → server returns `NFSERR_STALE` → `nfs_getattr` returns
   `ESTALE` → `KASSERT(error == 0, ...)` at `vfs_syscalls.c:4113` → **panic**.

## Decisive evidence

Serial-console panic signature (`dfbsd-qemu/boot.log`), identical across the
initial run and the fresh-`vm.sh reset` confirmation run:

```
panic: kern_ftruncate(): VOP_GETATTR didn't return 0
cpuid = 0
Trace beginning at frame 0xfffff800abb23798
kern_ftruncate() at kern_ftruncate+0x152 0xffffffff80705532
kern_ftruncate() at kern_ftruncate+0x152 0xffffffff80705532
sys_xsyscall() at sys_xsyscall+0x89 0xffffffff80bd6749
syscall2() at syscall2+0x11e 0xffffffff80bd611e
Debugger("panic")
Stopped at Debugger+0x7c: movb $0,0xbd77f9(%rip)
db>
```

The panic names exactly the function the finding cites (`kern_ftruncate`,
KASSERT at `:4113` → `+0x152` in the disassembly), reached via the normal
`syscall2` → `sys_xsyscall` → `kern_ftruncate` path from an unprivileged
`ftruncate(2)`. The `kern_truncate` twin at `:4038` is the same bug; the
`ftruncate` variant was used for the demonstration only because the ESTALE
choreography is cleanest on an open `fd`. (`trunc_panic.c` / `trunc_only.c`
are retained as the path-truncate variants and the local-FS/no-panic baselines.)

## Exploit chain

**None — this is not a memory-corruption class.** It is a reachable
assertion (CWE-617): the primitive is a kernel `panic()` (DoS), full stop.
No bytes are corrupted, no pointers are hijacked, no privilege changes. The
realistic impact ceiling is **denial of service of an INVARIANTS+quota host
by any local user with write access to a file on a GETATTR-failing (NFS)
mount.** No further primitive is derivable.

## PoC changes made during verification

- **Added `estale_trig.c`** — the trigger that actually fires the panic. The
  original `trunc_panic.c` (path-based `truncate` against a "failing GETATTR
  FS") does not fire on master because, as traced above, a *transport*
  GETATTR failure is papered over by the NFS attribute cache; only an
  *application-level* GETATTR error (ESTALE on a stale open-fd handle)
  reaches the KASSERT. `estale_trig.c` implements that choreography.
- **Added `trunc_only.c`** — errno-printing diagnostic that proved (via the
  `EINTR`-from-SETATTR result) that the dead-server path returns
  `GETATTR=0` and therefore cannot trip the KASSERT. Kept as the negative
  evidence / baseline.
- Sharpened `trunc_panic.c` to print errnos and populate the target.
- **Added `run.sh`** — the multi-step reproducer (quota reboot + loopback
  NFS + ESTALE handle invalidation), since the bug needs three runtime
  preconditions the clean-install guest lacks. `build.sh` builds all three
  sources as the unprivileged user.

## Recommended fix

Convert both KASSERTs to proper error-returns (the `done:` labels already
perform the correct cleanup: `vput(vp)` for `kern_truncate` at `:4051`,
`fdrop(fp)` for `kern_ftruncate` at `:4128`; `kern_ftruncate` must
`vn_unlock(vp)` first because its `done:` sits after the `vn_unlock` at
`:4126`). The standalone `git apply`-able diff is `fix.diff`; it applies
cleanly to `sys/kern/vfs_syscalls.c`. **This matches the finding markdown's
`## Recommended fix` proposal** (same error-return + cleanup shape); the
runner's diff additionally carries an explicit `vn_unlock(vp)` before the
`goto done` in `kern_ftruncate` to avoid leaking the vnode lock.
