# DF-0079 — VERDICT

## Verdict: REPRODUCED — trivial unprivileged local full-system DoS

A single `write(/dev/null, buf, (size_t)1<<32)` by an unprivileged user
(uid 1001, not in wheel) wedges one CPU at 100% in kernel context **forever**.
The syscall never returns, the process is unkillable from userspace, and on a
small guest the whole machine becomes unreachable within ~1 second. Forking N
copies wedges N cores. Recovery is a hard reset only.

## Mechanism (every hop cited `path:line`, confirmed in audited master DEV)

1. `sys_write` (`sys/kern/sys_generic.c:336`) only rejects `(ssize_t)nbyte < 0`.
   `nbyte = 2³²` is positive as a 64-bit `ssize_t`, so it is accepted.
   `sys_generic.c:340` sets `aiov.iov_len = uap->nbyte` (= 2³²) and
   `:344` sets `auio.uio_resid = uap->nbyte` (= 2³²). No clamping anywhere.

2. The write reaches `mmwrite` → `mmrw` (`sys/kern/kern_memio.c:222`). The
   per-iteration byte count is declared **`u_int c;`** (32-bit) at `:225`.

3. `mmrw` enters `while (uio->uio_resid > 0 && error == 0)` at `:232`. The
   early guard `if (iov->iov_len == 0) { ... continue; }` at `:234` does NOT
   fire because the **full 64-bit** `iov_len` is 2³², not 0.

4. For `/dev/null` (minor 2), `case 2:` at `:292` returns early on read
   (`:296-297`) and on write executes `c = iov->iov_len;` at `:298` — a direct
   **`size_t` → `u_int` truncation**. With `iov_len = 0x100000000` the low 32
   bits are zero, so **`c = 0`**. There is **no `uiomove`/`copyin`** on this
   path (the user buffer pointer `(void*)0x1` is never dereferenced), so no
   `EFAULT` rescues us.

5. After `break` at `:299`, control falls to the bookkeeping at `:377-382`:
   `iov->iov_base += c` (+= 0), `iov->iov_len -= c` (-= 0 → still 2³²),
   `uio->uio_offset += c` (+= 0), `uio->uio_resid -= c` (-= 0 → still 2³²).

6. The `while` predicate at `:232` is still true (`uio_resid == 2³² > 0`,
   `error == 0`). Steps 3–5 repeat with **zero net change** to the loop state.
   `mmrw` spins forever in kernel context on the calling CPU.

7. The mem cdev is `D_MPSAFE | D_QUICK` (`:85`) and `mmrw` holds no lock while
   spinning, so the wedge is a **pure unyielding tight kernel loop** — not a
   lockup. It never blocks, never calls `lwkt_yield`/`uiomove`/`tsleep`, and
   never reaches a signal-check point, so:
   - the thread is stuck in state **`R`** (running) consuming ~100% of one CPU;
   - it cannot be preempted or signalled (SIGKILL never takes effect);
   - if that CPU also services the network IRQ (vtnet), sshd is starved and the
     guest becomes unreachable within ~1 s.

`/dev/null` and `/dev/zero` are created mode `0666` (`:842`/`:847`), so the
attack is reachable by **any local user**. `/dev/zero` write (`case 12`,
`:363-365`) has the identical `c = iov->iov_len` truncation and the same
infinite loop. (`/dev/kmem` `:264` has it too but is root-only.)

## Reproduced evidence

- `serial_wedge_capture.txt` — a serial-console watcher (writing `ps`/`top` to
  `/dev/ttyd0`, which lands in the host `boot.log` and survives reset) captured
  the wedged process across 18 iterations. Decisive excerpt:
  ```
  PID PPID STAT UID %CPU  TIME    COMMAND
  852 1    R0   1001 0.0  0:00.50 ./df0079   (t+0.0s)
  852 1    R0   1001 0.0  0:01.68 ./df0079   (t+0.7s)
  ...
  852 1    R0   1001 0.0  0:20.56 ./df0079   (t+19.5s)
  CPU states: 0.0% user, 0.0% nice, 50.0% system, 0.0% interrupt, 50.0% idle
  ```
  STAT `R0` = running on CPU (not blocked); `UID 1001` = unprivileged `maxx`;
  cputime grows ~1.18 s per wall-second = 100% of one of two CPUs; residual
  never drains. (50% system / 50% idle = one CPU fully wedged; the fork-N
  variant takes the rest.)
- `run.log` — host-side timeline: a single detached `./df0079` (one process)
  made the guest **unresponsive to ssh within 1 second**; ssh stayed unreachable
  for the full 10 s observation window. No kernel panic (clean starvation, per
  `boot.log`). Only `vm.sh reset` recovered the guest. Reproduced across three
  independent runs (each requiring a reset).
- `build.log`, `env.txt` — clean build; `/dev/null` & `/dev/zero` both
  `crw-rw-rw-` (0666); the vulnerable `u_int c;` decl confirmed at `:225`.

## Exploit chain / weaponization

Not a memory-corruption primitive — there is no corruption, only an infinite
loop — so there is no escalation chain. The realistic impact ceiling is
**reliable unprivileged local full-system denial of service**:
- One `write()` per CPU pegs every core at 100% in-kernel and never returns.
- The wedged threads are unkillable (no signal-check point), so even an
  administrator cannot recover short of a reboot.
- Trivially scriptable: a one-line attacker (`python -c 'import os; os.write(os.open("/dev/null",1),b"x"*(1<<32))'` or the C PoC) run by any user, or auto-started at login by a compromised low-priv account, takes the whole machine down indefinitely.

## Fix (in `fix.diff`)

Root-cause, one-line fix: widen the per-iteration byte count from 32-bit to
64-bit so `c = iov->iov_len` cannot truncate:

```diff
--- a/sys/kern/kern_memio.c
+++ b/sys/kern/kern_memio.c
@@ -222,7 +222,7 @@
 mmrw(cdev_t dev, struct uio *uio, int flags)
 {
 	int o;
-	u_int c;
+	size_t		c;
```

With `c` being `size_t`, `c = iov->iov_len` (= 2³²) no longer truncates; the
bookkeeping at `:380`/`:382` subtracts the full 2³², draining `uio_resid` to 0
in a single iteration, so the `while` predicate at `:232` exits and the write
returns normally. This closes `/dev/null` (`:298`), `/dev/zero` (`:364`) **and**
`/dev/kmem` (`:264`) in one change. All existing function-argument uses of `c`
(`uiomove`/`read_random`/`add_buffer_randomness_src` at `:253/:289/:309/:314/:319/:328/:354/:372`)
operate on values already bounded by `min(…, PAGE_SIZE)`, so widening `c` to
`size_t` compiles cleanly and changes no bounded-path behaviour.

This **supersedes** the finding markdown's per-case `c = min(iov->iov_len, PAGE_SIZE)`
clamp proposal: that also works for `/dev/null` and `/dev/zero` but would need
separate edits at three sites and would not fix the latent `/dev/kmem` (`:264`)
truncation. Widening `c` is the single-change root-cause fix the finding itself
names as "cleanest".

## PoC changes made during verification

- `df0079.c`: rewritten with an explanatory header citing every relevant
  `kern_memio.c` line, a `trigger()` helper, and a fork-N mode. The core
  trigger is unchanged from the original PoC: `write(fd, (void*)0x1, 0x100000000ULL)`.
- Added `watch_df0079.sh` (serial-console observer), `build.sh`, `run.sh`,
  `env.txt`, `VERDICT.md`, this `README.md`, `fix.diff`, `manifest.json`, and
  the full logs (`build.log`, `run.log`, `serial_wedge_capture.txt`).
