Use bufio.Writer.AvailableBuffer to write the frame header directly
into bufio's internal buffer as a single append+Write, avoiding 5
separate WriteByte calls. Fall back to the existing writeUint32
byte-at-a-time path when the buffer has insufficient space.
```
name old ns/op new ns/op speedup
WriteFrameHeader-8 18.8 7.8 ~2.4x
(0 allocs/op in both)
```
Add TestWriteFrameHeader with correctness
checks, allocation assertions, and coverage of both fast and slow
write paths. Move BenchmarkReadFrameHeader from client_test.go to
derp_test.go alongside BenchmarkWriteFrameHeader, co-located with
the functions under test.
Updates tailscale/corp#38509
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>