tstest/largetailnet, tstest/integration/testcontrol: add in-process large-tailnet benchmark

Add a Go benchmark that exercises a single tailnet client (a [tsnet.Server]
running in the test process) against a synthetic large initial netmap and
a stream of caller-driven peer add/remove deltas, all in-process.

The harness is split in two parts:

  - tstest/largetailnet, a reusable package containing a [Streamer]
    that hijacks the map long-poll on a [testcontrol.Server] via the new
    AltMapStream hook, sends one initial MapResponse with N synthetic
    peers, and forwards caller-supplied delta MapResponses on the same
    stream. Helpers like MakePeer / AllocPeer build synthetic peers with
    unique IDs and addresses derived from the Tailscale ULA range.

  - tstest/largetailnet/largetailnet_test.go, BenchmarkGiantTailnet
    (headless tailscaled workload, no IPN bus subscriber) and
    BenchmarkGiantTailnetBusWatcher (GUI-client workload with one
    Notify subscriber attached). Both are gated on
    --actually-test-giant-tailnet (skipped by default), stand up an
    in-process testcontrol + tsnet.Server, let Up block until the
    initial N-peer netmap has been processed, then ResetTimer and run
    add+remove pairs via b.Loop. Per-delta sync is via a test-only
    [ipnlocal.LocalBackend.AwaitNodeKeyForTest] channel that closes
    once the just-added peer key appears in the netmap (no-watcher
    variant) or via bus-Notify drain (bus-watcher variant).

To support the hijack, [testcontrol.Server] grows an AltMapStream hook
and a small MapStreamWriter interface for benchmarks/stress tests that
need to drive a controlled MapResponse sequence; the normal serveMap
path is untouched when AltMapStream is nil. The streamer answers
non-streaming "lite" map polls (which controlclient issues before the
streaming long-poll to push HostInfo) with an empty MapResponse and
returns immediately, so the streaming poll that follows is the one
that gets the initial netmap.

The benchmark is intended for before/after comparisons of netmap- and
delta-handling changes targeted at large tailnets. CPU profiles on
unmodified main show the expected O(N) hotspots:
setControlClientStatusLocked / authReconfigLocked /
userspaceEngine.Reconfig / setNetMapLocked, plus JSON encoding of the
full Notify.NetMap to bus watchers (which dominates the BusWatcher
variant).

Median ms/op over 10 runs on unmodified main, by tailnet size N:

       N      no-watcher   bus-watcher
   10000          32          166
   50000         222          865
  100000         504         1765
  250000        1551         4696

Recommended invocation:

	go test ./tstest/largetailnet/ -run=^$ \
	    -bench='BenchmarkGiantTailnet(BusWatcher)?$' \
	    -benchtime=2000x -timeout=10m \
	    --actually-test-giant-tailnet \
	    --giant-tailnet-n=250000 \
	    -cpuprofile=/tmp/giant.cpu.pprof

Updates #12542

Change-Id: I4f5b2bb271a36ba853d5a0ffe82054ef2b15c585
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit is contained in:
Brad Fitzpatrick
2026-04-26 22:48:05 +00:00
committed by Brad Fitzpatrick
parent 33342aec32
commit ad5436af0d
6 changed files with 601 additions and 1 deletions
+8
View File
@@ -2471,6 +2471,14 @@ func (b *LocalBackend) PeersForTest() []tailcfg.NodeView {
return b.currentNode().PeersForTest()
}
// AwaitNodeKeyForTest returns a channel that is closed once a peer with the
// given node key first appears in the current netmap. If the peer is already
// present, the returned channel is already closed. See
// [nodeBackend.AwaitNodeKeyForTest].
func (b *LocalBackend) AwaitNodeKeyForTest(k key.NodePublic) <-chan struct{} {
return b.currentNode().AwaitNodeKeyForTest(k)
}
func (b *LocalBackend) getNewControlClientFuncLocked() clientGen {
if b.ccGen == nil {
// Initialize it rather than just returning the
+45
View File
@@ -29,6 +29,7 @@ import (
"tailscale.com/util/eventbus"
"tailscale.com/util/mak"
"tailscale.com/util/slicesx"
"tailscale.com/util/testenv"
"tailscale.com/wgengine/filter"
)
@@ -107,6 +108,12 @@ type nodeBackend struct {
// nodeByKey is an index of node public key to node ID for fast lookups.
// It is mutated in place (with mu held) and must not escape the [nodeBackend].
nodeByKey map[key.NodePublic]tailcfg.NodeID
// keyWaitersForTest is the test-only registry of channels waiting for
// a given peer key to first appear in the netmap. See
// [nodeBackend.AwaitNodeKeyForTest]. It is populated lazily and remains
// nil in production, where no test installs a waiter.
keyWaitersForTest map[key.NodePublic]chan struct{}
}
func newNodeBackend(ctx context.Context, logf logger.Logf, bus *eventbus.Bus) *nodeBackend {
@@ -421,6 +428,7 @@ func (nb *nodeBackend) SetNetMap(nm *netmap.NetworkMap) {
nb.updateNodeByAddrLocked()
nb.updateNodeByKeyLocked()
nb.updatePeersLocked()
nb.signalKeyWaitersForTestLocked()
if nm != nil {
nb.derpMapViewPub.Publish(nm.DERPMap.View())
} else {
@@ -428,6 +436,43 @@ func (nb *nodeBackend) SetNetMap(nm *netmap.NetworkMap) {
}
}
// AwaitNodeKeyForTest returns a channel that is closed once a peer with the
// given node key first appears in this nodeBackend's peer index, or
// immediately (a closed channel) if it's already present. It is intended for
// in-process benchmarks that drive synthetic netmap deltas and need a
// zero-overhead signal that the client has applied a delta, replacing
// poll-based [local.Client.WhoIsNodeKey] loops in tests. It panics outside
// of tests.
func (nb *nodeBackend) AwaitNodeKeyForTest(k key.NodePublic) <-chan struct{} {
testenv.AssertInTest()
nb.mu.Lock()
defer nb.mu.Unlock()
if _, ok := nb.nodeByKey[k]; ok {
return syncs.ClosedChan()
}
if ch, ok := nb.keyWaitersForTest[k]; ok {
return ch
}
ch := make(chan struct{})
mak.Set(&nb.keyWaitersForTest, k, ch)
return ch
}
// signalKeyWaitersForTestLocked closes any waiter channels whose keys now
// appear in nb.nodeByKey. It is cheap when there are no waiters, which is
// the common case in production. It is called from [nodeBackend.SetNetMap]
// after the per-key index has been rebuilt.
//
// Caller must hold nb.mu.
func (nb *nodeBackend) signalKeyWaitersForTestLocked() {
for k, ch := range nb.keyWaitersForTest {
if _, ok := nb.nodeByKey[k]; ok {
close(ch)
delete(nb.keyWaitersForTest, k)
}
}
}
func (nb *nodeBackend) updateNodeByAddrLocked() {
nm := nb.netMap
if nm == nil {