TestOnlyTaggedPeersCanBeDialed has a race condition:
- The test untags ps[2] and waits until ps[0] sees this tag dropped from
ps[2] in the netmap.
- Later the test tries to dial ps[2] from ps[0] and expects the dial to
fail as authorization to dial relies on the presence of the tag, now
removed from ps[2].
- However, the authorization layer caches the status used to consult peer
tags. When the dial happens before the cache times out, the test fails.
- Due to a bug in testcontrol.Server.UpdateNode, which the test uses to
remove the tag, netmap updates are not immediately triggered. The test
has to wait for the next natural set of netmap updates, which on my
machine takes about 22 seconds. As a result, the cache in the
authorization layer times out and the test passes.
- If one fixes the bug in UpdateNode, then netmap updates happen
immediately, the cache is no longer timed out when the dial occurs, and
the test fails.
Fixes#18720
Updates #18703
Signed-off-by: Harry Harpham <harry@tailscale.com>