tstest/natlab, .github/workflows: add opt-in natlab CI workflow
The natlab vmtest suite (tstest/natlab/vmtest) and the integration nat
tests are gated behind --run-vm-tests because they need KVM and are
slow. Until now nothing in CI exercised them apart from a single
canary TestEasyEasy run on every PR.
Add .github/workflows/natlab-test.yml that runs the full opt-in suite
on demand (workflow_dispatch), on PRs labeled "natlab", and on main
every 12 hours via cron. The workflow has two phases:
- "prepare" builds the gokrazy VM image, downloads the Ubuntu and
FreeBSD cloud images once via the new natlabprep tool, and emits
a dynamic JSON matrix of every TestX function it finds in the two
opt-in packages.
- "test" is a per-test matrix that depends on prepare. Each matrix
job restores the shared caches and runs a single test, so adding
a new TestFoo is automatically picked up on the next run without
any workflow edits.
Rename the existing natlab-integrationtest.yml to natlab-basic.yml
since it's the small smoke variant (just TestEasyEasy on every PR);
the new natlab-test.yml is the bigger suite. The job inside is
renamed to EasyEasy for the same reason.
Move the macOS arm64 host check from vmtest.Env.Start into
vmtest.Env.AddNode so a test that adds a vmtest.MacOS node skips
immediately on a non-macOS host, and add an explicit
skipIfNotMacOSArm64 helper at the top of the two macOS-only tests
so the platform requirement is obvious to readers.
Quiet the takeAgentConnOne miss log in tstest/natlab/vnet by default
(it was the overwhelming majority of bytes in CI logs, with no signal
in healthy runs) and replace it with a periodic "still waiting" line
that only fires after 10s, so a truly stuck agent connection still
surfaces.
Updates #13038
Change-Id: I4582098d8865200fd5a73a9b696942319ccf3bf0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit is contained in:
committed by
Brad Fitzpatrick
parent
4eec4423b4
commit
e062b46984
@@ -7,6 +7,7 @@ import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"net/netip"
|
||||
"runtime"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
@@ -21,7 +22,20 @@ import (
|
||||
"tailscale.com/types/netmap"
|
||||
)
|
||||
|
||||
// skipIfNotMacOSArm64 skips the test when the host isn't a macOS arm64 host.
|
||||
// macOS VM tests require Apple Virtualization.framework via tailmac.
|
||||
// AddNode also enforces this when a macOS node is added, but having an
|
||||
// explicit skip at the top of macOS-only tests makes the requirement
|
||||
// obvious to readers.
|
||||
func skipIfNotMacOSArm64(t *testing.T) {
|
||||
t.Helper()
|
||||
if runtime.GOOS != "darwin" || runtime.GOARCH != "arm64" {
|
||||
t.Skipf("macOS VM tests require a macOS arm64 host (got %s/%s)", runtime.GOOS, runtime.GOARCH)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMacOSAndLinuxCanPing(t *testing.T) {
|
||||
skipIfNotMacOSArm64(t)
|
||||
env := vmtest.New(t)
|
||||
|
||||
lan := env.AddNetwork("192.168.1.1/24")
|
||||
@@ -39,6 +53,7 @@ func TestMacOSAndLinuxCanPing(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestTwoMacOSVMsCanPing(t *testing.T) {
|
||||
skipIfNotMacOSArm64(t)
|
||||
env := vmtest.New(t)
|
||||
|
||||
lan := env.AddNetwork("192.168.1.1/24")
|
||||
@@ -969,8 +984,18 @@ func TestCachedNetmapAfterRestart(t *testing.T) {
|
||||
}
|
||||
netmapCheckStep.End(nil)
|
||||
|
||||
// 90s is generous on purpose. After both nodes restart with stale cached
|
||||
// netmap entries, a's first WG handshake to b's pre-restart endpoint
|
||||
// hits the dead NAT mapping on b's side and is silently dropped (we
|
||||
// see this as "no recent outgoing packet" NAT drops in the vnet log).
|
||||
// Recovery then waits on wireguard-go's REKEY_TIMEOUT (~5s) before the
|
||||
// next handshake attempt, and on disco-via-DERP to teach each side the
|
||||
// other's new endpoint. On an idle host this converges in well under
|
||||
// 15s; on a contended host (a 14/16-CPU-loaded local repro, or any
|
||||
// shared CI runner) the same sequence has been observed at 50-60s
|
||||
// because every timer fires multiple times under scheduling jitter.
|
||||
pingStep.Begin()
|
||||
if err := env.Ping(a, b, tailcfg.PingTSMP, 30*time.Second); err != nil {
|
||||
if err := env.Ping(a, b, tailcfg.PingTSMP, 90*time.Second); err != nil {
|
||||
pingStep.End(err)
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user