tailscale

Author	SHA1	Message	Date
M. J. FrombergerandGitHub	9f48567bf1	ipn/ipnlocal,wgengine/magicsock: add basic counters for cached peer connectivity (#19699 ) Add new clientmetric counters for establishing contact with peers while using cached network map data. To do this, instrument the magicsock.Conn with a bit to indicate whether its peer data came from a cached netmap. If so, there are two conditions we will count as establishing connectivity to a peer: - Receipt of a CallMeMaybe from a peer via disco. - Establishing a valid endpoint address for a peer. In vmtest, add Env.ClientMetrics to scrape metrics from the specified node. Use this to check that counters were updated in caching tests. Updates https://github.com/tailscale/projects/issues/13 Updates #12639 Change-Id: Ie8cf3244ac8af4f5bcfe4d0d944078da2ba08990 Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>	2026-05-12 12:01:05 -07:00
James TuckerandJames Tucker	120bfcf1cc	util/eventbus: extract non-generic SubscriberFunc constructor body and cache type name Two changes that share the same intent of reducing per-T duplication in code that doesn't actually depend on T: 1. Hoist the non-generic portion of newSubscriberFunc[T] into a newSubscriberFuncCore() helper. The hoisted work is the time timer setup, the subscriberFuncCore allocation, and the unregister closure (which captures only the non-generic reflect.Type and subscribeState). The generic body now does only the two T-bound things it has to: compute reflect.TypeFor[T] and create the dispatch closure. Effect on the per-shape-stencil body of newSubscriberFunc[T]: before: 523 B per shape (in synthetic test) after: 293 B per shape (-230 B per shape; -56% on this body) 2. Cache reflect.Type.String() once at construction (in core.typeName) instead of recomputing it every time the dispatch closure runs. The dispatch closure also now takes the subscriberFuncCore directly rather than building an intermediate dispatchFuncState struct on every call. Effect on the dispatch closure body (newSubscriberFunc[T].func1): before: 581 B per shape after: 480 B per shape (-101 B per shape; -17%) Combined effect on tailscaled (linux/amd64): named-symbol savings via symcost: ~7 KB stripped binary delta: -8 KB (page-quantized) arm64 binary delta: 0 (page-quantized) cumulative reduction from baseline (5167ff412): linux/amd64: -110,592 bytes (-0.391%) linux/arm64: -131,072 bytes (-0.499%) Throughput is also improved by the typeName cache: BenchmarkBasic goes from 2018 ns/op to 1864 ns/op (-7.6%) because the dispatch hot path no longer allocates a string on every event. Updates #12614 Change-Id: Ib3a3d6796785e16506330ec034e1144580d467a3 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-12 11:16:04 -07:00
Brad FitzpatrickandBrad Fitzpatrick	758ebe9839	tstest/natlab/vmtest: use short paths for Unix sockets macOS limits Unix socket paths to 104 bytes. The Go test TempDir path (e.g. /var/folders/.../TestDirectConnection...679197086/001/) easily exceeds that, causing "bind: invalid argument". Create a short /tmp/vmtest* directory for all socket files (vnet, QMP, dgram) so the paths stay well under the limit on every platform. Updates #13038 Change-Id: I721d24561d1766aaa964692bc77f40a131aa9455 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-11 21:54:27 -07:00
Brad FitzpatrickandBrad Fitzpatrick	f4c5613156	tstest/natlab/vmtest: don't require KVM; use TCG on macOS startCloudQEMU hardcoded -machine q35,accel=kvm and -cpu host, which fails on any host without KVM (notably macOS). Replace with a qemuAccelArgs helper that probes /dev/kvm and falls back to QEMU's TCG software emulation, matching the pattern already used by tstest/integration/nat. Also wire the helper into startGokrazyQEMU so gokrazy VMs pick up KVM when available. Updates #13038 Change-Id: I7745518db823279b1880957bb14ca2ffdaab4c50 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-11 19:18:17 -07:00
Brad FitzpatrickandBrad Fitzpatrick	e062b46984	tstest/natlab, .github/workflows: add opt-in natlab CI workflow The natlab vmtest suite (tstest/natlab/vmtest) and the integration nat tests are gated behind --run-vm-tests because they need KVM and are slow. Until now nothing in CI exercised them apart from a single canary TestEasyEasy run on every PR. Add .github/workflows/natlab-test.yml that runs the full opt-in suite on demand (workflow_dispatch), on PRs labeled "natlab", and on main every 12 hours via cron. The workflow has two phases: - "prepare" builds the gokrazy VM image, downloads the Ubuntu and FreeBSD cloud images once via the new natlabprep tool, and emits a dynamic JSON matrix of every TestX function it finds in the two opt-in packages. - "test" is a per-test matrix that depends on prepare. Each matrix job restores the shared caches and runs a single test, so adding a new TestFoo is automatically picked up on the next run without any workflow edits. Rename the existing natlab-integrationtest.yml to natlab-basic.yml since it's the small smoke variant (just TestEasyEasy on every PR); the new natlab-test.yml is the bigger suite. The job inside is renamed to EasyEasy for the same reason. Move the macOS arm64 host check from vmtest.Env.Start into vmtest.Env.AddNode so a test that adds a vmtest.MacOS node skips immediately on a non-macOS host, and add an explicit skipIfNotMacOSArm64 helper at the top of the two macOS-only tests so the platform requirement is obvious to readers. Quiet the takeAgentConnOne miss log in tstest/natlab/vnet by default (it was the overwhelming majority of bytes in CI logs, with no signal in healthy runs) and replace it with a periodic "still waiting" line that only fires after 10s, so a truly stuck agent connection still surfaces. Updates #13038 Change-Id: I4582098d8865200fd5a73a9b696942319ccf3bf0 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-11 17:14:46 -07:00
James TuckerandJames Tucker	4eec4423b4	util/eventbus: move Publisher publisher-interface impl to a non-generic core Mirrors the same refactor previously applied to SubscriberFunc: - Publisher[T]: a thin user-facing facade. Holds a pointer to a non-generic publisherCore and exposes Publish/Close/ShouldPublish. - publisherCore: a non-generic struct that owns the Client back- pointer, stop flag, and cached reflect.Type. It implements the package-private publisher interface (publishType, Close). The bus's per-Client publisher set is set.Set[publisher] keyed on this single non-generic type. The publisher interface only exists to support diagnostic introspection (Debugger.PublishTypes returning the list of types a client publishes). Previously, satisfying that diagnostic-only interface forced Publisher[T] to be the implementor and cost a per-T itab, generic dictionary, and equality function on every event type ever passed through Publish[T]. Moving the implementation to a non-generic core lets the diagnostic surface work unchanged while charging zero per-T cost for the diagnostic-driven generic interface. Publisher[T].Publish is also slimmed: the channel/select/stopFlag loop is now a non-generic publish() helper that takes the value as 'any'. The per-T body is reduced to forwarding the boxed value to the helper. Measured impact (util/eventbus/sizetest): total per-flow binary cost: linux/amd64: 2252.8 B/flow -> 1900.5 B/flow (-352.3 B / -15.6%) linux/arm64: 2228.2 B/flow -> 1835.0 B/flow (-393.2 B / -17.6%) Publisher per-receiver attribution: linux/amd64: 635.2 B/flow -> 369.6 B/flow (-265.6 B / -41.8%) linux/arm64: 751.7 B/flow -> 373.2 B/flow (-378.5 B / -50.4%) Cumulative reduction from the original baseline (5167ff412): linux/amd64: 3096.6 B/flow -> 1900.5 B/flow (-1196.1 B / -38.6%) linux/arm64: 3145.7 B/flow -> 1835.0 B/flow (-1310.7 B / -41.7%) Dropped per-T symbols (200-flow eventbus binary): - .dict.Publisher[T] was 14,400 B (72 B/T) - type:.eq.Publisher[T] was 11,832 B (58 B/T) - go:itab.Publisher[T],publisher was 8,000 B (40 B/T) - (Publisher[T]).Close shape stencils collapsed to 1 Behavior is unchanged: BenchmarkBasicThroughput is within noise (2018 -> 2038 ns/op at -benchtime=2s) and all eventbus tests pass. Updates #12614 Change-Id: I61979c2bf95d2a711c2321e6e0b4b7d15980e9f5 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-11 14:39:42 -07:00
James TuckerandJames Tucker	d72cde1a6b	util/eventbus: move SubscriberFunc subscriber-interface impl to a non-generic core Splits SubscriberFunc[T] into: - SubscriberFunc[T]: a thin user-facing facade that holds only a pointer to a non-generic core. It exposes Close() to user code, which forwards to the core. - subscriberFuncCore: a non-generic struct that owns all the subscriber state (stop flag, unregister, logf, slow timer, cached reflect.Type) and implements the bus's package-private subscriber interface. Its dispatch() invokes a closure captured at construction time that performs the vals.Peek().Event.(T) type assertion and runs the user callback on the unboxed value. The bus's outputs map and subscriber-interface itab are parameterized only by subscriberFuncCore, not by T, eliminating both the per-T itab and the per-T generic dictionary that previously scaled with the number of subscribed event types. Measured impact (util/eventbus/sizetest): total per-flow binary cost: linux/amd64: 3039.2 B/flow -> 2252.8 B/flow (-786.4 B / -25.9%) linux/arm64: 3145.7 B/flow -> 2228.2 B/flow (-917.5 B / -29.2%) SubscriberFunc per-receiver attribution: linux/amd64: 840.8 B/flow -> 300.8 B/flow (-540.0 B / -64.2%) linux/arm64: 849.9 B/flow -> 303.8 B/flow (-546.1 B / -64.3%) Dropped per-T symbols (200-flow eventbus binary): - (SubscriberFunc[T]).dispatch was 26,639 B total (130 B/T) - (SubscriberFunc[T]).subscribeType was 3,600 B total ( 18 B/T) - .dict.SubscriberFunc[T] was 14,400 B total ( 72 B/T) - go:itab.SubscriberFunc[T],... was 9,600 B total ( 48 B/T) Of the original 913 B/flow attributed to SubscriberFunc, 540 B/flow is now gone, dropping the receiver to 300 B/flow. Behavior is unchanged: BenchmarkBasicThroughput is within noise (1955 -> 1941 ns/op on the test box) and all eventbus tests pass. Updates #12614 Change-Id: I646b3b05fd8d95f9afead59bfd0f69cd18b7a709 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-11 12:14:05 -07:00
Francois MarierandBrad Fitzpatrick	ead5ce65a3	cmd/pgproxy: fix client TLS handshake timeout There is a 30-second timeout set on client TLS connections but the handshake was called on the wrong connection and so the timeout was never used in practice. Signed-off-by: Francois Marier <francois@fmarier.org>	2026-05-11 11:12:11 -07:00
Fran Bull	2f45a6a9d8	feature/conn25: return expired assignments to address pools Make it possible to remove the least recently used expired address assignment from addrAssignments. Before checking out a new address from the IP pools, return a handful of expired addresses. Updates tailscale/corp#39975 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-08 14:33:06 -07:00
Fran Bull	82346f3882	feature/conn25: move addrAssignments to their own file Updates tailscale/corp#39975 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-08 14:33:06 -07:00
Claus LensbølandGitHub	469d356ed8	tstest/natlab/vmtest: add test for direct conn with cached netmap (#19660 ) When a peer is not able to connect to control after a restart and is using a cached netmap, that nodes should be able to connect to another peer in its tailnet (given that the home DERP of that peer has not changed in the meantime). Add test that starts two peers and connects them to a tailnet with caching enabled. Then blackhole traffic to control from one peer and restart it. Verify that the connection between the two ends up direct. Adds facilities for expecting a certain path type between nodes. Updates: #19597 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-05-08 16:57:27 -04:00
Fran Bull	ee2378b141	feature/conn25: follow CNAMEs when rewriting DNS response If a DNS query for a domain that should be routed through a connector results in CNAME records in the response, collapse the CNAME chain to an A/AAAA record for the domain -> magic IP. Fixes tailscale/corp#39978 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-08 08:12:24 -07:00
Brad FitzpatrickandBrad Fitzpatrick	24eb157448	go.toolchain.rev: bump to Go 1.26.3 Updates tailscale/corp#41490 Change-Id: I35b67bdbcd71468fea03b033b17aeefe1319dc45 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-07 15:33:05 -07:00
Alex ChanandAlex Chan	d6ffc0d986	tka,ipn: reduce boilerplate in Tailnet Lock tests The `CreateStateForTest` helper reduces boilerplate in cases where the test only cares about the trusted keys and not the disablement values (and makes it more obvious where the disablement values are meaningful). The `setupChonkStorage` helper reduces the boilerplate when creating on-disk TKA storage in tests. The `fakeLocalBackend` helper reduces the boilerplate when setting up a `LocalBackend` instance in the IPN tests. Updates #cleanup Change-Id: Iacfba1be5f7fab208eec11e4369d63c7d7519da5 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-07 21:49:27 +01:00
Fernando SerbonciniandGitHub	495d3acc7b	tstest/natlab/vmtest: kill QEMU when test process dies (#19676 ) Re-exec the test binary as a thin wrapper that holds a pipe inherited from the test. When the test goes away (any reason, including SIGKILL, panic, or OOM), the kernel closes the pipe write end; the wrapper sees EOF and SIGKILLs itself, taking QEMU and its children with it. Updates #13038 Change-Id: Ib2151098193551396c1d7bb51b07da3bd6b2cfb4 Signed-off-by: Fernando Serboncini <fserb@tailscale.com>	2026-05-07 16:14:27 -04:00
Claus LensbølandGitHub	76248a68b2	tstest/natlab/vnet: close gonet sockets when test is done (#19677 ) Running all vmtests in tstest/natlab/vmtest locally was breaking later tasks in the queue. The goroutine dump on timeout had goroutines hanging around for 9 minutes, meaning that something was not getting cleaned up. goroutine 262 [select, 9 minutes]: gvisor.dev/gvisor/pkg/tcpip/adapters/gonet.commonRead({...}) Add a timeout of Now() to gonet TCP connections when the test ends (inspired by ServeUnixConn()), and wait for them to shut down before exiting the test. Updates #13038 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-05-07 14:57:07 -04:00
Hazel TandGitHub	33b9579c21	scripts/installer.sh: add openSUSE Slowroll as a Tumbleweed derivative (#19662 ) Fixes: #14927 Signed-off-by: Hazel T <hazel@tailscale.com>	2026-05-07 12:43:55 +01:00
Erisa AandGitHub	76712b32d9	.github: install ca-certificates on Kali to fix installer tests (#19673 ) Updates #cleanup Signed-off-by: Erisa A <erisa@tailscale.com>	2026-05-07 12:20:09 +01:00
James TuckerandJames Tucker	0def0f19bd	util/eventbus: extract SubscriberFunc.dispatch loop to a non-generic helper The (SubscriberFunc[T]).dispatch method body — a ~40-line select loop with slow-subscriber timer, snapshot handling, ctx-cancel draining, and a CI stack-dump branch — was previously fully duplicated by the Go compiler for every distinct GC shape of T. None of that body actually depends on T except for the type assertion and the user callback invocation. This change moves the loop body into a non-generic dispatchFunc() helper, leaving (SubscriberFunc[T]).dispatch as a tiny wrapper that: - performs the vals.Peek().Event.(T) type assertion - spawns the callback goroutine via `go runFuncCallback(s.read, t, callDone)` — a regular generic function call, not a closure, so that `go` binds the args to the goroutine's frame instead of allocating a closure on the heap. This preserves the zero-extra-allocation behavior of the original (*SubscriberFunc[T]).runCallback method. - resolves T's name via reflect.TypeFor[T]().String() (cached on the stack rather than recomputed on each %T formatting) - calls dispatchFunc with the callDone channel The %T formatting in the original logf calls is replaced with %s on the resolved name string, removing per-T fmt instantiations. A new BenchmarkBasicFuncThroughput is added alongside the existing BenchmarkBasicThroughput so per-event allocation behavior on the SubscribeFunc dispatch path is covered by the benchmark suite. Measured impact (util/eventbus/sizetest): SubscriberFunc per-flow attribution: linux/amd64: 912.5 B/flow -> 840.8 B/flow (-71.7 B/flow) linux/arm64: 917.5 B/flow -> 849.9 B/flow (-67.6 B/flow) The total per-flow size delta on amd64 dropped from 3,096.6 B to 3,039.2 B (-57 B/flow). The arm64 total stayed at 3,145.7 B because the linker's page-aligned section sizing absorbed the improvement on this binary; the symcost-attributed per-receiver number is the real signal. Behavior is unchanged: BenchmarkBasicThroughput stays at 0 allocs/op and BenchmarkBasicFuncThroughput holds at the same 2 allocs/op, 144 B/op as the prior eventbus implementation. All eventbus tests pass. Updates #12614 Change-Id: I85f933f50f58cd25bbfe5cc46bdda7aab22f0bf7 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-06 18:56:09 -07:00
Brad FitzpatrickandBrad Fitzpatrick	87a74c3aa2	tsnet: make workload identity federation opt-in The tailscale.com/wif package brings in the AWS SDK (github.com/aws/aws-sdk-go-v2/{config,sts,...} and github.com/aws/smithy-go) to support fetching ID tokens from AWS IMDS for workload identity federation. Until now, tsnet pulled this in unconditionally via feature/condregister/identityfederation, costing ~70 unwanted deps for every tsnet program whether or not it uses workload identity federation. These AWS SDK deps were originally removed from tsnet on 2025-09-29 by commit `69c79cb9f` ("ipn/store, feature/condregister: move AWS + Kube store registration to condregister"). They were then accidentally added back on 2026-01-14 by commit `6a6aa805d` ("cmd,feature: add identity token auto generation for workload identity", PR #18373) when the new wif package was wired into tsnet via feature/identityfederation. Drop the blanket import. tsnet programs that want workload identity federation now opt in with: import _ "tailscale.com/feature/identityfederation" The hook lookup in resolveAuthKey already uses GetOk and degrades gracefully when the feature isn't linked, so existing programs that don't use workload identity federation see no behavior change. The tailscale CLI still imports the condregister wrapper directly, so its behavior is also unchanged. Lock this in with TestDeps additions: tailscale.com/wif as a BadDep, plus substring checks in OnDep that fail on any github.com/aws/ or k8s.io/ dependency creeping back in. Also, switch cmd/gitops-pusher from the condregister wrapper to a direct import of feature/identityfederation: gitops-pusher's auth flow calls HookExchangeJWTForTokenViaWIF directly, so it shouldn't be subject to the ts_omit_identityfederation build tag. Updates #12614 Change-Id: I70599f2bdd4d3666b26a859d5b76caa5d6b94507 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-06 18:43:45 -07:00
Adriano Sela AvilesandAdriano Sela Aviles	daddb14b8f	control/controlhttp: use ws:// when HTTPSPort is NoPort in JS dialer When HTTPS is explicitly disabled (HTTPSPort == NoPort), the JS WebSocket dialer should use ws:// instead of wss://. This matches the behavior of the non-JS client and fixes connections to development control servers e.g. http://localhost:31544. Updates tailscale/corp#40944 Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>	2026-05-06 15:58:58 -07:00
Brad FitzpatrickandBrad Fitzpatrick	d06cc56987	wgengine/magicsock: add more docs, checks to Test32bitAlignment Per recent chat with @raggi about all this, I went and looked at this test again. Updates #cleanup Change-Id: Icb7d87b1ed2cebf481ee4e358a3aa603e63fb8a4 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-06 15:29:44 -07:00
Brad FitzpatrickandBrad Fitzpatrick	15bb10dbce	tsnet: ban awsstore and kubestore as deps in TestDeps Commit `69c79cb9f` (Sep 2025) moved awsstore and kubestore registration behind condregister build tags so tsnet wouldn't pull in the AWS SDK and Kubernetes client by default. The accompanying TestDeps BadDeps entry was missed, so PR #19667 (which re-added those imports) wasn't caught by the test. Add the two packages to BadDeps so future regressions fail the test. Updates #19667 Updates #12614 Change-Id: I903b7c976e5e122cc0c0b956dc73740f5d474fac Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-06 14:57:47 -07:00
Tom ProctorandGitHub	b74eeda055	cmd/testwrapper: print unit for package duration (#19663 ) Include the unit (s) when printing the time taken to test each package. Updates #cleanup Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>	2026-05-06 22:31:48 +01:00
kari-tsandGitHub	c721189cef	ipn/ipnlocal: prefer one CGNAT route on Android (#19652 ) Android rebuilds its VpnService interface when the VPN route configuration changes, which tears down long lived TCP connections through the tunnel. Use the same automatic OneCGNATRoute behavior as macOS on Android, and prefer the single CGNAT route when no other interface is using the CGNAT, falling back to fine grained peer routes otherwise. Updates tailscale/tailscale#19591 Signed-off-by: kari <kari@tailscale.com>	2026-05-05 19:11:17 -07:00
Brad FitzpatrickandBrad Fitzpatrick	f844c8bc32	util/winutil/gp: deflake TestGroupPolicyReadLockClose The test goroutine read lockCnt immediately after Lock returned, racing with Close: close(lk.closing) wakes lockSlow's select, whose deferred Add(-2) on lockCnt can run before Close's CAS clears the LSB. When that happens, lockCnt is briefly 1 (3 - 2) instead of 0 (1 + 2 - 2 - 1), producing "lockCnt: got 1; want 0". Move the lockCnt assertion into the main test goroutine, after both Close has returned and the Lock goroutine has finished, so both updates have settled before we read. Fixes #19647 Change-Id: Ia67036ff73a1beb528cbd621460db9048f3066ad Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-05 14:02:35 -07:00
Jonathan NobelsandGitHub	872d79089e	VERSION.txt: this is v1.99.0 (#19645 ) Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>	2026-05-05 15:07:20 -04:00
Evan LowryandGitHub	aa21b0c008	client/systray: fix recommended exit node not showing as selected (#19627 ) When an exit node was set before launching systray, the recommended row in exit nodes rendered as not selected even when the active exit node was at the same location. This looks to be two different things: - suggestExitNode takes its own suggestion into account, and not the users active exit node. When a mullvad city is reached via the picker rather than the recommended row, the suggester's pick and prefs.ExitNodeID end up as distinct peers in the same city, resulting in an ID-only equality check missing the match. - Toggle state was constructed and mutated via .Check(), which for newly created elements may be cached (such as when launching systray, with an already active node). Fixes #19626 Signed-off-by: Evan Lowry <evan@tailscale.com>	2026-05-05 10:49:38 -03:00
Alex ChanandAlex Chan	eac531da8e	cmd/tailscale/cli: unhide `--report posture` flag in `up` This was originally hidden during the beta period in both `up` and `set`, then when device posture went GA we unhid the flag in `set` but not in `up`. This is confusing for users, because an error message can direct them to run `tailscale up` with this flag if they've set it previously, but the help text won't tell them what it does. Updates #5902 Updates #17972 Change-Id: I9a31946f4b3bb411feed0f5a6449d7ff9a5ba9d3 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-05 10:12:36 +01:00
Brad FitzpatrickandBrad Fitzpatrick	883d4fd2cd	wgengine/netstack, net/ping: stop using pro-bing and use our net/ping instead Fixes #19633 Fixes #13760 Change-Id: I0fa9423523a3a0fb1dfcde57de0f26e51723ff97 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-04 14:05:24 -07:00
Brad FitzpatrickandBrad Fitzpatrick	81569e891f	tstest/iosdeps: update import list to mirror ipn-go-bridge The purpose of this package is to test the iOS dependency closure, but it had drifted from the actual import list of the ipn-go-bridge package in the corp repo (the Go side of the iOS / macOS app). Update the imports to match ipn-go-bridge's GOOS=ios import list, adding many missing packages including wgengine/netstack, feature/{taildrop,syspolicy,condregister}, the util/syspolicy/* subpackages, types/{key,lazy,logid,netmap}, tsd, safesocket, util/{eventbus,must,set}, and several net/* and ipn/* packages. Drop two now-stale BadDeps entries (for now!): database/sql/driver and github.com/google/uuid are reached via wgengine/netstack -> github.com/prometheus-community/pro-bing, which netstack imports on darwin \|\| ios for ICMP user-ping, so the iOS app already ships them. But we should fix that later. Updates #19633 Change-Id: Ic50779fdb195685a2e8ccd7c513eee91b0feeaf8 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-04 14:05:24 -07:00
Brad FitzpatrickandBrad Fitzpatrick	9bb7ca6116	cmd/vet/lowerell, drive/driveimpl: forbid variables named "l" or "I" Add a new vet checker that rejects variables, parameters, named return values, receivers, range/type-switch bindings, type parameters, struct fields, and constants named "l" (lowercase ell) or "I" (uppercase i). Both are hard to distinguish from the digit "1" and from each other in too many fonts. Rename the two pre-existing struct fields named "l" (both of type net.Listener) in drive/driveimpl/drive_test.go to "ln", matching the convention used elsewhere for net.Listener locals. Rename the test-fixture struct fields "I" (single int label) to "Int" in metrics/multilabelmap_test.go and util/deephash/deephash_test.go, preserving the "first letters of types" convention used alongside neighboring fields like I8/I16/U/U8. Also teach pkgdoc_test.go to skip testdata/ directories, which the go tool ignores; they are not real packages. Fixes #19631 Change-Id: I71ad2fa990705f7a070406ebcdb8cefa7487d849 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-04 14:03:28 -07:00
Andrew LytvynovandGitHub	0cf899610c	util/linuxfw/linuxfwtest: remove unused package (#19520 ) Added in 2022, this appears to be unused now. Updates #cleanup Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2026-05-04 12:33:12 -07:00
License UpdaterandWill Norris	ca2317439d	licenses: update license notices Signed-off-by: License Updater <noreply+license-updater@tailscale.com>	2026-05-04 10:34:27 -07:00
Jordan WhitedandJordan Whited	ce76f44df2	derp/derpserver: remove global rate limiter Which can be unfair around varying packet sizes. Updates tailscale/corp#40962 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2026-05-04 09:41:14 -07:00
Fernando SerbonciniandGitHub	29122506be	misc/git_hook: propagate shared HOOK_VERSION (#19476 ) Move HOOK_VERSION into the githook package and export it as githook.HookVersion, so tailscale/corp can reference it via the shared-code bump instead of having to bump HOOK_VERSION by hand. New launcher.sh composes the wanted version from 2 sources: the shared HOOK_VERSION and an optional repo local version, misc/git_hook/HOOK_VERSION, for repo-specific config bumps. Updates tailscale/corp#40381 Change-Id: I7cf16889ba53cb564cc2df7dfd7588748f542c55 Signed-off-by: Fernando Serboncini <fserb@tailscale.com>	2026-05-04 12:38:28 -04:00
George JonesandGitHub	290a6cc03c	appc, feature/conn25: handle exact and wildcard domains correctly (#19202 ) Installed SplitDNS routes are always treated as wildcard domains, so the domains that we pass to the local resolver should be normalized and have any leading *. wildcard prefix removed. When looking at DNS responses to see if the domain matches, we need to consider both exact matches and wildcard matches. We now keep separate maps of exact-match domains and wildcard domains, and when we match we check to see if there's a match in the exact-match map, otherwise we check against the wild card match map until we find a match, removing a label after each check. Rather than looking for matching self-hosted domains (domains serviced by the connector being run on the self-node), the apps that are being serviced by the connector on the self-node are tracked instead. When checking to see if a DNS response should be rewritten, it is ignored if any of the matching apps for the domain are in the self-hosted apps set. Fixes tailscale/corp#39272 Signed-off-by: George Jones <george@tailscale.com>	2026-05-01 17:33:21 -04:00
Fran Bull	bdf3419e7d	net/dns: add custom scheme resolvers If another part of the client code registers a custom scheme with the forwarder, the forwarder will check resolver addresses to see if they match the scheme. If they do, the corresponding custom scheme handler will be called to find the actual address for the resolver at this moment. If the handler returns the empty string then that resolver will be ignored. This is useful if you want to dynamically determine where to send certain DNS requests. It is being added to support new app connector (conn25) work that would like to make sure it sends DNS requests to the current connector peer in a high availability configuration. Updates tailscale/corp#39858 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-01 14:01:10 -07:00
Rollie MaandGitHub	78126c5d9f	tailcfg: add node capability for services in desktop clients (#19605 ) Add a node capability to help determine if the desktop clients should show services list/menu/section Updates: https://github.com/tailscale/corp/issues/40900 Change-Id: Ie34b3362f921d710173b2a0dd190354352bb26f0 Signed-off-by: Rollie Ma <rollie@tailscale.com>	2026-05-01 12:07:33 -07:00
Tom MeadowsandGitHub	ee10f9881c	cmd/k8s-operator: add authkey reissuing to recorder reconciler (#19556 ) also fixes memory leak with authKeyReissuing map on ProxyGroup reconciler authkey reissue. Updates #19311 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>	2026-05-01 18:26:55 +01:00
Alex ChanandAlex Chan	3ced30b0b6	tka: clarify that this limit is on disablement values not secrets Values get written into TKA state; secrets don't. Updates #cleanup Change-Id: Ief9831dcb1102f584a33b2e71b611b38ca463724 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-01 18:25:39 +01:00
Andrew LytvynovandGitHub	f15a4f4416	client/web: move API permission checks into handlers (#19576 ) There are only a couple endpoints that check peer capabilities. Keeping permission checks with the code that assumes they were performed, rather than with the routing layer, feels easier to reason about. Check that the caller is actually a peer and pass their capabilities via a context value for handlers that want to check them. Along with this, simplify the helper handler wrappers that are not needed for most of the endpoints. Updates #40851 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2026-05-01 09:01:53 -07:00
Brad FitzpatrickandBrad Fitzpatrick	bbcb8650d4	cmd/tailscale/cli: fetch netmap via current-netmap debug action Stop opening an IPN bus subscription with NotifyInitialNetMap purely to read the current netmap once. Use the LocalAPI debug current-netmap action (added in `159cf8707`) instead, which returns the current netmap synchronously without subscribing to the bus. Updates #12542 Change-Id: I8aa2096d65aaea4dfe62634f03ce06b5470e0e51 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-01 07:53:51 -07:00
Brad FitzpatrickandBrad Fitzpatrick	4c3ed5ab32	all: migrate code off Notify.NetMap to Notify.SelfChange Move tailscaled's in-tree reactive users from of IPN bus Notify.NetMap updates to the narrower Notify.SelfChange signal introduced earlier in this series. Consumers that need additional state (peers, DNS config, etc.) fetch it on demand via the LocalAPI. It is a step toward the larger goal of not fanning Notify.NetMap out to every bus watcher on Linux/non-GUI hosts. A future change stops sending Notify.NetMap entirely on Linux and non-GUI platforms. (eventually once macOS/iOS/Windows migrate to the upcoming new Notify APIs, we'll remove ipn.Notify.NetMap entirely) Updates #12542 Change-Id: I51ea9d86bdca1909d6ac0e7d5bd3934a3a4e8516 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-01 06:51:40 -07:00
Claus LensbølandGitHub	ff9c3f0e00	tstest/natlab/vmtest: add test loading netmap cache from disk (#19598 ) For testing the loading of netmap cache from disk, the cache needs to exist. The simple solution is to start two nodes and connect them to control, with the netmap caching capability set. Then cut the connection to control, restart the nodes, and ping between them. This tests that we can start from a cache and get to running state, but also that we are able to establish a connection between the nodes. For now this is not testing how the nodes are able to talk to each other (DERP vs direct). Updates #19597 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-05-01 09:46:19 -04:00
Brad FitzpatrickandBrad Fitzpatrick	89a78dc9b7	client/local, ipn/localapi, ipn/ipnlocal: add PeerByID Add a narrow LocalAPI accessor and matching client/LocalBackend method to look up a single peer's current full [tailcfg.Node] by NodeID, in O(1) time on the daemon side, without fetching the entire netmap. Useful for callers that need the latest state of a single peer (e.g. in response to a peer-mutation event on the IPN bus) without paying for a full netmap fetch. Updates #12542 Change-Id: I1cb2d350e6ad846a5dabc1f5368dfc8121387f7c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-01 06:20:46 -07:00
Alex ChanandAlex Chan	cac94f51cc	ipn/ipnlocal: don't compact TKA state on startup Compacting on startup means nodes may compact at a different cadence based on whether they're long-running or restarting frequently. We already compact after every sync, which only occurs when the TKA state has changed. Waiting for TKA changes to trigger compaction on nodes means compaction will occur more consistently across a tailnet. Updates tailscale/corp#33537 Change-Id: Ia0aa6d9e5e362e9ab08450fde69772841790d5b5 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-01 13:27:12 +01:00
Brad FitzpatrickandBrad Fitzpatrick	a6c5d23742	ipn, ipn/ipnlocal: add Notify.SelfChange Add a new bus signal that lets reactive consumers (containerboot, kube agents, sniproxy, tsconsensus, etc.) react to self-node updates without having to subscribe to the full netmap. Today those consumers either watch Notify.NetMap (which on large tailnets is expensive to encode and ship per watcher) or poll. SelfChange is a cheap, narrow alternative: addresses, name, key expiry, capabilities, etc. Consumers that need additional state can react to SelfChange and then fetch the relevant bits on demand via existing LocalClient methods. Producer-side, every netmap-bearing setControlClientStatus call now also publishes SelfChange. Future changes will migrate individual in-tree consumers off Notify.NetMap to this signal, and eventually gate the legacy NetMap emission to platforms whose host GUIs still require it. Updates #12542 Change-Id: I4441650b0e085d663eb6bf26a03748b7d961ca49 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-30 14:47:03 -07:00
Brad FitzpatrickandBrad Fitzpatrick	9f343fdc0c	client/local, ipn/localapi, all: add CertDomains and DNSConfig accessors Add two narrow LocalAPI accessors so callers don't have to subscribe to the IPN bus and pull a full *netmap.NetworkMap just to read DNS-shaped fields: - GET /localapi/v0/cert-domains returns DNS.CertDomains. - GET /localapi/v0/dns-config returns the full tailcfg.DNSConfig. Migrate in-tree callers off the netmap-on-the-bus pattern: - kube/certs.waitForCertDomain still wakes on the IPN bus but now queries CertDomains via LocalClient.CertDomains rather than reading n.NetMap.DNS.CertDomains. The kube LocalClient interface and FakeLocalClient gain a CertDomains method. - cmd/tailscale dns status calls LocalClient.DNSConfig directly instead of opening a NotifyInitialNetMap watcher. - cmd/tailscale configure kubeconfig switches from a netmap watcher + serviceDNSRecordFromNetMap to LocalClient.DNSConfig + serviceDNSRecordFromDNSConfig. This is part of a series moving callers away from depending on the netmap traveling on the IPN bus, so the bus payload can shrink in a later change. Updates #12542 Change-Id: Ie10204e141d085fbac183b4cfe497226b670ad6c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-30 13:50:46 -07:00
Michael Ben-Amiandmzbenami	822299642b	feature/conn25: centralize config on Conn25 with atomic access We have two sources of truth for configuration state: the node view (from the netmap/policy) and prefs (the --advertise-connector option). These come with two independent update paths: onSelfChange for node view changes and profileStateChange for pref changes. Centralize config on Conn25 so that onSelfChange and profileStateChange can update their independent parts without bundling changes together. The old bundled approach required read-modify-write, which opened the door to potential TOCTOU bugs. The node view config is stored as an atomic.Pointer[config] and the prefs-derived field (advertise-connector) becomes an independent atomic.Bool. onSelfChange creates a fresh config and stores it atomically. profileStateChange sets the bool. This also establishes clearer lines of responsibility: - Configuration state lives on Conn25. Methods that need to read config (isConnectorDomain, mapDNSResponse, the IPMapper methods) are on Conn25, and use the atomics for synchronization. - "Active" state (address allocations, transit IP mappings) lives on client and connector, and use a mutex for synchronization on that state, without conflicting with configuration synchronization. It's fine for active state to be out of sync with config — e.g. a transit IP allocated for an app should still be tracked, and gracefully expired, even if the app is removed from the node view. Removing config responsibility from client/connector makes these cases clearer to handle. - In cases where the client or connector does need access to config-derived state, e.g. a client reconfiguring its IP pools from the IPSets in the config, we can use closures for the client or connector to get just the latest state it needs from the config. See getIPSets() in this commit. - As of this commit, the connector doesn't need config-derived state at all. Fixes tailscale/corp#40872 Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>	2026-04-30 16:29:56 -04:00

1 2 3 4 5 ...