The old check was too aggressive and required TS_GO_NEXT=1 at runtime
as well, which is too strict and onerous.
This is a sanity check only (and an outdated one, at that); it's okay for
it to be slightly loose and permit two possible values. If either is working,
we're already way past the old bug that this was introduced to catch.
Updates tailscale/corp#36382
Change-Id: Ib9a62e10382cd889ba590c3539e6b8535c6b19fe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* cmd/containerboot,kube/services: support the ability to automatically advertise services on startup
Updates #17769
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
* cmd/containerboot: don't assume we want to use kube state store if in kubernetes
Fixes#8188
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
---------
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
The new version of app connector (conn25) needs to read DNS responses
for domains it is interested in and store and swap out IP addresses.
Add a hook to dns manager to enable this.
Give the conn25 updated netmaps so that it knows when to assign
connecting addresses and from what pool.
Assign an address when we see a DNS response for a domain we are
interested in, but don't do anything with the address yet.
Updates tailscale/corp#34252
Signed-off-by: Fran Bull <fran@tailscale.com>
Add more explicit `--bind-address` and `--bind-port` flags to the `tailscale netcheck` CLI to give users control over UDP probes' source IP and UDP port.
This was already supported in a less documented manner via the` TS_DEBUG_NETCHECK_UDP_BIND` environment variable. The environment variable reference is preserved and used as a fallback value in the absence of these new CLI flags.
Updates tailscale/corp#36833
Signed-off-by: Amal Bansode <amal@tailscale.com>
--auth is already best-effort, but we saw some CI failures due to
failing to fetch stats when cigocached was overwhelmed recently. Make
sure it fails more gracefully in the absence of cigocached like the rest
of cigocacher already does.
Updates tailscale/corp#37059
Change-Id: I0703b30b1c5a7f8c649879a87e6bcd2278610208
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
fixestailscale/corp#37048
We're duplicating logic in AnyInterfaceUp in the ChangeDelta
and we're duplicating it wrong. The new State has the logic
for this based on the HaveV6 and HaveV4 flags.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Stop stripping the "*." prefix from wildcard domains when used
as storage keys. Instead, replace "*" with "wildcard_" only at
the filesystem boundary in certFile and keyFile. This prevents
wildcard and non-wildcard certs from colliding in storage.
Updates #1196
Updates #7081
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
Updating a node on a testcontrol server should trigger netmap updates to
all connected streaming clients. This was not the case previous to this
change and consequently caused race conditions in tests. It was possible
for a test to call UpdateNode and for connected nodes to never see the
update propagate.
Updates #16340Fixes#18703
Signed-off-by: Harry Harpham <harry@tailscale.com>
This commit implements a reconciler for the new `ProxyGroupPolicy`
custom resource. When created, all `ProxyGroupPolicy` resources
within the same namespace are merged into two `ValidatingAdmissionPolicy`
resources, one for egress and one for ingress.
These policies use CEL expressions to limit the usage of the
"tailscale.com/proxy-group" annotation on `Service` and `Ingress`
resources on create & update.
Included here is also a new e2e test that ensures that resources that
violate the policy return an error on creation, and that once the
policy is changed to allow them they can be created.
Closes: https://github.com/tailscale/corp/issues/36830
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit is based on ff0978ab, and extends #18497 to connect network map
caching to the LocalBackend. As implemented, only "whole" netmap values are
stored, and we do not yet handle incremental updates. As-written, the feature must
be explicitly enabled via the TS_USE_CACHED_NETMAP envknob, and must be
considered experimental.
Updates #12639
Co-Authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I48a1e92facfbf7fb3a8e67cff7f2c9ab4ed62c83
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
TestOnlyTaggedPeersCanBeDialed has a race condition:
- The test untags ps[2] and waits until ps[0] sees this tag dropped from
ps[2] in the netmap.
- Later the test tries to dial ps[2] from ps[0] and expects the dial to
fail as authorization to dial relies on the presence of the tag, now
removed from ps[2].
- However, the authorization layer caches the status used to consult peer
tags. When the dial happens before the cache times out, the test fails.
- Due to a bug in testcontrol.Server.UpdateNode, which the test uses to
remove the tag, netmap updates are not immediately triggered. The test
has to wait for the next natural set of netmap updates, which on my
machine takes about 22 seconds. As a result, the cache in the
authorization layer times out and the test passes.
- If one fixes the bug in UpdateNode, then netmap updates happen
immediately, the cache is no longer timed out when the dial occurs, and
the test fails.
Fixes#18720
Updates #18703
Signed-off-by: Harry Harpham <harry@tailscale.com>
This adds a new ControlKnob to make MagicDNS IPv6 registration
(telling systemd/etc) opt-out rather than opt-in.
Updates #15404
Change-Id: If008e1cb046b792c6aff7bb1d7c58638f7d650b1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
OnPolicyChange can observe a conn in activeConns before authentication
completes. The previous `c.info == nil` guard was itself a data race
against clientAuth writing c.info, and even when c.info appeared
non-nil, c.localUser could still be nil, causing a nil pointer
dereference at c.localUser.Username.
Add an authCompleted atomic.Bool to conn, stored true after all auth
fields are written in clientAuth. OnPolicyChange checks this atomic
instead of c.info, which provides the memory barrier guaranteeing all
prior writes are visible to the concurrent reader.
Updates tailscale/corp#36268 (fixes, but we might want to cherry-pick)
Co-authored-by: Gesa Stupperich <gesa@tailscale.com>
Change-Id: I4c69843541f5f9f04add9bf431e320c65a203a39
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The gocross-wrapper.sh bash script already checks TS_GO_NEXT (as of
a374cc344e) to select go.toolchain.next.rev over go.toolchain.rev,
but when TS_USE_GOCROSS=1 the Go binary itself was hardcoded to read
go.toolchain.rev. This makes gocross also respect the TS_GO_NEXT=1
environment variable.
Updates tailscale/corp#36382
Change-Id: I04bef25a34e7ed3ccb1bfdb33a3a1f896236c6ee
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In PR #18681, we started logging which exit nodes were being
suggested. However, we did not log if there were errors encountered.
This patch corrects this oversight.
Updates: tailscale/corp#29964
Updates: tailscale/corp#36446
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This switches our gokrazy builds to use a new variant of cmd/gok called
opinionated about using monorepos: https://github.com/bradfitz/monogok
And with that, we can get rid of all the go.mod files and builddir forests
under gokrazy/**.
Updates #13038
Updates gokrazy/gokrazy#361
Change-Id: I9f18fbe59b8792286abc1e563d686ea9472c622d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(*health.Tracker).CurrentState() returns an empty state when there are no client-side
warnables, even when there are control-health messages, which is incorrect.
This fixes it.
Updates tailscale/corp#37275
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Store packet filter rules in the cache. The match expressions are derived
from the filter rules, so these do not need to be stored explicitly, but ensure
they are properly reconstructed when the cache is read back.
Update the tests to include these fields, and provide representative values.
Updates #12639
Change-Id: I9bdb972a86d2c6387177d393ada1f54805a2448b
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
fixestailscale/corp#36708
Sets up a set of metrics to report watchdog timeouts for wgengine and
reports an event for any watchdog timeout.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
In the absence of a better mechanism, writing unqualified hostnames to the hosts file may be required
for MagicDNS to work on some Windows environments, such as domain-joined machines. It can also
improve MagicDNS performance on non-domain joined devices when we are not the device's primary
DNS resolver.
At the same time, updating the hosts file can be slow and expensive, especially when it already contains
many entries, as was previously reported in #14327. It may also have negative side effects, such as interfering
with the system's DNS resolution policies.
Additionally, to fix#18712, we had to extend hosts file usage to domain-joined machines when we are not
the primary DNS resolver. For the reasons above, this change may introduce risk.
To allow customers to disable hosts file updates remotely without disabling MagicDNS entirely, whether on
domain-joined machines or not, this PR introduces the `disable-hosts-file-updates` node attribute.
Updates #18712
Updates #14327
Signed-off-by: Nick Khyl <nickk@tailscale.com>
On domain-joined Windows devices the primary search domain (the one the device is joined to)
always takes precedence over other search domains. This breaks MagicDNS when we are the primary
resolver on the device (see #18712). To work around this Windows behavior, we should write MagicDNS
host names the hosts file just as we do when we're not the primary resolver.
This commit does exactly that.
Fixes#18712
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This commit adds a new custom resource definition to the kubernetes
operator named `ProxyGroupPolicy`. This resource is namespace scoped
and is used as an allow list for which `ProxyGroup` resources can be
used within its namespace.
The `spec` contains two fields, `ingress` and `egress`. These should
contain the names of `ProxyGroup` resources to denote which can be
used as values in the `tailscale.com/proxy-group` annotation within
`Service` and `Ingress` resources.
The intention is for these policies to be merged within a namespace and
produce a `ValidatingAdmissionPolicy` and `ValidatingAdmissionPolicyBinding`
for both ingress and egress that prevents users from using names of
`ProxyGroup` resources in those annotations.
Closes: https://github.com/tailscale/corp/issues/36829
Signed-off-by: David Bond <davidsbond93@gmail.com>
Adds a new track for release candidates. Supports querying by track in
version and updating to RCs in update for supported platforms.
updates #18193
Signed-off-by: Will Hannah <willh@tailscale.com>
Two methods were recently added to the testcontrol.Server type:
AddDNSRecords and SetGlobalAppCaps. These two methods should trigger
netmap updates for all nodes connected to the Server instance, the way
that other state-change methods do (see SetNodeCapMap, for example).
This will also allow us to get rid of Server.ForceNetmapUpdate, which
was a band-aid fix to force the netmap updates which should have been
triggered by the aforementioned methods.
Fixestailscale/corp#37102
Signed-off-by: Harry Harpham <harry@tailscale.com>
Instead of relying on the local timezone, which may cause
non-deterministic behavior in some CIs, we force timezone
to be UTC on default created clocks.
Fixes: tailscale/corp#37005
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>