* move probing out of netcheck into new net/portmapper package
* use PCP ANNOUNCE op codes for PCP discovery, rather than causing
short-lived (sub-second) side effects with a 1-second-expiring map +
delete.
* track when we heard things from the router so we can be less wasteful
in querying the router's port mapping services in the future
* use portmapper from magicsock to map a public port
Fixes#1298Fixes#1080Fixes#1001
Updates #864
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Users in Amsterdam (as one example) were flipping back and forth
between equidistant London & Frankfurt relays too much.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
At least the Apple Airport Extreme doesn't allow hairpin
sends from a private socket until it's seen traffic from
that src IP:port to something else out on the internet.
See https://github.com/tailscale/tailscale/issues/188#issuecomment-600728643
And it seems that even sending to a likely-filtered RFC 5737
documentation-only IPv4 range is enough to set up the mapping.
So do that for now. In the future we might want to classify networks
that do and don't require this separately. But for now help it.
I've confirmed that this is enough to fix the hairpin check on Avery's
home network, even using the RFC 5737 IP.
Fixes#188
My earlier 3fa58303d0 tried to implement
the net/http.Tranhsport.DialTLSContext hook, but I didn't return a
*tls.Conn, so we ended up sending a plaintext HTTP request to an HTTPS
port. The response ended up being Go telling as such, not the
/derp/latency-check handler's response (which is currently still a
404). But we didn't even get the 404.
This happened to work well enough because Go's built-in error response
was still a valid HTTP response that we can measure for timing
purposes, but it's not a great answer. Notably, it means we wouldn't
be able to get a future handler to run server-side and count those
latency requests.
Also:
* add -verbose flag to cmd/tailscale netcheck
* remove some API from the interfaces package
* convert some of the interfaces package to netaddr.IP
* don't even send IPv4 probes on machines with no IPv4 (or only v4
loopback)
* and once three regions have replied, stop waiting for other probes
at 2x the slowest duration.
Updates #376
Under some conditions, code would try to look things up in the maps
before the first call to updateLatency. I don't see any reason to delay
initialization of the maps, so let's just init them right away when
creating the Report instance.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.
And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)
This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.
Also fix some test flakes & races.
Fixes#387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes#399 (wgengine: flaky tests)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If Australia's far away and not going to be used, it's still going to
be far away a minute later. No need to send backup
just-in-case-UDP-gets-lost STUN packets to the known far away
destinations. Those are the ones most likely to trigger retries due to
delay anyway (in random 50-250ms, currently). But we'll keep sending 1
packet to them, just in case our airplane landed.
Likewise, be less aggressive with IPv6. The main point is just to see
whether IPv6 works. No need to send up to 10 packets every round. Max
two is enough (except for the first round). This does mean our STUN
traffic graphs for IPv4-vs-IPv6 will change shape. Oh well. It was a
weird eyeball metric for IPv6 connectivity anyway and we have better
metrics.
We can tweak this policy over time. It's factored out and has tests
now.
It used to make assumptions based on having Anycast IPs that are super
near. Now we're intentionally going to a bunch of different distant
IPs to measure latency.
Also, optimize how the hairpin detection works. No need to STUN on
that socket. Just use that separate socket for sending, once we know
the other UDP4 socket's endpoint. The trick is: make our test probe
also a STUN packet, so it fits through magicsock's existing STUN
routing.
This drops netcheck from ~5 seconds to ~250-500ms.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Basically, don't trust the OS-level link monitor to only tell you
interesting things. Sanity check it.
Also, move the interfaces package into the net directory now that we
have it.
I started to write a full DNS caching resolver and I realized it was
overkill and wouldn't work on Windows even in Go 1.14 yet, so I'm
doing this tiny one instead for now, just for all our netcheck STUN
derp lookups, and connections to DERP servers. (This will be caching a
exactly 8 DNS entries, all ours.)
Fixes#145 (can be better later, of course)
* adds new packet "netcheck" to do the checking of UDP, IPv6, and
nearest DERP server, and the Report type for all that (and more
in the future, probably pulling in danderson's natprobe)
* new tailcfg.NetInfo type
* cmd/tailscale netcheck subcommand (tentative name, likely to
change/move) to print out the netcheck.Report.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>