ssh/tailssh: fix race in session termination message delivery
When a recording upload fails mid-session, the recording goroutine
cancels the session context. This triggers two concurrent paths:
exec.CommandContext kills the process (causing cmd.Wait to return),
and killProcessOnContextDone tries to write the termination message
via exitOnce.Do. If cmd.Wait returns first, the main goroutine's
exitOnce.Do(func(){}) steals the once, and the termination message
is never written to the client.
Fix by waiting for killProcessOnContextDone to finish writing the
termination message (via <-ss.exitHandled) before claiming exitOnce,
when the context is already done.
Also fix the fallback path when launchProcess itself fails due to
context cancellation: use SSHTerminationMessage() with the correct
"\r\n\r\n" framing instead of fmt.Fprintf with the internal error
string.
Deflakes TestSSHRecordingCancelsSessionsOnUploadFailure, which was
failing consistently at a low rate due to the exitOnce race. After
this fix, flakestress passes with 8,668 runs, 0 failures.
Fixes #7707 (again. hopefully for good.)
Change-Id: I5ab911c71574db8d3f9d979fb839f273be51ecf9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit is contained in:
committed by
Brad Fitzpatrick
parent
6e44c6828b
commit
2b1cfa7c4d
@@ -36,7 +36,6 @@ import (
|
||||
gliderssh "github.com/tailscale/gliderssh"
|
||||
"golang.org/x/net/http2"
|
||||
"golang.org/x/net/http2/h2c"
|
||||
"tailscale.com/cmd/testwrapper/flakytest"
|
||||
"tailscale.com/ipn/ipnlocal"
|
||||
"tailscale.com/ipn/store/mem"
|
||||
"tailscale.com/net/memnet"
|
||||
@@ -470,8 +469,6 @@ func newSSHRule(action *tailcfg.SSHAction) *tailcfg.SSHRule {
|
||||
}
|
||||
|
||||
func TestSSHRecordingCancelsSessionsOnUploadFailure(t *testing.T) {
|
||||
flakytest.Mark(t, "https://github.com/tailscale/tailscale/issues/7707")
|
||||
|
||||
if runtime.GOOS != "linux" && runtime.GOOS != "darwin" {
|
||||
t.Skipf("skipping on %q; only runs on linux and darwin", runtime.GOOS)
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user