Move Linux client & common packages into a public repo.

This commit is contained in:
Earl Lee
2020-02-05 14:16:58 -08:00
parent c955043dfe
commit a8d8b8719a
156 changed files with 17113 additions and 0 deletions
+6
View File
@@ -0,0 +1,6 @@
*~
*.out
/example/logadopt/logadopt
/example/logreprocess/logreprocess
/example/logtail/logtail
/logtail
+10
View File
@@ -0,0 +1,10 @@
# Tailscale Logs Service
This github repository contains libraries, documentation, and examples
for working with the public API of the tailscale logs service.
For a very quick introduction to the core features, read the
[API docs](api.md) and peruse the
[logs reprocessing](./example/logreprocess/demo.sh) example.
For more information, write to info@tailscale.io.
+195
View File
@@ -0,0 +1,195 @@
# Tailscale Logs Service
The Tailscale Logs Service defines a REST interface for configuring, storing,
retrieving, and processing log entries.
# Overview
HTTP requests are received at the service **base URL**
[https://log.tailscale.io](https://log.tailscale.io), and return JSON-encoded
responses using standard HTTP response codes.
Authorization for the configuration and retrieval APIs is done with a secret
API key passed as the HTTP basic auth username. Secret keys are generated via
the web UI at base URL. An example of using basic auth with curl:
curl -u <log_api_key>: https://log.tailscale.io/collections
In the future, an HTTP header will allow using MessagePack instead of JSON.
## Collections
Logs are organized into collections. Inside each collection is any number of
instances.
A collection is a domain name. It is a grouping of related logs. As a
guideline, create one collection per product using subdomains of your
company's domain name. Collections must be registered with the logs service
before any attempt is made to store logs.
## Instances
Each collection is a set of instances. There is one instance per machine
writing logs.
An instance has a name and a number. An instance has a **private** and
**public** ID. The private ID is a 32-byte random number encoded as hex.
The public ID is the SHA-256 hash of the private ID, encoded as hex.
The private ID is used to write logs. The only copy of the private ID
should be on the machine sending logs. Ideally it is generated on the
machine. Logs can be written as soon as a private ID is generated.
The public ID is used to read and adopt logs. It is designed to be sent
to a service that also holds a logs service API key.
The tailscale logs service will store any logs for a short period of time.
To enable logs retention, the log can be **adopted** using the public ID
and a logs service API key.
Once this is done, logs will be retained long-term (for the configured
retention period).
Unadopted instance logs are stored temporarily to help with debugging:
a misconfigured machine writing logs with a bad ID can be spotted by
reading the logs.
If a public ID is not adopted, storage is tightly capped and logs are
deleted after 12 hours.
# APIs
## Storage
### `POST /c/<collection-name>/<private-ID>` — send a log
The body of the request is JSON.
A **single message** is an object with properties:
`{ }`
The client may send any properties it wants in the JSON message, except
for the `logtail` property which has special meaning. Inside the logtail
object the client may only set the following properties:
- `client_time` in the format of RFC3339: "2006-01-02T15:04:05.999999999Z07:00"
A future version of the logs service API will also support:
- `client_time_offset` a integer of nanoseconds since the client was reset
- `client_time_reset` a boolean if set to true resets the time offset counter
On receipt by the server the `client_time_offset` is transformed into a
`client_time` based on the `server_time` when the first (or
client_time_reset) event was received.
If any other properties are set in the logtail object they are moved into
the "error" field, the message is saved and a 4xx status code is returned.
A **batch of messages** is a JSON array filled with single message objects:
`[ { }, { }, ... ]`
If any of the array entries are not objects, the content is converted
into a message with a `"logtail": { "error": ...}` property, saved, and
a 4xx status code is returned.
Similarly any other request content not matching one of these formats is
saved in a logtail error field, and a 4xx status code is returned.
An invalid collection name returns `{"error": "invalid collection name"}`
along with a 403 status code.
Clients are encouraged to:
- POST as rapidly as possible (if not battery constrained). This minimizes
both the time necessary to see logs in a log viewer and the chance of
losing logs.
- Use HTTP/2 when streaming logs, as it does a much better job of
maintaining a TLS connection to minimize overhead for subsequent posts.
A future version of logs service API will support sending requests with
`Content-Encoding: zstd`.
## Retrieval
### `GET /collections` — query the set of collections and instances
Returns a JSON object listing all of the named collections.
The caller can query-encode the following fields:
- `collection-name` — limit the results to one collection
```
{
"collections": {
"collection1.yourcompany.com": {
"instances": {
"<logtail.PublicID>" :{
"first-seen": "timestamp",
"size": 4096
},
"<logtail.PublicID>" :{
"first-seen": "timestamp",
"size": 512000,
"orphan": true,
}
}
}
}
}
```
### `GET /c/<collection_name>` — query stored logs
The caller can query-encode the following fields:
- `instances` — zero or more log collection instances to limit results to
- `time-start` — the earliest log to include
- One of:
- `time-end` — the latest log to include
- `max-count` — maximum number of logs to return, allows paging
- `stream` — boolean that keeps the response dangling, streaming in
logs like `tail -f`. Incompatible with logtail-time-end.
In **stream=false** mode, the response is a single JSON object:
{
// TODO: header fields
"logs": [ {}, {}, ... ]
}
In **stream=true** mode, the response begins with a JSON header object
similar to the storage format, and then is a sequence of JSON log
objects, `{...}`, one per line. The server continues to send these until
the client closes the connection.
## Configuration
For organizations with a small number of instances writing logs, the
Configuration API are best used by a trusted human operator, usually
through a GUI. Organizations with many instances will need to automate
the creation of tokens.
### `POST /collections` — create or delete a collection
The caller must set the `collection` property and `action=create` or
`action=delete`, either form encoded or JSON encoded. Its character set
is restricted to the mundane: [a-zA-Z0-9-_.]+
Collection names are a global space. Typically they are a domain name.
### `POST /instances` — adopt an instance into a collection
The caller must send the following properties, form encoded or JSON encoded:
- `collection` — a valid FQDN ([a-zA-Z0-9-_.]+)
- `instances` an instance public ID encoded as hex
The collection name must be claimed by a group the caller belongs to.
The pair (collection-name, instance-public-ID) may or may not already have
logs associated with it.
On failure, an error message is returned with a 4xx or 5xx status code:
`{"error": "what went wrong"}`
+49
View File
@@ -0,0 +1,49 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package backoff
import (
"context"
"log"
"math/rand"
"time"
)
const MAX_BACKOFF_MSEC = 30000
type Backoff struct {
n int
Name string
NewTimer func(d time.Duration) *time.Timer
}
func (b *Backoff) BackOff(ctx context.Context, err error) {
if ctx.Err() == nil && err != nil {
b.n++
// n^2 backoff timer is a little smoother than the
// common choice of 2^n.
msec := b.n * b.n * 10
if msec > MAX_BACKOFF_MSEC {
msec = MAX_BACKOFF_MSEC
}
// Randomize the delay between 0.5-1.5 x msec, in order
// to prevent accidental "thundering herd" problems.
msec = rand.Intn(msec) + msec/2
log.Printf("%s: backoff: %d msec\n", b.Name, msec)
newTimer := b.NewTimer
if newTimer == nil {
newTimer = time.NewTimer
}
t := newTimer(time.Duration(msec) * time.Millisecond)
select {
case <-ctx.Done():
t.Stop()
case <-t.C:
}
} else {
// not a regular error
b.n = 0
}
}
+82
View File
@@ -0,0 +1,82 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package logtail
import (
"bytes"
"errors"
"fmt"
"sync"
)
type Buffer interface {
// TryReadLine tries to read a log line from the ring buffer.
// If no line is available it returns a nil slice.
// If the ring buffer is closed it returns io.EOF.
TryReadLine() ([]byte, error)
// Write writes a log line into the ring buffer.
Write([]byte) (int, error)
}
func NewMemoryBuffer(numEntries int) Buffer {
return &memBuffer{
pending: make(chan qentry, numEntries),
}
}
type memBuffer struct {
next []byte
pending chan qentry
dropMu sync.Mutex
dropCount int
}
func (m *memBuffer) TryReadLine() ([]byte, error) {
if m.next != nil {
msg := m.next
m.next = nil
return msg, nil
}
select {
case ent := <-m.pending:
if ent.dropCount > 0 {
m.next = ent.msg
buf := new(bytes.Buffer)
fmt.Fprintf(buf, "----------- %d logs dropped ----------", ent.dropCount)
return buf.Bytes(), nil
}
return ent.msg, nil
default:
return nil, nil
}
}
func (m *memBuffer) Write(b []byte) (int, error) {
m.dropMu.Lock()
defer m.dropMu.Unlock()
ent := qentry{
msg: b,
dropCount: m.dropCount,
}
select {
case m.pending <- ent:
m.dropCount = 0
return len(b), nil
default:
m.dropCount++
return 0, errBufferFull
}
}
type qentry struct {
msg []byte
dropCount int
}
var errBufferFull = errors.New("logtail: buffer full")
+51
View File
@@ -0,0 +1,51 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package main
import (
"flag"
"io/ioutil"
"log"
"net/http"
"net/url"
"os"
"strings"
)
func main() {
collection := flag.String("c", "", "logtail collection name")
publicID := flag.String("m", "", "machine public identifier")
apiKey := flag.String("p", "", "logtail API key")
flag.Parse()
if len(flag.Args()) != 0 {
flag.Usage()
os.Exit(1)
}
log.SetFlags(0)
req, err := http.NewRequest("POST", "https://log.tailscale.io/instances", strings.NewReader(url.Values{
"collection": []string{*collection},
"instances": []string{*publicID},
"adopt": []string{"true"},
}.Encode()))
if err != nil {
log.Fatal(err)
}
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
req.SetBasicAuth(*apiKey, "")
resp, err := http.DefaultClient.Do(req)
if err != nil {
log.Fatal(err)
}
b, err := ioutil.ReadAll(resp.Body)
resp.Body.Close()
if err != nil {
log.Fatalf("logadopt: response read failed %d: %v", resp.StatusCode, err)
}
if resp.StatusCode != 200 {
log.Fatalf("adoption failed: %d: %s", resp.StatusCode, string(b))
}
log.Printf("%s", string(b))
}
+87
View File
@@ -0,0 +1,87 @@
#!/bin/bash
# Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
# Use of this source code is governed by a BSD-style
# license that can be found in the LICENSE file.
#
# This shell script demonstrates writing logs from machines
# and then reprocessing those logs to amalgamate python tracebacks
# into a single log entry in a new collection.
#
# To run this demo, first install the example applications:
#
# go install tailscale.com/logtail/example/...
#
# Then generate a LOGTAIL_API_KEY and two test collections by visiting:
#
# https://log.tailscale.io
#
# Then set the three variables below.
trap 'rv=$?; [ "$rv" = 0 ] || echo "-- exiting with code $rv"; exit $rv' EXIT
set -e
LOG_TEXT='server starting
config file loaded
answering queries
Traceback (most recent call last):
File "/Users/crawshaw/junk.py", line 6, in <module>
main()
File "/Users/crawshaw/junk.py", line 4, in main
raise Exception("oops")
Exception: oops'
die() {
echo "$0: $*" >&2
exit 1
}
msg() {
echo "-- $*" >&2
}
if [ -z "$LOGTAIL_API_KEY" ]; then
die "LOGTAIL_API_KEY is not set"
fi
if [ -z "$COLLECTION_IN" ]; then
die "COLLECTION_IN is not set"
fi
if [ -z "$COLLECTION_OUT" ]; then
die "COLLECTION_OUT is not set"
fi
# Private IDs are 32-bytes of random hex.
# Normally you'd keep the same private IDs from one run to the next, but
# this is just an example.
msg "Generating keys..."
privateid1=$(hexdump -n 32 -e '8/4 "%08X"' /dev/urandom)
privateid2=$(hexdump -n 32 -e '8/4 "%08X"' /dev/urandom)
privateid3=$(hexdump -n 32 -e '8/4 "%08X"' /dev/urandom)
# Public IDs are the SHA-256 of the private ID.
publicid1=$(echo -n $privateid1 | xxd -r -p - | shasum -a 256 | sed 's/ -//')
publicid2=$(echo -n $privateid2 | xxd -r -p - | shasum -a 256 | sed 's/ -//')
publicid3=$(echo -n $privateid3 | xxd -r -p - | shasum -a 256 | sed 's/ -//')
# Write the machine logs to the input collection.
# Notice that this doesn't require an API key.
msg "Producing new logs..."
echo "$LOG_TEXT" | logtail -c $COLLECTION_IN -k $privateid1 >/dev/null
echo "$LOG_TEXT" | logtail -c $COLLECTION_IN -k $privateid2 >/dev/null
# Adopt the logs, so they will be kept and are readable.
msg "Adopting logs..."
logadopt -p "$LOGTAIL_API_KEY" -c "$COLLECTION_IN" -m $publicid1
logadopt -p "$LOGTAIL_API_KEY" -c "$COLLECTION_IN" -m $publicid2
# Reprocess the logs, amalgamating python tracebacks.
#
# We'll take that reprocessed output and write it to a separate collection,
# again via logtail.
#
# Time out quickly because all our "interesting" logs (generated
# above) have already been processed.
msg "Reprocessing logs..."
logreprocess -t 3s -c "$COLLECTION_IN" -p "$LOGTAIL_API_KEY" 2>&1 |
logtail -c "$COLLECTION_OUT" -k $privateid3
@@ -0,0 +1,116 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// The logreprocess program tails a log and reprocesses it.
package main
import (
"bufio"
"encoding/json"
"flag"
"io/ioutil"
"log"
"net/http"
"os"
"strings"
"time"
"tailscale.com/logtail"
)
func main() {
collection := flag.String("c", "", "logtail collection name to read")
apiKey := flag.String("p", "", "logtail API key")
timeout := flag.Duration("t", 0, "timeout after which logreprocess quits")
flag.Parse()
if len(flag.Args()) != 0 {
flag.Usage()
os.Exit(1)
}
log.SetFlags(0)
if *timeout != 0 {
go func() {
<-time.After(*timeout)
log.Printf("logreprocess: timeout reached, quitting")
os.Exit(1)
}()
}
req, err := http.NewRequest("GET", "https://log.tailscale.io/c/"+*collection+"?stream=true", nil)
if err != nil {
log.Fatal(err)
}
req.SetBasicAuth(*apiKey, "")
resp, err := http.DefaultClient.Do(req)
if err != nil {
log.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
b, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Fatalf("logreprocess: read error %d: %v", resp.StatusCode, err)
}
log.Fatalf("logreprocess: read error %d: %s", resp.StatusCode, string(b))
}
tracebackCache := make(map[logtail.PublicID]*ProcessedMsg)
scanner := bufio.NewScanner(resp.Body)
for scanner.Scan() {
var msg Msg
if err := json.Unmarshal(scanner.Bytes(), &msg); err != nil {
log.Fatalf("logreprocess of %q: %v", string(scanner.Bytes()), err)
}
var pMsg *ProcessedMsg
if pMsg = tracebackCache[msg.Logtail.Instance]; pMsg != nil {
pMsg.Text += "\n" + msg.Text
if strings.HasPrefix(msg.Text, "Exception: ") {
delete(tracebackCache, msg.Logtail.Instance)
} else {
continue // write later
}
} else {
pMsg = &ProcessedMsg{
OrigInstance: msg.Logtail.Instance,
Text: msg.Text,
}
pMsg.Logtail.ClientTime = msg.Logtail.ClientTime
}
if strings.HasPrefix(msg.Text, "Traceback (most recent call last):") {
tracebackCache[msg.Logtail.Instance] = pMsg
continue // write later
}
b, err := json.Marshal(pMsg)
if err != nil {
log.Fatal(err)
}
log.Printf("%s", b)
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
}
type Msg struct {
Logtail struct {
Instance logtail.PublicID `json:"instance"`
ClientTime time.Time `json:"client_time"`
} `json:"logtail"`
Text string `json:"text"`
}
type ProcessedMsg struct {
Logtail struct {
ClientTime time.Time `json:"client_time"`
} `json:"logtail"`
OrigInstance logtail.PublicID `json:"orig_instance"`
Text string `json:"text"`
}
+46
View File
@@ -0,0 +1,46 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// The logtail program logs stdin.
package main
import (
"bufio"
"flag"
"io"
"log"
"os"
"tailscale.com/logtail"
)
func main() {
collection := flag.String("c", "", "logtail collection name")
privateID := flag.String("k", "", "machine private identifier, 32-bytes in hex")
flag.Parse()
if len(flag.Args()) != 0 {
flag.Usage()
os.Exit(1)
}
log.SetFlags(0)
var id logtail.PrivateID
if err := id.UnmarshalText([]byte(*privateID)); err != nil {
log.Fatalf("logtail: bad -privateid: %v", err)
}
logger := logtail.Log(logtail.Config{
Collection: *collection,
PrivateID: id,
})
log.SetOutput(io.MultiWriter(logger, os.Stdout))
defer logger.Flush()
defer log.Printf("logtail exited")
scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
log.Println(scanner.Text())
}
}
+238
View File
@@ -0,0 +1,238 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Package filch is a file system queue that pilfers your stderr.
// (A FILe CHannel that filches.)
package filch
import (
"bufio"
"bytes"
"fmt"
"io"
"os"
"sync"
)
var stderrFD = 2 // a variable for testing
type Options struct {
ReplaceStderr bool // dup over fd 2 so everything written to stderr comes here
}
type Filch struct {
OrigStderr *os.File
mu sync.Mutex
cur *os.File
alt *os.File
altscan *bufio.Scanner
recovered int64
}
func (f *Filch) TryReadLine() ([]byte, error) {
f.mu.Lock()
defer f.mu.Unlock()
if f.altscan != nil {
if b, err := f.scan(); b != nil || err != nil {
return b, err
}
}
f.cur, f.alt = f.alt, f.cur
if f.OrigStderr != nil {
if err := dup2Stderr(f.cur); err != nil {
return nil, err
}
}
if _, err := f.alt.Seek(0, os.SEEK_SET); err != nil {
return nil, err
}
f.altscan = bufio.NewScanner(f.alt)
f.altscan.Split(splitLines)
return f.scan()
}
func (f *Filch) scan() ([]byte, error) {
if f.altscan.Scan() {
return f.altscan.Bytes(), nil
}
err := f.altscan.Err()
err2 := f.alt.Truncate(0)
_, err3 := f.alt.Seek(0, os.SEEK_SET)
f.altscan = nil
if err != nil {
return nil, err
}
if err2 != nil {
return nil, err2
}
if err3 != nil {
return nil, err3
}
return nil, nil
}
func (f *Filch) Write(b []byte) (int, error) {
f.mu.Lock()
defer f.mu.Unlock()
if len(b) == 0 || b[len(b)-1] != '\n' {
bnl := make([]byte, len(b)+1)
copy(bnl, b)
bnl[len(bnl)-1] = '\n'
return f.cur.Write(bnl)
}
return f.cur.Write(b)
}
func (f *Filch) Close() (err error) {
f.mu.Lock()
defer f.mu.Unlock()
if f.OrigStderr != nil {
if err2 := unsaveStderr(f.OrigStderr); err == nil {
err = err2
}
f.OrigStderr = nil
}
if err2 := f.cur.Close(); err == nil {
err = err2
}
if err2 := f.alt.Close(); err == nil {
err = err2
}
return err
}
func New(filePrefix string, opts Options) (f *Filch, err error) {
var f1, f2 *os.File
defer func() {
if err != nil {
if f1 != nil {
f1.Close()
}
if f2 != nil {
f2.Close()
}
err = fmt.Errorf("filch: %s", err)
}
}()
path1 := filePrefix + ".log1.txt"
path2 := filePrefix + ".log2.txt"
f1, err = os.OpenFile(path1, os.O_CREATE|os.O_RDWR, 0666)
if err != nil {
return nil, err
}
f2, err = os.OpenFile(path2, os.O_CREATE|os.O_RDWR, 0666)
if err != nil {
return nil, err
}
fi1, err := f1.Stat()
if err != nil {
return nil, err
}
fi2, err := f2.Stat()
if err != nil {
return nil, err
}
f = &Filch{
OrigStderr: os.Stderr, // temporary, for past logs recovery
}
// Neither, either, or both files may exist and contain logs from
// the last time the process ran. The three cases are:
//
// - neither: all logs were read out and files were truncated
// - either: logs were being written into one of the files
// - both: the files were swapped and were starting to be
// read out, while new logs streamed into the other
// file, but the read out did not complete
if n := fi1.Size() + fi2.Size(); n > 0 {
f.recovered = n
}
switch {
case fi1.Size() > 0 && fi2.Size() == 0:
f.cur, f.alt = f2, f1
case fi2.Size() > 0 && fi1.Size() == 0:
f.cur, f.alt = f1, f2
case fi1.Size() > 0 && fi2.Size() > 0: // both
// We need to pick one of the files to be the elder,
// which we do using the mtime.
var older, newer *os.File
if fi1.ModTime().Before(fi2.ModTime()) {
older, newer = f1, f2
} else {
older, newer = f2, f1
}
if err := moveContents(older, newer); err != nil {
fmt.Fprintf(f.OrigStderr, "filch: recover move failed: %v\n", err)
fmt.Fprintf(older, "filch: recover move failed: %v\n", err)
}
f.cur, f.alt = newer, older
default:
f.cur, f.alt = f1, f2 // does not matter
}
if f.recovered > 0 {
f.altscan = bufio.NewScanner(f.alt)
f.altscan.Split(splitLines)
}
f.OrigStderr = nil
if opts.ReplaceStderr {
f.OrigStderr, err = saveStderr()
if err != nil {
return nil, err
}
if err := dup2Stderr(f.cur); err != nil {
return nil, err
}
}
return f, nil
}
func moveContents(dst, src *os.File) (err error) {
defer func() {
_, err2 := src.Seek(0, os.SEEK_SET)
err3 := src.Truncate(0)
_, err4 := dst.Seek(0, os.SEEK_SET)
if err == nil {
err = err2
}
if err == nil {
err = err3
}
if err == nil {
err = err4
}
}()
if _, err := src.Seek(0, os.SEEK_SET); err != nil {
return err
}
if _, err := io.Copy(dst, src); err != nil {
return err
}
return nil
}
func splitLines(data []byte, atEOF bool) (advance int, token []byte, err error) {
if atEOF && len(data) == 0 {
return 0, nil, nil
}
if i := bytes.IndexByte(data, '\n'); i >= 0 {
return i + 1, data[0 : i+1], nil
}
if atEOF {
return len(data), data, nil
}
return 0, nil, nil
}
+178
View File
@@ -0,0 +1,178 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package filch
import (
"fmt"
"io/ioutil"
"os"
"path/filepath"
"strings"
"testing"
"unicode"
)
type filchTest struct {
*Filch
}
func newFilchTest(t *testing.T, filePrefix string, opts Options) *filchTest {
f, err := New(filePrefix, opts)
if err != nil {
t.Fatal(err)
}
return &filchTest{Filch: f}
}
func (f *filchTest) write(t *testing.T, s string) {
t.Helper()
if _, err := f.Write([]byte(s)); err != nil {
t.Fatal(err)
}
}
func (f *filchTest) read(t *testing.T, want string) {
t.Helper()
if b, err := f.TryReadLine(); err != nil {
t.Fatalf("r.ReadLine() err=%v", err)
} else if got := strings.TrimRightFunc(string(b), unicode.IsSpace); got != want {
t.Errorf("r.ReadLine()=%q, want %q", got, want)
}
}
func (f *filchTest) readEOF(t *testing.T) {
t.Helper()
if b, err := f.TryReadLine(); b != nil || err != nil {
t.Fatalf("r.ReadLine()=%q err=%v, want nil slice", string(b), err)
}
}
func (f *filchTest) close(t *testing.T) {
t.Helper()
if err := f.Close(); err != nil {
t.Fatal(err)
}
}
func genFilePrefix(t *testing.T) string {
t.Helper()
filePrefix, err := ioutil.TempDir("", "filch")
if err != nil {
t.Fatal(err)
}
return filepath.Join(filePrefix, "ringbuffer-")
}
func TestQueue(t *testing.T) {
filePrefix := genFilePrefix(t)
defer os.RemoveAll(filepath.Dir(filePrefix))
f := newFilchTest(t, filePrefix, Options{ReplaceStderr: false})
f.readEOF(t)
const line1 = "Hello, World!"
const line2 = "This is a test."
const line3 = "Of filch."
f.write(t, line1)
f.write(t, line2)
f.read(t, line1)
f.write(t, line3)
f.read(t, line2)
f.read(t, line3)
f.readEOF(t)
f.write(t, line1)
f.read(t, line1)
f.readEOF(t)
f.close(t)
}
func TestRecover(t *testing.T) {
t.Run("empty", func(t *testing.T) {
filePrefix := genFilePrefix(t)
defer os.RemoveAll(filepath.Dir(filePrefix))
f := newFilchTest(t, filePrefix, Options{ReplaceStderr: false})
f.write(t, "hello")
f.read(t, "hello")
f.readEOF(t)
f.close(t)
f = newFilchTest(t, filePrefix, Options{ReplaceStderr: false})
f.readEOF(t)
f.close(t)
})
t.Run("cur", func(t *testing.T) {
filePrefix := genFilePrefix(t)
defer os.RemoveAll(filepath.Dir(filePrefix))
f := newFilchTest(t, filePrefix, Options{ReplaceStderr: false})
f.write(t, "hello")
f.close(t)
f = newFilchTest(t, filePrefix, Options{ReplaceStderr: false})
f.read(t, "hello")
f.readEOF(t)
f.close(t)
})
t.Run("alt", func(t *testing.T) {
t.Skip("currently broken on linux, passes on macOS")
/* --- FAIL: TestRecover/alt (0.00s)
filch_test.go:128: r.ReadLine()="world", want "hello"
filch_test.go:129: r.ReadLine()="hello", want "world"
*/
filePrefix := genFilePrefix(t)
defer os.RemoveAll(filepath.Dir(filePrefix))
f := newFilchTest(t, filePrefix, Options{ReplaceStderr: false})
f.write(t, "hello")
f.read(t, "hello")
f.write(t, "world")
f.close(t)
f = newFilchTest(t, filePrefix, Options{ReplaceStderr: false})
// TODO(crawshaw): The "hello" log is replayed in recovery.
// We could reduce replays by risking some logs loss.
// What should our policy here be?
f.read(t, "hello")
f.read(t, "world")
f.readEOF(t)
f.close(t)
})
}
func TestFilchStderr(t *testing.T) {
pipeR, pipeW, err := os.Pipe()
if err != nil {
t.Fatal(err)
}
defer pipeR.Close()
defer pipeW.Close()
stderrFD = int(pipeW.Fd())
defer func() {
stderrFD = 2
}()
filePrefix := genFilePrefix(t)
defer os.RemoveAll(filepath.Dir(filePrefix))
f := newFilchTest(t, filePrefix, Options{ReplaceStderr: true})
f.write(t, "hello")
if _, err := fmt.Fprintf(pipeW, "filch\n"); err != nil {
t.Fatal(err)
}
f.read(t, "hello")
f.read(t, "filch")
f.readEOF(t)
f.close(t)
pipeW.Close()
b, err := ioutil.ReadAll(pipeR)
if err != nil {
t.Fatal(err)
}
if len(b) > 0 {
t.Errorf("unexpected write to fake stderr: %s", b)
}
}
+30
View File
@@ -0,0 +1,30 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
//+build !windows
package filch
import (
"os"
"syscall"
)
func saveStderr() (*os.File, error) {
fd, err := syscall.Dup(stderrFD)
if err != nil {
return nil, err
}
return os.NewFile(uintptr(fd), "stderr"), nil
}
func unsaveStderr(f *os.File) error {
err := dup2Stderr(f)
f.Close()
return err
}
func dup2Stderr(f *os.File) error {
return syscall.Dup2(int(f.Fd()), stderrFD)
}
+44
View File
@@ -0,0 +1,44 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package filch
import (
"fmt"
"os"
"syscall"
)
var kernel32 = syscall.MustLoadDLL("kernel32.dll")
var procSetStdHandle = kernel32.MustFindProc("SetStdHandle")
func setStdHandle(stdHandle int32, handle syscall.Handle) error {
r, _, e := syscall.Syscall(procSetStdHandle.Addr(), 2, uintptr(stdHandle), uintptr(handle), 0)
if r == 0 {
if e != 0 {
return error(e)
}
return syscall.EINVAL
}
return nil
}
func saveStderr() (*os.File, error) {
return os.Stderr, nil
}
func unsaveStderr(f *os.File) error {
os.Stderr = f
return nil
}
func dup2Stderr(f *os.File) error {
fd := int(f.Fd())
err := setStdHandle(syscall.STD_ERROR_HANDLE, syscall.Handle(fd))
if err != nil {
return fmt.Errorf("dup2Stderr: %w", err)
}
os.Stderr = f
return nil
}
+103
View File
@@ -0,0 +1,103 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package logtail
import (
"crypto/rand"
"crypto/sha256"
"encoding/hex"
"fmt"
)
// PrivateID represents an instance that write logs.
// Private IDs are only shared with the server when writing logs.
type PrivateID [32]byte
// Safely generate a new PrivateId for use in Config objects.
// You should persist this across runs of an instance of your app, so that
// it can append to the same log file on each run.
func NewPrivateID() (id PrivateID, err error) {
_, err = rand.Read(id[:])
if err != nil {
return PrivateID{}, err
}
// Clamping, for future use.
id[0] &= 248
id[31] = (id[31] & 127) | 64
return id, nil
}
func (id PrivateID) MarshalText() ([]byte, error) {
b := make([]byte, hex.EncodedLen(len(id)))
if i := hex.Encode(b, id[:]); i != len(b) {
return nil, fmt.Errorf("logtail.PrivateID.MarhsalText: i=%d", i)
}
return b, nil
}
func (id *PrivateID) UnmarshalText(s []byte) error {
b, err := hex.DecodeString(string(s))
if err != nil {
return fmt.Errorf("logtail.PrivateID.UnmarshalText: %v", err)
}
if len(b) != len(id) {
return fmt.Errorf("logtail.PrivateID.UnmarshalText: invalid hex length: %d", len(b))
}
copy(id[:], b)
return nil
}
func (id PrivateID) String() string {
b, err := id.MarshalText()
if err != nil {
panic(err)
}
return string(b)
}
func (id PrivateID) Public() (pub PublicID) {
var emptyID PrivateID
if id == emptyID {
panic("invalid logtail.Public() on an empty private ID")
}
h := sha256.New()
h.Write(id[:])
if n := copy(pub[:], h.Sum(pub[:0])); n != len(pub) {
panic(fmt.Sprintf("public id short copy: %d", n))
}
return pub
}
// PublicID represents an instance in the logs service for reading and adoption.
// The public ID value is a SHA-256 hash of a private ID.
type PublicID [sha256.Size]byte
func (id PublicID) MarshalText() ([]byte, error) {
b := make([]byte, hex.EncodedLen(len(id)))
if i := hex.Encode(b, id[:]); i != len(b) {
return nil, fmt.Errorf("logtail.PublicID.MarhsalText: i=%d", i)
}
return b, nil
}
func (id *PublicID) UnmarshalText(s []byte) error {
b, err := hex.DecodeString(string(s))
if err != nil {
return fmt.Errorf("logtail.PublicID.UnmarshalText: %v", err)
}
if len(b) != len(id) {
return fmt.Errorf("logtail.PublicID.UnmarshalText: invalid hex length: %d", len(b))
}
copy(id[:], b)
return nil
}
func (id PublicID) String() string {
b, err := id.MarshalText()
if err != nil {
panic(err)
}
return string(b)
}
+54
View File
@@ -0,0 +1,54 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package logtail
import (
"testing"
)
func TestIDs(t *testing.T) {
id1, err := NewPrivateID()
if err != nil {
t.Fatal(err)
}
pub1 := id1.Public()
id2, err := NewPrivateID()
if err != nil {
t.Fatal(err)
}
pub2 := id2.Public()
if id1 == id2 {
t.Fatalf("subsequent private IDs match: %v", id1)
}
if pub1 == pub2 {
t.Fatalf("subsequent public IDs match: %v", id1)
}
if id1.String() == id2.String() {
t.Fatalf("id1.String()=%v equals id2.String()", id1.String())
}
if pub1.String() == pub2.String() {
t.Fatalf("pub1.String()=%v equals pub2.String()", pub1.String())
}
id1txt, err := id1.MarshalText()
if err != nil {
t.Fatal(err)
}
var id3 PrivateID
if err := id3.UnmarshalText(id1txt); err != nil {
t.Fatal(err)
}
if id1 != id3 {
t.Fatalf("id1 %v: marshal and unmarshal gives different key: %v", id1, id3)
}
if want, got := id1.Public(), id3.Public(); want != got {
t.Fatalf("id1.Public()=%v does not match id3.Public()=%v", want, got)
}
if id1.String() != id3.String() {
t.Fatalf("id1.String()=%v does not match id3.String()=%v", id1.String(), id3.String())
}
}
+464
View File
@@ -0,0 +1,464 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Package logtail sends logs to log.tailscale.io.
package logtail
import (
"bytes"
"context"
"crypto/rand"
"encoding/json"
"errors"
"fmt"
"io"
"io/ioutil"
"math/big"
"net/http"
"os"
"strconv"
"sync"
"time"
"tailscale.com/logtail/backoff"
)
type Logger interface {
// Write logs an encoded JSON blob.
//
// If the []byte passed to Write is not an encoded JSON blob,
// then contents is fit into a JSON blob and written.
//
// This is intended as an interface for the stdlib "log" package.
Write([]byte) (int, error)
// Flush uploads all logs to the server.
// It blocks until complete or there is an unrecoverable error.
Flush() error
// Shutdown gracefully shuts down the logger while completing any
// remaining uploads.
//
// It will block, continuing to try and upload unless the passed
// context object interrupts it by being done.
// If the shutdown is interrupted, an error is returned.
Shutdown(context.Context) error
// Close shuts down this logger object, the background log uploader
// process, and any associated goroutines.
//
// DEPRECATED: use Shutdown
Close()
}
type Encoder interface {
EncodeAll(src, dst []byte) []byte
Close() error
}
type Config struct {
Collection string // collection name, a domain name
PrivateID PrivateID // machine-specific private identifier
BaseURL string // if empty defaults to "https://log.tailscale.io"
HTTPC *http.Client // if empty defaults to http.DefaultClient
SkipClientTime bool // if true, client_time is not written to logs
LowMemory bool // if true, logtail minimizes memory use
TimeNow func() time.Time // if set, subsitutes uses of time.Now
Stderr io.Writer // if set, logs are sent here instead of os.Stderr
Buffer Buffer // temp storage, if nil a MemoryBuffer
CheckLogs <-chan struct{} // signals Logger to check for filched logs to upload
NewZstdEncoder func() Encoder // if set, used to compress logs for transmission
}
func Log(cfg Config) Logger {
if cfg.BaseURL == "" {
cfg.BaseURL = "https://log.tailscale.io"
}
if cfg.HTTPC == nil {
cfg.HTTPC = http.DefaultClient
}
if cfg.TimeNow == nil {
cfg.TimeNow = time.Now
}
if cfg.Stderr == nil {
cfg.Stderr = os.Stderr
}
if cfg.Buffer == nil {
pendingSize := 256
if cfg.LowMemory {
pendingSize = 64
}
cfg.Buffer = NewMemoryBuffer(pendingSize)
}
if cfg.CheckLogs == nil {
cfg.CheckLogs = make(chan struct{})
}
l := &logger{
stderr: cfg.Stderr,
httpc: cfg.HTTPC,
url: cfg.BaseURL + "/c/" + cfg.Collection + "/" + cfg.PrivateID.String(),
lowMem: cfg.LowMemory,
buffer: cfg.Buffer,
skipClientTime: cfg.SkipClientTime,
sent: make(chan struct{}, 1),
sentinel: make(chan int32, 16),
checkLogs: cfg.CheckLogs,
timeNow: cfg.TimeNow,
bo: backoff.Backoff{
Name: "logtail",
},
shutdownStart: make(chan struct{}),
shutdownDone: make(chan struct{}),
}
if cfg.NewZstdEncoder != nil {
l.zstdEncoder = cfg.NewZstdEncoder()
}
ctx, cancel := context.WithCancel(context.Background())
l.uploadCancel = cancel
go l.uploading(ctx)
l.Write([]byte("logtail started"))
return l
}
type logger struct {
stderr io.Writer
httpc *http.Client
url string
lowMem bool
skipClientTime bool
buffer Buffer
sent chan struct{} // signal to speed up drain
checkLogs <-chan struct{} // external signal to attempt a drain
sentinel chan int32
timeNow func() time.Time
bo backoff.Backoff
zstdEncoder Encoder
uploadCancel func()
shutdownStart chan struct{} // closed when shutdown begins
shutdownDone chan struct{} // closd when shutdown complete
dropMu sync.Mutex
dropCount int
}
func (l *logger) Shutdown(ctx context.Context) error {
if ctx == nil {
ctx = context.Background()
}
done := make(chan struct{})
go func() {
select {
case <-ctx.Done():
l.uploadCancel()
<-l.shutdownDone
case <-l.shutdownDone:
}
close(done)
}()
close(l.shutdownStart)
io.WriteString(l, "logger closing down\n")
<-done
if l.zstdEncoder != nil {
return l.zstdEncoder.Close()
}
return nil
}
func (l *logger) Close() {
l.Shutdown(nil)
}
func (l *logger) drainPending() (res []byte) {
buf := new(bytes.Buffer)
entries := 0
var batchDone bool
for buf.Len() < 1<<18 && !batchDone {
b, err := l.buffer.TryReadLine()
if err == io.EOF {
break
} else if err != nil {
b = []byte(fmt.Sprintf("reading ringbuffer: %v", err))
batchDone = true
} else if b == nil {
if entries > 0 {
break
}
select {
case <-l.shutdownStart:
batchDone = true
case <-l.checkLogs:
case <-l.sent:
}
continue
}
if len(b) == 0 {
continue
}
if b[0] != '{' || !json.Valid(b) {
// This is probably a log added to stderr by filch
// outside of the logtail logger. Encode it.
// Do not add a client time, as it could have been
// been written a long time ago.
b = l.encodeText(b, true)
}
switch {
case entries == 0:
buf.Write(b)
case entries == 1:
buf2 := new(bytes.Buffer)
buf2.WriteByte('[')
buf2.Write(buf.Bytes())
buf2.WriteByte(',')
buf2.Write(b)
buf.Reset()
buf.Write(buf2.Bytes())
default:
buf.WriteByte(',')
buf.Write(b)
}
entries++
}
if entries > 1 {
buf.WriteByte(']')
}
if buf.Len() == 0 {
return nil
}
return buf.Bytes()
}
var clientSentinelPrefix = []byte(`{"logtail":{"client_sentinel":`)
const (
noSentinel = 0
stopSentinel = 1
)
// newSentinel creates a client sentinel between 2 and maxint32.
// It does not generate the reserved values:
// 0 is no sentinel
// 1 is stop the logger
func newSentinel() ([]byte, int32) {
val, err := rand.Int(rand.Reader, big.NewInt(1<<31-2))
if err != nil {
panic(err)
}
v := int32(val.Int64()) + 2
buf := new(bytes.Buffer)
fmt.Fprintf(buf, "%s%d}}\n", clientSentinelPrefix, v)
return buf.Bytes(), v
}
// readSentinel reads a sentinel.
// If it is not a sentinel it reports 0.
func readSentinel(b []byte) int32 {
if !bytes.HasPrefix(b, clientSentinelPrefix) {
return 0
}
b = bytes.TrimPrefix(b, clientSentinelPrefix)
b = bytes.TrimSuffix(bytes.TrimSpace(b), []byte("}}"))
v, err := strconv.Atoi(string(b))
if err != nil {
return 0
}
return int32(v)
}
// This is the goroutine that repeatedly uploads logs in the background.
func (l *logger) uploading(ctx context.Context) {
defer close(l.shutdownDone)
for {
body := l.drainPending()
if l.zstdEncoder != nil {
body = l.zstdEncoder.EncodeAll(body, nil)
}
for len(body) > 0 {
select {
case <-ctx.Done():
return
default:
}
uploaded, err := l.upload(ctx, body)
if err != nil {
fmt.Fprintf(l.stderr, "logtail: upload: %v\n", err)
}
if uploaded {
break
}
l.bo.BackOff(ctx, err)
}
select {
case <-l.shutdownStart:
return
default:
}
}
}
func (l *logger) upload(ctx context.Context, body []byte) (uploaded bool, err error) {
req, err := http.NewRequest("POST", l.url, bytes.NewReader(body))
if err != nil {
// I know of no conditions under which this could fail.
// Report it very loudly.
// TODO record logs to disk
panic("logtail: cannot build http request: " + err.Error())
}
if l.zstdEncoder != nil {
req.Header.Add("Content-Encoding", "zstd")
}
maxUploadTime := 45 * time.Second
ctx, cancel := context.WithTimeout(ctx, maxUploadTime)
defer cancel()
req = req.WithContext(ctx)
compressedNote := "not-compressed"
if l.zstdEncoder != nil {
compressedNote = "compressed"
}
resp, err := l.httpc.Do(req)
if err != nil {
return false, fmt.Errorf("log upload of %d bytes %s failed: %v", len(body), compressedNote, err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
uploaded = resp.StatusCode == 400 // the server saved the logs anyway
b, _ := ioutil.ReadAll(resp.Body)
return uploaded, fmt.Errorf("log upload of %d bytes %s failed %d: %q", len(body), compressedNote, resp.StatusCode, string(b))
}
return true, nil
}
func (l *logger) Flush() error {
return nil
}
var errHasLogtail = errors.New("logtail: JSON log message contains reserved 'logtail' property")
func (l *logger) send(jsonBlob []byte) (int, error) {
n, err := l.buffer.Write(jsonBlob)
select {
case l.sent <- struct{}{}:
default:
}
return n, err
}
func (l *logger) encodeText(buf []byte, skipClientTime bool) []byte {
now := l.timeNow()
b := make([]byte, 0, len(buf)+16)
b = append(b, '{')
if !skipClientTime {
b = append(b, `"logtail": {"client_time": "`...)
b = now.AppendFormat(b, time.RFC3339Nano)
b = append(b, "\"}, "...)
}
b = append(b, "\"text\": \""...)
for i, c := range buf {
switch c {
case '\b':
b = append(b, '\\', 'b')
case '\f':
b = append(b, '\\', 'f')
case '\n':
b = append(b, '\\', 'n')
case '\r':
b = append(b, '\\', 'r')
case '\t':
b = append(b, '\\', 't')
case '"':
b = append(b, '\\', '"')
case '\\':
b = append(b, '\\', '\\')
default:
b = append(b, c)
}
if l.lowMem && i > 254 {
b = append(b, "…"...)
break
}
}
b = append(b, "\"}\n"...)
return b
}
func (l *logger) encode(buf []byte) []byte {
if buf[0] != '{' {
return l.encodeText(buf, l.skipClientTime) // text fast-path
}
now := l.timeNow()
obj := make(map[string]interface{})
if err := json.Unmarshal(buf, &obj); err != nil {
for k := range obj {
delete(obj, k)
}
obj["text"] = string(buf)
}
if txt, isStr := obj["text"].(string); l.lowMem && isStr && len(txt) > 254 {
// TODO(crawshaw): trim to unicode code point
obj["text"] = txt[:254] + "…"
}
hasLogtail := obj["logtail"] != nil
if hasLogtail {
obj["error_has_logtail"] = obj["logtail"]
obj["logtail"] = nil
}
if !l.skipClientTime {
obj["logtail"] = map[string]string{
"client_time": now.Format(time.RFC3339Nano),
}
}
b, err := json.Marshal(obj)
if err != nil {
fmt.Fprintf(l.stderr, "logtail: re-encoding JSON failed: %v\n", err)
// I know of no conditions under which this could fail.
// Report it very loudly.
panic("logtail: re-encoding JSON failed: " + err.Error())
}
b = append(b, '\n')
return b
}
func (l *logger) Write(buf []byte) (int, error) {
if len(buf) == 0 {
return 0, nil
}
if l.stderr != nil && l.stderr != ioutil.Discard {
if buf[len(buf)-1] == '\n' {
l.stderr.Write(buf)
} else {
// The log package always line-terminates logs,
// so this is an uncommon path.
bufnl := make([]byte, len(buf)+1)
copy(bufnl, buf)
bufnl[len(bufnl)-1] = '\n'
l.stderr.Write(bufnl)
}
}
b := l.encode(buf)
return l.send(b)
}
+20
View File
@@ -0,0 +1,20 @@
// Copyright (c) 2020 Tailscale Inc & AUTHORS All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package logtail
import (
"context"
"testing"
)
func TestFastShutdown(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
cancel()
l := Log(Config{
BaseURL: "http://localhost:1234",
})
l.Shutdown(ctx)
}