While not required, it’s highly recommended that you read Dave Cheney’s post on Dynamically Scoped Variables before continuing.
What and Why?
Consider an instance of context.Context
as it flows through the code.
Typically it’s created at the beginning of a transaction and enriched or referenced throughout the transaction.
Sometimes it’s not passed everywhere – it certainly doesn’t need to be.
Consider a scenario where you have an encryption library and it contains a simple function AESEncrypt
:
package crypto
import (
"crypto/aes"
"crypto/cipher"
"crypto/rand"
"github.com/spf13/viper"
"io"
)
func AESEncrypt(keyID string, plaintext []byte) (ciphertext, nonce []byte, err error) {
key := viper.GetString(keyID) // retrieve the key from config file
block, err := aes.NewCipher([]byte(key))
if err != nil {
return // key was malformed?
}
nonce = make([]byte, 12)
if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
panic(err.Error()) // if you can't read your random, you've got problems
}
aesgcm, err := cipher.NewGCM(block)
if err != nil {
panic(err.Error()) // inputs are effectively static
}
ciphertext = aesgcm.Seal(nil, nonce, plaintext, nil)
return
}
The function reads in an encryption key, builds a cipher, and encrypts the input. It works faithfully for a few months, until someone decides keys need to be stored on a keyserver and retrieved at runtime. Now the earlier call
viper.GetString(keyID) // retrieve the key from config file
must change to
key := keyserverClient.GetKey(context.TODO(), keyID) // retrieve the key from the keyserver
and now we have no context to provide.
In a perfect world, we’d refactor the AESEncrypt
to change the function signature to add context support.
But I’m lazy and want a different approach.
We had the context further up the callstack, but it was omitted at some point in the stack. We want it back now.
How?
The Limitations
To begin this journey, let’s note some of the restrictions placed on contex.Context
.
Each restriction limits the work we have to do: narrower scope means less scenarios to handle.
- A context SHOULD be the first parameter in the function signature. The
go lint
linterlintContextArgs
checks for this - Context is an interface which contains four unexported implementations under the hood: deadline, cancel, value, and empty.
So we always know where a context will be in a function signature and also the legal values of its underlying concrete type.
The next tools for solving the problem come from understanding the internals of go. Specifically, the interface
Interfaces, such as the context.Context
interface, are constructed by two parts: an itab
and the data.
The runtime2.go
source has it defined simply as:
type iface struct {
tab *itab
data unsafe.Pointer
}
itab
, short for “itable”, short for “interface table”, contains all the details the go linker and runtime require for ducktyping to work.
A more in-depth view is available at the go-internals Interfaces chapter.
For the time being, we need to know two things:
- The
itab
anddata
pointers make up an interface - Different instances of the same concrete interface implementation will recycle the same
itab
Interfaces at Runtime
At runtime, interfaces aren’t passed as the single iface
struct we saw defined in runtime2.go
.
Instead, they’re passed as the contents: the itab
, data
tuple.
We can demonstrate this with the below trivial example: a 2 frame stack, where the caller places a single interface as a parameter to the callee.
import "context"
func main() {
panicker(context.WithValue(context.Background(), "key", "oops!"))
}
//go:noinline
func panicker(ctx context.Context) {
panic(ctx.Value("key"))
}
In the resulting output stacktrace, the callee shows two parameters:
panic: oops!
goroutine 1 [running]:
main.panicker(0x1099b60, 0xc000068060)
/Users/aidan/go/src/github.com/raidancampbell.github.io.source/content/scratch/abusing-context-part-ii.go:11 +0x61
main.main()
/Users/aidan/go/src/github.com/raidancampbell.github.io.source/content/scratch/abusing-context-part-ii.go:6 +0x7a
The bottom of the stack (using the “stack grows downwards” terminology, whereas the stacktrace shows current execution at the top) has the two values in question: main.panicker(0x1099b60, 0xc000068060)
.
While only one was actually passed to the function, two appeared in the call stack.
Why? I’m not sure why Go does it this way.
My guess is that the majority of the itab
is constructed at compile (linking) time, and is required for the way interfaces are treated in go.
This theory is reinforced by the memory address offset: the itab
appears much earlier in memory compared to the data.
Rebuilding an interface
Let’s modify the above example to gain access to the stack as a string:
package main
import (
"bufio"
"bytes"
"context"
"fmt"
"runtime"
)
func main() {
panicker(context.WithValue(context.Background(), "key", "oops!"))
}
func panicker(_ context.Context) {
var buf [8192]byte
n := runtime.Stack(buf[:], false) // get the current callstack as a string
sc := bufio.NewScanner(bytes.NewReader(buf[:n]))
for sc.Scan() {
fmt.Println(sc.Text())
}
}
Minus the panic’s message, the results remain the same:
goroutine 1 [running]:
main.panicker(0x10ee820, 0xc000090180)
/Users/aidan/go/src/github.com/raidancampbell.github.io.source/content/scratch/abusing-context-part-ii.go:17 +0x69
main.main()
/Users/aidan/go/src/github.com/raidancampbell.github.io.source/content/scratch/abusing-context-part-ii.go:12 +0x7a
We know 0x10ee820
and 0xc000090180
compose the input context for panicker
.
Using the unsafe
package we can rebuild the incoming context:
package main
import (
"bufio"
"bytes"
"context"
"fmt"
"regexp"
"runtime"
"unsafe"
)
func main() {
panicker(context.WithValue(context.Background(), "key", "oops!"))
}
var twoParamPatt = regexp.MustCompile(`^.+[a-zA-Z][a-zA-Z0-9\-_]*\.[a-zA-Z][a-zA-Z0-9\-_]*\((?P<type_itab>0x[0-9a-f]+), (?P<type_data>0x[0-9a-f]+).+`)
func panicker(_ context.Context) {
var buf [8192]byte
n := runtime.Stack(buf[:], false) // get the current callstack as a string
sc := bufio.NewScanner(bytes.NewReader(buf[:n]))
for sc.Scan() {
// match the expected regex. for this example, we're only expecting the below match (addrs will vary):
// main.panicker(0x10ee820, 0xc000090180)
matches := twoParamPatt.FindStringSubmatch(sc.Text())
if matches == nil {
continue
}
// grab the two memory addresses (itab and data value)
var p1, p2 uintptr
_, err1 := fmt.Sscanf(matches[1], "%v", &p1)
_, err2 := fmt.Sscanf(matches[2], "%v", &p2)
if err1 != nil || err2 != nil {
continue
}
// put the two pointers into the iface layout
idata := [2]uintptr{p1, p2}
// declare that the iface is a context
ctx := *(*context.Context)(unsafe.Pointer(&idata))
// use the context
fmt.Println(ctx.Value("key"))
}
}
Walking through the above code:
var twoParamPatt = regexp.MustCompile(`^.+[a-zA-Z][a-zA-Z0-9\-_]*\.[a-zA-Z][a-zA-Z0-9\-_]*\((?P<type_itab>0x[0-9a-f]+), (?P<type_data>0x[0-9a-f]+).+`)
A rather nasty regex for matching the “.(arg1, arg2” part of a stacktrace.
var p1, p2 uintptr
_, err1 := fmt.Sscanf(matches[1], "%v", &p1)
_, err2 := fmt.Sscanf(matches[2], "%v", &p2)
if err1 != nil || err2 != nil {
continue
}
Extracting the string hex addresses into actual uintptr
. In the above execution, these would be 0x10ee820
and 0xc000090180
idata := [2]uintptr{p1, p2}
This is placing two pointers in the same layout as the iface
struct: itab
pointer first, data pointer second.
Remember that the internal struct for an interface looks like this:
type iface struct {
tab *itab
data unsafe.Pointer
}
ctx := *(*context.Context)(unsafe.Pointer(&idata))
Create an unsafe pointer to the effective iface
struct instance, then cast that pointer as a context.Context
pointer, then dereference the context.Context
pointer for your final value.
At this point we’ve “recovered” the context, but it’s a bit moot since we were given the context to begin with.
How would we handle true context recovery if it existed somewhere up the stack, but was never passed to us?
Context recovery through dynamic scoping
The above code has a major fatal flaw: it finds the first function up the stack with two parameters, and blindly jams them into a context.Context
.
In reality we don’t know if the first two parameters are the itab
and data pointers: they may be ints, or even an itab
and data pointer for an unrelated interface.
Here we call upon our initial restrictions: context.Context
should always be first, and we know the four possible implementations of context.
Knowing that there’s four possible implementations allows us to enumerate the legal itab pointers on invocation:
func RecoverCtx() (context.Context, error) {
return emptyItab(context.Background())
}
//go:noinline
func emptyItab(_ context.Context) (context.Context, error) {
return valueItab(context.WithValue(context.Background(), "", ""))
}
//go:noinline
func valueItab(_ context.Context) (context.Context, error) {
ctx, c := context.WithCancel(context.Background())
defer c()
return cancelItab(ctx)
}
//go:noinline
func cancelItab(_ context.Context) (context.Context, error) {
ctx, c := context.WithDeadline(context.Background(), time.Now())
defer c()
return timerItab(ctx)
}
//go:noinline
func timerItab(_ context.Context) (context.Context, error) {
return doGetCtx()
}
func doGetCtx() (context.Context, error) {
// TODO
}
The above outlines a call chain: the entry point is the exported RecoverCtx
function, which passes through several other functions on its way down to the actual implementation.
Each of these functions receives a context, then passes it to the next function.
When viewing the stack at runtime, it would look like this:
libraidan/pkg/runsafe.doGetCtx(0x0, 0x0, 0x0, 0x0)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:59 +0xbb
libraidan/pkg/runsafe.timerItab(0x14b5ce0, 0xc0000a44e0, 0x0, 0x0, 0x0, 0x0)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:52 +0x4c
libraidan/pkg/runsafe.cancelItab(0x14b5c60, 0xc0000e8c40, 0x0, 0x0, 0x0, 0x0)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:47 +0x1a5
libraidan/pkg/runsafe.valueItab(0x14b5d20, 0xc00009d320, 0x0, 0x0, 0x0, 0x0)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:40 +0x150
libraidan/pkg/runsafe.emptyItab(0x14b5ca0, 0xc0000a6008, 0x0, 0x0, 0x0, 0x0)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:33 +0xd7
libraidan/pkg/runsafe.RecoverCtx(0x0, 0x0, 0x0, 0x0)
Using this stack, we can build guarantees:
- we know how many functions live above our call to
runtime.Stack
- we know the name of each of these functions
- we know the concrete input type (effectively the
itab
) each function receives
Why all the //go:noinline
pragmas? The compiler would inline the trivial functions, and we’d lose the parameter addresses.
The same stack, but without inlining disabled:
libraidan/pkg/runsafe.doGetCtx(0x13a1520, 0xc0000a6008, 0xbfae3de322870128, 0xbc60e)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:59 +0x5b
libraidan/pkg/runsafe.timerItab(...)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:52
libraidan/pkg/runsafe.cancelItab(0x13a14e0, 0xc0000e09c0, 0x0, 0x0, 0x0, 0x0)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:47 +0x9e
libraidan/pkg/runsafe.valueItab(0x13a15a0, 0xc00009d320, 0x0, 0x0, 0x0, 0x0)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:40 +0x82
libraidan/pkg/runsafe.emptyItab(0x13a1520, 0xc0000a6008, 0xc000040680, 0x100e808, 0x30, 0x130a680)
/Users/aidan/go/src/libraidan/pkg/runsafe/context.go:33 +0x7e
libraidan/pkg/runsafe.RecoverCtx(...)
With the parameters elided, we lose the itab
pointers and can’t enumerate the legal values.
In this scenario, we would be missing out on recovering certain context implementations.
With all the above in mind, we’re finally ready to implement the doGetCtx()
function:
func doGetCtx() (context.Context, error) {
var buf [8192]byte
n := runtime.Stack(buf[:], false) // get the current callstack as a string
sc := bufio.NewScanner(bytes.NewReader(buf[:n]))
var (
// hold the type itab pointers for each of the context implementations
deadlineType, cancelType, valueType, emptyType uintptr
// used to count our way up the stack,
// as the stack is constant the lowest few levels and we need to leverage that
stackMatch int
)
for sc.Scan() { // for each line (walking up the stack from here)
// if the line doesn't match, skip.
matches := pattern.FindStringSubmatch(sc.Text())
if matches == nil {
continue
}
// if this is the first iteration, then it's just our function. skip it.
if stackMatch == 0 && strings.Contains(sc.Text(), "doGetCtx") {
continue
}
stackMatch++
// grab the two memory addresses (itab and type value)
var p1, p2 uintptr
_, err1 := fmt.Sscanf(matches[1], "%v", &p1)
_, err2 := fmt.Sscanf(matches[2], "%v", &p2)
if err1 != nil || err2 != nil {
continue
}
// build up the legal values for each implementation of context
// the stackMatch must match the known location in the stack.
// Otherwise we might return a malformed context
if stackMatch == 1 && strings.Contains(sc.Text(), "timerItab") {
deadlineType = p1
} else if stackMatch == 2 && strings.Contains(sc.Text(), "cancelItab") {
cancelType = p1
} else if stackMatch == 3 && strings.Contains(sc.Text(), "valueItab") {
valueType = p1
} else if stackMatch == 4 && strings.Contains(sc.Text(), "emptyItab") {
emptyType = p1
} else if p1 != emptyType && p1 != valueType && p1 != cancelType && p1 != deadlineType {
// if we're in the caller's code, and the first parameter isn't a
// known context implementation, then skip this stack frame
continue
}
if stackMatch <= 4 { // we're still building the legal context implementations
continue
}
// at this point we're done building the legal context implementations,
// and this matched one. rebuild a context from the addresses, and return
idata := [2]uintptr{p1, p2}
return *(*context.Context)(unsafe.Pointer(&idata)), nil
}
// no context was found. Return a non-nil context to be polite, but also return an error.
return context.Background(), UnrecoverableContext{}
}
We leverage the known frames as we move up the callstack from doGetCtx
to RecoverCtx
.
Each of these frames is verified by name and location, then the itab
pointer value is stored.
Once the we’ve left the comfort of our guaranteed callstack, we need to match the first address against one of the known itab
values.
If one is found, then the resulting context is built and returned.
But there’s no guarantee this will happen: the context could be elided due to inlining in the caller’s code, or the caller could have never had a context to begin with!
The full code, which I do not recommend using, is available in libraidan’s runsafe package.