Escape analysis in the Go compiler
Cuong Manh Le
2020-09-26
Software Engineer
Cuong Manh Le
2020-09-26
Software Engineer
This talk assumes you did know fundamental concepts of a compiler.
The term gc in this talk stands for go compiler, not garbage collector.
There are some compilers for Go: gc, gccgo, tinygo ... This talk is about gc, the official Go compiler.
This talk use go version 1.15
I recommend this course for anyone interested in learning about compilers.
3Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.
package main import "fmt" func main() { fmt.Println("Hello, 世界") }
Go was built to solve problems at Google:
See more details here
8A compiler is a computer program that translates high level human written computer code into machine executable code.
$ file main.go main.go: C source, UTF-8 Unicode text $ go tool compile main.go $ ls main.go main.o $ file main.o main.o: current ar archive $ ar x main.o $ ls _go_.o main.go main.o __.PKGDEF $ file _go_.o _go_.o: data
Translate go source code to machine code.
$ go tool compile main.go
Actually, "gc" is a separate executable, invoked by the "go" command:
$ go tool -n compile /home/cuonglm/sources/go/pkg/tool/linux_amd64/compile
Through many steps or phases, logically, there're four phases:
Includes both lexing and parsing
14a := 1
will be parsed/lexed and produce:
[]ast.Stmt { &ast.AssignStmt { Lhs: []ast.Expr { &ast.Ident {Name: "a"}, }, Tok: :=, Rhs: []ast.Expr { &ast.BasicLit { ValuePos: 32, Kind: INT, Value: "1", }, }, }, }
This uses "go/ast", compiler uses another parser in "cmd/compile/internal/syntax", but it's similar.
17package main func main() { var x string x = 1 println(x) }
The intermediate representation (IR) of source code in another form, specifically, it's a tree.
18Static single assignment form is another IR of the source code (not a tree):
b1: v1 = InitMem <mem> v2 = SP <uintptr> v3 = SB <uintptr> v4 = Addr <*uint8> {type.string} v3 v5 = Addr <*string> {""..stmp_0} v3 v6 = IMake <interface {}> v4 v5 (~arg0[interface {}]) v7 = ConstInterface <interface {}> v8 = ArrayMake1 <[1]interface {}> v7 v9 = VarDef <mem> {.autotmp_11} v1 ... v25 = ConstInterface <error> (fmt..autotmp_4[error], fmt.err[error]) DEAD v28 = OffPtr <*io.Writer> [0] v2 v29 = Addr <*uint8> {go.itab.*os.File,io.Writer} v3 v30 = Addr <**os.File> {os.Stdout} v3 ...
SSA helps avoiding unnecessary operations that don't affect the final form of a variable's use: nil-check, bound check ...
2000000 (5) TEXT "".main(SB), ABIInternal 00001 (5) FUNCDATA $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 00002 (5) FUNCDATA $1, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB) 00003 (5) FUNCDATA $3, "".main.stkobj(SB) v26 00004 (6) XORPS X0, X0 v11 00005 (6) MOVUPS X0, ""..autotmp_11-16(SP) v20 00006 (6) LEAQ type.string(SB), AX v14 00007 (6) MOVQ AX, ""..autotmp_11-16(SP) v38 00008 (6) LEAQ ""..stmp_0(SB), AX v17 00009 (6) MOVQ AX, ""..autotmp_11-8(SP) v27 00010 (?) NOP # $GOROOT/src/fmt/print.go ... v36 00019 (274) PCDATA $1, $0 v36 00020 (274) CALL fmt.Fprintln(SB) # main.go b4 00021 (6) RET 00022 (?) END
The second phase contains more sub-phases
The second phase contains more sub-phases
Escape Analysis is important to help the compiler decide if it should allocate variables on the heap or on the stack or how to store the associated data.
24Escape Analysis is a method for determining the dynamic scope of pointers – where in the program a pointer can be accessed. It is related to pointer analysis and shape analysis.
From https://en.wikipedia.org/wiki/Escape_analysis
25Escape Analysis is a method for determining the dynamic scope of pointers – where in the program a pointer can be accessed. It is related to pointer analysis and shape analysis.
"where" means stack or heap
26This is traditional way heap/stack work.
Within Go runtime:
The most benefit is helping convert heap allocation → stack allocation
28That means:
Stack
package main import "testing" type s struct { f *int } func stack() { x := s{} x.f = new(int) } func BenchmarkAlloc(b *testing.B) { b.ReportAllocs() for i := 0; i <= b.N; i++ { stack() } }
$ go test -bench=. benchstat_stack_test.go goos: linux goarch: amd64 BenchmarkAlloc-8 1000000000 0.241 ns/op 0 B/op 0 allocs/op BenchmarkAlloc-8 1000000000 0.241 ns/op 0 B/op 0 allocs/op BenchmarkAlloc-8 1000000000 0.239 ns/op 0 B/op 0 allocs/op BenchmarkAlloc-8 1000000000 0.239 ns/op 0 B/op 0 allocs/op BenchmarkAlloc-8 1000000000 0.239 ns/op 0 B/op 0 allocs/op BenchmarkAlloc-8 1000000000 0.239 ns/op 0 B/op 0 allocs/op BenchmarkAlloc-8 1000000000 0.240 ns/op 0 B/op 0 allocs/op BenchmarkAlloc-8 1000000000 0.240 ns/op 0 B/op 0 allocs/op BenchmarkAlloc-8 1000000000 0.241 ns/op 0 B/op 0 allocs/op BenchmarkAlloc-8 1000000000 0.239 ns/op 0 B/op 0 allocs/op PASS ok command-line-arguments 2.661s
Heap
package main import "testing" type s struct { f *int } func heap() { x := &s{} x.f = new(int) } func BenchmarkAlloc(b *testing.B) { b.ReportAllocs() for i := 0; i <= b.N; i++ { heap() } }
$ go test -bench=. benchstat_heap_test.go goos: linux goarch: amd64 BenchmarkAlloc-8 100000000 10.9 ns/op 8 B/op 1 allocs/op BenchmarkAlloc-8 100000000 11.6 ns/op 8 B/op 1 allocs/op BenchmarkAlloc-8 94421409 11.2 ns/op 8 B/op 1 allocs/op BenchmarkAlloc-8 100000000 10.9 ns/op 8 B/op 1 allocs/op BenchmarkAlloc-8 100000000 11.0 ns/op 8 B/op 1 allocs/op BenchmarkAlloc-8 100000000 11.5 ns/op 8 B/op 1 allocs/op BenchmarkAlloc-8 100000000 12.0 ns/op 8 B/op 1 allocs/op BenchmarkAlloc-8 98214091 11.1 ns/op 8 B/op 1 allocs/op BenchmarkAlloc-8 105177076 11.2 ns/op 8 B/op 1 allocs/op BenchmarkAlloc-8 100000000 11.1 ns/op 8 B/op 1 allocs/op PASS ok command-line-arguments 12.126s
$ benchstat stack.txt heap.txt name old time/op new time/op delta Alloc-8 0.24ns ± 1% 11.25ns ± 7% +4591.41% (p=0.000 n=10+10) name old alloc/op new alloc/op delta Alloc-8 0.00B 8.00B ± 0% +Inf% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Alloc-8 0.00 1.00 ± 0% +Inf% (p=0.000 n=10+10)
With some rules:
var x struct { f, g *int } var u []*int x.f = u[0]
is modeled simply as:
x = *u
With some rules:
p = &q // -1 p = q // 0 p = *q // 1 p = **q // 2 p = **&**&q // 2
To calculate the minimal number of dereferences from a "location" to others.
Bellman-Ford is slower than Dijkstra, but it can handle negative weight (from addressing operations).
Also do not have to worry about negative cycles, because the compiler does not allow "distances" to go below 0.
var x int _ = &(&x) // invalid
With following code:
package p var px *int func foo() { var i int p := &i q := p px = q }
if __name__ == '__main__': graph = { 'i': {}, 'p': {'i': -1}, 'q': {'p': 0}, 'heap': {'q': 0} } distance, _ = bellman_ford(graph, source='heap') print distance
$ python main.py {'i': -1, 'p': 0, 'q': 0, 'heap': 0}
To determine whether a location outlives others, that means it may survive beyond other's lifetime if stack allocated.
A location outlives other if ...
42We don't know what the caller will do with the returned values.
Except for directly called closures:
var u int p := func() *int { return &u }() *p = 42
in the same function:
var l *int for { l = new(int) }
var l *int func() { l = new(int) }
Escape analysis also records information about whether a function's parameters will escape.
var global *int func f(p *int) { *p = 42 } func g(p *int) { global = p } func h() { f(new(int)) // does not escape g(new(int)) // does escape }
That is because escape analysis firstly analyzes just f, then analyzes just *g, then just h, but it uses the results of previously analyzing f and g.
46Go compiler provides tools for us to diagnose all of those things.
usage: compile [options] file.go... ... version,destination for JSON compiler/optimizer logging -l disable inlining -lang string release to compile for -linkobj file write linker-specific object to file -linkshared generate code that will be linked against Go shared libraries -live debug liveness analysis -m print optimization decisions -memprofile file write memory profile to file -memprofilerate rate set runtime.MemProfileRate to rate ...
10 func stack() { 11 x := s{} 12 x.f = new(int) 13 } 14 15 func heap() { 16 x := &s{} 17 x.f = new(int) 18 }
$ go tool compile -l -m stack_vs_heap_test.go stack_vs_heap_test.go:12:11: new(int) does not escape stack_vs_heap_test.go:16:7: &s{} does not escape stack_vs_heap_test.go:17:11: new(int) escapes to heap
$ go tool compile -l -m=2 stack_vs_heap_test.go stack_vs_heap_test.go:12:11: new(int) does not escape stack_vs_heap_test.go:17:11: new(int) escapes to heap: stack_vs_heap_test.go:17:11: flow: {heap} = &{storage for new(int)}: stack_vs_heap_test.go:17:11: from new(int) (spill) at stack_vs_heap_test.go:17:11 stack_vs_heap_test.go:17:11: from x.f = new(int) (assign) at stack_vs_heap_test.go:17:6 stack_vs_heap_test.go:16:7: &s{} does not escape stack_vs_heap_test.go:17:11: new(int) escapes to heap
$ go tool compile -l -m=3 stack_vs_heap_test.go stack_vs_heap_test.go:11:4:[1] stack stmt: x := s{} stack_vs_heap_test.go:11:2:[1] stack stmt: var x s stack_vs_heap_test.go:12:6:[1] stack stmt: x.f = new(int) stack_vs_heap_test.go:12:11: new(int) does not escape stack_vs_heap_test.go:16:4:[1] heap stmt: x := &s{} stack_vs_heap_test.go:16:2:[1] heap stmt: var x *s stack_vs_heap_test.go:17:6:[1] heap stmt: x.f = new(int) stack_vs_heap_test.go:17:11: new(int) escapes to heap: stack_vs_heap_test.go:17:11: flow: {heap} = &{storage for new(int)}: stack_vs_heap_test.go:17:11: from new(int) (spill) at stack_vs_heap_test.go:17:11 stack_vs_heap_test.go:17:11: from x.f = new(int) (assign) at stack_vs_heap_test.go:17:6 stack_vs_heap_test.go:16:7: &s{} does not escape stack_vs_heap_test.go:17:11: new(int) escapes to heap
var global *int func f(x bool, p *int) { if x { global = p } } func g() { f(false, new(int)) // BAD: new(int) is heap allocated, but would be safe to stack allocate here }
Or assigment through a pointer is conservatively treated as a store to the heap.
func heap() { x := &s{} x.f = new(int) }
These folks have helped me review many things like parts of this talk and code.
55