Escape analysis in the Go compiler

Cuong Manh Le

2020-09-26

Software Engineer

Agenda

2

Warm up

I recommend this course for anyone interested in learning about compilers.

3

Let's go

4

Go overview

5

Go

Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.

package main

import "fmt"

func main() {
    fmt.Println("Hello, 世界")
}
6

Why is Go great for modern distributed computing?

7

Go was built to solve problems at Google:

See more details here

8

Go Compiler Overview

9

What's compiler

A compiler is a computer program that translates high level human written computer code into machine executable code.

$ file main.go
main.go: C source, UTF-8 Unicode text

$ go tool compile main.go

$ ls
main.go main.o

$ file main.o
main.o: current ar archive

$ ar x main.o

$ ls
_go_.o main.go  main.o __.PKGDEF

$ file _go_.o
_go_.o: data
10

Go compiler (gc)

Translate go source code to machine code.

$ go tool compile main.go

Actually, "gc" is a separate executable, invoked by the "go" command:

$ go tool -n compile
/home/cuonglm/sources/go/pkg/tool/linux_amd64/compile
11

How does gc do it

Through many steps or phases, logically, there're four phases:

12

How does gc do it

Source

13

1 - Parsing

Includes both lexing and parsing

14

Lexer comes from lexical analysis which means converting a sequence of characters into a sequence of tokens/strings

Source

15

Source

16

Example

a := 1

will be parsed/lexed and produce:

[]ast.Stmt {
    &ast.AssignStmt {
        Lhs: []ast.Expr {
            &ast.Ident {Name: "a"},
        },
        Tok: :=,
        Rhs: []ast.Expr {
            &ast.BasicLit {
                ValuePos: 32,
                Kind: INT,
                Value: "1",
            },
        },
    },
}

This uses "go/ast", compiler uses another parser in "cmd/compile/internal/syntax", but it's similar.

17

2 - Type-checking and AST transformations

package main

func main() {
    var x string
    x = 1
    println(x)
}

The intermediate representation (IR) of source code in another form, specifically, it's a tree.

18

3 - SSA

Static single assignment form is another IR of the source code (not a tree):

  b1:
    v1 = InitMem <mem>
    v2 = SP <uintptr>
    v3 = SB <uintptr>
    v4 = Addr <*uint8> {type.string} v3
    v5 = Addr <*string> {""..stmp_0} v3
    v6 = IMake <interface {}> v4 v5 (~arg0[interface {}])
    v7 = ConstInterface <interface {}>
    v8 = ArrayMake1 <[1]interface {}> v7
    v9 = VarDef <mem> {.autotmp_11} v1
    ...
    v25 = ConstInterface <error> (fmt..autotmp_4[error], fmt.err[error]) DEAD
    v28 = OffPtr <*io.Writer> [0] v2
    v29 = Addr <*uint8> {go.itab.*os.File,io.Writer} v3
    v30 = Addr <**os.File> {os.Stdout} v3
    ...
19

SSA helps avoiding unnecessary operations that don't affect the final form of a variable's use: nil-check, bound check ...

20

4 - Generate machine code

        00000 (5)   TEXT    "".main(SB), ABIInternal
        00001 (5)   FUNCDATA    $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        00002 (5)   FUNCDATA    $1, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)
        00003 (5)   FUNCDATA    $3, "".main.stkobj(SB)
 v26    00004 (6)   XORPS   X0, X0
 v11    00005 (6)   MOVUPS  X0, ""..autotmp_11-16(SP)
 v20    00006 (6)   LEAQ    type.string(SB), AX
 v14    00007 (6)   MOVQ    AX, ""..autotmp_11-16(SP)
 v38    00008 (6)   LEAQ    ""..stmp_0(SB), AX
 v17    00009 (6)   MOVQ    AX, ""..autotmp_11-8(SP)
 v27    00010 (?)   NOP
# $GOROOT/src/fmt/print.go
 ...
v36    00019 (274) PCDATA  $1, $0
 v36    00020 (274) CALL    fmt.Fprintln(SB)
# main.go
 b4     00021 (6)   RET
        00022 (?)   END
21

That's 4 logical phases of go compiler

22

But ...

The second phase contains more sub-phases

23

But ...

The second phase contains more sub-phases

Escape Analysis is important to help the compiler decide if it should allocate variables on the heap or on the stack or how to store the associated data.

24

What is Escape Analysis

Escape Analysis is a method for determining the dynamic scope of pointers – where in the program a pointer can be accessed. It is related to pointer analysis and shape analysis.

From https://en.wikipedia.org/wiki/Escape_analysis

25

What is Escape Analysis

Escape Analysis is a method for determining the dynamic scope of pointers – where in the program a pointer can be accessed. It is related to pointer analysis and shape analysis.

"where" means stack or heap

26

Stack vs Heap

This is traditional way heap/stack work.

Within Go runtime:

27

Why is Escape Analysis necessary

The most benefit is helping convert heap allocation → stack allocation

28

Why is Escape Analysis necessary

That means:

29

Example

Stack

package main

import "testing"

type s struct {
    f *int
}

func stack() {
    x := s{}
    x.f = new(int)
}

func BenchmarkAlloc(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i <= b.N; i++ {
        stack()
    }
}
30

$ go test -bench=. benchstat_stack_test.go
goos: linux
goarch: amd64
BenchmarkAlloc-8       1000000000             0.241 ns/op           0 B/op           0 allocs/op
BenchmarkAlloc-8       1000000000             0.241 ns/op           0 B/op           0 allocs/op
BenchmarkAlloc-8       1000000000             0.239 ns/op           0 B/op           0 allocs/op
BenchmarkAlloc-8       1000000000             0.239 ns/op           0 B/op           0 allocs/op
BenchmarkAlloc-8       1000000000             0.239 ns/op           0 B/op           0 allocs/op
BenchmarkAlloc-8       1000000000             0.239 ns/op           0 B/op           0 allocs/op
BenchmarkAlloc-8       1000000000             0.240 ns/op           0 B/op           0 allocs/op
BenchmarkAlloc-8       1000000000             0.240 ns/op           0 B/op           0 allocs/op
BenchmarkAlloc-8       1000000000             0.241 ns/op           0 B/op           0 allocs/op
BenchmarkAlloc-8       1000000000             0.239 ns/op           0 B/op           0 allocs/op
PASS
ok      command-line-arguments    2.661s
31

Example

Heap

package main

import "testing"

type s struct {
    f *int
}

func heap() {
    x := &s{}
    x.f = new(int)
}

func BenchmarkAlloc(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i <= b.N; i++ {
        heap()
    }
}
32

$ go test -bench=. benchstat_heap_test.go
goos: linux
goarch: amd64
BenchmarkAlloc-8       100000000            10.9 ns/op           8 B/op           1 allocs/op
BenchmarkAlloc-8       100000000            11.6 ns/op           8 B/op           1 allocs/op
BenchmarkAlloc-8       94421409            11.2 ns/op           8 B/op           1 allocs/op
BenchmarkAlloc-8       100000000            10.9 ns/op           8 B/op           1 allocs/op
BenchmarkAlloc-8       100000000            11.0 ns/op           8 B/op           1 allocs/op
BenchmarkAlloc-8       100000000            11.5 ns/op           8 B/op           1 allocs/op
BenchmarkAlloc-8       100000000            12.0 ns/op           8 B/op           1 allocs/op
BenchmarkAlloc-8       98214091            11.1 ns/op           8 B/op           1 allocs/op
BenchmarkAlloc-8       105177076            11.2 ns/op           8 B/op           1 allocs/op
BenchmarkAlloc-8       100000000            11.1 ns/op           8 B/op           1 allocs/op
PASS
ok      command-line-arguments    12.126s
33

Example

$ benchstat stack.txt heap.txt
name     old time/op    new time/op    delta
Alloc-8    0.24ns ± 1%   11.25ns ± 7%  +4591.41%  (p=0.000 n=10+10)

name     old alloc/op   new alloc/op   delta
Alloc-8     0.00B          8.00B ± 0%      +Inf%  (p=0.000 n=10+10)

name     old allocs/op  new allocs/op  delta
Alloc-8      0.00           1.00 ± 0%      +Inf%  (p=0.000 n=10+10)
34

How is escape analysis implemented in the Go compiler?

35

Building a directed weighted graph

With some rules:

36

That means

var x struct { f, g *int }
var u []*int

x.f = u[0]

is modeled simply as:

x = *u
37

Building a directed weighted graph

With some rules:

p = &q    // -1
p = q     //  0
p = *q    //  1
p = **q   //  2

p = **&**&q  // 2
38

Run Bellman-Ford shortest path algorithm

To calculate the minimal number of dereferences from a "location" to others.

Bellman-Ford is slower than Dijkstra, but it can handle negative weight (from addressing operations).

Also do not have to worry about negative cycles, because the compiler does not allow "distances" to go below 0.

var x int
_ = &(&x)  // invalid
39

Example

With following code:

package p

var px *int

func foo() {
    var i int
    p := &i
    q := p
    px = q
}
40

Example

if __name__ == '__main__':
    graph = {
        'i': {},
        'p': {'i': -1},
        'q': {'p': 0},
        'heap': {'q': 0}
    }

    distance, _ = bellman_ford(graph, source='heap')

    print distance
$ python main.py
{'i': -1, 'p': 0, 'q': 0, 'heap': 0}

source

41

Static data-flow analysis

To determine whether a location outlives others, that means it may survive beyond other's lifetime if stack allocated.

A location outlives other if ...

42

Returned values

We don't know what the caller will do with the returned values.

Except for directly called closures:

var u int
p := func() *int { return &u }()
*p = 42
43

Higher loop scope

in the same function:

var l *int
for {
    l = new(int)
}
44

Other is declared within a child closure

var l *int
func() {
    l = new(int)
}
45

Escape analysis also records information about whether a function's parameters will escape.

var global *int

func f(p *int) { *p = 42 }
func g(p *int) { global = p }

func h() {
    f(new(int))  // does not escape
    g(new(int))  // does escape
}

That is because escape analysis firstly analyzes just f, then analyzes just *g, then just h, but it uses the results of previously analyzing f and g.

46

That's the theory

47

In practice

Go compiler provides tools for us to diagnose all of those things.

usage: compile [options] file.go...
   ...
        version,destination for JSON compiler/optimizer logging
  -l    disable inlining
  -lang string
        release to compile for
  -linkobj file
        write linker-specific object to file
  -linkshared
        generate code that will be linked against Go shared libraries
  -live
        debug liveness analysis
  -m    print optimization decisions
  -memprofile file
        write memory profile to file
  -memprofilerate rate
        set runtime.MemProfileRate to rate
  ...
48

Example

    10    func stack() {
    11        x := s{}
    12        x.f = new(int)
    13    }
    14
    15    func heap() {
    16        x := &s{}
    17        x.f = new(int)
    18    }
$ go tool compile -l -m stack_vs_heap_test.go
stack_vs_heap_test.go:12:11: new(int) does not escape
stack_vs_heap_test.go:16:7: &s{} does not escape
stack_vs_heap_test.go:17:11: new(int) escapes to heap
49

Example

$ go tool compile -l -m=2 stack_vs_heap_test.go
stack_vs_heap_test.go:12:11: new(int) does not escape
stack_vs_heap_test.go:17:11: new(int) escapes to heap:
stack_vs_heap_test.go:17:11:   flow: {heap} = &{storage for new(int)}:
stack_vs_heap_test.go:17:11:     from new(int) (spill) at stack_vs_heap_test.go:17:11
stack_vs_heap_test.go:17:11:     from x.f = new(int) (assign) at stack_vs_heap_test.go:17:6
stack_vs_heap_test.go:16:7: &s{} does not escape
stack_vs_heap_test.go:17:11: new(int) escapes to heap
50

Example

$ go tool compile -l -m=3 stack_vs_heap_test.go
stack_vs_heap_test.go:11:4:[1] stack stmt: x := s{}
stack_vs_heap_test.go:11:2:[1] stack stmt: var x s
stack_vs_heap_test.go:12:6:[1] stack stmt: x.f = new(int)
stack_vs_heap_test.go:12:11: new(int) does not escape
stack_vs_heap_test.go:16:4:[1] heap stmt: x := &s{}
stack_vs_heap_test.go:16:2:[1] heap stmt: var x *s
stack_vs_heap_test.go:17:6:[1] heap stmt: x.f = new(int)
stack_vs_heap_test.go:17:11: new(int) escapes to heap:
stack_vs_heap_test.go:17:11:   flow: {heap} = &{storage for new(int)}:
stack_vs_heap_test.go:17:11:     from new(int) (spill) at stack_vs_heap_test.go:17:11
stack_vs_heap_test.go:17:11:     from x.f = new(int) (assign) at stack_vs_heap_test.go:17:6
stack_vs_heap_test.go:16:7: &s{} does not escape
stack_vs_heap_test.go:17:11: new(int) escapes to heap
51

There's plenty of short-comings

var global *int

func f(x bool, p *int) { if x { global = p } }

func g() {
    f(false, new(int)) // BAD: new(int) is heap allocated, but would be safe to stack allocate here
}

Or assigment through a pointer is conservatively treated as a store to the heap.

func heap() {
    x := &s{}
    x.f = new(int)
}
52

Further reading

53

Q&A

54

Special thanks (listing order by reviewing date)

These folks have helped me review many things like parts of this talk and code.

55

Thank you

Cuong Manh Le

2020-09-26

Software Engineer

Use the left and right arrow keys or click the left and right edges of the page to navigate between slides.
(Press 'H' or navigate to hide this message.)