关于golang:Go-append-扩容机制

在《切片传递的暗藏危机》一文中，小菜刀有简略地提及到切片扩容的问题。在读者探讨群中，有人举了以下例子，想得到一个正当的答复。

package mainfunc main() {    s := []int{1,2}    s = append(s, 3,4,5)    println(cap(s))}// output: 6

为什么后果不是5，不是8，而是6呢？因为小菜刀在该文中对于扩容的形容不够精确，让读者产生了纳闷。因而本文想借此机会粗疏剖析一下append函数及其背地的扩容机制。

咱们晓得，append是一种用户在应用时，并不需要引入相干包而可间接调用的函数。它是内置函数，其定义位于源码包 builtin 的builtin.go。

// The append built-in function appends elements to the end of a slice. If// it has sufficient capacity, the destination is resliced to accommodate the// new elements. If it does not, a new underlying array will be allocated.// Append returns the updated slice. It is therefore necessary to store the// result of append, often in the variable holding the slice itself://    slice = append(slice, elem1, elem2)//    slice = append(slice, anotherSlice...)// As a special case, it is legal to append a string to a byte slice, like this://    slice = append([]byte("hello "), "world"...)func append(slice []Type, elems ...Type) []Type

append 会追加一个或多个数据至 slice 中，这些数据会存储至 slice 的底层数组。其中，数组长度是固定的，如果数组的残余空间足以包容追加的数据，则能够失常地将数据存入该数组。一旦追加数据后总长度超过原数组长度，原数组就无奈满足存储追加数据的要求。此时会怎么解决呢？

同时咱们发现，该文件中仅仅定义了函数签名，并没有蕴含函数实现的任何代码。这里咱们未免好奇，append到底是如何实现的呢？

编译过程

为了答复上述问题，咱们无妨从编译动手。Go编译可分为四个阶段：词法与语法分析、类型查看与形象语法树（AST）转换、两头代码生成和生成最初的机器码。

咱们次要须要关注的是编译期第二和第三阶段的代码，别离是位于src/cmd/compile/internal/gc/typecheck.go下的类型查看逻辑

func typecheck1(n *Node, top int) (res *Node) {    ...    switch n.Op {    case OAPPEND:    ...}

位于src/cmd/compile/internal/gc/walk.go下的形象语法树转换逻辑

func walkexpr(n *Node, init *Nodes) *Node {    ...    case OAPPEND:            // x = append(...)            r := n.Right            if r.Type.Elem().NotInHeap() {                yyerror("%v can't be allocated in Go; it is incomplete (or unallocatable)", r.Type.Elem())            }            switch {            case isAppendOfMake(r):                // x = append(y, make([]T, y)...)                r = extendslice(r, init)            case r.IsDDD():                r = appendslice(r, init) // also works for append(slice, string).            default:                r = walkappend(r, init, n)            }    ...}

和位于src/cmd/compile/internal/gc/ssa.go下的两头代码生成逻辑

// append converts an OAPPEND node to SSA.// If inplace is false, it converts the OAPPEND expression n to an ssa.Value,// adds it to s, and returns the Value.// If inplace is true, it writes the result of the OAPPEND expression n// back to the slice being appended to, and returns nil.// inplace MUST be set to false if the slice can be SSA'd.func (s *state) append(n *Node, inplace bool) *ssa.Value {    ...}

其中，两头代码生成阶段的state.append办法，是咱们重点关注的中央。入参 inplace 代表返回值是否笼罩原变量。如果为false，开展逻辑如下（留神：以下代码只是为了不便了解的伪代码，并不是 state.append 中理论的代码）。同时，小菜刀留神到如果写成 append(s, e1, e2, e3) 不带接收者的模式，并不能通过编译，所以暂未明确它的场景在哪。

    // If inplace is false, process as expression "append(s, e1, e2, e3)":    ptr, len, cap := s     newlen := len + 3     if newlen > cap {         ptr, len, cap = growslice(s, newlen)         newlen = len + 3 // recalculate to avoid a spill     }     // with write barriers, if needed:     *(ptr+len) = e1     *(ptr+len+1) = e2     *(ptr+len+2) = e3     return makeslice(ptr, newlen, cap)

如果是true，例如 slice = append(slice, 1, 2, 3) 语句，那么返回值会笼罩原变量。开展形式逻辑如下

    // If inplace is true, process as statement "s = append(s, e1, e2, e3)":         a := &s     ptr, len, cap := s     newlen := len + 3     if uint(newlen) > uint(cap) {        newptr, len, newcap = growslice(ptr, len, cap, newlen)        vardef(a)       // if necessary, advise liveness we are writing a new a        *a.cap = newcap // write before ptr to avoid a spill        *a.ptr = newptr // with write barrier     }     newlen = len + 3 // recalculate to avoid a spill     *a.len = newlen     // with write barriers, if needed:     *(ptr+len) = e1     *(ptr+len+1) = e2     *(ptr+len+2) = e3

不论 inpalce 是否为true，咱们均会获取切片的数组指针、大小和容量，如果在追加元素后，切片新的大小大于原始容量，就会调用 runtime.growslice 对切片进行扩容，并将新的元素顺次退出切片。

因而，通过append向元素类型为 int 的切片（已蕴含元素 1，2，3）追加元素 1， slice=append(slice,1)可分为两种状况。

状况1，切片的底层数组还有可包容追加元素的空间。

状况2，切片的底层数组已无可包容追加元素的空间，需调用扩容函数，进行扩容。

扩容函数

后面咱们提到，追加操作时，当切片底层数组的残余空间不足以包容追加的元素，就会调用 growslice，其调用的入参 cap 为追加元素后切片的总长度。

growslice 的代码较长，咱们能够依据逻辑分为三个局部。

初步确定切片容量

func growslice(et *_type, old slice, cap int) slice {  ...  newcap := old.cap    doublecap := newcap + newcap    if cap > doublecap {        newcap = cap    } else {        if old.len < 1024 {            newcap = doublecap        } else {            // Check 0 < newcap to detect overflow            // and prevent an infinite loop.            for 0 < newcap && newcap < cap {                newcap += newcap / 4            }            // Set newcap to the requested cap when            // the newcap calculation overflowed.            if newcap <= 0 {                newcap = cap            }        }    }  ...}

在该环节中，如果须要的容量 cap 超过原切片容量的两倍 doublecap，会间接应用须要的容量作为新容量newcap。否则，当原切片长度小于1024时，新切片的容量会间接翻倍。而当原切片的容量大于等于1024时，会重复地减少25%，直到新容量超过所须要的容量。

计算容量所需内存大小

    var overflow bool    var lenmem, newlenmem, capmem uintptr    switch {    case et.size == 1:        lenmem = uintptr(old.len)        newlenmem = uintptr(cap)        capmem = roundupsize(uintptr(newcap))        overflow = uintptr(newcap) > maxAlloc        newcap = int(capmem)    case et.size == sys.PtrSize:        lenmem = uintptr(old.len) * sys.PtrSize        newlenmem = uintptr(cap) * sys.PtrSize        capmem = roundupsize(uintptr(newcap) * sys.PtrSize)        overflow = uintptr(newcap) > maxAlloc/sys.PtrSize        newcap = int(capmem / sys.PtrSize)    case isPowerOfTwo(et.size):        var shift uintptr        if sys.PtrSize == 8 {            // Mask shift for better code generation.            shift = uintptr(sys.Ctz64(uint64(et.size))) & 63        } else {            shift = uintptr(sys.Ctz32(uint32(et.size))) & 31        }        lenmem = uintptr(old.len) << shift        newlenmem = uintptr(cap) << shift        capmem = roundupsize(uintptr(newcap) << shift)        overflow = uintptr(newcap) > (maxAlloc >> shift)        newcap = int(capmem >> shift)    default:        lenmem = uintptr(old.len) * et.size        newlenmem = uintptr(cap) * et.size        capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))        capmem = roundupsize(capmem)        newcap = int(capmem / et.size)    }

在该环节，通过判断切片元素的字节大小是否为1，零碎指针大小（32位为4，64位为8）或2的倍数，进入相应所需内存大小的计算逻辑。

这里须要留神的是 roundupsize 函数，它依据输出冀望大小 size ，返回 mallocgc 理论将调配的内存块的大小。

func roundupsize(size uintptr) uintptr {    if size < _MaxSmallSize {        if size <= smallSizeMax-8 {            return uintptr(class_to_size[size_to_class8[divRoundUp(size, smallSizeDiv)]])        } else {            return uintptr(class_to_size[size_to_class128[divRoundUp(size-smallSizeMax, largeSizeDiv)]])        }    }    // Go的内存治理虚拟地址页大小为 8k（_PageSize）  // 当size的大小行将溢出时，就不采纳向上取整的做法，间接用以后冀望size值。    if size+_PageSize < size {        return size    }    return alignUp(size, _PageSize)}

依据内存调配中的大小对象准则，如果冀望分配内存非大对象 ( <_MaxSmallSize )，即小于32k，则须要依据 divRoundUp 函数将待申请的内存向上取整，取整时会应用 class_to_size 以及 size_to_class8 和 size_to_class128 数组。这些数组不便于内存分配器进行调配，以进步调配效率并缩小内存碎片。

// _NumSizeClasses = 67 代表67种特定大小的对象类型var class_to_size = [_NumSizeClasses]uint16{0, 8, 16, 32, 48, 64, 80, 96, 112,...}

当冀望分配内存为大对象时，会通过 alignUp 将该 size 的大小向上取值为虚构页大小（_PageSize）的倍数。

内存调配

    if overflow || capmem > maxAlloc {        panic(errorString("growslice: cap out of range"))    }    var p unsafe.Pointer    if et.ptrdata == 0 {        p = mallocgc(capmem, nil, false)        memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)    } else {        p = mallocgc(capmem, et, true)        if lenmem > 0 && writeBarrier.enabled {            bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(old.array), lenmem-et.size+et.ptrdata)        }    }    memmove(p, old.array, lenmem)    return slice{p, old.len, newcap}

如果在第二个环节中，造成了溢出或者冀望调配的内存超过最大调配限度，会引起 panic。

mallocgc 调配一个大小为后面计算失去的 capmem 对象。如果是小对象，则间接从以后G所在P的缓存闲暇列表中调配；如果是大对象，则从堆上进行调配。同时，如果切片中的元素不是指针类型，那么会调用 memclrNoHeapPointers将超出切片以后长度的地位清空；如果是元素是指针类型，且原有切片元素个数不为0 并能够关上写屏障时，须要调用 bulkBarrierPreWriteSrcOnly 将旧切片指针标记暗藏，在新切片中保留为nil指针。

在最初应用memmove将原数组内存中的内容拷贝到新申请的内存中，并将新的内存指向指针p 和旧的长度值，新的容量值赋值给新的 slice 并返回。

留神，在 growslice 实现后，只是把旧有数据拷贝到了新的内存中去，且计算失去新的 slice 容量大小，并没有实现最终追加数据的操作。如果 slice 以后 len =3，cap=3，slice=append(slice,1)，那它实现的工作如下图所示。

growslice之后，此时新的slice曾经拷贝了旧的slice数据，并且其底层数组有短缺的残余空间追加数据。后续只需拷贝追加数据至残余空间，并批改 len 值即可，这一部分就不再深究了。

总结

这里回到文章结尾中的例子

package mainfunc main() {    s := []int{1,2}    s = append(s, 3,4,5)    println(cap(s))}

因为初始 s 的容量是2，现须要追加3个元素，所以通过 append 肯定会触发扩容，并调用 growslice 函数，此时他的入参 cap 大小为2+3=5。通过翻倍原有容量失去 doublecap = 2+2，doublecap 小于 cap 值，所以在第一阶段计算出的冀望容量值 newcap=5。在第二阶段中，元素类型大小 int 和 sys.PtrSize 相等，通过 roundupsize 向上取整内存的大小到 capmem = 48 字节，所以新切片的容量newcap 为 48 / 8 = 6 ，胜利解释！

在切片 append 操作时，如果底层数组已无可包容追加元素的空间，则需扩容。扩容并不是在原有底层数组的根底上减少内存空间，而是新调配一块内存空间作为切片的底层数组，并将原有数据和追加数据拷贝至新的内存空间中。

在扩容的容量确定上，绝对比较复杂，它与CPU位数、元素大小、是否蕴含指针、追加个数等都有关系。当咱们看完扩容源码逻辑后，发现去纠结它的扩容确切值并没什么必要。

在理论应用中，如果可能确定切片的容量范畴，比拟适合的做法是：切片初始化时就调配足够的容量空间，在append追加操作时，就不必再思考扩容带来的性能损耗问题。

func BenchmarkAppendFixCap(b *testing.B) {    for i := 0; i < b.N; i++ {        a := make([]int, 0, 1000)        for i := 0; i < 1000; i++ {            a = append(a, i)        }    }}func BenchmarkAppend(b *testing.B) {    for i := 0; i < b.N; i++ {        a := make([]int, 0)        for i := 0; i < 1000; i++ {            a = append(a, i)        }    }}

它们的压测后果如下，孰优孰劣，高深莫测。

 $ go test -bench=. -benchmemBenchmarkAppendFixCap-8          1953373               617 ns/op               0 B/op          0 allocs/opBenchmarkAppend-8                 426882              2832 ns/op           16376 B/op         11 allocs/op