内存调配与GC

  • Go应用值传递
  • 协程栈记录了协程执行现场
  • 协程栈在堆上由GC回收
  • 编译原理相干

逃逸剖析

  • 局部变量太大
  • 栈帧回收后,须要持续应用的变量
  • 不是所有变量读能放在协程栈上

触发逃逸的情景

  • 指针逃逸

函数返回了对象的指针(函数外能够拜访,变量此时不是局部变量)

func a()*int{    v := 0    return &v}func main(){    i := a()}
  • 空接口逃逸
func b()*int{    v := 0    // interface{}类型的函数往往会应用反射    fmt.Println(v)}func main(){    i := a()}
  • 大变量逃逸
变量过大会导致栈空间有余,64位,个别超过64KB的变量会逃逸

栈扩容

  • Go栈的初始空间为2KB
  • 在函数调用前判断栈空间(morestack)
  • 必要时对栈进行扩容
  • 晚期应用分段栈,前期应用间断栈
  • 当空间有余时扩容,变为原来的2倍
  • 当空间使用率有余1/4时缩容,变为原来的1/2
/* * support for morestack */// Called during function prolog when more stack is needed.//// The traceback routines see morestack on a g0 as being// the top of a stack (for example, morestack calling newstack// calling the scheduler calling newm calling gc), so we must// record an argument size. For that purpose, it has no arguments.TEXT runtime·morestack(SB),NOSPLIT,$0-0    // Cannot grow scheduler stack (m->g0).    get_tls(CX)    MOVQ    g(CX), BX    MOVQ    g_m(BX), BX    MOVQ    m_g0(BX), SI    CMPQ    g(CX), SI    JNE    3(PC)    CALL    runtime·badmorestackg0(SB)    CALL    runtime·abort(SB)    // Cannot grow signal stack (m->gsignal).    MOVQ    m_gsignal(BX), SI    CMPQ    g(CX), SI    JNE    3(PC)    CALL    runtime·badmorestackgsignal(SB)    CALL    runtime·abort(SB)    // Called from f.    // Set m->morebuf to f's caller.    NOP    SP    // tell vet SP changed - stop checking offsets    MOVQ    8(SP), AX    // f's caller's PC    MOVQ    AX, (m_morebuf+gobuf_pc)(BX)    LEAQ    16(SP), AX    // f's caller's SP    MOVQ    AX, (m_morebuf+gobuf_sp)(BX)    get_tls(CX)    MOVQ    g(CX), SI    MOVQ    SI, (m_morebuf+gobuf_g)(BX)    // Set g->sched to context in f.    MOVQ    0(SP), AX // f's PC    MOVQ    AX, (g_sched+gobuf_pc)(SI)    LEAQ    8(SP), AX // f's SP    MOVQ    AX, (g_sched+gobuf_sp)(SI)    MOVQ    BP, (g_sched+gobuf_bp)(SI)    MOVQ    DX, (g_sched+gobuf_ctxt)(SI)    // Call newstack on m->g0's stack.    MOVQ    m_g0(BX), BX    MOVQ    BX, g(CX)    MOVQ    (g_sched+gobuf_sp)(BX), SP    CALL    runtime·newstack(SB)    CALL    runtime·abort(SB)    // crash if newstack returns    RET

heapArena

  • Go每次申请的虚拟内存单元为64MB
  • 最多有20^20个虚拟内存单元
  • 所有的heapArena组成了mheap(Go堆内存)
// A heapArena stores metadata for a heap arena. heapArenas are stored// outside of the Go heap and accessed via the mheap_.arenas index.////go:notinheap// 申请的信息type heapArena struct {    // bitmap stores the pointer/scalar bitmap for the words in    // this arena. See mbitmap.go for a description. Use the    // heapBits type to access this.    bitmap [heapArenaBitmapBytes]byte    // spans maps from virtual address page ID within this arena to *mspan.    // For allocated spans, their pages map to the span itself.    // For free spans, only the lowest and highest pages map to the span itself.    // Internal pages map to an arbitrary span.    // For pages that have never been allocated, spans entries are nil.    //    // Modifications are protected by mheap.lock. Reads can be    // performed without locking, but ONLY from indexes that are    // known to contain in-use or stack spans. This means there    // must not be a safe-point between establishing that an    // address is live and looking it up in the spans array.    // 记录mspan    spans [pagesPerArena]*mspan    // pageInUse is a bitmap that indicates which spans are in    // state mSpanInUse. This bitmap is indexed by page number,    // but only the bit corresponding to the first page in each    // span is used.    //    // Reads and writes are atomic.    pageInUse [pagesPerArena / 8]uint8    // pageMarks is a bitmap that indicates which spans have any    // marked objects on them. Like pageInUse, only the bit    // corresponding to the first page in each span is used.    //    // Writes are done atomically during marking. Reads are    // non-atomic and lock-free since they only occur during    // sweeping (and hence never race with writes).    //    // This is used to quickly find whole spans that can be freed.    //    // TODO(austin): It would be nice if this was uint64 for    // faster scanning, but we don't have 64-bit atomic bit    // operations.    pageMarks [pagesPerArena / 8]uint8    // pageSpecials is a bitmap that indicates which spans have    // specials (finalizers or other). Like pageInUse, only the bit    // corresponding to the first page in each span is used.    //    // Writes are done atomically whenever a special is added to    // a span and whenever the last special is removed from a span.    // Reads are done atomically to find spans containing specials    // during marking.    pageSpecials [pagesPerArena / 8]uint8    // checkmarks stores the debug.gccheckmark state. It is only    // used if debug.gccheckmark > 0.    checkmarks *checkmarksMap    // zeroedBase marks the first byte of the first page in this    // arena which hasn't been used yet and is therefore already    // zero. zeroedBase is relative to the arena base.    // Increases monotonically until it hits heapArenaBytes.    //    // This field is sufficient to determine if an allocation    // needs to be zeroed because the page allocator follows an    // address-ordered first-fit policy.    //    // Read atomically and written with an atomic CAS.    zeroedBase uintptr}
type mheap struct {    // lock must only be acquired on the system stack, otherwise a g    // could self-deadlock if its stack grows with the lock held.    lock  mutex    pages pageAlloc // page allocation data structure    sweepgen uint32 // sweep generation, see comment in mspan; written during STW    // allspans is a slice of all mspans ever created. Each mspan    // appears exactly once.    //    // The memory for allspans is manually managed and can be    // reallocated and move as the heap grows.    //    // In general, allspans is protected by mheap_.lock, which    // prevents concurrent access as well as freeing the backing    // store. Accesses during STW might not hold the lock, but    // must ensure that allocation cannot happen around the    // access (since that may free the backing store).    allspans []*mspan // all spans out there    // _ uint32 // align uint64 fields on 32-bit for atomics    // Proportional sweep    //    // These parameters represent a linear function from gcController.heapLive    // to page sweep count. The proportional sweep system works to    // stay in the black by keeping the current page sweep count    // above this line at the current gcController.heapLive.    //    // The line has slope sweepPagesPerByte and passes through a    // basis point at (sweepHeapLiveBasis, pagesSweptBasis). At    // any given time, the system is at (gcController.heapLive,    // pagesSwept) in this space.    //    // It is important that the line pass through a point we    // control rather than simply starting at a 0,0 origin    // because that lets us adjust sweep pacing at any time while    // accounting for current progress. If we could only adjust    // the slope, it would create a discontinuity in debt if any    // progress has already been made.    pagesInUse         atomic.Uint64 // pages of spans in stats mSpanInUse    pagesSwept         atomic.Uint64 // pages swept this cycle    pagesSweptBasis    atomic.Uint64 // pagesSwept to use as the origin of the sweep ratio    sweepHeapLiveBasis uint64        // value of gcController.heapLive to use as the origin of sweep ratio; written with lock, read without    sweepPagesPerByte  float64       // proportional sweep ratio; written with lock, read without    // TODO(austin): pagesInUse should be a uintptr, but the 386    // compiler can't 8-byte align fields.    // scavengeGoal is the amount of total retained heap memory (measured by    // heapRetained) that the runtime will try to maintain by returning memory    // to the OS.    //    // Accessed atomically.    scavengeGoal uint64    // Page reclaimer state    // reclaimIndex is the page index in allArenas of next page to    // reclaim. Specifically, it refers to page (i %    // pagesPerArena) of arena allArenas[i / pagesPerArena].    //    // If this is >= 1<<63, the page reclaimer is done scanning    // the page marks.    reclaimIndex atomic.Uint64    // reclaimCredit is spare credit for extra pages swept. Since    // the page reclaimer works in large chunks, it may reclaim    // more than requested. Any spare pages released go to this    // credit pool.    reclaimCredit atomic.Uintptr    // arenas is the heap arena map. It points to the metadata for    // the heap for every arena frame of the entire usable virtual    // address space.    //    // Use arenaIndex to compute indexes into this array.    //    // For regions of the address space that are not backed by the    // Go heap, the arena map contains nil.    //    // Modifications are protected by mheap_.lock. Reads can be    // performed without locking; however, a given entry can    // transition from nil to non-nil at any time when the lock    // isn't held. (Entries never transitions back to nil.)    //    // In general, this is a two-level mapping consisting of an L1    // map and possibly many L2 maps. This saves space when there    // are a huge number of arena frames. However, on many    // platforms (even 64-bit), arenaL1Bits is 0, making this    // effectively a single-level map. In this case, arenas[0]    // will never be nil.    // 堆由所有heapArena组成    arenas [1 << arenaL1Bits]*[1 << arenaL2Bits]*heapArena    // heapArenaAlloc is pre-reserved space for allocating heapArena    // objects. This is only used on 32-bit, where we pre-reserve    // this space to avoid interleaving it with the heap itself.    heapArenaAlloc linearAlloc    // arenaHints is a list of addresses at which to attempt to    // add more heap arenas. This is initially populated with a    // set of general hint addresses, and grown with the bounds of    // actual heap arena ranges.    arenaHints *arenaHint    // arena is a pre-reserved space for allocating heap arenas    // (the actual arenas). This is only used on 32-bit.    arena linearAlloc    // allArenas is the arenaIndex of every mapped arena. This can    // be used to iterate through the address space.    //    // Access is protected by mheap_.lock. However, since this is    // append-only and old backing arrays are never freed, it is    // safe to acquire mheap_.lock, copy the slice header, and    // then release mheap_.lock.    allArenas []arenaIdx    // sweepArenas is a snapshot of allArenas taken at the    // beginning of the sweep cycle. This can be read safely by    // simply blocking GC (by disabling preemption).    sweepArenas []arenaIdx    // markArenas is a snapshot of allArenas taken at the beginning    // of the mark cycle. Because allArenas is append-only, neither    // this slice nor its contents will change during the mark, so    // it can be read safely.    markArenas []arenaIdx    // curArena is the arena that the heap is currently growing    // into. This should always be physPageSize-aligned.    curArena struct {        base, end uintptr    }    _ uint32 // ensure 64-bit alignment of central    // central free lists for small size classes.    // the padding makes sure that the mcentrals are    // spaced CacheLinePadSize bytes apart, so that each mcentral.lock    // gets its own cache line.    // central is indexed by spanClass.    // 136个    central [numSpanClasses]struct {        mcentral mcentral        pad      [cpu.CacheLinePadSize - unsafe.Sizeof(mcentral{})%cpu.CacheLinePadSize]byte    }    spanalloc             fixalloc // allocator for span*    cachealloc            fixalloc // allocator for mcache*    specialfinalizeralloc fixalloc // allocator for specialfinalizer*    specialprofilealloc   fixalloc // allocator for specialprofile*    specialReachableAlloc fixalloc // allocator for specialReachable    speciallock           mutex    // lock for special record allocators.    arenaHintAlloc        fixalloc // allocator for arenaHints    unused *specialfinalizer // never set, just here to force the specialfinalizer type into DWARF}

调配

  • 线性调配
  • 闲暇链表调配
线性调配与闲暇链表调配会产生碎片
  • 分级调配

内存治理单元mspan

  • 依据隔离适应策略,应用内存时最小单位为mspan
  • 每个mspan是N个雷同的小格子
  • 67个mspan
// class  bytes/obj  bytes/span  objects  tail waste  max waste  min align//     1          8        8192     1024           0     87.50%          8//     2         16        8192      512           0     43.75%         16//     3         24        8192      341           8     29.24%          8//     4         32        8192      256           0     21.88%         32//     5         48        8192      170          32     31.52%         16//     6         64        8192      128           0     23.44%         64//     7         80        8192      102          32     19.07%         16//     8         96        8192       85          32     15.95%         32//     9        112        8192       73          16     13.56%         16//    10        128        8192       64           0     11.72%        128//    11        144        8192       56         128     11.82%         16//    12        160        8192       51          32      9.73%         32//    13        176        8192       46          96      9.59%         16//    14        192        8192       42         128      9.25%         64//    15        208        8192       39          80      8.12%         16//    16        224        8192       36         128      8.15%         32//    17        240        8192       34          32      6.62%         16//    18        256        8192       32           0      5.86%        256//    19        288        8192       28         128     12.16%         32//    20        320        8192       25         192     11.80%         64//    21        352        8192       23          96      9.88%         32//    22        384        8192       21         128      9.51%        128//    23        416        8192       19         288     10.71%         32//    24        448        8192       18         128      8.37%         64//    25        480        8192       17          32      6.82%         32//    26        512        8192       16           0      6.05%        512//    27        576        8192       14         128     12.33%         64//    28        640        8192       12         512     15.48%        128//    29        704        8192       11         448     13.93%         64//    30        768        8192       10         512     13.94%        256//    31        896        8192        9         128     15.52%        128//    32       1024        8192        8           0     12.40%       1024//    33       1152        8192        7         128     12.41%        128//    34       1280        8192        6         512     15.55%        256//    35       1408       16384       11         896     14.00%        128//    36       1536        8192        5         512     14.00%        512//    37       1792       16384        9         256     15.57%        256//    38       2048        8192        4           0     12.45%       2048//    39       2304       16384        7         256     12.46%        256//    40       2688        8192        3         128     15.59%        128//    41       3072       24576        8           0     12.47%       1024//    42       3200       16384        5         384      6.22%        128//    43       3456       24576        7         384      8.83%        128//    44       4096        8192        2           0     15.60%       4096//    45       4864       24576        5         256     16.65%        256//    46       5376       16384        3         256     10.92%        256//    47       6144       24576        4           0     12.48%       2048//    48       6528       32768        5         128      6.23%        128//    49       6784       40960        6         256      4.36%        128//    50       6912       49152        7         768      3.37%        256//    51       8192        8192        1           0     15.61%       8192//    52       9472       57344        6         512     14.28%        256//    53       9728       49152        5         512      3.64%        512//    54      10240       40960        4           0      4.99%       2048//    55      10880       32768        3         128      6.24%        128//    56      12288       24576        2           0     11.45%       4096//    57      13568       40960        3         256      9.99%        256//    58      14336       57344        4           0      5.35%       2048//    59      16384       16384        1           0     12.49%       8192//    60      18432       73728        4           0     11.11%       2048//    61      19072       57344        3         128      3.57%        128//    62      20480       40960        2           0      6.87%       4096//    63      21760       65536        3         256      6.25%        256//    64      24576       24576        1           0     11.45%       8192//    65      27264       81920        3         128     10.00%        128//    66      28672       57344        2           0      4.91%       4096//    67      32768       32768        1           0     12.50%       8192// alignment  bits  min obj size//         8     3             8//        16     4            32//        32     5           256//        64     6           512//       128     7           768//      4096    12         28672//      8192    13         32768
//go:notinheaptype mspan struct {    next *mspan     // next span in list, or nil if none    prev *mspan     // previous span in list, or nil if none    list *mSpanList // For debugging. TODO: Remove.    startAddr uintptr // address of first byte of span aka s.base()    npages    uintptr // number of pages in span    manualFreeList gclinkptr // list of free objects in mSpanManual spans    // freeindex is the slot index between 0 and nelems at which to begin scanning    // for the next free object in this span.    // Each allocation scans allocBits starting at freeindex until it encounters a 0    // indicating a free object. freeindex is then adjusted so that subsequent scans begin    // just past the newly discovered free object.    //    // If freeindex == nelem, this span has no free objects.    //    // allocBits is a bitmap of objects in this span.    // If n >= freeindex and allocBits[n/8] & (1<<(n%8)) is 0    // then object n is free;    // otherwise, object n is allocated. Bits starting at nelem are    // undefined and should never be referenced.    //    // Object n starts at address n*elemsize + (start << pageShift).    freeindex uintptr    // TODO: Look up nelems from sizeclass and remove this field if it    // helps performance.    nelems uintptr // number of object in the span.    // Cache of the allocBits at freeindex. allocCache is shifted    // such that the lowest bit corresponds to the bit freeindex.    // allocCache holds the complement of allocBits, thus allowing    // ctz (count trailing zero) to use it directly.    // allocCache may contain bits beyond s.nelems; the caller must ignore    // these.    allocCache uint64    // allocBits and gcmarkBits hold pointers to a span's mark and    // allocation bits. The pointers are 8 byte aligned.    // There are three arenas where this data is held.    // free: Dirty arenas that are no longer accessed    //       and can be reused.    // next: Holds information to be used in the next GC cycle.    // current: Information being used during this GC cycle.    // previous: Information being used during the last GC cycle.    // A new GC cycle starts with the call to finishsweep_m.    // finishsweep_m moves the previous arena to the free arena,    // the current arena to the previous arena, and    // the next arena to the current arena.    // The next arena is populated as the spans request    // memory to hold gcmarkBits for the next GC cycle as well    // as allocBits for newly allocated spans.    //    // The pointer arithmetic is done "by hand" instead of using    // arrays to avoid bounds checks along critical performance    // paths.    // The sweep will free the old allocBits and set allocBits to the    // gcmarkBits. The gcmarkBits are replaced with a fresh zeroed    // out memory.    allocBits  *gcBits    gcmarkBits *gcBits    // sweep generation:    // if sweepgen == h->sweepgen - 2, the span needs sweeping    // if sweepgen == h->sweepgen - 1, the span is currently being swept    // if sweepgen == h->sweepgen, the span is swept and ready to use    // if sweepgen == h->sweepgen + 1, the span was cached before sweep began and is still cached, and needs sweeping    // if sweepgen == h->sweepgen + 3, the span was swept and then cached and is still cached    // h->sweepgen is incremented by 2 after every GC    sweepgen    uint32    divMul      uint32        // for divide by elemsize    allocCount  uint16        // number of allocated objects    spanclass   spanClass     // size class and noscan (uint8)    state       mSpanStateBox // mSpanInUse etc; accessed atomically (get/set methods)    needzero    uint8         // needs to be zeroed before allocation    elemsize    uintptr       // computed from sizeclass or from npages    limit       uintptr       // end of data in span    speciallock mutex         // guards specials list    specials    *special      // linked list of special records sorted by offset.}

每个heapArena中的mspan都不确定

核心索引 mcentral

136个

68个不须要GC扫描,68须要GC扫描

// 给定大小的闲暇对象的地方列表// Central list of free objects of a given size.////go:notinheap// 保留雷同mspan的目录type mcentral struct {    spanclass spanClass    // partial and full contain two mspan sets: one of swept in-use    // spans, and one of unswept in-use spans. These two trade    // roles on each GC cycle. The unswept set is drained either by    // allocation or by the background sweeper in every GC cycle,    // so only two roles are necessary.    //    // sweepgen is increased by 2 on each GC cycle, so the swept    // spans are in partial[sweepgen/2%2] and the unswept spans are in    // partial[1-sweepgen/2%2]. Sweeping pops spans from the    // unswept set and pushes spans that are still in-use on the    // swept set. Likewise, allocating an in-use span pushes it    // on the swept set.    //    // Some parts of the sweeper can sweep arbitrary spans, and hence    // can't remove them from the unswept set, but will add the span    // to the appropriate swept list. As a result, the parts of the    // sweeper and mcentral that do consume from the unswept list may    // encounter swept spans, and these should be ignored.    partial [2]spanSet // list of spans with a free object    full    [2]spanSet // list of spans with no free objects}

协程缓存 mcache

  • 每个P有一个mcache

// Per-thread (in Go, per-P) cache for small objects.// This includes a small object cache and local allocation stats.// No locking needed because it is per-thread (per-P).//// mcaches are allocated from non-GC'd memory, so any heap pointers// must be specially handled.////go:notinheaptype mcache struct {    // The following members are accessed on every malloc,    // so they are grouped here for better caching.    nextSample uintptr // trigger heap sample after allocating this many bytes    scanAlloc  uintptr // bytes of scannable heap allocated    // Allocator cache for tiny objects w/o pointers.    // See "Tiny allocator" comment in malloc.go.    // tiny points to the beginning of the current tiny block, or    // nil if there is no current tiny block.    //    // tiny is a heap pointer. Since mcache is in non-GC'd memory,    // we handle it by clearing it in releaseAll during mark    // termination.    //    // tinyAllocs is the number of tiny allocations performed    // by the P that owns this mcache.    tiny       uintptr    tinyoffset uintptr    tinyAllocs uintptr    // The rest is not accessed on every malloc.    alloc [numSpanClasses]*mspan // spans to allocate from, indexed by spanClass    stackcache [_NumStackOrders]stackfreelist    // flushGen indicates the sweepgen during which this mcache    // was last flushed. If flushGen != mheap_.sweepgen, the spans    // in this mcache are stale and need to the flushed so they    // can be swept. This is done in acquirep.    flushGen uint32}
type p struct {     ...    // 本地mache    mcache      *mcache    ...}

调配堆内存

对象分级

  • Tiny微对象(0,16B)无指针
  • Small小对象[16B,32KB]
  • Large大对象(32KB,正无穷大)

渺小对象调配至一般mspan,大对象调配到0级mspan(量身定做mspan)

微对象调配

  • 从mcache拿到2级mspan
  • 将多个微对象合并成一个16Byte存入
// Allocate an object of size bytes.// Small objects are allocated from the per-P cache's free lists.// Large objects (> 32 kB) are allocated straight from the heap.// 小对象是从 per-P 缓存的闲暇列表中调配的。// 大对象 (> 32 kB) 间接从堆中调配。func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {    if gcphase == _GCmarktermination {        throw("mallocgc called with gcphase == _GCmarktermination")    }    if size == 0 {        return unsafe.Pointer(&zerobase)    }    userSize := size    if asanenabled {        // Refer to ASAN runtime library, the malloc() function allocates extra memory,        // the redzone, around the user requested memory region. And the redzones are marked        // as unaddressable. We perform the same operations in Go to detect the overflows or        // underflows.        size += computeRZlog(size)    }    if debug.malloc {        if debug.sbrk != 0 {            align := uintptr(16)            if typ != nil {                // TODO(austin): This should be just                //   align = uintptr(typ.align)                // but that's only 4 on 32-bit platforms,                // even if there's a uint64 field in typ (see #599).                // This causes 64-bit atomic accesses to panic.                // Hence, we use stricter alignment that matches                // the normal allocator better.                if size&7 == 0 {                    align = 8                } else if size&3 == 0 {                    align = 4                } else if size&1 == 0 {                    align = 2                } else {                    align = 1                }            }            return persistentalloc(size, align, &memstats.other_sys)        }        if inittrace.active && inittrace.id == getg().goid {            // Init functions are executed sequentially in a single goroutine.            inittrace.allocs += 1        }    }    // assistG is the G to charge for this allocation, or nil if    // GC is not currently active.    var assistG *g    if gcBlackenEnabled != 0 {        // Charge the current user G for this allocation.        assistG = getg()        if assistG.m.curg != nil {            assistG = assistG.m.curg        }        // Charge the allocation against the G. We'll account        // for internal fragmentation at the end of mallocgc.        assistG.gcAssistBytes -= int64(size)        if assistG.gcAssistBytes < 0 {            // This G is in debt. Assist the GC to correct            // this before allocating. This must happen            // before disabling preemption.            gcAssistAlloc(assistG)        }    }    // Set mp.mallocing to keep from being preempted by GC.    mp := acquirem()    if mp.mallocing != 0 {        throw("malloc deadlock")    }    if mp.gsignal == getg() {        throw("malloc during signal")    }    mp.mallocing = 1    shouldhelpgc := false    dataSize := userSize    c := getMCache(mp)    if c == nil {        throw("mallocgc called without a P or outside bootstrapping")    }    var span *mspan    var x unsafe.Pointer    noscan := typ == nil || typ.ptrdata == 0    // In some cases block zeroing can profitably (for latency reduction purposes)    // be delayed till preemption is possible; delayedZeroing tracks that state.    delayedZeroing := false    // <= 32KB    if size <= maxSmallSize {        // < 16B        if noscan && size < maxTinySize {            // Tiny allocator.            //            // Tiny allocator combines several tiny allocation requests            // into a single memory block. The resulting memory block            // is freed when all subobjects are unreachable. The subobjects            // must be noscan (don't have pointers), this ensures that            // the amount of potentially wasted memory is bounded.            //            // Size of the memory block used for combining (maxTinySize) is tunable.            // Current setting is 16 bytes, which relates to 2x worst case memory            // wastage (when all but one subobjects are unreachable).            // 8 bytes would result in no wastage at all, but provides less            // opportunities for combining.            // 32 bytes provides more opportunities for combining,            // but can lead to 4x worst case wastage.            // The best case winning is 8x regardless of block size.            //            // Objects obtained from tiny allocator must not be freed explicitly.            // So when an object will be freed explicitly, we ensure that            // its size >= maxTinySize.            //            // SetFinalizer has a special case for objects potentially coming            // from tiny allocator, it such case it allows to set finalizers            // for an inner byte of a memory block.            //            // The main targets of tiny allocator are small strings and            // standalone escaping variables. On a json benchmark            // the allocator reduces number of allocations by ~12% and            // reduces heap size by ~20%.            off := c.tinyoffset            // Align tiny pointer for required (conservative) alignment.            if size&7 == 0 {                off = alignUp(off, 8)            } else if goarch.PtrSize == 4 && size == 12 {                // Conservatively align 12-byte objects to 8 bytes on 32-bit                // systems so that objects whose first field is a 64-bit                // value is aligned to 8 bytes and does not cause a fault on                // atomic access. See issue 37262.                // TODO(mknyszek): Remove this workaround if/when issue 36606                // is resolved.                off = alignUp(off, 8)            } else if size&3 == 0 {                off = alignUp(off, 4)            } else if size&1 == 0 {                off = alignUp(off, 2)            }             //  该object适宜现有的tiny block。            if off+size <= maxTinySize && c.tiny != 0 {                // The object fits into existing tiny block.                x = unsafe.Pointer(c.tiny + off)                c.tinyoffset = off + size                c.tinyAllocs++                mp.mallocing = 0                releasem(mp)                return x            }            // Allocate a new maxTinySize block.             // 调配一个新的maxTinySize block。            span = c.alloc[tinySpanClass]            v := nextFreeFast(span)            if v == 0 {                v, span, shouldhelpgc = c.nextFree(tinySpanClass)            }            x = unsafe.Pointer(v)            (*[2]uint64)(x)[0] = 0            (*[2]uint64)(x)[1] = 0            // See if we need to replace the existing tiny block with the new one            // based on amount of remaining free space.            if !raceenabled && (size < c.tinyoffset || c.tiny == 0) {                // Note: disabled when race detector is on, see comment near end of this function.                c.tiny = uintptr(x)                c.tinyoffset = size            }            size = maxTinySize        } else {            var sizeclass uint8             // 查表找实用的span             if size <= smallSizeMax-8 {                sizeclass = size_to_class8[divRoundUp(size, smallSizeDiv)]            } else {                sizeclass = size_to_class128[divRoundUp(size-smallSizeMax, largeSizeDiv)]            }            size = uintptr(class_to_size[sizeclass])            spc := makeSpanClass(sizeclass, noscan)            span = c.alloc[spc]            // 找到            v := nextFreeFast(span)            if v == 0 {                // 将span进行替换,全局与本地mache替换                v, span, shouldhelpgc = c.nextFree(spc)            }            x = unsafe.Pointer(v)            if needzero && span.needzero != 0 {                memclrNoHeapPointers(unsafe.Pointer(v), size)            }        }    } else {        shouldhelpgc = true        // For large allocations, keep track of zeroed state so that        // bulk zeroing can be happen later in a preemptible context.        // 定制0级span         span = c.allocLarge(size, noscan)        span.freeindex = 1        span.allocCount = 1        size = span.elemsize        x = unsafe.Pointer(span.base())        if needzero && span.needzero != 0 {            if noscan {                delayedZeroing = true            } else {                memclrNoHeapPointers(x, size)                // We've in theory cleared almost the whole span here,                // and could take the extra step of actually clearing                // the whole thing. However, don't. Any GC bits for the                // uncleared parts will be zero, and it's just going to                // be needzero = 1 once freed anyway.            }        }    }    var scanSize uintptr    if !noscan {        heapBitsSetType(uintptr(x), size, dataSize, typ)        if dataSize > typ.size {            // Array allocation. If there are any            // pointers, GC has to scan to the last            // element.            if typ.ptrdata != 0 {                scanSize = dataSize - typ.size + typ.ptrdata            }        } else {            scanSize = typ.ptrdata        }        c.scanAlloc += scanSize    }    // Ensure that the stores above that initialize x to    // type-safe memory and set the heap bits occur before    // the caller can make x observable to the garbage    // collector. Otherwise, on weakly ordered machines,    // the garbage collector could follow a pointer to x,    // but see uninitialized memory or stale heap bits.    publicationBarrier()    // Allocate black during GC.    // All slots hold nil so no scanning is needed.    // This may be racing with GC so do it atomically if there can be    // a race marking the bit.    if gcphase != _GCoff {        gcmarknewobject(span, uintptr(x), size, scanSize)    }    if raceenabled {        racemalloc(x, size)    }    if msanenabled {        msanmalloc(x, size)    }    if asanenabled {        // We should only read/write the memory with the size asked by the user.        // The rest of the allocated memory should be poisoned, so that we can report        // errors when accessing poisoned memory.        // The allocated memory is larger than required userSize, it will also include        // redzone and some other padding bytes.        rzBeg := unsafe.Add(x, userSize)        asanpoison(rzBeg, size-userSize)        asanunpoison(x, userSize)    }    if rate := MemProfileRate; rate > 0 {        // Note cache c only valid while m acquired; see #47302        if rate != 1 && size < c.nextSample {            c.nextSample -= size        } else {            profilealloc(mp, x, size)        }    }    mp.mallocing = 0    releasem(mp)    // Pointerfree data can be zeroed late in a context where preemption can occur.    // x will keep the memory alive.    if delayedZeroing {        if !noscan {            throw("delayed zeroing on data that may contain pointers")        }        memclrNoHeapPointersChunked(size, x) // This is a possible preemption point: see #47302    }    if debug.malloc {        if debug.allocfreetrace != 0 {            tracealloc(x, size, typ)        }        if inittrace.active && inittrace.id == getg().goid {            // Init functions are executed sequentially in a single goroutine.            inittrace.bytes += uint64(size)        }    }    if assistG != nil {        // Account for internal fragmentation in the assist        // debt now that we know it.        assistG.gcAssistBytes -= int64(size - dataSize)    }    if shouldhelpgc {        if t := (gcTrigger{kind: gcTriggerHeap}); t.test() {            gcStart(t)        }    }    if raceenabled && noscan && dataSize < maxTinySize {        // Pad tinysize allocations so they are aligned with the end        // of the tinyalloc region. This ensures that any arithmetic        // that goes off the top end of the object will be detectable        // by checkptr (issue 38872).        // Note that we disable tinyalloc when raceenabled for this to work.        // TODO: This padding is only performed when the race detector        // is enabled. It would be nice to enable it if any package        // was compiled with checkptr, but there's no easy way to        // detect that (especially at compile time).        // TODO: enable this padding for all allocations, not just        // tinyalloc ones. It's tricky because of pointer maps.        // Maybe just all noscan objects?        x = add(x, size-dataSize)    }    return x}// allocLarge allocates a span for a large object.func (c *mcache) allocLarge(size uintptr, noscan bool) *mspan {    if size+_PageSize < size {        throw("out of memory")    }    npages := size >> _PageShift    if size&_PageMask != 0 {        npages++    }    // Deduct credit for this span allocation and sweep if    // necessary. mHeap_Alloc will also sweep npages, so this only    // pays the debt down to npage pages.    deductSweepCredit(npages*_PageSize, npages)    spc := makeSpanClass(0, noscan)    s := mheap_.alloc(npages, spc)    if s == nil {        throw("out of memory")    }    stats := memstats.heapStats.acquire()    atomic.Xadd64(&stats.largeAlloc, int64(npages*pageSize))    atomic.Xadd64(&stats.largeAllocCount, 1)    memstats.heapStats.release()    // Update heapLive.    gcController.update(int64(s.npages*pageSize), 0)    // Put the large span in the mcentral swept list so that it's    // visible to the background sweeper.    mheap_.central[spc].mcentral.fullSwept(mheap_.sweepgen).push(s)    s.limit = s.base() + size    heapBitsForAddr(s.base()).initSpan(s)    return s}
// nextFree returns the next free object from the cached span if one is available.// Otherwise it refills the cache with a span with an available object and// returns that object along with a flag indicating that this was a heavy// weight allocation. If it is a heavy weight allocation the caller must// determine whether a new GC cycle needs to be started or if the GC is active// whether this goroutine needs to assist the GC.//// Must run in a non-preemptible context since otherwise the owner of// c could change.func (c *mcache) nextFree(spc spanClass) (v gclinkptr, s *mspan, shouldhelpgc bool) {    s = c.alloc[spc]    shouldhelpgc = false    freeIndex := s.nextFreeIndex()    if freeIndex == s.nelems {        // The span is full.        if uintptr(s.allocCount) != s.nelems {            println("runtime: s.allocCount=", s.allocCount, "s.nelems=", s.nelems)            throw("s.allocCount != s.nelems && freeIndex == s.nelems")        }        c.refill(spc)        shouldhelpgc = true        s = c.alloc[spc]        freeIndex = s.nextFreeIndex()    }    if freeIndex >= s.nelems {        throw("freeIndex is not valid")    }    v = gclinkptr(freeIndex*s.elemsize + s.base())    s.allocCount++    if uintptr(s.allocCount) > s.nelems {        println("s.allocCount=", s.allocCount, "s.nelems=", s.nelems)        throw("s.allocCount > s.nelems")    }    return}
// refill acquires a new span of span class spc for c. This span will// have at least one free object. The current span in c must be full.//// Must run in a non-preemptible context since otherwise the owner of// c could change.func (c *mcache) refill(spc spanClass) {    // Return the current cached span to the central lists.    s := c.alloc[spc]    if uintptr(s.allocCount) != s.nelems {        throw("refill of span with free space remaining")    }    if s != &emptymspan {        // Mark this span as no longer cached.        if s.sweepgen != mheap_.sweepgen+3 {            throw("bad sweepgen in refill")        }        mheap_.central[spc].mcentral.uncacheSpan(s)    }    // Get a new cached span from the central lists.    s = mheap_.central[spc].mcentral.cacheSpan()    if s == nil {        throw("out of memory")    }    if uintptr(s.allocCount) == s.nelems {        throw("span has no free space")    }    // Indicate that this span is cached and prevent asynchronous    // sweeping in the next sweep phase.    s.sweepgen = mheap_.sweepgen + 3    // Assume all objects from this span will be allocated in the    // mcache. If it gets uncached, we'll adjust this.    stats := memstats.heapStats.acquire()    atomic.Xadd64(&stats.smallAllocCount[spc.sizeclass()], int64(s.nelems)-int64(s.allocCount))    // Flush tinyAllocs.    if spc == tinySpanClass {        atomic.Xadd64(&stats.tinyAllocCount, int64(c.tinyAllocs))        c.tinyAllocs = 0    }    memstats.heapStats.release()    // Update heapLive with the same assumption.    // While we're here, flush scanAlloc, since we have to call    // revise anyway.    usedBytes := uintptr(s.allocCount) * s.elemsize    gcController.update(int64(s.npages*pageSize)-int64(usedBytes), int64(c.scanAlloc))    c.scanAlloc = 0    c.alloc[spc] = s}
// Allocate a span to use in an mcache.func (c *mcentral) cacheSpan() *mspan {    // Deduct credit for this span allocation and sweep if necessary.    spanBytes := uintptr(class_to_allocnpages[c.spanclass.sizeclass()]) * _PageSize    deductSweepCredit(spanBytes, 0)    traceDone := false    if trace.enabled {        traceGCSweepStart()    }    // If we sweep spanBudget spans without finding any free    // space, just allocate a fresh span. This limits the amount    // of time we can spend trying to find free space and    // amortizes the cost of small object sweeping over the    // benefit of having a full free span to allocate from. By    // setting this to 100, we limit the space overhead to 1%.    //    // TODO(austin,mknyszek): This still has bad worst-case    // throughput. For example, this could find just one free slot    // on the 100th swept span. That limits allocation latency, but    // still has very poor throughput. We could instead keep a    // running free-to-used budget and switch to fresh span    // allocation if the budget runs low.    spanBudget := 100    var s *mspan    var sl sweepLocker    // Try partial swept spans first.    sg := mheap_.sweepgen    if s = c.partialSwept(sg).pop(); s != nil {        goto havespan    }    sl = sweep.active.begin()    if sl.valid {        // Now try partial unswept spans.        for ; spanBudget >= 0; spanBudget-- {            s = c.partialUnswept(sg).pop()            if s == nil {                break            }            if s, ok := sl.tryAcquire(s); ok {                // We got ownership of the span, so let's sweep it and use it.                s.sweep(true)                sweep.active.end(sl)                goto havespan            }            // We failed to get ownership of the span, which means it's being or            // has been swept by an asynchronous sweeper that just couldn't remove it            // from the unswept list. That sweeper took ownership of the span and            // responsibility for either freeing it to the heap or putting it on the            // right swept list. Either way, we should just ignore it (and it's unsafe            // for us to do anything else).        }        // Now try full unswept spans, sweeping them and putting them into the        // right list if we fail to get a span.        for ; spanBudget >= 0; spanBudget-- {            s = c.fullUnswept(sg).pop()            if s == nil {                break            }            if s, ok := sl.tryAcquire(s); ok {                // We got ownership of the span, so let's sweep it.                s.sweep(true)                // Check if there's any free space.                freeIndex := s.nextFreeIndex()                if freeIndex != s.nelems {                    s.freeindex = freeIndex                    sweep.active.end(sl)                    goto havespan                }                // Add it to the swept list, because sweeping didn't give us any free space.                c.fullSwept(sg).push(s.mspan)            }            // See comment for partial unswept spans.        }        sweep.active.end(sl)    }    if trace.enabled {        traceGCSweepDone()        traceDone = true    }    // We failed to get a span from the mcentral so get one from mheap.    s = c.grow()    if s == nil {        return nil    }    // At this point s is a span that should have free slots.havespan:    if trace.enabled && !traceDone {        traceGCSweepDone()    }    n := int(s.nelems) - int(s.allocCount)    if n == 0 || s.freeindex == s.nelems || uintptr(s.allocCount) == s.nelems {        throw("span has no free objects")    }    freeByteBase := s.freeindex &^ (64 - 1)    whichByte := freeByteBase / 8    // Init alloc bits cache.    s.refillAllocCache(whichByte)    // Adjust the allocCache so that s.freeindex corresponds to the low bit in    // s.allocCache.    s.allocCache >>= s.freeindex % 64    return s}
// grow allocates a new empty span from the heap and initializes it for c's size class.func (c *mcentral) grow() *mspan {    npages := uintptr(class_to_allocnpages[c.spanclass.sizeclass()])    size := uintptr(class_to_size[c.spanclass.sizeclass()])    s := mheap_.alloc(npages, c.spanclass)    if s == nil {        return nil    }    // Use division by multiplication and shifts to quickly compute:    // n := (npages << _PageShift) / size    n := s.divideByElemSize(npages << _PageShift)    s.limit = s.base() + size*n    heapBitsForAddr(s.base()).initSpan(s)    return s}// allocLarge allocates a span for a large object.func (c *mcache) allocLarge(size uintptr, noscan bool) *mspan {    if size+_PageSize < size {        throw("out of memory")    }    npages := size >> _PageShift    if size&_PageMask != 0 {        npages++    }    // Deduct credit for this span allocation and sweep if    // necessary. mHeap_Alloc will also sweep npages, so this only    // pays the debt down to npage pages.    deductSweepCredit(npages*_PageSize, npages)    spc := makeSpanClass(0, noscan)    s := mheap_.alloc(npages, spc)    if s == nil {        throw("out of memory")    }    stats := memstats.heapStats.acquire()    atomic.Xadd64(&stats.largeAlloc, int64(npages*pageSize))    atomic.Xadd64(&stats.largeAllocCount, 1)    memstats.heapStats.release()    // Update heapLive.    gcController.update(int64(s.npages*pageSize), 0)    // Put the large span in the mcentral swept list so that it's    // visible to the background sweeper.    mheap_.central[spc].mcentral.fullSwept(mheap_.sweepgen).push(s)    s.limit = s.base() + size    heapBitsForAddr(s.base()).initSpan(s)    return s}
// alloc allocates a new span of npage pages from the GC'd heap.//// spanclass indicates the span's size class and scannability.//// Returns a span that has been fully initialized. span.needzero indicates// whether the span has been zeroed. Note that it may not be.func (h *mheap) alloc(npages uintptr, spanclass spanClass) *mspan {    // Don't do any operations that lock the heap on the G stack.    // It might trigger stack growth, and the stack growth code needs    // to be able to allocate heap.    var s *mspan    systemstack(func() {        // To prevent excessive heap growth, before allocating n pages        // we need to sweep and reclaim at least n pages.        if !isSweepDone() {            h.reclaim(npages)        }        s = h.allocSpan(npages, spanAllocHeap, spanclass)    })    return s}
// allocSpan allocates an mspan which owns npages worth of memory.//// If typ.manual() == false, allocSpan allocates a heap span of class spanclass// and updates heap accounting. If manual == true, allocSpan allocates a// manually-managed span (spanclass is ignored), and the caller is// responsible for any accounting related to its use of the span. Either// way, allocSpan will atomically add the bytes in the newly allocated// span to *sysStat.//// The returned span is fully initialized.//// h.lock must not be held.//// allocSpan must be called on the system stack both because it acquires// the heap lock and because it must block GC transitions.////go:systemstackfunc (h *mheap) allocSpan(npages uintptr, typ spanAllocType, spanclass spanClass) (s *mspan) {    // Function-global state.    gp := getg()    base, scav := uintptr(0), uintptr(0)    growth := uintptr(0)    // On some platforms we need to provide physical page aligned stack    // allocations. Where the page size is less than the physical page    // size, we already manage to do this by default.    needPhysPageAlign := physPageAlignedStacks && typ == spanAllocStack && pageSize < physPageSize    // If the allocation is small enough, try the page cache!    // The page cache does not support aligned allocations, so we cannot use    // it if we need to provide a physical page aligned stack allocation.    pp := gp.m.p.ptr()    if !needPhysPageAlign && pp != nil && npages < pageCachePages/4 {        c := &pp.pcache        // If the cache is empty, refill it.        if c.empty() {            lock(&h.lock)            *c = h.pages.allocToCache()            unlock(&h.lock)        }        // Try to allocate from the cache.        base, scav = c.alloc(npages)        if base != 0 {            s = h.tryAllocMSpan()            if s != nil {                goto HaveSpan            }            // We have a base but no mspan, so we need            // to lock the heap.        }    }    // For one reason or another, we couldn't get the    // whole job done without the heap lock.    lock(&h.lock)    if needPhysPageAlign {        // Overallocate by a physical page to allow for later alignment.        npages += physPageSize / pageSize    }    if base == 0 {        // Try to acquire a base address.        base, scav = h.pages.alloc(npages)        if base == 0 {            var ok bool            growth, ok = h.grow(npages)            if !ok {                unlock(&h.lock)                return nil            }            base, scav = h.pages.alloc(npages)            if base == 0 {                throw("grew heap, but no adequate free space found")            }        }    }    if s == nil {        // We failed to get an mspan earlier, so grab        // one now that we have the heap lock.        s = h.allocMSpanLocked()    }    if needPhysPageAlign {        allocBase, allocPages := base, npages        base = alignUp(allocBase, physPageSize)        npages -= physPageSize / pageSize        // Return memory around the aligned allocation.        spaceBefore := base - allocBase        if spaceBefore > 0 {            h.pages.free(allocBase, spaceBefore/pageSize, false)        }        spaceAfter := (allocPages-npages)*pageSize - spaceBefore        if spaceAfter > 0 {            h.pages.free(base+npages*pageSize, spaceAfter/pageSize, false)        }    }    unlock(&h.lock)    if growth > 0 {        // We just caused a heap growth, so scavenge down what will soon be used.        // By scavenging inline we deal with the failure to allocate out of        // memory fragments by scavenging the memory fragments that are least        // likely to be re-used.        scavengeGoal := atomic.Load64(&h.scavengeGoal)        if retained := heapRetained(); retained+uint64(growth) > scavengeGoal {            // The scavenging algorithm requires the heap lock to be dropped so it            // can acquire it only sparingly. This is a potentially expensive operation            // so it frees up other goroutines to allocate in the meanwhile. In fact,            // they can make use of the growth we just created.            todo := growth            if overage := uintptr(retained + uint64(growth) - scavengeGoal); todo > overage {                todo = overage            }            h.pages.scavenge(todo)        }    }HaveSpan:    // At this point, both s != nil and base != 0, and the heap    // lock is no longer held. Initialize the span.    s.init(base, npages)    if h.allocNeedsZero(base, npages) {        s.needzero = 1    }    nbytes := npages * pageSize    if typ.manual() {        s.manualFreeList = 0        s.nelems = 0        s.limit = s.base() + s.npages*pageSize        s.state.set(mSpanManual)    } else {        // We must set span properties before the span is published anywhere        // since we're not holding the heap lock.        s.spanclass = spanclass        if sizeclass := spanclass.sizeclass(); sizeclass == 0 {            s.elemsize = nbytes            s.nelems = 1            s.divMul = 0        } else {            s.elemsize = uintptr(class_to_size[sizeclass])            s.nelems = nbytes / s.elemsize            s.divMul = class_to_divmagic[sizeclass]        }        // Initialize mark and allocation structures.        s.freeindex = 0        s.allocCache = ^uint64(0) // all 1s indicating all free.        s.gcmarkBits = newMarkBits(s.nelems)        s.allocBits = newAllocBits(s.nelems)        // It's safe to access h.sweepgen without the heap lock because it's        // only ever updated with the world stopped and we run on the        // systemstack which blocks a STW transition.        atomic.Store(&s.sweepgen, h.sweepgen)        // Now that the span is filled in, set its state. This        // is a publication barrier for the other fields in        // the span. While valid pointers into this span        // should never be visible until the span is returned,        // if the garbage collector finds an invalid pointer,        // access to the span may race with initialization of        // the span. We resolve this race by atomically        // setting the state after the span is fully        // initialized, and atomically checking the state in        // any situation where a pointer is suspect.        s.state.set(mSpanInUse)    }    // Commit and account for any scavenged memory that the span now owns.    if scav != 0 {        // sysUsed all the pages that are actually available        // in the span since some of them might be scavenged.        sysUsed(unsafe.Pointer(base), nbytes)        atomic.Xadd64(&memstats.heap_released, -int64(scav))    }    // Update stats.    if typ == spanAllocHeap {        atomic.Xadd64(&memstats.heap_inuse, int64(nbytes))    }    if typ.manual() {        // Manually managed memory doesn't count toward heap_sys.        memstats.heap_sys.add(-int64(nbytes))    }    // Update consistent stats.    stats := memstats.heapStats.acquire()    atomic.Xaddint64(&stats.committed, int64(scav))    atomic.Xaddint64(&stats.released, -int64(scav))    switch typ {    case spanAllocHeap:        atomic.Xaddint64(&stats.inHeap, int64(nbytes))    case spanAllocStack:        atomic.Xaddint64(&stats.inStacks, int64(nbytes))    case spanAllocPtrScalarBits:        atomic.Xaddint64(&stats.inPtrScalarBits, int64(nbytes))    case spanAllocWorkBuf:        atomic.Xaddint64(&stats.inWorkBufs, int64(nbytes))    }    memstats.heapStats.release()    // Publish the span in various locations.    // This is safe to call without the lock held because the slots    // related to this span will only ever be read or modified by    // this thread until pointers into the span are published (and    // we execute a publication barrier at the end of this function    // before that happens) or pageInUse is updated.    h.setSpans(s.base(), npages, s)    if !typ.manual() {        // Mark in-use span in arena page bitmap.        //        // This publishes the span to the page sweeper, so        // it's imperative that the span be completely initialized        // prior to this line.        arena, pageIdx, pageMask := pageIndexOf(s.base())        atomic.Or8(&arena.pageInUse[pageIdx], pageMask)        // Update related page sweeper stats.        h.pagesInUse.Add(int64(npages))    }    // Make sure the newly allocated span will be observed    // by the GC before pointers into the span are published.    publicationBarrier()    return s}
// Try to add at least npage pages of memory to the heap,// returning how much the heap grew by and whether it worked.//// h.lock must be held.func (h *mheap) grow(npage uintptr) (uintptr, bool) {    assertLockHeld(&h.lock)    // We must grow the heap in whole palloc chunks.    // We call sysMap below but note that because we    // round up to pallocChunkPages which is on the order    // of MiB (generally >= to the huge page size) we    // won't be calling it too much.    ask := alignUp(npage, pallocChunkPages) * pageSize    totalGrowth := uintptr(0)    // This may overflow because ask could be very large    // and is otherwise unrelated to h.curArena.base.    end := h.curArena.base + ask    nBase := alignUp(end, physPageSize)    if nBase > h.curArena.end || /* overflow */ end < h.curArena.base {        // Not enough room in the current arena. Allocate more        // arena space. This may not be contiguous with the        // current arena, so we have to request the full ask.        av, asize := h.sysAlloc(ask)        if av == nil {            print("runtime: out of memory: cannot allocate ", ask, "-byte block (", memstats.heap_sys, " in use)\n")            return 0, false        }        if uintptr(av) == h.curArena.end {            // The new space is contiguous with the old            // space, so just extend the current space.            h.curArena.end = uintptr(av) + asize        } else {            // The new space is discontiguous. Track what            // remains of the current space and switch to            // the new space. This should be rare.            if size := h.curArena.end - h.curArena.base; size != 0 {                // Transition this space from Reserved to Prepared and mark it                // as released since we'll be able to start using it after updating                // the page allocator and releasing the lock at any time.                sysMap(unsafe.Pointer(h.curArena.base), size, &memstats.heap_sys)                // Update stats.                atomic.Xadd64(&memstats.heap_released, int64(size))                stats := memstats.heapStats.acquire()                atomic.Xaddint64(&stats.released, int64(size))                memstats.heapStats.release()                // Update the page allocator's structures to make this                // space ready for allocation.                h.pages.grow(h.curArena.base, size)                totalGrowth += size            }            // Switch to the new space.            h.curArena.base = uintptr(av)            h.curArena.end = uintptr(av) + asize        }        // Recalculate nBase.        // We know this won't overflow, because sysAlloc returned        // a valid region starting at h.curArena.base which is at        // least ask bytes in size.        nBase = alignUp(h.curArena.base+ask, physPageSize)    }    // Grow into the current arena.    v := h.curArena.base    h.curArena.base = nBase    // Transition the space we're going to use from Reserved to Prepared.    sysMap(unsafe.Pointer(v), nBase-v, &memstats.heap_sys)    // The memory just allocated counts as both released    // and idle, even though it's not yet backed by spans.    //    // The allocation is always aligned to the heap arena    // size which is always > physPageSize, so its safe to    // just add directly to heap_released.    atomic.Xadd64(&memstats.heap_released, int64(nBase-v))    stats := memstats.heapStats.acquire()    atomic.Xaddint64(&stats.released, int64(nBase-v))    memstats.heapStats.release()    // Update the page allocator's structures to make this    // space ready for allocation.    h.pages.grow(v, nBase-v)    totalGrowth += nBase - v    return totalGrowth, true}

GC

GC应用三色标记-革除法

GC回收对象

root
  • 被栈上的指针援用
  • 被区区变量指针援用
  • 被寄存器中的指针援用

间接援用也不可回收

三色标记法

  • 彩色:有援用,曾经扫描实现
  • 灰色:有援用,正在扫描
  • 红色:未扫描/垃圾

Yuasa删除屏障

  • 并发标记时,对指针开释的红色对象置灰(能够杜绝在GC标记中被开释的指针被清理回收)

Dijkstra写屏障

  • 并发标记时,对指针新指向的红色对象置灰(能够杜绝在GC标记中被插入的指针被清理回收)

混合屏障

Go应用混合屏障

GC剖析工具

  • go tool pprof
  • go tool trace
  • go build -gcflags=" -m"
  • GODEBUG=" gctrace=1"