Golang-抢占调度流程分析

26次阅读

共计 24982 个字符,预计需要花费 63 分钟才能阅读完成。

Golang 抢占调度流程分析

一. 前置知识

本文针对调度抢占逻辑的源码阅读, 如果配合下列菜单食用效果更佳.

  1. 熟悉 golang 的基本知识和语法.
  2. 了解目前 golang G/M/P 的概念和 goroutine
  3. 了解用户态线程 /stackfull 协程的基本原理.
  4. 推荐资料:Go Preemptive Scheduler Design Doc
  5. 本文环境: go1.12.5 GOOS=darwin GOARCH=amd64

二. 源码分析总

2.1 总体框架

总体思想:

  1. sysmon 中定期扫描正在执行的 g 列表, 筛选出执行时间过长的 g 并且设置需要被抢占的标签.
  2. 在恰当的地方检测被抢占标记,(runtime 主动)切换, 让出 cpu.

2.2 标记待抢占 G

2.2.1 关键数据结构

sysmontick 结构:

顾名思义sysmontick 用来保存 sysmon 抢占检测的上下文.

type sysmontick struct {
    schedtick   uint32 // 标记某个 g 的调度次数
    schedwhen   int64 // 某个 g 最近一次的调度时间戳
    syscalltick uint32 // 某个 g 的系统调用次数
    syscallwhen int64 // 某个 g 最近一次系统调用时间戳
}

p 结构:

这里只摘取 P 结构中与抢占相关的字段

type p struct {
    lock mutex
   ......

    status      uint32 // one of pidle/prunning/...
    schedtick   uint32     // incremented on every scheduler call
    syscalltick uint32     // incremented on every system call
    sysmontick  sysmontick // 即上面 sysmontick,sysmon 会操作这个数据, 用于表示最近一次的检测上下文.
  ......
}

2.2.2 扫描 & 标记

sysmon其实干了很多事情, 现在只关注两个点:

  • 扫描的周期
  • 标记的逻辑
func sysmon() {
    ......
    idle := 0 // how many cycles in succession we had not wokeup somebody
    delay := uint32(0)
    for {
        if idle == 0 { // start with 20us sleep...
            delay = 20
        } else if idle > 50 { // start doubling the sleep after 1ms...
            delay *= 2
        }
        if delay > 10*1000 { // up to 10ms
            delay = 10 * 1000
        }
        usleep(delay)
    ......
        // retake P's blocked in syscalls
        // and preempt long running G's
        if retake(now) != 0 {idle = 0} else {idle++}
    }
}

关于扫描周期, 至少是 20us 一个循环, 后面视 idle 循环次数来进行 指数退避 (超过 1ms 之后倍增), 但最长时间不超过10ms, 故系统至多在10ms 左右进行一次抢占检测.

接下来看抢占的具体检测标记逻辑retake

// forcePreemptNS is the time slice given to a G before it is
// preempted.
const forcePreemptNS = 10 * 1000 * 1000 // 10ms

func retake(now int64) uint32 {
    n := 0
    lock(&allpLock)
    for i := 0; i < len(allp); i++ {_p_ := allp[i]
        if _p_ == nil {continue}
        pd := &_p_.sysmontick 
        s := _p_.status 
        if s == _Psyscall {......} else if s == _Prunning {
            // Preempt G if it's running for too long.
            t := int64(_p_.schedtick)
      if int64(pd.schedtick) != t {pd.schedtick = uint32(t)
                pd.schedwhen = now
                continue
            }
            if pd.schedwhen+forcePreemptNS > now {continue}
            preemptone(_p_)
        }
    }
    unlock(&allpLock)
    return uint32(n)
}
  1. 通过遍历 allp 列表来获取正在运行的 g.
  2. 状态检测.

            t := int64(_p_.schedtick)
          if int64(pd.schedtick) != t { // 在周期内已经调度过, 即当前 p 上运行的 g 改变过.
                    pd.schedtick = uint32(t)
                    pd.schedwhen = now // 更新最近一次抢占检测的时间
                    continue
                }
                if pd.schedwhen+forcePreemptNS > now {continue}
                preemptone(_p_)

从上面关键数据结构得知 p.schedtick 记录了这个 P 上总共调度次数 (递增), 故sysmon 通过比较最近一次记录的schedtick 即可判断在一个周期内是否发生过调度行为.

通过最近一次检测时间与当前时间比较来明确是否需要抢占标记pd.schedwhen+forcePreemptNS>now

forcePreemptNS10ms , 如果超过 10ms 没有调度, 则需要抢占, PS: 并不能保证一个 G 最多运行 10ms.

最后通过 preemptone 来标记当前 G 需要被抢占

func preemptone(_p_ *p) bool {mp := _p_.m.ptr()
    if mp == nil || mp == getg().m {return false}
    gp := mp.curg
    if gp == nil || gp == mp.g0 {return false}
    gp.preempt = true
    // Every call in a go routine checks for stack overflow by
    // comparing the current stack pointer to gp->stackguard0.
    // Setting gp->stackguard0 to StackPreempt folds
    // preemption into the normal stack overflow check.
    gp.stackguard0 = stackPreempt
    return true
}

核心关键就是设置 g.preempt=trueg.stackguard0 = stackPreempt.

2.3 抢占触发

func preemptone 注释已经说了(runtime 的注释很好), 抢占触发时机: 目标 g 进行函数调用中触发栈检测过程中进行.

func testfunc()(sum int){var nums[100] int
    for _, num := range nums {sum += num}
    return
}
"".testfunc STEXT size=132 args=0x8 locals=0x328
    0x0000 00000 (main.go:60)    TEXT    "".testfunc(SB), ABIInternal, $808-8
    0x0000 00000 (main.go:60)    MOVQ    (TLS), CX
    0x0009 00009 (main.go:60)    LEAQ    -680(SP), AX
    0x0011 00017 (main.go:60)    CMPQ    AX, 16(CX)
    0x0015 00021 (main.go:60)    JLS    122
    0x0017 00023 (main.go:60)    SUBQ    $808, SP
    0x001e 00030 (main.go:60)    MOVQ    BP, 800(SP)
    0x0026 00038 (main.go:60)    LEAQ    800(SP), BP
    0x002e 00046 (main.go:60)    FUNCDATA    $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x002e 00046 (main.go:60)    FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x002e 00046 (main.go:60)    FUNCDATA    $3, gclocals·39825eea4be6e41a70480a53a624f97b(SB)
    0x002e 00046 (main.go:62)    PCDATA    $2, $1
    0x002e 00046 (main.go:62)    PCDATA    $0, $0
    0x002e 00046 (main.go:62)    LEAQ    ""..autotmp_3(SP), DI
    0x0032 00050 (main.go:62)    XORPS    X0, X0
    0x0035 00053 (main.go:62)    PCDATA    $2, $0
    0x0035 00053 (main.go:62)    LEAQ    -32(DI), DI
    0x0039 00057 (main.go:62)    DUFFZERO    $64
    0x004c 00076 (main.go:62)    XORL    AX, AX
    0x004e 00078 (main.go:62)    XORL    CX, CX
    0x0050 00080 (main.go:62)    JMP    92
    0x0052 00082 (main.go:62)    MOVQ    ""..autotmp_3(SP)(AX*8), DX
    0x0056 00086 (main.go:62)    INCQ    AX
    0x0059 00089 (main.go:63)    ADDQ    DX, CX
    0x005c 00092 (main.go:62)    CMPQ    AX, $100
    0x0060 00096 (main.go:62)    JLT    82
    0x0062 00098 (main.go:65)    MOVQ    CX, "".sum+816(SP)
    0x006a 00106 (main.go:65)    MOVQ    800(SP), BP
    0x0072 00114 (main.go:65)    ADDQ    $808, SP
    0x0079 00121 (main.go:65)    RET
    0x007a 00122 (main.go:65)    NOP
    0x007a 00122 (main.go:60)    PCDATA    $0, $-1
    0x007a 00122 (main.go:60)    PCDATA    $2, $-1
    0x007a 00122 (main.go:60)    CALL    runtime.morestack_noctxt(SB) // 扩充 stack
    0x007f 00127 (main.go:60)    JMP    0

具体调用链路为: morestack_noctxt=>runtime.morestack=>runtime·newstack

func newstack() {
     ......
   ......
    preempt := atomic.Loaduintptr(&gp.stackguard0) == stackPreempt
    
    // Be conservative about where we preempt.
    // We are interested in preempting user Go code, not runtime code.
    // If we're holding locks, mallocing, or preemption is disabled, don't
    // preempt.
    // This check is very early in newstack so that even the status change
    // from Grunning to Gwaiting and back doesn't happen in this case.
    // That status change by itself can be viewed as a small preemption,
    // because the GC might change Gwaiting to Gscanwaiting, and then
    // this goroutine has to wait for the GC to finish before continuing.
    // If the GC is in some way dependent on this goroutine (for example,
    // it needs a lock held by the goroutine), that small preemption turns
    // into a real deadlock.
    if preempt {
    // 注意点.
        if thisg.m.locks != 0 || thisg.m.mallocing != 0 || thisg.m.preemptoff != "" || thisg.m.p.ptr().status != _Prunning {
            // Let the goroutine keep running for now.
            // gp->preempt is set, so it will be preempted next time.
            gp.stackguard0 = gp.stack.lo + _StackGuard
            gogo(&gp.sched) // never return
        }
    }

 ......
 ......
 ......

    if preempt {
        if gp == thisg.m.g0 {throw("runtime: preempt g0")
        }
        if thisg.m.p == 0 && thisg.m.locks == 0 {throw("runtime: g is running but p is not")
        }
        // Synchronize with scang.
        casgstatus(gp, _Grunning, _Gwaiting)
        if gp.preemptscan {......}
        // Act like goroutine called runtime.Gosched.
        casgstatus(gp, _Gwaiting, _Grunning)
        gopreempt_m(gp) // never return
    }
  ......
}

最终 gopreempt_m 来抢占当前的 g,gopreempt_m最终调用的还是goschedImpl .

  • 把当前的 G 状态改为runnable, 并且放到全局 G 队列中.
  • 调用schedule() 重新执行一次调度, 来继续运行其它可运行的 G.
func gopreempt_m(gp *g) {
    if trace.enabled {traceGoPreempt()
    }
    goschedImpl(gp)
}

func goschedImpl(gp *g) {status := readgstatus(gp)
    if status&^_Gscan != _Grunning {dumpgstatus(gp)
        throw("bad g status")
    }
    casgstatus(gp, _Grunning, _Grunnable)
    dropg()
    lock(&sched.lock)
    globrunqput(gp)
    unlock(&sched.lock)

    schedule()}

三. 实验

3.1 测试程序

注意点:

  • 确保 User Code 部分执行时长能够超过 10ms, 并且中间不能调用任何会导致切换的代码.
  • 确保User Code 能够促发抢占让出逻辑, 非内联函数调用, 即 call 时候会调用newstack
  • 无其余如网络 IO 等干扰因素.
package main

import (
    "runtime"
    "time"
    "flag"
)

func testfunc()(sum int){var nums[100] int
    for _, num := range nums {sum += num}
    return
}

func main() {

    var numOfG = 1

    runtime.GOMAXPROCS(2)
    flag.IntVar(&numOfG, "n", 1, "num of g")
    flag.Parse()
    for i:=0 ; i< numOfG ; i++{go func() {
            for{testfunc()
            }
        }()}

     time.Sleep(time.Second*100)
}

我们设置 GOMAXPROCS=2, 来更好的观察上面的分析结果, 并通过死循环模拟 User Code 部分执行过久的情况, 采样间隔为 1000ms.

3.2 实验数据

3.2.1 MAXPROCS> 死循环 G 个数

SCHED 0ms: gomaxprocs=8 idleprocs=6 threads=3 spinningthreads=1 idlethreads=0 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=0 syscalltick=0 m=2 runqsize=0 gfreecnt=0
  P2: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P3: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P4: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P5: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P6: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P7: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  M2: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=1
  G1: status=4(chan receive) m=-1 lockedm=0
  G2: status=1() m=-1 lockedm=-1
  G3: status=1() m=-1 lockedm=-1
SCHED 1003ms: gomaxprocs=2 idleprocs=1 threads=5 spinningthreads=0 idlethreads=2 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=0 schedtick=2 syscalltick=6 m=-1 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=48 syscalltick=10 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=5 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=1() m=-1 lockedm=-1
  G5: status=3() m=0 lockedm=-1
SCHED 2005ms: gomaxprocs=2 idleprocs=1 threads=5 spinningthreads=0 idlethreads=2 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=0 schedtick=2 syscalltick=6 m=-1 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=91 syscalltick=10 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=5 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=1() m=-1 lockedm=-1
  G5: status=3() m=0 lockedm=-1
SCHED 3011ms: gomaxprocs=2 idleprocs=1 threads=5 spinningthreads=0 idlethreads=2 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=0 schedtick=2 syscalltick=6 m=-1 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=135 syscalltick=10 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=5 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=1() m=-1 lockedm=-1
  G5: status=3() m=0 lockedm=-1
SCHED 4015ms: gomaxprocs=2 idleprocs=1 threads=5 spinningthreads=0 idlethreads=2 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=0 schedtick=2 syscalltick=6 m=-1 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=179 syscalltick=10 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=4 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=5 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=2() m=3 lockedm=-1
  G5: status=3() m=0 lockedm=-1
Goroutine Total Execution Network wait Sync block Blocking syscall Scheduler wait GC sweeping GC pause
4 5002ms 4999ms 0ns 0ns 0ns 2485µs 0ns (0.0%) 0ns (0.0%

可以发现 G4 为死循环的 G, 在 P0 上执行, 每一秒中新增的 schedtick 约为 50, 也就是实际调度约为 20ms 一次.

3.2.2 MAXPROCS= 死循环 G 个数

SCHED 0ms: gomaxprocs=8 idleprocs=6 threads=3 spinningthreads=1 idlethreads=0 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=0 syscalltick=0 m=2 runqsize=0 gfreecnt=0
  P2: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P3: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P4: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P5: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P6: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P7: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  M2: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=1
  G1: status=4(chan receive) m=-1 lockedm=0
  G2: status=1() m=-1 lockedm=-1
  G3: status=1() m=-1 lockedm=-1
SCHED 1006ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=1 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=47 syscalltick=6 m=2 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=49 syscalltick=8 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=5 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=0 curg=4 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=6 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=2() m=2 lockedm=-1
  G5: status=2() m=3 lockedm=-1
  G6: status=3() m=0 lockedm=-1
SCHED 2013ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=1 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=91 syscalltick=6 m=2 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=93 syscalltick=8 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=4 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=0 curg=5 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=6 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=2() m=3 lockedm=-1
  G5: status=2() m=2 lockedm=-1
  G6: status=3() m=0 lockedm=-1
SCHED 3018ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=1 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=135 syscalltick=6 m=2 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=137 syscalltick=8 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=4 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=0 curg=5 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=6 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=2() m=3 lockedm=-1
  G5: status=2() m=2 lockedm=-1
  G6: status=3() m=0 lockedm=-1
SCHED 4029ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=1 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=179 syscalltick=6 m=2 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=181 syscalltick=8 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=0 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=6 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=1() m=-1 lockedm=-1
  G5: status=1() m=-1 lockedm=-1
  G6: status=3() m=0 lockedm=-1
SCHED 5038ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=2 runqueue=1 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=226 syscalltick=8 m=0 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=227 syscalltick=8 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=4 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=0 curg=5 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  G1: status=4(semacquire) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=1(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=2() m=3 lockedm=-1
  G5: status=2() m=0 lockedm=-1
  G6: status=4(timer goroutine (idle)) m=-1 lockedm=-1

G4/G5 为死循环 G, 与第一个实验数据类似.

Goroutine Total Execution Network wait Sync block Blocking syscall Scheduler wait GC sweeping GC pause
4 5021ms 5019ms 0ns 0ns 0ns 2156µs 0ns (0.0%) 0ns (0.0%)
5 5021ms 5018ms 0ns 0ns 0ns 2442µs 0ns (0.0%) 0ns (0.0%

3.2.3 MAXPROCS< 死循环 G 个数

我们通过开启 5 个死循环 G 来实验.

SCHED 0ms: gomaxprocs=8 idleprocs=5 threads=5 spinningthreads=1 idlethreads=0 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=0 syscalltick=0 m=3 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=1 syscalltick=0 m=2 runqsize=0 gfreecnt=0
  P2: status=1 schedtick=0 syscalltick=0 m=4 runqsize=0 gfreecnt=0
  P3: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P4: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P5: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P6: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P7: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  M4: p=2 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=true blocked=false lockedg=-1
  M3: p=0 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=true blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=1
  G1: status=1(chan receive) m=-1 lockedm=0
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
SCHED 1000ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=1 runqueue=3 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=48 syscalltick=9 m=2 runqsize=0 gfreecnt=0
  P1: status=1 schedtick=49 syscalltick=6 m=3 runqsize=0 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=7 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=0 curg=4 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=9 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=2() m=2 lockedm=-1
  G5: status=1() m=-1 lockedm=-1
  G6: status=1() m=-1 lockedm=-1
  G7: status=2() m=3 lockedm=-1
  G8: status=1() m=-1 lockedm=-1
  G9: status=3() m=0 lockedm=-1
SCHED 2003ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=1 runqueue=1 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=91 syscalltick=9 m=2 runqsize=1 gfreecnt=0
  P1: status=1 schedtick=92 syscalltick=6 m=3 runqsize=1 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=8 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=0 curg=7 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=9 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=1() m=-1 lockedm=-1
  G5: status=1() m=-1 lockedm=-1
  G6: status=1() m=-1 lockedm=-1
  G7: status=2() m=2 lockedm=-1
  G8: status=2() m=3 lockedm=-1
  G9: status=3() m=0 lockedm=-1
SCHED 3013ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=1 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=134 syscalltick=9 m=2 runqsize=1 gfreecnt=0
  P1: status=1 schedtick=135 syscalltick=6 m=3 runqsize=2 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=0 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=9 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=1() m=-1 lockedm=-1
  G5: status=1() m=-1 lockedm=-1
  G6: status=1() m=-1 lockedm=-1
  G7: status=1() m=-1 lockedm=-1
  G8: status=1() m=-1 lockedm=-1
  G9: status=3() m=0 lockedm=-1
SCHED 4017ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=1 runqueue=1 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=178 syscalltick=9 m=2 runqsize=1 gfreecnt=0
  P1: status=1 schedtick=179 syscalltick=6 m=3 runqsize=1 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=7 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=0 curg=6 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=9 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=1() m=-1 lockedm=-1
  G5: status=1() m=-1 lockedm=-1
  G6: status=2() m=2 lockedm=-1
  G7: status=2() m=3 lockedm=-1
  G8: status=1() m=-1 lockedm=-1
  G9: status=3() m=0 lockedm=-1
SCHED 5019ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0 idlethreads=2 runqueue=1 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=220 syscalltick=9 m=2 runqsize=2 gfreecnt=0
  P1: status=1 schedtick=221 syscalltick=6 m=3 runqsize=1 gfreecnt=0
  M4: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  M3: p=1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M2: p=0 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 spinning=false blocked=true lockedg=-1
  G1: status=4(sleep) m=-1 lockedm=-1
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
  G17: status=4(finalizer wait) m=-1 lockedm=-1
  G33: status=4(trace reader (blocked)) m=-1 lockedm=-1
  G4: status=1() m=-1 lockedm=-1
  G5: status=1() m=-1 lockedm=-1
  G6: status=1() m=-1 lockedm=-1
  G7: status=1() m=-1 lockedm=-1
  G8: status=1() m=-1 lockedm=-1
  G9: status=1() m=-1 lockedm=-1

当死循环 G 超过MAXPROCS 可以明显发现 5 个死循环的 G 在轮转切换(抢占, P 关联), 并且可以明显观察到全局 G 队列的变化, 执行时长比较均匀.

Goroutine Total Execution Network wait Sync block Blocking syscall Scheduler wait GC sweeping GC pause
4 5040ms 2173ms 0ns 0ns 0ns 2867ms 0ns (0.0%) 0ns (0.0%)
5 5040ms 1950ms 0ns 0ns 0ns 3090ms 0ns (0.0%) 0ns (0.0%)
6 5040ms 2042ms 0ns 0ns 0ns 2997ms 0ns (0.0%) 0ns (0.0%)
7 5040ms 1938ms 0ns 0ns 0ns 3102ms 0ns (0.0%) 0ns (0.0%)
8 5040ms 1971ms 0ns 0ns 0ns 3069ms 0ns (0.0%) 0ns (0.0%

四. 总结

  • 本问只是简单的分析, 抢占还有其他条件限制, 但抢占基本是针对 User Code 的, 不针对 runtime, 具体可见学习资料.
  • 待后续.

最后 ending…如有不足请指点,亦可留言或联系 fobcrackgp@163.com.
本文为笃行原创文章首发于大题小作, 永久链接:Golang 抢占调度流程分析

https://www.ifobnn.com/golangpreempt.html

正文完
 0