关于go:go官方的一个bug导致不会主动关闭http-Request-Body

40次阅读

共计 4063 个字符,预计需要花费 11 分钟才能阅读完成。

代码版本:go1.20.2
咱们晓得在用 go 的 http client 时不须要咱们被动敞开 Request Body, 上面你 Client.Do 的源码应用阐明:

//src/net/http/client.go
//...
// 这里说了底层的 Transport 会被动敞开 request Body
// The request Body, if non-nil, will be closed by the underlying
// Transport, even on errors.
//
// ...
func (c *Client) Do(req *Request) (*Response, error) {return c.do(req)
}

咱们在我的项目中须要用到这个个性,就是须要 client 被动帮忙咱们敞开 request.Body,然而咱们发现协程泄露,最初定位到可能是因为 request.Body 没有被被动敞开导致. 难懂是官网的形容有问题吗?最初咱们在 github issue 中看到了有人提出 request.Body 在特定状况下不会被关掉的场景, 最初官网也进行了修复.

咱们先来看下这个 issue(https://github.com/golang/go/issues/49621):

他还写了演示示例(https://play.golang.com/p/lku8lEgiPu6)

重点次要是这上图中说的在 writeLoop()里, 可能 pc.writech 和 pc.closech 都有内容,然而执行到了 <-pc.closech 导致 Request.Body 没有被 close
咱们先来看下 writeLoop()源码, 重点看下中文正文:

//src/net/http/transport.go
func (pc *persistConn) writeLoop() {defer close(pc.writeLoopDone)
    for {
        select {
        case wr := <-pc.writech:
            startBytesWritten := pc.nwrite
            // 这外面会去敞开 Request.Body, 具体细节就不去看了
            err := wr.req.Request.write(pc.bw, pc.isProxy, wr.req.extra, pc.waitForContinue(wr.continueCh))
            if bre, ok := err.(requestBodyReadError); ok {//...}
            if err == nil {err = pc.bw.Flush()
            }
            if err != nil {
                if pc.nwrite == startBytesWritten {err = nothingWrittenError{err}
                }
            }
            pc.writeErrCh <- err // to the body reader, which might recycle us
            wr.ch <- err         // to the roundTrip function
            if err != nil {pc.close(err)
                return
            }
        case <-pc.closech: // 间接退出
            return
        }
    }
}

咱们能够看到如果失常申请下须要进入到 case wr := <-pc.writech 才会对 request 进行操作,才会在外面 close request.Body. 如果 case wr := <-pc.writechcase <-pc.closech都满足,然而进入到了 case <-pc.closech 就会导致 request.Body 不会被敞开。那么这种状况在什么时候会产生了呢?

//src/net/http/transport.go
func (pc *persistConn) roundTrip(req *transportRequest) (resp *Response, err error) {
    // ...

    // Write the request concurrently with waiting for a response,
    // in case the server decides to reply before reading our full
    // request body.
    startBytesWritten := pc.nwrite
    writeErrCh := make(chan error, 1)
    // 这里写入 pc.writech
    pc.writech <- writeRequest{req, writeErrCh, continueCh}
    //...
}

下面的 roundTrip()写入了 pc.writech,然而 pc.closech 是在其余协程写入的

//src/net/http/transport.go
// close closes the underlying TCP connection and closes
// the pc.closech channel.
//
// The provided err is only for testing and debugging; in normal
// circumstances it should never be seen by users.
func (pc *persistConn) close(err error) {pc.mu.Lock()
    defer pc.mu.Unlock()
    pc.closeLocked(err)
}

func (pc *persistConn) closeLocked(err error) {
    if err == nil {panic("nil error")
    }
    pc.broken = true
    if pc.closed == nil {
        pc.closed = err
        pc.t.decConnsPerHost(pc.cacheKey)
        // Close HTTP/1 (pc.alt == nil) connection.
        // HTTP/2 closes its connection itself.
        if pc.alt == nil {
            if err != errCallerOwnsConn {pc.conn.Close()
            }
            close(pc.closech) // 这里唤醒 pc.closech
        }
    }
    pc.mutateHeaderFunc = nil
}

咱们能够看到 pc.closech 次要是在 persistConn close()的时候唤醒. 所以大抵逻辑就是申请到了一条连贯 persistConn 而后在 Read/Write 的时候疾速失败,因为这两个在不同的协程导致 pc.writechpc.closech同时满足条件。go 官网修复了这个 bug(https://go-review.googlesource.com/c/go/+/461675),咱们来看下怎么修复的:
https://go-review.googlesource.com/c/go/+/461675/4/src/net/ht…

看下面批改局部就是在 (t *Transport) roundTrip(req *Request) 外面再去尝试敞开 request.Body. 咱们再看下这次 pr 的测试用例,很清晰:
https://go-review.googlesource.com/c/go/+/461675/4/src/net/ht…

上面把重要局部解释下:

// https://go.dev/issue/49621
func TestConnClosedBeforeRequestIsWritten(t *testing.T) {run(t, testConnClosedBeforeRequestIsWritten, testNotParallel, []testMode{http1Mode})
}
func testConnClosedBeforeRequestIsWritten(t *testing.T, mode testMode) {ts := newClientServerTest(t, mode, HandlerFunc(func(w ResponseWriter, r *Request) {}),
        func(tr *Transport) {tr.DialContext = func(_ context.Context, network, addr string) (net.Conn, error) {
                // Connection 会疾速返回谬误
                return &funcConn{
                    // 这里本人定义一个 conn,不论是 Read 还是 Write 都会即刻返回谬误
                    read: func([]byte) (int, error) {return 0, errors.New("error")
                    },
                    write: func([]byte) (int, error) {return 0, errors.New("error")
                    },
                }, nil
            }
        },
    ).ts
    // 这里设置了一个 hook 就是在进入 RoundTrip 前劳动一下给足够的工夫让 closech 被 close
    SetEnterRoundTripHook(func() {time.Sleep(1 * time.Millisecond)
    })
    defer SetEnterRoundTripHook(nil)
    var closes int
    _, err := ts.Client().Post(ts.URL, "text/plain", countCloseReader{&closes, strings.NewReader("hello")})
    if err == nil {t.Fatalf("expected request to fail, but it did not")
    }
    // 这里的 closes 应该等于 1
    if closes != 1 {t.Errorf("after RoundTrip, request body was closed %v times; want 1", closes)
    }
}

目前这个 bug fix 曾经合入了 master, 然而什么时候公布到正式版本未知

总结

不要太置信官网的操作,官网也是可能有 bug 的,要大胆猜疑并去摸索。

相干链接

  • bug fix: https://go-review.googlesource.com/c/go/+/461675
  • issue: https://github.com/golang/go/issues/49621

正文完
 0