关于erlang:erlang-cowboy-在-nginx-499-时的-handler-process-shutdown

48次阅读

共计 2258 个字符,预计需要花费 6 分钟才能阅读完成。

简述

erlang 的 cowboy 是一个 web server 框架。它在客户端提前断开(nginx http code 499)时,会间接杀掉 handler 过程。这很容易造成 bug。

示例代码

参考 https://ninenines.eu/docs/en/…
有 handler 代码如下:

-module(hello_handler).
-behavior(cowboy_handler).

-export([init/2]).

init(Req, State) ->
      erlang:display("before_sleep"),
      timer:sleep(3000),
      erlang:display("after_sleep"),
    Req = cowboy_req:reply(
        200,
        #{<<"content-type">> => <<"text/plain">>},
        <<"Hello Erlang!">>,
        Req
    ),
    {ok, Req, State}.

curl http://localhost:8080

时,有输入:

([email protected])1> "before_sleep"
"after_sleep"

如果

curl http://localhost:8080 --max-time 0.001
curl: (28) Resolving timed out after 4 milliseconds

有输入:

([email protected])1> "before_sleep"

这个阐明 handler 过程的执行被抢行掐断了。如果代码中有对过程内部资源的拜访,比方加锁,显然会造成锁开释问题。

问题起因

见 cowboy_http.erl:loop

loop(State=#state{parent=Parent, socket=Socket, transport=Transport, opts=Opts,
        buffer=Buffer, timer=TimerRef, children=Children, in_streamid=InStreamID,
        last_streamid=LastStreamID}) ->
    Messages = Transport:messages(),
    InactivityTimeout = maps:get(inactivity_timeout, Opts, 300000),
    receive
        %% Discard data coming in after the last request
        %% we want to process was received fully.
        {OK, Socket, _} when OK =:= element(1, Messages), InStreamID > LastStreamID ->
            loop(State);
        %% Socket messages.
        {OK, Socket, Data} when OK =:= element(1, Messages) ->
            parse(<< Buffer/binary, Data/binary >>, State);
        {Closed, Socket} when Closed =:= element(2, Messages) ->
            terminate(State, {socket_error, closed, 'The socket has been closed.'});
        {Error, Socket, Reason} when Error =:= element(3, Messages) ->
            terminate(State, {socket_error, Reason, 'An error has occurred on the socket.'});
        {Passive, Socket} when Passive =:= element(4, Messages);
                %% Hardcoded for compatibility with Ranch 1.x.
                Passive =:= tcp_passive; Passive =:= ssl_passive ->
            setopts_active(State),
            loop(State);
        %% Timeouts.

最终会通过发送 exit 音讯形式,杀掉 children 过程。

-spec terminate(children()) -> ok.
terminate(Children) ->
    %% For each child, either ask for it to shut down,
    %% or cancel its shutdown timer if it already is.
    %%
    %% We do not need to flush stray timeout messages out because
    %% we are either terminating or switching protocols,
    %% and in the latter case we flush all messages.
    _ = [case TRef of
        undefined -> exit(Pid, shutdown);
        _ -> erlang:cancel_timer(TRef, [{async, true}, {info, false}])
    end || #child{pid=Pid, timer=TRef} <- Children],
    before_terminate_loop(Children).

因为 children 没有 trap exit,在没有任何日志输入,任何机会解决的状况下退出了。

总结

因为 cowboy 在对端断开时,会间接杀掉 handler 过程,这个很容易造成 bug。能够应用 nginx 的 proxy_ignore_client_abort on。让客户端断开不传递至后端,从而躲避这个问题。

正文完
 0