共计 2258 个字符,预计需要花费 6 分钟才能阅读完成。
简述
erlang 的 cowboy 是一个 web server 框架。它在客户端提前断开(nginx http code 499)时,会间接杀掉 handler 过程。这很容易造成 bug。
示例代码
参考 https://ninenines.eu/docs/en/…
有 handler 代码如下:
-module(hello_handler).
-behavior(cowboy_handler).
-export([init/2]).
init(Req, State) ->
erlang:display("before_sleep"),
timer:sleep(3000),
erlang:display("after_sleep"),
Req = cowboy_req:reply(
200,
#{<<"content-type">> => <<"text/plain">>},
<<"Hello Erlang!">>,
Req
),
{ok, Req, State}.
在
curl http://localhost:8080
时,有输入:
([email protected])1> "before_sleep"
"after_sleep"
如果
curl http://localhost:8080 --max-time 0.001
curl: (28) Resolving timed out after 4 milliseconds
有输入:
([email protected])1> "before_sleep"
这个阐明 handler 过程的执行被抢行掐断了。如果代码中有对过程内部资源的拜访,比方加锁,显然会造成锁开释问题。
问题起因
见 cowboy_http.erl:loop
loop(State=#state{parent=Parent, socket=Socket, transport=Transport, opts=Opts,
buffer=Buffer, timer=TimerRef, children=Children, in_streamid=InStreamID,
last_streamid=LastStreamID}) ->
Messages = Transport:messages(),
InactivityTimeout = maps:get(inactivity_timeout, Opts, 300000),
receive
%% Discard data coming in after the last request
%% we want to process was received fully.
{OK, Socket, _} when OK =:= element(1, Messages), InStreamID > LastStreamID ->
loop(State);
%% Socket messages.
{OK, Socket, Data} when OK =:= element(1, Messages) ->
parse(<< Buffer/binary, Data/binary >>, State);
{Closed, Socket} when Closed =:= element(2, Messages) ->
terminate(State, {socket_error, closed, 'The socket has been closed.'});
{Error, Socket, Reason} when Error =:= element(3, Messages) ->
terminate(State, {socket_error, Reason, 'An error has occurred on the socket.'});
{Passive, Socket} when Passive =:= element(4, Messages);
%% Hardcoded for compatibility with Ranch 1.x.
Passive =:= tcp_passive; Passive =:= ssl_passive ->
setopts_active(State),
loop(State);
%% Timeouts.
最终会通过发送 exit 音讯形式,杀掉 children 过程。
-spec terminate(children()) -> ok.
terminate(Children) ->
%% For each child, either ask for it to shut down,
%% or cancel its shutdown timer if it already is.
%%
%% We do not need to flush stray timeout messages out because
%% we are either terminating or switching protocols,
%% and in the latter case we flush all messages.
_ = [case TRef of
undefined -> exit(Pid, shutdown);
_ -> erlang:cancel_timer(TRef, [{async, true}, {info, false}])
end || #child{pid=Pid, timer=TRef} <- Children],
before_terminate_loop(Children).
因为 children 没有 trap exit,在没有任何日志输入,任何机会解决的状况下退出了。
总结
因为 cowboy 在对端断开时,会间接杀掉 handler 过程,这个很容易造成 bug。能够应用 nginx 的 proxy_ignore_client_abort on。让客户端断开不传递至后端,从而躲避这个问题。
正文完