共计 40230 个字符,预计需要花费 101 分钟才能阅读完成。
作者:李锡超
一个爱笑的江苏苏宁银行 数据库工程师,次要负责数据库日常运维、自动化建设、DMP 平台运维。善于 MySQL、Python、Oracle,喜好骑行、钻研技术。
本文起源:原创投稿
* 爱可生开源社区出品,原创内容未经受权不得随便应用,转载请分割小编并注明起源。
1. 问题景象:
9 月 15 日,收到告警提醒测试环境某零碎的 MySQL MGR 集群存在异样。
日志内容如下:
// 10.x.y.97 节点 MySQL 谬误日志提醒节点 10.x.y.95 不可达:2022-09-15T19:26:14.320181+08:00 0 "Warning" "MY-011493" "Repl" Plugin group_replication reported: "Member with address 10.x.y.95:3306 has become unreachable."
// 随后 mgraliver(MGR 探针) 日志提醒进行了切换:"2022-09-15 19:26:16" : Exception: Invalid primary_member_ip: or secondary_node_list:""10.x.y.96", "10.x.y.97"" .
"2022-09-15 19:26:16" : Exception: MGR is likely to be switching, Sleep 1 sec and continue .
"2022-09-15 19:27:01" : Exception: MGR running with WARN. ONLINE nodes Pri:10.x.y.96 Sec:10.x.y.97 diff from conf_node:10.x.y.95,10.x.y.96,10.x.y.97 .
即告警提醒利用零碎 MySQL MGR 产生了切换,由故障前的三节点 MGR 集群,切换为包含 Pri:10.x.y.96 Sec:10.x.y.97 两个节点集群。
2. 初步剖析
收到告警后,通过剖析 MySQL 谬误日志、操作系统零碎、监控日志等信息,发现操作系统的工夫存在异样:
具体异样截图,如下图:
即:存在【Time has been changed】异样。
联合 mysql 官网文档阐明:其明确阐明其故障检测基于工夫:
1) 如果一个成员在 5 秒内没有收到另一个成员的音讯,则狐疑该成员产生故障,并在本人的 Performance Schema 表 replication_group_members 中将该成员的状态列为 UNREACHABLE。
2) 如果狐疑继续超过 10 秒,则狐疑成员会尝试将其认为可疑成员有谬误的观点流传给该组的其余成员。
具体参考:https://dev.mysql.com/doc/ref…
由此初步狐疑是操作系统的工夫产生了跳变,触发 MySQL MGR 产生故障切换。
并将相干异样,反馈系统零碎管理部相干老师进行进一步确认。
3. 根本原因
通过零碎专家深入分析,确认其根本原因是因为问题时段服务器底层存在异样。
导致 19:27:13 左右 虚拟机产生挂起(此时虚拟机的任何操作都不能进行,包含监控脚本、工夫、MGR 的心跳等)。
虚拟机挂起后,因为除故障节点外的其它两个节点无奈与故障节点进行通信,因而认为该节点存在异样并将其驱赶。并最终产生如上告警信息。
至此,该问题失去最终确认。
4. 对于工夫对 MGR 的影响
既然该问题最终确认是因为虚拟机挂起导致主节点被驱赶。那么如果不是挂起,只是虚拟机的工夫产生跳变,那还会触发故障切换么??
为此,通过测试环境进行了测试,其测试论断为:
当 MGR 集群中一个节点工夫发生变化后(比方忽然快了 1 小时、慢了 1 小时),MGR 集群的同步状态并不会因而受到影响。error_log 外面也未看到显著的报错!
即:MGR 节点的工夫异样,并不会触发 MGR 产生故障切换。
那具体是什么机制呢??为此联合源码进行确认!!
5. 咱们晓得,源码中波及节点间探测,次要包含如下函数 (alive_task/detector_task):
函数总体调用关系:
alive_task
task_now
may_be_dead
task_now
detector_task
check_global_node_set
DETECT
task_now
check_local_node_set
DETECT
task_now
// alive_task:int alive_task(task_arg arg MY_ATTRIBUTE((unused))) {while (!xcom_shutdown) {
..
// 超过 0.5 秒,播送我是 alive
f (server_active(site, get_nodeno(site)) < task_now() - 0.5) {replace_pax_msg(&ep->i_p, pax_msg_new(alive_synode, site));
ep->i_p->op = i_am_alive_op;
send_to_all_site(site, ep->i_p, "alive_task");
}
...
{double sec = task_now();
// 超过 4 秒没有心跳, 询问你是否活着
if (i != get_nodeno(site) && may_be_dead(site->detected, i, sec)) {replace_pax_msg(&ep->you_p, pax_msg_new(alive_synode, site));
ep->you_p->op = are_you_alive_op;
ep->you_p->a = new_app_data();
ep->you_p->a->app_key.group_id = ep->you_p->a->group_id =
get_group_id(site);
ep->you_p->a->body.c_t = xcom_boot_type;
init_node_list(1, &site->nodes.node_list_val[i],
&ep->you_p->a->body.app_u_u.nodes);
send_server_msg(site, i, ep->you_p);
}
}
TASK_DELAY(1.0);
}
}
// DETECT
#define DETECTOR_LIVE_TIMEOUT 5.0
// 判断心跳是否超时
#define DETECT(site, i) \
(i == get_nodeno(site)) || \
(site->detected[i] + DETECTOR_LIVE_TIMEOUT > task_now())
static void check_global_node_set(site_def *site, int *notify) {
u_int i;
u_int nodes = get_maxnodes(site);
site->global_node_count = 0;
for (i = 0; i < nodes && i < site->global_node_set.node_set_len; i++) {int detect = DETECT(site, i);
if (site->global_node_set.node_set_val[i]) site->global_node_count++;
// 本次捕捉的心跳状态,则示意有节点心跳渐变: 无心跳 -> 有心跳 有心跳 -> 无心跳
if (site->global_node_set.node_set_val[i] != detect) {
// 须要告诉全局状态发生变化
*notify = 1;
}
DBGOHK(FN; NDBG(i, u); NDBG(*notify, d));
}
}
static void check_local_node_set(site_def *site, int *notify) {
u_int i;
u_int nodes = get_maxnodes(site);
for (i = 0; i < nodes && i < site->global_node_set.node_set_len; i++) {int detect = DETECT(site, i);
// 本次捕捉的心跳状态,则示意有节点心跳渐变: 无心跳 -> 有心跳 有心跳 -> 无心跳
if (site->local_node_set.node_set_val[i] != detect) {site->local_node_set.node_set_val[i] = detect;
// 须要告诉本地辨认到可疑节点
*notify = 1;
}
DBGOHK(FN; NDBG(i, u); NDBG(*notify, d));
}
}
// detector_task
int detector_task(task_arg arg [[maybe_unused]]) {while (!xcom_shutdown) {
{site_def *x_site = get_executor_site_rw();
if (x_site && get_nodeno(x_site) != VOID_NODE_NO) {if (x_site != last_x_site) {reset_disjunct_servers(last_x_site, x_site);
}
update_detected(x_site);
if (x_site != last_x_site) {
last_x_site = x_site;
ep->notify = 1;
ep->local_notify = 1;
}
check_global_node_set(x_site, &ep->notify); // 判断是否有节点心跳超时,须要发动全局 view 变更告诉
update_global_count(x_site); // 更新全局节点个数
/* Send xcom message if node has changed state */
if (ep->notify && iamtheleader(x_site) && enough_live_nodes(x_site)) {
ep->notify = 0;
send_my_view(x_site);// 如果有节点心跳异样,且以后节点是 0 号主节点,且多数派的节点存活,则发送视图变更告诉,即驱赶心跳异样节点,并解决新节点退出或老节点退出
}
}
if (x_site && get_nodeno(x_site) != VOID_NODE_NO) {update_global_count(x_site); // 更新全局节点个数
check_local_node_set(x_site, &ep->local_notify);// 判断是否有节点心跳超时,须要发动 suspicion 操作
if (ep->local_notify) {
ep->local_notify = 0;
deliver_view_msg(x_site); /* To application */ // 驱赶心跳异样节点
}
}
}
TIMED_TASK_WAIT(&detector_wait, 1.0);
}
}
}
重点
通过浏览以上代码,可知函数 alive_task/detector_task 都通过 task_now() 获取以后工夫,并进行判断是否超过对应的阈值。
那么进一步剖析 task_now() 逻辑如下:
task_now
xcom_init_clock
xcom_monotonic_seconds
seconds
static void xcom_init_clock(xcom_clock *clock) {
// 调用 Linux 的 clock_gettime 函数,获取从系统启动时开始计时,以秒为单位(小于 1 秒以小数示意)。该工夫不受零碎影响,也不会被用户扭转。clock->monotonic_start = get_monotonic_time();
// 调用 Linux 的 clock_gettime 函数,获取零碎工夫 (如 date),以秒为单位(小于 1 秒以小数示意)。该工夫随着零碎工夫的扭转而扭转。clock->real_start = get_real_time();
// 计算零碎工夫与启动计时的差值 (offset)
clock->offset = clock->real_start - clock->monotonic_start;
// 通过差值 + 启动计时,失去工夫。xcom_monotonic_seconds(clock);
// 批改计算标记为 1。尔后,只有 MGR 失常运行,MGR 节点所获取工夫等于 = 此处获取的差值 (offset)+ 启动计时。// 因而,无论零碎工夫如何变动,MGR 都将获取“正确”的工夫。// 具体逻辑见如下代码:clock->done = 1;
}
static double xcom_monotonic_seconds(xcom_clock *clock) {// 初始化时获取的差值 (offset)+ 启动计时。clock->now = get_monotonic_time() + clock->offset;
return clock->now;
}
// 后续其它操作获取工夫的函数:double seconds() {
// 当第一步初始化后,!task_timer.done 始终未 false
if (!task_timer.done) {,xcom_init_clock(&task_timer);
}
// 因而初始化之后调用 seconds() 返回:xcom_monotonic_seconds(&task_timer)
return xcom_monotonic_seconds(&task_timer);
}
double task_now() {
// 当第一步初始化后,!task_timer.done 始终未 false。if (!task_timer.done) {xcom_init_clock(&task_timer);
}
// 间接返回计算的 task_timer.now
return task_timer.now;
}
对于 task_timer.now,通过跟踪发现多个逻辑都在调用,局部地位参考,如下代码栈:/*
group_replication.so!seconds() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:310)
group_replication.so!task_loop() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:1210)
group_replication.so!xcom_taskmain2(xcom_port listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.cc:1536)
group_replication.so!Gcs_xcom_proxy_impl::xcom_init(Gcs_xcom_proxy_impl * const this, xcom_port xcom_listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:243)
group_replication.so!xcom_taskmain_startup(void * ptr) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:102)
pfs_spawn_thread(void * arg) (/source_code/mysql-8.0.27/storage/perfschema/pfs.cc:2946)
libpthread.so.0!start_thread (未知源:0)
libc.so.6!clone (未知源:0)
group_replication.so!seconds() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:310)
group_replication.so!alive_task(task_arg arg) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_detector.cc:452)
group_replication.so!task_loop() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:1158)
group_replication.so!xcom_taskmain2(xcom_port listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.cc:1536)
group_replication.so!Gcs_xcom_proxy_impl::xcom_init(Gcs_xcom_proxy_impl * const this, xcom_port xcom_listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:243)
group_replication.so!xcom_taskmain_startup(void * ptr) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:102)
pfs_spawn_thread(void * arg) (/source_code/mysql-8.0.27/storage/perfschema/pfs.cc:2946)
libpthread.so.0!start_thread (未知源:0)
libc.so.6!clone (未知源:0)
group_replication.so!seconds() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:310)
group_replication.so!incoming_connection_task(task_arg arg) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.cc:761)
group_replication.so!task_loop() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:1158)
group_replication.so!xcom_taskmain2(xcom_port listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.cc:1536)
group_replication.so!Gcs_xcom_proxy_impl::xcom_init(Gcs_xcom_proxy_impl * const this, xcom_port xcom_listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:243)
group_replication.so!xcom_taskmain_startup(void * ptr) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:102)
pfs_spawn_thread(void * arg) (/source_code/mysql-8.0.27/storage/perfschema/pfs.cc:2946)
libpthread.so.0!start_thread (未知源:0)
libc.so.6!clone (未知源:0)
group_replication.so!seconds() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:310)
group_replication.so!cache_manager_task(task_arg arg) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_cache.cc:641)
group_replication.so!task_loop() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:1158)
group_replication.so!xcom_taskmain2(xcom_port listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.cc:1536)
group_replication.so!Gcs_xcom_proxy_impl::xcom_init(Gcs_xcom_proxy_impl * const this, xcom_port xcom_listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:243)
group_replication.so!xcom_taskmain_startup(void * ptr) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:102)
pfs_spawn_thread(void * arg) (/source_code/mysql-8.0.27/storage/perfschema/pfs.cc:2946)
libpthread.so.0!start_thread (未知源:0)
libc.so.6!clone (未知源:0)
group_replication.so!seconds() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:310)
group_replication.so!get_xcom_message(pax_machine ** p, synode_no msgno, int n) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.cc:2909)
group_replication.so!executor_task(task_arg arg) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.cc:4382)
group_replication.so!task_loop() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:1158)
group_replication.so!xcom_taskmain2(xcom_port listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.cc:1536)
group_replication.so!Gcs_xcom_proxy_impl::xcom_init(Gcs_xcom_proxy_impl * const this, xcom_port xcom_listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:243)
group_replication.so!xcom_taskmain_startup(void * ptr) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:102)
pfs_spawn_thread(void * arg) (/source_code/mysql-8.0.27/storage/perfschema/pfs.cc:2946)
libpthread.so.0!start_thread (未知源:0)
libc.so.6!clone (未知源:0)
group_replication.so!seconds() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:310)
group_replication.so!tcp_reaper_task(task_arg arg) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.cc:1773)
group_replication.so!task_loop() (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.cc:1158)
group_replication.so!xcom_taskmain2(xcom_port listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.cc:1536)
group_replication.so!Gcs_xcom_proxy_impl::xcom_init(Gcs_xcom_proxy_impl * const this, xcom_port xcom_listen_port) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:243)
group_replication.so!xcom_taskmain_startup(void * ptr) (/source_code/mysql-8.0.27/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:102)
pfs_spawn_thread(void * arg) (/source_code/mysql-8.0.27/storage/perfschema/pfs.cc:2946)
libpthread.so.0!start_thread (未知源:0)
libc.so.6!clone (未知源:0)
*/
6. debug 验证记录
/*
[Switching to Thread 0x7fff4e7fc700 (LWP 21685)]
1: {*site->servers[0], *site->servers[1], *site->servers[2], sec} = <error: array elements must all be the same size>
-exec print {*site->servers[0], *site->servers[1], *site->servers[2], sec}
array elements must all be the same size
[New Thread 0x7fff4dffb700 (LWP 21717)]
[Thread 0x7fff4dffb700 (LWP 21717) exited]
[Switching to Thread 0x7ffff7ee7a00 (LWP 21476)]
[Switching to Thread 0x7fff4e7fc700 (LWP 21685)]
1: {*site->servers[0], *site->servers[1], *site->servers[2], sec} = <error: array elements must all be the same size>
-exec display {*site->servers[0], *site->servers[1], *site->servers[2]}
2: {*site->servers[0], *site->servers[1], *site->servers[2]} = {{garbage = 0, refcnt = 4, srv = 0x7fff42558400 "10.211.55.14", port = 33302, con = 0x7fff42556f40, active = 1664038872.1307986, detected = 1664038872.7222638, outgoing = {data = {type = 0, suc = 0x7fff42558640, pred = 0x7fff42558640}, queue = {type = 0, suc = 0x7fff425686a0, pred = 0x7fff425686a0}}, sender = 0x7fff425686a0, reply_handler = 0x7fff4256c5c0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\224\000\002\232\000\000\000\000\000\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000o\272\000\000\000\001", '\000' <repeats 11 times>, "\002\000\000\000\000\000\000\000\002\000\000\000\027~\020\363\060\000\000\000\000\000\000o\272\000\000\000\002\000\000\000\000\000\000\000\001\000\000\000\004\000\000\000\004", '\000' <repeats 35 times>, "\003~\020\363\060\000\000\000\000\000\000o\272\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\001\v\003\000\003\000\024\000\v\001\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\353\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 6, last_ping_received = 1664038820.1029122}, {garbage = 0, refcnt = 4, srv = 0x7fff425704e0 "10.211.55.14", port = 33303, con = 0x7fff42556f70, active = 1664038872.1307986, detected = 1664038872.7222638, outgoing = {data = {type = 0, suc = 0x7fff42570720, pred = 0x7fff42570720}, queue = {type = 0, suc = 0x7fff42580780, pred = 0x7fff42580780}}, sender = 0x7fff42580780, reply_handler = 0x7fff425846a0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\224\000\002\232\000\000\000\000\001\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000o\272\000\000\000\001", '\000' <repeats 11 times>, "\002\000\000\000\000\000\000\000\002\000\000\000\027~\020\363\060\000\000\000\000\000\000o\272\000\000\000\002\000\000\000\000\000\000\000\001\000\000\000\004\000\000\000\004", '\000' <repeats 35 times>, "\002~\020\363\060\000\000\000\000\000\000o\272\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\001\v\003\000\003\000\024\000\v\001\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\353\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 6, last_ping_received = 1664038820.1029122}, {garbage = 0, refcnt = 2, srv = 0x7fff42589a40 "10.211.55.14", port = 33301, con = 0x7fff42599ce0, active = 0, detected = 1664038872.1307986, outgoing = {data = {type = 0, suc = 0x7fff42589c80, pred = 0x7fff42589c80}, queue = {type = 0, suc = 0x7fff42599d10, pred = 0x7fff42599d10}}, sender = 0x7fff42599d10, reply_handler = 0x0, out_buf = {start = 0, n = 0, buf = '\000' <repeats 65535 times>}, invalid = 0, number_of_pings_received = 0, last_ping_received = 0}}
-exec display sec
3: sec = 1664038873.0992022
[New Thread 0x7fff4dffb700 (LWP 21742)]
[Thread 0x7fff4dffb700 (LWP 21742) exited]
Unable to perform this action because the process is running.
Unable to perform this action because the process is running.
Unable to perform this action because the process is running.
[Switching to Thread 0x7ffff7ee7a00 (LWP 21476)]
[Switching to Thread 0x7fff4e7fc700 (LWP 21685)]
3: sec = 1664038930.2414346
2: {*site->servers[0], *site->servers[1], *site->servers[2]} = {{garbage = 0, refcnt = 4, srv = 0x7fff42558400 "10.211.55.14", port = 33302, con = 0x7fff42556f40, active = 1664038929.2778521, detected = 1664038929.7121894, outgoing = {data = {type = 0, suc = 0x7fff42558640, pred = 0x7fff42558640}, queue = {type = 0, suc = 0x7fff425686a0, pred = 0x7fff425686a0}}, sender = 0x7fff425686a0, reply_handler = 0x7fff4256c5c0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\224\000\002\232\000\000\000\000\000\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000o\377\000\000\000\001", '\000' <repeats 11 times>, "\002\000\000\000\000\000\000\000\002\000\000\000\027~\020\363\060\000\000\000\000\000\000o\377\000\000\000\002\000\000\000\000\000\000\000\001\000\000\000\004\000\000\000\004", '\000' <repeats 35 times>, "\003~\020\363\060\000\000\000\000\000\000o\377\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\000\342\003\000\003\000\024\000\342", '\000' <repeats 11 times>, "\002\000\000\000\000\000\302\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 13, last_ping_received = 1664038889.2793722}, {garbage = 0, refcnt = 4, srv = 0x7fff425704e0 "10.211.55.14", port = 33303, con = 0x7fff42556f70, active = 1664038929.2778521, detected = 1664038929.712055, outgoing = {data = {type = 0, suc = 0x7fff42570720, pred = 0x7fff42570720}, queue = {type = 0, suc = 0x7fff42580780, pred = 0x7fff42580780}}, sender = 0x7fff42580780, reply_handler = 0x7fff425846a0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\224\000\002\232\000\000\000\000\001\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000o\377\000\000\000\001", '\000' <repeats 11 times>, "\002\000\000\000\000\000\000\000\002\000\000\000\027~\020\363\060\000\000\000\000\000\000o\377\000\000\000\002\000\000\000\000\000\000\000\001\000\000\000\004\000\000\000\004", '\000' <repeats 35 times>, "\002~\020\363\060\000\000\000\000\000\000o\377\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\000\342\003\000\003\000\024\000\342", '\000' <repeats 11 times>, "\002\000\000\000\000\000\302\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 13, last_ping_received = 1664038889.2793722}, {garbage = 0, refcnt = 2, srv = 0x7fff42589a40 "10.211.55.14", port = 33301, con = 0x7fff42599ce0, active = 0, detected = 1664038929.2778521, outgoing = {data = {type = 0, suc = 0x7fff42589c80, pred = 0x7fff42589c80}, queue = {type = 0, suc = 0x7fff42599d10, pred = 0x7fff42599d10}}, sender = 0x7fff42599d10, reply_handler = 0x0, out_buf = {start = 0, n = 0, buf = '\000' <repeats 65535 times>}, invalid = 0, number_of_pings_received = 0, last_ping_received = 0}}
1: {*site->servers[0], *site->servers[1], *site->servers[2], sec} = <error: array elements must all be the same size>
[New Thread 0x7fff4dffb700 (LWP 21749)]
[Thread 0x7fff4dffb700 (LWP 21749) exited]
Unable to perform this action because the process is running.
Unable to perform this action because the process is running.
[Switching to Thread 0x7ffff7ee7a00 (LWP 21476)]
[Switching to Thread 0x7fff4e7fc700 (LWP 21685)]
3: sec = 1664038967.5334535
2: {*site->servers[0], *site->servers[1], *site->servers[2]} = {{garbage = 0, refcnt = 4, srv = 0x7fff42558400 "10.211.55.14", port = 33302, con = 0x7fff42556f40, active = 1664038966.5543196, detected = 1664038967.5255749, outgoing = {data = {type = 0, suc = 0x7fff42558640, pred = 0x7fff42558640}, queue = {type = 0, suc = 0x7fff425686a0, pred = 0x7fff425686a0}}, sender = 0x7fff425686a0, reply_handler = 0x7fff4256c5c0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\224\000\002\232\000\000\000\000\000\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p,\000\000\000\001", '\000' <repeats 11 times>, "\002\000\000\000\000\000\000\000\002\000\000\000\027~\020\363\060\000\000\000\000\000\000p,\000\000\000\002\000\000\000\000\000\000\000\001\000\000\000\004\000\000\000\004", '\000' <repeats 35 times>, "\003~\020\363\060\000\000\000\000\000\000p,\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\000\342\003\000\003\000\024\000\342", '\000' <repeats 11 times>, "\002\000\000\000\000\000\302\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 6, last_ping_received = 1664038939.5367978}, {garbage = 0, refcnt = 4, srv = 0x7fff425704e0 "10.211.55.14", port = 33303, con = 0x7fff42556f70, active = 1664038966.5543196, detected = 1664038967.5255749, outgoing = {data = {type = 0, suc = 0x7fff42570720, pred = 0x7fff42570720}, queue = {type = 0, suc = 0x7fff42580780, pred = 0x7fff42580780}}, sender = 0x7fff42580780, reply_handler = 0x7fff425846a0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\224\000\002\232\000\000\000\000\001\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p,\000\000\000\001", '\000' <repeats 11 times>, "\002\000\000\000\000\000\000\000\002\000\000\000\027~\020\363\060\000\000\000\000\000\000p,\000\000\000\002\000\000\000\000\000\000\000\001\000\000\000\004\000\000\000\004", '\000' <repeats 35 times>, "\002~\020\363\060\000\000\000\000\000\000p,\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\000\342\003\000\003\000\024\000\342", '\000' <repeats 11 times>, "\002\000\000\000\000\000\302\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 6, last_ping_received = 1664038939.5367978}, {garbage = 0, refcnt = 2, srv = 0x7fff42589a40 "10.211.55.14", port = 33301, con = 0x7fff42599ce0, active = 0, detected = 1664038966.5543196, outgoing = {data = {type = 0, suc = 0x7fff42589c80, pred = 0x7fff42589c80}, queue = {type = 0, suc = 0x7fff42599d10, pred = 0x7fff42599d10}}, sender = 0x7fff42599d10, reply_handler = 0x0, out_buf = {start = 0, n = 0, buf = '\000' <repeats 65535 times>}, invalid = 0, number_of_pings_received = 0, last_ping_received = 0}}
1: {*site->servers[0], *site->servers[1], *site->servers[2], sec} = <error: array elements must all be the same size>
Unable to perform this action because the process is running.
Unable to perform this action because the process is running.
[Switching to Thread 0x7ffff7ee7a00 (LWP 21476)]
[Switching to Thread 0x7fff4e7fc700 (LWP 21685)]
3: sec = 1664039046.2713785
2: {*site->servers[0], *site->servers[1], *site->servers[2]} = {{garbage = 0, refcnt = 4, srv = 0x7fff42558400 "10.211.55.14", port = 33302, con = 0x7fff42556f40, active = 1664039045.2710295, detected = 1664039045.6940327, outgoing = {data = {type = 0, suc = 0x7fff42558640, pred = 0x7fff42558640}, queue = {type = 0, suc = 0x7fff425686a0, pred = 0x7fff425686a0}}, sender = 0x7fff425686a0, reply_handler = 0x7fff4256c5c0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\200\000\002\232\000\000\000\000\000\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p\212\000\000\000\002", '\000' <repeats 11 times>, "\002\377\377\377\377\000\000\000\002\000\000\000\017~\020\363\060\000\000\000\000\000\000p\213", '\000' <repeats 35 times>, "\004~\020\363\060\000\000\000\000\000\000p\213\000\000\000\000\000\000\000\n\000\000\000\000\000\000\000\000\000\000p\212\000\000\000\001\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\000\342\003\000\003\000\024\000\342", '\000' <repeats 11 times>, "\002\000\000\000\000\000\302\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 6, last_ping_received = 1664038939.5367978}, {garbage = 0, refcnt = 4, srv = 0x7fff425704e0 "10.211.55.14", port = 33303, con = 0x7fff42556f70, active = 1664039045.2710295, detected = 1664039045.6939404, outgoing = {data = {type = 0, suc = 0x7fff42570720, pred = 0x7fff42570720}, queue = {type = 0, suc = 0x7fff42580780, pred = 0x7fff42580780}}, sender = 0x7fff42580780, reply_handler = 0x7fff425846a0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\200\000\002\232\000\000\000\000\001\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p\212\000\000\000\002", '\000' <repeats 11 times>, "\002\377\377\377\377\000\000\000\002\000\000\000\017~\020\363\060\000\000\000\000\000\000p\213", '\000' <repeats 35 times>, "\003~\020\363\060\000\000\000\000\000\000p\213\000\000\000\000\000\000\000\n\000\000\000\000\000\000\000\000\000\000p\212\000\000\000\001\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\000\342\003\000\003\000\024\000\342", '\000' <repeats 11 times>, "\002\000\000\000\000\000\302\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 6, last_ping_received = 1664038939.5367978}, {garbage = 0, refcnt = 2, srv = 0x7fff42589a40 "10.211.55.14", port = 33301, con = 0x7fff42599ce0, active = 0, detected = 1664039045.2710295, outgoing = {data = {type = 0, suc = 0x7fff42589c80, pred = 0x7fff42589c80}, queue = {type = 0, suc = 0x7fff42599d10, pred = 0x7fff42599d10}}, sender = 0x7fff42599d10, reply_handler = 0x0, out_buf = {start = 0, n = 0, buf = '\000' <repeats 65535 times>}, invalid = 0, number_of_pings_received = 0, last_ping_received = 0}}
1: {*site->servers[0], *site->servers[1], *site->servers[2], sec} = <error: array elements must all be the same size>
[New Thread 0x7fff4dffb700 (LWP 21767)]
[Thread 0x7fff4dffb700 (LWP 21767) exited]
Unable to perform this action because the process is running.
Unable to perform this action because the process is running.
Unable to perform this action because the process is running.
[Switching to Thread 0x7ffff7ee7a00 (LWP 21476)]
[Switching to Thread 0x7fff4e7fc700 (LWP 21685)]
3: sec = 1664039245.357549
2: {*site->servers[0], *site->servers[1], *site->servers[2]} = {{garbage = 0, refcnt = 4, srv = 0x7fff42558400 "10.211.55.14", port = 33302, con = 0x7fff42556f40, active = 1664039244.8811147, detected = 1664039245.1229894, outgoing = {data = {type = 0, suc = 0x7fff42558640, pred = 0x7fff42558640}, queue = {type = 0, suc = 0x7fff425686a0, pred = 0x7fff425686a0}}, sender = 0x7fff425686a0, reply_handler = 0x7fff4256c5c0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\224\000\002\232\000\000\000\000\000\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p\250\000\000\000\001", '\000' <repeats 11 times>, "\002\000\000\000\000\000\000\000\002\000\000\000\027~\020\363\060\000\000\000\000\000\000p\250\000\000\000\002\000\000\000\000\000\000\000\001\000\000\000\004\000\000\000\004", '\000' <repeats 35 times>, "\003~\020\363\060\000\000\000\000\000\000p\250\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\000\342\003\000\003\000\024\000\342", '\000' <repeats 11 times>, "\002\000\000\000\000\000\302\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 3, last_ping_received = 1664039052.3651485}, {garbage = 0, refcnt = 4, srv = 0x7fff425704e0 "10.211.55.14", port = 33303, con = 0x7fff42556f70, active = 1664039244.8811147, detected = 1664039245.1229894, outgoing = {data = {type = 0, suc = 0x7fff42570720, pred = 0x7fff42570720}, queue = {type = 0, suc = 0x7fff42580780, pred = 0x7fff42580780}}, sender = 0x7fff42580780, reply_handler = 0x7fff425846a0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\224\000\002\232\000\000\000\000\001\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p\250\000\000\000\001", '\000' <repeats 11 times>, "\002\000\000\000\000\000\000\000\002\000\000\000\027~\020\363\060\000\000\000\000\000\000p\250\000\000\000\002\000\000\000\000\000\000\000\001\000\000\000\004\000\000\000\004", '\000' <repeats 35 times>, "\002~\020\363\060\000\000\000\000\000\000p\250\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\000\342\003\000\003\000\024\000\342", '\000' <repeats 11 times>, "\002\000\000\000\000\000\302\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 3, last_ping_received = 1664039052.3651485}, {garbage = 0, refcnt = 2, srv = 0x7fff42589a40 "10.211.55.14", port = 33301, con = 0x7fff42599ce0, active = 0, detected = 1664039244.8811147, outgoing = {data = {type = 0, suc = 0x7fff42589c80, pred = 0x7fff42589c80}, queue = {type = 0, suc = 0x7fff42599d10, pred = 0x7fff42599d10}}, sender = 0x7fff42599d10, reply_handler = 0x0, out_buf = {start = 0, n = 0, buf = '\000' <repeats 65535 times>}, invalid = 0, number_of_pings_received = 0, last_ping_received = 0}}
1: {*site->servers[0], *site->servers[1], *site->servers[2], sec} = <error: array elements must all be the same size>
[Switching to Thread 0x7ffff7ee7a00 (LWP 21476)]
[Switching to Thread 0x7fff4e7fc700 (LWP 21685)]
3: sec = 1664039271.9992189
2: {*site->servers[0], *site->servers[1], *site->servers[2]} = {{garbage = 0, refcnt = 4, srv = 0x7fff42558400 "10.211.55.14", port = 33302, con = 0x7fff42556f40, active = 1664039270.9997928, detected = 1664039271.6567574, outgoing = {data = {type = 0, suc = 0x7fff42558640, pred = 0x7fff42558640}, queue = {type = 0, suc = 0x7fff425686a0, pred = 0x7fff425686a0}}, sender = 0x7fff425686a0, reply_handler = 0x7fff4256c5c0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\200\000\002\232\000\000\000\000\000\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p\300\000\000\000\001", '\000' <repeats 11 times>, "\002\377\377\377\377\000\000\000\002\000\000\000\017~\020\363\060\000\000\000\000\000\000p\300\000\000\000\002", '\000' <repeats 31 times>, "\004~\020\363\060\000\000\000\000\000\000p\300\000\000\000\002\000\000\000\n\000\000\000\000\000\000\000\000\000\000p\277\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\001\v\003\000\003\000\024\000\v\001\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\353\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 3, last_ping_received = 1664039052.3651485}, {garbage = 0, refcnt = 4, srv = 0x7fff425704e0 "10.211.55.14", port = 33303, con = 0x7fff42556f70, active = 1664039270.9997928, detected = 1664039271.6567192, outgoing = {data = {type = 0, suc = 0x7fff42570720, pred = 0x7fff42570720}, queue = {type = 0, suc = 0x7fff42580780, pred = 0x7fff42580780}}, sender = 0x7fff42580780, reply_handler = 0x7fff425846a0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\200\000\002\232\000\000\000\000\001\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p\300\000\000\000\001", '\000' <repeats 11 times>, "\002\377\377\377\377\000\000\000\002\000\000\000\017~\020\363\060\000\000\000\000\000\000p\300\000\000\000\002", '\000' <repeats 31 times>, "\003~\020\363\060\000\000\000\000\000\000p\300\000\000\000\002\000\000\000\n\000\000\000\000\000\000\000\000\000\000p\277\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\001\v\003\000\003\000\024\000\v\001\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\353\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 3, last_ping_received = 1664039052.3651485}, {garbage = 0, refcnt = 2, srv = 0x7fff42589a40 "10.211.55.14", port = 33301, con = 0x7fff42599ce0, active = 0, detected = 1664039270.9997928, outgoing = {data = {type = 0, suc = 0x7fff42589c80, pred = 0x7fff42589c80}, queue = {type = 0, suc = 0x7fff42599d10, pred = 0x7fff42599d10}}, sender = 0x7fff42599d10, reply_handler = 0x0, out_buf = {start = 0, n = 0, buf = '\000' <repeats 65535 times>}, invalid = 0, number_of_pings_received = 0, last_ping_received = 0}}
1: {*site->servers[0], *site->servers[1], *site->servers[2], sec} = <error: array elements must all be the same size>
[Switching to Thread 0x7ffff7ee7a00 (LWP 21476)]
[Switching to Thread 0x7fff4e7fc700 (LWP 21685)]
3: sec = 1664039415.4698734
2: {*site->servers[0], *site->servers[1], *site->servers[2]} = {{garbage = 0, refcnt = 4, srv = 0x7fff42558400 "10.211.55.14", port = 33302, con = 0x7fff42556f40, active = 1664039414.4705608, detected = 1664039414.6241975, outgoing = {data = {type = 0, suc = 0x7fff42558640, pred = 0x7fff42558640}, queue = {type = 0, suc = 0x7fff425686a0, pred = 0x7fff425686a0}}, sender = 0x7fff425686a0, reply_handler = 0x7fff4256c5c0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\200\000\002\232\000\000\000\000\000\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p\300\000\000\000\001", '\000' <repeats 11 times>, "\002\377\377\377\377\000\000\000\002\000\000\000\017~\020\363\060\000\000\000\000\000\000p\300\000\000\000\002", '\000' <repeats 31 times>, "\004~\020\363\060\000\000\000\000\000\000p\300\000\000\000\002\000\000\000\n\000\000\000\000\000\000\000\000\000\000p\277\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\001\v\003\000\003\000\024\000\v\001\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\353\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 3, last_ping_received = 1664039052.3651485}, {garbage = 0, refcnt = 4, srv = 0x7fff425704e0 "10.211.55.14", port = 33303, con = 0x7fff42556f70, active = 1664039414.4705608, detected = 1664039414.6241059, outgoing = {data = {type = 0, suc = 0x7fff42570720, pred = 0x7fff42570720}, queue = {type = 0, suc = 0x7fff42580780, pred = 0x7fff42580780}}, sender = 0x7fff42580780, reply_handler = 0x7fff425846a0, out_buf = {start = 0, n = 0, buf = "\000\000\000\n\000\000\000\200\000\002\232\000\000\000\000\001\000\000\000\002~\020\363\060~\020\363\060\000\000\000\000\000\000p\300\000\000\000\001", '\000' <repeats 11 times>, "\002\377\377\377\377\000\000\000\002\000\000\000\017~\020\363\060\000\000\000\000\000\000p\300\000\000\000\002", '\000' <repeats 31 times>, "\003~\020\363\060\000\000\000\000\000\000p\300\000\000\000\002\000\000\000\n\000\000\000\000\000\000\000\000\000\000p\277\000\000\000\002\000\000\000\n", '\000' <repeats 15 times>, "\004\000\000\001\v\003\000\003\000\024\000\v\001\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\000\353\000\000\000\000\000\000\000"...}, invalid = 0, number_of_pings_received = 3, last_ping_received = 1664039052.3651485}, {garbage = 0, refcnt = 2, srv = 0x7fff42589a40 "10.211.55.14", port = 33301, con = 0x7fff42599ce0, active = 0, detected = 1664039414.4705608, outgoing = {data = {type = 0, suc = 0x7fff42589c80, pred = 0x7fff42589c80}, queue = {type = 0, suc = 0x7fff42599d10, pred = 0x7fff42599d10}}, sender = 0x7fff42599d10, reply_handler = 0x0, out_buf = {start = 0, n = 0, buf = '\000' <repeats 65535 times>}, invalid = 0, number_of_pings_received = 0, last_ping_received = 0}}
1: {*site->servers[0], *site->servers[1], *site->servers[2], sec} = <error: array elements must all be the same size>
[New Thread 0x7fff4dffb700 (LWP 21841)]
[Thread 0x7fff4dffb700 (LWP 21841) exited]
*/
/*
-exec display task_timer
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055750.7200146, done = 1}
[New Thread 0x7fff9c4f4700 (LWP 3837)]
[Thread 0x7fff9c4f4700 (LWP 3837) exited]
[Switching to Thread 0x7ffff7ee7a00 (LWP 32272)]
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055792.7309351, done = 1}
[Switching to Thread 0x7fff4e7fc700 (LWP 32449)]
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055792.8114834, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055794.829345, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055795.8460269, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055797.0455055, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055797.0455055, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055797.0455055, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055797.0455055, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055797.0455055, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055797.0455055, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055797.0455055, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055802.7471585, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055802.7471585, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055802.7471585, done = 1}
[New Thread 0x7fff9c4f4700 (LWP 3842)]
[Thread 0x7fff9c4f4700 (LWP 3842) exited]
// 执行工夫批改操作:
[[email protected] logs]# date
Sun Sep 25 05:44:45 CST 2022
[[email protected] logs]# date -s “2022-09-25 00:01:00”
Sun Sep 25 00:01:00 CST 2022
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055893.4010971, done = 1}
[Switching to Thread 0x7fff4e7fc700 (LWP 32449)]
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055893.4823222, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055895.5633478, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055896.8602524, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055896.8602524, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055896.8602524, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055896.8602524, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055899.0611885, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055899.0611885, done = 1}
1: task_timer = {real_start = 1664049812.9187768, monotonic_start = 50635.304078640998, offset = 1663999177.6146982, now = 1664055899.0611885, done = 1}
*/
基于如上代码剖析,总结如下:
1). MGR 集群中获取工夫 = 初始化时获取的固定差值 (offset)+ 启动计时。因为 offset 是不变的值,启动计时在 OS 失常运行时,是一个恒定减少的数字。即 MGR 集群心跳的工夫不受零碎工夫的管制。因而在步骤 3,进行测试时,未发现 MGR 集群存在异样。也确认本次故障的确不是因为工夫发生变化而引起。