关于数据库:Oracle案例ORA00600-internal-error-code-arguments-4187

8次阅读

共计 7774 个字符,预计需要花费 20 分钟才能阅读完成。

本案例客户来自某省电信,alert 日志大量的 ORA-00600[4187]报错,曾经影响到业务失常运行。
Fri Nov 19 16:07:09 2021
Errors in file /u01/ora

cle/app/oracle/diag/rdbms/lcfa/LCFA1/trace/LCFA1_smon_5811.trc  (incident=184182):
ORA-00600: internal error code, arguments: [4187], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Block recovery from logseq 54671, block 287204 to scn 17162371413499
Recovery of Online Redo Log: Thread 1 Group 5 Seq 54671 Reading mem 0
  Mem# 0: +DATA/lcfa/onlinelog/group_5.287.904064243
Block recovery stopped at EOT rba 54671.287388.16
Block recovery completed at rba 54671.287388.16, scn 3995.3977065979
Non-fatal internal error happenned while SMON was doing flushing of monitored table stats.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Fri Nov 19 16:07:10 2021
Sweep [inc][184182]: completed
Fri Nov 19 16:07:10 2021
Errors in file /u01/oracle/app/oracle/diag/rdbms/lcfa/LCFA1/trace/LCFA1_ora_1734.trc  (incident=190317):
ORA-00600: Ě²¿´펳´ú«, ²Ίý: [4187], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Nov 19 16:09:04 2021
Block recovery from logseq 54671, block 287204 to scn 17162371413499
Recovery of Online Redo Log: Thread 1 Group 5 Seq 54671 Reading mem 0
  Mem# 0: +DATA/lcfa/onlinelog/group_5.287.904064243
Block recovery completed at rba 54671.287388.16, scn 3995.3977065982
Fri Nov 19 16:10:30 2021
Errors in file /u01/oracle/app/oracle/diag/rdbms/lcfa/LCFA1/trace/LCFA1_ora_6392.trc  (incident=184485):
ORA-00600: Ě²¿´펳´ú«, ²Ίý: [4187], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.

能够看到 ORA-00600[4187]并随同着 blockrecover,通常 ORA-00600[4XXX]谬误都来自于 undo 相干,并且都会触发 BRR。SMON 曾经遇到 1 次外部谬误,如果 smon 遇到 100 次外部谬误则会重启实例,由参数_smon_internal_errlimit 管制。

SQL> @sp smon_inter
 
-- show parameter by sp
 
-- show hidden parameter by sp
old   3: where x.indx=y.indx and ksppinm like '_%&p%'
new   3: where x.indx=y.indx and ksppinm like '_%smon_inter%'
 
NAME                                     VALUE      DESC
---------------------------------------- ---------- ------------------------------------------------------------------------------------------
_smon_internal_errlimit                  100        limit of SMON internal errors

ORA-00600 4187 在 Doc ID 19700135.8 上有比较清楚的阐明:Description
ORA-600 [4187] can occur for undo segments where wrap# is close to the max value of 0xffffffff (KSQNMAXVAL).

This normally affects databases with high transaction rate that have existed for a relatively long time.

大抵意思是长期的高 TPS 的环境,当在新的事务绑定到某个 undo 段某个 slot 上,将递增 wrap#,然而递增后的 wrap# 超过最大值 KSQNMAXVAL(0xffffffff),就会抛出 ORA-00600[4187]谬误。

持续查看 trace 文件查找报异样的 undo 段头:

 TRN CTL:: seq: 0xd14f chd: 0x0009 ctl: 0x0004 inc: 0x00000000 nfb: 0x0000
            mgc: 0xb000 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00c06434.d14f.2c scn: 0x0f9b.ed0bc826
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00000000.d14f.2b ext: 0x2  spc: 0x1dc   
    uba: 0x00000000.d14f.2a ext: 0x2  spc: 0x70e   
    uba: 0x00000000.d14b.02 ext: 0x1e spc: 0x1f02  
    uba: 0x00000000.ce3f.02 ext: 0x12 spc: 0x14da  
    uba: 0x00000000.3226.02 ext: 0x32 spc: 0x14ae  
  TRN TBL::
 
  index  state cflags  wrap#    uel         scn            dba            parent-xid    nub     stmt_num    cmt
  ------------------------------------------------------------------------------------------------
   0x00    9    0x00  0xfffffa0c  0x000c  0x0f9b.ed0bc8f1  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x01    9    0x00  0xfffff4ab  0x0016  0x0f9b.ed0bc84e  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x02    9    0x00  0xfffff3aa  0x0008  0x0f9b.ed0bc934  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x03    9    0x00  0xfffff8d9  0x000e  0x0f9b.ed0bc985  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x04    9    0x00  0xfffffce8  0xffff  0x0f9b.ed0bc9e7  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x05    9    0x00  0xfffff627  0x001a  0x0f9b.ed0bc833  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x06    9    0x00  0xfffff4e6  0x0004  0x0f9b.ed0bc9cb  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x07    9    0x00  0xffffece5  0x000b  0x0f9b.ed0bc85c  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x08    9    0x00  0xfffff724  0x0021  0x0f9b.ed0bc93a  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x09    9    0x00  0xfffffff3  0x0015  0x0f9b.ed0bc828  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x0a    9    0x00  0xfffffaf2  0x0018  0x0f9b.ed0bc90c  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x0b    9    0x00  0xfffff671  0x0010  0x0f9b.ed0bc867  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x0c    9    0x00  0xfffffec0  0x001e  0x0f9b.ed0bc900  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x0d    9    0x00  0xfffff8bf  0x0020  0x0f9b.ed0bc889  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x0e    9    0x00  0xfffff4ce  0x0013  0x0f9b.ed0bc9ab  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x0f    9    0x00  0xfffff64d  0x000d  0x0f9b.ed0bc875  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x10    9    0x00  0xfffff5ec  0x000f  0x0f9b.ed0bc86b  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x11    9    0x00  0xfffffccb  0x001c  0x0f9b.ed0bc950  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x12    9    0x00  0xfffff55a  0x001f  0x0f9b.ed0bc976  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x13    9    0x00  0xfffff659  0x0014  0x0f9b.ed0bc9b1  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x14    9    0x00  0xffffefb8  0x0006  0x0f9b.ed0bc9c2  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x15    9    0x00  0xffffed27  0x0005  0x0f9b.ed0bc82e  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x16    9    0x00  0xfffffd66  0x0007  0x0f9b.ed0bc854  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x17    9    0x00  0xfffffdd5  0x0000  0x0f9b.ed0bc8e6  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x18    9    0x00  0xfffff1f4  0x001d  0x0f9b.ed0bc917  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x19    9    0x00  0xfffff303  0x0002  0x0f9b.ed0bc927  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x1a    9    0x00  0xfffff592  0x0001  0x0f9b.ed0bc83b  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305927
   0x1b    9    0x00  0xfffff9f1  0x0017  0x0f9b.ed0bc8df  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x1c    9    0x00  0xffffeee0  0x0012  0x0f9b.ed0bc95b  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x1d    9    0x00  0xfffff23f  0x0019  0x0f9b.ed0bc91e  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x1e    9    0x00  0xfffff67e  0x000a  0x0f9b.ed0bc908  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x1f    9    0x00  0xfffff1ad  0x0003  0x0f9b.ed0bc982  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x20    9    0x00  0xfffffb0c  0x001b  0x0f9b.ed0bc8ba  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928
   0x21    9    0x00  0xfffff1eb  0x0011  0x0f9b.ed0bc943  0x00c06434  0x0000.000.00000000  0x00000001   0x00000000  1637305928

异样的 undo 段头的 dump 能够看到所有 slot 的 wrap# 都十分高,ktuxc 中的 chd 为 0009,阐明下一次事务将应用 slot 9 的事务槽,而 slot 9 的 wrap# 为 0xfffffff3 曾经十分靠近 KSQNMAXVAL,然而咱们晓得每次 wrap# 重用只会加 1,并不会超过 KSQNMAXVAL,那么为什么会报出 ORA-00600[4187]呢?

起因在于重用 slot 时 wrap#+ 1 的算法曾经过期了,以后采纳的是在执行 ktubnd 函数为事务绑定 undo 段时,会调用 kjqghd 去计算出一个重用 slot 递增值 delta,这个 delta 也是有限度的,必须小于 16(由 KTU_MAX_KSQN_DELTA 定义),所以就可能会呈现 0xfffffff3 +delta 的值超过 KSQNMAXVAL。

晓得了谬误起因,解决办法其实很简略,就是删除异常的 undo 段或者重建 undo 表空间,如果删除不掉 undo 段,比方还有其余流动事务,那么能够用_corrupted_rollback_segments 屏蔽掉该 undo 段。mos 也提供了脚本去查看哪些 undo 段面临这样的问题。

 select b.segment_name, b.tablespace_name 
         ,a.ktuxeusn "Undo Segment Number"
         ,a.ktuxeslt "Slot"
         ,a.ktuxesqn "Wrap#"
   from  x$ktuxe a, dba_rollback_segs b
   where a.ktuxesqn > -429496730 and a.ktuxesqn < 0
       and a.ktuxeusn = b.segment_id;

这里还有一点须要思考的是,为什么会呈现 wrap# 如此大?仅仅是高 TPS 吗?咱们晓得事务绑定 undo 段的准则是尽可能的将流动事务均匀的各个 undo 段上,具体算法为:

在以后 undo tablespace 中的 online undo segment 中寻找事务表中没有流动事务的 undo segment;

如果没有找到则尝试在以后 undo tablespace online 那些处于 offline 状态的 undo segment;

如果没有找到则尝试在以后 undo tablespace 创立 undo segment 并 online;

如果无奈创立则会寻找最近起码应用的 undo segment。

有一种十分大的可能性就是能够 online 的 undo 段太少,通过查看该实例 undo 表空间大小为 1.5g,且不可主动扩大,这才导致了 undo 事务表的各个 slot 的 wrap# 如此之高。

所以针对该 case 的补充倡议是依据高峰期 TPS,正当设置 undo 表空间大小以及_rollback_segment_count。

墨天轮原文链接:https://www.modb.pro/db/17494…(复制链接至浏览器或点击文末浏览原文查看)

对于作者
李翔宇,云和恩墨西区交付技术顾问,长期服务挪动运营商行业客户,相熟 Oracle 性能优化,故障诊断,非凡复原。

正文完
 0