收到一些库有会话占用了大量temp表空间的告警,通常的做法是查问v$sort_usage找到对应sql_id
select * from (select username, session_addr, sql_id, contents, segtype, blocks * 8 / 1024 / 1024 gb from v$sort_usage order by blocks desc) where rownum <= 20;
但有时会发现找进去的sql特地简略,怎么看都不像能占用几十几百G temp表空间的样子
为什么从v$sort_usage找到的sql_id有时是错的?应该怎么找正确的sql_id?
对于第一个问题,查找了网上的一些文章,发现v$sort_usage的sql_id并不对应v$session的sql_id,它对应的是v$session的prev_sql_id
有一个V$FIXED_VIEW_DEFINITION视图能够查看固定视图定义,查问能够看到v$sort_usage定义如下
select * from V$FIXED_VIEW_DEFINITION where view_name='V$SORT_USAGE';--输入如下select USERNAME , "USER" , SESSION_ADDR , SESSION_NUM , SQLADDR , SQLHASH, SQL_ID, TABLESPACE , CONTENTS , SEGTYPE , SEGFILE# , SEGBLK# ,EXTENTS , BLOCKS , SEGRFNO# from GV$SORT_USAGE where inst_id = USERENV('Instance');
能够看到它来自GV$SORT_USAGE,再查看GV$SORT_USAGE的定义
select * from V$FIXED_VIEW_DEFINITION where view_name='GV$SORT_USAGE';--输入如下select x$ktsso.inst_id, username, username, ktssoses, ktssosno, prev_sql_addr, prev_hash_value, prev_sql_id, ----------- 留神这个 ktssotsn, decode(ktssocnt, 0, 'PERMANENT', 1, 'TEMPORARY'), decode(ktssosegt, 1, 'SORT', 2, 'HASH', 3, 'DATA', 4, 'INDEX', 5, 'LOB_DATA', 6, 'LOB_INDEX', 'UNDEFINED'), ktssofno, ktssobno, ktssoexts, ktssoblks, ktssorfno from x$ktsso, v$session where ktssoses = v$session.saddr and ktssosno = v$session.serial#;
能够看到v$sort_usage的sql_id对应的是v$session的prev_sql_id。如果以后sql是最新正在执行的语句,查出来的后果是对的;如果过后执行了其它sql,看到就是谬误的语句。
上面有一个例子,来自 http://www.laoxiong.net/temporary_tablespace_excessive_usage_case.html
--依据dba_objects建一个测试表T1,使其数据量达到2000万行。 select count(*) from t1; COUNT(*) ----------- 20171200 --而后将SQL工作区设置为手动模式,设置sort内存大小限度为200M: alter session set workarea_size_policy=manual; alter session set sort_area_size=209715200; --查问失去以后的会话sid: select sid from v$mystat where rownum< =1; SID ----------- 2111 --执行这上面的代码: declare v_object_name varchar2(100); v_dummy varchar2(100); begin for rec in (select * from t1 order by object_id,object_name) loop select object_type into v_dummy from t1 where rownum<=1; select object_name into v_object_name from dba_objects where object_id=rec.object_id; dbms_lock.sleep(60*10); exit; end loop; end; / --这段代码会关上一个游标,对2000万的数据量进行排序,而后在循环中只取一条数据,而后就进入sleep。在另一个窗口中监控到2111这个会话的event变成了PL/SQL lock timer,就去查问v$sort_usage: select a.sql_id sort_sql_id,b.sql_id,b.prev_sql_id, contents,segtype,blocks*8/1024/1024 gb from v$sort_usage a,v$session b where a.session_addr=b.saddr and b.sid=2111; SORT_SQL_ID SQL_ID PREV_SQL_ID CONTENTS SEGTYPE GB ------------- ------------- ------------- --------- --------- ----------- fabh24prgk2sj bhzf316mdc07w fabh24prgk2sj TEMPORARY SORT 1.444824219 --能够看到v$sort_usage中的SQL_ID(即上述后果中SORT_SQL_ID)与v$session中的pre_sql_id统一,这条SQL是: @sqlbyid fabh24prgk2sj SQL_FULLTEXT -------------------------------------------------------- SELECT OBJECT_NAME FROM DBA_OBJECTS WHERE OBJECT_ID=:B1 --而实际上以后正在执行的SQL是: @sqlbyid bhzf316mdc07w SQL_FULLTEXT --------------------------------------------------------------------------- declare v_object_name varchar2(100); v_dummy varchar2(100); begin for rec in (select * from t1 order by object_id,object_name) loop select object_type into v_dummy from t1 where rownum<=1; select object_name into v_object_name from dba_objects where object_id=rec.object_id; dbms_lock.sleep(60*10); exit; end loop; end;
这个问题记录在了mos文档 Bug 17834663 - Include SQL ID for statement that created a temporary segment in GV$SORT_USAGE (文档 ID 17834663.8) 能够看到12.2以下版本都受该bug影响(查了下19c的定义还是用的prev_sql_id)
Affects:
Product (_Component_)
Oracle Server (Rdbms)
Range of versions _believed_ to be affected
Versions BELOW 12.2
Versions _confirmed_ as being affected
- 11.2.0.3
Platforms affected
Generic (all / most platforms affected)
Fixed:
The fix for 17834663 is first included in
- 12.2.0.1 (Base Release)
- 12.1.0.2 (Server Patch Set)
依据文档,从11.2.0.2版本之后,为v$sort_usage的基表减少了KTSSOSQLID列,它真正代表耗费高temp表空间的sql
x$ktsso is the base table that stores the temp tablespace usage information. With 11.2.0.1 or earlier, it stores no SQLIDs. From 11.2.0.2, a new column KTSSOSQLID had been added.
所以真正查问耗费高temp表空间的sql语句应该为
SELECT S.sid, S.serial# sid_serial, S.username, S.osuser, P.spid, S.module, S.program, SUM(T.blocks) * TBS.block_size / 1024 / 1024 mb_used, T.tablespace, COUNT(*) sort_ops FROM v$sort_usage T, v$session S, dba_tablespaces TBS, v$process P WHERE T.session_addr = S.saddr AND S.paddr = P.addr AND T.tablespace = TBS.tablespace_name GROUP BY S.sid, S.serial#, S.username, S.osuser, P.spid, S.module, S.program, TBS.block_size, T.tablespace HAVING SUM(blocks) > 10000 -- 80MB ORDER BY sid_serial; --将sid代入上面sql查出真正的sql_id select ktssosqlid from x$ktsso, v$session where ktssoses = v$session.saddr and ktssosno = v$session.serial# and v$session.sid=2111;
或者间接利用上面的sql
select k.inst_id "INST_ID", ktssoses "SADDR", sid, ktssosno "SERIAL#", username "USERNAME", osuser "OSUSER", ktssosqlid "SQL_ID", ktssotsn "TABLESPACE", decode(ktssocnt, 0, 'PERMANENT', 1, 'TEMPORARY') "CONTENTS", --留神在12c的v$sort_usage定义中TABLESPACE和CONTENTS曾经发生变化了。 decode(ktssosegt, 1, 'SORT', 2, 'HASH', 3, 'DATA', 4, 'INDEX', 5, 'LOB_DATA', 6, 'LOB_INDEX' , 'UNDEFINED') "SEGTYPE", ktssofno "SEGFILE#", ktssobno "SEGBLK#", ktssoexts "EXTENTS", ktssoblks "BLOCKS", round(ktssoblks*p.value/1024/1024, 2) "SIZE_MB", ktssorfno "SEGRFNO#" from x$ktsso k, v$session s, (select value from v$parameter where name='db_block_size') p where ktssoses = s.saddr and ktssosno = s.serial#;
v$sort_usage中的SEGTYPE列含意
- SORT:SQL排序应用的长期段,包含order by、group by、union、distinct、窗口函数、建索引等产生的排序。
- DATA:长期表(Global Temporary Table)存储数据应用的段。
- INDEX:长期表上建的索引应用的段。
- HASH:hash算法所应用的长期段。
- LOB_DATA和LOB_INDEX:长期LOB应用的长期段。
另外能够通过以下语句查看一段时间前占用temp表空间高的sql
select * from (select t.sample_time, s.PARSING_SCHEMA_NAME, t.sql_id, t.sql_child_number as sql_child, round(t.temp_space_allocated / 1024 / 1024 / 1024, 2) || ' G' as temp_used, round(t.temp_space_allocated / (select sum(decode(d.autoextensible, 'YES', d.maxbytes, d.bytes)) from dba_temp_files d), 2) * 100 || ' %' as temp_pct, t.program, t.module, s.SQL_TEXT from v$active_session_history t, v$sql s where t.sample_time > to_date('2019-07-15 22:00:00', 'yyyy-mm-dd hh24:mi:ss') and t.sample_time < to_date('2019-07-15 23:00:00', 'yyyy-mm-dd hh24:mi:ss') and t.temp_space_allocated is not null and t.sql_id = s.SQL_ID order by t.temp_space_allocated desc) where rownum < 50 order by temp_used desc;
参考
Bug 17834663 - Include SQL ID for statement that created a temporary segment in GV$SORT_USAGE (文档 ID 17834663.8)
http://www.laoxiong.net/temporary_tablespace_excessive_usage_case.html