乐趣区

关于数据库:Oracle-11g-查看TEMP实际占用转

收到一些库有会话占用了大量 temp 表空间的告警,通常的做法是查问 v$sort_usage 找到对应 sql_id

select *
  from (select username,
               session_addr,
               sql_id,
               contents,
               segtype,
               blocks * 8 / 1024 / 1024 gb
          from v$sort_usage
         order by blocks desc)
 where rownum <= 20;

但有时会发现找进去的 sql 特地简略,怎么看都不像能占用几十几百 G temp 表空间的样子
为什么从 v$sort_usage 找到的 sql_id 有时是错的?应该怎么找正确的 sql_id?
对于第一个问题,查找了网上的一些文章,发现 v$sort_usage 的 sql_id 并不对应 v$session 的 sql_id,它对应的是 v$session 的 prev_sql_id
有一个 V$FIXED_VIEW_DEFINITION 视图能够查看固定视图定义,查问能够看到 v$sort_usage 定义如下

select * from V$FIXED_VIEW_DEFINITION where view_name='V$SORT_USAGE';
-- 输入如下
select  USERNAME , "USER" , SESSION_ADDR , SESSION_NUM , SQLADDR , SQLHASH, SQL_ID, TABLESPACE , CONTENTS , SEGTYPE , SEGFILE# , SEGBLK# ,EXTENTS , BLOCKS , SEGRFNO# from GV$SORT_USAGE where inst_id = USERENV('Instance');

能够看到它来自 GV$SORT_USAGE,再查看 GV$SORT_USAGE 的定义

select * from V$FIXED_VIEW_DEFINITION where view_name='GV$SORT_USAGE';
-- 输入如下
select x$ktsso.inst_id,
       username,
       username,
       ktssoses,
       ktssosno,
       prev_sql_addr,
        prev_hash_value,
        prev_sql_id,  ----------- 留神这个
        ktssotsn,
        decode(ktssocnt, 0, 'PERMANENT', 1, 'TEMPORARY'),
        decode(ktssosegt,
               1,
               'SORT',
               2,
               'HASH',
               3,
               'DATA',
               4,
               'INDEX',
               5,
               'LOB_DATA',
               6,
               'LOB_INDEX',
               'UNDEFINED'),
        ktssofno,
        ktssobno,
        ktssoexts,
        ktssoblks,
        ktssorfno
   from x$ktsso, v$session
  where ktssoses = v$session.saddr
    and ktssosno = v$session.serial#;

能够看到 v$sort_usage 的 sql_id 对应的是 v$session 的 prev_sql_id。如果以后 sql 是最新正在执行的语句,查出来的后果是对的;如果过后执行了其它 sql,看到就是谬误的语句。
上面有一个例子,来自  http://www.laoxiong.net/temporary_tablespace_excessive_usage_case.html

-- 依据 dba_objects 建一个测试表 T1,使其数据量达到 2000 万行。select count(*) from t1;  
   COUNT(*)  
----------- 
   20171200  
-- 而后将 SQL 工作区设置为手动模式,设置 sort 内存大小限度为 200M:alter session set workarea_size_policy=manual;  
  alter session set sort_area_size=209715200;  
 -- 查问失去以后的会话 sid:select sid from v$mystat where rownum< =1;  
         SID  
 ----------- 
        2111  
 -- 执行这上面的代码:declare  
        v_object_name varchar2(100);  
        v_dummy varchar2(100);  
     begin  
       for rec in (select * from t1 order by object_id,object_name) loop  
          select object_type into v_dummy from t1 where rownum<=1;  
          select object_name into v_object_name from dba_objects where object_id=rec.object_id;  
          dbms_lock.sleep(60*10);  
          exit;  
      end loop;  
    end;  
    /  
 -- 这段代码会关上一个游标,对 2000 万的数据量进行排序,而后在循环中只取一条数据,而后就进入 sleep。在另一个窗口中监控到 2111 这个会话的 event 变成了 PL/SQL lock timer,就去查问 v$sort_usage: 
 select a.sql_id sort_sql_id,b.sql_id,b.prev_sql_id, contents,segtype,blocks*8/1024/1024 gb   
 from v$sort_usage a,v$session b   
 where a.session_addr=b.saddr and b.sid=2111;  
 SORT_SQL_ID   SQL_ID        PREV_SQL_ID   CONTENTS  SEGTYPE            GB  
 ------------- ------------- ------------- --------- --------- ----------- 
 fabh24prgk2sj bhzf316mdc07w fabh24prgk2sj TEMPORARY SORT      1.444824219  
 -- 能够看到 v$sort_usage 中的 SQL_ID(即上述后果中 SORT_SQL_ID)与 v$session 中的 pre_sql_id 统一,这条 SQL 是:@sqlbyid fabh24prgk2sj  
 SQL_FULLTEXT  
 -------------------------------------------------------- 
 SELECT OBJECT_NAME FROM DBA_OBJECTS WHERE OBJECT_ID=:B1  
 -- 而实际上以后正在执行的 SQL 是:@sqlbyid bhzf316mdc07w  
 SQL_FULLTEXT  
 --------------------------------------------------------------------------- 
 declare  
    v_object_name varchar2(100);  
    v_dummy varchar2(100);  
 begin  
   for rec in (select * from t1 order by object_id,object_name) loop  
      select object_type into v_dummy from t1 where rownum<=1;  
      select object_name into v_object_name from dba_objects where object_id=rec.object_id;  
      dbms_lock.sleep(60*10);  
      exit;  
   end loop;  
 end; 

这个问题记录在了 mos 文档 Bug 17834663 – Include SQL ID for statement that created a temporary segment in GV$SORT_USAGE (文档 ID 17834663.8) 能够看到 12.2 以下版本都受该 bug 影响(查了下 19c 的定义还是用的 prev_sql_id)

Affects:

Product (_Component_)

Oracle Server (Rdbms)

Range of versions _believed_ to be affected

Versions BELOW 12.2

Versions _confirmed_ as being affected

  • 11.2.0.3

Platforms affected

Generic (all / most platforms affected)

Fixed:

The fix for 17834663 is first included in

  • 12.2.0.1 (Base Release)
  • 12.1.0.2 (Server Patch Set)

依据文档,从 11.2.0.2 版本之后,为 v$sort_usage 的基表减少了 KTSSOSQLID 列,它真正代表耗费高 temp 表空间的 sql
x$ktsso is the base table that stores the temp tablespace usage information. With 11.2.0.1 or earlier, it stores no SQLIDs. From 11.2.0.2, a new column KTSSOSQLID had been added.

所以真正查问耗费高 temp 表空间的 sql 语句应该为

SELECT S.sid,
       S.serial# sid_serial,
       S.username,
       S.osuser,
       P.spid,
       S.module,
       S.program,
       SUM(T.blocks) * TBS.block_size / 1024 / 1024 mb_used,
       T.tablespace,
        COUNT(*) sort_ops
   FROM v$sort_usage T, v$session S, dba_tablespaces TBS, v$process P
  WHERE T.session_addr = S.saddr
    AND S.paddr = P.addr
    AND T.tablespace = TBS.tablespace_name
  GROUP BY S.sid,
           S.serial#,
           S.username,
           S.osuser,
           P.spid,
           S.module,
           S.program,
           TBS.block_size,
           T.tablespace
 HAVING SUM(blocks) > 10000 -- 80MB
  ORDER BY sid_serial;
 -- 将 sid 代入上面 sql 查出真正的 sql_id
 select ktssosqlid from x$ktsso, v$session where ktssoses = v$session.saddr and ktssosno = v$session.serial# and v$session.sid=2111; 

或者间接利用上面的 sql

select k.inst_id "INST_ID",  
       ktssoses "SADDR",  
       sid,  
       ktssosno "SERIAL#",  
       username "USERNAME",  
       osuser "OSUSER",   
       ktssosqlid "SQL_ID",  
       ktssotsn "TABLESPACE",  
       decode(ktssocnt, 0, 'PERMANENT', 1, 'TEMPORARY') "CONTENTS",  
        -- 留神在 12c 的 v$sort_usage 定义中 TABLESPACE 和 CONTENTS 曾经发生变化了。decode(ktssosegt, 1, 'SORT', 2, 'HASH', 3, 'DATA', 4, 'INDEX',   
           5, 'LOB_DATA', 6, 'LOB_INDEX' , 'UNDEFINED') "SEGTYPE",  
        ktssofno "SEGFILE#",  
        ktssobno "SEGBLK#",  
        ktssoexts "EXTENTS",  
        ktssoblks "BLOCKS",  
        round(ktssoblks*p.value/1024/1024, 2) "SIZE_MB",  
        ktssorfno "SEGRFNO#"  
 from x$ktsso k, v$session s,   
      (select value from v$parameter where name='db_block_size') p   
 where ktssoses = s.saddr  
   and ktssosno = s.serial#; 

v$sort_usage 中的 SEGTYPE 列含意

  • SORT:SQL 排序应用的长期段,包含 order by、group by、union、distinct、窗口函数、建索引等产生的排序。
  • DATA:长期表 (Global Temporary Table) 存储数据应用的段。
  • INDEX:长期表上建的索引应用的段。
  • HASH:hash 算法所应用的长期段。
  • LOB_DATA 和 LOB_INDEX:长期 LOB 应用的长期段。

另外能够通过以下语句查看一段时间前占用 temp 表空间高的 sql

select *
  from (select t.sample_time,
               s.PARSING_SCHEMA_NAME,
               t.sql_id,
               t.sql_child_number as sql_child,
               round(t.temp_space_allocated / 1024 / 1024 / 1024, 2) || 'G' as temp_used,
               round(t.temp_space_allocated /
                     (select sum(decode(d.autoextensible, 'YES', d.maxbytes, d.bytes))
                        from dba_temp_files d),
                      2) * 100 || '%' as temp_pct,
                t.program,
                t.module,
                s.SQL_TEXT
           from v$active_session_history t, v$sql s
          where t.sample_time > to_date('2019-07-15 22:00:00', 'yyyy-mm-dd hh24:mi:ss')
            and t.sample_time < to_date('2019-07-15 23:00:00', 'yyyy-mm-dd hh24:mi:ss')
            and t.temp_space_allocated is not null
            and t.sql_id = s.SQL_ID
          order by t.temp_space_allocated desc)
  where rownum < 50
  order by temp_used desc;

参考
Bug 17834663 – Include SQL ID for statement that created a temporary segment in GV$SORT_USAGE (文档 ID 17834663.8)
http://www.laoxiong.net/temporary_tablespace_excessive_usage_case.html

退出移动版