关于数据库:技术实践丨PostgreSQL插件之pgdirtyread-闪回查询

摘要：Oracle 数据库有时候不小心删除掉数据，想查问这些数据，或者复原数据，就能够应用带有 as of 子句的 select 语句进行闪回查问。

PG 粉有福了，上面介绍一种相似“闪回查问”插件 pg_dirtyread，能够读取未被 vacuum 的 dead 数据。

github 主页：https://github.com/df7cb/pg_d…

1.2 released：https://www.postgresql.org/me…

语法：

SELECT * FROM pg_dirtyread('tablename') AS t(col1 type1, col2 type2, ...);

样例 1：删除找回

 CREATE EXTENSION pg_dirtyread;  
  -- Create table and disable autovacuum
  CREATE TABLE foo (bar bigint, baz text);  
 
  ALTER TABLE foo SET (autovacuum_enabled = false, toast.autovacuum_enabled = false);  -- 测试不便，先把主动 vacuum 敞开掉。INSERT INTO foo VALUES (1, 'Test'), (2, 'New Test');  
  DELETE FROM foo WHERE bar = 1;  
 
  SELECT * FROM pg_dirtyread('foo') as t(bar bigint, baz text);
   bar   │   baz
  ─────┼──────────     
  1     │ Test     
  2     │ New Test

能够看到，被删除的记录 (1, ‘Test’) 曾经能够查问到。

样例 2：列被 drop 的状况

 CREATE TABLE ab(a text, b text);  
  INSERT INTO ab VALUES ('Hello', 'World');  
 
  ALTER TABLE ab DROP COLUMN b;  
  DELETE FROM ab;  
 
  SELECT * FROM pg_dirtyread('ab') ab(a text, dropped_2 text);
     a   │ dropped_2
  ───────┼───────────
   Hello │ World

能够看到，尽管 b 列被 drop 掉了，然而依然能够读取到数据。

如何指定列：这里应用 dropped_N 来拜访第 N 列，从 1 开始计数。

局限：因为 PG 删除了原始列的元数据信息，因而须要在表列名中指定正确的类型，这样能力进行大量的完整性检查。包含类型长度、类型对齐、类型修饰符，并且采取的是按值传递。

样例 3：零碎列

SELECT * FROM pg_dirtyread('foo')      
     AS t(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,
           bar bigint, baz text);
   tableoid │ ctid  │ xmin │ xmax │ cmin │ cmax │ dead │ bar │        baz
  ──────────┼───────┼──────┼──────┼──────┼──────┼──────┼─────┼───────────────────      
  41823 │ (0,1) │ 1484 │ 1485 │    0 │    0 │ t    │   1 │ Delete
  41823 │ (0,2) │ 1484 │    0 │    0 │    0 │ f    │   2 │ Insert      
  41823 │ (0,3) │ 1484 │ 1486 │    0 │    0 │ t    │   3 │ Update
  41823 │ (0,4) │ 1484 │ 1488 │    0 │    0 │ f    │   4 │ Not deleted      
  41823 │ (0,5) │ 1484 │ 1489 │    1 │    1 │ f    │   5 │ Not updated      
  41823 │ (0,6) │ 1486 │    0 │    0 │    0 │ f    │   3 │ Updated      
  41823 │ (0,7) │ 1489 │    0 │    1 │    1 │ t    │   5 │ Not quite updated      
  41823 │ (0,8) │ 1490 │    0 │    2 │    2 │ t    │   6 │ Not inserted

能够看到，xmax 和 ctid 能够被复原了。oid 只在 11 以及更早的版本中能力被复原。

10 和 11 曾经反对，2.0 当前的版本曾经反对 12 和 13，社区还是很沉闷。

外围代码有 2 局部：

1、dirtyread_tupconvert.c 次要实现了 dirtyread_convert_tuples_by_name，通过列名进行元组转换，解决列原信息被清理以及存在表继承的状况，要害局部是数组：attrMap[]，下标从 1 开始。

重点剖析下 dirtyread_do_convert_tuple

HeapTuple
dirtyread_do_convert_tuple(HeapTuple tuple, TupleConversionMap *map, TransactionId oldest_xmin)
{
 
    /*
     * Extract all the values of the old tuple, offsetting the arrays so that
     * invalues[0] is left NULL and invalues[1] is the first source attribute;
     * this exactly matches the numbering convention in attrMap.
     */
    heap_deform_tuple(tuple, map->indesc, invalues + 1, inisnull + 1); //+ 1 是因为是从下标 1 开始，从旧的元组中把数据的值获取到
 
    /*
     * Transpose into proper fields of the new tuple. 这部分是重点，在这里实现转换
     */
    for (i = 0; i < outnatts; i++)
    {
        int         j = attrMap;
 
        if (j == DeadFakeAttributeNumber) 
        // 场景 1：明确是 dead，间接调用内核的函数 HeapTupleIsSurelyDead 即可，// 定义在 tqual.c 中，其它场景能够应用 HeapTupleSatisfiesVacuum、HeapTupleSatisfiesMVCC 等等，这里明确是 dead，所以应用 HeapTupleIsSurelyDead
        {
            outvalues = HeapTupleIsSurelyDead(tuple
                    , oldest_xmin);
            outisnull = false;
        }
        else if (j < 0) // 场景 2：零碎列，交给函数 heap_getsysattr 来解决。outvalues = heap_getsysattr(tuple, j, map->indesc, &outisnull);
        else
        {   // 场景 3：最常见的场景，间接获取即可。outvalues = invalues[j];
            outisnull = inisnull[j];
        }
    }
 
    return heap_form_tuple(map->outdesc, outvalues, outisnull); // 从新包装为 tuple 格局
}

2、pg_dirtyread.c 面向客户的接口在这里实现。

重点剖析下 Datum pg_dirtyread(PG_FUNCTION_ARGS)

第 1 局部

 if (SRF_IS_FIRSTCALL())，这部分比拟套路化
    {
        superuser 校验
        PG_GETARG_OID 获取表的 oid
        heap_open 关上表
        get_call_result_type 计算结果校验，不反对复合类型
        BlessTupleDesc(tupdesc) 拿到表构造
        usr_ctx->map = dirtyread_convert_tuples_by_name(usr_ctx->reltupdesc,
                        funcctx->tuple_desc, "Error converting tuple descriptors!");  // 要害的一步，这里应用 dirtyread_convert_tuples_by_name 函数，。heap_beginscan(usr_ctx->rel, SnapshotAny...), 开始启动表扫描，这里应用了 SnapshotAny  
     }

第 2 局部，一直的获取每一行，而后对每一行进行转换，直到扫描完结。

 if ((tuplein = heap_getnext(usr_ctx->scan, ForwardScanDirection)) != NULL)
    {if (usr_ctx->map != NULL)
        {tuplein = dirtyread_do_convert_tuple(tuplein, usr_ctx->map, usr_ctx->oldest_xmin);
            SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuplein));
        }
        else
            SRF_RETURN_NEXT(funcctx, heap_copy_tuple_as_datum(tuplein, usr_ctx->reltupdesc));
    }
    else
    {heap_endscan(usr_ctx->scan); // 完结扫描
        heap_close(usr_ctx->rel, AccessShareLock); // 敞开表
        SRF_RETURN_DONE(funcctx);
    }

整体上实现并不是很简单，了解了这些后，就能够在此基础上减少本人的性能了。而 PG 的魅力就在于此 – 架构的开放性，能够让开发者迅速地开发本人的“小程序”进去。

点击关注，第一工夫理解华为云陈腐技术~

关于数据库:技术实践丨PostgreSQL插件之pgdirtyread-闪回查询

一、咱们一起看下官网的 3 个例子：

二、反对的版本

三、实现剖析