关于oracle:深度解析实例恢复

通过 v$datafile.last_time 和 v$datafile.last_change#判断是否须要实例复原，当数据库失常敞开时，会在 last_change#和 last_time 记录最初的 change# 与 time，实例复原是依据 CKPT- Q 的程序进行复原的，CKPT- Q 的程序是和 REDO 的程序统一的。

在数据库服务器异样断电重启后，数据库会进行实例复原，那么实例复原的过程中 Oracle 做了什么操作呢？

首先说下实例复原的定义：

Instance recovery is the process of applying records in the online redo log to data files to reconstruct changes made after the most recent checkpoint. Instance recovery occurs automatically when an administrator attempts to open a database that was previously shut down inconsistently.

Oracle Database performs instance recovery automatically in the following situations:

The database opens for the first time after the failure of a single-instance database or all instances of an Oracle RAC database. This form of instance recovery is also called crash recovery. Oracle Database recovers the online redo threads of the terminated instances together.

Some but not all instances of an Oracle RAC database fail. Instance recovery is performed automatically by a surviving instance in the configuration.

The SMON background process performs instance recovery, applying online redo automatically. No user intervention is required.

因而咱们晓得非一致性敞开会引发实例复原（一致性敞开不会，参考 shutdown immediate 的官网定义）同时 RAC 节点宕机也会在一个存活节点进行实例复原，其过程就是重构内存中的脏块并提交，同时对未提交的做出回滚，这个过程由 smon 后盾过程负责。

实例复原分两阶段：

1. 前滚：Rolling Forward

Oracle 依据 redo 日志中的记录：

1）对于提交的事务，依据日志进行内存中的脏块重现，而后进行 commit，并按失常机制失常写入磁盘。

2）对于未提交的事务，也依据 redo 进行脏块重现（为何会有未提交的事务日志被写入磁盘呢？因为日志的写入是按工夫排序的，一些已提交事务的写日志操作会引发之前的一些未提交事务日志的写入），对以此类脏块只是重现，oracle 在此阶段齐全不对此类脏块做其余操作。

因为一些未提交大事务的更改曾经被写入磁盘（但仍然会放弃严格的日志先写机制），以及前滚过程中生成的未提交事务的脏块，oracle 必须进行第二步的回滚。

2. 回滚：Rolling Back

对于所有未提交的脏块，oracle 依据 undo 的前镜像进行回滚，从新将内存中缓存的相干数据脏块换为非脏块，同时将写入 disk 的脏块应用 undo 从新笼罩。

这里上一幅官网的图：

图：Basic Instance Recovery Steps: Rolling Forward and Rolling Back

图解：

咱们看到实例复原前 redo 日志中记录的日志对应着四种更改块（redo 只记录更改）：

1）已提交且被写入磁盘的更改块，oracle 对这种块无需做任何操作。

2）已提交但未被写入磁盘的更改块，oracle 会在前滚过程中在内存重现脏块，而后按失常机制提交。

3）未提交且未被写入磁盘的更改块。

4）未提交但已被写入磁盘的更改块。

因为回滚是按事务为单位进行解决的，因而对于 3、4 两种块的解决全副是在回滚阶段，oracle 依据 undo 进行所有未提交事务的回滚操作，用前镜像笼罩掉磁盘 or 内存中的数据，这样就会解决掉第 3、4 种块。

此外，从上不难看出 oracle 默认 undo 中记录的事务进度是和 redo 日志中的完全一致的，不存在 undo 记录了事务被提交然而 redo 日志记录未提交的状况。

然而并不是所有状况都合乎 Oracle 默认的预期，有时候数据库频繁掉电就可能呈现无奈胜利进行实例复原的状况，此时只能采取一些非凡伎俩对数据文件头和 SCN 做一些改变。

个别除非特地紧急的情况，否则不要用 BBED、强制推动 SCN 等”偏方“去关上数据库，对于一个胜利的 DBA 来说，做好备份和灾备永远是最重要的工作。

Instance Recovery Phases\
The first phase of instance recovery is called cache recovery or rolling forward, and involves reapplying all of the changes recorded in the online redo log to the data files. Because rollback data is recorded in the online redo log, rolling forward also regenerates the corresponding undo segments.

Rolling forward proceeds through as many online redo log files as necessary to bring the database forward in time. After rolling forward, the data blocks contain all committed changes recorded in the online redo log files. These files could also contain uncommitted changes that were either saved to the data files before the failure, or were recorded in the online redo log and introduced during cache recovery.

After the roll forward, any changes that were not committed must be undone. Oracle Database uses the checkpoint position, which guarantees that every committed change with an SCN lower than the checkpoint SCN is saved on disk. Oracle Database applies undo blocks to roll back uncommitted changes in data blocks that were written before the failure or introduced during cache recovery. This phase is called rolling back or transaction recovery.