Oracle ASM normal /high redundancy DISKGROUP 丢失 1 failgroup影响：将继续运行，但无法重新启动

在oracle数据库中，避免磁盘故障对于避免数据损坏至关重要，但有时，即使 ASM 具有冗余，也可能会发生多个故障。最近有个客户给两个存储做了2个Faigroup+1 Quorum third voting disk on a NFS server，在测试高可用时，断开一个存储链路后发现些疑问，一个是断开期间有业务近10秒的挂起，另一个是重启节点后实例无法启动。

对于断存储事务挂起的问题，因为有使用Linux multipathed多路径软件，建议调整链路失败后的尝试次数与timeout时间。本文重点记录第二个现象，断开一个存储链路Failgroup后，可以继续使用，但CRS重启无法启动，这是一个预期行为。

在具有冗余（如normal冗余）的 Oracle 自动存储管理（ASM）磁盘组中，丢失一个失败组（failgroup）确实不会直接影响数据库实例的可用性，因为 ASM 设计上能够容忍在正常冗余配置下丢失一个失败组。然而，当重启 Oracle 集群ware（CRS）时，由于Failgroup的丢失，可能会导致启动失败。注意，通常如votedisk 所在的asm diskgroup nomal, 3个vd, 但是当1个vd出现问题，现象1是设备直接从OS layer不存在, 这种情况大概率会在ASM层直接触发drop disk, 而不用等xxx_repaire_time从ASM diskgroup DROP掉，重启是不影响，但是现象2如果设备在OS layer是可见但不可用，设备对应的ASM disk（voting disk）会有可能变为OFFLINE，此时会影响CRS自启动。

具体分析：

数据库实例的可用性：
- 正常冗余：如果你使用正常冗余的磁盘组（通常是 2 向镜像），ASM 会在不同的失败组之间保留数据的两份副本。丢失一个失败组意味着 ASM 仍然可以从剩余的副本中访问数据，因此数据库依然可以在线运行，不会出现数据丢失。
- 在这种情况下，只要剩余的失败组健康，数据库实例应当能够保持在线，不会受到影响。
重启 Oracle 集群（CRS）时的影响：
- 当尝试重启 Oracle 集群栈（CRS）时，它会管理 Oracle GI，包括 ASM 和数据库实例。
- 失败组丢失的影响：当 CRS 重启时，ASM 实例可能会因冗余要求未满足而无法挂载磁盘组。在Nomal冗余的 ASM 磁盘组中，丢失一个failgroup（例如，一个存储服务器或磁盘集）会减少冗余性，但 CRS 在尝试挂载磁盘组时仍然会期望看到完整的冗余配置。
- 如果丢失了一个failgroup，且 ASM 无法满足所需的冗余要求（因为磁盘组中缺少所需的副本），它将无法挂载该磁盘组，CRS 将无法启动相关的数据库实例，导致启动失败。

为什么会发生这种情况？

ASM 磁盘组冗余检查：当 CRS 启动时，它会检查 ASM 磁盘组的冗余性。如果所需的failgroup丢失或冗余性遭到破坏（例如，某个failgroup无法访问或不再存在），CRS 将不会允许挂载磁盘组，因为冗余约束已被违反。
集群依赖性：CRS 依赖 ASM 来管理所有 Oracle 服务（包括数据库）的磁盘组可用性。如果 ASM 因为缺失failgroup而无法挂载磁盘组，CRS 无法启动相关的数据库实例，因此数据库实例将无法启动。

如何解决：

恢复存储链路
管理员可以发出“alter diskgroup mount force”来告诉 ASM，尝试手动强制挂载 diskgroup,即使它无法保持所需的冗余，包含投票磁盘的磁盘组也是如此。执行下面的SQL：

alter diskgroup xxx  online force;

— or —
需要手动干预：以独占模式启动集群：

crsctl start crs -excl

然后连接到 ASM 并执行

alter disgkroup <dgname> mount force

然后解决错误（例如将另一个磁盘添加到另一个故障组，可以重新镜像数据并删除磁盘。

之后就可以再次正常启动了。

alter diskgroup xxx online force

after “alter diskgroup xxx online force; ” the disk group was mounted and the Clusterware “continued” the startup process.

Mounting Disk Groups Using the FORCE Option

In the FORCE mode, ASM attempts to mount the disk group even if it cannot discover all of the devices that belong to the disk group. This setting is useful if some of the disks in a normal or high redundancy disk group became unavailable while the disk group was dismounted.

If ASM discovers all of the disks in the disk group, then MOUNT FORCE fails. Therefore, use the MOUNT FORCE setting only if some disks are unavailable. Otherwise, use NOFORCE [the default].

The disk group mount succeeds if ASM finds at least one complete set of extents in a disk group. If ASM determines that one or more disks are not available, then ASM moves those disks off line and drops [sic!] the disks after the DISK_REPAIR_TIME expires.

In clustered ASM environments, if an ASM instance is not the first instance to mount the disk group, then using the MOUNT FORCE statement fails. This is because the disks have been accessed by another instance and the disks are not locally accessible

坏两个磁盘呢？

想象一下，如果我们有多个DISK CELL,如有7块盘做一个ASM DISKGROUP（normal），然后每个DISK是一个failgroup， Normal冗余有2份数据打散。如果17：14坏了1个 cell03, 在没有超过disk_repair_time和failgroup_repair_time时，在17:18又坏了1个cell1, 这样就在rebalance前坏了两块盘，而如果你的文件幸运在cell06和cell07 那可能不会丢失数据，而如果没有那么幸运呢，而cell03 在18：00修好了，但是数据版本不一致，你不能保证都是最新版本无法mount normal磁盘组。如果你还有大部分文件在其他5个asm disk上，ASM DISKGROU上运行的数据库很大(如10TB+)，整个数据库恢复可能会好几个小时，但是如果坏的盘上只有1个表空间或1个文件，是不是恢复就更快，能不能先mount上磁盘组，然后确认损坏程度，恢复更少的数据呢？ 12.1（11.2.0.4部分版本）后引入了新特性 mount restricted force for recovery 。

SQL> alter diskgroup data mount;
alter diskgroup data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "9" is missing from group number "1"
ORA-15042: ASM disk "1" is missing from group number "1"

-- 修好cell03(disk 9)

SQL> alter diskgroup data mount;
alter diskgroup data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "1" in group "DATA" may result in a data loss

-- 决定不一致打开
SQL> alter diskgroup data mount restricted force for recovery;
Diskgroup altered.
SQL>

SQL> select NAME,FAILGROUP,LABEL,PATH from v$asm_disk order by FAILGROUP, label;
NAME                                     FAILGROUP                      LABEL                           PATH
---------------------------------------- ------------------------------ ------------------------------- ------------------------------------------------------------
CELLI01                                  CELLI01
CELLI02                                  CELLI02                        CELLI02                         ORCL:CELLI02
CELLI03                                  CELLI03
CELLI04                                  CELLI04                        CELLI04                         ORCL:CELLI04
CELLI05                                  CELLI05                        CELLI05                         ORCL:CELLI05
CELLI06                                  CELLI06                        CELLI06                         ORCL:CELLI06
CELLI07                                  CELLI07                        CELLI07                         ORCL:CELLI07
                                                                        CELLI03                         ORCL:CELLI03
10 rows selected.
SQL>

SQL> alter diskgroup data online disks in failgroup CELLI03;
Diskgroup altered.

SQL> select NAME,FAILGROUP,LABEL,PATH from v$asm_disk order by FAILGROUP, label;
NAME                                     FAILGROUP                      LABEL                           PATH
---------------------------------------- ------------------------------ ------------------------------- ------------------------------------------------------------
CELLI01                                  CELLI01
CELLI02                                  CELLI02                        CELLI02                         ORCL:CELLI02
CELLI03                                  CELLI03                        CELLI03                         ORCL:CELLI03
CELLI04                                  CELLI04                        CELLI04                         ORCL:CELLI04
CELLI05                                  CELLI05                        CELLI05                         ORCL:CELLI05
CELLI06                                  CELLI06                        CELLI06                         ORCL:CELLI06
CELLI07                                  CELLI07                        CELLI07                         ORCL:CELLI07

-- 限制模式不允许rebalance

-- 此时干净的dismount，再干净的mount
SQL> alter diskgroup data dismount;
Diskgroup altered.

SQL> alter diskgroup data mount;
Diskgroup altered.

SQL> alter diskgroup DATA rebalance;
Diskgroup altered.

SQL> select NAME,FAILGROUP,LABEL,PATH from v$asm_disk order by FAILGROUP, label;
NAME                                     FAILGROUP                      LABEL                           PATH
---------------------------------------- ------------------------------ ------------------------------- ------------------------------------------------------------
CELLI01                                  CELLI01
CELLI02                                  CELLI02                        CELLI02                         ORCL:CELLI02
CELLI03                                  CELLI03                        CELLI03                         ORCL:CELLI03
CELLI04                                  CELLI04                        CELLI04                         ORCL:CELLI04
CELLI05                                  CELLI05                        CELLI05                         ORCL:CELLI05
CELLI06                                  CELLI06                        CELLI06                         ORCL:CELLI06
CELLI07                                  CELLI07                        CELLI07                         ORCL:CELLI07
9 rows selected.

然后就可以做数据库校验分析坏块，做更小粒度的数据恢复。

移除cell01

SQL> ALTER DISKGROUP data DROP DISKS IN FAILGROUP CELLI01;
ALTER DISKGROUP data DROP DISKS IN FAILGROUP CELLI01
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15084: ASM disk "CELLI01" is offline and cannot be dropped.
SQL>
SQL> ALTER DISKGROUP data DROP DISKS IN FAILGROUP CELLI01 FORCE;
Diskgroup altered.

SQL> select NAME,FAILGROUP,LABEL,PATH from v$asm_disk order by FAILGROUP, label;
NAME                                     FAILGROUP                      LABEL                           PATH
---------------------------------------- ------------------------------ ------------------------------- ------------------------------------------------------------
_DROPPED_0001_DATA                       CELLI01
CELLI02                                  CELLI02                        CELLI02                         ORCL:CELLI02
CELLI03                                  CELLI03                        CELLI03                         ORCL:CELLI03
CELLI04                                  CELLI04                        CELLI04                         ORCL:CELLI04
CELLI05                                  CELLI05                        CELLI05                         ORCL:CELLI05
CELLI06                                  CELLI06                        CELLI06                         ORCL:CELLI06
CELLI07                                  CELLI07                        CELLI07                         ORCL:CELLI07

9 rows selected.

等reblanace后dropted记录自动清理。
— over —

References
https://www.fernandosimon.com/blog/asm-mount-restricted-force-for-recovery/