首页 » ORACLE 9i-23ai » 揭秘 Oracle 11gR2 RAC ASM Instance 启动

揭秘 Oracle 11gR2 RAC ASM Instance 启动

在ORACLE 11G RAC或使用ASM环境, 存储ASM instance spfile到ASM instance,听上去很奇怪,启动实例所需要的文件在自已的实例里,对于ASM和ASM spfile先有鸡还是先有蛋的问题,首先要解决的几个问题:
1,asm diskgroup 没mount,如何读spfile?
2,在oracle_home/dbs下没有参数文件,如何知道spfile在哪?
3,不知道spfile,怎么知道的asm_diskstring?

我在一个11203 2nodes linux rac做个实验

[grid@znode1 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

[grid@znode1 ~]$ crs_stat -t    
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.CRSDG.dg   ora....up.type ONLINE    ONLINE    znode1      
ora.DBDG.dg    ora....up.type ONLINE    ONLINE    znode1      
ora.FLRV.dg    ora....up.type ONLINE    ONLINE    znode1      
ora....ER.lsnr ora....er.type ONLINE    ONLINE    znode1      
ora....N1.lsnr ora....er.type ONLINE    ONLINE    znode1      
ora....N2.lsnr ora....er.type ONLINE    ONLINE    znode2      
ora....N3.lsnr ora....er.type ONLINE    ONLINE    znode2      
ora.asm        ora.asm.type   ONLINE    ONLINE    znode1      
ora.cvu        ora.cvu.type   ONLINE    ONLINE    znode2      
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE               
ora....network ora....rk.type ONLINE    ONLINE    znode1      
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    znode2      
ora.ons        ora.ons.type   ONLINE    ONLINE    znode1      
ora.rac.db     ora....se.type ONLINE    ONLINE    znode1      
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    znode1      
ora.scan2.vip  ora....ip.type ONLINE    ONLINE    znode2      
ora.scan3.vip  ora....ip.type ONLINE    ONLINE    znode2      
ora....SM1.asm application    ONLINE    ONLINE    znode1      
ora....E1.lsnr application    ONLINE    ONLINE    znode1      
ora.znode1.gsd application    OFFLINE   OFFLINE               
ora.znode1.ons application    ONLINE    ONLINE    znode1      
ora.znode1.vip ora....t1.type ONLINE    ONLINE    znode1      
ora....SM2.asm application    ONLINE    ONLINE    znode2      
ora....E2.lsnr application    ONLINE    ONLINE    znode2      
ora.znode2.gsd application    OFFLINE   OFFLINE               
ora.znode2.ons application    ONLINE    ONLINE    znode2      
ora.znode2.vip ora....t1.type ONLINE    ONLINE    znode2      
[grid@znode1 ~]$ srvctl stop database -d rac

[grid@znode1 ~]$ crsctl  stop resource  ora.FLRV.dg

[grid@znode1 ~]$ crsctl  stop resource  ora.DBDG.dg

[grid@znode1 ~]$ crsctl  stop resource  ora.CRSDG.dg

[grid@znode1 ~]$ crsctl  stop resource  ora.asm


[grid@znode1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Fri Jul 20 08:58:45 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

用strace命令跟踪一下进程,也可以用truss,dtrace等平台工具。

[session 2]
[root@znode1 ~]# ps -ef|grep oracle
grid     20642 20639  0 08:58 ?        00:00:00 oracle+ASM1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
root     20756 20709  0 09:00 pts/2    00:00:00 grep oracle

[root@znode1 ~]# ps -ef|grep 20639
grid     20639 20638  0 08:58 pts/3    00:00:00 sqlplus   as sysasm
grid     20946 20639  0 09:03 ?        00:00:00 oracle+ASM1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
root     21460 21401  0 09:09 pts/4    00:00:00 grep 20639

[root@znode1 ~]# strace -o asmstartup.tra -p 20642
Process 20642 attached - interrupt to quit

[session1]

SQL> startup nomount
ASM instance started

Total System Global Area  284565504 bytes
Fixed Size                  1344456 bytes
Variable Size             258055224 bytes
ASM Cache                  25165824 bytes
SQL> 


[session2 显示]
[root@znode1 ~]# strace -o asmstartup.tra -p 20642(不用再重新输入)
Process 20642 attached - interrupt to quit
Process 20642 detached
[root@znode1 ~]# 
[root@znode1 ~]# vi asmstartup.tra 

read(9, "\0001\0\0\6\0\0\0\0\0\3\212\6\376\377\377\377\0\1\0\0\376\377\377\377\376\377\377\377\0\0\0"..., 8208) = 49
mmap2(NULL, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6ab6000
times({tms_utime=0, tms_stime=3, tms_cutime=0, tms_cstime=0}) = 522261802
...省略
uname({sys="Linux", node="znode1", ...}) = 0
gettimeofday({1342746189, 667048}, NULL) = 0
getuid32()                              = 1100
getppid()                               = 20639
...省略
connect(6, {sa_family=AF_FILE, path="/var/tmp/.oracle/sOCSSD_LL_znode1_"...}, 110) = 0
recv(6, "0\0\0\0\1\0\1\1\1\0\1\0\1\0\0\0\0\0\0\0\1\0\0\0\254\305!\0\rq\0\0"..., 10240, 0) = 48
recv(6, 0x119c6768, 10240, 0)           = -1 EAGAIN (Resource temporarily unavailable)
recv(6, 0x119c6768, 10240, 0)           = -1 EAGAIN (Resource temporarily unavailable)
recv(6, 0x119c6768, 10240, 0)           = -1 EAGAIN (Resource temporarily unavailable)
recv(6, 0x119c6768, 10240, 0)           = -1 EAGAIN (Resource temporarily unavailable)
open("/u01/app/11.2.0/grid/auth/css/znode1/A6126639/9dc2315", O_WRONLY|O_CREAT|O_EXCL, 0644) = 13
fchmod(13, 0644)                        = 0
write(13, "k\205\235\177", 4)           = 4
close(13) 

TIP:从这段文本可以看到进程去连接OCSSD 验证自身身份,所以这也是为什么在启asm之前要先启css服务,如果连接失败自然ASM启动就会出错,如果成功就可以通过socket file /var/tmp/.oracle/sCRSD_UI_SOCK与OHASD通信,上篇也说过是OHASD负责OLR管理,所以就可以从OLR中得到SPFILE 路径信息和asm_diskstring,从接下来的输出可以看到

open("/etc/oracle/olr.loc", O_RDONLY)   = 13
fstat64(13, {st_mode=S_IFREG|0644, st_size=82, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6c77000
read(13, "olrconfig_loc=/u01/app/11.2.0/gr"..., 4096) = 82
read(13, "", 4096)                      = 0
close(13)                               = 0
munmap(0xb6c77000, 4096)                = 0
open("/etc/oracle/ocr.loc", O_RDONLY)   = 13
fstat64(13, {st_mode=S_IFREG|0644, st_size=38, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6c77000
read(13, "ocrconfig_loc=+crsdg\nlocal_only="..., 4096) = 38
read(13, "", 4096)                      = 0
close(13)                               = 0
...
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=405, ...}) = 0
stat64("/u01/app/11.2.0/grid/gpnp/znode1/wallets/peer/cwallet.sso", {st_mode=S_IFREG|0700, st_size=6301, ...}) = 0
lstat64("/u01/app/11.2.0/grid/gpnp/znode1/wallets/peer/cwallet.sso", {st_mode=S_IFREG|0700, st_size=6301, ...}) = 0
open("/u01/app/11.2.0/grid/gpnp/znode1/wallets/peer/cwallet.sso", O_RDONLY|O_LARGEFILE) = 18

write(3, "kfspFileNameGet name=+CRSDG/rac/"..., 76) = 76
write(3, "\n", 1)                       = 1

times({tms_utime=2, tms_stime=19, tms_cutime=0, tms_cstime=0}) = 522261855
write(3, "kgfspb_shallow_discover dscstr=\""..., 33) = 33
write(3, "\n", 1)                       = 1
open("/dev/raw", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 21
fcntl64(21, F_SETFD, FD_CLOEXEC)        = 0
getdents64(21, /* 8 entries */, 32768)  = 192
getdents64(21, /* 0 entries */, 32768)  = 0
close(21)                               = 0
access("/dev/raw/raw4", R_OK|W_OK)      = 0
stat64("/dev/raw/raw4", {st_mode=S_IFCHR|0660, st_rdev=makedev(162, 4), ...}) = 0
open("/dev/raw/raw4", O_RDONLY|O_LARGEFILE) = 21

Tip:
正如人们所期望的那样,OHASD 从 OLR 读取此信息,它的路径,它来自 /etc/oracle/olr.loc。我想指出,对于 Oracle Restart 模式,在 Oracle RAC 环境信息中,它是正确的 存储在 GPNP 配置文件中,并且有维护和管理此配置文件信息的 GPNPD 进程

write(3, "kgfdp_query grp CRSDG", 21)   = 21
write(3, "\n", 1)                       = 1
times({tms_utime=2, tms_stime=20, tms_cutime=0, tms_cstime=0}) = 522261860
write(3, "kgfdp_create", 12)            = 12
write(3, "\n", 1)                       = 1
times({tms_utime=2, tms_stime=20, tms_cutime=0, tms_cstime=0}) = 522261861
write(3, "kgfdp_initCall ausize=1048576 bl"..., 61) = 61
write(3, "\n", 1)                       = 1
times({tms_utime=2, tms_stime=20, tms_cutime=0, tms_cstime=0}) = 522261861
write(3, "kgfdp_doQuery sync=2", 20)    = 20
write(3, "\n", 1)                       = 1
times({tms_utime=2, tms_stime=20, tms_cutime=0, tms_cstime=0}) = 522261861
write(3, "kgfdp_read", 10)              = 10
write(3, "\n", 1)                       = 1
times({tms_utime=2, tms_stime=20, tms_cutime=0, tms_cstime=0}) = 522261861
write(3, "kgfdp_readMeta totPst=6 blksz=40"..., 34) = 34
write(3, "\n", 1)                       = 1

TIP:
“kgfspb_shallow_discover dscstr=\”” 对应的是asm_diskstring,扫描ASM DISK的路径,如果是用ASMLIB或dev/raw/或/dev/oracleasm/disks/在linux环境中值为null,然后扫描了每个磁盘的文件头读取元数据,从文件头可以得到磁盘组信息、spfile、voting disk file 位置,分别是,kfdhdb.dskname,kfdhdb.spfile, kfdhdb.spfflg (first block and number of blocks)and kfdhdb.vfstart, kfdhdb.vfend (begin block and end block). 可以用$ORACLE_HOME/bin下的kfed工具读硬盘文件。

[grid@znode1 ~]$ kfed read /dev/raw/raw1|more

kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD  --Indicates that this is ASM disk header

kfdhdb.dsknum:                        0 ; 0x024: 0x0000           -- disk number
kfdhdb.grptyp:                        2 ; 0x026: KFDGTP_NORMAL    -- Indicates mirroring level, in my case it is NORMAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER    -- number of member 
kfdhdb.dskname:              CRSDG_0000 ; 0x028: length=10        -- disk name
kfdhdb.grpname:                   CRSDG ; 0x048: length=5         -- disk group name
kfdhdb.fgname:               CRSDG_0000 ; 0x068: length=10        -- To which failure group this disk belongs
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200           -- Disk sector size

kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000           -- Disk block size
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000       -- Disk au size
kfdhdb.mfact:                    113792 ; 0x0c0: 0x0001bc80
kfdhdb.dsksize:                     964 ; 0x0c4: 0x000003c4       -- disk total size MB

kfdhdb.vfstart:                     256 ; 0x0ec: 0x00000100       -- Begin block address of voting disk file
kfdhdb.vfend:                       288 ; 0x0f0: 0x00000120       -- End block address of voting disk file
kfdhdb.spfile:                       59 ; 0x0f4: 0x0000003b       -- Begin block address of spfile
kfdhdb.spfflg:                        1 ; 0x0f8: 0x00000001       -- Number of blocks containing spfile

从X$KFFXP也可以查询到这个位置

X$KFFXP contains the physical allocation table for each ASM file, that is it contains the mapping between ASM files (identified by columns NUMBER_KFFXP and COMPOUND_KFFXP) and their location on disk is
Column Name       Description
ADDR              table address/identifier
INDX              row identifier
INST_ID           instance number (RAC)
NUMBER_KFFXP      ASM file number. Join with v$asm_file and v$asm_alias
COMPOUND_KFFXP    File identifier. Join with compound_index in v$asm_file
INCARN_KFFXP      File incarnation id. Join with incarnation in v$asm_file
PXN_KFFXP         Extent number per file
XNUM_KFFXP        Logical extent number per file (mirrored extents have the same value)
GROUP_KFFXP       ASM disk group number. Join with v$asm_disk and v$asm_diskgroup
DISK_KFFXP        Disk number where the extent is allocated. Join with v$asm_disk
AU_KFFXP          Relative position of the allocation unit from the beginning of the disk. The allocation unit size (1 MB) in v$asm_diskgroup
LXN_KFFXP         0,1 used to identify primary/mirror extent, 2 identifies file header allocation unit (hypothesis)
FLAGS_KFFXP       N.K.
CHK_KFFXP         N.K.

SQL> select NAME         "FILE NAME",              
  2                  AU_KFFXP     "AU NUMBER",     
  3                  NUMBER_KFFXP "FILE NUMBER",   
  4                  GROUP_Number,                 
  5                  DISK_KFFXP   "DISK NUMBER"    
  6             from x$kffxp, v$asm_alias          
  7            where GROUP_KFFXP = GROUP_NUMBER    
  8              and NUMBER_KFFXP = FILE_NUMBER    
  9       and lower(name) ='registry.253.787925627'
 10  ;

FILE NAME                                 AU NUMBER FILE NUMBER GROUP_NUMBER DISK NUMBER
---------------------------------------- ---------- ----------- ------------ -----------
REGISTRY.253.787925627                           59         253            1           0
REGISTRY.253.787925627                           59         253            1           2

SQL> select group_number,path,disk_number from v$asm_disk order by 1,3;

GROUP_NUMBER PATH                 DISK_NUMBER
------------ -------------------- -----------
           1 /dev/raw/raw1                  0
           1 /dev/raw/raw2                  1
           1 /dev/raw/raw3                  2
           2 /dev/raw/raw4                  0
           2 /dev/raw/raw5                  1
           3 /dev/raw/raw6                  0

6 rows selected.

SQL> select group_number,name from v$asm_diskgroup;

GROUP_NUMBER NAME
------------ ------------------------------
           1 CRSDG
           2 DBDG
           3 FLRV

Tip:
可以看出spfile可以在/dev/raw/raw1、/dev/raw/raw3 的59号AU上

[grid@znode1 ~]$ cat /etc/oracle/olr.loc
olrconfig_loc=/u01/app/11.2.0/grid/cdata/znode1.olr
crs_home=/u01/app/11.2.0/grid

strings /u01/app/11.2.0/grid/cdata/znode1.olr|grep  -I asmparameterfile
cat /u01/app/11.2.0/grid/gpnp/profiles/peer/profile.xml|grep -I asmparameterfile
gpnptool get

都可以得到下面的信息
/u01/app/11.2.0/grid/cdata/znode1.olr、OCR 和/u01/app/11.2.0/grid/gpnp/profiles/peer/profile.xml和spfile中是都同步的么?
做个实验alter system set asm_diskstring=’/dev/raw/raw*’,发现只有u01/app/11.2.0/grid/gpnp/profiles/peer/profile.xml的输出没有修改。

所以得出的结论是
OHASD–>OLR—–>GPNP profile(DiscoveryString)—>asm_disk物理路径—>扫描所有disk header,得到spfile物理位置—>直接路径读spfile—>启动ASM—>如有更新同步OLR

reference:
http://aychin.wordpress.com
http://www.askmaclean.com/

打赏

,

对不起,这篇文章暂时关闭评论。