首页 » ORACLE 9i-23ai » Troubleshooting ORA-07445 [skgxpdmpmem()+22096]

Troubleshooting ORA-07445 [skgxpdmpmem()+22096]

上个月有个节点重启了,数据库环境10.2.0.5 2nodes RAC on hpux 11.31 ia64. 下面简单的记录一下。

SQL>  select startup_time from gv$instance;
STARTUP_TIME
-------------------
2014-12-21 20:55:58
2013-08-05 23:21:21

# alert

Sun Dec 21 20:55:41 EAT 2014
Errors in file /opt/oracle/app/admin/anbob/bdump/anbob1_lms0_13499.trc:
ORA-07445: exception encountered: core dump [skgxpdmpmem()+22096] [SIGSEGV] [Address not mapped to object] [0x0002D0B8C] [] []
Sun Dec 21 20:55:43 EAT 2014
Trace dumping is performing id=[cdmp_20141221205543]
Sun Dec 21 20:55:45 EAT 2014
Errors in file /opt/oracle/app/admin/anbob/bdump/anbob1_pmon_13442.trc:
ORA-00484: LMS* process terminated with error
Sun Dec 21 20:55:45 EAT 2014
PMON: terminating instance due to error 484

# anbob1_lms0_13499.trc

*** 2014-12-21 20:55:41.673
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x2d0b8c, PC: [0xc00000000bc88fa0, skgxpdmpmem()+22096]
  r1: 9ffffffffd7ef2e8       r20:                0       br5:                0
  r2: c00000000bc45bf0       r21:              21c       br6: c00000000042e670
  r3: 9fffffff5fb77c00       r22:             d08f       br7: c00000000bc6aff0
  r4:                0       r23:               64        ip: c00000000bc88fa0
  r5: c000000000000408       r24:             d088      iipa:                0
  r6: c0000000000443e0       r25: 9ffffffffce7a968       cfm:             4fa6
  r7: 9ffffffffd7f8de8       r26:              163        um:               1a
  r8: 9ffffffffc8ba078       r27: fffffffffffffffb       rsc:               1f
  r9:           2d0b8c       r28: ffffffffffff2f6d       bsp: 9ffffffffd8006a0
 r10: 9ffffffffd07bcf8       r29:             d08f  bspstore: 9ffffffffd8006a0
 r11: 9ffffffffd7ef530       r30:             d095      rnat:                0
 r12: 9fffffffffffaf70       r31: 9ffffffffd07bd70       ccv:                0
 r13: 9ffffffffd4554b0      NaTs:                0      unat:                0
 r14:                1       PRs:            28e97      fpsr:    9804c8a74433f
 r15: 600000000033ee18       br0: c00000000bc7b6c0       pfs: c000000000000b1d
 r16:             d08f       br1: c000000000294bc0        lc:                0
 r17:              1f8       br2:                0        ec:                0
 r18:             2b3c       br3:                0       isr: 9ffffffffd8006a0
 r19:                0       br4:                0       ifa:                0
Reason code: 0008
*** 2014-12-21 20:55:41.684
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [skgxpdmpmem()+22096] [SIGSEGV] [Address not mapped to object] [0x0002D0B8C] [] []
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+64          call     ksedst1()            000000001 ? 000000001 ?
ksedmp()+2176        call     ksedst()             000000001 ?
                                                   C000000000000D20 ?
ssexhd()+1264        call     ksedmp()             000000003 ?
                                                   6000000000230DA0 ?
                                                   60000000000C7420 ?
             call     ssexhd()             C0000002FF6101D8 ?
                                                   60000000000C9570 ?
skgxpdmpmem()+22096  call                  6000000000235200 ?
                                                   10000000B ?
                                                   6000000000235010 ?
skgxpmcpy()+50848    call     skgxpdmpmem()+22032  9FFFFFFFFD07BC08 ?
                                                   0002CE050 ?
                                                   60000000003136F0 ?
skgxppost()+29232    call     skgxpmcpy()+50752    9FFFFFFFFD07BC08 ?
                                                   60000000002CE050 ?
skgxppost()+4720     call     skgxppost()+28800    60000000002CE050 ?
                                                   9FFFFFFFFD7EF2E8 ?
                                                   60000000003136F0 ?
skgxpwait()+464      call     skgxppost()+1280     9FFFFFFFFFFFB830 ?
                                                   60000000002CEC80 ?
ksxpwait()+3296      call     skgxpwait()          9FFFFFFFFFFFB830 ?
                                                   60000000002CE050 ?
$cold_ksliwat()+148  call     ksxpwait()           00000001E ?

                                                   000000000 ?
kslwaitns_timed()+1  call     $cold_ksliwat()      000000003 ? 000000001 ?
12                                                 000000035 ? 000000000 ?
                                                   000000018 ? 000000000 ?
kskthbwt()+400       call     kslwaitns_timed()    000000003 ? 000000001 ?
                                                   000000035 ? 000000000 ?
                                                   000000018 ? 000000000 ?
                                                   000000000 ?
                                                   9FFFFFFFFFFFBB5C ?
kslwait()+640        call     kskthbwt()           000000003 ? 000000035 ?
                                                   000000000 ? 000000018 ?
                                                   000000000 ? 000000000 ?
                                                   00000000A ? 000000000 ?
ksxprcv()+944        call     kslwait()            000000003 ? 000000035 ?
                                                   000000000 ? 000000018 ?
                                                   000000000 ? 000000000 ?
kjctr_rksxp()+736    call     ksxprcv()            60000000000C6C98 ?
                                                   000000018 ?
                                                   9FFFFFFFFFFFC710 ?
kjctrcv()+448        call     kjctr_rksxp()        9FFFFFFFFD3BA408 ?
                                                   C0000002FEF88340 ?

kjcsrmg()+128        call     kjctrcv()            9FFFFFFFFD3BA408 ?
                                                   C0000002FEF88340 ?

kjmsm()+15152        call     kjcsrmg()            C0000002FCCF6591 ?
                                                   9FFFFFFFFFFFCBF4 ?

                                                   00002825D ?
ksbrdp()+2368        call     kjmsm()              9FFFFFFFFFFFD2B0 ?
                                                   9FFFFFFFFFFFCBF0 ?

                                                   000027119 ? 000000000 ?
opirip()+1184        call     ksbrdp()             9FFFFFFFFFFFD2C0 ?
                                                   60000000000BA268 ?
                                                   60000000000C6C98 ?
opidrv()+1184        call     opirip()             9FFFFFFFFFFFEC00 ?
                                                   000000004 ?
                                                   9FFFFFFFFFFFF220 ?
sou2o()+240          call     opidrv()             000000032 ?
                                                   60000000000C6C98 ?
                                                   9FFFFFFFFFFFF220 ?
opimai_real()+336    call     sou2o()              9FFFFFFFFFFFF240 ?
                                                   000000032 ? 000000004 ?
                                                   9FFFFFFFFFFFF220 ?
main()+240           call     opimai_real()        000000003 ? 000000000 ?
main_opd_entry()+80  call     main()               000000003 ?
                                                   9FFFFFFFFFFFF720 ?
                                                   60000000000BA268 ?
                                                   C000000000000004 ?

#vi /opt/oracle/app/admin/anbob/bdump/anbob1_pmon_13442.trc

Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /opt/oracle/app/product/10.2.0/db_1
System name:    HP-UX
Node name:      anbob1
Release:        B.11.31
Version:        U
Machine:        ia64
Instance name: anbob1
Redo thread mounted by this instance: 1
Oracle process number: 2
Unix process pid: 13442, image: oracle@anbob1 (PMON)

*** 2014-12-21 20:55:45.367
*** SERVICE NAME:(SYS$BACKGROUND) 2014-12-21 20:55:45.366
*** SESSION ID:(885.1) 2014-12-21 20:55:45.366
Background process LMS0 found dead
Oracle pid = 7
OS pid (from detached process) = 13499
OS pid (from process state) = 13499
dtp = c000000100e11ec0, proc = c0000002fc552f00
error 484 detected in background process
ORA-00484: LMS* process terminated with error
ksuitm: waiting up to [5] seconds before killing DIAG(13491)

MOS 内部中只发现了一个最相似的BUG
Bug 14196801 : ORA-7445 [SKGXPDMPMEM()+52481] INTERMI

—– SQL Statement (None)
—– Current SQL information unavailable
– no cursor.
—– Call Stack Trace —–
skdstdst <- ksedst1 <- ksedst <- dbkedDefDump <- ksedmp <- ssexhd <- <- skgxpdmpmem <- skgxpgetimd <- skgxppost <- skgxpvsnd <- ksxprcvimd <- kjctr_rksxp <- kjctrcv <- kjcsrmg <- kjmsm <- ksbrdp <- opirip <- opidrv <- sou2o <- opimai_real <- main <- main_opd_entry WORKAROUND? =========== No RELATED ISSUES (bugs, forums, RFAs) =================================== There isn't any bug with identical stack trace : skgxpdmpmem <- skgxpgetimd <- skgxppost Other bugs with ora-7445 skgxpdmpmem related to LMS: Bug 9029091: LNX64-10205-RAC: ORA-600: [KJBLREPLAY:DUP] / ORA-7445:[SKGXPDMPMEM()+11838], LMS ==> duplicate of Bug 8913462: AROLTP-D:LMS1 DIED WITH ORA-600 [KJBLREPLAY:DUP] ==> duplicate of Bug 6961928: ORA-600: INTERNAL ERROR CODE, ARGUMENTS: [KJBCLOSE:SH] ==> all of them show also ORA-600 kjbclose:sh
Bug 9097995: LNX64-10205-RAC: LMS HIT [SKGXPDMPMEM] AND [KJBLREPLAY:DUP],INSTANCE CRASHED ==> also duplicate of Bug 8913492
Bug 9009829: LNX64-10205-RAC: LMS HIT ORA-600 [KCLEXPANDLOCK_2], INSTANCE CRASHED ==> fixed on 10.2.0.5 Bug 8985365: LNX:ETL: ORA-7445 [SKGXPDMPMEM()+11838] [SIGSEGV] [UNKNOWN CODE] [0X000000000] ==> also dixed on 10.2.0.5

最后SR 也没有确切BUG ,只是说很像,而且10.2.0.5 版本维护已过期所以无法提交AMERICAN 开发,只能从已知BUG中查询。应该是与主机资源相关的,本机没有部OSW所有有些资源无法查询。

打赏

对不起,这篇文章暂时关闭评论。