ora-600 [3708] [910] internal error issue
上周在更换机房,停数据库时,遇到的另一个ora-600 问题,记录一下
环境
os version :centos 5
db version :oracle 10204 single-instance
physical dataguard(异地)
描述一下当时情况
1,23:30 网管断开外网 –注意这里导地dataguard 就已无法连接
2,00:22 发出shutdown immediate
3,00:37 第一次出现ora-600 [3708]
4, 00:39 左右通过OS kill oracle process
5, 随后ipcrm 删除了共享内存段
6,再次startup 确认可以打开后进行了shutdown immediate没再出现ora-600
SQL> shutdown immediate
ORA-00600: internal error code, arguments: [3708], [910], [], [], [], [], [], []
alert log file ############################# Fri Aug 31 23:49:11 2012 Errors in file /oracle/admin/icme/bdump/icme_lns1_18491.trc: ORA-03113: end-of-file on communication channel Fri Aug 31 23:49:11 2012 LGWR: I/O error 3113 archiving log 2 to 'sdicme' Sat Sep 1 00:22:11 2012 Starting background process EMN0 EMN0 started with pid=363, OS id=31070 Sat Sep 1 00:22:11 2012 Shutting down instance: further logons disabled Sat Sep 1 00:22:11 2012 Stopping background process CJQ0 Sat Sep 1 00:22:11 2012 Stopping background process MMNL Sat Sep 1 00:22:11 2012 Stopping background process MMON Sat Sep 1 00:22:12 2012 Shutting down instance (immediate) License high water mark = 1220 Sat Sep 1 00:22:12 2012 Stopping Job queue slave processes, flags = 7 Sat Sep 1 00:22:12 2012 Job queue slave processes stopped All dispatchers and shared servers shutdown Sat Sep 1 00:22:20 2012 ALTER DATABASE CLOSE NORMAL Sat Sep 1 00:22:20 2012 SMON: disabling tx recovery SMON: disabling cache recovery Sat Sep 1 00:22:22 2012 Shutting down archive processes Archiving is disabled Sat Sep 1 00:22:27 2012 ARCH shutting down ARC9: Archival stopped Sat Sep 1 00:22:32 2012 ARCH shutting down ARC8: Archival stopped Sat Sep 1 00:22:37 2012 ARCH shutting down ARC7: Archival stopped Sat Sep 1 00:22:42 2012 ARCH shutting down ARC6: Archival stopped Sat Sep 1 00:22:47 2012 ARCH shutting down ARC5: Archival stopped Sat Sep 1 00:22:52 2012 ARCH shutting down ARC4: Archival stopped Sat Sep 1 00:22:57 2012 ARCH shutting down ARC3: Archival stopped Sat Sep 1 00:23:07 2012 ARCH shutting down ARC1: Archival stopped Sat Sep 1 00:23:12 2012 ARCH shutting down ARC0: Archival stopped Sat Sep 1 00:37:35 2012 Errors in file /oracle/admin/icme/bdump/icme_lgwr_29039.trc: Sat Sep 1 00:37:39 2012 Errors in file /oracle/admin/icme/udump/icme_ora_31068.trc: ORA-00600: internal error code, arguments: [3708], [910], [], [], [], [], [], [] Sat Sep 1 00:37:43 2012 CLOSE: Error 600 during database close Sat Sep 1 00:37:43 2012 ARC1: Archival stopped Sat Sep 1 00:23:12 2012 ARCH shutting down ARC0: Archival stopped Sat Sep 1 00:37:35 2012 Errors in file /oracle/admin/icme/bdump/icme_lgwr_29039.trc: Sat Sep 1 00:37:39 2012 Errors in file /oracle/admin/icme/udump/icme_ora_31068.trc: ORA-00600: internal error code, arguments: [3708], [910], [], [], [], [], [], [] Sat Sep 1 00:37:43 2012 CLOSE: Error 600 during database close Sat Sep 1 00:37:43 2012 SMON: enabling cache recovery SMON: enabling tx recovery Sat Sep 1 00:37:43 2012 ORA-600 signalled during: ALTER DATABASE CLOSE NORMAL... Sat Sep 1 00:38:52 2012 Thread 1 closed at log sequence 33821 Successful close of redo thread 1 Sat Sep 1 00:40:04 2012 Errors in file /oracle/admin/icme/bdump/icme_pmon_29029.trc: ORA-00476: RECO process terminated with error Sat Sep 1 00:40:04 2012 PMON: terminating instance due to error 476 Instance terminated by PMON, pid = 29029 trace file contents ################################################## *** 2012-09-01 00:37:39.683 ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [3708], [910], [], [], [], [], [], [] Current SQL statement for this session: ALTER DATABASE CLOSE NORMAL ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedst()+31 call ksedst1() 000000000 ? 000000001 ? 7FBFFF2500 ? 7FBFFF2560 ? 7FBFFF24A0 ? 000000000 ? ksedmp()+610 call ksedst() 000000000 ? 000000001 ? 7FBFFF2500 ? 7FBFFF2560 ? 7FBFFF24A0 ? 000000000 ? ksfdmp()+21 call ksedmp() 000000003 ? 000000001 ? 7FBFFF2500 ? 7FBFFF2560 ? 7FBFFF24A0 ? 000000000 ? kgeriv()+176 call ksfdmp() 000000003 ? 000000001 ? 7FBFFF2500 ? 7FBFFF2560 ? 7FBFFF24A0 ? 000000000 ? kgesiv()+119 call kgeriv() 0066876E0 ? 006730B90 ? 000000000 ? 000000000 ? 7FBFFF24A0 ? 000000000 ? ksesic1()+215 call kgesiv() 0066876E0 ? 006730B90 ? 000000E7C ? 000000001 ? 7FBFFF3280 ? 000000000 ? kcttsc()+695 call ksesic1() 000000E7C ? 000000000 ? 00000038E ? 000000001 ? 000000000 ? 7FBFFF2F00 ? kcfcld()+145 call kcttsc() 000000003 ? 000000000 ? 000000003 ? 000000001 ? 000000000 ? FFFFFFFF000000BD ? dbsclose()+498 call kcfcld() 000000003 ? 000000000 ? 000000003 ? 000000001 ? 000000000 ? FFFFFFFF000000BD ? adbdrv()+63033 call dbsclose() 000000000 ? 000000000 ? 000000003 ? 000000001 ? 000000000 ? FFFFFFFF000000BD ? opiexe()+13505 call adbdrv() 000000000 ? 000000000 ? 2393BC048 ? 000000001 ? 000000000 ? FFFFFFFF000000BD ? opiosq0()+3316 call opiexe() 000000004 ? 000000000 ? 7FBFFFB0F8 ? 000000003 ? 000000000 ? FFFFFFFF000000BD ? kpooprx()+315 call opiosq0() 000000003 ? 00000000E ? 7FBFFFB268 ? 0000000A4 ? 000000000 ? FFFFFFFF000000BD ? kpoal8()+799 call kpooprx() 7FBFFFE414 ? 7FBFFFC440 ? 00000001B ? 000000001 ? 000000000 ? FFFFFFFF000000BD ? opiodr()+984 call kpoal8() 00000005E ? 000000017 ? 7FBFFFE410 ? 000000001 ? 000000001 ? FFFFFFFF000000BD ? ttcpip()+1012 call opiodr() 00000005E ? 000000017 ? 7FBFFFE410 ? 000000000 ? 0059B1290 ? ... check archivelog info at that time SQL> select * from ( select dest_id,sequence#,first_time,next_time,creator,standby_dest from v$archived_log where sequence#>33818 order by 2 3 ) where rownum<10; DEST_ID SEQUENCE# FIRST_TIME NEXT_TIME CREATOR STA ---------- ---------- ------------------- ------------------- ------- --- 2 33819 2012-08-31 11:04:47 2012-08-31 15:41:31 LGWR YES 1 33819 2012-08-31 11:04:47 2012-08-31 15:41:31 ARCH NO 1 33820 2012-08-31 15:41:31 2012-08-31 22:00:06 ARCH NO 2 33820 2012-08-31 15:41:31 2012-08-31 22:00:06 LGWR YES 1 33821 2012-08-31 22:00:06 2012-09-01 00:46:31 ARCH NO 1 33822 2012-09-01 00:46:31 2012-09-01 05:47:58 ARCH NO 1 33823 2012-09-01 05:47:58 2012-09-01 06:19:14 ARCH NO 1 33824 2012-09-01 06:19:14 2012-09-01 06:19:23 ARCH NO 1 33825 2012-09-01 06:19:23 2012-09-01 06:19:39 ARCH NO 9 rows selected.
cause:
Basically this error is raised because LGWR timmed out.
While shutdown, Oracle routine (kcttsc()) sends a message to LGWR to change the state of a redo thread and waits for confirmation. If the return message never comes, then LGWR
times out after 15 minutes which is the second argument in the ora-600 (910 secs ie. 15mins.) In the situation we faced , lgwr had RT enqueue and waiting for ‘direct path read’
on file 197. Becuase of some OS or network issues the read took more than 15 minutes and lgwr gave a timeout.
this is a bug 6512622
solution:
1. Apply the 10.2.0.5 patchset where the bug is fixed
or
2. Apply one off Patch 6512622 if available on My Oracle Support for your platform and Oracle Version.
or
3. Upgrade to 11.1.0.6 where the bug is fixed.
I think the improved I/O performance or shutdown the database before off network Also the occurrence of this problem can be avoided.
目前这篇文章有1条评论(Rss)评论关闭。