ORA-600 [kdBlkCheckError][X],[X],[38504] and ORA-600[4194],[],[] in 11.2.0.4
ORA-600 [kdBlkCheckError][X],[X],[38504] and ORA-600[4194],[],[] in 11.2.0.4
Symptom:
The Oracle Database is crashing down in few minutes as soon as we start the Database. DB 11.2.0.4 in linux single instance . it is in VMWare 6, Done a dynamic disk allocation before the problem occurred . then check alert log found ORA-600 [kdBlkCheckError] and ORA-600[4194] errors.
SUGGESTIONS:
ORA-600 [kdBlkCheckError]
Kernel Data Block Check Error, When logical corrupted data blocks is detected ,Normally this oracle bug or memory corrupt.
ORA-600 [4194] [a] [b]
VERSIONS:
versions 6.0 to 10.1
DESCRIPTION:
A mismatch has been detected between Redo records and rollback (Undo) records.
We are validating the Undo record number relating to the change being applied against the maximum undo record number recorded in the undo block. This error is reported when the validation fails.
ARGUMENTS:
Arg [a] Maximum Undo record number in Undo block
Arg [b] Undo record number from Redo block
FUNCTIONALITY:
Kernel Transaction Undo called from Cache layer
Note in the case ora-600 [4194] arguments A and B is null, It seems like your undo tablespace is corrupted.In rare cases (usually DBA error) the Oracle UNDO tablespace can become corrupted. This manifests with this error:
ORA-00376: file xx cannot be read at this time
Alert log
=======================
Errors in file /oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_3058.trc (incident=19319): ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [224], [38504], [], [], [], [], [], [], [], [] Incident details in: /oracle/diag/rdbms/orcl/orcl/incident/incdir_19319/orcl_smon_3058_i19319.trc replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC Mon Sep 01 15:24:13 2014 QMNC started with pid=22, OS id=3091 Completed: alter database open Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Block recovery from logseq 123, block 58 to scn 2959315 Recovery of Online Redo Log: Thread 1 Group 3 Seq 123 Reading mem 0 Mem# 0: /oracle/oradata/orcl/redo03.log Block recovery completed at rba 123.103.16, scn 0.2959316 Errors in file /oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_3058.trc: ORA-01595: error freeing extent (3) of rollback segment (7)) ORA-00607: Internal error occurred while making a change to a data block ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [224], [38504], [], [], [], [], [], [], [], [] Starting background process CJQ0 Mon Sep 01 15:24:15 2014 CJQ0 started with pid=24, OS id=3105 Mon Sep 01 15:24:15 2014 Dumping diagnostic data in directory=[cdmp_20140901152415], requested by (instance=1, osid=3058 (SMON)), summary=[incident=19319]. Mon Sep 01 15:24:15 2014 Errors in file /oracle/diag/rdbms/orcl/orcl/trace/orcl_m000_3103.trc (incident=19399): ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] Incident details in: /oracle/diag/rdbms/orcl/orcl/incident/incdir_19399/orcl_m000_3103_i19399.trc Use ADRCI or Support Workbench to package the incident.
PMON trace
Flush retried for xcb 0xbcedf500, pmd 0xbffcf1c8 kti: Reconstructing undo block 0xc002e2 for xcb 0xbcedf500 Doing block recovery for file 3 block 738 Block header before block recovery: buffer tsn: 2 rdba: 0x00c002e2 (3/738) scn: 0x0000.00292004 seq: 0x01 flg: 0x04 tail: 0x20040201 frmt: 0x02 chkval: 0xa0c7 type: 0x02=KTU UNDO BLOCK Resuming block recovery (PMON) for file 3 block 738 Block recovery from logseq 123, block 58 to scn 2959315 *** 2014-09-01 15:24:26.856 Recovery of Online Redo Log: Thread 1 Group 3 Seq 123 Reading mem 0 Block recovery completed at rba 123.103.16, scn 0.2959317 ==== Redo read statistics for thread 1 ==== Total physical reads (from disk and memory): 362Kb -- Redo read_disk statistics -- Read rate (ASYNC): 0Kb in 0.01s => 0.00 Mb/sec -- Redo read_memory statistics -- Read disk 0Kb and read memory 362Kb, hit-ratio=1.00 Longest record: 1Kb, moves: 0/71 (0%) Longest LWN: 6Kb, moves: 0/17 (0%), moved: 0Mb Last redo scn: 0x0000.002d27d3 (2959315) ------------------------------------------------------- IMU redo block change list ------------------------------------------------------ tsn 1 rdba 0x814d80 bh 0xa8ff4cf0 cv 0xbcfb8070 ------------------------------------------------------ KTB Redo op: 0x11 ver: 0x01 compat bit: 4 (post-11) padding: 1 op: F xid: 0x0007.00d.0000052f uba: 0x00c002e2.02d5.17 Block cleanout record, scn: 0x0000.002d28d6 ver: 0x01 opt: 0x02, entries follow... itli: 1 flg: 2 scn: 0x0000.002d28cd itli: 2 flg: 2 scn: 0x0000.002d28d5 KDO Op code: DRP row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x00814d80 hdba: 0x00800eba itli: 1 ispac: 0 maxfr: 4858 tabn: 0 slot: 28(0x1c) ------------------------------------------------------ tsn 2 rdba 0xc000e0 bh 0xaafa81a8 cv 0xbcfb8178 ------------------------------------------------------ ktudh redo: slt: 0x000d sqn: 0x0000052f flg: 0x0012 siz: 304 fbi: 0 uba: 0x00c002e2.02d5.17 pxid: 0x0000.000.00000000 ------------------------------------------------------ tsn 1 rdba 0x800ec5 bh 0xaafac378 cv 0xbcfb8248 ------------------------------------------------------ index redo (kdxlde): delete leaf row KTB Redo op: 0x01 ver: 0x01 compat bit: 4 (post-11) padding: 1 ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] kjzduptcctx: Notifying DIAG for crash event ----- Abridged Call Stack Trace ----- ksedsts()+465<-kjzdssdmp()+267<-kjzduptcctx()+232<-kjzdicrshnfy()+63<-ksuitm()+5570<-ksbrdp()+3507<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 ----- End of Abridged Call Stack Trace -----
SMON trace
Block Checking: DBA = 12583136, Block Type = System Managed Segment Header Block ERROR: SMU Segment Header Corrupted. Error Code = 38504 ktu4smck: SCN commited txn list is not sorted. previous txn slot=25, scn=0x0000.00291da4 offending txn slot=20, scn=0x0000.00291517 TRN CTL:: seq: 0x02d5 chd: 0x0005 ctl: 0x0005 inc: 0x00000000 nfb: 0x0001 mgc: 0xb000 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe) uba: 0x00c002e2.02d5.17 scn: 0x0000.00291c12 Version: 0x01 FREE BLOCK POOL:: uba: 0x00000000.02d5.16 ext: 0x2 spc: 0x1496 uba: 0x00c002e0.02d5.0b ext: 0x2 spc: 0xb74 uba: 0x00000000.02ab.1d ext: 0x7 spc: 0x8cc uba: 0x00000000.0225.01 ext: 0x2 spc: 0x1f84 uba: 0x00000000.0000.00 ext: 0x0 spc: 0x0 TRN TBL:: index state cflags wrap# uel scn dba parent-xid nub bcl cmt ----------------------------------------------------------------------------------------- 0x00 9 0x00 0x052d 0x0008 0x0000.00291641 0x00c00121 0x0000.000.00000000 0x00000001 0x00000000 1405993807 0x01 9 0x00 0x052a 0x0021 0x0000.00291b21 0x00c0012b 0x0000.000.00000000 0x00000001 0x00000000 1405985479 0x02 9 0x00 0x0511 0x0007 0x0000.00291d56 0x00c0012d 0x0000.000.00000000 0x00000003 0x00000000 1406001110 0x03 9 0x00 0x052e 0x000b 0x0000.0029187c 0x00c00127 0x0000.000.00000000 0x00000001 0x00000000 1405995007 ... 0x0d 10 0x00 0x052f 0x0002 0x0000.002d27b1 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 0 0x0e 9 0x00 0x052d 0xffdd 0x0000.002915f2 0x00c008de 0x0000.000.00000000 0x00000001 0x00000000 1405987108 0x0f 9 0x00 0x052e 0x0002 0x0000.0029175e 0x00c00125 0x0000.000.00000000 0x00000003 0x00000000 1405994408 EXT TRN CTL:: usn: 7 sp1:0x00000000 sp2:0x00000000 sp3:0x00000000 sp4:0x00000000 sp5:0x00000000 sp6:0x00000000 sp7:0x00000000 sp8:0x00000000 TYP:0 CLS:29 AFN:3 DBA:0x00c000e0 OBJ:4294967295 SCN:0x0000.00292161 SEQ:1 OP:5.2 ENC:0 RBL:0 ktudh redo: slt: 0x000d sqn: 0x0000052f flg: 0x0411 siz: 80 fbi: 0 uba: 0x00c002e2.02d5.17 pxid: 0x0000.000.00000000 ... Disk Block image: buffer rdba: 0x00c000e0 scn: 0x0000.00292161 seq: 0x01 flg: 0x04 tail: 0x21612601 frmt: 0x02 chkval: 0xcc65 type: 0x26=KTU SMU HEADER BLOCK ... SMON: following errors trapped and ignored: ORA-01595: error freeing extent (3) of rollback segment (7)) ORA-00607: Internal error occurred while making a change to a data block ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [224], [38504], [], [], [], [], [], [], [], []
Tip:
Above trace file content had truncated.
CLS ==>The block Class Block, classes above 16 are reserved for undo segments. The block class is dependent on the undo segment number. Each undo segment has two block classes; one for the undo segment header and the other for undo segment blocks. so 29 (7 undo segment’s Undo Header)
AFN ==>Absolute File Number
DBA:0x00c000e0 ==>datafile 3 block 224
OP ==> redo operation, start of a transaction the redo operation is 5.2.
In the above example the class (CLS) is 29. we can determine that this transaction is using undo segment number is 7. We can also see that the slot number (slt) is 0x000d and the sequence number (sqn) is 0x0000052f .
For this transaction, the XID will be
XID: 0x0007.00d.0000052f
# The xid is 8 bytes composed of Undo segment number , Undo segment header transaction table slot , and sequence number wrap .
# The uba is 8 bytes composed of DBA of undo block , Sequence number , and Record number in block .
Transaction recovery (the process of rolling back transactions) is performed:
- By the shadow process when a rollback statement is issued
- By PMON when a session (process) crashes with a transaction in progress
- By SMON or a shadow process on opening a database that crashed with active transactions
If the RDBMS instance crashes before the transaction is committed, there is no time to roll back the transaction. The next time the database is opened, crash recovery rolls the database forward, returning it to its pre-crash state. it is then the responsibility of transaction recovery to remove any incomplete transactions.
Transaction recover at database open
- Active transactions in the SYSTEM rollback segment are immediately rolled back
- Active transactions in other rollback segments are marked as “dead”
- At a later time, SMON scans the segments again and performs a rollback on dead transactions
This is especially useful when downtime must be kept to an absolute minimum. A side effect of this behavior is that databases now open, in most cases, even if a rollback segment is corrupt, with errors loged to alert and trace file(SMON). This makes it much easier to diagnose the failure, because you have access to the database. However, it would be easy for customers to run for some time without realizing that they have a problem. even if the server is unable to read the rollback segment header, because the data file is offline or corrupted, then you cannot open the database. ALL rollback segments found in undo$, not just those specified with the rollback_segments parameter, are checked for active transactions when the database is opened.
we can also find dead transactions by dumping the rollback segment header and checking the state column of the transaction table dump. to dump the rollback segment header use the following command:
sql> alter system dump undo header '<rollback segment name>'
A active transaction is identified by having state=’10’
tack a example
#session 1 SQL> delete tt; --no commit #session 2 sys@ANBOB>select xidusn,xidslot,start_scnb,status from v$transaction; XIDUSN XIDSLOT START_SCNB STATUS -------------------- -------------------- -------------------- ---------------- 2 22 242359942 ACTIVE sys@ANBOB>select usn,extents,status,curblk from v$rollstat where XACTS>0; USN EXTENTS STATUS CURBLK -------------------- -------------------- --------------- -------------------- 2 3 NLINE 2 sys@ANBOB>select * from v$rollname; USN NAME -------------------- ------------------------------ 0 SYSTEM 1 _SYSSMU1_1240252155$ 2 _SYSSMU2_111974964$ 3 _SYSSMU3_4004931649$ 4 _SYSSMU4_1126976075$ 5 _SYSSMU5_4011504098$ 6 _SYSSMU6_3654194381$ 7 _SYSSMU7_4222772309$ 8 _SYSSMU8_3612859353$ 9 _SYSSMU9_3945653786$ 10 _SYSSMU10_3271578125$ sys@ANBOB>alter system dump undo header '_SYSSMU2_111974964$'; sys@ANBOB>select * from v$diag_info;
trace file
====================================== Version: 0x01 FREE BLOCK POOL:: uba: 0x00000000.1bf4.16 ext: 0x1 spc: 0x1496 uba: 0x00000000.1bf4.02 ext: 0x1 spc: 0x1f06 uba: 0x00000000.1bf3.06 ext: 0x0 spc: 0x136a uba: 0x00000000.19f6.42 ext: 0x2 spc: 0x9c2 uba: 0x00000000.081c.04 ext: 0x34 spc: 0x1dae TRN TBL:: index state cflags wrap# uel scn dba parent-xid nub stmt_num cmt ------------------------------------------------------------------------------------------------ 0x00 9 0x00 0x610a 0x000a 0x0008.0e72174d 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1409648053 0x01 9 0x00 0x6112 0x001c 0x0008.0e7219a0 0x00c00096 0x0000.000.00000000 0x00000001 0x00000000 1409648434 0x02 9 0x00 0x6115 0x0018 0x0008.0e721ca1 0x00c00089 0x0000.000.00000000 0x00000001 0x00000000 1409649254 0x03 9 0x00 0x610d 0x0019 0x0008.0e72198c 0x00c00096 0x0000.000.00000000 0x00000001 0x00000000 1409648434 ... had truncated 0x12 9 0x00 0x6110 0x001e 0x0008.0e721dec 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1409649794 0x13 9 0x00 0x6108 0x0020 0x0008.0e721b8c 0x00c00088 0x0000.000.00000000 0x00000001 0x00000000 1409649033 0x14 9 0x00 0x60fe 0x001b 0x0008.0e72190d 0x00c00095 0x0000.000.00000000 0x00000001 0x00000000 1409648433 0x15 9 0x00 0x610c 0x0009 0x0008.0e721ba5 0x00c00088 0x0000.000.00000000 0x00000001 0x00000000 1409649033 0x16 10 0x80 0x6108 0x0001 0x0008.0e721e86 0x00c0008a 0x0000.000.00000000 0x00000001 0x00000000 0 0x17 9 0x00 0x610b 0x0010 0x0008.0e72189c 0x00c00aed 0x0000.000.00000000 0x00000001 0x00000000 1409648413 0x18 9 0x00 0x60fb 0x0012 0x0008.0e721cac 0x00c00089 0x0000.000.00000000 0x00000001 0x00000000 1409649254 Using the _offline_rollback_segments or _corrupted_rollback_segments parameters changes the behavior of the RDBMS when: Opening the database Performing consistent read and delayed block cleanout Dropping a rollback segment
When opening a database, any rollback segments listed in _offline or _corrupted parameters:
Are not scanned, and any active transactions are neither marked as dead nor rolled back Appear offline in dba_rollback_segs (undo$) Cannot be acquired by the instance for new transactions
Solution:
In cases of UNDO log corruption, you must:
- Change the undo_management parameter from “AUTO” to “MANUAL”
- Drop the old UNDO tablespace
- Create a new UNDO tablespace
- Change the undo_management parameter from “MANUAL” to “AUTO”
IF database can be open. References How to Change the Existing Undo Tablespace to a New Undo Tablespace (文档 ID 431652.1)
IF database cannot be open, or little time can be open
1 – Identify the bad segment:
select segment_name,status from dba_rollback_segs where tablespace_name='xxx' and status = 'NEEDS RECOVERY';
or
#strings system01.dbf | grep _SYSSMU | cut -d $ -f 1 | sort -u
2. Bounce the instance with the hidden parameter “_offline_rollback_segments” or “_corrupted_rollback_segments” in init.ora (or using spfile), specifying the bad segment name:
*.undo_management='MANUAL' *.undo_tablespace='SYSTEM' *._offline_rollback_segments=('_SYSSMU7_1394480367$','xxx')
Tip: in this case just need 7# rollback_segment.
Noremally you cannot drop a rollback segment if it contains active transactions . you can circumvent this by using the parameters, If you drop an _offline or _corrupted rollback segment that contains active transaction, you risk logical corruption, possibly in the data dictionary.
3. Bounce database, nuke the corrupt segment and tablespace:
startup pfile='' drop rollback segment "_SYSSMU7_1394480367$"; drop tablespace UNDOTBS1 including contents and datafiles;
4. Create new undo tablespace and set default undo tablespace, restore pfile UNDO management to AUTO, restartup
5. export full database ,drop database and recreate database ,import (recommendation)
Always make sure to change your database back into a supported state by solving the problems, removing the special settings in you parameter file, shuting the instance down, and performing a normal startup. Although the database may seem to run smoothly, certain corruption problems can come back even after a long time, potentially causing a lot more problems than they did in the first place.
NOTE:
When using these undocumented parameters the transaction table is not read when the database is opened. so transactions are not marked as dead or rolled back. the database is in an unsupported state.
Undocumented Parameters: More Effects On CR and Cleanout
- If an open ITL is found to be associated with an _offline segment, the segment is read to find the transaction status
- If committed, the block is cleaned out
- If active and you want to read the block, a CR copy is constructed using undo from the segment
- If active and you want to lock the row,undesirable behavior may result
If an open ITL is found to be associated with a _corrupted segment, the segment is not read to find the transaction status
It is as if the rollback segment had been dropped; the transaction is assumed to be committed and delayed block cleanout is performed
If the transaction was not committed, logical corruption will occur
Reference MOS and DSI
对不起,这篇文章暂时关闭评论。