Performance tuning ‘wait for a undo record’ event

前几日一个数据库的负载非常高,查看数据库的活动会话发现大部分session在等待’wait for a undo record’的事件, 该事件通常发生在fast-start parallel rollback, 该特性通常是在SMON进程发现存在一个长时间的事务需要回滚,或遇到PARALLE DML需要回滚时,超过一定量的回滚任务时自动启动多个server process的并行回滚

Troubleshooting Instance crash after ora-7445[opiaba],ora-600 [kgh_heap_sizes:ds], ora-600 [17147]

ORA-00600: 内部错误代码, 参数: [kgh_heap_sizes:ds], [0x700000E82B30018], [], [], [], [], [], [], [], [], [], []
ORA-07445: 出现异常错误: 核心转储 [opiaba()+788] [SIGSEGV] [ADDR:0xF00000E161B7CD2] [PC:0x10662AFD4] [Address not mapped to object] []

Listener no register service& INTERMEDIATE status with “Not All Endpoints Registered” in 11gR2 RAC

是一套11GR2 的RAC 环境, CRSCTL CHECK CRS检查CRS 服务已无法通讯,当时也让他查询了crsd.bin 进程确认不存在了, 当时通知重启CRS便可以解决,但是后来通知客户端依旧有个节点无法连接,检查LISTNER 并没有注册任何SERVICE,而且当时也只监听在PUBLIC IP, 检查DB PARAMETER LOCAL_LISTENER 是绑定VIP,

,

Troubleshooting ora-7445 [kkqteParseSqlForPG()+1840] in 11g r2, Table expansion transformation

Exception [type: SIGSEGV, Invalid permissions for mapped object] [ADDR:0x63] [PC:0x107C00D30, kkqteParseSqlForPG()+1840] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_22283288.trc (incident=253347):
ORA-07445: 出现异常错误: 核心转储 [kkqteParseSqlForPG()+1840] [SIGSEGV] [ADDR:0x63] [PC:0x107C00D30] [Invalid permissions for mapped object] []

Troubleshooting ora-07445 [__lwp_kill()+48] [SIGIOT] error and instance crash

Exception [type: SIGIOT, unknown code] [ADDR:0x6CA9] [PC:0xC0000000003125F0, __lwp_kill()+48] [exception issued by pid: 27817, uid: 1024] [flags: 0x0, count: 1]
Errors in file /oracle/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_lms3_27817.trc (incident=704134):
ORA-07445: exception encountered: core dump [__lwp_kill()+48] [SIGIOT] [ADDR:0x6CA9] [PC:0xC0000000003125F0] [unknown code] []

,

After OS reboot, Ohasd(cssd) start fail is due to OLR corrupted

前几天帮助同事处理了个案例, 主机意外重启后数据库无法启动, 环境是11.2.0.3 standalone o […]

数据去哪了?现实版 (partition data invalid)

前几天有业务部门反应有个表的数据带上条件查询不出来,不带数据则可以,表没有做特殊处理,11.2.0.3 RAC […]

Troubleshooting Instance crash when modify db_cache_size, ora-600 [kmgs_pre_process_request_6]

ORA-00600: internal error code, arguments: [kmgs_pre_process_request_6], [6], [895], [0], [3], [0x459C1F3D8], [], []
Mon Dec 22 22:40:43 2014
MMAN: terminating instance due to error 822
Instance terminated by MMAN, pid = 31205

Troubleshooting ORA-12012&ORA-29283&ORA-06512 issue

ORA-12012: error on auto execute of job “ORACLE_OCM”.”MGMT_CONFIG_JOB_2_2″
ORA-29283: invalid file operation
ORA-06512: at “SYS.UTL_FILE”, line 536
ORA-29283: invalid file operation
ORA-06512: at “ORACLE_OCM.MGMT_DB_LL_METRICS”, line 2436
ORA-06512: at line 1

Oracle Database History

In 1977 Larry Ellison saw a IBM Journal of Research and Development, discovered a research paper that described a working prototype for a relational database management system (RDBMS). Showing it to coworkers Bob Miner and Ed Oates at Ampex .

How to release still “killed“ status session in v$session?(释放killed的session)

最近在一套生产库上发现了几个已经killed的会话一直保持在v$session 会话中,会话是user type的连接,而且已经killed了很久,通过SPID 发现操作系统层面已不存在该进程, 下面是我多次尝试后最终释放,

How to find out who caused the database user locked(ora-1017 or ORA-28000)(捕捉登录失败)

昨天有应用反应server日志中有如下错误,提示用户locked,但是重启了下web server又恢复了正常,但是这期间我们也没有人为的给db user unlock, 下面记录一下, 其实事情的经过是这样子的…

Corrupted free block & ORA-19566 when using rman backup after restore DB

RMAN-00569: ========= ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: =====================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 03/10/2015 12:18:24
ORA-19566: exceeded limit of 0 corrupt blocks for file E:\ORACLE2\PRODUCT\10.1.0\ORADATA\ORCL\USERS01.DBF

Opatch auto PSU for RAC on hpux ia

to Opatch 11.2.0.4.5 PSU for RAC on hpux ia-31

kjfspseudorcfg and kjxgrrcfgchk some reason #

kjxgrrcfgchk: Initiating reconfig, reason=3 #######<<<<<<<<< kjxgrrcfgchk: COMM rcfg - Disk Vote Required kjfmReceiverHealthCB_CheckAll: Recievers are healthy.

,

Online Redefinition Partition Existing Table, ora-600 [kkzuord_copycolcomcb.2.exec] and ORA-23539

从9i起可以重定义表结构可以在线,对于在线重定义的好处很多站点都有这里不再叙述,原理也是利用了mview及mview log 的低层操作, 满足对于7*24 小时业务的在线调整, 但是需要增加原大小一倍的空间存放临时数据

, ,

Troubleshooting TNS-12547&TNS-12560 AIX error 32: Broken pipe caused by tcp socket leak

上周出现个蹊跷案例,最近一直在忙今天简单的记录一下, 中间件反馈数据库连接时失败,在数据库使用lsnrctl status 查看监听状态会发现Listener一会儿正常,一会儿报错,但是在Listener正常时可以看出listener的start date并没有restart过

Tuning: latch: cache buffers chains 又一案例

CBC latch竞争的原因很多,通常也可以理解为一种热块, 对于CBC 通常都是从session wait中找到child latch address 然后再去x$bh的hladdr字段找到相应的obj,1个BH handle可以会关连多个obj, 再参考TCH 列确认比较hot的对象。大多数CBC的OLTP系统多数应该注意一下sql 的执行计划中使用了NL join的方式;

Troubleshooting ora-12519 or ora-12516 ,listener service ‘blocked’, and wait event ‘latch: ges resource hash

通常当出现ora-12519 or 12516时都是因为数据库进程数超过了数据库参数processes 或 sessions 时, 并且通常在db alert 中出现ora-20 or ora-18 的错误信息,如果当时查看监听服务状态使用lsnrctl service 会发现service 当时是”blocked”状态

How to list all db links in oracle DB to generate a flat file? (生成dblink列表文件)

自己整理了个简单的SHELL 去收集LOCAL 的所有DB LINKS,功能是如果DB LINK创建使用的是简单方式(没有配置TNSNAMES.ORA)直接取IP:PORT, 或如果使用TNSNAME Alias Name调用TNSPING 转换成IP, 同时还会判断tnsping ip port 里否通?