Troubleshooting Slower IO Performance on Veritas for 11.2.0.4 compared 10gR2 on RAW device after RMAN migrate
简单的记录一个案例, 一个数据库相同的存储从原来的oracle 10.2.0.4 使用RAW device,升级到oracle 11.2.0.4 同时使用Veritas 卷Vxfs 文件系统的RAC ,因为也使用了ODM(Oracle Disk Manager), 理论上也是一种变象的RAW device, 支持异常IO,避免双重buffer. 用户最直观的感受是在RMAN 备份集做的迁移后,数据库性能比之前慢的很多倍,如一个6G的表count(*)原来不到1分钟到后来要10分钟以上,同样对相同大小的数据库的RMAN 备份也有原来的2个小时左右,到现在10几个小时,同时备份是CPU 使用率很高 ,vmstat 显示很高的cpu wait。
SQL ID: 8h17yarx41rhx Plan Hash: 2304749046 select count(*) from TAB_ANBOB call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.01 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 36.22 824.27 857822 857990 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 36.23 824.28 857822 857990 0 1 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: SYS Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=857990 pr=857822 pw=0 time=824274020 us) 42799324 42799324 42799324 TABLE ACCESS FULL TAB_ANBOB (cr=857990 pr=857822 pw=0 time=809921707 us cost=233014 size=0 card=42815870) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ library cache lock 1 0.00 0.00 row cache lock 2 0.00 0.00 library cache pin 1 0.00 0.00 SQL*Net message to client 2 0.00 0.00 Disk file operations I/O 8 0.00 0.00 db file sequential read 4 0.01 0.02 db file scattered read 6713 0.30 793.87 gc cr multi block request 8388 0.05 8.79 gc cr grant 2-way 3 0.00 0.00 latch: gc element 5 0.00 0.00 SQL*Net message from client 2 53.55 53.55
Note:
对count做了sql trace基本是I/O的分散读, 性能确实是非常不理想。
IOStat by Function summary Function Name Reads: Data Reqs per sec Data per sec Writes: Data Reqs per sec Data per sec Waits: Count Avg Tm(ms) RMAN 775.7G 73.63 36.781M 699.4G 33.17 33.163M 1456 2.61 Buffer Cache Reads 1.7G 4.21 .083M 0M 0.00 0M 47.4K 10.20 Others 1.1G 3.35 .052M 277M 0.82 .013M 72.4K 0.07 DBWR 0M 0.00 0M 1.1G 2.99 .051M 0 LGWR 0M 0.00 0M 893M 21.15 .041M 0 Direct Writes 0M 0.00 0M 6M 0.00 0M 0 Direct Reads 1M 0.01 0M 0M 0.00 0M 0 TOTAL: 778.5G 81.20 36.915M 701.6G 58.13 33.269M 121.2K 4.06
Note:
RMAN备份同样也是每秒30M+左右的吞吐,但是和count一样,单次IO读也在3ms左右(SAS),那说明存储响应时间还是不错。
性能差距这么大常见的几个原因:
1, 备份如果是到磁带机可能是有原来的LAN-FREE变成了LAN网络,遇到了网络瓶颈。
2, VxFS 碎片太多。
Block size
cannot be changed once the file system has been created,Recommended Oracle file systems block size (assuming your Oracle database have block size equal or bigger than 8KB which is :
Recommended Oracle file systems block size (assuming your Oracle database have block size equal or bigger than 8KB which is :
File System | Block Size |
---|---|
Oracle software and dump/diagnostic directories | 1KB |
Redo log directory | 512 bytes for Solaris, AIX, Windows, Linux and 1KB for HP-UX |
Archived log directory | 1KB |
Control files directory | 8KB (control files block size is 16KB starting with Oracle 10g) |
Data, index, undo, system/sysaux and temporary directories | 8KB |
You can check control file block size with (Linux RedHat 5.5 and Oracle 11.2.0.3):
SQL> select cfbsz from x$kcccf; CFBSZ ---------- 16384 16384
File extents
Same as Oracle table extent you can change default extent allocation policy and/or preallocate space to a file:
# getext system01.dbf
File system fragmentation
To display it issue:
# /opt/VRTS/bin/fsadm -t vxfs -D /oracle/data
# /opt/VRTS/bin/df -o s /ora_prisma/data/ # /opt/VRTS/bin/fsmap -A $datafile
3, 双重buffer
Sun Solaris using a Veritas File system
Oracle Import takes longer when using buffered VxFS than using unbuffered VxFS. Quick I/O allows regular files built on VxFS to
be accessed as a raw device, bypassing normal file system buffering and allowing direct I/O. when Quick I/O is NOT enabled, the convosync & options are used in “mount” command for altering the caching behavior of a vxfs file system; mount options “convosync=direct” and “=direct” can be used to avoid double-buffering. Data is synchronous to disk bypassing the page cache and meta-data updates are synchronous. These options are normally used for datafiles and redo logs. The __fdsync() system calls indicate that the problem is OS-related. Setting mincache=direct on veritas filesystem mount option solves performance problem.
mount -F vxfs -o mincache=direct /dev/vx/dsk/testdg/testvol /test
Mount options
Suggested mount options for Oracle databases:
File System | Normal Mount Options (VxFS) | Advanced Mount Options (VxFS) |
---|---|---|
Oracle software and dump/diagnostic directories | delaylog,datainlog,nolargefiles | delaylog,nodatainlog,nolargefiles |
Redo log directory | delaylog,datainlog,largefiles | delaylog,nodatainlog,convosync=direct,mincache=direct,largefiles |
Archived log directory | delaylog,datainlog,nolargefiles | delaylog,nodatainlog,convosync=direct,mincache=direct,nolargefiles |
Control files directory | delaylog,datainlog,nolargefiles | delaylog,datainlog,nolargefiles |
Data, index, undo, system/sysaux and temporary directories | delaylog,datainlog,largefiles | delaylog,nodatainlog,convosync=direct,mincache=direct,largefiles |
4, Veritas library BUG
An upgrade in veritas library from 5.0 to 5.1 (which includes ODM), resolved the LGWR issue.The LGWR appears to be encountering regular problems where it hangs for 3 seconds at a time. During this time, an iostat -x on the redo devices shows zero IO being pushed through.
Intermittent Long ‘log file sync’ Waits on Veritas ODM, LGWR Posting Long Write Times, I/O Portion of Wait Minimal (文档 ID 1278149.1)
根本原因
回到这个案例,因为当前的SQL查询和rman 备份到本地vxfs都很慢,所以排除了第一种可能, 后分析发现当前的VERITAS VXFS文件系统存在大量的碎片, 使用fsmap 查看一个8G的数据文件,(和oracle table extent一样每一行表示一个碎片)有17万个碎片。当有巨量的碎片, 执行每一个文件的操作时,需要从元数据中检索到每个扩展(extent)信息,如此巨大的meta data数据量操作,是一定会消耗大量的CPU资源和费时,这也是为什么rman备份时会有cpu wait.
经检查,这些数据文件之所以有巨量的扩展(extent),原因是,产生这些Oracle数据文件时,采用了Oracle RMAN并行写入的方式,即多个数据文件在同时不停的追加,并没有事先为这些文件预留连续空间。因此,数据在写入下层介质时,不是连续的一个整体,频繁分配新的extent,导致产生巨量extent,最终引发文件读写性能问题。
解决方法
可以使用veritas的文件系统碎片整理,或数据文件重新在文件系统上重构。
碎片整理方法(注:可在线,参数是文件系统的根,时间比较久):
nohup /opt/VRTS/bin/fsadm –EeDd /acct/oradata01/ &
对不起,这篇文章暂时关闭评论。