Oracle insert values 差异Exadata X5 和Exadata X8 分析思路
最近有个客户的某业务其中一条insert into txxxx values(seq.nextval, xx, xx ,xx);的SQL从Oracle Exadata X5迁移到X8上以后,sql性能有几百ms的差异,因该SQL是应用APP界面中登录写日志的一个SQL, 比较影响体验的一个关键SQL, 希望找到其中的原因,是否其它SQL存在相同的问题?
sqlstat对比
================x8 ====================
Avg Avg Plan Snapshot Avg LIO Avg PIO CPU (secs) Elapsed (secs) IO(secs) Clus(secs) APP (secs) Conc(secs) Celoff(secs) Hash Value Time INSTANCE_NUMBER Execs LOADS Per Exec Per Exec Per Exec Per Exec Per Exec Per Exec Per Exec Per Exec Per Exec ---------- ------------ --------------- -------- ---------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- 906344217 10-JUL 15:00 1 34 0 15.97 0.00 0.00 0.04 0.00 0.04 0.00 0.00 0.00 906344217 10-JUL 16:00 1 26 0 17.54 0.00 0.00 0.02 0.00 0.02 0.00 0.00 0.00 906344217 10-JUL 16:00 2 22 2 16.55 0.00 0.00 0.03 0.00 0.03 0.00 0.00 0.00 906344217 10-JUL 17:00 2 30 0 15.83 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.00 906344217 10-JUL 17:00 1 26 0 16.85 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.00 906344217 10-JUL 18:00 2 35 2 18.00 0.00 0.00 0.02 0.00 0.01 0.00 0.00 0.00 906344217 10-JUL 18:00 1 37 0 16.05 0.00 0.00 0.02 0.00 0.02 0.00 0.00 0.00 906344217 10-JUL 19:00 2 24 2 18.54 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 906344217 10-JUL 19:00 1 13 0 16.46 0.00 0.00 0.03 0.00 0.03 0.00 0.00 0.00 906344217 10-JUL 20:00 1 15 0 15.80 0.00 0.00 0.03 0.00 0.03 0.00 0.00 0.00 906344217 10-JUL 21:00 2 12 2 16.00 0.00 0.00 0.04 0.00 0.04 0.00 0.00 0.00 906344217 11-JUL 05:00 1 5 0 17.20 0.00 0.00 0.04 0.00 0.04 0.00 0.00 0.00 906344217 11-JUL 07:00 1 5 0 23.40 0.00 0.00 0.05 0.00 0.05 0.00 0.00 0.00 906344217 11-JUL 08:00 2 31 2 16.42 0.00 0.00 0.02 0.00 0.02 0.00 0.00 0.00 906344217 11-JUL 08:00 1 35 0 17.66 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.00 906344217 11-JUL 09:00 2 28 0 16.75 0.00 0.00 0.03 0.00 0.03 0.00 0.00 0.00 906344217 11-JUL 09:00 1 59 0 17.10 0.00 0.00 0.02 0.00 0.02 0.00 0.00 0.00 906344217 11-JUL 10:00 2 38 0 16.37 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.00 906344217 11-JUL 11:00 2 41 2 16.15 0.00 0.00 0.02 0.00 0.02 0.00 0.00 0.00 906344217 11-JUL 11:00 1 48 0 16.73 0.00 0.00 0.03 0.00 0.03 0.00 0.00 0.00 906344217 11-JUL 12:00 1 32 0 16.19 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.00 906344217 11-JUL 15:00 2 29 0 16.34 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.00 906344217 11-JUL 15:00 1 25 0 16.56 0.00 0.00 0.07 0.00 0.07 0.00 0.00 0.00 118 rows selected.
================x5 ====================
Avg Plan Snapshot Avg LIO Avg PIO CPU (secs) Elapsed (secs) IO(secs) Clus(secs) APP (secs) Conc(secs) Celoff(secs) Hash Value Time INSTANCE_NUMBER Execs LOADS Per Exec Per Exec Per Exec Per Exec Per Exec Per Exec Per Exec Per Exec Per Exec ---------- ------------ --------------- -------- ---------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- 906344217 03-JUL 12:00 1 2,578 0 27.11 1.33 0.00 28.61 0.05 0.05 0.00 27.42 0.00 906344217 03-JUL 12:00 2 2,751 0 75.30 3.21 0.00 23.98 0.10 0.02 0.00 20.68 0.00 906344217 04-JUL 20:00 2 487 0 16.37 2.24 0.00 0.44 0.01 0.00 0.00 0.00 0.00 906344217 06-JUL 09:00 1 2,551 95 22.66 4.08 0.00 2.17 0.09 0.00 0.00 1.48 0.00 906344217 06-JUL 10:04 1 2,674 0 22.80 4.24 0.00 1.68 0.23 0.01 0.00 1.25 0.00 906344217 07-JUL 07:00 2 555 1 15.99 2.50 0.00 0.35 0.00 0.00 0.00 0.00 0.00 906344217 09-JUL 08:00 2 1,145 0 16.39 2.47 0.00 0.28 0.00 0.00 0.00 0.00 0.00 906344217 09-JUL 10:00 1 764 4 17.22 2.87 0.00 0.33 0.00 0.00 0.00 0.00 0.00 906344217 09-JUL 11:00 1 961 0 17.16 2.70 0.00 0.27 0.01 0.01 0.00 0.00 0.00 906344217 09-JUL 12:00 2 800 0 17.02 2.82 0.00 0.37 0.00 0.00 0.00 0.00 0.00 906344217 09-JUL 13:00 1 352 0 16.76 3.16 0.00 0.52 0.00 0.00 0.00 0.01 0.00 906344217 09-JUL 13:00 2 893 1 17.27 2.97 0.00 0.28 0.00 0.00 0.00 0.00 0.00 906344217 09-JUL 14:00 1 572 1 17.66 3.00 0.00 0.42 0.00 0.00 0.00 0.00 0.00 906344217 09-JUL 15:00 2 1,127 0 16.63 2.84 0.00 0.24 0.01 0.00 0.00 0.00 0.00 906344217 09-JUL 16:00 1 333 0 16.40 3.07 0.00 0.79 0.00 0.00 0.00 0.00 0.00 906344217 09-JUL 16:00 2 1,364 0 17.17 2.99 0.00 0.20 0.01 0.00 0.00 0.00 0.00 906344217 09-JUL 17:00 1 651 0 17.92 2.80 0.00 0.37 0.00 0.00 0.00 0.00 0.00 906344217 10-JUL 07:00 2 374 0 17.33 2.95 0.00 0.60 0.00 0.00 0.00 0.00 0.00 906344217 10-JUL 07:00 1 456 0 16.55 2.75 0.00 0.43 0.00 0.00 0.00 0.00 0.00 906344217 10-JUL 08:00 2 709 0 16.98 3.17 0.00 0.37 0.01 0.00 0.00 0.00 0.00 906344217 10-JUL 13:00 2 1,019 0 16.85 3.43 0.00 0.24 0.00 0.00 0.00 0.00 0.00 906344217 11-JUL 07:00 2 520 0 17.15 3.01 0.00 0.51 0.00 0.00 0.00 0.00 0.00 22 rows selected. SQL>
Note:
x8上SQL执行时间比X5快了ms一个量级, X8每次执行是0.0N ms, 而X5是0.N ms, 虽然执行量不大,相差ms级也不大,但客户如果希望找到这其中的原因呢?并没有使用批量insert, 该SQL只是业务应用登录中的1次调用。同时在X5上显示每次执行还有3个物理IO read.
分析思路
硬件差异SSD or HDD?
DB/OS CPU/IO负载
收集SQLHC报告
表的创建DDL
表上索引及其他附属对象如trigger
数据库特性参数不同(db_lost_write_protect、db_block_checking、db_block_checksum等)
AUDIT(FGA)
DB version(11g和19c 这点是确认的)
对比SQL做sql trace(10046)
对比session的statistics(可借助tom’s runstats)
物理I/O 差异(db cache size, table cached), 通常table seg block +index block read
在Oracle heap table通常insert不会互相影响,通常是UK,PK约束和sequence争用(enq: sq ,row cache lock), X5的前2个snap 的降速就是exec并发高时concurrency 发生在sequence,因为sequence 为nocache,还是RAC环境2节点部署的应用。分析该问题最好是sql trace结合session statistics找原因。
样例:
# session 1 SQL> set serveroutput on SQL> set timing on SQL> @i USERNAME INST_NAME HOST_NAME I# SID SERIAL# VERSION STARTED SPID OPID CPID SADDR PADDR -------------------- -------------------- ------------------------- --- ----- -------- ---------- -------- ---------- ----- --------------- ---------------- ---------------- ANBOB PDB1-anbob19c oel7db1 1 35 17322 19.0.0.0.0 20230712 21834 34 21832 00000000780D5CF8 0000000078CF94A8 Elapsed: 00:00:00.34 ## enable sql trace SQL> alter session set tracefile_identifier=weejar; Session altered. SQL> exec dbms_session.session_trace_enable(true, false); PL/SQL procedure successfully completed. # session 2 create table sn1 as select ses.sid, sn.name, ses.value from v$sesstat ses, v$statname sn where sn.statistic# = ses.statistic# and ses.sid in (35) and lower(sn.name) like lower('%%') and value>0; # session 1 ## do insert declare TYPE T_HOUSE_T_ARRAY IS TABLE OF VARCHAR2(36); L_DATA T_HOUSE_T_ARRAY; CURSOR c IS SELECT sys_guid() from dual connect by level <= 50000 order by dbms_random.value; BEGIN for i in 1..100000 loop begin INSERT INTO house_t (HOUSEID, HOUSEGUID, AOGUID) VALUES (sys_guid(), sys_guid(), sys_guid()); commit; end; end loop; END; 16 / PL/SQL procedure successfully completed. # session 2 create table sn2 as select ses.sid, sn.name, ses.value from v$sesstat ses, v$statname sn where sn.statistic# = ses.statistic# and ses.sid in (35) and lower(sn.name) like lower('%%') 14 and value>0; Table created. # session 1 ## disable sql trace SQL> exec dbms_session.session_trace_disable; PL/SQL procedure successfully completed.
分析SQL TRACE
Parsing user id: SYS (recursive depth: 2) ******************************************************************************** SQL ID: 806prp55vvg4p Plan Hash: 0 INSERT INTO HOUSE_T (HOUSEID, HOUSEGUID, AOGUID) VALUES (SYS_GUID(), SYS_GUID(), SYS_GUID()) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 100000 9.43 13.81 6 52176 1177174 100000 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 100001 9.43 13.81 6 52176 1177174 100000 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 106 (recursive depth: 1) Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 0 0 0 LOAD TABLE CONVENTIONAL HOUSE_T (cr=0 pr=3 pw=0 time=26493 us starts=1) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ db file sequential read 6 0.04 0.11 log file switch (private strand flush incomplete) 2 0.16 0.20 log buffer space 5 1.30 2.68 ********************************************************************************
分析Session statistics
select sn2.name,sn2.value-nvl(sn1.value,0) diff, sn2.value s2v,sn1.value s1v from sn2 left join sn1 on sn2.name=sn1.name where sn2.value<> nvl(sn1.value,0) order by 1; SQL> select sn2.name,sn2.value-nvl(sn1.value,0) diff, sn2.value s2v,sn1.value s1v from sn2 left join sn1 on sn2.name=sn1.name where sn2.value<> nvl(sn1.value,0) and sn2.name like '%split%' order by 1; NAME DIFF S2V S1V ---------------------------------------------------------------- ---------- ---------- ---------- branch node splits 5 5 leaf node 90-10 splits 1646 1646 leaf node splits 1646 1646 root node splits 3 3 SQL> select sn2.name,sn2.value-nvl(sn1.value,0) diff, sn2.value s2v,sn1.value s1v from sn2 left join sn1 on sn2.name=sn1.name where sn2.value<> nvl(sn1.value,0) and sn2.name like '%redo%' order by 1; NAME DIFF S2V S1V ---------------------------------------------------------------- ---------- ---------- ---------- redo buffer allocation retries 21 23 2 redo entries 259339 266490 7151 redo entries for lost write detection 47 425 378 redo log space requests 5 5 redo log space wait time 21 21 redo size 221612560 236190448 14577888 redo size for lost write detection 3704 35448 31744 redo synch time 1 6 5 redo synch time (usec) 8282 57222 48940 redo synch time overhead (usec) 119 450 331 redo synch time overhead count ( 2ms) 1 8 7 redo synch writes 1 8 7 redo write info find 1 8 7 13 rows selected.
数据库参数对比
可以创建dblink,直接对比所有参数(含隐藏参数),或者导出临时表,判断是否在2台数据库环境因为某特性变化产生了问题。 比如在某一套环境发现有redo size for lost write detection ,而另一实例没有,从参数我们能对比出db_lost_write_protect并不一样,之前记录过一个案例《Troubleshooting Select 产生Redo分析案例》
对于XD可能存在一些优化产生性能的提升,同时X5上是Oracle 11G r2 而X8上是oracle 19C,硬件与软件版本都不一致,但希望可以找找其中的差异, 目前因为缺少数据,尚未定位最终原因,仅记录分析方法。
对不起,这篇文章暂时关闭评论。