Troubleshooting DB session spin call java function
一天客户问有个SQL执行中好像是hang了,环境oracle 19c RAC,当前的session对应的v$session.event是‘gc current grant 2-way ’, 之前运行都是秒级,现在已经跑了1个小时还未结束,通常看到这个event会怀疑是GC出了问题,然后就走错了路, 下面看一下这个案例。
SQL> select username,sid,sql_id,event,seconds_in_wait,state,last_call_et from v$session where sql_id='0ax014ztkh0r4'; USERNAME SID SQL_ID EVENT SECONDS_IN_WAIT STATE LAST_CALL_ET ---------- ---------- ------------- ------------------------------------------------------------ --------------- ------------------- -------------- ANBOB_U1 1651 0ax014ztkh0r4 gc current grant 2-way 3834 WAITED SHORT TIME 3836 SQL> select username,sid,sql_id,event,seconds_in_wait,state,last_call_et,blocking_instance,blocking_session from v$session where sql_id='0ax014ztkh0r4'; USERNAME SID SQL_ID EVENT SECONDS_IN_WAIT STATE LAST_CALL_ET BLOCKING_INSTANCE BLOCKING_SESSION ---------- ---------- ------------- ------------------------------------------------------------ --------------- ------------------- -------------- ----------------- ---------------- ANBOB_U1 1651 0ax014ztkh0r4 gc current grant 2-way 3898 WAITED SHORT TIME 3900
v$session 和v$session_wait的event是表示不只是当前的wait event,还有可能是上一个(刚刚完成的)event, 如何判断是否当前正在等待呢?可以查看当state列如果是”WAITING”时. 也可以参考WAIT_TIME列,当值为0时表示会话当前正在等待,同时SECONDS_IN_WAIT列是当前等待所花费的时间,从11g以后不再建议使用此列,用另外两列代替,如wait_time_micro表示等待的时间(以微秒为单位),如果会话当前正在等待中,则此列的值为当前等待中花费的时间,如果会话当前不在等待中,则该值为上一次等待中等待的时间。而对于另一列TIME_REMAINING_MICRO 是估算等待的剩余时间,如果为null 表示当前会话未等待。(不过我觉的这列仅供参考)
上面这个会话当前session.state 是’WAITED SHORT TIME’, 所以该event并不是当前会话的正在等待的event, 如果使用我的ase.sql查看活动会话,会显示ON cpu.
下面分析SQL 使用sql monitor 利器
SQL> select DBMS_SQLTUNE.REPORT_SQL_MONITOR(sql_id=>'&sql_id',report_level=>'ALL',type=>'text') from dual; Enter value for sql_id: 0ax014ztkh0r4 DBMS_SQLTUNE.REPORT_SQL_MONITOR(SQL_ID=>'0AX014ZTKH0R4',REPORT_LEVEL=>'ALL',TYPE=>'TEXT') -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL Monitoring Report SQL Text ------------------------------ UPDATE ANBOB_ENCRYPT_20210315 A SET PASSWORD_NEW = JAVA_ENCRY_F(A.ATTR_VALUE) WHERE PROD_INST_ID_1 = :B1 Global Information ------------------------------ Status : EXECUTING Instance ID : 1 Session : ANBOB_U1 (1651:36742) SQL ID : 0ax014ztkh0r4 SQL Execution ID : 16806606 Execution Started : 03/16/2021 14:52:26 First Refresh Time : 03/16/2021 14:52:32 Last Refresh Time : 03/16/2021 15:53:50 Duration : 3685s Module/Action : SQL*Plus/- Service : xxxxxxxxxxx Program : sqlplus@xxx-11fb-g06-sev-r730-02u22 (TNS V1-V PLSQL Entry Ids (Object/Subprogram) : 1081201,1 PLSQL Current Ids (Object/Subprogram) : 1081201,1 Binds ======================================================================================================================== | Name | Position | Type | Value | ======================================================================================================================== | :B1 | 1 | NUMBER | 925559560 | ======================================================================================================================== Global Stats =================================================== | Elapsed | Cpu | Java | Other | Buffer | | Time(s) | Time(s) | Time(s) | Waits(s) | Gets | =================================================== | 3686 | 3681 | 3686 | 4.59 | 5 | #buffer get 5,cpu time 3681s,java time 3686s =================================================== SQL Plan Monitoring Details (Plan Hash Value=3801314702) =============================================================================================================================================== | Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Activity | Activity Detail | | | | | (Estim) | | Active(s) | Active | | (Actual) | (%) | (# samples) | =============================================================================================================================================== | 0 | UPDATE STATEMENT | | | | 1 | +6 | 1 | 0 | | | | 1 | UPDATE | ANBOB_ENCRYPT_20210315 | | | 3686 | +1 | 1 | 0 | 100.00 | Cpu (3672) | | -> 2 | INDEX RANGE SCAN | ANBOB_ENCRYPT_20210315_1 | 1 | 1 | 3683 | +6 | 1 | 1 | | | ===============================================================================================================================================
Note:
session 确实已经running了1个多小时,更新1行,buffer gets 5, 时间都记入了Java TIME.
看session call stack
SQL> oradebug short_stack ksedsts()+426<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-PassWord__encode()+1753<-sjoninvk_jit()+1041<-joevm_joe_run_jit_somersault()+458<-joe_invoke()+1631<-joet_aux_thread_main()+1674<-seoa_note_stack_outside()+34<-joet_thread_main()+64<-sjontlo_initialize()+178<-joe_enter_vm()+1197<-ioei_call_java()+4716<-ioesub_CALL_JAVA()+569<-seoa_note_stack_outside()+34<-ioe_call_java()+292<-jox_invoke_java_()+4133<-kkxmjexe()+1493<-kgmexcb()+56<-kkxmswu()+91<-kgmexwi()+1011<-kgmexec()+1452<-evapls()+1251<-evaopn2()+737<-upderh()+1252<-upduaw()+193<-kdusru()+617<-kauupd()+356<-updrow()+1693<-qerupUpdRow()+725<-qerupFetch()+644<-updaul()+1416<-updThreePhaseExe()+340<-updexe()+443<-opiexe()+11783<-opipls()+2086<-opiodr()+1202<-rpidrus()+198<-skgmstack()+65<-rpidru()+132<-rpiswu2()+541<-rpidrv()+1248<-psddr0()+467<-psdnal()+624<-pevm_EXECC()+306<-pfrinstr_EXECC()+56<-pfrrun_no_tool()+52<-pfrrun()+902<-plsql_run()+752<-peicnt()+282<-kkxexe()+720<-opiexe()+30719<-kpoal8()+2387<-opiodr()+1202<-ttcpip()+1222<-opitsk()+1897<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main()+245 SQL> oradebug short_stack ksedsts()+426<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-PassWord__encode()+2131<-sjoninvk_jit()+1041<-joevm_joe_run_jit_somersault()+458<-joe_invoke()+1631<-joet_aux_thread_main()+1674<-seoa_note_stack_outside()+34<-joet_thread_main()+64<-sjontlo_initialize()+178<-joe_enter_vm()+1197<-ioei_call_java()+4716<-ioesub_CALL_JAVA()+569<-seoa_note_stack_outside()+34<-ioe_call_java()+292<-jox_invoke_java_()+4133<-kkxmjexe()+1493<-kgmexcb()+56<-kkxmswu()+91<-kgmexwi()+1011<-kgmexec()+1452<-evapls()+1251<-evaopn2()+737<-upderh()+1252<-upduaw()+193<-kdusru()+617<-kauupd()+356<-updrow()+1693<-qerupUpdRow()+725<-qerupFetch()+644<-updaul()+1416<-updThreePhaseExe()+340<-updexe()+443<-opiexe()+11783<-opipls()+2086<-opiodr()+1202<-rpidrus()+198<-skgmstack()+65<-rpidru()+132<-rpiswu2()+541<-rpidrv()+1248<-psddr0()+467<-psdnal()+624<-pevm_EXECC()+306<-pfrinstr_EXECC()+56<-pfrrun_no_tool()+52<-pfrrun()+902<-plsql_run()+752<-peicnt()+282<-kkxexe()+720<-opiexe()+30719<-kpoal8()+2387<-opiodr()+1202<-ttcpip()+1222<-opitsk()+1897<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main()+245
因为当前的SQL正在执行java代码部分,说明该存储过程有java语言,需要排查java代码 。
SQL> @ddl ANBOB_U1.JAVA_ENCRY_F
PL/SQL procedure successfully completed.
DBMS_METADATA.GET_DDL(OBJECT_TYPE,OBJECT_NAME,OWNER)
------------------------------------------------------------------------------------------------------
CREATE OR REPLACE EDITIONABLE FUNCTION "ANBOB_U1"."JAVA_ENCRY_F" (code VARCHAR2)
RETURN VARCHAR2
AS LANGUAGE JAVA
NAME 'PassWord.encode(java.lang.String) return java.lang.String';
/
再继续分析JAVA 代码部分发现,对应的表中有一条记录,处理进入列循环, while(true) { if xxx } 判断出了问题。
所以到此能明确是因为当前会话执行的SQL,使用了java function, 因为java程序代码缺陷导致SQL一直处理spin(无限循环中). 对于这类问题我个人觉的是程序设计问题,为什么一个加密操作又是java又是db?如果是我,我想要么1,加密在程序中完成,计算后的值存入DB;2,完全使用数据库的加密,如使用dbms_crypto等内置package完成。
对不起,这篇文章暂时关闭评论。