Troubleshooting ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^xx”,”kglseshtTable”)
最近在客户的环境中遇到一个较为罕见的问题,数据库实例的alert日志持续报告ORA-4031错误,导致一系列级联问题,包括ORA-600错误、归档失败、ASM实例通信失败等现象。同时,INACTIVE连接数异常增加,进一步可能出现数据库链接数耗尽。问题的核心出现在ASM实例,日志中反复出现ORA-4031错误kglseshtTable ,尽管ASM实例的SGA已经配置为4GB,这个大小在正常情况下应当足够使用,但此次却是首次出现此类问题。
该环境为Oracle 11.2.0.3版本,部署在2节点RAC架构下,运行于RHEL Linux 6.9操作系统,并且已启用HugePages。针对这些问题,我们需要深入分析ASM实例的内存使用情况,排查是否存在内存泄漏、过度内存分配或其他潜在的内存瓶颈。之所以说罕见是因为在主流的11204或12c 19c都没有出现过该问题。记录一下该问题。
ASM ALERT
2024-11-12 11:22:13.958000 +08:00
Errors in file /u/app/base/diag/asm/+asm/+ASM1/trace/+ASM1_m000_3981.trc (incident=535046):
ORA-04031: unable to allocate 12312 bytes of shared memory ("shared pool","unknown object","KKSSP^2109","kglseshtTable")
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u/app/base/diag/asm/+asm/+ASM1/trace/+ASM1_m000_3981.trc:
ORA-04031: unable to allocate 12312 bytes of shared memory ("shared pool","unknown object","KKSSP^2109","kglseshtTable")
Process m000 died, see its trace file
2024-11-12 11:23:11.023000 +08:00
Sweep [inc][535106]: completed
Sweep [inc][535046]: completed
2024-11-12 11:27:44.991000 +08:00
Errors in file /u/app/base/diag/asm/+asm/+ASM1/trace/+ASM1_lmd0_47867.trc (incident=535073):
ORA-04031: unable to allocate 7072 bytes of shared memory ("shared pool","unknown object","sga heap(3,1)","ges resource ")
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2024-11-12 11:27:47.127000 +08:00
Sweep [inc][535073]: completed
2024-11-12 11:30:50.379000 +08:00
Errors in file /u/app/base/diag/asm/+asm/+ASM1/trace/+ASM1_ora_7788.trc (incident=535123):
ORA-00600: internal error code, arguments: [ksxp_rm_check0], [0x7FAFA41397A8], [0], [0x08474B2E0], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u/app/base/diag/asm/+asm/+ASM1/trace/+ASM1_ora_7791.trc (incident=535124):
ORA-00600: internal error code, arguments: [ksxp_rm_check0], [0x7F190B2127A8], [0], [0x08474B2E0], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
...
2024-11-12 12:37:58.735000 +08:00
Errors in file /u/app/base/diag/asm/+asm/+ASM1/trace/+ASM1_ora_34132.trc (incident=535116):
ORA-04031: unable to allocate 12312 bytes of shared memory ("shared pool","unknown object","KKSSP^2233","kglseshtTable")
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2024-11-12 12:38:43.676000 +08:00
Sweep [inc][535116]: completed
2024-11-12 12:38:48.875000 +08:00
Errors in file /u/app/base/diag/asm/+asm/+ASM1/trace/+ASM1_m001_34462.trc (incident=534891):
ORA-04031: unable to allocate 12312 bytes of shared memory ("shared pool","unknown object","KKSSP^2605","kglseshtTable")
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u/app/base/diag/asm/+asm/+ASM1/trace/+ASM1_m001_34462.trc:
ORA-04031: unable to allocate 12312 bytes of shared memory ("shared pool","unknown object","KKSSP^2605","kglseshtTable")
Process m001 died, see its trace file
2024-11-12 12:39:14.190000 +08:00
Errors in file /u/app/base/diag/asm/+asm/+ASM1/trace/+ASM1_lmd0_47867.trc (incident=535175):
ORA-04031: unable to allocate 7072 bytes of shared memory ("shared pool","unknown object","sga heap(3,1)","ges resource ")
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
10043 trace
dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 00BAF3FA0 ? 7F3A29CD6460 ? 66 000000FBF ? 7F3A29A9CFB0 ? 100000000 ? 000000002 ? kgeade()+351 call dbkePostKGE_kgsf() 00BAF3FA0 ? 7F3A29CD6460 ? 000000FBF ? 7F3A29A9CFB0 ? 100000000 ? 000000002 ? kgerelv()+135 call kgeade() 00BAF3FA0 ? 00BAF4150 ? 7F3A29CD6460 ? 000000FBF ? 100000000 ? 000000002 ? kgerev()+29 call kgerelv() 00BAF3FA0 ? 7F3A29CD6460 ? 000000FBF ? 00A3CE094 ? 7FFF65F48480 ? 000000000 ? kserec2()+165 call kgerev() 00BAF3FA0 ? 7F3A29CD6460 ? 000000FBF ? 000000000 ? 7FFF65F48480 ? 000000000 ? kfnRecordErr()+436 call kserec2() 00BAF3FA0 ? 000000001 ? 00000000D ? 00951BDE8 ? 000000001 ? 000000079 ? kfnConnectPriv()+11 call kfnRecordErr() 7F3A29AA9640 ? 7F3A29A6D950 ? 449 000000000 ? 00951BDE8 ? 000000001 ? 000000079 ? kfnUseConnFl()+213 call kfnConnectPriv() 000000000 ? 7FFF65F49D08 ? 000000000 ? 000000001 ? 000000000 ? 000000000 ? kfncFileIdentify()+ call kfnUseConnFl() 000000000 ? 000000000 ? 684 000000000 ? 000000001 ? 000000000 ? 000000000 ? kfioIdentify()+1701 call kfncFileIdentify() EB92B79C8 ? EC87AAA9A ? 000000030 ? 0003FE83E ? 00004342B ? EC87AAA68 ? ksfd_osmopn()+1115 call kfioIdentify() 7FFF65F4A862 ? 000000030 ? 0003FE83E ? 000000200 ? 000000003 ? EC87AAA48 ? ksfdopn1()+3064 call ksfd_osmopn() 7FFF65F4A862 ? 000000030 ? 000000200 ? 000000003 ? 000000000 ? F08EE1368 ? ksfdopn()+212 call ksfdopn1() 7FFF65F4A862 ? 000000030 ? 000000200 ? 000000003 ? 000000000 ? F08EE1368 ? kcropn()+1331 call ksfdopn() 7FFF65F4A862 ? 000000030 ? 000000200 ? 000000003 ? 000000000 ? 000000000 ? kcroio()+7 call kcropn() 000000000 ? 7FFF65F4AFD8 ? 000000000 ? 7FFF65F4EDF0 ? 000000000 ? 000000000 ? krsk_rlh_get_info() call kcroio() 000000000 ? 7FFF65F4AFD8 ? +393 000000000 ? 7FFF65F4EDF0 ? 000000000 ? 000000000 ? krse_arc_source_ini call krsk_rlh_get_info() 000000000 ? 7FFF65F4EE80 ? t()+185 7FFF65F4EDF0 ? 000000001 ? 7F3A285543C4 ? 7F3A285543C8 ? krse_arc_driver_cor call krse_arc_source_ini 000000000 ? 7FFF65F4EE80 ?
函数调用
kfnRecordErr()+436 call kserec2() 00BAF3FA0 ? 000000001 ?
kernel automatic storage management networking subsystem [partial hit for: kfn ] => kernel service error [partial hit for: kse ]
kfnConnectPriv()+11 call kfnRecordErr() 7F3A29AA9640 ? 7F3A29A6D950 ?
kernel automatic storage management networking subsystem [partial hit for: kfn ] => kernel automatic storage management networking subsystem [partial hit for: kfn ]
kfnUseConnFl()+213 call kfnConnectPriv() 000000000 ? 7FFF65F49D08 ?
kernel automatic storage management networking subsystem [partial hit for: kfn ] => kernel automatic storage management networking subsystem [partial hit for: kfn ]
kfncFileIdentify()+ call kfnUseConnFl() 000000000 ? 000000000 ?
kernel automatic storage management networking subsystem client networking [partial hit for: kfnc ] => kernel automatic storage management networking subsystem [partial hit for: kfn ]
kfioIdentify()+1701 call kfncFileIdentify() EB92B79C8 ? EC87AAA9A ?
kernel automatic storage management translation IO layer [partial hit for: kfio ] => kernel automatic storage management networking subsystem client networking [partial hit for: kfnc ]
ksfd_osmopn()+1115 call kfioIdentify() 7FFF65F4A862 ? 000000030 ?
kernel service functions disk IO [partial hit for: ksfd ] => kernel automatic storage management translation IO layer [partial hit for: kfio ]
ksfdopn1()+3064 call ksfd_osmopn() 7FFF65F4A862 ? 000000030 ?
kernel service functions disk IO open random access file fname [partial hit for: ksfdopn ] => kernel service functions disk IO [partial hit for: ksfd ]
ksfdopn()+212 call ksfdopn1() 7FFF65F4A862 ? 000000030 ?
kernel service functions disk IO open random access file fname => kernel service functions disk IO open random access file fname [partial hit for: ksfdopn ]
异常的ASM 实例
$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.3.0 Production on Thu Nov 14 23:53:21 2024 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter sga ORA-00604: error occurred at recursive SQL level 2 ORA-04031: unable to allocate 3000 bytes of shared memory ("shared pool","GV$PARAMETER","sga heap(2,0)","call") SQL> show parameter target ORA-00604: error occurred at recursive SQL level 2 ORA-04031: unable to allocate 3000 bytes of shared memory ("shared pool","GV$PARAMETER","sga heap(2,0)","call")
暂时正常的ASM 实例节点
$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.3.0 Production on Thu Nov 14 23:54:19 2024 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter target NAME TYPE VALUE ------------------------------------ ---------------- ----------------- memory_max_target big integer 4G memory_target big integer 4G pga_aggregate_target big integer 0 sga_target big integer 0 $ cat /proc/meminfo |grep -i hugepage AnonHugePages: 1079296 kB HugePages_Total: 33239 HugePages_Free: 5820 HugePages_Rsvd: 3302 HugePages_Surp: 0 Hugepagesize: 2048 kB
Cause
You may have both Automatic Memory Management (AMM) and hugepages configured at the same time on Linux platform.. According to both of the following documents, you cannot do this.
With AMM all SGA memory is allocated by creating files under /dev/shm. Please also note that ramfs (instead of tmpfs mount over /dev/shm) is not supported for AMM at all. Although it is possible to start a database with AMM even if the HugePages are configured on the system, it highly depends on how /dev/shm is mounted, and how much space is reserved for it. ORA-4031 errors could occur as a result of using this kind of mixed environment if /dev/shm is not correctly sized or mounted properly.
Using both AMM and Huge pages is not supported. AMM is the default for ASM instances
Solution
Disable Automatic Memory Management (AMM) on ASM instance:
1. Unset MEMORY_TARGET/MEMORY_MAX_TARGET SQL> alter system reset memory_target; SQL> alter system reset memory_max_target; 2. Set memory shared_pool_size 3. restart the ASM database
Known issue
Bug 12925089 ORA-600 [ksxp_rm_check0] can occur on process startup 11.2.0.3
Workaround
Address the cause of the preceding error.
eg: Adjust memory parameters if possible to avoid memory exhaustion.
Non-published Bug:12925089 which fix is included starting 11.2.0.4
Bug 12925089 ORA-600 [ksxp_rm_check0] can occur on process startup
19c
ORA-04031: unable to allocate 12312 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^1489″,”kglseshtTable”)
Cause
This is due to a product defect. This was investigated in unpublished Bug
33647820 DB ENCOUNTERED ORA-04031 “KGLSESHTTABLE” AFTER UPGRADING TO 19.12DBRU FROM 19.10
This is due to due to a reserved shared pool imbalance.
对不起,这篇文章暂时关闭评论。