Troubleshooting ORA-04031: unable to allocate 13840 bytes of shared memory “ges resource dynamic” in 12C+
在12c 版本以后”ges resource dynamic”逐渐增长最终导致shared_pool可能会超过手动管理的shared pool size达到sga_max_size后出现ora-4031. 与之相关的oracle bug就好几个,这最近因为这个问题导致lmd hang堵塞了其它实例的前台进程,关掉了这个节点临时恢复,简单记录。
#db alert log
2020-08-01 09:19:24.538000 +08:00
Thread 3 advanced to log sequence 10912 (LGWR switch)
Current log# 16 seq# 10912 mem# 0: +DATADG/ANBOB/ONLINELOG/group_16.280.1025954009
Archived Log entry 121814 added for T-3.S-10911 ID 0xd370ac7 LAD:1
2020-08-01 09:19:25.785000 +08:00
TT02: Standby redo logfile selected for thread 3 sequence 10912 for destination LOG_ARCHIVE_DEST_2
2020-08-01 09:32:07.640000 +08:00
Errors in file /oracle/app/oracle/diag/rdbms/ANBOB/anbob3/trace/billb3_lmd2_45682.trc (incident=256225):
ORA-04031: unable to allocate 13840 bytes of shared memory ("shared pool","unknown object","sga heap(1,0)","ges resource dynamic")
Incident details in: /oracle/app/oracle/diag/rdbms/ANBOB/anbob3/incident/incdir_256225/billb3_lmd2_45682_i256225.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2020-08-01 09:32:10.382000 +08:00
Errors in file /oracle/app/oracle/diag/rdbms/ANBOB/anbob3/trace/billb3_lmd2_45682.trc (incident=256226):
ORA-04031: unable to allocate 13840 bytes of shared memory ("shared pool","unknown object","sga heap(1,0)","ges resource dynamic")
Incident details in: /oracle/app/oracle/diag/rdbms/ANBOB/anbob3/incident/incdir_256226/billb3_lmd2_45682_i256226.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2020-08-01 09:32:13.441000 +08:00
Errors in file /oracle/app/oracle/diag/rdbms/ANBOB/anbob3/trace/billb3_lmd2_45682.trc (incident=256227):
ORA-04031: unable to allocate 13840 bytes of shared memory ("shared pool","unknown object","sga heap(1,0)","ges resource dynamic")
# 本次出现ora-4031的每一次dump trace中的heapdump 可以以”TOP”为关键字查找
*** 2020-08-01T09:32:07.644132+08:00 ================================= Begin 4031 Diagnostic Information /TOP 7f7f4e265000-7f7f4e26e000 r--s 0011b000 fe:07 1100698 /oracle/app/oracle/product/12.2.0/db_1/lib/libskgxp12.so 7f7f4e26e000-7f7f4e26f000 r--s 0011b000 fe:07 1100698 /oracle/app/oracle/product/12.2.0/db_1/lib/libskgxp12.so 7f7f4e26f000-7f7f4e271000 r--s 18592000 fe:07 1101565 /oracle/app/oracle/product/12.2.0/db_1/bin/oracle 7f7f4e271000-7f7f4e272000 rw-s 00000000 fe:07 4743170 /oracle/app/oracle/product/12.2.0/db_1/dbs/hc_billb3.dat 7f7f4e272000-7f7f4e275000 rw-p 00000000 00:00 0 7f7f4e275000-7f7f4e276000 r--p 00020000 fe:00 1155074 /lib64/ld-2.22.so 7f7f4e276000-7f7f4e277000 rw-p 00021000 fe:00 1155074 /lib64/ld-2.22.so 7f7f4e277000-7f7f4e278000 rw-p 00000000 00:00 0 7fff927e7000-7fff92822000 rw-p 00000000 00:00 0 [stack] 7fff92919000-7fff9291c000 r--p 00000000 00:00 0 [vvar] 7fff9291c000-7fff9291e000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] *********************** End of process map dump **************** Maximum map count configured per process: 65530 3< ***** process_map_dump ***** [TOC00005-END] ============================================== TOP 10 MEMORY USES FOR SGA HEAP SUB POOL 1 ---------------------------------------------- "ges resource dynamic " 14 GB 62% "ges enqueues " 6642 MB 29% "free memory " 721 MB 3% "gcs resources " 352 MB 2% "gcs shadows " 225 MB 1% "gc name table " 128 MB 1% "gcs resv res hash bucket " 107 MB 0% "db_block_hash_buckets " 101 MB 0% "ges resource permanent " 49 MB 0% "Checkpoint queue " 46 MB 0% ----------------------------------------- free memory 721 MB memory alloc. 22 GB Sub total 23 GB ============================================== TOP 10 MAXIMUM MEMORY USES FOR SGA HEAP SUB POOL 1 ---------------------------------------------- "ges resource dynamic " 14 GB "ges enqueues " 6642 MB "free memory " 1201 MB "gcs resources " 352 MB "gcs shadows " 225 MB "gc name table " 128 MB "gcs resv res hash bucket " 107 MB "db_block_hash_buckets " 101 MB "ges resource permanent " 49 MB "Checkpoint queue " 46 MB ============================================== TOP 10 MEMORY USES FOR SGA HEAP SUB POOL 2 ---------------------------------------------- "ges enqueues " 2909 MB 63% "free memory " 443 MB 10% "gcs resources " 352 MB 8% "gcs shadows " 224 MB 5% "gc name table " 128 MB 3% "gcs resv res hash bucket " 107 MB 2% "db_block_hash_buckets " 102 MB 2% "ges resource permanent " 49 MB 1% "Checkpoint queue " 46 MB 1% "event statistics per sess " 26 MB 1% ----------------------------------------- free memory 443 MB memory alloc. 4165 MB Sub total 4608 MB ============================================== TOP 10 MAXIMUM MEMORY USES FOR SGA HEAP SUB POOL 2 ---------------------------------------------- "ges enqueues " 2909 MB "free memory " 715 MB "gcs resources " 352 MB "gcs shadows " 224 MB "gc name table " 128 MB "gcs resv res hash bucket " 107 MB "db_block_hash_buckets " 102 MB "ges resource permanent " 49 MB "Checkpoint queue " 46 MB "event statistics per sess " 26 MB ============================================== TOP 10 MEMORY USES FOR SGA HEAP SUB POOL 3 ---------------------------------------------- "ges resource dynamic " 14 GB 61% "ges enqueues " 6832 MB 29% "free memory " 1011 MB 4% "gcs resources " 353 MB 1% "gcs shadows " 226 MB 1% "gc name table " 128 MB 1% "gcs resv res hash bucket " 106 MB 0% "db_block_hash_buckets " 101 MB 0% "ges resource permanent " 49 MB 0% "Checkpoint queue " 46 MB 0% ----------------------------------------- free memory 1011 MB memory alloc. 22 GB Sub total 23 GB ============================================== TOP 10 MAXIMUM MEMORY USES FOR SGA HEAP SUB POOL 3 ---------------------------------------------- "ges resource dynamic " 14 GB "ges enqueues " 6832 MB "free memory " 1232 MB "KGH: NO ACCESS " 512 MB "gcs resources " 353 MB "gcs shadows " 226 MB "gc name table " 128 MB "gcs resv res hash bucket " 106 MB "db_block_hash_buckets " 101 MB "ges resource permanent " 49 MB ============================================== ...
# 从lmhb trace文件中grep 几个bug关键字
# grep "library cache pin wait" *lmhb*.trc billb3_lmhb_2380.trc:kjgcr_ServiceGCR: KJGCR_METRICS: Local metric library cache pin wait check, id 11 failed billb3_lmhb_2380.trc:kjgcr_ChkGlobalMetric: metric 11 (library cache pin wait check) failed for dbname ANBOB, inst 3, node 3 billb3_lmhb_2380.trc:kjgcr_ChkGlobalMetric: metric 11 (library cache pin wait check) succeed for dbname ANBOB, inst 1, node 1 billb3_lmhb_2380.trc:kjgcr_ServiceGCR: KJGCR_METRICS: Local metric library cache pin wait check, id 11 succeed billb3_lmhb_2380.trc:kjgcr_ServiceGCR: KJGCR_METRICS: Local metric library cache pin wait check, id 11 failed billb3_lmhb_2380.trc:kjgcr_ChkGlobalMetric: metric 11 (library cache pin wait check) failed for dbname ANBOB, inst 3, node 3 billb3_lmhb_2380.trc:kjgcr_ChkGlobalMetric: metric 11 (library cache pin wait check) succeed for dbname ANBOB, inst 1, node 1 billb3_lmhb_2380.trc:kjgcr_ServiceGCR: KJGCR_METRICS: Local metric library cache pin wait check, id 11 succeed billb3_lmhb_2380.trc:kjgcr_ServiceGCR: KJGCR_METRICS: Local metric library cache pin wait check, id 11 failed billb3_lmhb_2380.trc:kjgcr_ChkGlobalMetric: metric 11 (library cache pin wait check) failed for dbname ANBOB, inst 3, node 3 billb3_lmhb_2380.trc:kjgcr_ChkGlobalMetric: metric 11 (library cache pin wait check) succeed for dbname ANBOB, inst 1, node 1 billb3_lmhb_2380.trc:kjgcr_ServiceGCR: KJGCR_METRICS: Local metric library cache pin wait check, id 11 succeed billb3_lmhb_2380.trc:kjgcr_ServiceGCR: KJGCR_METRICS: Local metric library cache pin wait check, id 11 failed billb3_lmhb_2380.trc:kjgcr_ChkGlobalMetric: metric 11 (library cache pin wait check) failed for dbname ANBOB, inst 3, node 3 billb3_lmhb_2380.trc:kjgcr_ChkGlobalMetric: metric 11 (library cache pin wait check) succeed for dbname ANBOB, inst 1, node 1 billb3_lmhb_2380.trc:kjgcr_ServiceGCR: KJGCR_METRICS: Local metric library cache pin wait check, id 11 succeed billb3_lmhb_2380.trc:kjgcr_ServiceGCR: KJGCR_METRICS: Local metric library cache pin wait check, id 11 failed billb3_lmhb_2380.trc:kjgcr_ChkGlobalMetric: metric 11 (library cache pin wait check) failed for dbname ANBOB, inst 3, node 3 。。。 # grep kjgcr_GrowR *lmhb*.trc billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth billb3_lmhb_2380.trc:kjgcr_GrowResourceCache: LCP waits high, allowing res cache growth # trace file *** 2020-08-01T10:15:45.376086+08:00 kjgcr_ChkGlobalMetric: metric 7 (check lck heartbeat) failed for dbname ANBOB, inst 2, node 2 *** 2020-08-01T10:15:45.380376+08:00 kjgcr_ChkGlobalMetric: metric 19 (check lgwr heartbeat) succeed for dbname +ASM, inst 3, node 3 *** 2020-08-01T10:15:45.380413+08:00 kjgcr_ChkGlobalMetric: metric 19 (check lgwr heartbeat) failed for dbname ANBOB, inst 2, node 2 *** 2020-08-01T10:15:47.362926+08:00 ============================== LCK1 (ospid: 45689) has not moved for 1249 sec (1596248147.1596246898) *** 2020-08-01T10:15:47.367616+08:00 ============================== LGWR (ospid: 45718) has not moved for 1241 sec (1596248147.1596246906) *** 2020-08-01T10:15:47.367682+08:00 kjgcr_ChkGlobalMetric: metric 4 (check lmd heartbeat) failed for dbname ANBOB, inst 3, node 3 *** 2020-08-01T10:15:47.367708+08:00 kjgcr_ChkGlobalMetric: metric 4 (check lmd heartbeat) succeed for dbname +ASM, inst 3, node 3 *** 2020-08-01T10:15:47.367738+08:00 kjgcr_ChkGlobalMetric: metric 6 (check lmon heartbeat) failed for dbname ANBOB, inst 3, node 3 *** 2020-08-01T10:15:47.367756+08:00 kjgcr_ChkGlobalMetric: metric 6 (check lmon heartbeat) succeed for dbname +ASM, inst 3, node 3 *** 2020-08-01T10:15:47.367778+08:00
这是个12c r2 201804 RU与之相关的有两个bug.
这个问题相关的bug
1, For Oracle Versions >=19.1 but BELOW 20.1 and ASM
Unpublished Bug:30497120 – 影响ASM Rebalance Produces AD Enqueue Leak. 在ASM 19c中频繁RB可能会导致”ges resource dynamic”增长出现ora-4031
Workaround: Set parameter _lm_broadcast_res=disable.
2. For Oracle Versions >= 12.2 but BELOW 19.1
Bug:26405036 – VERY HIGH “GES ENQUEUES” ON THE SHARED POOL
现象LMHB trace中出现”memory load check” failure.和”library cache pin wait check” failure. heapdump 中”ges resource dynamic”使用高.
Workaround: on R12.2 or above, Start pseudo reconfiguration by below command is workaround.
SQL> oradebug setmypid
SQL> oradebug lkdebug -m reconfig lkdebug
Bug:27824540 – ORA-04031 (“SHARED POOL”,”UNKNOWN OBJECT”,”SGA HEAP(1,0)”,”GES RESOURCE DYNAMIC”
现象 “ges resource dynamic” 在head dump中使用高,LMHB trace文件中出现kjgcr_GrowResourceCache和(check lck heartbeat) fail
Workaround: There are 2 possible workarounds.
1) Disable the action 11.
SQL> oradebug dyn_gcr -a 11 -disable
Note: This oradebug command is available on 12.2 and later.
2) Disable the GES resource cache; set the initialization parameter “_ges_direct_free” to TRUE.
Note: 完全禁用 GES resource cache可能会影响insert的TM资源, 并且调这个参数还可能引入另一个Bug 30998759直到19.8和21c base版才修复.
3. For Oracle Versions >= 12.1.0.2 but BELOW 12.2
Unpublished Bug:21260431 – GETTING ORA-4031 AFTER 12C UPGRADE
现象 “ges resource dynamic” 在head dump中使用高, gv$ges_resource记录数只增不减。
Workaround: None. 安装oneoff patch
Unpublished Bug:21373473 – INSTANCE TERMINATED AS LMD0 AND LMD2 HUNG FOR MORE THAN 70 SECS
现象 “ges resource dynamic”使用高, gv$ges_resource中记录比预期高, DX and BB locks cached不释放. LMD可能无响应.
WORKAROUND: set _ges_direct_free_res_type=”CTARAHDXBB” 需要重启实例
4. For Oracle Versions = 12.1
Unpublished BUG:27860058, Fix For Bug 26405036 On 12.1, shown above.
WORKAROUND:只能重启实例 ,或安装oneoff patch
Unpublished Bug:28300808, Fix For Bug 27824540 On 12.1, shown above.
Workaround: Disable the GES resource cache; “_ges_direct_free” to TRUE影响上面提到过。 或安装oneoff patch
Reference Error ORA-04031 in the Shared Pool with Huge Allocation in Memory Type “ges resource dynamic” or “ges enqueues” memory” (Doc ID 2631592.1)
对不起,这篇文章暂时关闭评论。