再续PGA 和max_map_count、_realfree_heap_pagesize
之前分享过《PGA 一点认识》《Know more about PGA_AGGREGATE_LIMIT 12c 19c》 看到限制PGA的参数有_PGA_MAX_SIZE、 _SMM_(PX)_MAX_SIZE、PGA_AGGREGATE_TARGET、PGA_AGGREGATE_LIMIT ,为了提高SQL性能手动增加_PGA_MAX_SIZE参数值(The hidden parameter _pga_max_size does not limit a process size, only the work area),如在数据仓库环境可能更明显,最近看到一个案例在执行存储过程时,当把参数_PGA_MAX_SIZE 调到4GB以上时(default 2GB),依旧遇到了ora-4030 错误, trace 文件中显示上限达4GB的上限,Process Map Dump 达65537行。
在MOS中查找不难发现PLSQL Procedure Causing ORA-04030: (pga heap,control file i/o buffer) And ORA-04030: (koh-kghu sessi,pmuccst: adt/record) or ORA-04030: (koh-kghucall ,pmucalm coll) Errors (Doc ID 1325100.1) 匹配,提到了解决方案有2种
1,增加操作系统级page map的个数
more /proc/sys/vm/max_map_count sysctl -w vm.max_map_count=262144 (for example) -- add /etc/sysctl.conf -- or --
2, 增加数据库级realfree heap pagesize大小,需要重启数据库
For versions 11.2.0.4 and lower:
_use_realfree_heap=TRUE _realfree_heap_pagesize_hint = 262144
For 12.1 and higher:
_use_realfree_heap=TRUE _realfree_heap_pagesize = 262144
检查默认配置
[oracle@oel7db1 ~]$ sysctl -a|grep max_map vm.max_map_count = 65530 SQL> @pd realfree Show all parameters and session values from x$ksppi/x$ksppcv... NUM N_HEX NAME VALUE DESCRIPTION ---------- ----- -------------------------------------------------------- ------------------------------ --------------------------------------------------------------------------------------------------- 192 C0 _realfree_heap_max_size 32768 minimum max total heap size, in Kbytes 193 C1 _realfree_heap_pagesize 65536 hint for real-free page size in bytes 194 C2 _realfree_pq_heap_pagesize 65536 hint for pq real-free page size in bytes 195 C3 _realfree_heap_mode 0 mode flags for real-free heap 196 C4 _use_realfree_heap TRUE use real-free based allocator for PGA memory [oracle@oel7db1 ~]$ echo 65536*65530/1024/1024|bc 4095
默认数据库realfree allocator pagesize 是 64 kB (65536), 操作系统vm.max_map_count默认是65530, 64*65530=4GB. 如果把realfree heap pagesize增加到256kB (262144) 上限就成了16GB.对于多租户修改CDB级参数即可,PDB继承。
解决问题一方面增加单个进程PGA上限,另一方面要检查为什么单个进程PGA使用这么高?是否是合理?当发现有些进程使用PGA或内存使用较高时,如11g, 12, 19c都存在的bug ASM的dia0/diag 进程内存较大时,检查方法:
column name format a25 column pname format a12 column "MegaBytes" format a10 set numwidth 6 select ss.sid, p.pid, p.pname, sn.name, round(ss.value/(1024 *1024))||'Mb' "MegaBytes" from v$sesstat ss, v$statname sn, v$session s, v$process p where s.paddr = p.addr and sn.statistic# = ss.statistic# and s.sid = ss.sid and sn.name in ('session pga memory' , 'session pga memory max') and p.pname='DIA0' order by ss.value / select s.sid, s.program, p.pga_used_mem, p.pga_alloc_mem from v$session s, v$process p where s.paddr=p.addr and p.pga_used_mem >xxx; oradebug setorapid xxx oradebug unlimit oradebug dump heapdump 1 oradebug tracefile_name -- format trace file using heap.awk
查找内存使用相关的SQL和heap名称,查询是否有bug? 如上面提到的bug diag/dia进程可以kill 自动重启进程释放内存。
在Linux系统中可以使用pmap查看Process Map 记录数,但是当我在RAC on RHEL 7环境使用oracle用户用pmap查看server process(LOCAL=NO),提示无权限。如果使用strace跟踪是在read /proc/PID/maps文件时. 查看文件系统权限时:
$ ls -l /proc/NNNN/
-r--r--r-- 1 root root 0 Aug 9 16:06 maps
-rw------- 1 root root 0 Aug 9 16:06 mem
...
-r--r--r-- 1 root root 0 Aug 9 16:06 smaps
-r--r--r-- 1 root root 0 Aug 9 16:06 stack
-r--r--r-- 1 root root 0 Aug 9 16:06 stat
-r--r--r-- 1 root root 0 Aug 9 16:06 statm
-r--r--r-- 1 root root 0 Aug 9 16:06 status
注意是-r-r-r的权限,单从文件权限看other是有read权限的,使用oracle用户cat maps文件无权限,如果尝cat status文件是可以,后来从官方《Permission denied when reading /proc/$PID/* owned by the user》 和stackoverflow 找到了答案。
It’s because the file permissions are not the only protection you’re encountering.
Those aren’t just regular text files on a file system, procfs is a window into process internals and you have to get past both the file permissions plus whatever other protections are in place.
The maps show potentially dangerous information about memory usage and where executable code is located within the process space. If you look into ASLR, you’ll see this was a method of preventing potential attackers from knowing where code was loaded and it wouldn’t make sense to reveal it in a world-readable entry in procfs.
This protection was added way back in 2007:
This change implements a check using “ptrace_may_attach” before allowing access to read the maps contents. To control this protection, the new knob /proc/sys/kernel/maps_protect has been added, with corresponding updates to the procfs documentation.
Within ptrace_may_attach() (actually within one of the functions it calls) lies the following code:
if (((current->uid != task->euid) ||
(current->uid != task->suid) ||
(current->uid != task->uid) ||
(current->gid != task->egid) ||
(current->gid != task->sgid) ||
(current->gid != task->gid)) && !capable(CAP_SYS_PTRACE))
return -EPERM;
so that, unless you have the same real user/group ID, saved user/group ID and effective user/group ID (i.e., no sneaky setuid stuff) and they’re the same as the user/group ID that owns the process, you’re not allowed to see inside that “file” (unless your process has the CAP_SYS_PTRACE capability of course).
文件权限只是控制文件读写的一方面,对于/proc文件系统中文件,出于安全原因,从其他进程内存读取有额外的权限检查, /proc/$PID/maps显示了有关内存使用情况,以及可执行代码在进程空间中的位置的潜在危险信息, 操作系统在内核级使用CAP(Capabilities)机制控制,对于内核调用时可以分类分别控制,如果文件没有相同的UID/GID,还要确保使用 $PID 的进程对核心转储功能没有限制,可以临时启用审计规则auditctl 调测。简而言之就是使用pmap跟踪程序时,尝试read /proc/PID的文件时需要CAP_SYS_PTRACE Capabilities, oracle读取root owner的maps文件,可以通过以下方式授权:
su - root # setcap cap_sys_ptrace=eip /usr/bin/pmap
OOM KILL
In Linux oom_kill.c, out_of_memory() calls select_bad_process() to find processes to be killed.
If found, kill by oom_kill_process().
If not found, panic the system (halt the system, never return) by:
panic(“Out of memory and no killable processes…\n”);
TEST CODE
ksun’s SQL Scripts to test PGA
create or replace package pga_mem_test as procedure allo (p_mb int, p_sleep number := 0); end; / create or replace package body pga_mem_test as type t_tab_kb is table of char(1024); -- 1KB p_tab_1mb t_tab_kb := t_tab_kb(); type t_tab_mb is table of t_tab_kb; p_tab_mb t_tab_mb := t_tab_mb(); p_sid number := sys.dbms_support.mysid; ------------------------------------------- procedure rpt(l_name varchar) is l_v$process_mem varchar2(4000); l_v$process_memory_mem varchar2(4000); begin select 'Used/Alloc/Freeable/Max >>> '|| round(pga_used_mem/1024/1024) ||'/'||round(pga_alloc_mem/1024/1024)||'/'|| round(pga_freeable_mem/1024/1024)||'/'||round(pga_max_mem/1024/1024) into l_v$process_mem from v$process where addr = (select paddr from v$session where sid = p_sid); select 'Category(Alloc/Used/Max) >>> '|| listagg(Category||'('||round(allocated/1024/1024)||'/'|| round(used/1024/1024)||'/'||round(max_allocated/1024/1024)||') > ') within group (order by Category desc) name_usage_list into l_v$process_memory_mem from v$process_memory where pid = (select pid from v$process where addr = (select paddr from v$session where sid = p_sid)); dbms_output.put_line(rpad(l_name, 20)||' > '||rpad(l_v$process_mem, 50)); dbms_output.put_line(' ------ '||l_v$process_memory_mem); end rpt; ------------------------------------------- procedure allo (p_mb int, p_sleep number) is begin rpt('Start allocate: '||p_mb||' MB'); select 'M' bulk collect into p_tab_1mb from dual connect by level <= 1024; -- 1MB for i in 1..p_mb loop -- p_mb MB p_tab_mb.extend; p_tab_mb(i) := p_tab_1mb; end loop; rpt('End allocate: '||p_mb||' MB'); dbms_lock.sleep(p_sleep); end allo; end; / /* exec dbms_session.reset_package; set serveroutput on exec pga_mem_test.allo(1024*1, 30); -- allocate 1GB */ ------------------------------------------- create or replace procedure pga_mem_test_jobs(p_job_cnt number, p_mb number, p_sleep number := 0) as l_job_id pls_integer; begin for i in 1.. p_job_cnt loop dbms_job.submit(l_job_id, 'begin pga_mem_test.allo('||p_mb||', '||p_sleep||'); end;'); end loop; commit; end; / --exec pga_mem_test_jobs(4, 1024*2, 60); -- 4 Jobs, each allocates 2 GB, sleeping 60 seconds
对不起,这篇文章暂时关闭评论。