Shell: extract more from listener.log (分析oracle监听日志连接频率)
最近遇到了两起数据库连接数不足的问题, 通常都会预留一些会话增加的情况, 但在一些特殊情况下如连接风暴(logon storm), 如果在监听中没有做rate限流,对数据库来说巨大的冲击可能会导致数据库Hang 或 ora-20 或ora-18 错误。 对于Hang并伴有进程数不足的情况,AWR、ASH 都可能无法升成,甚至数据库都无法登录或做SSD 都不成功, 这时候LISTENER.LOG 就成了“破案”时关键的线索。 下面记录分享一些分析listener.log的一些脚本.(Note:在不同UNIX下可能稍有变化)
统计一天内每小时的session请求数
# fgrep "13-JAN-2015 " listener.log |awk '/establish/ {print substr($0,1,15)}' |sort |uniq -c 2978 13-JAN-2015 00: 2883 13-JAN-2015 01:
指定的一小时每分钟session请求数
# fgrep "13-JAN-2015 11:" listener.log |awk '/establish/ {print substr($0,1,17)}' |sort |uniq -c 94 13-JAN-2015 11:00 44 13-JAN-2015 11:01 80 13-JAN-2015 11:02 119 13-JAN-2015 11:03 ...
指定的一小时每秒session请求数
# fgrep "13-JAN-2015 11:30" listener.log |awk '/establish/ {print substr($0,1,20)}' |sort |uniq -c 7 13-JAN-2015 11:30:00 3 13-JAN-2015 11:30:01 4 13-JAN-2015 11:30:02 4 13-JAN-2015 11:30:03 ...
指定的一小时内每分钟连接创建失败数
#fgrep "11-JAN-2015 11:" listener.log |awk '/establish/ { if ( $NF != 0 ) print substr($0,1,17)}' |sort |uniq -c 474 11-JAN-2015 11:38 10 11-JAN-2015 11:39
指定的一小时内每IP请求数
#fgrep "11-JAN-2015 11:" anbob_listener.log|fgrep "establish"|awk -F* '{print $3}'|awk -F= '{ print $4}'|sed -e 's/......$//g'|sort |uniq -c|sort -n 1 136.142.26.139 2 136.142.10.212 2 136.142.21.171 8 136.142.21.172 13 136.142.26.133 13 136.142.29.17 14 136.142.29.20 18 136.142.26.35 23 136.142.29.29 ...
指定的分钟内每IP请求数
#fgrep "11-JAN-2015 11:30" listener.log|fgrep "establish"|awk -F* '{print $3}'|awk -F= '{ print $4}'|sed -e 's/......$//g'|sort |uniq -c|sort -n 1 136.142.26.35 1 136.142.29.149 1 136.142.29.156 1 136.142.29.17 2 136.142.30.189 3 136.142.26.133 4 136.142.26.136 4 136.142.29.157 7 136.142.29.20 9 136.142.29.22 10 136.142.26.34 ...
指定时间段内的每IP请求数
sed -n '/22-MAY-2019 12:01:00/,/22-MAY-2019 12:05:00/p' listener.log |fgrep "establish"|awk -F* '{print $3}'|awk -F= '{ print $4}'|sed -e 's/......$//g'|sort |uniq -c|sort -n 19 133.96.43.75 36 133.96.65.47 53 133.96.66.41 62 133.96.66.42 80 133.96.65.48 85 133.96.43.71 96 133.96.65.41 360 133.96.65.46 384 133.96.65.86 681 133.96.65.45
全天每小时每个IP请求数
fgrep "09-JAN-2015 " listener.log|fgrep "establish"|awk -F* '{print $1 " " $3}'|awk -F= '{ print $1 " " $4}'|sed -e 's/......$//g'| awk '{print $1 " " $2 " " $4}'|cut -b-14,21- |sort |uniq -c 1 09-JAN-2015 01 136.142.21.172 66 09-JAN-2015 01 136.142.21.85 11 09-JAN-2015 01 136.142.26.131 5 09-JAN-2015 01 136.142.26.133 21 09-JAN-2015 01 136.142.26.136 5113 09-JAN-2015 01 136.142.26.24 49 09-JAN-2015 01 136.142.26.34 6 09-JAN-2015 01 136.142.29.141 28 09-JAN-2015 01 136.142.29.148 49 09-JAN-2015 01 136.142.29.149 85 09-JAN-2015 01 136.142.29.150 2 09-JAN-2015 01 136.142.29.151 6 09-JAN-2015 01 136.142.29.156 6 09-JAN-2015 01 136.142.29.157 2 09-JAN-2015 01 136.142.29.162 58 09-JAN-2015 01 136.142.29.164 4 09-JAN-2015 01 136.142.29.17 4 09-JAN-2015 01 136.142.29.184 207 09-JAN-2015 01 136.142.29.192
指定时间指定主机上不同主机USER的请求统计
$ fgrep "04-JUL-2018 15:" listener.log|fgrep "133.96.65.45" | awk -F= '$11 ~ /USER/ {print $12 }' | sed 's/).*$//'|sort|uniq -c|sort -n
10815 taskmon
指定时间指定条件的PROGRAM的请求统计
$ fgrep "01-NOV-2018 " listener.log|sed 's/^.*PROGRAM//;s/).*$//;s/^.*:...\*//;s/).*$//;s/\*.*$//'|sort|uniq -c|sort -n 1 sqlldr@kinjk3 2789 sqlplus 8025 sqlplus@kinjk3
Note:
PROGRAM=oracle的可能是DBLINK,有些使用sid连接的的可能program为空如hpux下面这条记录:
# hpux 01-NOV-2018 00:00:06 * (CONNECT_DATA=(SID=tbcsa1)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=))) * (ADDRESS=(PROTOCOL=tcp)(HOST=133.96.65.86)(PORT=63438)) * establish * tbcsa1 * 0
指定时间段的PROGRAM的请求统计
sed -n '/22-MAY-2019 09:01:00/,/22-MAY-2019 09:05:00/p' listener.log | fgrep "establish"|sed 's/^.*PROGRAM//;s/).*$//;s/^.*:...\*//;s/).*$//;s/\*.*$//'|sort|uniq -c|sort -n 1 establish 2 =python 5 =JDBC Thin Client 6 =VpnOrderSyn 6 =rman 40 =IntClient 79 =oracle 95 =SynSrpFeedBackResult 2472 =
分析监听 网络请求队列
netstat -ano |sed -n '/.*.1521.* LISTEN$/,/TCP:/p' |awk '{if($0 ~ "LISTEN"){ip=$4;}if($0 ~ "qlen"){print ip" "$0}}'
统计多个监听
$ su - gird (listener owner) $ cd $ORACLE_BASE/diag/tnslsnr/`hostname` $ find . -name "listener*.log"|while read LINE do echo ********************** echo $LINE echo "==============================" ----do something -- fgrep "03-SEP-2020 13" $LINE |awk '{ if ( $NF != 0 ) print $0 }'|awk '{print $1 " " $2}' |awk -F: '{print $1 ":" $2 }' |sort |uniq -c|sort -n done
使用ADRCI一条命令收集每小时连接频率
注意:如果是多个监听时需要配置adr home
adrci exec='set home diag/tnslsnr ; show alert -term -p \"' "MESSAGE_TEXT like \'%establish%\' and ORIGINATING_TIMESTAMP > \'2019-08-31 00:00:00\'" '\"' | awk '/establish/{sub(/:.*/,"",$2);print "connection/hour at "$1,$2":00-"$2":59"}'|uniq -c
收集白名单
从监听日志里分析白名单的IP段(注意:日志里只是当前保留的日志时间段的访问记录,并非全部。)
cd /oracle/app/grid/diag/tnslsnr/$(hostname)/listener/trace grep "establish" listener.log|awk -F* '{print $3}'|awk -F= '{print $4}'|sed -e 's/......$//g'|awk -F. '{print $1 "." $2 "." $3}'|sort|uniq | awk 'BEGIN{RS=EOF}{gsub(/\n/,".*,");print}'
SQL脚本
-- file: session_rpt.sql -- Purpose: To Collect all session information -- Author: weejar -- Copyright: (c) ANBOB - http://www.anbob.com.com - All rights reserved. -- version 2.6 col spoolname new_value spoolname select 'session_rpt_'||to_char(sysdate,'yyyymmdd') spoolname from dual; spool '&spoolname' prom list of sessions set lines 300 pages 1000 col current_time for a50 select 'anbob.com' author,to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') current_time,instance_name,version,status,instance_role from v$instance / select * from v$resource_limit where RESOURCE_NAME in('processes','sessions'); col sid form 99999 col serial# form 99999 col spid form a6 col program heading 'program' for a25 trunc col username form a15 col osuser form a10 col idle form a30 heading "Idle" col terminal form a12 col logon_time form a18 col machine for a15 trunc col rn for 9999 col service_name for a30 set lines 150 pages 1000 break on report compute sum of cnt on report select username,status,count(*) cnt from v$session group by username,status order by 1 / select username,machine,count(*) cnt from v$session group by username,machine order by 1,2 / select username,machine,failed_over,count(*) cnt from v$session where failed_over='YES' group by username,machine,failed_over order by 1,2 / select server,status,count(*) from v$session group by server,status / select inst_id,service_name,count(*) cnt from gv$session group by inst_id,service_name order by 1,2 / select inst_id,pname,username,count(*) cnt from gv$process group by inst_id,pname,username / select machine,program,count(*) from v$session where type='USER' group by machine,program order by 1,2 / select machine,server,username, count(*) cnt from v$session -- where program like 'oracle@qdyy%(TNS V1-V3)' -- and machine in('qdyya1') group by machine,server,username / ttitle - center 'displays the top 50 longest idle times' skip 2 select a.* from ( select sid,serial#,username,status, to_char(logon_time,'dd-mm-yy hh:mi:ss') logon_time , floor(last_call_et/3600)||' hours ' || floor(mod(last_call_et,3600)/60)||' mins ' || mod(mod(last_call_et,3600),60)||' secs' idle , machine ,row_number() over(order by last_call_et desc ) rn from v$session where type='USER' ) a where rn<= 50 / ttitle off column event heading 'wait event' for a30 trunc ttitle - center 'displays active session' skip 2 select sid,serial#,username,event,program,MACHINE,sql_id,BLOCKING_SESSION from v$session where status='ACTIVE' and username is not null; ttitle off
对不起,这篇文章暂时关闭评论。