首页 » Cloud, ORACLE 9i-23ai » Oracle RAC 环境中配置DNS性能问题或故障对使用SCAN Listener的影响

Oracle RAC 环境中配置DNS性能问题或故障对使用SCAN Listener的影响

Oracle 11g 提供了一种特性SCAN LISTENER, 简化客户配置应用连接串,客户请求发给scan listener后实际还是会重定向给VIP listener与客户端建立联系, 标准配置是建议把scan name的域名配置到DNS中,如果不使用DNS只能识别一个scan ip, 配置在/etc/hosts中。 如果使用DNS服务器,对DNS的响应有较高要求,同时也增加了一个故障点,对于优化DNS 延迟问题MOS中Reducing Client Connection Delays When DNS Is Unresponsive (Doc ID 1449843.1) 给出了建议, 最近一个客户的DNS突然故障影响了部分使用SCAN 特性应用连接, 这里简单的记录.

服务端

服务端的crs agent在检查监听资源时会解析联系不上DNS,而超时,oraagent_oracle.log日志如下

22292: 2021-11-03 10:02:10.337: [   AGENT][1543]{1:45270:26878} {1:45270:26878} Created alert : (:CRSAGF00113:) :  Aborting the command: check for resource: ora.LISTENER_SCAN1.lsnr 1 1
	行 22310: 2021-11-03 10:02:14.330: [    AGFW][1543]{1:45270:26878} Command: check for resource: ora.LISTENER_SCAN1.lsnr 1 1 completed with status: TIMEDOUT
	行 22349: 2021-11-03 10:05:09.543: [   AGENT][6455]{1:45270:26878} {1:45270:26878} Created alert : (:CRSAGF00113:) :  Aborting the command: check for resource: ora.LISTENER_SCAN1.lsnr 1 1
	行 22367: 2021-11-03 10:05:13.528: [    AGFW][6455]{1:45270:26878} Command: check for resource: ora.LISTENER_SCAN1.lsnr 1 1 completed with status: TIMEDOUT
	行 22484: 2021-11-03 10:13:10.364: [   AGENT][1543]{1:45270:26878} {1:45270:26878} Created alert : (:CRSAGF00113:) :  Aborting the command: check for resource: ora.LISTENER_SCAN1.lsnr 1 1
	行 22502: 2021-11-03 10:13:14.349: [    AGFW][1543]{1:45270:26878} Command: check for resource: ora.LISTENER_SCAN1.lsnr 1 1 completed with status: TIMEDOUT


crsd.log 日志

2021-11-03 10:02:14.334: [    AGFW][9768]{0:7:5} Verifying msg rid = ora.LISTENER_SCAN1.lsnr 1 1
2021-11-03 10:02:14.335: [    AGFW][9768]{0:7:5} Received state change for ora.LISTENER_SCAN1.lsnr 1 1 [old state = ONLINE, new state = UNKNOWN]
2021-11-03 10:02:14.335: [    AGFW][9768]{0:7:5} Received state LABEL change for ora.LISTENER_SCAN1.lsnr 1 1 [old label  = , new label = CHECK TIMED OUT]
2021-11-03 10:02:14.344: [   CRSPE][11053]{0:7:5} State change received from hbjcdb02 for ora.LISTENER_SCAN1.lsnr 1 1
2021-11-03 10:02:14.454: [   CRSPE][11053]{0:7:5} Processing PE command id=96064. Description: [Resource State Change (ora.LISTENER_SCAN1.lsnr 1 1) : 11ada1550]
2021-11-03 10:02:14.461: [   CRSPE][11053]{0:7:5} RI [ora.LISTENER_SCAN1.lsnr 1 1] new external state [INTERMEDIATE] old value: [ONLINE] on hbjcdb02 label = [CHECK TIMED OUT]
2021-11-03 10:02:14.461: [   CRSPE][11053]{0:7:5} Set State Details to [CHECK TIMED OUT] from [ ] for [ora.LISTENER_SCAN1.lsnr 1 1]
2021-11-03 10:02:14.462: [   CRSPE][11053]{0:7:5} Processing unplanned state change for [ora.LISTENER_SCAN1.lsnr 1 1]
2021-11-03 10:02:14.463: [  CRSRPT][11310]{0:7:5} Published to EVM CRS_RESOURCE_STATE_CHANGE for ora.LISTENER_SCAN1.lsnr
2021-11-03 10:02:14.471: [   CRSPE][11053]{0:7:5} PE Command [ Resource State Change (ora.LISTENER_SCAN1.lsnr 1 1) : 11ada1550 ] has completed

lsnrctl status可能会hang, Scan监听日志中出现:

 grep -B 1 TNS listener_scan1.log
03-NOV-2021 10:04:16 *  * (ADDRESS=(PROTOCOL=tcp)(HOST=::1)(PORT=9996)) * status *  * 12525
TNS-12525: TNS:listener has not received client's request in time allowed
 TNS-12535: TNS:operation timed out
  TNS-12606: TNS: Application timeout occurred

客户端

在DNS正常运行时间有问题或 DNS 延迟可能造成问题的情况下,可以将客户端连接描述符配置为使用 IP 地址对与SCAN Listener的连接进行负载平衡, 例如

RAC =
   (DESCRIPTION=
     (ADDRESS_LIST=
       (LOAD_BALANCE=on)(FAILOVER=ON)
       (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
       (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
       (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
     )
     (CONNECT_DATA=(SERVICE_NAME= rac.example.com))
   )

但是注意SCAN LISTENER还是把VIP name重定向返回给客户端,需要客户端联系DNS再次解析,如果files 不是在dns 之前的话, 而如果是使用JDBC thin 驱动则必须要过DNS检查,及时是files在前。Mos中有如下描述:

There is an additional DNS query performed after the client is redirected by the scan listener to the vip/database listener. The scan listener supplies the client with the vip name which the client must again resolve through DNS. Having each of the node vip names mapped to IPs in the client /etc/hosts file will suppress the DNS query provided nsswitch.conf is configured to use files before dns.

hosts: files dns

This workaround is suitable perhaps for a single or small group of systems (e.g., application servers) and will eliminate a bit more of any DNS related connection delays.

JDBC thin driver architecture differs from OCI in that the Listener must perform a DNS query with each incoming (JDBC thin) client connection. In that situation and if DNS response is slow all incoming client connections through the listener may be affected or delayed including other coinciding OCI connections. A recent performance enhancement to the JDBC thin client modifies the connection properties to lessen this DNS dependency at the listener. This improved JDBC connection behavior takes effect starting with Oracle 12.1.0.2.0 but the same enhancement can also be backported to earlier versions using bug 18369949.

打赏

, ,

对不起,这篇文章暂时关闭评论。