Troubleshooting Oracle 11.2 DB Service Res stoped ORA-12514 after ‘stuck archiver’ ORA-257
最近遇到一起因oracle 数据库的归档空间耗尽,导致部分应用连接数据库时提示ORA-12514错误,而使用监听上现在的服务名连接数据库提示ora-257, DBA应该知道,ORA-12514错误显然是应用连接使用的服务名在listener没有办法匹配,而ora-257是归档空间满无法完成归档,怎么会ora-12514又和归档联系起来了呢?
ORA-12514: TNS:listener could not resolve SERVICE_NAME given in connect descriptor
ORA-12514 是 Oracle 数据库中常见的连接错误,表示客户端尝试连接到数据库实例时,监听器无法找到与指定的连接描述符相匹配的服务处理程序。
常见原因
1. 服务名配置错误:
客户端(含JDBC)提供的 SERVICE_NAME 与监听器配置文件(listener.ora)中定义的服务名不匹配。
2. 监听器未启动:
监听器服务未启动或未正确配置。
3. 数据库实例未注册:
数据库实例未注册到监听器,或者注册失败。
4. tnsnames.ora 配置错误:
客户端的 tnsnames.ora 文件配置错误,导致连接描述符不正确。
确认应用使用的服务名后,发现应用使用的是自已创建的service。而不是默认和db name同名的service, 通过v$service或下面的命令可以查询服务名或启动。
查看服务 $ srvctl config database $ srvctl config service -d <dbname> 启动服务 $ srvctl start service -d <dbname> -s <service name>
为什么归档满什么导致自建服务名stop呢?
梳理CRS层的alert.log和crsd.log,可以发现先是因为db资源label打上stuck archiver,然后从online 状态变为intermediate,然后Evm 还是通知db 资源已更改,然后service服务收到了消息,级联的改变状态,从online变为UNKOWN,然后CRS开始自动重启srv服务,但是无法成功,尝试failover,因为归档目录满无法启动,最后srv服务资源达到最大restart attempt exhausted上限停止。 因为srv服务资源依赖db资源。
–Oracle 23c RAC
[oracle@db1 ~]$ srvctl add service -db anbob -service my_1_2_srv -preferred "anbob1" -available "anbob2" [oracle@db1 ~]$ srvctl status service -db anbob Service my_1_2_srv is not running. [oracle@db1 ~]$ srvctl start service -db anbob -service my_1_2_srv [oracle@db1 ~]$ srvctl status service -db anbob Service my_1_2_srv is running on instance(s) anbob1 [oracle@db1 ~]$ /u01/app/23.0.0/grid/bin/crsctl stat res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.LISTENER.lsnr ONLINE ONLINE db1 STABLE ONLINE ONLINE db2 STABLE ... -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- .. ora.db1.vip 1 ONLINE ONLINE db1 STABLE ora.db2.vip 1 ONLINE ONLINE db2 STABLE ora.anbob.db 1 ONLINE ONLINE db1 Open,HOME=/u01/app/o racle/product/23.0.0 /dbhome_1,STABLE 2 ONLINE ONLINE db2 Open,HOME=/u01/app/o racle/product/23.0.0 /dbhome_1,STABLE ora.anbob.my_1_2_srv.svc 1 ONLINE ONLINE db1 STABLE ora.anbob.test.pdb 1 ONLINE ONLINE db1 STABLE 2 ONLINE ONLINE db2 STABLE ... -------------------------------------------------------------------------------- [oracle@db1 ~]$ /u01/app/23.0.0/grid/bin/crsctl stat res "ora.anbob.my_1_2_srv.svc" -p NAME=ora.anbob.my_1_2_srv.svc TYPE=ora.service.type ACL=owner:oracle:rwx,pgrp:oinstall:r--,other::r--,group:dba:r-x,group:oper:r-x,group:racdba:r-x,user:grid:r-x ACTIONS=isJavaService,group:"dba",group:"oper",group:"racdba",user:"grid",user:"oracle" rf_action,group:"dba",group:"oper",group:"racdba",user:"grid",user:"oracle" ACTION_SCRIPT= ACTION_TIMEOUT=600 ACTIVE_PLACEMENT=0 ADC_SERVICE= AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX% AGENT_PARAMETERS= ALLOW_RESTART=default AQ_HA_NOTIFICATION=0 AUTO_START=restore CARDINALITY=1 CHECK_INTERVAL=0 CHECK_TIMEOUT=30 CLB_GOAL=LONG CLEAN_TIMEOUT=60 COMMIT_OUTCOME=0 COMMIT_OUTCOME_FASTPATH=0 CSS_CRITICAL=no DELETE_TIMEOUT=60 DESCRIPTION=Oracle Service resource DRAIN_ID= DRAIN_TIMEOUT= DTP=0 EDITION= ENABLED=1 FAILOVER_DELAY=0 FAILOVER_METHOD= FAILOVER_RESTORE=NONE FAILOVER_RETRIES= FAILOVER_TYPE=NONE FAILURE_INTERVAL=0 FAILURE_THRESHOLD=0 GEN_CHILD_SERVICES= GEN_PDB_DB_DOMAIN= GEN_SERVICE_NAME=my_1_2_srv GEN_TENANT_ID= GLOBAL=false GSM_FLAGS=0 HOSTING_MEMBERS=db1 db2 HUB_SERVICE= IGNORE_TARGET_ON_FAILURE=no INSTANCE_FAILOVER=1 INTERMEDIATE_TIMEOUT=0 LOAD=1 LOGGING_LEVEL=1 MANAGEMENT_POLICY=AUTOMATIC MAX_LAG_TIME=ANY MODIFY_TIMEOUT=60 NLS_LANG= OFFLINE_CHECK_INTERVAL=0 PLACEMENT=restricted PLACEMENT_POLICY= PLACEMENT_TEMPLATE= PLUGGABLE_DATABASE= RANK=0 REGISTER_ENTRYPOINT_REQUIRED=everywhere REGISTER_TIMEOUT=60 RELOCATE_BY_DEPENDENCY=1 RELOCATE_KIND=online REPLAY_INITIATION_TIME=300 RESET_SESSION_STATE=NONE RESOURCE_GROUP= RESTART_ATTEMPTS=0 RESTART_DELAY=0 RETENTION=86400 RF_PARAMETERS= RF_SERVICE=0 RLB_GOAL=NONE ROLE=PRIMARY SCRIPT_TIMEOUT=60 SERVER_CATEGORY= SERVER_POOLS= SERVICE_NAME=my_1_2_srv SERVICE_NAME_PQ= SERVICE_TYPE=MAIN SESSION_NOREPLAY=false SESSION_STATE_CONSISTENCY= SQL_TRANSLATION_PROFILE= START_CONCURRENCY=0 START_DEPENDENCIES=hard(ora.anbob.db,type:ora.cluster_vip_net1.type) weak(type:ora.listener.type) pullup(type:ora.cluster_vip_net1.type) pullup:always(ora.anbob.db) START_DEPENDENCIES_RTE_INTERNAL=MAINTrueFalseora.anbob.db0Trueora.anbob.dbora.cluster_vip_net1.typeAUTOMATIC START_DEPENDENCIES_TEMPLATE= START_TIMEOUT=600 STOP_CONCURRENCY=0 STOP_DEPENDENCIES=hard(intermediate:ora.anbob.db,type:ora.cluster_vip_net1.type) STOP_DEPENDENCIES_RTE_INTERNAL=TrueFalseora.anbob.dbora.cluster_vip_net1.type STOP_DEPENDENCIES_TEMPLATE= STOP_OPTION= STOP_TIMEOUT=600 TABLE_FAMILY_ID= TAF_FAILOVER_DELAY= TAF_POLICY=NONE TARGET_DEFAULT=default TEMPLATE_TIMEOUT=86400 TYPE_VERSION=3.2 UPTIME_THRESHOLD=1h USER_WORKLOAD=yes USE_STICKINESS=0 USR_ORA_DISCONNECT=false USR_ORA_ENV= USR_ORA_FLAGS= USR_ORA_OPEN_MODE= USR_ORA_OPI=false USR_ORA_STOP_MODE= WORKLOAD_CPU=0 WORKLOAD_CPU_CAP=0 WORKLOAD_MEMORY_MAX=0 WORKLOAD_MEMORY_TARGET=0 [oracle@db1 ~]$ srvctl stop service -db anbob -service my_1_2_srv [oracle@db1 ~]$ srvctl remove service -db anbob -service my_1_2_srv [oracle@db1 ~]$ srvctl add service -db anbob -service my_1_2_srv -preferred anbob1,anbob2 -available anbob2 PRCD-2097 : failed to add or modify service using instance anbob because it is specified multiple times [oracle@db1 ~]$ srvctl add service -db anbob -service my_1_2_srv -preferred anbob1,anbob2 ... ora.anbob.my_1_2_srv.svc 1 OFFLINE OFFLINE STABLE 2 OFFLINE OFFLINE STABLE ...
正常情况下,如果归档清理后, db的状态应该改变并触发srv服务状态的状态改变,为什么没有呢? 客户环境是11.2.0.4, MOS查询了一下,存在一个bug. 影响 11.2.0.3, 11.2.0.4
Bug 17592037 Service does not failover when cannot logon to instance with archive error – superseded
When online logs cannot be archived to its targetted destination,oracle instance is in ‘Stuck Archiver’ condition and normal users may not login to the instance due to ORA-257.
However, under such conditions, service resource was not failing over to other available nodes.
his fix is superseded by the fix in bug 24451580.
Bug 24451580 Service resoure becomes UNKNOWN when database becomes stuck archive
When database becomes stuck archiver status due to archive process error,service resource becomes UNKNOWN whereas it is expected to become offline to failover.And even stuck archiver status is cleared, service resource does not back to online.
This problem is considered to happen from 11.2.0.4.160119 OCWPSU and 12.1.0.2.160119 OCWPSU.
目前这篇文章还没有评论(Rss)