首页 » ORACLE 9i-23ai » Troubleshooting Oracle 11.2 DB Service Res stoped ORA-12514 after ‘stuck archiver’ ORA-257

Troubleshooting Oracle 11.2 DB Service Res stoped ORA-12514 after ‘stuck archiver’ ORA-257

最近遇到一起因oracle 数据库的归档空间耗尽,导致部分应用连接数据库时提示ORA-12514错误,而使用监听上现在的服务名连接数据库提示ora-257, DBA应该知道,ORA-12514错误显然是应用连接使用的服务名在listener没有办法匹配,而ora-257是归档空间满无法完成归档,怎么会ora-12514又和归档联系起来了呢?

ORA-12514: TNS:listener could not resolve SERVICE_NAME given in connect descriptor

ORA-12514 是 Oracle 数据库中常见的连接错误,表示客户端尝试连接到数据库实例时,监听器无法找到与指定的连接描述符相匹配的服务处理程序。

常见原因
1. 服务名配置错误:
客户端(含JDBC)提供的 SERVICE_NAME 与监听器配置文件(listener.ora)中定义的服务名不匹配。
2. 监听器未启动:
监听器服务未启动或未正确配置。
3. 数据库实例未注册:
数据库实例未注册到监听器,或者注册失败。
4. tnsnames.ora 配置错误:
客户端的 tnsnames.ora 文件配置错误,导致连接描述符不正确。

确认应用使用的服务名后,发现应用使用的是自已创建的service。而不是默认和db name同名的service,  通过v$service或下面的命令可以查询服务名或启动。

查看服务
$ srvctl config database
$ srvctl config service   -d  <dbname>
启动服务
$ srvctl start service -d <dbname>   -s <service name>

为什么归档满什么导致自建服务名stop呢?
梳理CRS层的alert.log和crsd.log,可以发现先是因为db资源label打上stuck archiver,然后从online 状态变为intermediate,然后Evm 还是通知db 资源已更改,然后service服务收到了消息,级联的改变状态,从online变为UNKOWN,然后CRS开始自动重启srv服务,但是无法成功,尝试failover,因为归档目录满无法启动,最后srv服务资源达到最大restart attempt exhausted上限停止。 因为srv服务资源依赖db资源。

–Oracle 23c RAC

[oracle@db1 ~]$ srvctl add service -db anbob -service my_1_2_srv -preferred "anbob1" -available "anbob2"
[oracle@db1 ~]$ srvctl status service -db anbob
Service my_1_2_srv is not running.

[oracle@db1 ~]$ srvctl start service -db anbob -service my_1_2_srv
[oracle@db1 ~]$ srvctl status service -db anbob
Service my_1_2_srv is running on instance(s) anbob1

[oracle@db1 ~]$ /u01/app/23.0.0/grid/bin/crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1                      STABLE
               ONLINE  ONLINE       db2                      STABLE
...
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
..
ora.db1.vip
      1        ONLINE  ONLINE       db1                      STABLE
ora.db2.vip
      1        ONLINE  ONLINE       db2                      STABLE
ora.anbob.db
      1        ONLINE  ONLINE       db1                      Open,HOME=/u01/app/o
                                                             racle/product/23.0.0
                                                             /dbhome_1,STABLE
      2        ONLINE  ONLINE       db2                      Open,HOME=/u01/app/o
                                                             racle/product/23.0.0
                                                             /dbhome_1,STABLE
ora.anbob.my_1_2_srv.svc
      1        ONLINE  ONLINE       db1                      STABLE
ora.anbob.test.pdb
      1        ONLINE  ONLINE       db1                      STABLE
      2        ONLINE  ONLINE       db2                      STABLE
...
--------------------------------------------------------------------------------


[oracle@db1 ~]$ /u01/app/23.0.0/grid/bin/crsctl stat res "ora.anbob.my_1_2_srv.svc" -p
NAME=ora.anbob.my_1_2_srv.svc
TYPE=ora.service.type
ACL=owner:oracle:rwx,pgrp:oinstall:r--,other::r--,group:dba:r-x,group:oper:r-x,group:racdba:r-x,user:grid:r-x
ACTIONS=isJavaService,group:"dba",group:"oper",group:"racdba",user:"grid",user:"oracle" rf_action,group:"dba",group:"oper",group:"racdba",user:"grid",user:"oracle"
ACTION_SCRIPT=
ACTION_TIMEOUT=600
ACTIVE_PLACEMENT=0
ADC_SERVICE=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AGENT_PARAMETERS=
ALLOW_RESTART=default
AQ_HA_NOTIFICATION=0
AUTO_START=restore
CARDINALITY=1
CHECK_INTERVAL=0
CHECK_TIMEOUT=30
CLB_GOAL=LONG
CLEAN_TIMEOUT=60
COMMIT_OUTCOME=0
COMMIT_OUTCOME_FASTPATH=0
CSS_CRITICAL=no
DELETE_TIMEOUT=60
DESCRIPTION=Oracle Service resource
DRAIN_ID=
DRAIN_TIMEOUT=
DTP=0
EDITION=
ENABLED=1
FAILOVER_DELAY=0
FAILOVER_METHOD=
FAILOVER_RESTORE=NONE
FAILOVER_RETRIES=
FAILOVER_TYPE=NONE
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
GEN_CHILD_SERVICES=
GEN_PDB_DB_DOMAIN=
GEN_SERVICE_NAME=my_1_2_srv
GEN_TENANT_ID=
GLOBAL=false
GSM_FLAGS=0
HOSTING_MEMBERS=db1 db2
HUB_SERVICE=
IGNORE_TARGET_ON_FAILURE=no
INSTANCE_FAILOVER=1
INTERMEDIATE_TIMEOUT=0
LOAD=1
LOGGING_LEVEL=1
MANAGEMENT_POLICY=AUTOMATIC
MAX_LAG_TIME=ANY
MODIFY_TIMEOUT=60
NLS_LANG=
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=restricted
PLACEMENT_POLICY=
PLACEMENT_TEMPLATE=
PLUGGABLE_DATABASE=
RANK=0
REGISTER_ENTRYPOINT_REQUIRED=everywhere
REGISTER_TIMEOUT=60
RELOCATE_BY_DEPENDENCY=1
RELOCATE_KIND=online
REPLAY_INITIATION_TIME=300
RESET_SESSION_STATE=NONE
RESOURCE_GROUP=
RESTART_ATTEMPTS=0
RESTART_DELAY=0
RETENTION=86400
RF_PARAMETERS=
RF_SERVICE=0
RLB_GOAL=NONE
ROLE=PRIMARY
SCRIPT_TIMEOUT=60
SERVER_CATEGORY=
SERVER_POOLS=
SERVICE_NAME=my_1_2_srv
SERVICE_NAME_PQ=
SERVICE_TYPE=MAIN
SESSION_NOREPLAY=false
SESSION_STATE_CONSISTENCY=
SQL_TRANSLATION_PROFILE=
START_CONCURRENCY=0
START_DEPENDENCIES=hard(ora.anbob.db,type:ora.cluster_vip_net1.type) weak(type:ora.listener.type) pullup(type:ora.cluster_vip_net1.type) pullup:always(ora.anbob.db)
START_DEPENDENCIES_RTE_INTERNAL=MAINTrueFalseora.anbob.db0Trueora.anbob.dbora.cluster_vip_net1.typeAUTOMATIC
START_DEPENDENCIES_TEMPLATE=
START_TIMEOUT=600
STOP_CONCURRENCY=0
STOP_DEPENDENCIES=hard(intermediate:ora.anbob.db,type:ora.cluster_vip_net1.type)
STOP_DEPENDENCIES_RTE_INTERNAL=TrueFalseora.anbob.dbora.cluster_vip_net1.type
STOP_DEPENDENCIES_TEMPLATE=
STOP_OPTION=
STOP_TIMEOUT=600
TABLE_FAMILY_ID=
TAF_FAILOVER_DELAY=
TAF_POLICY=NONE
TARGET_DEFAULT=default
TEMPLATE_TIMEOUT=86400
TYPE_VERSION=3.2
UPTIME_THRESHOLD=1h
USER_WORKLOAD=yes
USE_STICKINESS=0
USR_ORA_DISCONNECT=false
USR_ORA_ENV=
USR_ORA_FLAGS=
USR_ORA_OPEN_MODE=
USR_ORA_OPI=false
USR_ORA_STOP_MODE=
WORKLOAD_CPU=0
WORKLOAD_CPU_CAP=0
WORKLOAD_MEMORY_MAX=0
WORKLOAD_MEMORY_TARGET=0

[oracle@db1 ~]$ srvctl stop service -db anbob -service my_1_2_srv
[oracle@db1 ~]$ srvctl remove service -db anbob -service my_1_2_srv

[oracle@db1 ~]$ srvctl add service -db anbob -service my_1_2_srv -preferred anbob1,anbob2 -available anbob2
PRCD-2097 : failed to add or modify service using instance anbob because it is specified multiple times

[oracle@db1 ~]$ srvctl add service -db anbob -service my_1_2_srv -preferred anbob1,anbob2

...
ora.anbob.my_1_2_srv.svc
      1        OFFLINE OFFLINE                               STABLE
      2        OFFLINE OFFLINE                               STABLE
...

正常情况下,如果归档清理后, db的状态应该改变并触发srv服务状态的状态改变,为什么没有呢? 客户环境是11.2.0.4, MOS查询了一下,存在一个bug. 影响 11.2.0.3, 11.2.0.4

Bug 17592037 Service does not failover when cannot logon to instance with archive error – superseded

When online logs cannot be archived to its targetted destination,oracle instance is in ‘Stuck Archiver’ condition and normal users  may not login to the instance due to ORA-257.

However, under such conditions, service resource was not failing over to other available nodes.

his fix is superseded by the fix in bug 24451580.

Bug 24451580 Service resoure becomes UNKNOWN when database becomes stuck archive

When database becomes stuck archiver status due to archive process error,service resource becomes UNKNOWN whereas it is expected to become offline to failover.And even stuck archiver status is cleared, service resource does not back to online.

This problem is considered to happen from 11.2.0.4.160119 OCWPSU and 12.1.0.2.160119 OCWPSU.

打赏

, , ,

目前这篇文章还没有评论(Rss)

我要评论