首页 » OceanBase, ORACLE 9i-23ai » Oceanbase不建议你模仿Oracle的错误编号(ORA-NNNNN)!

Oceanbase不建议你模仿Oracle的错误编号(ORA-NNNNN)!

故事从一个客户正在将数据库平台从Oracle迁移到OceanBase的过程中说起。在并轨生产运行过程中,应用报告了ORA-54错误号,这让Oracle的数据库管理员感到震惊,并花费了大量时间来进行分析。然而,他们最终发现这个错误是由OceanBase的驱动程序抛出的。在尤其有些客户一点儿问题全中心排查躁动,这类报错让Oracle DBA分析oracle数据库,确实会让人要想爆粗,浪费了甲方的付费的乙方资源,同时还不利于故障定位。 下面对比一下oracle和oceanbase的resource busy报错。

oracle

-- session 1 sid=1295
SQL> create table test2(id int primary key, name varchar2(10));
Table created.

SQL> insert into test2 values(1,'anbob');
1 row created.

SQL> insert into test2 values(2,'anbob.com');
1 row created.

SQL> select * from test2;
        ID NAME
---------- ----------
         1 anbob
         2 anbob.com

SQL> commit;
Commit complete.

SQL> update test2 set name='www.anbob' where id=1;
1 row updated.
------ no commit;


-- session 2 sid=1040
SQL> alter system set ddl_lock_timeout=0;
System altered.

SQL> alter table test2 modify name varchar2(100);
alter table test2 modify name varchar2(100)
            *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

SQL> select * from test2 for update nowait;
select * from test2 for update nowait
              *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired


SQL> select * from test2 for update wait 5;
select * from test2 for update wait 5
-- wait 5 sec
              *
ERROR at line 1:
ORA-30006: resource busy; acquire with WAIT timeout expired

SQL> select * from test2 for update;
select * from test2 for update
-- waiting until ctrl+c
              *
ERROR at line 1:
ORA-01013: user requested cancel of current operation

SQL> @lock 'sid in(1295,1040)'

    SID TY      LMODE MODE_HELD          REQUEST MODE_REQUESTED  LOCK_ID1   LOCK_ID2        CTIME      BLOCK
------- -- ---------- --------------- ---------- --------------- ---------- ---------- ---------- ----------
   1040 TX          0 None                     6 Exclusive       13369353   3874               33          0
   1040 TM          3 Row-X (SX)               0 None            181384     0                  33          2
   1040 AE          4 Share                    0 None            134        1                 811          2
   1295 TX          6 Exclusive                0 None            13369353   3874              817          1
   1295 AE          4 Share                    0 None            134        1                 878          2
   1295 TM          3 Row-X (SX)               0 None            181384     0                 817          2
6 rows selected.

SQL> select * from dba_waiters;

WAITING_SESSION WAITING_CON_ID HOLDING_SESSION HOLDING_CON_ID LOCK_TYPE                  MODE_HELD                                MODE_REQUESTED                             LOCK_ID1   LOCK_ID2
--------------- -------------- --------------- -------------- -------------------------- ---------------------------------------- ---------------------------------------- ---------- ----------
           1040              1            1295              1 Transaction                Exclusive                                Exclusive                                  13369353       3874

Note:
DDL受ddl_lock_timeout参数影响单位s,超时报ora-54。但是select for update不带wait限制, 可能是for ever吧. 如果for update nowait是报ora-54, 如果是for update wait N seconds超时后是报ORA-30006, 解决可以从后台找到block session 事务没有提交的会话kill .

另外之前blog也提示过,oracle官方退出了oracle error-help功能网站,可以根据错误编号查找更多的处理信息,如https://docs.oracle.com/en/error-help/db/ora-00054/index.html,如下

ORA-00054

resource busy and acquire with NOWAIT specified or timeout expired

Cause

Interested resource is busy.


Action

Retry if necessary or increase timeout.


Additional Information

To identify the session that is holding the lock in question, the following statement can be run (note, the query below requires SELECT privileges on gv$session and gv$lock):

SELECT l.inst_id, s.sid, s.serial#, s.username, s.sql_id,
       l.ctime AS "LOCK_HOLD_TIME_ELAPSED_SECONDS",
       s.client_identifier, s.module, s.action
FROM gv$session s, gv$lock l
WHERE l.sid     = s.sid
  AND l.inst_id = l.inst_id
  AND l.type    = '<lock_type>'
  AND l.id1     = <resource_id1_from_error_message>
  AND l.id2     = <resource_id2_from_error_message>;

The column LOCK_HOLD_TIME_ELAPSED_SECONDS (ctime) is the time since the current lock was granted. In order words, it is the number of seconds the other session has held the lock for.

Oceanbase for oracle

[admin@OceanBase1 ~]$ obclient -h172.xxxxxx -P2883 -uanbob@orauser#obdemo:10001 -p
Enter password:
Welcome to the OceanBase.  Commands end with ; or \g.
Your OceanBase connection id is 25041
Server version: OceanBase 3.2.4.1   (Built Jan  8 2023 22:52:43)
Copyright (c) 2000, 2018, OceanBase and/or its affiliates. All rights reserved.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

obclient [ANBOB]> create table test2(id int primary key, name varchar2(10));
Query OK, 0 rows affected (0.160 sec)

obclient [ANBOB]> insert into test2 values(1,'anbob');
Query OK, 1 row affected (0.019 sec)

obclient [ANBOB]> insert into test2 values(2,'anbob.com');
Query OK, 1 row affected (0.002 sec)
obclient [ANBOB]> commit;
obclient [ANBOB]> select * from test2;
+----+-----------+
| ID | NAME      |
+----+-----------+
|  1 | anbob     |
|  2 | anbob.com |
+----+-----------+
2 rows in set (0.014 sec)
# session 1
obclient [ANBOB]> update test2 set name='www.anbob' where id=1;
Query OK, 1 row affected (0.005 sec)
Rows matched: 1  Changed: 1  Warnings: 0
-- no commit;
# session 2

obclient [ANBOB]> select * from test2;
+----+-----------+
| ID | NAME      |
+----+-----------+
|  1 | anbob     |
|  2 | anbob.com |
+----+-----------+
2 rows in set (0.002 sec)

obclient [ANBOB]> alter table test2 modify name varchar2(100);
Query OK, 0 rows affected (0.067 sec)

obclient [ANBOB]> desc test2;
+-------+---------------+------+-----+---------+-------+
| FIELD | TYPE          | NULL | KEY | DEFAULT | EXTRA |
+-------+---------------+------+-----+---------+-------+
| ID    | NUMBER(38)    | NO   | PRI | NULL    | NULL  |
| NAME  | VARCHAR2(100) | YES  | NULL | NULL    | NULL  |
+-------+---------------+------+-----+---------+-------+
2 rows in set (0.007 sec)

# session 1
obclient [ANBOB]> desc test2;
+-------+---------------+------+-----+---------+-------+
| FIELD | TYPE          | NULL | KEY | DEFAULT | EXTRA |
+-------+---------------+------+-----+---------+-------+
| ID    | NUMBER(38)    | NO   | PRI | NULL    | NULL  |
| NAME  | VARCHAR2(100) | YES  | NULL | NULL    | NULL  |
+-------+---------------+------+-----+---------+-------+
2 rows in set (0.005 sec)

obclient [ANBOB]> select * from test2;
+----+-----------+
| ID | NAME      |
+----+-----------+
|  1 | www.anbob |
|  2 | anbob.com |
+----+-----------+
2 rows in set (0.004 sec)

Note:
在Oceanbase(以下简称OB) 事务并不会堵塞DDL.这点与oracle不同。

测试OB中的select for update

# session 1
obclient [ANBOB]> SHOW VARIABLES LIKE 'ob_query_timeout';
+------------------+----------------+
| VARIABLE_NAME    | VALUE          |
+------------------+----------------+
| ob_query_timeout | 10000000000000 |
+------------------+----------------+
1 row in set (0.006 sec)

obclient [ANBOB]>  ALTER SYSTEM SET ob_query_timeout = 72;
Query OK, 0 rows affected (0.003 sec)

obclient [ANBOB]> select * from test2 where id=1 for update nowait;
ORA-00600: internal error code, arguments: -4012, Timeout

obclient [ANBOB]> ALTER SESSION SET ob_query_timeout = 7200;
Query OK, 0 rows affected (0.002 sec)

obclient [ANBOB]> SHOW VARIABLES LIKE 'ob_query_timeout';
+------------------+-------+
| VARIABLE_NAME    | VALUE |
+------------------+-------+
| ob_query_timeout | 7200  |
+------------------+-------+
1 row in set (0.005 sec)

obclient [ANBOB]> select * from test2 where id=1 for update nowait;
ORA-00600: internal error code, arguments: -6212, Statement is timeout

Note:
原来ob_query_timeout参数配置的太小,自己的SQL都没执行完,单位是微秒,(这单位是否合理?自行判断),建议OB在数据库参数上也限制一下最小值。修改此参数意为让for update尽快报错.不过上面看到报出来oracle DBA一个很有挑战的错误代码ORA-600 Interal error。 不过参数还不一样4012是timeout, 6212 是statement is timeout. 不纠结,知道了单位我把配置改为10秒.

obclient [ANBOB]> alter session set ob_query_timeout =10000000;
Query OK, 0 rows affected (0.002 sec)

#session 1
obclient [ANBOB]> select * from test2 where id=1 for update ;
+----+-----------+
| ID | NAME      |
+----+-----------+
|  1 | www.anbob |
+----+-----------+
1 row in set (0.002 sec)

# session 2
obclient [ANBOB]> select * from test2 where id=1 for update nowait;
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired    
--立即返回报错与oracle相同

obclient [ANBOB]> select * from test2 where id=1 for update ;     --受ob_query_timeout 影响
-- wait 10s
ORA-30006: resource busy; acquire with WAIT timeout expired

obclient [ANBOB]> alter session set ob_query_timeout =20000000;
Query OK, 0 rows affected (0.002 sec)

obclient [ANBOB]> select * from test2 where id=1 for update ;
-- wait 20s
ORA-30006: resource busy; acquire with WAIT timeout expired

Note:
select for update与oracle的预期一样。报错的文本内容和oracle都一模一样,这让兼容性是好是坏?

ORACLE 错误编号

oracle@19c1:/home/oracle/tpt-oracle-master $ oerr ora 30006
30006, 00000, "resource busy; acquire with WAIT timeout expired"
// *Cause:  The requested resource is busy.
// *Action: Retry the operation later.
oracle@19c1:/home/oracle/tpt-oracle-master $ oerr ora 54
00054, 00000, "resource busy and acquire with NOWAIT specified or timeout expired"
// *Cause:  Interested resource is busy.
// *Action: Retry if necessary or increase timeout.
oracle@19c1:/home/oracle/tpt-oracle-master $

OCEANBASE ORA错误
https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000000218668

在OB中如何解锁?

obclient [oceanbase]> select * from gv$table where table_id=1100611139453790 \G
*************************** 1. row ***************************
         tenant_id: 1001
       tenant_name: orauser
          table_id: 1100611139453790
        table_name: TEST2
       database_id: 1100611139404831
     database_name: ANBOB
     tablegroup_id: -1
   tablegroup_name: NULL
        table_type: 3
         zone_list: zone1;zone2;zone3
      primary_zone: zone1;zone2;zone3
    collation_type: 46
          locality: FULL{1}@zone1, FULL{1}@zone2, FULL{1}@zone3
    schema_version: 1703594038143136
         read_only: 0
           comment:
      index_status: 1
        index_type: 0
        part_level: 0
    part_func_type: 0
    part_func_expr:
          part_num: 1
sub_part_func_type: 0
sub_part_func_expr:
      sub_part_num: 1
               dop: 1
         auto_part: 0
    auto_part_size: -1
1 row in set (0.019 sec)

obclient [oceanbase]> SELECT * FROM __all_virtual_trans_lock_stat ORDER BY ctx_create_time limit 8\G
*************************** 1. row ***************************
      tenant_id: 1001
       trans_id: {hash:14812608598155958701, inc:88587874, addr:"172.20.xxxx:2882", t:1703597571331695}
         svr_ip: 172.20.xxx
       svr_port: 2882
      partition: {tid:1100611139453790, partition_id:0, part_cnt:0}
       table_id: 1100611139453790
         rowkey: table_id=1100611139453790 rowkey_object=[{"DECIMAL":"1"}]
     session_id: 3221600981
       proxy_id: "172.20.yyyyy:2883"
ctx_create_time: 2023-12-26 21:32:51.331695
   expired_time: 2024-01-07 11:19:31.331902
  row_lock_addr: 140012405040624
1 row in set (0.059 sec)

obclient [oceanbase]> kill query 3221600981;
Query OK, 0 rows affected (0.001 sec)

obclient [oceanbase]> SELECT * FROM __all_virtual_trans_lock_stat ORDER BY ctx_create_time limit 8\G
*************************** 1. row ***************************
      tenant_id: 1001
       trans_id: {hash:14812608598155958701, inc:88587874, addr:"172.20.xxxxx:2882", t:1703597571331695}
         svr_ip: 172.20.xxxx
       svr_port: 2882
      partition: {tid:1100611139453790, partition_id:0, part_cnt:0}
       table_id: 1100611139453790
         rowkey: table_id=1100611139453790 rowkey_object=[{"DECIMAL":"1"}]
     session_id: 3221600981
       proxy_id: "172.20.yyyy:2883"
ctx_create_time: 2023-12-26 21:32:51.331695
   expired_time: 2024-01-07 11:19:31.331902
  row_lock_addr: 140012405040624
1 row in set (0.062 sec)

obclient [oceanbase]> kill connection 3221600981;
Query OK, 0 rows affected (0.002 sec)

obclient [oceanbase]> SELECT * FROM __all_virtual_trans_lock_stat ORDER BY ctx_create_time limit 8\G
Empty set (0.072 sec)



# session 1
obclient [ANBOB]> select * from test2 where id=1 for update ;
ERROR-02013: Lost connection to MySQL server during query
obclient [ANBOB]>

Note:
从__all_virtual_trans_lock_stat可以查询到lock资源(但在V3版本可能存在查询不到的现象),而且kill query也并未生效,最后使用kill connection终止连接释放。

Summary:
oceanbase存在一个错误编号与oracle相同,相同的编号,错误message有的相同有的不同,且触发机制可能存在差异, 如果在当前的生产环境中存在oracle和oceanbase两类数据库的应用,应用中的错误日志根据错误编号不容易区分报错数据源是oracle还是oceanbase. 而且oracle的每个版本相同的错误code可能存在错误message的调整,如资源限制类。 恐怕OCEANBASE很难保持与ORACLE的同步, 而且OB的错误处理我相信要比ORACLE多,恐怕有一天会出现由于两种数据库系统在错误编号上可能存在相似之处,但错误消息却有所不同,这会增加诊断和故障排除的复杂性。

打赏

, , ,

对不起,这篇文章暂时关闭评论。