首页 » PostgreSQL/GaussDB » PostgreSQL系(GaussDB、KingBase…) 在 SELECT 和 TRUNCATE之间lock冲突

PostgreSQL系(GaussDB、KingBase…) 在 SELECT 和 TRUNCATE之间lock冲突

在从Oracle迁移到PostgreSQL后,SELECT和TRUNCATE操作互相阻塞是一个常见问题,这主要是由于两种数据库的锁机制不同导致的。目前一客户项目在我们迁移到postgresql系的国产数据库生产环境后,时常在一个存储过程中包含truncate又有select 长事务时发生互相堵塞,简单记录.

原因分析

  1. 锁机制差异

    • Oracle的TRUNCATE是DDL操作,Select Null 锁,通常不会与SELECT冲突

    • PostgreSQL中TRUNCATE需要获取ACCESS EXCLUSIVE锁,这是最强的锁级别,会阻塞所有其他操作

  2. MVCC实现差异

    • PostgreSQL的MVCC实现方式与Oracle不同

    • TRUNCATE在PostgreSQL中实际上是创建新文件并删除旧文件,需要完全独占访问

演示
— 这里使用kingbase

-- session 1
test=# create table test(id int);
CREATE TABLE
test=# insert into test values (1);
INSERT 0 1
test=# commit;
WARNING:  there is no transaction in progress
COMMIT
test=# begin;
BEGIN
test=# select * from test;
 id
----
  1
(1 row)


-- session 2
kingbase=# \c test
You are now connected to database "test" as userName "system".
test=# begin;
BEGIN
test=# truncate table test;
 waiting...

-- session 3
kingbase=# \c test;
You are now connected to database "test" as userName "system".
test=# select * from test;
 waiting...

Note:
看到堵塞session 1 select — session 2 truncate — session select 堵塞链

kingbase=# select datname,pid,xact_start,query_start,wait_event_type,wait_event,state,query from pg_stat_activity;
 datname  |   pid   |          xact_start           |          query_start          | wait_event_type |     wait_event      |        state        |
            query
----------+---------+-------------------------------+-------------------------------+-----------------+---------------------+---------------------+--------------------------------------
-------------------------------------------------------------------
 ....
 kingbase | 2424743 | 2025-04-27 17:08:39.106147+08 | 2025-04-27 17:08:39.106147+08 |                 |                     | active              | select datname,pid,xact_start,query_s
tart,wait_event_type,wait_event,state,query from pg_stat_activity;
 test     | 2380294 | 2025-04-27 17:06:00.351627+08 | 2025-04-27 17:06:07.249354+08 | Client          | ClientRead          | idle in transaction | select * from test;
 test     | 2381560 | 2025-04-27 17:06:13.855348+08 | 2025-04-27 17:06:15.955823+08 | Lock            | relation            | active              | truncate table test;
 test     | 2423637 | 2025-04-27 17:06:26.795951+08 | 2025-04-27 17:06:26.795951+08 | Lock            | relation            | active              | select * from test;
...
(11 rows)

kingbase=# select * from pg_locks;
   locktype    | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |   pid   |        mode         | granted | fastpath
---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+---------------------+---------+----------
 virtualxid    |          |          |      |       | 8/393      |               |         |       |          | 8/393              | 2423637 | ExclusiveLock       | t       | t
 virtualxid    |          |          |      |       | 7/6        |               |         |       |          | 7/6                | 2381560 | ExclusiveLock       | t       | t
 virtualxid    |          |          |      |       | 6/126      |               |         |       |          | 6/126              | 2380294 | ExclusiveLock       | t       | t
 relation      |    13591 |    12085 |      |       |            |               |         |       |          | 5/834432           | 2424743 | AccessShareLock     | t       | t
 virtualxid    |          |          |      |       | 5/834432   |               |         |       |          | 5/834432           | 2424743 | ExclusiveLock       | t       | t
 relation      |    13590 |    16799 |      |       |            |               |         |       |          | 8/393              | 2423637 | AccessShareLock     | f       | f
 transactionid |          |          |      |       |            |        459127 |         |       |          | 7/6                | 2381560 | ExclusiveLock       | t       | f
 relation      |    13590 |    16799 |      |       |            |               |         |       |          | 6/126              | 2380294 | AccessShareLock     | t       | f
 relation      |    13590 |    16799 |      |       |            |               |         |       |          | 7/6                | 2381560 | AccessExclusiveLock | f       | f
(9 rows)


Note:
注意TRUNCATE 需要AccessExclusiveLock (table) ,而select 需要 AccessShareLock(Table), Truncate的互斥锁排他与很多事务都有锁冲突。可以见文档PostgreSQL Lock Conflicts

我之前的文章有记录过一些<脚本:用于分析PostgreSQL lock trees(堵塞树) 的查询>

SELECT
    activity.pid,
    activity.usename,
    activity.query,
    blocking.pid AS blocking_id,
    blocking.query AS blocking_query
FROM pg_stat_activity AS activity
JOIN pg_stat_activity AS blocking ON blocking.pid = ANY(pg_blocking_pids(activity.pid));
  pid   | usename |        query         | blocking_id |    blocking_query
---------+---------+----------------------+-------------+----------------------
 2381560 | system  | truncate table test; |     2380294 | select * from test;
 2423637 | system  | select * from test;  |     2381560 | truncate table test;

kingbase=# select pg_blocking_pids(2380294);
 pg_blocking_pids
------------------
 {}
(1 row)

kingbase=# select pg_blocking_pids(2381560);
 pg_blocking_pids
------------------
 {2380294}
(1 row)

kingbase=# select pg_blocking_pids(2423637);
 pg_blocking_pids
------------------
 {2381560}
(1 row)

  SELECT blocked_locks.pid     AS blocked_pid,
         blocked_activity.usename  AS blocked_user,
         blocking_locks.pid     AS blocking_pid,
         blocking_activity.usename AS blocking_user,
         blocked_activity.query    AS blocked_statement,
         blocking_activity.query   AS current_statement_in_blocking_process
   FROM  pg_catalog.pg_locks         blocked_locks
    JOIN pg_catalog.pg_stat_activity blocked_activity  ON blocked_activity.pid = blocked_locks.pid
    JOIN pg_catalog.pg_locks         blocking_locks 
        ON blocking_locks.locktype = blocked_locks.locktype
        AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
        AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
        AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
        AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
        AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
        AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
        AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
        AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
        AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
        AND blocking_locks.pid != blocked_locks.pid

    JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
   WHERE NOT blocked_locks.granted;
 blocked_pid | blocked_user | blocking_pid | blocking_user |  blocked_statement   | current_statement_in_blocking_process
-------------+--------------+--------------+---------------+----------------------+---------------------------------------
     2381560 | system       |      2423637 | system        | truncate table test; | select * from test;
     2381560 | system       |      2380294 | system        | truncate table test; | select * from test;
     2423637 | system       |      2380294 | system        | select * from test;  | select * from test;
     2423637 | system       |      2381560 | system        | select * from test;  | truncate table test;

或使用 lock monitoring view

kingbase=# select * from v_locks_monitor;
 locktype | datname | relation | page | tuple | transactionid | classid | objid | objsubid |                                                                   lock_conflict

----------+---------+----------+------+-------+---------------+---------+-------+----------+---------------------------------------------------------------------------------------------
-------------------------------------------------------
 relation | test    | 16799    |      |       |               |         |       |          | Pid: 2381560
                                                      +
          |         |          |      |       |               |         |       |          | Lock_Granted: false , Mode: AccessExclusiveLock , Username: system , Database: test , Client
_Addr: NULL , Client_Port: -1 , Application_Name: ksql+
          |         |          |      |       |               |         |       |          |  , Xact_Start: 2025-04-27 17:06:13.855348+08 , Query_Start: 2025-04-27 17:06:15.955823+08 ,
Xact_Elapse: +000000000 00:17:23.483185000            +
          |         |          |      |       |               |         |       |          | --------
                                                      +
          |         |          |      |       |               |         |       |          | 0Pid: 2380294
                                                      +
          |         |          |      |       |               |         |       |          | Lock_Granted: true , Mode: AccessShareLock , Username: system , Database: test , Client_Addr
: NULL , Client_Port: -1 , Application_Name: ksql     +
          |         |          |      |       |               |         |       |          |  , Xact_Start: 2025-04-27 17:06:00.351627+08 , Query_Start: 2025-04-27 17:06:07.249354+08 ,
Xact_Elapse: +000000000 00:17:36.986906000            +
          |         |          |      |       |               |         |       |          | --------
                                                      +
          |         |          |      |       |               |         |       |          | 0Pid: 2380294
                                                      +
          |         |          |      |       |               |         |       |          | Lock_Granted: true , Mode: AccessShareLock , Username: system , Database: test , Client_Addr
: NULL , Client_Port: -1 , Application_Name: ksql     +
          |         |          |      |       |               |         |       |          |  , Xact_Start: 2025-04-27 17:06:00.351627+08 , Query_Start: 2025-04-27 17:06:07.249354+08 ,
Xact_Elapse: +000000000 00:17:36.986906000            +
          |         |          |      |       |               |         |       |          | --------
                                                      +
          |         |          |      |       |               |         |       |          | 1Pid: 2423637
                                                      +
          |         |          |      |       |               |         |       |          | Lock_Granted: false , Mode: AccessShareLock , Username: system , Database: test , Client_Add
r: NULL , Client_Port: -1 , Application_Name: ksql    +
          |         |          |      |       |               |         |       |          |  , Xact_Start: 2025-04-27 17:06:26.795951+08 , Query_Start: 2025-04-27 17:06:26.795951+08 ,
Xact_Elapse: +000000000 00:17:10.542582000            +
          |         |          |      |       |               |         |       |          | --------
                                                      +
          |         |          |      |       |               |         |       |          |
(1 row)

预防措施

在迁移前充分测试锁行为
评估所有TRUNCATE操作的使用场景
培训团队了解PostgreSQL与Oracle在并发控制方面的差异

解决方案

在低峰期执行TRUNCATE操作
使用短事务,避免长事务与TRUNCATE冲突
通过设置较低的 lock_timeout 来优雅地处理 TRUNCATE 操作可能导致的阻塞问题。

DO $$
BEGIN
  FOR i IN 1..10 LOOP  -- 最多尝试10次
    BEGIN
      SET LOCAL lock_timeout = '100ms';  -- 设置很短的超时
      TRUNCATE TABLE target_table;
      EXIT;  -- 成功则退出循环
    EXCEPTION WHEN OTHERS THEN
      RAISE NOTICE 'Attempt % failed, retrying...', i;
      PERFORM pg_sleep(0.1 * i);  -- 指数退避
    END;
  END LOOP;
END $$;

-- 封装函数
CREATE OR REPLACE FUNCTION safe_truncate(table_name text, max_attempts int DEFAULT 5) 
RETURNS void AS $$
DECLARE
  attempt int := 1;
  success boolean := false;
BEGIN
  WHILE attempt <= max_attempts AND NOT success LOOP
    BEGIN
      EXECUTE format('SET LOCAL lock_timeout = %L', 
                     CASE 
                       WHEN attempt <= 2 THEN '50ms'
                       WHEN attempt <= 4 THEN '100ms'
                       ELSE '200ms'
                     END);
      
      EXECUTE format('TRUNCATE TABLE %I', table_name);
      success := true;
      
    EXCEPTION WHEN OTHERS THEN
      RAISE NOTICE 'Truncate attempt %/% failed on %: %', 
                   attempt, max_attempts, table_name, SQLERRM;
      PERFORM pg_sleep(0.2 * attempt);  -- 指数退避
      attempt := attempt + 1;
    END;
  END LOOP;
  
  IF NOT success THEN
    RAISE EXCEPTION 'Failed to truncate % after % attempts', table_name, max_attempts;
  END IF;
END;
$$ LANGUAGE plpgsql;

-- 使用示例
SELECT safe_truncate('my_large_table');

Note:
也可以如果多次失败,回退到DELETE + VACUUM方案

打赏

,

目前这篇文章还没有评论(Rss)

我要评论