如何在openGauss/PostgreSQL/KingBASE手动清理XLOG/WAL 文件?
openGauss/PostgreSQL中的预写式日志WAL(Write Ahead Log),又名Xlog或redo log,相当于oracle的online redo log, 不同的是oracle online redo log是提前创建几组滚动使用,但在opengauss中只需要本配置参数控制WAL日志的周期,数据库会一直的创建并自动清理,但存在一些情况WAL日志未清理导致目录空间耗尽,或目录空间紧张时手动删除wal日志时,比如如何确认在非归档模式下哪些WAL日志文件可以安全删除?
数据库参数
先看几个数据库实例控制的与wal日志相关的, 这里是这的测试环境openGauss 5.0,默认值。
openGauss=# \! sh show wal enable_mix_replication | off | All the replication log sent by the wal streaming. enable_wal_shipping_compression | off | enable compress xlog during xlog shipping. hadr_max_size_for_xlog_receiver | 256GB | This param set by user is used for xlog to stop receiving when the gap is larger than this param between replay xlog location and walreceiver r max_wal_senders | 4 | Sets the maximum number of simultaneously running WAL sender processes. wal_block_size | 8192 | Shows the block size in the write ahead log. wal_buffers | 16MB | Sets the number of disk-page buffers in shared memory for WAL. wal_file_init_num | 10 | Sets the number of xlog segment files that WAL writer auxiliary thread creates at one time. wal_flush_delay | 1 | set delay time when iterator table entry. wal_flush_timeout | 2 | set timeout when iterator table entry. wal_keep_segments | 16 | Sets the number of WAL files held for standby servers. wal_level | hot_standby | Sets the level of information written to the WAL. wal_log_hints | on | Writes full pages to WAL when first modified after a checkpoint, even for a non-critical modifications. wal_receiver_buffer_size | 64MB | Sets the buffer size to receive data from master. wal_receiver_connect_retries | 1 | Sets the maximum retries to connect master. wal_receiver_connect_timeout | 2s | Sets the maximum wait time to connect master. wal_receiver_status_interval | 5s | Sets the maximum interval between WAL receiver status reports to the primary. wal_receiver_timeout | 6s | Sets the maximum wait time to receive data from master. wal_segment_size | 16MB | Shows the number of pages per write ahead log segment. wal_sender_timeout | 6s | Sets the maximum time to wait for WAL replication. wal_sync_method | fdatasync | Selects the method used for forcing WAL updates to disk. wal_writer_delay | 200ms | WAL writer sleep time between WAL flushes. walsender_max_send_size | 8MB | Size of walsender max send size. walwriter_cpu_bind | -1 | Sets the binding CPU number for the WAL writer thread. walwriter_sleep_threshold | 500 | Number of idle xlog flushes before xlog flusher goes to sleep. openGauss=# \! sh show xlog advance_xlog_file_num | 0 | Sets the number of xlog files to be initialized in advance. archive_timeout | 0 | Forces a switch to the next xlog file if a new file has not been started within N seconds. enable_wal_shipping_compression | off | enable compress xlog during xlog shipping. enable_xlog_prune | on | Enable xlog prune when not all standys connected and xlog size is largger than max_xlog_size hadr_max_size_for_xlog_receiver | 256GB | This param set by user is used for xlog to stop receiving when the gap is larger than this param between replay xlog location and walreceiver r max_size_for_xlog_prune | 2147483647kB | This param set by user is used for xlog to be recycled when not all are connected and the param enable_xlog_prune is on. recovery_parse_workers | 1 | The number of recovery threads to do xlog parse. recovery_redo_workers | 1 | The number belonging to one parse worker to do xlog redo. wal_file_init_num | 10 | Sets the number of xlog segment files that WAL writer auxiliary thread creates at one time. walwriter_sleep_threshold | 500 | Number of idle xlog flushes before xlog flusher goes to sleep. xlog_file_path | | use only one xlog file, the path of it xlog_file_size | 549755813888 | share storage xlog file size xlog_lock_file_path | | used to control write to xlog_file_path xloginsert_locks | 8 | Sets the number of locks used for concurrent xlog insertions. openGauss=# show archive_mode; archive_mode -------------- off (1 row)
自动清理WAL
WAL日志并不是一直存在,会自动的清理。wal_keep_segments参数控制wal日志保留的个数,默认保留最近16个。“pg_xlog”目录下保留事务日志文件的最小数目。
另一个参数max_size_for_xlog_prune参数,在enable_xlog_prune打开时生效,如果有备机断连且xlog日志大小大于此阈值,则回收日志。自动清理时会在pg_log中记录:
[BACKEND] LOG: attempting to remove WAL segments older than log file 000000010000000000000008 [BACKEND] LOG: attempting to remove WAL segments older than log file 000000010000000000000009
我们看一下是哪16个
[og@oel7db1 pg_xlog]$ ls -lrt|cat -n
1 total 425988
2 drwx------ 2 og og 4096 May 8 11:36 archive_status
3 -rw------- 1 og og 16777216 May 10 17:17 00000001000000000000001C
4 -rw------- 1 og og 16777216 May 27 15:58 00000001000000000000001D
5 -rw------- 1 og og 16777216 May 27 15:59 00000001000000000000001E
6 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000001F
7 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000020
8 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000021
9 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000022
10 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000023
11 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000A
12 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000B
13 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000C
14 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000D
15 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000E
16 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000F
17 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000010
18 -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000011
19 -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000012
20 -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000013
21 -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000014
22 -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000015
23 -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000016
24 -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000017
25 -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000018
26 -rw------- 1 og og 16777216 May 27 16:52 000000010000000000000019
27 -rw------- 1 og og 16777216 May 27 16:52 00000001000000000000001A
28 -rw------- 1 og og 16777216 May 27 16:53 00000001000000000000001B
note:
减去当前和之后的。
手动日志切换
相当于Oracle的alter system switch logfile;
penGauss=# select pg_switch_xlog(); pg_switch_xlog ---------------- 0/F000168 (1 row) openGauss=# select pg_switch_xlog(); pg_switch_xlog ---------------- 0/10000168 (1 row)
查看当前的WAL日志
# DB内部 openGauss=# select * from pg_ls_waldir() order by modification asc; name | size | modification --------------------------+----------+------------------------ 000000010000000000000001 | 16777216 | 2023-05-08 11:37:00+08 000000010000000000000002 | 16777216 | 2023-05-10 17:17:55+08 000000010000000000000003 | 16777216 | 2023-05-27 15:58:58+08 000000010000000000000004 | 16777216 | 2023-05-27 15:59:02+08 000000010000000000000005 | 16777216 | 2023-05-27 16:00:24+08 000000010000000000000006 | 16777216 | 2023-05-27 16:00:27+08 000000010000000000000007 | 16777216 | 2023-05-27 16:00:29+08 000000010000000000000008 | 16777216 | 2023-05-27 16:00:31+08 000000010000000000000009 | 16777216 | 2023-05-27 16:00:33+08 00000001000000000000000A | 16777216 | 2023-05-27 16:00:34+08 00000001000000000000000B | 16777216 | 2023-05-27 16:00:36+08 00000001000000000000000C | 16777216 | 2023-05-27 16:00:38+08 00000001000000000000000D | 16777216 | 2023-05-27 16:00:39+08 00000001000000000000000E | 16777216 | 2023-05-27 16:00:41+08 00000001000000000000000F | 16777216 | 2023-05-27 16:00:43+08 000000010000000000000012 | 16777216 | 2023-05-27 16:00:44+08 000000010000000000000013 | 16777216 | 2023-05-27 16:00:44+08 000000010000000000000014 | 16777216 | 2023-05-27 16:00:45+08 000000010000000000000016 | 16777216 | 2023-05-27 16:00:45+08 000000010000000000000010 | 16777216 | 2023-05-27 16:00:45+08 000000010000000000000015 | 16777216 | 2023-05-27 16:00:45+08 000000010000000000000017 | 16777216 | 2023-05-27 16:00:46+08 000000010000000000000019 | 16777216 | 2023-05-27 16:00:46+08 000000010000000000000018 | 16777216 | 2023-05-27 16:00:46+08 00000001000000000000001A | 16777216 | 2023-05-27 16:00:47+08 000000010000000000000011 | 16777216 | 2023-05-27 16:03:45+08 (26 rows) # 操作系统 [og@oel7db1 pg_xlog]$ ls -lrt|cat -n 1 total 425988 2 drwx------ 2 og og 4096 May 8 11:36 archive_status 3 -rw------- 1 og og 16777216 May 8 11:37 000000010000000000000001 4 -rw------- 1 og og 16777216 May 10 17:17 000000010000000000000002 5 -rw------- 1 og og 16777216 May 27 15:58 000000010000000000000003 6 -rw------- 1 og og 16777216 May 27 15:59 000000010000000000000004 7 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000005 8 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000006 9 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000007 10 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000008 11 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000009 12 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000A 13 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000B 14 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000C 15 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000D 16 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000E 17 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000F 18 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000012 19 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000013 20 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000014 21 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000015 22 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000010 23 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000016 24 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000017 25 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000018 26 -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000019 27 -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000001A 28 -rw------- 1 og og 16777216 May 27 16:06 000000010000000000000011
日志文件命名格式
000000010000000000000011
WAL日志三个组成部分:
第一部分,叫做时间线,是从1开始递增地数字(类似ORACLE 的SCN的wrap), 当低位满时,前一段加1.
第二部分,叫做LogId,是LSN的高32位(从0开始)
第三部分,叫做LogSeg,是LSN的低32位除以WAL文件的大小,WAL文件的大小默认是16M
WAL日志手动清理
[og@oel7db1 ~]$ ps -ef|grep gauss og 1989 1 1 11:48 pts/0 00:04:13 gaussdb -D /opensource/opengauss/5.0/db --single_node og 26086 25871 0 15:59 pts/2 00:00:00 grep --color=auto gauss [og@oel7db1 pg_xlog]$ pg_controldata /opensource/opengauss/5.0/db/ pg_control version number: 923 Catalog version number: 201611171 Database system identifier: 576503776236165954 Database cluster state: in production pg_control last modified: Sat 27 May 2023 04:01:45 PM CST Latest checkpoint location: 0/110001C8 Prior checkpoint location: 0/110000A8 Latest checkpoint's REDO location: 0/11000148 Latest checkpoint's TimeLineID: 1 Latest checkpoint's full_page_writes: off Latest checkpoint's NextXID: 16617 Latest checkpoint's NextOID: 16431 Latest checkpoint's NextMultiXactId: 2 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 12664 Latest checkpoint's oldestXID's DB: 15645 Latest checkpoint's oldestActiveXID: 16617 Latest checkpoint's remove lsn: 0/1 Time of latest checkpoint: Sat 27 May 2023 04:01:45 PM CST Minimum recovery ending location: 0/0 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no Current wal_level setting: hot_standby Current max_connections setting: 250 Current max_prepared_xacts setting: 200 Current max_locks_per_xact setting: 256 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Database system TimeLine: 8 openGauss=# select pg_xlogfile_name('0/110001C8'); pg_xlogfile_name -------------------------- 000000010000000000000011 (1 row)
Note:
说明000000010000000000000011(#11)之前的wal日志都可以清理。
可以手动rm 也可以使用pg_archivecleanup,如果当前opengauss中没有该工具,可以从postgresql中复制,目前在openGauss的商业发行版Mogdb中有该工具的说明。 pg_archivecleanup 是一个用于清理旧的归档日志的工具。
pg_archivecleanup [OPTION]... ARCHIVELOCATION OLDESTKEPTWALFILE e.g. pg_archivecleanup -d /opensource/opengauss/5.0/db/ 000000010000000000000011
主库查看Xlog位置
主库也可以使用pg_current_xlog_location函数查看当前的xlog位置,使用pg_xlogfile_name转换日志文件名。如
openGauss=# select pg_current_xlog_location(),pg_xlogfile_name(pg_current_xlog_location()),txid_current(),now(); pg_current_xlog_location | pg_xlogfile_name | txid_current | now --------------------------+--------------------------+--------------+------------------------------- 0/200001E8 | 000000010000000000000020 | 21140 | 2022-10-03 14:09:52.749551+08 (1 row)
备库查看 xlog位置
主从同步的模式下,在opengauss standby端可以使用 pg_last_xlog_receive_location() 、pg_last_xlog_replay_location()、pg_last_xact_replay_timestamp()分别查看xlog接受到的位置,应用到的位置,应用到的时间。如果想利用move走xlog控制应用到的位置,可以使用 pg_xlog_replay_pause() pg_xlog_replay_resume()对应用XLOG的进度暂停与恢复应用,如果下一下XLOG不存在,日志中会提示无法open 某xlog,而停止。 注意:如果因缺少xlog standby出现实例crash, 在启动时一定注意增加standby参数,因为opengauss的垃圾设计,是依赖进程启动模式判断database role, 直接启动standby 实例(gs_ctl start 不带-M standby)相当于激活db, 也就是oracle的failover standby, 没有办法再回退到standby模式。
KingBASE
[kingbase@kingbase1 sys_wal]$ pwd /home/kingbase/cluster/project/kingbase/kingbase/data/sys_wal [kingbase@kingbase1 bin]$ sys_controldata /home/kingbase/cluster/project/kingbase/kingbase/data sys_control version number: 1201 Catalog version number: 202204191 Database system identifier: 7353544381029524840 Database cluster state: in production sys_control last modified: Wed 05 Jun 2024 11:59:59 PM CST Latest checkpoint location: 64/4D006218 Latest checkpoint's REDO location: 64/4D0061E8 Latest checkpoint's REDO WAL file: 00000001000000640000004D Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0:567330 Latest checkpoint's NextOID: 49469 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 899 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 567330 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Latest checkpoint's oldestCommitTsXid:0 Latest checkpoint's newestCommitTsXid:0 Time of latest checkpoint: Wed 05 Jun 2024 11:59:59 PM CST Fake LSN counter for unlogged rels: 0/3E8 Minimum recovery ending location: 0/0 Min recovery ending loc's timeline: 0 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no wal_level setting: replica wal_log_hints setting: on max_connections setting: 100 max_worker_processes setting: 8 max_wal_senders setting: 32 max_prepared_xacts setting: 300 max_locks_per_xact setting: 64 track_commit_timestamp setting: off Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1988 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 1 Mock authentication nonce: 72324c57b7671eb6e91c24cd926909bb9424757d117d6ba2f79685667f6c7e48 database mode: 1 auth method mode: 0 kingbase=# select pg_walfile_name('64/4D006218'); pg_walfile_name -------------------------- 00000001000000640000004D (1 row)
在KINGBASE和OG检查差不多,只是命令函数改变。
对不起,这篇文章暂时关闭评论。