什么是 pg_filenode.map 文件?
在PostgreSQL中,每个数据库对象都有一个唯一的文件节点号,用于标识对象在磁盘上的位置。这些文件节点号与实际的物理文件相对应,例如表的数据文件、索引文件等。pg_filenode.map文件将这些文件节点号与相应的文件路径进行映射,以便数据库系统能够准确地找到和访问这些对象, 通常从数据字典pg_class可以根据relfilenode字段查找对应的文件。试想一下,在数据库启动时,数据库都没有open如何加载pg_class?如同oracle 数据库一样,同样要加载dict对象,dict对象又是在Bootstrap$, 它一般是在datafile 1# block 520#位置,那PG呢? 在pg_class表中能看到像pg_class、pg_attribute等一些特定的表,在relfilenode字段中的查询结果为0,说明也是在db dict的外部。
OpenGauss 5
/c postgres select oid, relname, relfilenode,reltablespace from pg_class where relfilenode = 0 and relkind = 'r' openGauss-# order by reltablespace,oid; oid | relname | relfilenode | reltablespace ------+------------------------------+-------------+--------------- 1247 | pg_type | 0 | 0 1249 | pg_attribute | 0 | 0 1255 | pg_proc | 0 | 0 1259 | pg_class | 0 | 0 7815 | gs_package | 0 | 0 1136 | pg_pltemplate | 0 | 1664 1213 | pg_tablespace | 0 | 1664 1214 | pg_shdepend | 0 | 1664 1260 | pg_authid | 0 | 1664 1261 | pg_auth_members | 0 | 1664 1262 | pg_database | 0 | 1664 2396 | pg_shdescription | 0 | 1664 2964 | pg_db_role_setting | 0 | 1664 3450 | pg_resource_pool | 0 | 1664 3451 | pg_workload_group | 0 | 1664 3457 | pg_auth_history | 0 | 1664 3460 | pg_user_status | 0 | 1664 3464 | pg_app_workloadgroup_mapping | 0 | 1664 3592 | pg_shseclabel | 0 | 1664 4211 | pg_extension_data_source | 0 | 1664 5680 | gs_obsscaninfo | 0 | 1664 6126 | pg_subscription | 0 | 1664 6134 | pg_replication_origin | 0 | 1664 9014 | pgxc_group | 0 | 1664 9015 | pgxc_node | 0 | 1664 9022 | pg_job | 0 | 1664 9023 | pg_job_proc | 0 | 1664 9080 | gs_global_config | 0 | 1664 (28 rows)
Note:
一共有28个 relfilenode=0的r对象, 其中有5个reltablespace=0, (在postgresql中因为不支持package通常是4个)称为 “nailed” catalogs,其余在tablespace 1664名为pg_global ,称为共享表。
test=# select oid,* from pg_tablespace; oid | spcname | spcowner | spcacl | spcoptions | spcmaxsize | relative ------+------------+----------+--------+------------+------------+---------- 1663 | pg_default | 10 | | | | f 1664 | pg_global | 10 | | | | f (2 rows)
那些未在pg_class中的关系表就是在pg_filenode.map文件中, 从postgresql 9中引入,是一个特殊的文件,相当于pg系的bootstrap$ file 1,每个数据库有一个文件,整个群集有一个文件。pg_filenode.map
需要注意的是,pg_filenode.map文件是由PostgreSQL自动生成和维护的,一般情况下不需要手动修改或操作该文件。它在数据库启动时会被加载,并在需要时进行更新和维护。如果需要备份或迁移数据库,通常需要同时备份和恢复pg_filenode.map文件,以保证数据库对象的正确映射关系。目前,每个文件都保持在 512 字节(旨在占用一个标准大小的磁盘扇区),以最大程度地降低更新corrupted的风险;相应的 Struct 设计为恰好占用 512 个字节,最多可以包含 62 个条目 (MAX_MAPPINGS) ,每个条目 8 个字节. 前后有4字节的文件头、4字节的map个数和8字节文件尾校验值。4+4+62*8+8=512
pg_filenode.map文件结构
源码relmapper.c
/*------------------------------------------------------------------------- 2 * 3 * relmapper.c 4 * Catalog-to-filenumber mapping 5 * 6 * For most tables, the physical file underlying the table is specified by 7 * pg_class.relfilenode. However, that obviously won't work for pg_class 8 * itself, nor for the other "nailed" catalogs for which we have to be able 9 * to set up working Relation entries without access to pg_class. It also 10 * does not work for shared catalogs, since there is no practical way to 11 * update other databases' pg_class entries when relocating a shared catalog. 12 * Therefore, for these special catalogs (henceforth referred to as "mapped 13 * catalogs") we rely on a separately maintained file that shows the mapping 14 * from catalog OIDs to filenumbers. Each database has a map file for 15 * its local mapped catalogs, and there is a separate map file for shared 16 * catalogs. Mapped catalogs have zero in their pg_class.relfilenode entries. 17 * 18 * Relocation of a normal table is committed (ie, the new physical file becomes 19 * authoritative) when the pg_class row update commits. For mapped catalogs, 20 * the act of updating the map file is effectively commit of the relocation. 21 * We postpone the file update till just before commit of the transaction 22 * doing the rewrite, but there is necessarily a window between. Therefore 23 * mapped catalogs can only be relocated by operations such as VACUUM FULL 24 * and CLUSTER, which make no transactionally-significant changes: it must be 25 * safe for the new file to replace the old, even if the transaction itself 26 * aborts. An important factor here is that the indexes and toast table of 27 * a mapped catalog must also be mapped, so that the rewrites/relocations of 28 * all these files commit in a single map file update rather than being tied 29 * to transaction commit. 30 * 31 * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group 32 * Portions Copyright (c) 1994, Regents of the University of California 33 * 34 * 35 * IDENTIFICATION 36 * src/backend/utils/cache/relmapper.c 37 * 38 *------------------------------------------------------------------------- 39 */ 59 60 /* 61 * The map file is critical data: we have no automatic method for recovering 62 * from loss or corruption of it. We use a CRC so that we can detect 63 * corruption. Since the file might be more than one standard-size disk 64 * sector in size, we cannot rely on overwrite-in-place. Instead, we generate 65 * a new file and rename it into place, atomically replacing the original file. 66 * 67 * Entries in the mappings[] array are in no particular order. We could 68 * speed searching by insisting on OID order, but it really shouldn't be 69 * worth the trouble given the intended size of the mapping sets. 70 */ 71 #define RELMAPPER_FILENAME "pg_filenode.map" 72 #define RELMAPPER_TEMP_FILENAME "pg_filenode.map.tmp" 73 74 #define RELMAPPER_FILEMAGIC 0x592717 /* version ID value */ 75 76 /* 77 * There's no need for this constant to have any particular value, and we 78 * can raise it as necessary if we end up with more mapped relations. For 79 * now, we just pick a round number that is modestly larger than the expected 80 * number of mappings. 81 */ 82 #define MAX_MAPPINGS 64 <<<<<<<<<<<<<< 83 84 typedef struct RelMapping 85 { 86 Oid mapoid; /* OID of a catalog */ 87 RelFileNumber mapfilenumber; /* its rel file number */ 88 } RelMapping; 89 90 typedef struct RelMapFile 91 { 92 int32 magic; /* always RELMAPPER_FILEMAGIC */ 93 int32 num_mappings; /* number of valid RelMapping entries */ 94 RelMapping mappings[MAX_MAPPINGS]; 95 pg_crc32c crc; /* CRC of all above */ 96 } RelMapFile; 97
数据目录查找
[og@oel7db1 db]$ find . -name pg_filenode.map -exec ls -l {} \; -rw------- 1 og og 4096 May 8 11:37 ./base/1/pg_filenode.map -rw------- 1 og og 4096 May 8 11:37 ./base/15650/pg_filenode.map -rw------- 1 og og 4096 May 10 11:49 ./base/16385/pg_filenode.map -rw------- 1 og og 4096 May 8 11:37 ./base/15645/pg_filenode.map -rw------- 1 og og 4096 Sep 5 18:23 ./base/16549/pg_filenode.map -rw------- 1 og og 4096 May 8 11:37 ./global/pg_filenode.map anbob=# select oid,datname,dattablespace from pg_database; oid | datname | dattablespace -------+-----------+--------------- 1 | template1 | 1663 16385 | anbob | 1663 16549 | test | 1663 15645 | template0 | 1663 15650 | postgres | 1663 (5 rows)
dump 文件结构
[og@oel7db1 16549]$ hexdump pg_filenode.map 0000000 2718 0059 0015 0000 04eb 0000 3b35 0000 -- RELMAPPER_FILEMAGIC、num_mappings x0015=21 0000010 04e1 0000 3b22 0000 04e7 0000 3b26 0000 0000020 04df 0000 3ac7 0000 1e87 0000 3b2e 0000 0000030 0b14 0000 3b28 0000 0b15 0000 3b2a 0000 0000040 1f42 0000 3b30 0000 1f43 0000 3b32 0000 0000050 0a62 0000 3b24 0000 0a63 0000 3b25 0000 0000060 0a66 0000 3b37 0000 0a67 0000 3b38 0000 0000070 26fd 0000 3b39 0000 0a82 0000 3b2b 0000 0000080 25c2 0000 3b2d 0000 24a2 0000 3b2c 0000 0000090 2709 0000 3b34 0000 2608 0000 3b33 0000 00000a0 0a8f 0000 3ac9 0000 0a90 0000 3aca 0000 00000b0 0000 0000 0000 0000 0000 0000 0000 0000 * 0000ff0 0000 0000 0000 0000 06a9 e617 0000 0000 0001000
Note:
第一部分是magic文件头,第二部分是21个map条目个数,第三部分是21个具体的oid与relfilenode对应,第四部分是基于CRC算法的校验值
dump一个global pg_filenode.map
[og@oel7db1 db]$ od -j 8 -N $((512-8-8)) -td4 ./global/pg_filenode.map 0000010 1262 15646 2964 15289 0000030 1213 15295 1136 15299 0000050 1260 15054 1261 15302 0000070 1214 15306 2396 15310 0000110 3457 15562 3460 15558 0000130 9015 15355 9014 15360 0000150 3450 15367 3451 15371 0000170 3464 15375 3592 15399 0000210 9022 15430 9023 15437 0000230 4211 15566 5680 15448 0000250 9080 15467 6134 15536 0000270 6126 15134 2846 15312 0000310 2847 15314 2966 15291 0000330 2967 15293 5504 15362 0000350 5505 15364 2676 15056 0000370 2677 15057 2694 15304 0000410 2695 15305 2671 15648 0000430 2672 15649 7166 15568 0000450 7167 15569 2397 15315 0000470 1137 15301 1232 15308 0000510 1233 15309 2697 15297 0000530 2698 15298 9010 15358 0000550 9024 15359 9012 15365 0000570 9013 15366 9000 15369 0000610 9017 15370 9018 15373 0000630 9019 15374 9020 15377 0000650 9021 15378 9003 15357 0000670 2965 15294 3593 15401 0000710 3458 15564 3459 15565 0000730 3461 15560 3462 15561 0000750 3453 15432 3454 15433 0000770 openGauss=# select pg_relation_filenode(1213); pg_relation_filenode ---------------------- 15295 (1 row)
Note:
正好是62个,如oid 1213 对应 RelFileNumber 15295,使用pg_relation_filenode 函数也可以转换确认.
dump 非global 库下的pg_filenode.map
[og@oel7db1 db]$ od -j 8 -N $((512-8-8)) -td4 ./base/16549/pg_filenode.map 0000010 1259 15157 1249 15138 0000030 1255 15142 1247 15047 0000050 7815 15150 2836 15144 0000070 2837 15146 8002 15152 0000110 8003 15154 2658 15140 0000130 2659 15141 2662 15159 0000150 2663 15160 9981 15161 0000170 2690 15147 9666 15149 0000210 9378 15148 9993 15156 0000230 9736 15155 2703 15049 0000250 2704 15050 0 0 0000270 0 0 0 0 * 0000770 [og@oel7db1 db]$ od -j 8 -N $((512-8-8)) -td4 ./base/15650/pg_filenode.map 0000010 1259 15157 1249 15138 0000030 1255 15142 1247 15047 0000050 7815 15150 2836 15144 0000070 2837 15146 8002 15152 0000110 8003 15154 2658 15140 0000130 2659 15141 2662 15159 0000150 2663 15160 9981 15161 0000170 2690 15147 9666 15149 0000210 9378 15148 9993 15156 0000230 9736 15155 2703 15049 0000250 2704 15050 0 0 0000270 0 0 0 0 * 0000770 [og@oel7db1 db]$ od -j 8 -N $((512-8-8)) -td4 ./base/16385/pg_filenode.map 0000010 1259 15157 1249 15138 0000030 1255 15142 1247 15047 0000050 7815 15150 2836 15144 0000070 2837 15146 8002 15152 0000110 8003 15154 2658 15140 0000130 2659 15141 2662 15159 0000150 2663 15160 9981 15161 0000170 2690 15147 9666 15149 0000210 9378 15148 9993 15156 0000230 9736 15155 2703 15049 0000250 2704 15050 0 0 0000270 0 0 0 0 * 0000770
非global下的pg_filenode.map或叫local map file的内容基本上是一样的,因为很少对catalog 对象做vacuum full(会导致relfilenode改变)。 所以如果当Local mapfile 丢失或误删除,影响该库的访问时,可以从其它数据库先copy一份pg_filenode.map做为恢复尝试。
加载relmap的函数
/* 755 * load_relmap_file -- load the shared or local map file 756 * 757 * Because these files are essential for access to core system catalogs, 758 * failure to load either of them is a fatal error. 759 * 760 * Note that the local case requires DatabasePath to be set up. 761 */ 762 static void 763 load_relmap_file(bool shared, bool lock_held) 764 { 765 if (shared) 766 read_relmap_file(&shared_map, "global", lock_held, FATAL); 767 else 768 read_relmap_file(&local_map, DatabasePath, lock_held, FATAL); 769 } 770 771 /* 772 * read_relmap_file -- load data from any relation mapper file 773 * 774 * dbpath must be the relevant database path, or "global" for shared relations. 775 * 776 * RelationMappingLock will be acquired released unless lock_held = true. 777 * 778 * Errors will be reported at the indicated elevel, which should be at least 779 * ERROR. 780 */ 781 static void 782 read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel) 783 { 784 char mapfilename[MAXPGPATH]; 785 pg_crc32c crc; 786 int fd; 787 int r; 788 789 Assert(elevel >= ERROR);
postgresql提供了一组功能函数可以查询oid与relfilenodel转换
openGauss=# select pg_relation_filenode(8003); pg_relation_filenode ---------------------- 15154 (1 row) openGauss=# select pg_filenode_relation(0,15154)::oid; --0 表未默认表空间 pg_filenode_relation ---------------------- 8003 (1 row) openGauss=# select pg_filenode_relation(0,15154); pg_filenode_relation ------------------------------ pg_toast.pg_toast_7815_index (1 row)
当数据库打开始可以使用上面的方法转换,如果数据库未打开始,对于非普通表以外的就是使用pg_filenode.map得到oid与relfilenode的对应关系,然后可以找到对应的db目录,从对应db目录下找对应的pg_class得到对应的名称.
使用第三方插件pg_filenodemapdata可以 转换硬编码的名称转换
postgres@oel7db1 16447]$ pg_filenodemapdata ./pg_filenode.map magic: 0x00592717 num_mappings: 17 0) 1259 - pg_class: 1259 1) 1249 - pg_attribute: 1249 2) 1255 - pg_proc: 1255 3) 1247 - pg_type: 1247 4) 2836 - pg_toast_1255: 2836 5) 2837 - pg_toast_1255_index: 2837 6) 4171 - unlisted system catalog relation: 4171 7) 4172 - unlisted system catalog relation: 4172 8) 2658 - pg_attribute_relid_attnam_index: 2658 9) 2659 - pg_attribute_relid_attnum_index: 2659 10) 2662 - pg_class_oid_index: 2662 11) 2663 - pg_class_relname_nsp_index: 2663 12) 3455 - pg_class_tblspc_relfilenode_index: 3455 13) 2690 - pg_proc_oid_index: 2690 14) 2691 - pg_proc_proname_args_nsp_index: 2691 15) 2703 - pg_type_oid_index: 2703 16) 2704 - pg_type_typname_nsp_index: 2704 file checksum: 0x546114BB
对于opengauss数据库下对于pg_package未编码,所以可能是未知。
对于表文件路径的样式有三类
* For files in the default tablespace, base/database_oid/filenode id for the relation * For files in other tablespaces: pg_tblspc / tablespace_oid / tablespace_version_subdir / database_oid / filenode id for the relation * For shared relations (see below): global/filenode id for the relation
如果pg_class中relfilenode为零,在这种情况下,文件通过pg_relfilenode.map定位。这是共享目录和某些系统目录的典型情况,例如pg_database、pg_class和pg_proc等。
— over —
对不起,这篇文章暂时关闭评论。