PostgreSQL性能调优相关的参数配置 (最佳实践)
在PostgreSQL数据库中,与硬件基础环境依赖相关的参数,默认配置通常不适合生产环境。正如 PostgreSQL wiki 页面所述,默认配置的选择是为了在各种可能缺乏大量资源的设备上更容易地安装数据库。Oracle数据库也存在类似的问题,因此在内存相关的参数上基于Advisor框架推出了ASMM和AMM自动管理,但大部分参数如优化器相关的,仍然是针对通用场景设计的。当然,也有大量的Oracle DBA基于自己的经验整理出了一些最佳实践参数。相较之下,PostgreSQL的参数数量远少于Oracle。下面整理了几个基础的数据库调优参数。
任何数据库对于生产环境,配置合理的性能参数,都可以从中收益,之前整理过<Linux最佳实践for Postgresql>,同样也是希望可以从操作系统层提升数据库的稳定性和性能。PG中重点关注参数有:
)max_connections
)shared_buffers
)effective_cache_size
)work_mem
)maintenance_work_mem
)autovacuum_max_workers
)wal_buffers
)effective_io_concurrency
)random_page_cost
)seq_page_cost
)log_min_duration_statement
从视图pg_settings中可以查询确认当前值 。在opengauss DB中同样受用,以下是在Mogdb分支中的查询
[omm@mogdb1 ~]$ gsql -r gsql ((MogDB 5.0.1 build ae6d2ada) compiled at 2023-08-16 09:07:43 commit 0 last mr 1804 ) Non-SSL connection (SSL connection is recommended when requiring high-security) Type "help" for help. MogDB=# SELECT MogDB-# name AS setting_name, MogDB-# setting AS setting_value, MogDB-# unit AS setting_unit MogDB-# FROM pg_settings MogDB-# WHERE name IN ( MogDB(# 'max_connections', MogDB(# 'shared_buffers', MogDB(# 'effective_cache_size', MogDB(# 'work_mem', MogDB(# 'maintenance_work_mem', MogDB(# 'autovacuum_max_workers', MogDB(# 'wal_buffers', MogDB(# 'effective_io_concurrency', MogDB(# 'random_page_cost', MogDB(# 'seq_page_cost', MogDB(# 'log_min_duration_statement' MogDB(# ); setting_name | setting_value | setting_unit ----------------------------+---------------+-------------- autovacuum_max_workers | 10 | effective_cache_size | 16384 | 8kB effective_io_concurrency | 1 | log_min_duration_statement | 200 | ms maintenance_work_mem | 16384 | kB max_connections | 200 | random_page_cost | 4 | seq_page_cost | 1 | shared_buffers | 131072 | 8kB wal_buffers | 2048 | 8kB work_mem | 65536 | kB (11 rows)
max_connections
max_connections 我们需要将最大连接数限制为可以提供最高系统吞吐量的值。如同通用可伸缩性定律指出,如果在系统达到最大吞吐量后继续增加并发性成本只会降低系统吞吐量。建议配置连接池(如pgbouncer)降低连接数。
shared_buffers
PostgreSQL的shared_buffer参数,多数建议为RAM的25% ~ 40%, 这与 Oracle、MySQL 或 SQL Server 不同(配置大量的buffer 减少物理IO),pg是两级缓存shared buffer+OS cache, PostgreSQL 文档建议将该值设置为最多可用 RAM 的 25%, 剩余的 RAM 可以分配给操作系统内核、操作系统运行的其他服务、客户端连接和操作系统缓存。
work_mem
work_mem配置允许将temporary数据写入磁盘之前设置查询操作可以使用的最大内存量,如同ORACLE PGA中的workarea, 在PG中默认值仅为 4MB,如果并发100,那最多允许4M*100 瞬态内存进行内存中查询(例如ORDER BY、Hash Joins、Hash Aggregate、Window Functions)。此设置可以基于每个会话进行配置,因此我们可以使用较小的常规值,并仅针对应该运行需要更多内存进行排序或哈希的查询的会话增加该值。
maintenance_work_mem
PostgreSQL 可用于维护操作(如 索引创建或其他特定于 DDL 的操作)的内存量,默认值为 64 MB, 总内存使用是maintenance_work_mem * autovacuum_max_workers(default 3)计算的, Robert Haas 建议是基于最大表的行数*1.2M 做为maintenance_work_mem 的初始化配置. 我们建议是也是session级临时更改值,加速维护操作,如同在加速索引创建的技巧.
wal_buffers
wal_buffers
参数在PostgreSQL中用于设置用于写入预写日志(Write-Ahead Logging, WAL)的共享内存缓冲区的大小。和oracle log buffer一样,这个缓冲区在数据库事务提交之前暂时存储WAL数据。默认是1/32的shared_buffer, 如果shared_buffer较大,中型规模建议配置16MB 或32MB,大型数据库建议64MB或更大,在某些版本中值-1 ,表示PostgreSQL会自动设置一个值。wal_buffers
的具体设置应根据数据库的规模、写入频率以及系统的整体资源情况进行调整。
effective_io_concurrency
该设置定义基础磁盘可以同时操作的读取和写入操作数, 0值表示禁用asynchronous I/O 。default 0, 如果值为2,这意味着 PostgreSQL 可以同时发出 2 个 I/0 请求,如果基础磁盘可以同时处理多个请求,则应增加该值并测试哪个值提供最佳应用程序性能。 EnterpriseDB: 2
for HDD and 200
for SSD.
random_page_cos&seq_page_cost
参数 random_page_cost和seq_page_cost 用于计算随机IO的Cost, 默认PostgreSQL 会认为随机访问读取比顺序读取慢 4 倍,如果您使用的是固态硬盘 (SSD),可能不会存在如此大的差距, 我们建议配置相同参数 1 for SSD, defult for HDD。
log_min_duration_statement
如同mysql的慢日志记录配置, 该参数控制执行时间超过提供的阈值的语句到日志,默认情况下-1,此功能处于禁用状态,单位ms, 如果您对慢日志分析比较敏感可以初始化配置1000ms,以记录SQL的量级调整值,毕竟记入太多的日志短时间也不能优化来减少。也可以使用一些SQL审核的专用工具。
Cybertec 出品的PostgreSQL Configurator工具也可以在线给我们生成PG的参数配置建议,如PostgreSQL v15, 64CPUs \ mem 512G \4*1T SSD \200 并发、2个replicat ,使用pg_dump备份配置建议:
# DISCLAIMER - Software and the resulting config files are provided AS IS - IN NO EVENT SHALL # BE THE CREATOR LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL # DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION. # Connectivity max_connections = 200 superuser_reserved_connections = 3 # Memory Settings shared_buffers = '131072 MB' work_mem = '256 MB' maintenance_work_mem = '2720 MB' huge_pages = try # NB! requires also activation of huge pages via kernel params, see here for more: https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-HUGE-PAGES effective_cache_size = '358 GB' effective_io_concurrency = 200 # concurrent IO only really activated if OS supports posix_fadvise function random_page_cost = 1.25 # speed of random disk access relative to sequential access (1.0) # Monitoring shared_preload_libraries = 'pg_stat_statements' # per statement resource usage stats track_io_timing=on # measure exact block IO times track_functions=pl # track execution times of pl-language procedures if any # Replication wal_level = replica # consider using at least 'replica' max_wal_senders = 10 synchronous_commit = on # Checkpointing: checkpoint_timeout = '15 min' checkpoint_completion_target = 0.9 max_wal_size = '10240 MB' min_wal_size = '5120 MB' # WAL archiving archive_mode = on # having it on enables activating P.I.T.R. at a later time without restart› archive_command = '/bin/true' # not doing anything yet with WAL-s # WAL writing wal_compression = on wal_buffers = -1 # auto-tuned by Postgres till maximum of segment size (16MB by default) wal_writer_delay = 200ms wal_writer_flush_after = 1MB wal_keep_size = '22080 MB' # Background writer bgwriter_delay = 200ms bgwriter_lru_maxpages = 100 bgwriter_lru_multiplier = 2.0 bgwriter_flush_after = 0 # Parallel queries: max_worker_processes = 65 max_parallel_workers_per_gather = 33 max_parallel_maintenance_workers = 33 max_parallel_workers = 65 parallel_leader_participation = on # Advanced features enable_partitionwise_join = on enable_partitionwise_aggregate = on jit = on max_slot_wal_keep_size = '1000 MB' track_wal_io_timing = on maintenance_io_concurrency = 200 wal_recycle = on # General notes: # Note that not all settings are automatically tuned. # Consider contacting experts at # https://www.cybertec-postgresql.com # for more professional expertise.
小结
默认情况下,PostgreSQL 不会针对我们将运行数据库生产系统的现代硬件进行预配置。意味着数据库可能无法充分利用底层的硬件资源,导致查询性能不佳,当前以上配置仅做参考,应该根据自己的应用场景调整数据库参数。
Reference
https://vladmihalcea.com/postgresql-performance-tuning-settings/
对不起,这篇文章暂时关闭评论。