首页 » PostgreSQL/GaussDB » Create Index on TimstmapTZ::date On PostgreSQL

Create Index on TimstmapTZ::date On PostgreSQL

在PostgreSQL中的时间类型较多,如TIME, DATE, INTERVAL, TIMESTAMP,和Timestamp With Timezone(TIMESTAMPTZ) , 而timestamp类型精确度非常高,如date类型是只有年月日,time又只有时间分秒,而timestamp是两者组合,同时可以带时区,如有在些基于postgresql国产数据库中为了兼容sysdate,注意函数的返回精确度,防止数据迁移丢失精度,但有时用户希望使用精度较低,如按日期检索和创建索引时有点有趣。为了说明这一点这里测试一下。

构建测试环境
highgoV9(pg14)

highgo=# create table date_test(event_time timestamp(6) with time zone);
CREATE TABLE

highgo=# insert into date_test values(now());
INSERT 0 1
highgo=# select * from date_test;
          event_time
-------------------------------
 2024-12-09 11:25:39.046972+08
(1 row)

INSERT INTO date_test
SELECT
(
    SELECT '2023-03-01 00:00:00'::timestamptz +
           (floor(random() *
            (extract(EPOCH FROM '2023-04-01'::timestamptz) -
             extract(EPOCH FROM '2023-03-01'::timestamptz)) +
            b * 0)::integer || 'seconds')::interval
)
FROM generate_series(1, 100000) AS a(b);
 
ANALYZE date_test;

Timestamp vs TimstampTZ

The TIMESTAMP and TIMESTAMPTZ data types are similar; the only difference is that one includes the time zone information while the other doesn’t.

highgo=# SET timezone = 'America/Los_Angeles';
SET
highgo=#  SELECT timestamptz '2022-01-01 00:00:00+08'::timestamptz;
      timestamptz
------------------------
 2021-12-31 08:00:00-08

highgo=# SELECT
highgo-#       typname,
highgo-#       typlen
highgo-# FROM
highgo-#       pg_type
highgo-# WHERE
highgo-#       typname ~ '^timestamp';
   typname   | typlen
-------------+--------
 timestamp   |      8
 timestamptz |      8
(2 rows)

有时应用返回数据是带时区,而表定义不带,可能会导致数据报错,可以修改表定义,也可以创建cast
create cast (pg_catalog.timestamptz as timestamp) with inout as implicit;  

如有时我们需要要查某一天的数据,而数据类型是timestampTZ, 一种做法是创建默认索引,然后写SQL 范围区间;或者是创建一个转换数据库date的索引;或创建一个date_trunc函数索引;

1, 范围查询

highgo=# CREATE INDEX i_date_test_nyc_2 on date_test(event_time);
CREATE INDEX

highgo=# explain (verbose,buffers) select count(*) from date_test where event_time='2023-03-09';
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Aggregate  (cost=4.31..4.32 rows=1 width=8)
   Output: count(*)
   ->  Index Only Scan using i_date_test_nyc_2 on public.date_test  (cost=0.29..4.31 rows=1 width=0)
         Output: event_time
         Index Cond: (date_test.event_time = '2023-03-09 00:00:00+08'::timestamp with time zone)
 Planning:
   Buffers: shared hit=15 read=1
   I/O Timings: read=0.050
(8 rows)

highgo=# explain (verbose,buffers) select count(*) from date_test where event_time=to_timestamp('2023-03-09','yyyy-mm-dd');
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Aggregate  (cost=4.32..4.33 rows=1 width=8)
   Output: count(*)
   ->  Index Only Scan using i_date_test_nyc_2 on public.date_test  (cost=0.29..4.31 rows=1 width=0)
         Output: event_time
         Index Cond: (date_test.event_time = to_timestamp('2023-03-09'::text, 'yyyy-mm-dd'::text))
(5 rows)

highgo=# explain (buffers,costs off,verbose) select count(*) from date_test
        where event_time>=to_date('2023-03-09','yyyy-mm-dd') and event_time<to_date('2023-03-10','yyyy-mm-dd');
 QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Aggregate
 Output: count(*)
 -> Index Only Scan using i_date_test_nyc_2 on public.date_test
 Output: event_time
 Index Cond: ((date_test.event_time >= to_date('2023-03-09'::text, 'yyyy-mm-dd'::text)) AND (date_test.event_time < to_date('2023-03-10'::text, 'yyyy-mm-dd'::text)))
(5 rows)

2. timestamptz::date index

初始的按日期索引方法可能是在 timestamp 列上创建一个索引,强制转换为 date:

highgo=# CREATE INDEX i_date_test ON date_test ((event_time::date));
ERROR:  functions in index expression must be marked IMMUTABLE

因为具有不同时区值的会话可能会访问索引。使用 如果当前会话的 Timezone 值被具有不同 TimeZone 值的 Session 使用,则会产生不正确的结果,因此无法创建此类索引。(查询可用时区从pg_timezone_names) ,一种解决方案是 创建具有特定时区的索引:

highgo=# CREATE INDEX i_date_test_nyc ON date_test (((event_time AT TIME ZONE 'Asia/Shanghai')::date));
CREATE INDEX

但只有在查询中指定了匹配的时区时,才能使用索引:

highgo=# explain (verbose,buffers) select count(*) from date_test where event_time=to_timestamp('2023-03-09','yyyy-mm-dd')::date;  --wrong query
                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2193.02..2193.03 rows=1 width=8)
   Output: count(*)
   ->  Seq Scan on public.date_test  (cost=0.00..2193.02 rows=1 width=0)
         Output: event_time
         Filter: (date_test.event_time = (to_timestamp('2023-03-09'::text, 'yyyy-mm-dd'::text))::date)
(5 rows)

highgo=# explain (verbose,buffers) select count(*) from date_test where (event_time AT TIME ZONE 'Asia/Shanghai')::date='2023-03-09';
                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=482.25..482.26 rows=1 width=8)
   Output: count(*)
   ->  Bitmap Heap Scan on public.date_test  (cost=8.17..481.00 rows=500 width=0)
         Recheck Cond: (((date_test.event_time AT TIME ZONE 'Asia/Shanghai'::text))::date = '2023-03-09'::date)
         ->  Bitmap Index Scan on i_date_test_nyc  (cost=0.00..8.04 rows=500 width=0)
               Index Cond: (((date_test.event_time AT TIME ZONE 'Asia/Shanghai'::text))::date = '2023-03-09'::date)
 Planning:
   Buffers: shared hit=11
(8 rows)

highgo=# explain (verbose,buffers,costs off) select count(*) from date_test where (event_time AT TIME ZONE 'America/New_York')::date='2023-03-09';
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Aggregate
   Output: count(*)
   ->  Seq Scan on public.date_test
         Output: event_time
         Filter: (((date_test.event_time AT TIME ZONE 'America/New_York'::text))::date = '2023-03-09'::date)
(5 rows)

3,date_trunc 函数索引

highgo=# CREATE INDEX i_date_test_trunc ON date_test ((date_trunc('day', event_time)));
ERROR:  functions in index expression must be marked IMMUTABLE

highgo=# \df+ date_trunc
                                                                                                                                  List of functions
   Schema   |    Name    |      Result data type       |         Argument data types          | Type | Volatility | Parallel | Owner | Security | Access privileges | Language |      Sou
 Description
------------+------------+-----------------------------+--------------------------------------+------+------------+----------+-------+----------+-------------------+----------+---------
---------------------------------------------
 pg_catalog | date_trunc | interval                    | text, interval                       | func | immutable  | safe     | hg    | invoker  |                   | internal | interval
units
 pg_catalog | date_trunc | timestamp without time zone | text, timestamp without time zone    | func | immutable  | safe     | hg    | invoker  |                   | internal | timestam
 units
 pg_catalog | date_trunc | timestamp with time zone    | text, timestamp with time zone       | func | stable     | safe     | hg    | invoker  |                   | internal | timestam
ne to specified units
 pg_catalog | date_trunc | timestamp with time zone    | text, timestamp with time zone, text | func | stable     | safe     | hg    | invoker  |                   | internal | timestam
ne to specified units in specified time zone
(4 rows)

查看PostgreSQL 函数

postgres=# \df+ date_trunc
                                                                                                                                    List of functions
   Schema   |    Name    |      Result data type       |         Argument data types          | Type | Volatility | Parallel |  Owner   | Security | Access privileges | Language |     I
nternal name      |                                 Description
------------+------------+-----------------------------+--------------------------------------+------+------------+----------+----------+----------+-------------------+----------+------
------------------+-----------------------------------------------------------------------------
 pg_catalog | date_trunc | interval                    | text, interval                       | func | immutable  | safe     | postgres | invoker  |                   | internal | inter
val_trunc         | truncate interval to specified units
 pg_catalog | date_trunc | timestamp with time zone    | text, timestamp with time zone       | func | stable     | safe     | postgres | invoker  |                   | internal | times
tamptz_trunc      | truncate timestamp with time zone to specified units
 pg_catalog | date_trunc | timestamp with time zone    | text, timestamp with time zone, text | func | immutable  | safe     | postgres | invoker  |                   | internal | times
tamptz_trunc_zone | truncate timestamp with time zone to specified units in specified time zone
 pg_catalog | date_trunc | timestamp without time zone | text, timestamp without time zone    | func | immutable  | safe     | postgres | invoker  |                   | internal | times
tamp_trunc        | truncate timestamp to specified units
(4 rows)

postgres=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 17.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

-- kingbase
kingbase=# \df+ date_trunc
                                                                                                                                   List of functions
   Schema   |    Name    |      Result data type       |         Argument data types          | Type | Volatility | Parallel | Owner  | Security | Access privileges | Language |      So
  Description
------------+------------+-----------------------------+--------------------------------------+------+------------+----------+--------+----------+-------------------+----------+--------
----------------------------------------------
 pg_catalog | date_trunc | pg_catalog.interval         | text, pg_catalog.interval            | func | immutable  | safe     | system | invoker  |                   | internal | interva
 units
 pg_catalog | date_trunc | timestamp without time zone | text, timestamp without time zone    | func | immutable  | safe     | system | invoker  |                   | internal | timesta
d units
 pg_catalog | date_trunc | timestamp with time zone    | text, timestamp with time zone       | func | stable     | safe     | system | invoker  |                   | internal | timesta
one to specified units
 pg_catalog | date_trunc | timestamp with time zone    | text, timestamp with time zone, text | func | stable     | safe     | system | invoker  |                   | internal | timesta
one to specified units in specified time zone
 sys        | date_trunc | dsinterval                  | text, dsinterval                     | func | immutable  | safe     | system | invoker  |                   | internal | orainte
ed units
 sys        | date_trunc | yminterval                  | text, yminterval                     | func | immutable  | safe     | system | invoker  |                   | internal | orainte
ed units
(6 rows)

参考Postgresql修改(text, timestamp with time zone, text)稳定

highgo=#  alter function date_trunc(text, timestamp with time zone, text) immutable;
ALTER FUNCTION

highgo=# CREATE INDEX i_date_test_trunc ON date_test ((date_trunc('day', event_time,'Asia/Shanghai')));
CREATE INDEX

highgo=# explain verbose select count(*) from date_test where date_trunc('day', event_time, 'America/New_York') = '2023-03-09';
                                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1944.26..1944.27 rows=1 width=8)
   Output: count(*)
   ->  Seq Scan on public.date_test  (cost=0.00..1943.01 rows=500 width=0)
         Output: event_time
         Filter: (date_trunc('day'::text, date_test.event_time, 'America/New_York'::text) = '2023-03-09 00:00:00+08'::timestamp with time zone)
(5 rows)

highgo=# explain verbose select count(*) from date_test where date_trunc('day', event_time, 'Asia/Shanghai') = '2023-03-09';
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=481.00..481.01 rows=1 width=8)
   Output: count(*)
   ->  Bitmap Heap Scan on public.date_test  (cost=8.17..479.75 rows=500 width=0)
         Recheck Cond: (date_trunc('day'::text, date_test.event_time, 'Asia/Shanghai'::text) = '2023-03-09 00:00:00+08'::timestamp with time zone)
         ->  Bitmap Index Scan on i_date_test_trunc  (cost=0.00..8.04 rows=500 width=0)
               Index Cond: (date_trunc('day'::text, date_test.event_time, 'Asia/Shanghai'::text) = '2023-03-09 00:00:00+08'::timestamp with time zone)
(6 rows)

小结:
我们通过创建具有时区规范的 date 索引、具有时区规范的 date_trunc() 以及具有查询中使用的范围的 timestamp 来显示索引时间戳值。

  • date索引强制转换:仅日期粒度,固定到时区
  • date_trunc():固定到粒度和时区
  • 时间戳索引:不限颗粒度,灵活的时区
打赏

, ,

目前这篇文章还没有评论(Rss)

我要评论