维护表格：在截断和重新填充之后，是否需要REINDEX表格？

我有一个大约200万行的表格，其中包含我们用于分析的交易数据。每个星期我们都会用新的数据重新载入，所以我们一直使用TRUNCATE清除它，然后插入新的行。

桌子上有几个索引。如果我不删除并重新创build索引，每次截断和重新填充之后是否需要重新索引，还是不必要？我应该在TRUNCATE之后运行VACUUM，还是这样也不必要？

不，你通常不需要在TRUNCATE之后重新索引 – 如果你这么做的话，你最好放弃索引，加载数据，然后在最后重新创build索引。

有点类似于关于集群的这个答案 – Pg在TRUNCATE期间自动删除索引，然后在插入数据时增量重build它，所以在TRUNCATE之前没有保留索引膨胀。

如果删除索引，截断，插入数据并重新创build索引，则可能会得到更紧凑和更高效的索引。他们一定会更快build立。一旦构build索引性能的差异不足以保证仅使用b-tree索引的大多数应用程序的额外工作量，但填充表所需的时间差异是非常值得的。如果你正在使用GiST或（特别是）GIN，最好放弃索引并重新创build。

如果这样做很方便的话，放弃索引并把它们加回去，如果这对你不实用，就不要太担心。

在我的testing中，对于一个普通的b-tree，增量创build的复合索引是3720kb，而一次性创build的索引是2208kb。构build时间是164ms（插入）+ 347ms（索引）vs 742ms（插入+索引）。这种差异是显着的，但不足以成为一个巨大的担忧，除非你正在做大规模的DW。在插入索引运行之后， REINDEX又花费了342毫秒。看到

所以，@TomTom是正确的（不出意料），因为如果方便的话，可以删除和重新创build索引，就像为OLAP工作批量填充表一样。

然而， 重组可能是错误的答案，因为它意味着你做了一大堆昂贵的工作来创build索引，然后扔掉。删除索引并重新创build它而不是重新索引。

演示会议：

 regress=# -- Create, populate, then create indexes: regress=# CREATE TABLE demo (someint integer, sometext text); CREATE TABLE regress=# \timing on regress=# INSERT INTO demo (someint, sometext) SELECT x, (x%100)::text FROM generate_series(1,100000) x; INSERT 0 100000 Time: 164.678 ms regress=# CREATE INDEX composite_idx ON demo(sometext, someint); CREATE INDEX Time: 347.958 ms regress=# SELECT pg_size_pretty(pg_indexes_size('demo'::regclass)); pg_size_pretty ---------------- 2208 kB (1 row) regress=# -- Total time: 347.958+164.678=512.636ms, index size 2208kB regress=# -- Now, with truncate and insert: regress=# TRUNCATE TABLE demo; TRUNCATE TABLE regress=# INSERT INTO demo (someint, sometext) SELECT x, (x%100)::text FROM generate_series(1,100000) x; INSERT 0 100000 Time: 742.813 ms regress=# SELECT pg_size_pretty(pg_indexes_size('demo'::regclass)); pg_size_pretty ---------------- 3720 kB (1 row) regress=# -- Total time 742ms, index size 3720kB regress=# -- Difference: about 44% time increase, about 68% index size increase. regress=# -- Big-ish, but whether you care depends on your application. Now: regress=# REINDEX INDEX composite_idx ; REINDEX Time: 342.283 ms regress=# SELECT pg_size_pretty(pg_indexes_size('demo'::regclass)); pg_size_pretty ---------------- 2208 kB (1 row) regress=# -- Index is back to same size, but total time for insert with progressive regress=# -- index build plus reindex at the end us up to 1084.283, twice as long as regress=# -- dropping the indexes, inserting the data, and re-creating the indexes took.

所以：

对于OLAP，删除索引，插入，重新创build索引。
对于OLTP，您可能只想坚持使用渐进式索引构build。考虑索引上的非100％填充因子来降低插入成本。
避免插入渐进索引构build，然后重新索引，这是两个世界中最糟糕的。

当然，在这个testing中使用的尺寸是玩具表尺寸，所以你应该对你的真实世界的数据和索引的样本重复这个testing，以得到它对你有多大的差异的一个牢固的想法。我重复了这些testing的比例因子100大于上述值，并且一致地发现，如果增量构build，那么索引几乎是其大小的两倍，尽pipe相对构build时间差异实际上落在了这个特定的testing上。

所以：testing你的数据和模式。