Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
642 views
in Technique[技术] by (71.8m points)

sql - Add new column without table lock?

In my project having 23 million records and around 6 fields has been indexed of that table.

Earlier I tested to add delta column for Thinking Sphinx search but it turns in holding the whole database lock for an hour. Afterwards when the file is added and I try to rebuild indexes this is the query that holds the database lock for around 4 hours:

"update user_messages set delta = false where delta = true"

Well for making the server up I created a new database from db dump and promote it as database so server can be turned live.

Now what I am looking is that adding delta column in my table with out table lock is it possible? And once the column delta is added then why is the above query executed when I run the index rebuild command and why does it block the server for so long?

PS.: I am on Heroku and using Postgres with ika db model.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Postgres 11 or later

Since Postgres 11, only volatile default values still require a table rewrite. The manual:

Adding a column with a volatile DEFAULT or changing the type of an existing column will require the entire table and its indexes to be rewritten.

Bold emphasis mine. false is immutable. So just add the column with DEFAULT false. Super fast, job done:

ALTER TABLE tbl ADD column delta boolean DEFAULT false;

Postgres 10 or older, or for volatile DEFAULT

Adding a new column without DEFAULT or DEFAULT NULL will not normally force a table rewrite and is very cheap. Only writing actual values to it creates new rows. But, quoting the manual:

Adding a column with a DEFAULT clause or changing the type of an existing column will require the entire table and its indexes to be rewritten.

UPDATE in PostgreSQL writes a new version of the row. Your question does not provide all the information, but that probably means writing millions of new rows.

While doing the UPDATE in place, if a major portion of the table is affected and you are free to lock the table exclusively, remove all indexes before doing the mass UPDATE and recreate them afterwards. It's faster this way. Related advice in the manual.

If your data model and available disk space allow for it, CREATE a new table in the background and then, in one transaction: DROP the old table, and RENAME the new one. Related:

While creating the new table in the background: Apply all changes to the same row at once. Repeated updates create new row versions and leave dead tuples behind.

If you cannot remove the original table because of constraints, another fast way is to build a temporary table, TRUNCATE the original one and mass INSERT the new rows - sorted, if that helps performance. All in one transaction. Something like this:

BEGIN

SET temp_buffers = 1000MB;  -- or whatever you can spare temporarily

-- write-lock table here to prevent concurrent writes - if needed
LOCK TABLE tbl IN SHARE MODE;    

CREATE TEMP TABLE tmp AS
SELECT *, false AS delta
FROM   tbl;                -- copy existing rows plus new value
-- ORDER BY ???            -- opportune moment to cluster rows

-- DROP all indexes here

TRUNCATE tbl;              -- empty table - truncate is super fast

ALTER TABLE tbl ADD column delta boolean DEFAULT FALSE; -- NOT NULL?

INSERT INTO tbl
TABLE tmp;                 -- insert back surviving rows.

-- recreate all indexes here

COMMIT;

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...