Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
534 views
in Technique[技术] by (71.8m points)

sql server - What is the best practice for inserting a record if it doesn't already exist?

I know at least three ways to insert a record if it doesn't already exist in a table:

  1. The first one is using if not exist:

    IF NOT EXISTS(select 1 from table where <condition>)
        INSERT...VALUES
    
  2. The second one is using merge:

    MERGE table AS target  
    USING (SELECT values) AS source 
    ON (condition)  
    WHEN NOT MATCHED THEN  
    INSERT ... VALUES ...
    
  3. The third one is using insert...select:

    INSERT INTO table (<values list>)
    SELECT <values list>
    WHERE NOT EXISTS(select 1 from table where <condition>)
    

But which one is the best?

The first option seems to be not thread-safe, as the record might be inserted between the select statement in the if and the insert statement that follows, if two or more users try to insert the same record.

As for the second option, merge seems to be an overkill for this, as the documentation states:

Performance Tip: The conditional behavior described for the MERGE statement works best when the two tables have a complex mixture of matching characteristics. For example, inserting a row if it does not exist, or updating the row if it does match. When simply updating one table based on the rows of another table, improved performance and scalability can be achieved with basic INSERT, UPDATE, and DELETE statements.

So I think the third option is the best for this scenario (only insert the record if it doesn't already exist, no need to update if it does), but I would like to know what SQL Server experts think.

Please note that after the insert, I'm not interested to know whether the record was already there or whether it's a brand new record, I just need it to be there so that I can carry on with the rest of the stored procedure.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

When you need to guarantee the uniqueness of records on a condition that can not to be expressed by a UNIQUE or PRIMARY KEY constraint, you indeed need to make sure that the check for existence and insert are being done in one transaction. You can achieve this by either:

  1. Using one SQL statement performing the check and the insert (your third option)
  2. Using a transaction with the appropriate isolation level

There is a fourth way though that will help you better structure your code and also make it work in situations where you need to process a batch of records at once. You can create a TABLE variable or a temporary table, insert all of the records that need to be inserted in there and then write the INSERT, UPDATE and DELETE statements based on this variable.

Below is (pseudo)code demonstrating this approach:

-- Logic to create the data to be inserted if necessary

DECLARE @toInsert TABLE (idCol INT PRIMARY KEY,dataCol VARCHAR(MAX))
INSERT INTO @toInsert (idCol,dataCol) VALUES (1,'row 1'),(2,'row 2'),(3,'row 3')

-- Logic to insert the data

INSERT INTO realTable (idCol,dataCol)
SELECT TI.*
FROM @toInsert TI
WHERE NOT EXISTS (SELECT 1 FROM realTable RT WHERE RT.dataCol=TI.dataCol)

In many situations I use this approach as it makes the TSQL code easier to read, possible to refactor and apply unit tests to.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...