sql - How Do I Deep Copy a Set of Data, and Change FK References to Point to All the Copies?

Question

Welcome To Ask or Share your Answers For Others

sql - How Do I Deep Copy a Set of Data, and Change FK References to Point to All the Copies?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

sql - How Do I Deep Copy a Set of Data, and Change FK References to Point to All the Copies?

Suppose I have Table A and Table B. Table B references Table A. I want to deep copy a set of rows in Table A and Table B. I want all of the new Table B rows to reference the new Table A rows.

Note that I'm not copying the rows into any other tables. The rows in table A will be copied into table A, and the rows in table B will be copied into table B.

How can I ensure that the foreign key references get readjusted as part of the copy?

To clarify, I'm trying to find a generic way to do this. The example I'm giving involves two tables, but in practice the dependency graph may be much more complicated. Even a generic way to dynamically generate SQL to do the work would be fine.

UPDATE:

People are asking why this is necessary, so I'll give some background. It may be way too much, but here goes:

I'm working with an old desktop application that's been moved to a client-server model. But, the application still uses a rudimentary in-house binary file format for storing data for its tables. A data file is just a header followed by a series of rows, each of which is just the binary serialized field values, the order of which is determined by a schema text file. The only thing good about it is that it's very fast. It's terrible in every other respect. I'm moving the application to SQL Server and trying not to degrade the performance too badly.

This is a kind of scheduling application; the data's not critical to anybody, and there's no audit tracking, etc. necessary. It's not a supermassive amount of data, and we don't necessarily need to keep very old data around if the database grows too large.

One feature that they are accustomed to is the ability to duplicate entire schedules in order to create "what-if" scenarios that they can muck with. Any user can do this as many times as they want, as often as they want. In the old database, the data files for each schedule are stored in their own data folder, identified by name. So, copying a schedule was as simple as copying the data folder and renaming it.

I must be able to do effectively the same thing with SQL Server or the migration will not work. Maybe you're thinking that I can just only copy the data that actually gets changed in order to avoid redundancy; but that honestly sounds too complicated to be feasible.

To throw another wrench into the mix, there can be a hierarchy of schedule data folders. So, a data folder may contain a data folder, which may contain a data folder. And the copying can occur at any level.

In SQL Server, I'm implementing a nested set hierarchy to mimic this. I have a DATA_SET table like this:

CREATE TABLE dbo.DATA_SET
(
    DATA_SET_ID UNIQUEIDENTIFIER PRIMARY KEY,
    NAME NVARCHAR(128) NOT NULL,
    LFT INT NOT NULL,
    RGT INT NOT NULL
)

So, there's a tree structure of data sets. Each data set represents a schedule, and may contain child data sets. Every row in every table has a DATA_SET_ID FK reference, indicating which data set it belongs to. Whenever I copy a data set, I copy all the rows in the table for that data set, and every other data set, into the same table, but referencing new data sets.

So, here's a simple concrete example:

CREATE TABLE FOO
(
    FOO_ID BIGINT PRIMARY KEY,
    DATA_SET_ID BIGINT FOREIGN KEY REFERENCES DATA_SET(DATA_SET_ID) NOT NULL
)


CREATE TABLE BAR
(
    BAR_ID BIGINT PRIMARY KEY,
    DATA_SET_ID BIGINT FOREIGN KEY REFERENCES DATA_SET(DATA_SET_ID) NOT NULL,
    FOO_ID UNIQUEIDENTIFIER PRIMARY KEY
)

INSERT INTO FOO
SELECT 1, 1 UNION ALL
SELECT 2, 1 UNION ALL
SELECT 3, 1 UNION ALL

INSERT INTO BAR
SELECT 1, 1, 1
SELECT 2, 1, 2
SELECT 3, 1, 3

So, let's say I copy data set 1 into a new data set of ID 2. After I copy, the tables will look like this:

FOO
FOO_ID, DATA_SET_ID
1    1
2    1
3    1
4    2
5    2
6    2

BAR
BAR_ID, DATA_SET_ID, FOO_ID
1    1    1
2    1    2
3    1    3
4    2    4
5    2    5
6    2    6

As you can see, the new BAR rows are referencing the new FOO rows. It's not the rewiring of the DATA_SET_ID's that I'm asking about. I'm asking about rewiring the foreign keys in general.

So, that was surely too much information, but there you go.

I'm sure there are a lot of concerns about performance with the idea of bulk copying the data like this. The tables are not going to be huge. I'm not expecting more than 1000 records in any table, and most of the tables will be much much smaller than that. Old data sets can be deleted outright with no repercussions.

Thanks, Tedderz

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T19:27:35+0000

Here is an example with three tables that can probably get you started.

DB schema

CREATE TABLE users
    (user_id int auto_increment PRIMARY KEY, 
     user_name varchar(32));
CREATE TABLE agenda
    (agenda_id int auto_increment PRIMARY KEY, 
     `user_id` int, `agenda_name` varchar(7));
CREATE TABLE events
    (event_id int auto_increment PRIMARY KEY, 
     `agenda_id` int, 
     `event_name` varchar(8));

An SP to clone a user with his agenda and events records

DELIMITER $$
CREATE PROCEDURE clone_user(IN uid INT)
BEGIN
    DECLARE last_user_id INT DEFAULT 0;

    INSERT INTO users (user_name)
    SELECT user_name
      FROM users
     WHERE user_id = uid;

    SET last_user_id = LAST_INSERT_ID();

    INSERT INTO agenda (user_id, agenda_name)
    SELECT last_user_id, agenda_name
      FROM agenda
     WHERE user_id = uid;

    INSERT INTO events (agenda_id, event_name)
    SELECT a3.agenda_id_new, e.event_name
      FROM events e JOIN
    (SELECT a1.agenda_id agenda_id_old, 
           a2.agenda_id agenda_id_new
      FROM
    (SELECT agenda_id, @n := @n + 1 n 
       FROM agenda, (SELECT @n := 0) n 
      WHERE user_id = uid 
      ORDER BY agenda_id) a1 JOIN
    (SELECT agenda_id, @m := @m + 1 m 
       FROM agenda, (SELECT @m := 0) m 
      WHERE user_id = last_user_id 
      ORDER BY agenda_id) a2 ON a1.n = a2.m) a3 
         ON e.agenda_id = a3.agenda_id_old;
END$$
DELIMITER ;

To clone a user

CALL clone_user(3);

Here is SQLFiddle demo.

Categories

sql - How Do I Deep Copy a Set of Data, and Change FK References to Point to All the Copies?

sql - How Do I Deep Copy a Set of Data, and Change FK References to Point to All the Copies?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags