Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
737 views
in Technique[技术] by (71.8m points)

storage - Should a database avoid deleting resources?

Allow me to elaborate: You know how every database has a users table which allows to store the user's account data. In that table there is a column (usually) named is_active which is just a boolean. And when a user "deletes" his/her account, we do not actually delete the data, we just change the column is_active to false. Okey all of this is fine and I understand why we do it. My question is, we don`t we do this for every other resource (for example, posts or comments)? Besides the enormous amount of data being stored is there any other reason not to this?

question from:https://stackoverflow.com/questions/66051644/should-a-database-avoid-deleting-resources

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Before saying anything, you should realize that "every database" is a very, very, very wide term, and by no meaning your assumption can be generalized beyond just a few specific cases. But I think I know what you mean, so let me try to explain.

The basic problem with deleting data is that it violates the laws of physics - namely, the fundamental principle that time only flows forward.

Data in a database is typically connected via some kind of references. The kind of references of course depends on the database itself, but let's say we are talking about relational databases such as MySQL or Oracle. In a well-designed schema, pretty much every row of every table will be connected to some other row in some other table.

When you are deleting a row, you will need to decide what to do with other rows that are referencing it. Two common scenarios are to set them to NULL, or to also delete the referencing rows (CASCADE). But neither of them solves the fundamental problem.

To illustrate, let's say you wanted to delete Hitler from your history database. Since there was a chain of events that led from his birth to where the world is today, and since every fact is somehow linked to every other fact, you would effectively have to rewrite entire history from that point on. Would there be a second world war? What would be the geo-political situation today? etc. Not a simple task.

Similarly, if you have a social site and you delete a user, what are you going to do with his content? If other people have commented on his posts, are you going to also delete that? And if those users participated in other discussions, are you going to delete that too? So again, you come to the point where you (a) either have to fake the history in such a way that it will "look" consistent, (b) to delete everything again and start over from scratch, or (c) to leave things where they are.

Since option (a) is theoretically impossible and (b) is practically impossible, you are left only with (c) - to leave data where it is, hide it from the view, and just buy more storage.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...