Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
317 views
in Technique[技术] by (71.8m points)

sql - Anonymizing customer data for development or testing

I need to take production data with real customer info (names, address, phone numbers, etc) and move it into a dev environment, but I'd like to remove any semblance of real customer info.

Some of the answers to this question can help me generating NEW test data, but then how do I replace those columns in my production data, but keep the other relevant columns?

Let's say I had a table with 10000 fake names. Should I do a cross-join with a SQL update? Or do something like

UPDATE table
SET lastname = (SELECT TOP 1 name FROM samplenames ORDER By NEWID())
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is easier than it sounds if you understand the database. One thing that is necessary is to understand the places where personal info is not normalized. For instance, the customer master file will have a name and address, but the order file will also have a name and address that might be different.

My basic process:

  1. ID the data (i.e. the columns), and the tables which contain those columns.
  2. ID the "master" tables for those columns, and also the non-normailzed instances of those columns.
  3. Adjust the master files. Rather than trying to randomize them, (or make them phony), connect them to the key of the file. For customer 123, set the name to name123, the address to 123 123rd St, 123town, CA, USA, phone 1231231231. This has the added bonus of making debugging very easy!
  4. Change the non-normal instances by either updating from the master file or by doing the same kind of de-personalization

It doesn't look pretty, but it works.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...