Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
821 views
in Technique[技术] by (71.8m points)

append - Problems with an m:m Merge (Stata)

I am trying to the merge two datasets that have unemployment rates from different sources, and the first is structured as below:

It has over 30 variables but I am only listing this as an example. In addition, each observation is measured at one year only, for Egypt it is 2005.

year    country Gender  Unemployment
2005    EGY     Female    7.6
2005    EGY     Male      9.2
2005    EGY     Total      .
2006    EGY     Female    7.6
2006    EGY     Male       9
2006    EGY     Total      .

The second is structured as below, but it comes from an annual survey, so each country has three entries per year (total, male, female). And each country has from 1995-2019.

country Gender  year     Unemployment
EGY     Total   2005        12
EGY     Male    2005        7
EGY    Female   2005        17.5

Therefore, I tried to merge the two datasets with 1:1 and 1:m merge, and for both I get: "variables country year do not uniquely identify observations in the master data"

However, the merge worked with a m:m as in below,

merge m:m  country year using "DocumentsLMI.dta"

Thanks to Nick's advice, I merged with the triples:

merge 1:1 country year Gender cusing "DocumentsLMI.dta"

And it worked well!

question from:https://stackoverflow.com/questions/65941898/problems-with-an-mm-merge-stata

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

On the face of it your datasets are identified by triples of country year Gender and so qualify for merge 1:1 with those variables. So, the downside of an m:m merge appears to be that it is quite wrong.

That statement doesn't solve any of the problems that come next:

  1. Unemployment is so named in both sets, so what do you expect or want Stata to do?

  2. In your data example, the values of Unemployment are different in different datasets, although perhaps this is not true of the real data.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...