Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
177 views
in Technique[技术] by (71.8m points)

merge - Subsetting and Merging from 2 Related Data Frames in r

I have searched through the archives and to no avail on this problem I have involving the subsetting of 2 related data frames, one data frame is a key, the other is an annual list, I'd like to use the key to create a subset and an index. I have tried using the subset formula's but my code is not appropriately meeting my criteria. Here is the data:

players <- c('Albert Belle','Reggie Jackson', 'Reggie Jackson')
contract_start_season <- c(1999,1977,1982)
contract_end_season <- c(2003, 1981, 1985)
key <- data.frame (player = players, contract_start_season, contract_end_season)
player_data <- data.frame( season = c(seq(1975,1985),seq(1997,2003)), player = c(rep('Reggie Jackson',times=11),rep('Albert Belle', times=7)))

I want to use the key to subset the player data to those years, so for Jackson 1977 to 1981 and then 1982 to 1985 and for Albert Belle 1999 to 2003. I'd also like to create an index so for example Reggie Jackson 1977 would be year 1, 1978 year 2 ect...

The code I have tried without merging looks like this and it isn't working:

player_data[player_data$season >= key$contract_start_season&player_data$season <= key$contract_end_season,]

I am also running into problems when merging because Reggie Jackson has 2 different contract years and it is trying to merge both.

Any help or advice on this would be super appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Are you trying to do something along the following lines?

library(data.table)

key <- data.table(key)
player_data <- data.table(player_data)

#Adding another column called season to help in the merge later
key[,season := contract_start_season]

# Index on which to merge
setkeyv(key, c("player","season"))
setkeyv(player_data, c("player","season"))

#the roll = Inf makes it like a closest merge, instead of an exact merge
key[player_data, roll = Inf]

Output:

> key[player_data, roll = Inf]
            player season contract_start_season contract_end_season
 1:   Albert Belle   1997                    NA                  NA
 2:   Albert Belle   1998                    NA                  NA
 3:   Albert Belle   1999                  1999                2003
 4:   Albert Belle   2000                  1999                2003
 5:   Albert Belle   2001                  1999                2003
 6:   Albert Belle   2002                  1999                2003
 7:   Albert Belle   2003                  1999                2003
 8: Reggie Jackson   1975                    NA                  NA
 9: Reggie Jackson   1976                    NA                  NA
10: Reggie Jackson   1977                  1977                1981
11: Reggie Jackson   1978                  1977                1981
12: Reggie Jackson   1979                  1977                1981
13: Reggie Jackson   1980                  1977                1981
14: Reggie Jackson   1981                  1977                1981
15: Reggie Jackson   1982                  1982                1985
16: Reggie Jackson   1983                  1982                1985
17: Reggie Jackson   1984                  1982                1985
18: Reggie Jackson   1985                  1982                1985

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...