This problem has to be done in R only not SQL.
I have a problem where I am given below dataset.
Data Dictionary
UserID – 4848 customers who provided a rating for each movie - (Row)
Movie 1 to Movie 206 – 206 movies for which ratings are provided by 4848 distinct users (Columns)
1) I need to find Which movies have maximum views/ratings?
2) Define the top 5 movies with the least audience
I was able to get the max rating for each movie(column) by below. But after this how do I limit this result with highest rating.. what kind of filter or function can be used.
I used this :
dataset <- read.csv("Amazon - Movies and TV Ratings.csv", row.names = 1)
sapply(dataset,max,na.rm=TRUE)
This gives me one row with max value fr each col (5,5,2,5,3 etc.)
Sample dataset:
Movie1 Movie2 Movie3 Movie4 Movie5 Movie6
USer1 5 5 NA NA NA NA
USer2 NA NA 2 NA NA NA
USer3 NA NA NA 5 NA NA
USer4 NA NA NA 5 NA NA
USer5 NA NA NA NA 5 NA
USer6 NA NA NA NA 2 NA
USer7 NA NA NA NA 5 NA
USer8 NA NA NA NA 2 NA
USer9 NA NA NA NA 5 NA
USer10 NA NA NA NA 5 NA
Sample data screenshot:
Amazon rating dataset
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…