I have a csv file with different groups identified by an ID, something like:
ID,X
aaa,3
aaa,5
aaa,4
bbb,50
bbb,54
bbb,52
I need to:
- calculate the mean of x in each group;
- divide each value of x by the mean of x for that specific group.
So, in my example above, the mean in the 'aaa' group is 4, while in 'bbb' it's 52.
I need to obtain a new dataframe with a third column, where in each row I have the original value of x divided by the group average:
ID,X,x/group_mean
aaa,3,3/4
aaa,5,5/4
aaa,4,4/4
bbb,50,50/52
bbb,54,54/52
bbb,52,52/52
I can group the dataframe and calcualte each group's mean by:
df_data = pd.read_csv('test.csv', index_col=0)
df_grouped = df_data.groupby('ID')
for group_name, group_content in df_grouped:
mean_x_group = group_content['x'].mean()
print(f'mean = {mean_x_group}')
but how do I add the third column?
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…