python - Pandas row filters and and division from specific rows and columns

Question

Welcome To Ask or Share your Answers For Others

python - Pandas row filters and and division from specific rows and columns

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Pandas row filters and and division from specific rows and columns

I have the following dataframe:-

traffic_type    date        region   total_views
desktop         01/04/2018  aug      50
mobileweb       01/04/2018  aug      60
total           01/04/2018  aug      100
desktop         01/04/2018  world    20
mobileweb       01/04/2018  world    30
total           01/04/2018  world    40

I need to group by traffic_type, date, region, and filter the rows with traffic type total and in the same row create a desktop_share column which is total_views of traffic_type==desktop / total views of the traffic_type ==total the rest of the rows are blank for this column.

 traffic_type    date        region   total_views desktop_share
desktop         01/04/2018  aug      50           
mobileweb       01/04/2018  aug      60
total           01/04/2018  aug      200          0.25
desktop         01/04/2018  world    20
mobileweb       01/04/2018  world    30
total           01/04/2018  world    40           0.5

I have a long approach which works but I am looking for something more precise based on numpy or just pandas. My solution:

df1 = df2.loc[df2.traffic_type == 'desktop']
df1 = df1[['date', 'region', 'total_views']]
df1 = df2.merge(df1, how='left', on=['region', 'date'], suffixes=('', '_desktop'))
df1 = df1.loc[df1.traffic_type == 'total']
df1['desktop_share'] = df1['total_views_desktop'] / df1['total_views']
df1 = df1[['date', 'region', 'desktop_share', 'traffic_type']]

dfinal = df2.merge(df1, how='left', on=['region', 'date', 'traffic_type'])

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T03:10:13+0000

One idea with pivoting:

df1 = df.pivot_table(index=['date','region'], 
                     columns='traffic_type', 
                     values='total_views', 
                     aggfunc='sum')
print (df1)
traffic_type       desktop  mobileweb  total
date       region                           
01/04/2018 aug          50         60    200
           world        20         30     40

df2 = df1['desktop'].div(df1['total']).reset_index(name='desktop_share').assign(traffic_type='total')

df = df.merge(df2, how='left')
print (df)
  traffic_type        date region  total_views  desktop_share
0      desktop  01/04/2018    aug           50            NaN
1    mobileweb  01/04/2018    aug           60            NaN
2        total  01/04/2018    aug          200           0.25
3      desktop  01/04/2018  world           20            NaN
4    mobileweb  01/04/2018  world           30            NaN
5        total  01/04/2018  world           40           0.50

Another idea with MultiIndex:

df1 = df.set_index(['traffic_type','date','region'])

a = df1.xs('desktop', drop_level=False).rename({'desktop':'total'})
b = df1.xs('total', drop_level=False)

df = df1.assign(desktop_share = a['total_views'].div(b['total_views'])).reset_index()
print (df)
  traffic_type        date region  total_views  desktop_share
0      desktop  01/04/2018    aug           50            NaN
1    mobileweb  01/04/2018    aug           60            NaN
2        total  01/04/2018    aug          200           0.25
3      desktop  01/04/2018  world           20            NaN
4    mobileweb  01/04/2018  world           30            NaN
5        total  01/04/2018  world           40           0.50

Categories

python - Pandas row filters and and division from specific rows and columns

python - Pandas row filters and and division from specific rows and columns

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags