Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
174 views
in Technique[技术] by (71.8m points)

python 3.x - How to sum the last 7 days in Pandas between two dates

Here is my raw data

Raw Data

Here is the data (including types) after I add on the column 'Date_2wks_Ago' within Pandas

enter image description here

I would like to add on a new column 'Rainfall_Last7Days' that calculates, for each day, the total amount of rainfall for the last week.

So (ignoring the other columns that aren't relevant) it would look a little like this...

Ideal Dataset

Anyone know how to do this in Pandas?

My data is about 1000 observations long, so not huge.

question from:https://stackoverflow.com/questions/66050401/how-to-sum-the-last-7-days-in-pandas-between-two-dates

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think what you are looking for is the rolling() function.

This section recreates a simplified version of table

 import pandas as pd
    import numpy as np
    
    # Create df

rainfall_from_9am=[4.6
                    ,0.4
                    ,3.6
                    ,3.5
                    ,3.2
                    ,5.5
                    ,2.2
                    ,1.3
                    ,0
                    ,0
                    ,0.04
                    ,0
                    ,0
                    ,0
                    ,0.04
                    ,0.4]

date=['2019-02-03'
        ,'2019-02-04'
        ,'2019-02-05'
        ,'2019-02-06'
        ,'2019-02-07'
        ,'2019-02-08'
        ,'2019-02-09'
        ,'2019-02-10'
        ,'2019-02-11'
        ,'2019-02-12'
        ,'2019-02-13'
        ,'2019-02-14'
        ,'2019-02-15'
        ,'2019-02-16'
        ,'2019-02-17'
        ,'2019-02-18'
        ]

# Create df from list
df=pd.DataFrame({'rainfall_from_9am':rainfall_from_9am
                ,'date':date
                })

This part calculates the rolling sum of rainfall for the current and previous 6 records.

df['rain_last7days']=df['rainfall_from_9am'].rolling(7).sum()

print(df)
          

Output:

          date  rainfall_from_9am  rain_last7days
0   2019-02-03               4.60             NaN
1   2019-02-04               0.40             NaN
2   2019-02-05               3.60             NaN
3   2019-02-06               3.50             NaN
4   2019-02-07               3.20             NaN
5   2019-02-08               5.50             NaN
6   2019-02-09               2.20           23.00
7   2019-02-10               1.30           19.70
8   2019-02-11               0.00           19.30
9   2019-02-12               0.00           15.70
10  2019-02-13               0.04           12.24
11  2019-02-14               0.00            9.04
12  2019-02-15               0.00            3.54
13  2019-02-16               0.00            1.34
14  2019-02-17               0.04            0.08
15  2019-02-18               0.40            0.48

Conscious that this output does not match exactly with the example in your original question. Can you please help verify the correct logic you are after?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...