Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
604 views
in Technique[技术] by (71.8m points)

python - Pandas rolling window over irregular series with float index

I have got a time series with a float index representing minutes from start of an experiment. The observations are not perfectly equally spaced. I am doing a rolling mean. Here some example data:

S = pd.Series([0,3,2,6,4,7,7,9,11,13,12,12,11,9,6,7,3,5,4], 
              index=[0.01,0.13,0.2,0.29,0.4,0.5,0.59,0.68,0.79,0.9,1.0,1.1,1.19,1.29,1.4,1.5,1.6,1.71,1.8])
Sr = S.rolling(3, win_type='triang', center=True).mean()

In my real data the window spans several hundred data points. Thus, i would like it to always span the same time (in index units), instead of a fixed number of observations. I found that this is possible on datetime indexes, however I need the index to be float for further calculation. Is there any way of doing this without having to convert the index to datetime and back again?

Pseudo-function:

Sr = S.rolling(0.3, win_type='triang', center=True, *on=index*).mean()

Expected output for this example:

for each index i: mean over window from i-0.15 to i+0.15 (with triangular weighting according to distance from i)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I do not think it can be done with the rolling method.

Out of interest, it can be done manually as follows:

from scipy.signal.windows import triang
import numpy as np
import pandas as pd

def triangular(a):
    n = a.size
    b = triang(n) / (n - 1)
    return b @ a

S = pd.Series([0,3,2,6,4,7,7,9,11,13,12,12,11,9,6,7,3,5,4],
              index=[0.01,0.13,0.2,0.29,0.4,0.5,0.59,0.68,0.79,0.9,1.0,1.1,1.19,1.29,1.4,1.5,1.6,1.71,1.8])

df = pd.DataFrame({'S': S})
df['neighbours'] = df.index.to_series().apply(lambda x: [df.loc[index][0] for index in df.index if x - 0.15 < index <= x + 0.15])
df['rolling_mean'] = df.neighbours.apply(lambda x: triangular(np.array(x)))
df.drop('neighbours', axis=1, inplace=True)

print(df)

Output:

       S  rolling_mean
0.01   0          1.50
0.13   3          2.00
0.20   2          3.25
0.29   6          4.50
0.40   4          5.25
0.50   7          6.25
0.59   7          7.50
0.68   9          9.00
0.79  11         11.00
0.90  13         12.25
1.00  12         12.25
1.10  12         11.75
1.19  11         10.75
1.29   9          8.75
1.40   6          7.00
1.50   7          5.75
1.60   3          4.50
1.71   5          4.25
1.80   4          4.50

I doubt, however, that this is simpler than converting the float index into datetime and then back.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...