Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
448 views
in Technique[技术] by (71.8m points)

python - 熊猫数据框每天重新采样,没有日期时间索引(pandas dataframe resample per day without date time index)

I have a dataframe in pandas of the following form:

(我有以下形式的熊猫数据框:)

      timestamps         light
7   2004-02-28 00:58:45 150.88
26  2004-02-28 00:59:45 143.52
34  2004-02-28 01:00:45 150.88
42  2004-02-28 01:01:15 150.88
59  2004-02-28 01:02:15 150.88

Here note that the index is not the timestamps column.

(在这里请注意,索引不是时间戳列。)

But I want to resample (or bin the data somehow) to reflect the average value of the light column per minute , hour, day etc.. I have looked into the resample method that pandas offers and it requires the dataframe to have a datatime index for the method to work (unless I've misunderstood this).

(但是我想重新采样(或以某种方式对数据进行分箱)以反映每分钟,每小时,每天等的光柱的平均值。我研究了pandas提供的resample方法,它要求数据帧具有数据时间索引为工作方法(除非我误解了这一点)。)

  1. So my first question is, can I re-index the dataframe to have timestamps as the index (note that not each row has a unique timestamp and for each timestamp, there are about 30 rows with the same timestamp,each representing a sensor).

    (所以我的第一个问题是,我可以重新索引数据帧以将时间戳记作为索引吗(请注意,并非每一行都有唯一的时间戳记,对于每个时间戳记,大约有30行具有相同的时间戳记,每行代表一个传感器)。)

  2. If not, is there some other way to possibly achieve another dataframe which has the average value of light per hour , per day , per month etc..?

    (如果不是,是否有其他方法可以实现另一个具有每小时,每天,每月等的光平均值的数据帧?)

Any help would be appreciated.

(任何帮助,将不胜感激。)

  ask by Nikhil translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You are right - need DatetimeIndex , TimedeltaIndex or PeriodIndex else error:

(您是对的-需要DatetimeIndexTimedeltaIndexPeriodIndex否则错误:)

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'

(TypeError:仅对DatetimeIndex,TimedeltaIndex或PeriodIndex有效,但具有“ Index”的实例)

So you have to first reset_index and set_index if original index is important:

(因此,如果原始index很重要,则必须首先重设reset_indexset_index :)

print (df.reset_index().set_index('timestamps'))
                     index   light
timestamps                        
2004-02-28 00:58:45      7  150.88
2004-02-28 00:59:45     26  143.52
2004-02-28 01:00:45     34  150.88
2004-02-28 01:01:15     42  150.88
2004-02-28 01:02:15     59  150.88

if not only set_index :

(如果不仅是set_index :)

print (df.set_index('timestamps'))
                      light
timestamps                 
2004-02-28 00:58:45  150.88
2004-02-28 00:59:45  143.52
2004-02-28 01:00:45  150.88
2004-02-28 01:01:15  150.88
2004-02-28 01:02:15  150.88

and then resample :

(然后resample :)

print (df.reset_index().set_index('timestamps').resample('1D').mean())
            index    light
timestamps                
2004-02-28   33.6  149.408

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...