Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
495 views
in Technique[技术] by (71.8m points)

python - Resampling pandas dataframe is deleting column

                    Val         ts  year  doy     interpolat  region_id
2000-02-18          NaN  950832000  2000   49           NaN      19987
2000-03-05          NaN  952214400  2000   65           NaN      19987
2000-03-21          NaN  953596800  2000   81           NaN      19987
2000-04-06  0.402539365  954979200  2000   97           NaN      19987
2000-04-22   0.54021746  956361600  2000  113           NaN      19987

The above dataframe has a datetime index. I resample it like so:

df = df.resample('D')

However, this resampling results in this dataframe:

                    ts  year  doy    interpolat  region_id
2000-01-01  1199180160  2008    1             1      19990
2000-01-02         NaN   NaN  NaN           NaN        NaN
2000-01-03         NaN   NaN  NaN           NaN        NaN
2000-01-04         NaN   NaN  NaN           NaN        NaN
2000-01-05         NaN   NaN  NaN           NaN        NaN

Why did the 'Val' column disappear? and all the other columns seem messed up too. See Linearly interpolate missing rows in pandas dataframe for an explanation of where the dataframe is coming from.

--EDIT Based on @unutbu's questions:

df.reset_index().to_dict('list')

{'index': [Timestamp('2000-02-18 00:00:00'), Timestamp('2000-03-05 00:00:00'), Timestamp('2000-03-21 00:00:00'), ... '0.670709965', '0.631584375', '0.562112815', '0.50740686', '0.4447712', '0.47880806', nan, nan]}

-- EDIT: The csv file for the above data frame in its entirety is here:

https://www.dropbox.com/s/dp76hk6yfs6c1og/test.csv?dl=0

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The Val columns will probably not have a numerical dtype for some reason, and all non-numerical (eg object dtype) columns are removed in resample.

To check, just look at df.info().
To convert it to a numerical columns, you can use astype(float) or the convert_objects (pd.to_numeric starting from v0.17).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...