Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
207 views
in Technique[技术] by (71.8m points)

python 2.7 - group data by season according to the exact dates

i have a csv file containing 4 years of data and i am trying to group data per season over the 4 years , differently saying, i need to summarize and plot my whole data into 4 season only . here's a look on my data file :

timestamp,heure,lat,lon,impact,type
2006-01-01 00:00:00,13:58:43,33.837,-9.205,10.3,1
2006-01-02 00:00:00,00:07:28,34.5293,-10.2384,17.7,1
2007-02-01 00:00:00,23:01:03,35.0617,-1.435,-17.1,2
2007-02-02 00:00:00,01:14:29,36.5685,0.9043,36.8,1
2008-01-01 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1
2008-01-02 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1
....
2011-12-31 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1

and here's my desired output :

winter     (the mean value of impacts)
summer     (the mean value of impacts)
autumn      ....
spring      .....

Actually i've tried this code :

names =["timestamp","heure","lat","lon","impact","type"]
data = pd.read_csv('flash.txt',names=names, parse_dates=['timestamp'],index_col=['timestamp'], dayfirst=True)

spring = range(80, 172)
summer = range(172, 264)
fall = range(264, 355)

def season(x):
    if x in spring:
       return 'Spring'
    if x in summer:
       return 'Summer'
    if x in fall:
       return 'Fall'
   else :
       return 'Winter'

 data['SEASON'] = data.index.to_series().dt.month.map(lambda x : season(x))
 data['impact'] = data['impact'].abs()
 seasonly = data.groupby('SEASON')['impact'].mean()

and i got this horrible result : enter image description here

where am i mistaken ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You need DatetimeIndex.dayofyear:

data['SEASON'] = data.index.dayofyear.map(season)

Another solution with pandas.cut:

bins = [0, 91, 183, 275, 366]
labels=['Winter', 'Spring', 'Summer', 'Fall']
doy = data.index.dayofyear
data['SEASON1'] = pd.cut(doy + 11 - 366*(doy > 355), bins=bins, labels=labels)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...