Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
181 views
in Technique[技术] by (71.8m points)

python - Parsing CSV file using Panda

I have been using matplotlib for quite some time now and it is great however, I want to switch to panda and my first attempt at it didn't go so well.

My data set looks like this:

sam,123,184,2.6,543
winter,124,284,2.6,541
summer,178,384,2.6,542
summer,165,484,2.6,544
winter,178,584,2.6,545
sam,112,684,2.6,546
zack,145,784,2.6,547
mike,110,984,2.6,548
etc.....

I want first to search the csv for anything with the name mike and create it own list. Now with this list I want to be able to do some math for example add sam[3] + winter[4] or sam[1]/10. The last part would be to plot it columns against each other.

Going through this page

http://pandas.pydata.org/pandas-docs/stable/io.html#io-read-csv-table

The only thing I see is if I have a column header, however, I don't have any headers. I only know the position in a row of the values I want.

So my question is:

  1. How do I create a bunch of list for each row (sam, winter, summer)
  2. Is this method efficient if my csv has millions of data point?
  3. Could I use matplotlib plotting to plot pandas dataframe?

ie :

fig1 = plt.figure(figsize= (10,10))
ax = fig1.add_subplot(211)
ax.plot(mike[1], winter[3], label='Mike vs Winter speed', color = 'red')
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can read a csv without headers:

data=pd.read_csv(filepath, header=None)

Columns will be numbered starting from 0. Selecting and filtering:

all_summers = data[data[0]=='summer']

If you want to do some operations grouping by the first column, it will look like this:

data.groupby(0).sum()
data.groupby(0).count()
...

Selecting a row after grouping:

sums = data.groupby(0).sum()
sums.loc['sam']

Plotting example:

 sums.plot()
 import matplotlib.pyplot as plt
 plt.show()

For more details about plotting, see: http://pandas.pydata.org/pandas-docs/version/0.18.1/visualization.html


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...