Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
395 views
in Technique[技术] by (71.8m points)

python - limit how much data is read with numpy.genfromtxt for matplotlib

I am creating a graph in python using a text file for the source data and matplotlib to plot the graph. The simple logic below works well.

But is there a way to get have numpy.gentfromtxt only read the first 50 lines from the file 'temperature_logging'? Currently it reads the entire file.

temp = numpy.genfromtxt('temperature_logging',dtype=None,usecols=(0))
time = numpy.genfromtxt('temperature_logging',dtype=None,usecols=(1))

dates = matplotlib.dates.datestr2num(time)

pylab.plot_date(dates,temp,xdate=True,fmt='b-')

pylab.savefig('gp.png')

contents in temperature_logging;

21.75 12-01-2012-15:53:35    
21.75 12-01-2012-15:54:35    
21.75 12-01-2012-15:55:35    
.
.
.
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

numpy.genfromtxt accepts iterators as well as files. That means it will accept the output of itertools.islice. Here, test.txt is a five-line file:

>>> import itertools, numpy
>>> with open('test.txt') as t_in:
...     numpy.genfromtxt(itertools.islice(t_in, 3))
... 
array([[  1.,   2.,   3.,   4.,   5.],
       [  6.,   7.,   8.,   9.,  10.],
       [ 11.,  12.,  13.,  14.,  15.]])

One might think this would be slower than letting numpy handle the file IO, but a quick test suggests otherwise. genfromtxt provides a skip_footer keyword argument that you can use if you know how long the file is...

>>> numpy.genfromtxt('test.txt', skip_footer=2)
array([[  1.,   2.,   3.,   4.,   5.],
       [  6.,   7.,   8.,   9.,  10.],
       [ 11.,  12.,  13.,  14.,  15.]])

...but a few informal tests on a 1000-line file suggest that using islice is faster even if you skip only a few lines:

>>> def get(nlines, islice=itertools.islice):
...     with open('test.txt') as t_in:
...         numpy.genfromtxt(islice(t_in, nlines))
...         
>>> %timeit get(3)
1000 loops, best of 3: 338 us per loop
>>> %timeit numpy.genfromtxt('test.txt', skip_footer=997)
100 loops, best of 3: 4.92 ms per loop
>>> %timeit get(300)
100 loops, best of 3: 5.04 ms per loop
>>> %timeit numpy.genfromtxt('test.txt', skip_footer=700)
100 loops, best of 3: 8.48 ms per loop
>>> %timeit get(999)
100 loops, best of 3: 16.2 ms per loop
>>> %timeit numpy.genfromtxt('test.txt', skip_footer=1)
100 loops, best of 3: 16.7 ms per loop

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...