Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
487 views
in Technique[技术] by (71.8m points)

python - pandas.read_csv: how to skip comment lines

I think I misunderstand the intention of read_csv. If I have a file 'j' like

# notes
a,b,c
# more notes
1,2,3

How can I pandas.read_csv this file, skipping any '#' commented lines? I see in the help 'comment' of lines is not supported but it indicates an empty line should be returned. I see an error

df = pandas.read_csv('j', comment='#')

CParserError: Error tokenizing data. C error: Expected 1 fields in line 2, saw 3

I'm currently on

In [15]: pandas.__version__
Out[15]: '0.12.0rc1'

On version'0.12.0-199-g4c8ad82':

In [43]: df = pandas.read_csv('j', comment='#', header=None)

CParserError: Error tokenizing data. C error: Expected 1 fields in line 2, saw 3

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

So I believe in the latest releases of pandas (version 0.16.0), you could throw in the comment='#' parameter into pd.read_csv and this should skip commented out lines.

These github issues shows that you can do this:

See the documentation on read_csv: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...