Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
141 views
in Technique[技术] by (71.8m points)

python - reading multiple files contained in a zip file with pandas

I have multiple zip files containing different types of txt files. Like below:

zip1 
  - file1.txt
  - file2.txt
  - file3.txt

How can I use pandas to read in each of those files without extracting them?

I know if they were 1 file per zip I could use the compression method with read_csv like below:

df = pd.read_csv(textfile.zip, compression='zip') 

Any help on how to do this would be great.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can pass ZipFile.open() to pandas.read_csv() to construct a pandas.DataFrame from a csv-file packed into a multi-file zip.

Code:

pd.read_csv(zip_file.open('file3.txt'))

Example to read all .csv into a dict:

from zipfile import ZipFile

zip_file = ZipFile('textfile.zip')
dfs = {text_file.filename: pd.read_csv(zip_file.open(text_file.filename))
       for text_file in zip_file.infolist()
       if text_file.filename.endswith('.csv')}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...