Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

performance - Python copy larger file too slow

I am trying to copy a large file (> 1 GB) from hard disk to usb drive using shutil.copy. A simple script depicting what I am trying to do is:-

import shutil
src_file = "sourceolargefile"
dest = "destinationdirectory"
shutil.copy(src_file, dest)

It takes only 2-3 min on linux. But the same file copy on same file takes more that 10-15 min under Windows. Can somebody explain why and give some solution, preferably using python code?

Update 1

Saved the file as test.pySource file size is 1 GB. Destinantion directory is in USB drive. Calculated file copy time with ptime. Result is here:-

ptime.exe test.py

ptime 1.0 for Win32, Freeware - http://www.
Copyright(C) 2002, Jem Berkes <jberkes@pc-t

===  test.py ===

Execution time: 542.479 s

542.479 s == 9 min. I don't think shutil.copy should take 9 min for copying 1 GB file.

Update 2

Health of the USB is good as same script works well under Linux. Calculated time with same file under windows native xcopy.Here is the result.

ptime 1.0 for Win32, Freeware - http://www.pc-tools.net/
Copyright(C) 2002, Jem Berkes <jberkes@pc-tools.net>

===  xcopy F:est.iso L:usbest.iso
1 File(s) copied

Execution time: 128.144 s

128.144 s == 2.13 min. I have 1.7 GB free space even after copying test file.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Just to add some interesting information: WIndows does not like the tiny buffer used on the internals of the shutil implementation.

I've quick tried the following:

  • Copied the original shutil.py file to the example script folder and renamed it to myshutil.py
  • Changed the 1st line to import myshutil
  • Edited the myshutil.py file and changed the copyfileobj from

def copyfileobj(fsrc, fdst, length=16*1024):

to

def copyfileobj(fsrc, fdst, length=16*1024*1024):

Using a 16 MB buffer instead of 16 KB caused a huge performance improvement.

Maybe Python needs some tuning targeting Windows internal filesystem characteristics?

Edit:

Came to a better solution here. At the start of your file, add the following:

import shutil

def _copyfileobj_patched(fsrc, fdst, length=16*1024*1024):
    """Patches shutil method to hugely improve copy speed"""
    while 1:
        buf = fsrc.read(length)
        if not buf:
            break
        fdst.write(buf)
shutil.copyfileobj = _copyfileobj_patched

This is a simple patch to the current implementation and worked flawlessly here.

Python 3.8+: Python 3.8 includes a major overhaul of shutil, including increasing the windows buffer from 16KB to 1MB (still less than the 16MB suggested in this ticket). See ticket , diff


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...