Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
982 views
in Technique[技术] by (71.8m points)

http - Python: sending and receiving large files over POST using cherrypy

I have a cherrypy web server that needs to be able to receive large files over http post. I have something working at the moment, but it fails once the files being sent gets too big (around 200mb). I'm using curl to send test post requests, and when I try to send a file that's too big, curl spits out "The entity sent with the request exceeds the maximum allowed bytes." Searching around, this seems to be an error from cherrypy.

So I'm guessing that the file being sent needs to be sent in chunks? I tried something with mmap, but I couldn't get it too work. Does the method that handles the file upload need to be able to accept the data in chunks too?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I took DirectToDiskFileUpload as a starting point. The changes it makes to handle big uploads are:

  1. server.max_request_body_size to 0 (default 100MB),
  2. server.socket_timeout to 60 (default 10s),
  3. response.timeout to 3600 (default 300s),
  4. Avoiding double copy by using tempfile.NamedTemporaryFile.

There are also some useless actions taken to supposedly avoid holding upload in memory, which disable standard CherryPy body processing and use cgi.FieldStorage manually instead. It is useless because there is cherrypy._cpreqbody.Part.maxrambytes.

The threshold of bytes after which point the Part will store its data in a file instead of a string. Defaults to 1000, just like the cgi module in Python's standard library.

I've experimented with the following code (run by Python 2.7.4, CherryPy 3.6) and 1.4GB file. Memory usage (in gnome-system-monitor) never reached out 10MiB. According to the number of bytes actually written to the disk, cat /proc/PID/io's write_bytes is almost the size of the file. With standard cherrypy._cpreqbody.Part and shutil.copyfileobj it is obviously doubled.

#!/usr/bin/env python
# -*- coding: utf-8 -*-


import os
import tempfile

import cherrypy


config = {
  'global' : {
    'server.socket_host' : '127.0.0.1',
    'server.socket_port' : 8080,
    'server.thread_pool' : 8,
    # remove any limit on the request body size; cherrypy's default is 100MB
    'server.max_request_body_size' : 0,
    # increase server socket timeout to 60s; cherrypy's defult is 10s
    'server.socket_timeout' : 60
  }
}


class NamedPart(cherrypy._cpreqbody.Part):

  def make_file(self):
    return tempfile.NamedTemporaryFile()

cherrypy._cpreqbody.Entity.part_class = NamedPart


class App:

  @cherrypy.expose
  def index(self):
    return '''<!DOCTYPE html>
      <html>
      <body>
        <form action='upload' method='post' enctype='multipart/form-data'>
          File: <input type='file' name='videoFile'/> <br/>
          <input type='submit' value='Upload'/>
        </form>
      </body>
      </html>
    '''

  @cherrypy.config(**{'response.timeout': 3600}) # default is 300s
  @cherrypy.expose()
  def upload(self, videoFile):
    assert isinstance(videoFile, cherrypy._cpreqbody.Part)

    destination = os.path.join('/home/user/', videoFile.filename)

    # Note that original link will be deleted by tempfile.NamedTemporaryFile
    os.link(videoFile.file.name, destination)

    # Double copy with standard ``cherrypy._cpreqbody.Part``
    #import shutil
    #with open(destination, 'wb') as f:
    #  shutil.copyfileobj(videoFile.file, f)

    return 'Okay'


if __name__ == '__main__':
  cherrypy.quickstart(App(), '/', config)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...