Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
198 views
in Technique[技术] by (71.8m points)

python - Threadsafe and fault-tolerant file writes

I have a long-running process which writes a lot of stuff in a file. The result should be everything or nothing, so I'm writing to a temporary file and rename it to the real name at the end. Currently, my code is like this:

filename = 'whatever'
tmpname = 'whatever' + str(time.time())

with open(tmpname, 'wb') as fp:
    fp.write(stuff)
    fp.write(more stuff)

if os.path.exists(filename):
    os.unlink(filename)
os.rename(tmpname, filename)

I'm not happy with that for several reasons:

  • it doesn't clean up properly if an exception occurs
  • it ignores concurrency issues
  • it isn't reusable (I need this in different places in my program)

Any suggestions how to improve my code? Is there a library that can help me out?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use Python's tempfile module to give you a temporary file name. It can create a temporary file in a thread safe manner rather than making one up using time.time() which may return the same name if used in multiple threads at the same time.

As suggested in a comment to your question, this can be coupled with the use of a context manager. You can get some ideas of how to implement what you want to do by looking at Python tempfile.py sources.

The following code snippet may do what you want. It uses some of the internals of the objects returned from tempfile.

  • Creation of temporary files is thread safe.
  • Renaming of files upon successful completion is atomic, at least on Linux. There isn't a separate check between os.path.exists() and the os.rename() which could introduce a race condition. For an atomic rename on Linux the source and destinations must be on the same file system which is why this code places the temporary file in the same directory as the destination file.
  • The RenamedTemporaryFile class should behave like a NamedTemporaryFile for most purposes except when it is closed using the context manager, the file is renamed.

Sample:

import tempfile
import os

class RenamedTemporaryFile(object):
    """
    A temporary file object which will be renamed to the specified
    path on exit.
    """
    def __init__(self, final_path, **kwargs):
        tmpfile_dir = kwargs.pop('dir', None)

        # Put temporary file in the same directory as the location for the
        # final file so that an atomic move into place can occur.

        if tmpfile_dir is None:
            tmpfile_dir = os.path.dirname(final_path)

        self.tmpfile = tempfile.NamedTemporaryFile(dir=tmpfile_dir, **kwargs)
        self.final_path = final_path

    def __getattr__(self, attr):
        """
        Delegate attribute access to the underlying temporary file object.
        """
        return getattr(self.tmpfile, attr)

    def __enter__(self):
        self.tmpfile.__enter__()
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        if exc_type is None:
            self.tmpfile.delete = False
            result = self.tmpfile.__exit__(exc_type, exc_val, exc_tb)
            os.rename(self.tmpfile.name, self.final_path)
        else:
            result = self.tmpfile.__exit__(exc_type, exc_val, exc_tb)

        return result

You can then use it like this:

with RenamedTemporaryFile('whatever') as f:
    f.write('stuff')

During writing, the contents go to a temporary file, on exit the file is renamed. This code will probably need some tweaks but the general idea should help you get started.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...