A fairly straightforward question.
Two answers.
One saying, “Yes.”
The other saying, “No!”
Both with significant upvotes.
Who to believe? Let me attempt to clarify.
Both answers have some truth to them, and it depends on what you mean by a
file being closed.
First, consider what is meant by closing a file from the operating system’s
perspective.
When a process exits, the operating system clears up all the resources
that only that process had open. Otherwise badly-behaved programs that
crash but didn’t free up their resources could consume all the system
resources.
If Python was the only process that had that file open, then the file will
be closed. Similarly the operating system will clear up memory allocated by
the process, any networking ports that were still open, and most other
things. There are a few exceptional functions like shmat
that create
objects that persist beyond the process, but for the most part the
operating system takes care of everything.
Now, what about closing files from Python’s perspective? If any program
written in any programming language exits, most resources will get cleaned
up—but how does Python handle cleanup inside standard Python programs?
The standard CPython implementation of Python—as opposed to other Python
implementations like Jython—uses reference counting to do most of its
garbage collection. An object has a reference count field. Every time
something in Python gets a reference to some other object, the reference
count field in the referred-to object is incremented. When a reference is
lost, e.g, because a variable is no longer in scope, the reference count is
decremented. When the reference count hits zero, no Python code can reach
the object anymore, so the object gets deallocated. And when it gets
deallocated, Python calls the __del__()
destructor.
Python’s __del__()
method for files flushes the buffers and closes the
file from the operating system’s point of view. Because of reference
counting, in CPython, if you open a file in a function and don’t return the
file object, then the reference count on the file goes down to zero when
the function exits, and the file is automatically flushed and closed. When
the program ends, CPython dereferences all objects, and all objects have
their destructors called, even if the program ends due to an unhanded
exception. (This does technically fail for the pathological case where you have a cycle
of objects with destructors,
at least in Python versions before 3.4.)
But that’s just the CPython implementation. Python the language is defined
in the Python language reference, which is what all Python
implementations are required to follow in order to call themselves
Python-compatible.
The language reference explains resource management in its data model
section:
Some objects contain references to “external” resources such as open
files or windows. It is understood that these resources are freed when
the object is garbage-collected, but since garbage collection is not
guaranteed to happen, such objects also provide an explicit way to
release the external resource, usually a close() method. Programs are
strongly recommended to explicitly close such objects. The
‘try...finally‘ statement and the ‘with‘ statement provide convenient
ways to do this.
That is, CPython will usually immediately close the object, but that may
change in a future release, and other Python implementations aren’t even
required to close the object at all.
So, for portability and because explicit is better than implicit,
it’s highly recommended to call close()
on everything that can be
close()
d, and to do that in a finally
block if there is code between
the object creation and close()
that might raise an exception. Or to use
the with
syntactic sugar that accomplishes the same thing. If you do
that, then buffers on files will be flushed, even if an exception is
raised.
However, even with the with
statement, the same underlying mechanisms are
at work. If the program crashes in a way that doesn’t give Python’s
__del__()
method a chance to run, you can still end up with a corrupt
file on disk:
#!/usr/bin/env python3.3
import ctypes
# Cast the memory adress 0x0001 to the C function int f()
prototype = ctypes.CFUNCTYPE(int)
f = prototype(1)
with open('foo.txt', 'w'):
x.write('hi')
# Segfault
print(f())
This program produces a zero-length file. It’s an abnormal case, but it
shows that even with the with
statement resources won’t always
necessarily be cleaned up the way you expect. Python tells the operating
system to open a file for writing, which creates it on disk; Python writes hi
into the C library’s stdio
buffers; and then it crashes before the with
statement ends, and because of the apparent memory corruption, it’s not safe
for the operating system to try to read the remains of the buffer and flush them to disk. So the program fails to clean up properly even though there’s a with
statement. Whoops. Despite this, close()
and with
almost always work, and your program is always better off having them than not having them.
So the answer is neither yes nor no. The with
statement and close()
are technically not
necessary for most ordinary CPython programs. But not using them results in
non-portable code that will look wrong. And while they are extremely
helpful, it is still possible for them to fail in pathological cases.