Frequently, I use subprocess in Python to spawn other tools to do some tasks I need. Recently, we moved a large piece of software from Linux-based to Windows-based servers. And the nightmare begun.
There are many differences between Linux and Windows in how file descriptors (or file handles) are handled by the Python interpreter. Let see.
Subprocess
It is fairly easy to use subprocess in Python. You just use the Popen constructor. Here is an example calling the
ls command and retrieving the output.
import subprocess
p = subprocess.subprocess(['ls'], stdout=subprocess.PIPE)
stdout, stderr = p.communicate()
print stdout
File Descriptors (or File Handles)
File descriptors are the integers that are used to identify the open files (or pipes, fifos, tty, anything) that any running program has currently open. If you do not know what a file descriptor is, have a look at the
file descriptor Wikipedia article.
When any program is "
forked", the new process (usually) inherits the file descriptors of its parent.
Python and File Descriptors
Python works more or less in a similar way. File descriptors in Python are usually not handled directly. Instead, they are managed through higher level objects, like
Python file objects, which automatically handle creating and destroying file descriptors using underlying C libraries.
How to Get Opened File Descriptors in Python
In order for us to troubleshoot our issues, we needed a way to get the valid file descriptors in a Python script. So, we crafted the following script. Note that we only check file descriptors from 0 to 100, since we do not open so many files concurrently.
fd_table_status.py :
import os
import stat
_fd_types = (
('REG', stat.S_ISREG),
('FIFO', stat.S_ISFIFO),
('DIR', stat.S_ISDIR),
('CHR', stat.S_ISCHR),
('BLK', stat.S_ISBLK),
('LNK', stat.S_ISLNK),
('SOCK', stat.S_ISSOCK)
)
def fd_table_status():
result = []
for fd in range(100):
try:
s = os.fstat(fd)
except:
continue
for fd_type, func in _fd_types:
if func(s.st_mode):
break
else:
fd_type = str(s.st_mode)
result.append((fd, fd_type))
return result
def fd_table_status_logify(fd_table_result):
return ('Open file handles: ' +
', '.join(['{0}: {1}'.format(*i) for i in fd_table_result]))
def fd_table_status_str():
return fd_table_status_logify(fd_table_status())
if __name__=='__main__':
print fd_table_status_str()
When simply run, it will show all open file descriptors and their respective type:
$> python fd_table_status.py
Open file handles: 0: CHR, 1: CHR, 2: CHR
$>
The output is the same by calling
fd_table_status_str() .
Inherited File Descriptors - Linux vs Windows
To see the behavior, we will run the following script in Windows and in Linux. Note that the differentiated output is marked in bold.
test_fd_handling.py :
import fd_table_status
import subprocess
import platform
fds = fd_table_status.fd_table_status_str
if platform.system()=='Windows':
python_exe = r'C:\Python27\python.exe'
else:
python_exe = 'python'
print '1) Initial file descriptors:\n' + fds()
f = open('fd_table_status.py', 'r')
print '2) After file open, before Popen:\n' + fds()
p = subprocess.Popen(['python', 'fd_table_status.py'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
print '3) After Popen, before reading piped output:\n' + fds()
result = p.communicate()
print '4) After Popen.communicate():\n' + fds()
del p
print '5) After deleting reference to Popen instance:\n' + fds()
del f
print '6) After deleting reference to file instance:\n' + fds()
print '7) child process had the following file descriptors:'
print result[0][:-1]
Linux output
1) Initial file descriptors:
Open file handles: 0: CHR, 1: CHR, 2: CHR
2) After file open, before Popen:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
3) After Popen, before reading piped output:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG, 5: FIFO, 6: FIFO, 8: FIFO
4) After Popen.communicate():
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
5) After deleting reference to Popen instance:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
6) After deleting reference to file instance:
Open file handles: 0: CHR, 1: CHR, 2: CHR
7) child process had the following file descriptors:
Open file handles: 0: FIFO, 1: FIFO, 2: FIFO, 3: REG
Windows output
1) Initial file descriptors:
Open file handles: 0: CHR, 1: CHR, 2: CHR
2) After file open, before Popen:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
3) After Popen, before reading piped output:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG, 4: FIFO, 5: FIFO, 6: FIFO
4) After Popen.communicate():
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG, 5: FIFO, 6: FIFO
5) After deleting reference to Popen instance:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
6) After deleting reference to file instance:
Open file handles: 0: CHR, 1: CHR, 2: CHR
7) child process had the following file descriptors:
Open file handles: 0: FIFO, 1: FIFO, 2: FIFO
Step-by-step Output Review
I will be tracing the output of both runs step-by-step and explain (or try to explain) the output.
- The 3 initial file descriptors 0, 1, 2, (stdin, stdout, stderr) are connected to the TTY (the controlling terminal) in both cases. Type CHR indicates a special character device, which is our attached terminal (or pseudo-terminal).
- After a file is opened (for reading, but it does not matter), an new file descriptor (3) is added in both cases. Type REG indicates a regular file.
- After the Popen call, three additional file descriptors are added, 1 for each anonymous pipe created during the spawn of the new program. The numbers of the file descriptors do not matter, so do not pay attention to them. It is up to the OS to assign file descriptor values, so this is accepted. All goes well up to here. Type FIFO indicates a pipe (named pipe or anonymous pipe).
- Popen.communicate() does the following: a) sends any input to the child-process (we do not specify any), b) closes the input pipe, and c) reads all available output (both stdout and stderr) until the child terminates. In this step, we have different behavior in Windows and in Linux. In Linux, all three file descriptors in the parent process are closed, while in Windows the two output file descriptors (used for stdout and stderr by the child-process) are retained. Despite the fact that the child-process has terminated, in Windows, the two pipes remain open. This has a significant implication: if the Popen instance is not destroyed (e.g. by keeping a reference somehow, and thus not allowing the garbage collector to destroy the object), the pipes remain opened. If this is continued in further calls, the "zombie" file descriptors from the pipes will pile up, and eventually cause a Python exception "IOError: [Errno 24] Too many open files". Remember, this happens only in Windows!
- The reference to the Popen instance is removed, the Garbage Collector destroys the object, and the remaining pipes are also destroyed in Windows. Thus, the file descriptors are now the same with the Linux run. Expected behavior.
- The same happens when deleting the reference to the file object. Expected behavior.
- Now, we look at the file descriptors of the child-process (we retrieved the output of the child program when we did the communicate() call previously). In the Linux run, we see that the child-process has the previously opened file also available as a valid file descriptor, which is the expected behavior. But in the Windows run, we see that the child process does not have this file descriptor available! This is probably caused by the Windows-version of the Python interpreter. I have run other command-line tools and examined their behavior. It seems that the Python interpreter in Windows closes all file descriptors (besides 0, 1, 2) upon start-up. This requires further investigation to confirm the observed behavior.
Wrapper Python Script to Close All File Descriptors
Two simple scripts follow, which close all file descriptors before running a command. Both are called with the command to run as the 1st command-line argument, and the arguments to the command as the following command-line arguments.
Running the final program as a subprocess.
import sys
import os
import subprocess
if __name__=='__main__':
os.closerange(3, 100)
subprocess_args = sys.argv[1:]
if not subprocess_args:
print ("USAGE: python {0} .....\n\n"
"where are arguments passed to subprocess.Popen")
sys.exit(1)
popen = subprocess.Popen(subprocess_args, stdin=0, stdout=1, stderr=2)
exit_status = popen.wait()
sys.exit(exit_status)
Running the final program using execv.
import sys
import os
if __name__=='__main__':
os.closerange(3, 100)
subprocess_args = sys.argv[1:]
if not subprocess_args:
print ("USAGE: python {0} .....\n\n"
"where are arguments passed to os.execvp()")
sys.exit(1)
os.execvp(subprocess_args[0], subprocess_args)
Conclusions
Some important conclusions:
- Make sure that Popen instances are destroyed.
- If you:
a) are in Windows, and
b) are spawning an external tool to do some processing, and
c) have opened files that you want to work on (e.g. move) concurrently,
THEN: use one of the above wrapper scripts. It might save you a lot of trouble.
Further Reading and Information
Some more information can be found in
Python PEP-446, which might explain the behavior observed above. May be a better solution can be found for the case of inherited file handles in Windows.
An alternative (and probably better) way for achieving the same result (but avoiding the use of the above wrapper scripts) is to make a Win32 system call and mark a file descriptor as non-inheritable. In the following example code, I define an
open() replacement that opens the file and marks the resulting file descriptor as non-inheritable.
import msvcrt
import win32api
import win32con
def my_open(*args, **kwargs):
f = open(*args, **kwargs)
win32api.SetHandleInformation(
msvcrt.get_osfhandle(f.fileno()),
win32con.HANDLE_FLAG_INHERIT,
0)
return f
Although this behavior does not cause the same issues in Linux as it does in Windows, you can achieve the same behavior by setting
close_fd=True in the
subprocess.Popen() constructor. In Linux, it works fine.