mkstemp opening too many files - python

I'm using subprocess.run in a loop (more than 10 000 times) to call some java command.
Like this:
import subprocess
import tempfile
for i in range(10000):
ret = subprocess.run(["ls"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
(_, name) = tempfile.mkstemp()
with open(name, 'w+') as fp:
fp.write(ret.stdout.decode())
However, after some time, I got the following exception:
Traceback (most recent call last):
File "mwe.py", line 5, in <module>
ret = subprocess.run(["ls"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
File "/usr/lib/python3.5/subprocess.py", line 693, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.5/subprocess.py", line 1454, in _execute_child
errpipe_read, errpipe_write = os.pipe()
OSError: [Errno 24] Too many open files
Am I missing something to close some file descriptor?
Thanks

mkstemp returns an already open file descriptor fd followed by the filename. You are ignoring the file descriptor (your choice of the name _ suggests you have explicitly chosen to ignore it) and as a result you are neglecting to close it. Instead, you open the file a second time using the filename, creating a file object that contains a second file descriptor for the same file. Regardless of whether you close that second one, the first one remains open.
Here's a fix to the mkstemp approach:
temporaryFiles = []
for i in range(1000):
...
fd, name = tempfile.mkstemp()
os.write(fd, ... )
os.close(fd)
temporaryFiles.append(name) # remember the filename for future processing/deletion
Building on Wyrmwood's suggestion in the comments, an even better approach would be:
temporaryFiles = []
for i in range(1000):
...
with tempfile.NamedTemporaryFile(delete=False) as tmp:
# tmp is a context manager that will automatically close the file when you exit this clause
tmp.file.write( ... )
temporaryFiles.append(tmp.name) # remember the filename for future processing/deletion
Note that both mkstemp and the NamedTemporaryFile constructor have arguments that allow you to be more specific about the file's location (dir) and naming (prefix, suffix). If you want to keep the files, you should specify dir so that you keep them out of the default temporary directory, since the default location may get cleaned up by the OS.

Related

FileNotFoundError: [WinError 2] The system cannot find the file specified while trying to use pysndfx

I'm trying to sound process a wav file with python and pysndfx but getting this weird error. I've tried many different path formats and many different paths. Even thought os.path.isfile() returns true it still comes up with this error. Any help would be greatly appreciated.
from pysndfx import AudioEffectsChain
import os
in_file = os.getcwd() + "\\" + "a.mp3"
in_file = in_file.replace("\\", "//")#tried many things here, tried to it without any replacing
if os.path.isfile(in_file):
print("fileyes") #This returns true
else:
print("not a file")
print(in_file)
fs = 44100
fx = (AudioEffectsChain().
reverb().
delay().
phaser()
)
fx(in_file,"apro.mp3")
Here's the error
fileyes
E://PyEarTraning//Test//a.mp3
Traceback (most recent call last):
File "e:/PyEarTraning/Test/test.py", line 28, in <module>
fx(in_file,"E:\\PyEarTraning\\Test\\apro.mp3")
File "C:\Program Files (x86)\Python38-32\lib\site-packages\pysndfx\dsp.py", line 368, in __call__
infile = FilePathInput(src)
File "C:\Program Files (x86)\Python38-32\lib\site-packages\pysndfx\sndfiles.py", line 29, in __init__
stdout, stderr = Popen(shlex.split(info_cmd, posix=False),
File "C:\Program Files (x86)\Python38-32\lib\subprocess.py", line 854, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Program Files (x86)\Python38-32\lib\subprocess.py", line 1307, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
I can't comment (yet) so i'll ask here. Do you have your file in the same directory (folder) as the python program? If not then it won't work even if the file does actually exist somewhere. Try to copy or move both your file and program code into a new/the same folder.

Closing a PDF file, generated when text is updated - Tkinter

So I have a function which generates a pdf file from a latex file in Tkinter - but only when a button is clicked.
What I am trying to do now is write a function which updates the pdf file every few seconds, so that the user can see how what they have written so far looks like. What I do is run the function which generates the pdf every few seconds and another function, which is supposed to close the file after a few seconds - but I seem to have a problem with closing the pdf files - hence I need to do it manually, otherwise the updated version of the pdf does not appear on the screen.
Here is the code I have used:
def generate_pdf(self):
global mainName
global pdfDirectory
name=self.getName(self.fl)
f = open('%s.tex'%name,'w')
tex = self.txt.get(1.0,"end-1c")
f.write(tex)
f.close()
proc=subprocess.Popen(['pdflatex','-output-directory', pdfDirectory,'%s.tex'%name])
proc.communicate()
self.open_file(name)
self.master.after(20000,self.generate_pdf)
self.close_file(name)
def open_file(self,filename):
if sys.platform == "win32":
os.startfile('%s.pdf'%filename)
#os.unlink('%s.tex'%filename)
os.unlink('%s.log'%filename)
os.unlink('%s.aux'%filename)
else:
opener ="open" if sys.platform == "darwin" else "xdg-open"
subprocess.call([opener, '%s.pdf'%filename])
def close_file(self,filename):
if sys.platform == "win32":
os.close('%s.pdf'%filename)
else:
closer ="close" if sys.platform == "darwin" else "xdg-close"
subprocess.call([closer, '%s.pdf'%filename])
self.master.after(29000,self.close_file)
The error I get when running it in Windows is:
os.close('%s.pdf'%filename)
TypeError: an integer is required
The error I get when running it in Linux is:
File "interface_updated_Linux.py", line 716, in generate_pdf
self.close_file(name)
File "interface_updated_Linux.py", line 734, in close_file
subprocess.call([closer, '%s.pdf'%filename])
File "/usr/lib64/python2.6/subprocess.py", line 478, in call
p = Popen(*popenargs, **kwargs)
File "/usr/lib64/python2.6/subprocess.py", line 642, in __init__
errread, errwrite)
File "/usr/lib64/python2.6/subprocess.py", line 1238, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
From the documentation:
os.close(fd)
Close file descriptor fd.
Availability: Unix, Windows.
Note This function is intended for low-level I/O and must be applied
to a file descriptor as returned by os.open() or pipe(). To close a
“file object” returned by the built-in function open() or by popen()
or fdopen(), use its close() method.
You're trying to apply it to a string here.
You should figure out how to pass around the file descriptor among the methods in your class. This might be a little bit more complicated in the case where you're using subprocess to open/close files, but in general you're gonna need a (file handle/file descriptor/process ID) to close a file. Simply using the name is not sufficient since for example you could have two handles to the same file open, how would it know which one to close?

OSError: [Errno 36] File name too long while using Popen - Python

As I started asking on a previous question, I'm extracting a tarball using the tarfile module of python. I don't want the extracted files to be written on the disk, but rather get piped directly to another program, specifically bgzip.
#!/usr/bin/env python
import tarfile, subprocess, re
mov = []
def clean(s):
s = re.sub('[^0-9a-zA-Z_]', '', s)
s = re.sub('^[^a-zA-Z_]+', '', s)
return s
with tarfile.open("SomeTarballHere.tar.gz", "r:gz") as tar:
for file in tar.getmembers():
if file.isreg():
mov = file.name
proc = subprocess.Popen(tar.extractfile(file).read(), stdout = subprocess.PIPE)
proc2 = subprocess.Popen('bgzip -c > ' + clean(mov), stdin = proc, stdout = subprocess.PIPE)
mov = None
But now I get stuck on this:
Traceback (most recent call last):
File "preformat.py", line 12, in <module>
proc = subprocess.Popen(tar.extractfile(file).read(), stdout = subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 36] File name too long
Is there any workaround for this? I have been using the LightTableLinux.tar.gz (it contains the files for a text editor program) as a tarball to test the script on it.
The exception is raised in the forked-off child process when trying to execute the target program from this invocation:
proc = subprocess.Popen(tar.extractfile(file).read(), stdout = subprocess.PIPE)
This
reads the contents of an entry in the tar file
tries to execute a program with the name of the contents of that entry.
Also your second invocation won't work, as you are trying to use shell redirection without using shell=True in Popen():
proc2 = subprocess.Popen('bgzip -c > ' + clean(mov), stdin = proc, stdout = subprocess.PIPE)
The redirect may also not be necessary, as you should be able to simply redirect the output from bgzip to a file from python directly.
Edit: Unfortunately, despite extractfile() returning a file-like object, Popen() expects a real file (with a fileno). Hence, a little wrapping is required:
with tar.extractfile(file) as tarfile, file(clean(mov), 'wb') as outfile:
proc = subprocess.Popen(
('bgzip', '-c'),
stdin=subprocess.PIPE,
stdout=outfile,
)
shutil.copyfileobj(tarfile, proc.stdin)
proc.stdin.close()
proc.wait()

File not found error using python

I want to use MafftCommandline to align my data, but i get following error:
Traceback (most recent call last):
File "C:\Users\Rimvis\Desktop\asd\bioinformatika2_Working.py", line 35, in <mo
dule>
stdout, stderr = mafftCline() # Note that MAFFT will write the alignment to
stdout, which you may want to save to a file and then parse
File "C:\Python27\lib\site-packages\Bio\Application\__init__.py", line 475, in
__call__
shell=(sys.platform!="win32"))
File "C:\Python27\lib\subprocess.py", line 679, in __init__
errread, errwrite)
File "C:\Python27\lib\subprocess.py", line 896, in _execute_child
startupinfo)
WindowsError: [Error 2] The system cannot find the file specified
my code is as following :
dataToProcess = "dataToProcess.fa"
file = open(dataToProcess, "w")
arrayOfSequences = []
for sequence in blast.alignments:
if sequence.length > blast.alignments[0].length * 80 / 100:
sequenceToAppend = SeqRecord(Seq(sequence.hsps[0].sbjct), id=sequence.title)
arrayOfSequences.append(sequenceToAppend)
SeqIO.write(arrayOfSequences, file, "fasta")
file.close()
maffPath = "..\mafft-win\mafft.bat"
mafftCline = MafftCommandline(maffPath, input=dataToProcess)
stdout, stderr = mafftCline() # Note that MAFFT will write the alignment to stdout, which you may want to save to a file and then parse
alignedData = "aligned.fa"
alignedFile = open(alignedData, "w")
alignedFile.write(stdout)
alignedFile.close()
aligned = AlignIO.read(alignedData, "fasta")
i was using this tutorial as an example
As #willOEM has said, the script is looking for a file in a relative directory.
Your script assumes that its file is located in the same directory as your "dataToProcess" fasta file.
If you have moved your script or are trying to open a file located elsewhere then it will raise this error.
You'll need to change your dataToProcess, maffPath and alignedFile to refer to the absolute path.
The problem was that i needed to escape slashes.
And use maffPath = "..\\mafft-win\\mafft.bat" instead of maffPath = "..\mafft-win\mafft.bat"

closed fd with subprocesses, IPC, SMP

Given the function
def get_files_from_sha(sha, files):
from subprocess import Popen, PIPE
import tarfile
if 0 == len(files):
return {}
p = Popen(["git", "archive", sha], bufsize=10240, stdin=PIPE, stdout=PIPE, stderr=PIPE)
tar = tarfile.open(fileobj=p.stdout, mode='r|')
p.communicate()
contents = {}
doall = files == '*'
if not doall:
files = set(files)
for entry in tar:
if (isinstance(files, set) and entry.name in files) or doall:
tf = tar.extractfile(entry)
contents[entry.name] = tf.read()
if not doall:
files.discard(entry.name)
if not doall:
for fname in files:
contents[fname] = None
tar.close()
return contents
which is called in a loop for some values of sha, after a while (in my case, 4 iterations) it starts to fail at the call to tf.read(), with the message:
Traceback (most recent call last):
File "../yap-analysis/extract.py", line 243, in <module>
commits, identities, identities_by_name, identities_by_email, identities_freq = build_commits(commits)
File "../yap-analysis/extract.py", line 186, in build_commits
commit = get_commit(commit)
File "../yap-analysis/extract.py", line 84, in get_commit
contents = get_files_from_sha(commit['sha'], files)
File "../yap-analysis/extract.py", line 42, in get_files_from_sha
contents[entry.name] = tf.read()
File "/usr/lib/python2.7/tarfile.py", line 817, in read
buf += self.fileobj.read()
File "/usr/lib/python2.7/tarfile.py", line 737, in read
return self.readnormal(size)
File "/usr/lib/python2.7/tarfile.py", line 746, in readnormal
return self.fileobj.read(size)
File "/usr/lib/python2.7/tarfile.py", line 573, in read
buf = self._read(size)
File "/usr/lib/python2.7/tarfile.py", line 581, in _read
return self.__read(size)
File "/usr/lib/python2.7/tarfile.py", line 606, in __read
buf = self.fileobj.read(self.bufsize)
ValueError: I/O operation on closed file
I suspect there is some parallelization that subprocess attempts to make (?).
What is the actual cause and how to solve it in a clean and robust way on python2?
Do not use .communicate() on the Popen instance; it'll read the stdout stream until it is finished. From the documentation:
Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached.
The code for .communicate() even adds an explicit .close() call on the stdout of the pipe.
Simply removing the call to .communicate() should be enough, but do also add a .wait() after reading the tarfile contents:
tar.close()
p.stdout.close()
p.wait()
It could be that tar.close() also closes p.stdout, but an extra .close() there should not hurt.
I think your problem is the p.communicate(). This method sends to stdin, reads from stdout and stderr (which you are not capturing) and waits for the process to terminate.
tarfile is trying to read from the processes stdout, and by the time it does then the process is finished, hence the error.
I have not tried running your code (I don't have access to git) but you probably don't want the p.communicate at all, try commenting it out.

Categories

Resources