unable to delete temporary file with python - python

I am using django views, I create a temp_dir using tempfile.gettempdir().
I write a gzipped text file in there, and then scp the file elsewhere. When these tasks are complete I try to delete the temp_dir.
if os.path.exists( temp_dir ):
shutil.rmtree( temp_dir )
However, occasionally I get this error back:
Operation not permitted: '/tmp/.ICE-unix'
Any ideas what this error means and how to best handle this situation?

tempfile.gettempdir() does not create a temp directory - it returns your system's standand tmp directory. DO NOT DELETE IT! That will blow everybody's temp files away. You can delete the file you created inside the temp dir, or you can create your own temp dir, but leave this one alone.

The value for temp_dir is taken from the OS environment variables, and apparently some other process is also using it to create files. The other file might be in use/locked and that will prevent you from deleting it.
Q: What is /tmp/.ICE-unix ?
A: Its a directory where X-windows session information is saved.

I am no expert but try running the python program or what your using to do this as an administrator then it will most likely allow this process to be done...

Related

PyMuPDF (fitz) not properly closing files, resulting in PermissionError [WinError 32]

I can't figure out why I'm getting a PermissionError when trying to clean up some temporary pdf files that are no longer needed.
My script downloads a bunch of single page pdf's into a /temp folder, then uses PyMuPDF to merge them into a single pdf. At the end of the script, when the merged file has been created, a cleanup function is supposed to move the pdf's from the temp folder to a another folder so I can delete the temp folder. It's when everything else is done, at the end, that I get the permission error when trying to move the temp files.
I tried 2 methods to generate the pdf without leaving files open at the end: 1 as per the fitz wiki using open() and then close(), and the other using with to ensure that nothing was being left open unintentionally. I included a simplification of what I'm trying to do, which results in exactly the same PermissionError. Both methods I used are in there, which can be tried by commenting out either one of the methods when initiation the object. It is available with the folders and files as used in the script on my github. The scripts assumes some things to be present as defined in the dubinit of the PdfOut class:
import os, fitz, time
class PdfOut:
def __init__(self):
cwd = os.getcwd()
# 3 pdf files exist in the /temp folder
self.files = ['pdf1.pdf', 'pdf2.pdf', 'pdf3.pdf']
self.dir_in = os.path.join(cwd, 'temp')
# /archive directory exists - this is where composite pdf will be saved
self.dir_out = os.path.join(cwd, 'archive')
# /raw directory exists - this is where single page pdf must be moved at the end of the script
self.dir_store = os.path.join(cwd, 'raw')
self.bookmarks = ['file1', 'file2', 'file3']
self.file_out = "Combined_File.pdf"
def writePDFusingClose(self):
composite_pdf = fitz.open()
for f in self.files:
new_page = fitz.open(os.path.join(self.dir_in, f))
composite_pdf.insert_pdf(new_page)
new_page.close()
new_toc = []
page_count = 1
for item in self.bookmarks:
entry = [1, item, page_count]
new_toc.append(entry)
page_count += 1
composite_pdf.set_toc(new_toc)
composite_pdf.save(os.path.join(self.dir_out, self.file_out), deflate=True, garbage=3)
composite_pdf.close()
def writePDFusingWith(self):
with fitz.open() as composite_pdf:
for f in self.files:
with fitz.open(os.path.join(self.dir_in,f)) as new_page:
composite_pdf.insert_pdf(new_page)
new_toc = []
page_count = 1
for item in self.bookmarks:
entry = [1, item, page_count]
new_toc.append(entry)
page_count += 1
composite_pdf.set_toc(new_toc)
composite_pdf.save(os.path.join(self.dir_out, self.file_out), deflate=True, garbage=3)
def cleanUp(self):
for file_name in os.listdir(self.dir_in):
os.replace(os.path.join(self.dir_in, file_name), os.path.join(self.dir_store, file_name))
os.rmdir(self.dir_in)
new_file = PdfOut()
new_file.writePDFusingClose()
# new_file.writePDFusingWith()
# time.sleep(10)
new_file.cleanUp()
As you can see I even tried putting in a 10sec delay to allow for any scanning or system background operations to finish, but it didn't make a difference. In fact I actually tried manually deleting the files in windows explorer while the 10sec delay was ticking and it told me the file was locked by Python (so not some other system process). This leads me to believe PyMuPDF/fitz somehow keeps those files open in the python process even though the use of with should cause it to relinquish the files when completed with that specific operation.
This is the error message it generates:
Traceback (most recent call last):
File "d:\GitHub\TestPDFmergeandclean\main.py", line 52, in <module>
new_file.cleanUp()
File "d:\GitHub\TestPDFmergeandclean\main.py", line 45, in cleanUp
os.replace(os.path.join(self.dir_in, file_name), os.path.join(self.dir_store, file_name))
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'D:\\GitHub\\TestPDFmergeandclean\\temp\\pdf1.pdf' -> 'D:\\GitHub\\TestPDFmergeandclean\\raw\\pdf1.pdf'
Everything works as expected, the combined pdf is generated with the ToC in the folder where it's supposed to go, it's just the problem with the cleanup of the temp folder. For the life of me I can't find anywhere in the PyMuPDF documentation any other way of forcibly closing out docs other than with the use of .close() ...
Anybody have any idea what I'm doing wrong, or another suggestion to achieve the cleanup of the temp folder as I'm trying to achieve?
EDIT:
Once the main script completes I can manually move/delete the pdfs, indicating that they are indeed relinquished by Python when the script finishes. But that's kinda the point of my question, why can't I get Python to relinquish the files without having to end main.py and rerun another one? In my project I tried moving the cleanUp method to the main.py script and elsewhere, to separate it from the output.py (where the merged pdf is created), that didn't solve the issue unfortunately.
If you're interested you can see the full setup on my github (https://github.com/flyingbelgian/AU_AIP_crawler/tree/CombinePDF). In the project you will see that I also create temporary html files which are then moved to another folder without issue, even when cleanup was in the same .py as the creation of the temp html files. It's only the pdfs touched by PyMuPDF that appear to remain open even after calling .close() on them.
EDIT 2:
I added a more explicit
print("All methods completed, starting 20sec sleep")
time.sleep(20)
before the final cleanUp call, allowing me to check if the files are relinquished by PyMuPDF when all the pdf handling is completed.
This confirms that the files are being held open by Python and not some other windows process:

How to delete a file without an extension?

I have made a function for deleting files:
def deleteFile(deleteFile):
if os.path.isfile(deleteFile):
os.remove(deleteFile)
However, when passing a FIFO-filename (without file-extension), this is not accepted by the os-module.
Specifically I have a subprocess create a FIFO-file named 'Testpipe'.
When calling:
os.path.isfile('Testpipe')
It results to False. The file is not in use/open or anything like that. Python runs under Linux.
How can you correctly delete a file like that?
isfile checks for regular file.
You could workaround it like this by checking if it exists but not a directory or a symlink:
def deleteFile(filename):
if os.path.exists(filename) and not os.path.isdir(filename) and not os.path.islink(filename):
os.remove(filename)

Python temporary directory to execute other processes?

I have a string of Java source code in Python that I want to compile, execute, and collect the output (stdout and stderr). Unfortunately, as far as I can tell, javac and java require real files, so I have to create a temporary directory.
What is the best way to do this? The tempfile module seems to be oriented towards creating files and directories that are only visible to the Python process. But in this case, I need Java to be able to see them too. However, I also want the other stuff to be handled intelligently if possible (such as deleting the folder when done or using the appropriate system temp folder)
tempfile.NamedTemporaryFile and tempfile.TemporaryDirectory work perfectly fine for your purposes. The resulting objects have a .name attribute that provides a file system visible name that java/javac can handle just fine, just make sure to:
Set the suffix appropriately if the compiler insists on files being named with a .java extension
Always call .flush() on the file handle before handing the .name of a NamedTemporaryFile to an external process or it may (usually will) see an incomplete file
If you don't want Python cleaning up the files when you close the objects, either pass delete=False to NamedTemporaryFile's constructor, or use the mkstemp and mkdtemp functions (which create the objects, but don't clean them up for you).
So for example, you might do:
# Create temporary directory for source and class files
with tempfile.TemporaryDirectory() as d:
# Write source code
srcpath = os.path.join(d.name, "myclass.java")
with open(srcpath, "w") as srcfile:
srcfile.write('source code goes here')
# Compile source code
subprocess.check_call(['javac', srcpath])
# Run source code
# Been a while since I've java-ed; you don't include .java or .class
# when running, right?
invokename = os.path.splitext(srcpath)[0]
subprocess.check_call(['java', invokename])
... with block for TemporaryDirectory done, temp directory cleaned up ...
tempfile.mkstemp creates a file that is normally visible in the filesystem and returns you the path as well. You should be able to use this to create your input and output files - assuming javac will atomically overwrite the output file if it exists there should be no race condition if other processes on your system don't misbehave.

Can't access temporary files created with tempfile

I am using tempfile.NamedTemporaryFile() to store some text until the program ends. On Unix is working without any issues but on Windows the file returned isn't accessible for reading or writing: python gives Errno 13. The only way is to set delete=False and manually delete the file with os.remove(). Why?
This causes the IOError because the file can be opened only once after it is created.
The reason is because NamedTemporaryFile creates the file with FILE_SHARE_DELETE flag on Windows. On Windows when a file has been created/opened with specific share flag all subsequent open operations have to pass this share flag. It's not the case with Python's open function which does not pass FILE_SHARE_DELETE flag. See my answer on How to create a temporary file that can be read by a subprocess? question for more details and a workaround.
Take a look: http://docs.python.org/2/library/tempfile.html
tempfile.NamedTemporaryFile([mode='w+b'[, bufsize=-1[, suffix=''[, prefix='tmp'[, dir=None[, delete=True]]]]]])
This function operates exactly as TemporaryFile() does, except that the file is guaranteed to have a visible name in the file system (on Unix, the directory entry is not unlinked). That name can be retrieved from the name attribute of the file object. Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later). If delete is true (the default), the file is deleted as soon as it is closed.
Thanks to #Rnhmjoj here is a working solution:
file = NamedTemporaryFile(delete=False)
file.close()
You have to keep the file with the delete-flag and then close it after creation. This way, Windows will unlock the file and you can do stuff with it!

Python. Unchroot directory

I chrooted directory using following commands:
os.chroot("/mydir")
How to return to directory to previous - before chrooting?
Maybe it is possible to unchroot directory?
SOLUTION:
Thanks to Phihag. I found a solution. Simple example:
import os
os.mkdir('/tmp/new_dir')
dir1 = os.open('.', os.O_RDONLY)
dir2 = os.open('/tmp/new_dir', os.O_RDONLY)
os.getcwd() # we are in 'tmp'
os.chroot('/tmp/new_dir') # chrooting 'new_dir' directory
os.fchdir(dir2)
os.getcwd() # we are in chrooted directory, but path is '/'. It's OK.
os.fchdir(dir1)
os.getcwd() # we came back to not chrooted 'tmp' directory
os.close(dir1)
os.close(dir2)
More info
If you haven't changed your current working directory, you can simply call
os.chroot('../..') # Add '../' as needed
Of course, this requires the CAP_SYS_CHROOT capability (usually only given to root).
If you have changed your working directory, you can still escape, but it's harder:
os.mkdir('tmp')
os.chroot('tmp')
os.chdir('../../') # Add '../' as needed
os.chroot('.')
If chroot changes the current working directory, you can get around that by opening the directory, and using fchdir to go back.
Of course, if you intend to go out of a chroot in the course of a normal program (i.e. not a demonstration or security exploit), you should rethink your program. First of all, do you really need to escape the chroot? Why can't you just copy the required info into it beforehand?
Also, consider using a second process that stays outside of the chroot and answers to the requests of the chrooted one.

Categories

Resources