I am doing some file processing and for generating the file i need to generate some temporary file from existing data and then use that file as input to my function.
But i am confused where should i save that file and then delete it.
Is there any temp location where files automatically gets deleted after user session
Python has the tempfile module for exactly this purpose. You do not need to worry about the location/deletion of the file, it works on all supported platforms.
There are three types of temporary files:
tempfile.TemporaryFile - just basic temporary file,
tempfile.NamedTemporaryFile - "This function operates exactly as TemporaryFile() does, except that the file is guaranteed to have a visible name in the file system (on Unix, the directory entry is not unlinked). That name can be retrieved from the name attribute of the file object.",
tempfile.SpooledTemporaryFile - "This function operates exactly as TemporaryFile() does, except that data is spooled in memory until the file size exceeds max_size, or until the file’s fileno() method is called, at which point the contents are written to disk and operation proceeds as with TemporaryFile().",
EDIT: The example usage you asked for could look like this:
>>> with TemporaryFile() as f:
f.write('abcdefg')
f.seek(0) # go back to the beginning of the file
print(f.read())
abcdefg
You should use something from the tempfile module. I think that it has everything you need.
I would add that Django has a built-in NamedTemporaryFile functionality in django.core.files.temp which is recommended for Windows users over using the tempfile module. This is because the Django version utilizes the O_TEMPORARY flag in Windows which prevents the file from being re-opened without the same flag being provided as explained in the code base here.
Using this would look something like:
from django.core.files.temp import NamedTemporaryFile
temp_file = NamedTemporaryFile(delete=True)
Here is a nice little tutorial about it and working with in-memory files, credit to Mayank Jain.
I just added some important changes: convert str to bytes and a command call to show how external programs can access the file when a path is given.
import os
from tempfile import NamedTemporaryFile
from subprocess import call
with NamedTemporaryFile(mode='w+b') as temp:
# Encode your text in order to write bytes
temp.write('abcdefg'.encode())
# put file buffer to offset=0
temp.seek(0)
# use the temp file
cmd = "cat "+ str(temp.name)
print(os.system(cmd))
Related
I'm trying, just for fun, to understand if I can extract the full path of my file while using the with statement (python 3.8)
I have this simple code:
with open('tmp.txt', 'r') as file:
print(os.path.basename(file))
But I keep getting an error that it's not a suitable type format.
I've been trying also with the relpath, abspath, and so on.
It says that the input should be a string, but even after casting it into string, I'm getting something that I can't manipulate.
Perhaps there isn't an actual way to extract that full path name, but I think there is. I just can't find it, yet.
You could try:
import os
with open("tmp.txt", "r") as file_handle:
print(os.path.abspath(file_handle.name))
The functions in os.path accept strings or path-like objects. You are attempting to pass in a file instead. There are lots of reasons the types aren't interchangable.
Since you opened the file for text reading, file is an instance of io.TextIOWrapper. This class is just an interface that provides text encoding and decoding for some underlying data. It is not associated with a path in general: the underlying stream can be a file on disk, but also a pipe, a network socket, or an in-memory buffer (like io.StringIO). None of the latter are associated with a path or filename in the way that you are thinking, even though you would interface with them as through normal file objects.
If your file-like is an instance of io.FileIO, it will have a name attribute to keep track of this information for you. Other sources of data will not. Since the example in your question uses FileIO, you can do
with open('tmp.txt', 'r') as file:
print(os.path.abspath(file.name))
The full file path is given by os.path.abspath.
That being said, since file objects don't generally care about file names, it is probably better for you to keep track of that info yourself, in case one day you decide to use something else as input. Python 3.8+ allows you to do this without changing your line count using the walrus operator:
with open((filename := 'tmp.txt'), 'r') as file:
print(os.path.abspath(filename))
Using a library such as Python's io, I can create a file e.g. csv format, in memory. However, I cannot get a UNC (Universal Naming Convention) referencing this file. How would I be able to assign such a name to an in-memory file created by Python?
A "file" created in memory using the io module is not a "file" as far as the OS is concerned. You can't open it by name or access it from outside the process. This file only exists as a variable in the program.
You can get the string contents directly, as in the example # https://docs.python.org/3/library/io.html#io.StringIO
import io
output = io.StringIO()
output.write('First line.\n')
print('Second line.', file=output)
contents = output.getvalue()
or you can rewind the file and read it back:
output.seek(0)
contents = output.read()
I have a library I need to call that takes a local file path as input and runs open(local_path, 'rb'). However, I don't have a local file--I have an in memory text string. Right now I am writing that to a temp file and passing that, but it seems wasteful. Is there a better way to do this, given that I need to be able to run open(local_path, 'rb') on it?
Current code:
text = "Some text"
temp = tempfile.TemporaryFile(delete=False)
temp.write(bytes(text, 'UTF-8'))
temp.seek(0)
temp.close()
#call external lib here, passing in temp.name as the local_path input
Later, inside the lib I need to use (I can't edit this):
with open(local_path, 'rb') as content_file:
file_content = content_file.read()
Since the function you call in turn calls open() with the passed parameter, you must give it a str or a PathLike. This means you basically need a file which exists in the file system. You won't be able to pass an in-memory object like I was originally thinking.
Original answer:
I suggest looking at the io package. Specifically, StringIO provides a file-like wrapper on an in-memory string object. If you need binary, then try BytesIO.
I am using pdftk like this
pdftk template.pdf fill_form /temp/input.fdf output /temp/output.pdf
Now this is working fine
But now i have generated the temporary file instead of /temp/input.fdf with this
myfile = tempfile.NamedTemporaryFile()
myfile.write(fdf)
myfile.seek(0)
myfile.close()
Now i don't know how can i pass myfile as input to the pdftk
myfile.name will get you the file path.
Note that tempfiles do not exist after close(). From the docs:
tempfile.TemporaryFile([mode='w+b'[, bufsize=-1[, suffix=''[,
prefix='tmp'[, dir=None]]]]])
Return a file-like object that can be used as a temporary storage
area. The file is created using mkstemp(). It will be destroyed as
soon as it is closed (including an implicit close when the object is
garbage collected). Under Unix, the directory entry for the file is
removed immediately after the file is created. Other platforms do not
support this; your code should not rely on a temporary file created
using this function having or not having a visible name in the file
system.
Source: http://docs.python.org/2/library/tempfile.html
Can't you get the name using
myfile = tempfile.NamedTemporaryFile()
myfile.write(fdf)
myfile.seek(0)
myfile.close()
print(myfile.name)
After some frustration with unzip(1L), I've been trying to create a script that will unzip and print out raw data from all of the files inside a zip archive that is coming from stdin. I currently have the following, which works:
import sys, zipfile, StringIO
stdin = StringIO.StringIO(sys.stdin.read())
zipselect = zipfile.ZipFile(stdin)
filelist = zipselect.namelist()
for filename in filelist:
print filename, ':'
print zipselect.read(filename)
When I try to add validation to check if it truly is a zip file, however, it doesn't like it.
...
zipcheck = zipfile.is_zipfile(zipselect)
if zipcheck is not None:
print 'Input is not a zip file.'
sys.exit(1)
...
results in
File "/home/chris/simple/zipcat/zipcat.py", line 13, in <module>
zipcheck = zipfile.is_zipfile(zipselect)
File "/usr/lib/python2.7/zipfile.py", line 149, in is_zipfile
result = _check_zipfile(fp=filename)
File "/usr/lib/python2.7/zipfile.py", line 135, in _check_zipfile
if _EndRecData(fp):
File "/usr/lib/python2.7/zipfile.py", line 203, in _EndRecData
fpin.seek(0, 2)
AttributeError: ZipFile instance has no attribute 'seek'
I assume it can't seek because it is not a file, as such?
Sorry if this is obvious, this is my first 'go' with Python.
You should pass stdin to is_zipfile, not zipselect. is_zipfile takes a path to a file or a file object, not a ZipFile.
See the zipfile.is_zipfile documentation
You are correct that a ZipFile can't seek because it isn't a file. It's an archive, so it can contain many files.
To do this entirely in memory will take some work. The AttributeError message means that the is_zipfile method is trying to use the seek method of the file handle you provide. But standard input is not seekable, and therefore your file object for it has no seek method.
If you really, really can't store the file on disk temporarily, then you could buffer the entire file in memory (you would need to enforce a size limit for security), and then implement some "duck" code that looks and acts like a seekable file object but really just uses the byte-string in memory.
It is possible that you could cheat and buffer only enough of the data for is_zipfile to do its work, but I seem to recall that the table-of-contents for ZIP is at the end of the file. I could be wrong about that though.
Your 2011 python2 fragment was: StringIO.StringIO(sys.stdin.read())
In 2018 a python3 programmer might phrase that as: io.StringIO(...).
What you wanted was the following python3 fragment: io.BytesIO(...).
Certainly that works well for me when using the requests module to download binary ZIP files from webservers:
zf = zipfile.ZipFile(io.BytesIO(req.content))