Python--best way to use "open" command with in-memory str - python

I have a library I need to call that takes a local file path as input and runs open(local_path, 'rb'). However, I don't have a local file--I have an in memory text string. Right now I am writing that to a temp file and passing that, but it seems wasteful. Is there a better way to do this, given that I need to be able to run open(local_path, 'rb') on it?
Current code:
text = "Some text"
temp = tempfile.TemporaryFile(delete=False)
temp.write(bytes(text, 'UTF-8'))
temp.seek(0)
temp.close()
#call external lib here, passing in temp.name as the local_path input
Later, inside the lib I need to use (I can't edit this):
with open(local_path, 'rb') as content_file:
file_content = content_file.read()

Since the function you call in turn calls open() with the passed parameter, you must give it a str or a PathLike. This means you basically need a file which exists in the file system. You won't be able to pass an in-memory object like I was originally thinking.
Original answer:
I suggest looking at the io package. Specifically, StringIO provides a file-like wrapper on an in-memory string object. If you need binary, then try BytesIO.

Related

How to extract the full path from a file while using the "with" statement?

I'm trying, just for fun, to understand if I can extract the full path of my file while using the with statement (python 3.8)
I have this simple code:
with open('tmp.txt', 'r') as file:
print(os.path.basename(file))
But I keep getting an error that it's not a suitable type format.
I've been trying also with the relpath, abspath, and so on.
It says that the input should be a string, but even after casting it into string, I'm getting something that I can't manipulate.
Perhaps there isn't an actual way to extract that full path name, but I think there is. I just can't find it, yet.
You could try:
import os
with open("tmp.txt", "r") as file_handle:
print(os.path.abspath(file_handle.name))
The functions in os.path accept strings or path-like objects. You are attempting to pass in a file instead. There are lots of reasons the types aren't interchangable.
Since you opened the file for text reading, file is an instance of io.TextIOWrapper. This class is just an interface that provides text encoding and decoding for some underlying data. It is not associated with a path in general: the underlying stream can be a file on disk, but also a pipe, a network socket, or an in-memory buffer (like io.StringIO). None of the latter are associated with a path or filename in the way that you are thinking, even though you would interface with them as through normal file objects.
If your file-like is an instance of io.FileIO, it will have a name attribute to keep track of this information for you. Other sources of data will not. Since the example in your question uses FileIO, you can do
with open('tmp.txt', 'r') as file:
print(os.path.abspath(file.name))
The full file path is given by os.path.abspath.
That being said, since file objects don't generally care about file names, it is probably better for you to keep track of that info yourself, in case one day you decide to use something else as input. Python 3.8+ allows you to do this without changing your line count using the walrus operator:
with open((filename := 'tmp.txt'), 'r') as file:
print(os.path.abspath(filename))

Convert file into BytesIO object using python

I have a file and want to convert it into BytesIO object so that it can be stored in database's varbinary column.
Please can anyone help me convert it using python.
Below is my code:
f = open(filepath, "rb")
print(f.read())
myBytesIO = io.BytesIO(f)
myBytesIO.seek(0)
print(type(myBytesIO))
Opening a file with open and mode read-binary already gives you a Binary I/O object.
Documentation:
The easiest way to create a binary stream is with open() with 'b' in the mode string:
f = open("myfile.jpg", "rb")
So in normal circumstances, you'd be fine just passing the file handle wherever you need to supply it. If you really want/need to get a BytesIO instance, just pass the bytes you've read from the file when creating your BytesIO instance like so:
from io import BytesIO
with open(filepath, "rb") as fh:
buf = BytesIO(fh.read())
This has the disadvantage of loading the entire file into memory, which might be avoidable if the code you're passing the instance to is smart enough to stream the file without keeping it in memory. Note that the example uses open as a context manager that will reliably close the file, even in case of errors.

Should I open a file outside or inside a function?

Maybe this question makes no sense, but I was wondering if there was a "recommended practice" on how to pass a file to a function in Python.
Should I pass the file's path or the opened file itself ?
Should I do :
def func(file):
file.write(...)
with open(file_path, 'w') as file:
func(file)
...or :
def func(file_path):
with open(file_path, 'w') as file:
file.write(...)
func(file_path)
?
Is there some reason to use one method instead of the other ?
Both ways have their advantages and disadvantaged. When a function takes an open file object, it becomes easier to use with other file-like object such s io.StringIO. On the other hand, using a with statement inside a function is very elegant. A hybrid solution would be accepting both a path (string) and a file-like object. Several libraries do that.
Passing a file like object is recommended over passing a path. This means it will be easier to reuse your function with other types of files not just ones with a path on disk, such as BytesIO https://docs.python.org/3/library/io.html#io.BytesIO.
You can still use the with statement on the file like object, you don't have to use it only when you open it.

How to generate temporary file in django and then destroy

I am doing some file processing and for generating the file i need to generate some temporary file from existing data and then use that file as input to my function.
But i am confused where should i save that file and then delete it.
Is there any temp location where files automatically gets deleted after user session
Python has the tempfile module for exactly this purpose. You do not need to worry about the location/deletion of the file, it works on all supported platforms.
There are three types of temporary files:
tempfile.TemporaryFile - just basic temporary file,
tempfile.NamedTemporaryFile - "This function operates exactly as TemporaryFile() does, except that the file is guaranteed to have a visible name in the file system (on Unix, the directory entry is not unlinked). That name can be retrieved from the name attribute of the file object.",
tempfile.SpooledTemporaryFile - "This function operates exactly as TemporaryFile() does, except that data is spooled in memory until the file size exceeds max_size, or until the file’s fileno() method is called, at which point the contents are written to disk and operation proceeds as with TemporaryFile().",
EDIT: The example usage you asked for could look like this:
>>> with TemporaryFile() as f:
f.write('abcdefg')
f.seek(0) # go back to the beginning of the file
print(f.read())
abcdefg
You should use something from the tempfile module. I think that it has everything you need.
I would add that Django has a built-in NamedTemporaryFile functionality in django.core.files.temp which is recommended for Windows users over using the tempfile module. This is because the Django version utilizes the O_TEMPORARY flag in Windows which prevents the file from being re-opened without the same flag being provided as explained in the code base here.
Using this would look something like:
from django.core.files.temp import NamedTemporaryFile
temp_file = NamedTemporaryFile(delete=True)
Here is a nice little tutorial about it and working with in-memory files, credit to Mayank Jain.
I just added some important changes: convert str to bytes and a command call to show how external programs can access the file when a path is given.
import os
from tempfile import NamedTemporaryFile
from subprocess import call
with NamedTemporaryFile(mode='w+b') as temp:
# Encode your text in order to write bytes
temp.write('abcdefg'.encode())
# put file buffer to offset=0
temp.seek(0)
# use the temp file
cmd = "cat "+ str(temp.name)
print(os.system(cmd))

writing to a file via FTP in python

So i've followed the docs on this page:
http://docs.python.org/library/ftplib.html#ftplib.FTP.retrbinary
And maybe i'm confused just as to what 'retrbinary' does...i'm thinking it retrives a binary file and from there i can open it and write out to that file.
here's the line that is giving me problems...
ftp.retrbinary('RETR temp.txt',open('temp.txt','wb').write)
what i don't understand is i'd like to write out to temp.txt, so i was trying
ftp.retrbinary('RETR temp.txt',open('temp.txt','wb').write('some new txt'))
but i was getting errors, i'm able to make a FTP connection, do pwd(), cwd(), rename(), etc.
p.s. i'm trying to google this as much as possible, thanks!
It looks like the original code should have worked, if you were trying to download a file from the server. The retrbinary command accepts a function object you specify (that is, the name of the function with no () after it); it is called whenever a piece of data (a binary file) arrives. In this case, it will call the write method of the file you opened. This is slightly different than retrlines, because retrlines will assume the data is a text file, and will convert newline characters appropriately (but corrupt, say, images).
With further reading it looks like you're trying to write to a file on the server. In that case, you'll need to pass a file object (or some other object with a read method that behaves like a file) to be called by the store function:
ftp.storbinary("STOR test.txt", open("file_on_my_computer.txt", "rb"))
ftp.retrbinary takes second argument as callback function
it can be directly write method of file object i.e.open('temp.txt','wb').write
but instead you are calling write directly
you may supply your own callback and do whatever you want to do with data
def mywriter(data):
print data
ftp.retrbinary('RETR temp.txt', mywriter)

Categories

Resources