Run python zip file from memory at runtime? - python

I am trying to run a python zip file which is retrieved using requests.get. The zip file has several directories of python files in addition to the __main__.py, so in the interest of easily sending it as a single file, I am zipping it.
I know the file is being sent correctly, as I can save it to a file and then run it, however, I want to execute it without writing it to storage.
The working part is more or less as follows:
import requests
response = requests.get("http://myurl.com/get_zip")
I can write the zip to file using
f = open("myapp.zip","wb")
f.write(response.content)
f.close()
and manually run it from command line. However, I want something more like
exec(response.content)
This doesn't work since it's still compressed, but you get the idea.
I am also open to ideas that replace the zip with some other format of sending the code over internet, if it's easier to execute it from memory.

A possible solution is this
import io
import requests
from zipfile import ZipFile
response = requests.get("http://myurl.com/get_zip")
# Read the contents of the zip into a bytes object.
binary_zip = io.BytesIO(response.content)
# Convert the bytes object into a ZipFile.
zip_file = ZipFile(binary_zip, "r")
# Iterate over all files in the zip (folders should be also ok).
for script_file in zip_file.namelist():
exec(zip_file.read(script_file))
But it is a bit convoluted and probably can be improved.

Related

Archive files directly from memory in Python

I'm writing this program where I get a number of files, then zip them with encryption using pyzipper, and also I'm using io.BitesIO() to write these files to it so I keep them in-memory. So now, after some other additions, I want to get all of these in-memory files and zip them together in a single encrypted zip file using the same pyzipper.
The code looks something like this:
# Create the in-memory file object
in_memory = BytesIO()
# Create the zip file and open in write mode
with pyzipper.AESZipFile(in_memory, "w", compression=pyzipper.ZIP_LZMA, encryption=pyzipper.WZ_AES) as zip_file:
# Set password
zip_file.setpassword(b"password")
# Save "data" with file_name
zip_file.writestr(file_name, data)
# Go to the beginning
in_memory.seek(0)
# Read the zip file data
data = in_memory.read()
# Add the data to a list
files.append(data)
So, as you may guess the "files" list is an attribute from a class and the whole thing above is a function that does this a number of times and then you get the full files list. For simplicity's sake, I removed most of the irrelevant parts.
I get no errors for now, but when I try to write all files to a new zip file I get an error. Here's the code:
with pyzipper.AESZipFile(test_name, "w", compression=pyzipper.ZIP_LZMA, encryption=pyzipper.WZ_AES) as zfile:
zfile.setpassword(b"pass")
for file in files:
zfile.write(file)
I get a ValueError because of os.stat:
File "C:\Users\vulka\AppData\Local\Programs\Python\Python310\lib\site-packages\pyzipper\zipfile.py", line 820, in from_file
st = os.stat(filename)
ValueError: stat: embedded null character in path
[WHAT I TRIED]
So, I tried using mmap for this purpose but I don't think this can help me and if it can - then I have no idea how to make it work.
I also tried using fs.memoryfs.MemoryFS to temporarily create a virtual filessystem in memory to store all the files and then get them back to zip everything together and then save it to disk. Again - failed. I got tons of different errors in my tests and TBH, there's very little information out there on this fs method and even if what I'm trying to do is possible - I couldn't figure it out.
P.S: I don't know if pyzipper (almost 1:1 zipfile with the addition of encryption) supports nested zip files at all. This could be the problem I'm facing but if it doesn't I'm open to any suggestions for a new approach to doing this. Also, I don't want to rely on a 3rd party software, even if it is open source! (I'm talking about the method of using 7zip to do all the archiving and ecryption, even though it shouldn't even be possible to use it without saving the files to disk in the first place, which is the main thing I'm trying to avoid)

Get zip file from url with python3 request : make it more verbose

I try this to load a zip file from a url.
import requests
resp = requests.get('https://nlp.stanford.edu/data/glove.6B.zip')
I now the file is colossal, and I don't know in between if everything is going well or not.
(1) Is there a way to make the loading more verbose ?
(2) How do I know where data are loaded, and is there a relative path for it, which I can use for implementing the rest of my script ?
(3) How to nicely unzip ?
(4) How to either choose/set a file name or get the file name for the downloaded file ?
Is there a way to make the loading more verbose ?
If you want to download file to disk and be aware how many bytes were already downloaded you might use urrlib.request.urlretrieve from built-in module urllib.request. It does accept optional reporthook. This should be function which accept 3 arguments, it will be called at begin and end of each chunk with:
number of chunk
size of chunk
total size or 1 if unknown
Simple example which prints to stdout progress as fraction
from urllib.request import urlretrieve
def report(num, size, total):
print(num*size, '/', total)
urlretrieve("http://www.example.com","index.html",reporthook=report)
This does download www.example.com to current working directory as index.html reporting progress by printing. Note that fraction might be > 1 and should be treated as estimate.
EDIT: After download of zip file end, if you want to just unpack whole archive you might use shutil.unpack_archive from shutil built-in module. If more fine grained control is desired you might use zipfile built-in module, in PyMOTW3 entry for zipfile you might find examples like listing files inside ZIP archive, reading selected file from ZIP archive, reading metadata of file inside ZIP archive.

How to write file to memory filepath and read from memory filepath in Python?

An existing Python package requires a filepath as input parameter for a method to be able to parse the file from the filepath. I want to use this very specific Python package in a cloud environment, where I can't write files to the harddrive. I don't have direct control over the code in the existing Python package, and it's not easy to switch to another environment, where I would be able to write files to the harddrive. So I'm looking for a solution that is able to write a file to a memory filepath, and let the parser read directly from this memory filepath. Is this possible in Python? Or are there any other solutions?
Example Python code that works by using harddrive, which should be changed so that no harddrive is used:
temp_filepath = "./temp.txt"
with open(temp_filepath, "wb") as file:
file.write("some binary data")
model = Model()
model.parse(temp_filepath)
Example Python code that uses memory filesystem to store file, but which does not let parser read file from memory filesystem:
from fs import open_fs
temp_filepath = "./temp.txt"
with open_fs('osfs://~/') as home_fs:
home_fs.writetext(temp_filepath, "some binary data")
model = Model()
model.parse(temp_filepath)
You're probably looking for StringIO or BytesIO from io
import io
with io.BytesIO() as tmp:
tmp.write(content)
# to continue working, rewind file pointer
tmp.seek(0)
# work with tmp
pathlib may also be an advantage

How to save many CSV files from URL

I have many CSV files that I need to get from a URL. I found this reference: How to read a CSV file from a URL with Python?
It does almost the thing I want, but I don't want to go through Python to read the CSV and then have to save it. I just want to directly save the CSV file from the URL to my hard drive.
I have no problem with for loops and cycling through my URLs. It is simply a matter of saving the CSV file.
If all you want to do is save a csv, then I wouldn't suggest using python at all. In fact this is more of a unix question. Making the assumption here that you're working on some kind of *nix system I would suggest just using wget. For instance:
wget http://someurl/path/to/file.csv
You can run this command directly from python like so:
import subprocess
bashCommand = lambda url, filename: "wget -O %s.csv %s" % (filename, url)
save_locations = {'http://someurl/path/to/file.csv': 'test.csv'}
for url, filename in save_locations.items():
process = subprocess.Popen(bashCommand(url, filename).split(), stdout=subprocess.PIPE)
output = process.communicate()[0]

How to get a list of files from the archive (rar or zip) attached to the E-mail using Python?

How to get a list of files from the archive (rar or zip) attached to the E-mail using Python? That is, I have a EML file. I do not need to unzip the files just to get a list. Theoretically possible option when attached very large file and process the extracted attachments can take a lot of time and resources.
Here's how to do that with the stdlib, getting the first attachment in a simple multi-part message stored as message.eml:
import email.parser
import StringIO
import zipfile
with open('message.eml') as f:
msg = email.parser.Parser().parse(f)
attachment = msg.get_payload(1)
zipf = StringIO.StringIO(attachment.get_payload())
zip = zipfile.ZipFile(zipf)
filenames = zip.namelist()
This will parse the whole MIME envelope, decode the whole attachment, and read the ZIP directory of that attachment… but at least it won't uncompress any of the files in the ZIP archive, so I suspect you won't actually have any performance problem to worry about.
This answer tells you how to get the file object (for a zip archive, use the ZipFile constructor to open the file, rather than the normal open() function). Then you can use zipfile.namelist() to get the names of the archive members

Categories

Resources