How to compress with 7zip instead of zip, code changing - python

I have a code that compress every file in a specific folder with zip but I want to compress it with 7zip, so how to do ?
This is what I have so far:
for date in dict_date:#zipping folders and get same name like the folder
with ZipFile(os.path.join(src, '{0}.7z'.format(date)), 'w') as myzip:
for subFolder in dict_date[date]:
for fil in os.listdir(os.path.join(src, date, subFolder)):
if not fil.endswith('.7z'):
myzip.write(os.path.join(src, date, subFolder, fil))

You can try the command line method
import subprocess
subprocess.call(['7z', 'a', filename+'.7z', filename])
or for all files in folder
subprocess.call(['7z', 'a', filename+'.7z', "*.*"])

There doesn't appear to be a good Python module for creating a 7z archiveĀ (despite what the documentation says, py7zlib can only read them).
A workaround is to download the 7z SDK (http://www.7-zip.org/sdk.html) and use the 7zr executables that come with it via the subprocess module. 7z is in the public domain so you can carry this standalone program around without restriction.

Related

How can I unpack multi-part archives (zip/rar) in Python?

I have a 2 GB archive (prefer .zip or .rar) file in parts (let's assume 100 parts x 20MB), and I am trying to find a way to unpack it properly. I started with a .zip archive; I had files like test.zip, test.z01, test.z02...test.99, etc. When I merge them in Python like this:
for zipName in zips:
with open(os.path.join(path_to_zip_file, "test.zip"), "ab") as f:
with open(os.path.join(path_to_zip_file, zipName), "rb") as z:
f.write(z.read())
and then, after merge, unpack it like thod"
with zipfile.ZipFile(os.path.join(path_to_zip_file, "test.zip"), "r") as zipObj:
zipObj.extractall(path_to_zip_file)
I get errors, likr
test.zip file isn't zip file.
So then I tried with a .rar archive. I tried to unpack just the first file to see if my code would intelligently look for and pick up the remaining archive fragments, but it did not. So again I merged the .rar files (just like in the .zip case), and then tried to unpack it by using patoolib:
patoolib.extract_archive("test.rar", outdir="path here")
When I do that, I get errors like:
patoolib.util.PatoolError: could not find an executable program to extract format rar; candidates are (rar,unrar,7z)
After some work I figured out that these merged files are corrupted (I copied it and try to unpack normally on windows using WinRAR, and encountered some problems). So I tried other ways to merge for example using cat cat test.part.* >test.rar, but those don't help.
How can I merge and then unpack these archive files properly in Python?
Calling 7z out of python
rename the .zip to .zip.001 and .z01 to zip.002 and so on.
call 7z on the 001 ( 7z x test.zip.001 )
import subprocess
cmd = ['7z', 'x', 'test.zip.001']
sp = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)
CAT
cat test.zip* > test.zip should also work, but not always imho. Tried it for single file and works, but failed with subfolders. Maintaining the right order is mandatory.
Testing:
7z -v1m a test.zip 12MFile
cat test.zip* > test.zip
7z t test.zip
>> Everything is Ok
Can't check with "official" WinRAR (does this even still exist?!) nor WinZIP Files.
Merge File in Python
If you want to stay in python this works too (again for my 7z testfiles..):
import shutil
import glob
with open('output_file.zip','wb') as wfd:
for f in glob.glob('test.zip.*'): # Search for all files matching searchstring
with open(f,'rb') as fd:
shutil.copyfileobj(fd, wfd) # Concatinate
Further remarks
pyunpack (python frontend) with patool (python backend) and installed unrar or p7zip-rar (7z with the unfree rar-stuff) for linux or 7z in windows can handle zip and rar (and many more) in python
there is a 7z x -t flag for explicitly set it as split archive (if file is not named 001 maybe helps). Give as e.g. 7z x -trar.split or 7z x -tzip.split or something.

Python 7zip (py7zr) not registering file type

I have a script to extract all the contents of an .exe file however, the register_archive_format and register_unpack_format functions don't seem to work as expected. Here's a short version of my script:
import os
import re
import py7zr
import wget
import shutil
import zipfile
versions = ["1.10", "2.0", "2.1pre"]
shutil.register_archive_format('exe', py7zr.pack_7zarchive, description="exe archive")
shutil.register_unpack_format('exe', ['.exe'], py7zr.unpack_7zarchive)
print("Supported formats:")
formats = shutil.get_unpack_formats()
print(formats, "\n")
with py7zr.SevenZipFile(f"C:/Users/Me/Documents/Builds/{version}/{filePath}", 'r') as zip_ref:
folderName = re.search("^([^_]+)(-installer)([^.]*)", fileNameOnly)
folderName = folderName[1] + folderName[3]
#zip_ref.extractall(f"C:/Users/Me/Documents/Builds/{version}/{folderName}")
shutil.unpack_archive(zip_ref, f"C:/Users/Me/Documents/Builds/{version}/{folderName}")
The code prints the list of supported formats from shutil.get_unpack_formats() and seems to correctly show the exe file registered. But when the code reaches the shutil.unpack_archive() function it throws py7zr.exceptions.Bad7zFile: not a 7z file.
Is there a step I'm missing in order to extract from an exe file? I know I can extract from the exe as I do that manually through the context menu of the exe file.
py7zr support only a 7z file that has an extension foo.7z (it is not mandatory) and the binary should be started with '7z' magic keyword.
You may want to extract a self extracting archive (https://en.wikipedia.org/wiki/Self-extracting_archive) which will start a magic bytes which indicate it as executable. It is not supported by the tool.
That is why py7zr say it is not a (plain) 7z file.
Please see details at https://py7zr.readthedocs.io/en/latest/py7zr.html

Python ZipFile giving different namelist than unzipping utility

I have a bunch of timestamped .jpgs in a zip file, and when I open that zip file using Python's ZipFile package, I see three files:
>>> cameraZip = zipfile.ZipFile(zipPath, 'r')
>>> cameraZip.namelist()
['20131108_200152.jpg', '20131108_203158.jpg', '20131108_205521.jpg']
When I unpack the file using Mac OSX's default .zip unexpander, I get 371 files, from '20131101_000159.jpg' up to '20131108_193152.jpg'.
Unzipping this file gives the same result as the .zip unexpander:
$ unzip 2013.11.zip
extracting: 20131101_000159.jpg
extracting: 20131101_003156.jpg
...
extracting: 20131108_190155.jpg
extracting: 20131108_193152.jpg
Anybody have any idea what's going on?
Most likely the problem is in zip central directory record, which wasn't correctly flushed when zip file was created. While Python looks for central directory (I guess), other implementations process local file headers and found all of them.

Compare archiwum.rar content and extracted data from .rar in the folder on Windows 7

Does anyone know how to compare amount of files and size of the files in archiwum.rar and its extracted content in the folder?
The reason I want to do this, is that server I'am working on has been restarted couple of times during extraction and I am not sure, if all the files has been extracted correctly.
.rar files are more then 100GB's each and server is not that fast.
Any ideas?
ps. if the solution would be some code instead standalone program, my preference is Python.
Thanks
In Python you can use RarFile module. The usage is similar to build-in module ZipFile.
import rarfile
import os.path
extracted_dir_name = "samples/sample" # Directory with extracted files
file = rarfile.RarFile("samples/sample.rar", "r")
# list file information
for info in file.infolist():
print info.filename, info.date_time, info.file_size
# Compare with extracted file here
extracted_file = os.path.join(extracted_dir_name, info.filename)
if info.file_size != os.path.getsize(extracted_file):
print "Different size!"

Python ftplib - uploading multiple files?

I've googled but I could only find how to upload one file... and I'm trying to upload all files from local directory to remote ftp directory. Any ideas how to achieve this?
with the loop?
edit: in universal case uploading only files would look like this:
import os
for root, dirs, files in os.walk('path/to/local/dir'):
for fname in files:
full_fname = os.path.join(root, fname)
ftp.storbinary('STOR remote/dir' + fname, open(full_fname, 'rb'))
Obviously, you need to look out for name collisions if you're just preserving file names like this.
Look at Python-scriptlines required to make upload-files from JSON-Call and next FTPlib-operation: why some uploads, but others not?
Although a different starting position than your question, in the Answer of that first url you see an example construction to upload by ftplib a json-file plus an xml-file: look at scriptline 024 and further.
In the second url you see some other aspects related to upload of more files.
Also applicable for other file-types than json and xml, obviously with a different 'entry' before the 2 final sections which define and realize the FTP_Upload-function.
Create a FTP batch file (with a list of files that you need to transfer). Use python to execute ftp.exe with the "-s" option and pass in the list of files.
This is kludgy but apparently the FTPlib does not have accept multiple files in its STOR command.
Here is a sample ftp batch file.
*
OPEN inetxxx
myuser mypasswd
binary
prompt off
cd ~/my_reg/cronjobs/k_load/incoming
mput *.csv
bye
If the above contents were in a file called "abc.ftp" - then my ftp command would be
ftp -s abc.ftp
Hope that helps.

Categories

Resources