Python error: "That compression method is not supported" - python

I am trying to decompress some .zip or .rar archives, and i am getting the error "That Compression methond is not supported". All the files from this directory are .zip files.
import rarfile
import sys
import os, zipfile
from tkinter import *
from tkinter import filedialog
from tkinter import messagebox
ZipExtension='.zip'
RarExtension='.rar'
#filesZIP="..\directory"
try:
os.chdir(filesZIP) # change directory from working dir to dir with files
except:
messagebox.showerror("Error","The folder with the archives was not selected! Please run the app again and select the folder.")
sys.exit()
for item in os.listdir(filesZIP):# loop through items in dir
if item.endswith(ZipExtension): # check for ".zip" extension
file_name = os.path.abspath(item) # get full path of files
zip_ref = zipfile.ZipFile(file_name) # create zipfile object
zip_ref.extractall(filesZIP) # extract file to dir
zip_ref.close() # close file
for item in os.listdir(filesZIP):
if item.endswith(RarExtension):
file_name = os.path.abspath(item)
rar_ref = rarfile.RarFile(file_name)
rar_ref.extractall()
rar_ref.close()
messagebox.showinfo("Information",'Successful!')
The problem is that sometimes it works, and in some cases, like the one above, it gives me that error, even though there are all .zip files, with no password

Background
By design zip archives support at lot of different compression methods. The support for these different compression methods in python varies depending on the version of the zipfile library you are running.
With Python 2.x, I see zipfile supports only deflate and store
zipfile.ZIP_STORED
The numeric constant for an uncompressed archive member.
zipfile.ZIP_DEFLATED
The numeric constant for the usual ZIP compression method. This requires the zlib module. No other compression methods are currently supported.
while with Python 3, zipfile supports a few more
zipfile.ZIP_STORED
The numeric constant for an uncompressed archive member.
zipfile.ZIP_DEFLATED
The numeric constant for the usual ZIP compression method. This requires the zlib module.
zipfile.ZIP_BZIP2
The numeric constant for the BZIP2 compression method. This requires the bz2 module.
New in version 3.3.
zipfile.ZIP_LZMA
The numeric constant for the LZMA compression method. This requires the lzma module
What Compression Methods are being used?
To see if this is your issue, you first need to see what compression method is actually being used in your zip files.
Let me work though an example to see how that works.
First create a zip file using bzip2 compression
zip -Z bzip2 /tmp/try.zip /tmp/in.txt
Let's check what unzip can tell us about the compression method it actually used.
$ unzip -lv try.zip
Archive: try.zip
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
387776 BZip2 30986 92% 2022-09-20 14:11 f3d1fbaf in.txt
-------- ------- --- -------
387776 30986 92% 1 file
In unzip the Method column says it is using Bzip2 compression. I'm sure that WinZip has an equivalent report.
Unzip with Python 2.7
Next try uncompressing this zip file with Python 2.7 - I'll use the code below with Python 2 & Python 3
import zipfile
zip_ref = zipfile.ZipFile('/tmp/try.zip')
if zip_ref.testzip() is None:
print("zip file is ok")
zip_ref.close()
First Python 2.7 -- that matches what you are seeing. So that confirms that zipfile with Python 2.7 doesn't support bziip2 compression.
$ python2.7 /tmp/z.py
Traceback (most recent call last):
File "/tmp/z.py", line 4, in <module>
if zip_ref.testzip() is None:
File "/usr/lib/python2.7/zipfile.py", line 921, in testzip
with self.open(zinfo.filename, "r") as f:
File "/usr/lib/python2.7/zipfile.py", line 1033, in open
close_fileobj=should_close)
File "/usr/lib/python2.7/zipfile.py", line 553, in __init__
raise NotImplementedError("compression type %d (%s)" % (self._compress_type, descr))
NotImplementedError: compression type 12 (bzip2)
Unzip with Python 3.10
Next with Python 3.10.
$ python3.10 /tmp/z.py
zip file is ok
As expected, all is fine in this instance -- zipdetails with Python 3 does support bzip2 compression.

Related

Transform zarr directory storage to zip storage

codes:
store = zarr.ZipStore("/mnt/test.zip", "r")
Problem description:
Hi, sry for bothering, I found this statement inside Zarr official documentation about ZipStorage:
Alternatively, use a DirectoryStore when writing the data, then manually Zip the directory and use the Zip file for subsequent reads.
I am trying to transform a DirectoryStorage format Zarr dataset to a ZipStorage. I use zip operation provided in Linux.
zip -r test.zip test.zarr here test.zarr is a directory storage dataset including three groups. However, when I try to use the codes above to open it, get the error as below:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/eddie/miniconda3/envs/train/lib/python3.8/site-packages/zarr/storage.py", line 1445, in __init__
self.zf = zipfile.ZipFile(path, mode=mode, compression=compression,
File "/home/eddie/miniconda3/envs/train/lib/python3.8/zipfile.py", line 1190, in __init__
_check_compression(compression)
File "/home/eddie/miniconda3/envs/train/lib/python3.8/zipfile.py", line 686, in _check_compression
raise NotImplementedError("That compression method is not supported")
NotImplementedError: That compression method is not supported
I wonder if my compression method is wrong, and if there some workarounds to transform directory storage to zip storage or some other DB format, cause when the groups rise, the previous storage has so many nodes and not so convenient to transport. Thanks in advance.
Version and installation information
Value of zarr.__version__: 2.8.1
Value of numcodecs.__version__: 0.7.3
Version of Python interpreter: 3.8.0
Operating system (Linux/Windows/Mac): linux ubuntu 18.04
How Zarr was installed: pip
because zarr already uses compression, there is no need to use compression when creating the zip archive. I.e., you can use zip -r -0 to store files in the zip archive only, without compression.
Also, you might need to be careful about the paths that get stored within the zip archive. E.g., if I have a zarr hierarchy in some directory "/path/to/foo" and I want to store this into a zip file at "/path/to/bar.zip" I would do:
cd /path/to/foo
zip -r0 /path/to/bar.zip
This ensures that the paths that get stored within the zip archive are relative to the original root directory.
After zip with -r0 option, you can try with store = zarr.ZipStore("/mnt/test.zip"), this way, you won't get the error any more.

Python 7zip (py7zr) not registering file type

I have a script to extract all the contents of an .exe file however, the register_archive_format and register_unpack_format functions don't seem to work as expected. Here's a short version of my script:
import os
import re
import py7zr
import wget
import shutil
import zipfile
versions = ["1.10", "2.0", "2.1pre"]
shutil.register_archive_format('exe', py7zr.pack_7zarchive, description="exe archive")
shutil.register_unpack_format('exe', ['.exe'], py7zr.unpack_7zarchive)
print("Supported formats:")
formats = shutil.get_unpack_formats()
print(formats, "\n")
with py7zr.SevenZipFile(f"C:/Users/Me/Documents/Builds/{version}/{filePath}", 'r') as zip_ref:
folderName = re.search("^([^_]+)(-installer)([^.]*)", fileNameOnly)
folderName = folderName[1] + folderName[3]
#zip_ref.extractall(f"C:/Users/Me/Documents/Builds/{version}/{folderName}")
shutil.unpack_archive(zip_ref, f"C:/Users/Me/Documents/Builds/{version}/{folderName}")
The code prints the list of supported formats from shutil.get_unpack_formats() and seems to correctly show the exe file registered. But when the code reaches the shutil.unpack_archive() function it throws py7zr.exceptions.Bad7zFile: not a 7z file.
Is there a step I'm missing in order to extract from an exe file? I know I can extract from the exe as I do that manually through the context menu of the exe file.
py7zr support only a 7z file that has an extension foo.7z (it is not mandatory) and the binary should be started with '7z' magic keyword.
You may want to extract a self extracting archive (https://en.wikipedia.org/wiki/Self-extracting_archive) which will start a magic bytes which indicate it as executable. It is not supported by the tool.
That is why py7zr say it is not a (plain) 7z file.
Please see details at https://py7zr.readthedocs.io/en/latest/py7zr.html

Python: Assign a compression level to tarfile

My question is a follow up to this one.
I would like to know how I can modify the following code so that I can assign a compression level:
import os
import tarfile
home = '//global//scratch//chamar//parsed_data//batch0'
backup_dir = '//global//scratch//chamar//parsed_data//'
home_dirs = [ name for name in os.listdir(home) if os.path.isdir(os.path.join(home, name)) ]
for directory in home_dirs:
full_dir = os.path.join(home, directory)
tar = tarfile.open(os.path.join(backup_dir, directory+'.tar.gz'), 'w:gz')
tar.add(full_dir, arcname=directory)
tar.close()
Basically, what the code does is that I loop through each directory in batch0 and compress each directory (where in each directory there are 6000+ files) and create a tar.gz compressed file for each directory in //global//scratch//chamar//parsed_data//.
I think by default the compression level is = 9 but it takes a lot of time to compressed. I don't need a lot of compression. A level 5 would be enough. How can I modify the above code to include a compression level?
There is a compresslevel attribute you can pass to open() (no need to use gzopen() directly):
tar = tarfile.open(filename, "w:gz", compresslevel=5)
From the gzip documentation, compresslevel can be a number between 1 and 9 (9 is the default), 1 being the fastest and least compressed, and 9 being the slowest and most compressed.
[See also: tarfile documentation]
There is a compresslevel option in the gzopen method. The line below should replace the one with the tarfile.open call in your example:
tar = tarfile.TarFile.gzopen(os.path.join(backup_dir, directory+'.tar.gz'), mode='w', compresslevel=5)

Python zipfile module doesn't compress files

I have a problem with compression in Python.
I know I should call the ZIP_DEFLATED method when writing to make the zip file compressed, but it does not work for me.
I have 3 PDF documents in the C:zip directory.
When I run the following code it works just fine:
import os,sys
list = os.listdir('C:\zip')
file = ZipFile('test.zip','w')
for item in list:
file.write(item)
file.close()
It makes the test.zip file without the compression.
When I change the fourth row to this:
file = ZipFile('test.zip','w', compression = ZIP_DEFLATED)
It also makes the test.zip file without the compression.
I also tried to change the write method to give it the compress_ type argument:
file.write(item, compress_type = ZIP_DEFLATED)
But that doesn't work either.
I use Python version 2.7.4 with Win7.
I tired the code with another computer (same circumstances, Win7 and Python 2.7.4), and it made the test.zip file compressed just like it should.
I know the zlib module should be available, when I run this:
import zlib
It doesn't return an error, also if there would be something wrong with the zlib module the code at the top should had return an error too, so I suspect that zlib isn't the problem.
By default the ZIP module only store data, to compress it you can do this:
import zipfile
try:
import zlib
mode= zipfile.ZIP_DEFLATED
except:
mode= zipfile.ZIP_STORED
zip= zipfile.ZipFile('zipfilename', 'w', mode)
zip.write(item)
zip.close()
In case you get here as I did, I'll add something.
If you use ZipInfo objects, they always override the compression method specified while creating the ZipFile, which is then useless.
So either you set their compression method (no parameter on the constructor, you must set the attribute) or specify the compression method when calling write (or writestr).
import zlib
from zipfile import ZipFile, ZipInfo, ZIP_DEFLATED
def write_things():
zip_buffer = io.BytesIO()
with ZipFile(file = zip_buffer, mode = "w", compression = ZIP_DEFLATED) as zipper:
# Get some data to write
fname, content, zip_ts = get_file_data()
file_object = ZipInfo(fname, zip_ts)
zipper.writestr(file_object, content) # Surprise, no compression
# This is required to get compression
# zipper.writestr(file_object, content, compress_type = ZIP_DEFLATED)

How to check type of files without extensions? [duplicate]

This question already has answers here:
How to find the mime type of a file in python?
(18 answers)
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
I have a folder full of files and they don't have an extension. How can I check file types? I want to check the file type and change the filename accordingly. Let's assume a function filetype(x) returns a file type like png. I want to do this:
files = os.listdir(".")
for f in files:
os.rename(f, f+filetype(f))
How do I do this?
There are Python libraries that can recognize files based on their content (usually a header / magic number) and that don't rely on the file name or extension.
If you're addressing many different file types, you can use python-magic. That's just a Python binding for the well-established magic library. This has a good reputation and (small endorsement) in the limited use I've made of it, it has been solid.
There are also libraries for more specialized file types. For example, the Python standard library has the imghdr module that does the same thing just for image file types.
If you need dependency-free (pure Python) file type checking, see filetype.
The Python Magic library provides the functionality you need.
You can install the library with pip install python-magic and use it as follows:
>>> import magic
>>> magic.from_file('iceland.jpg')
'JPEG image data, JFIF standard 1.01'
>>> magic.from_file('iceland.jpg', mime=True)
'image/jpeg'
>>> magic.from_file('greenland.png')
'PNG image data, 600 x 1000, 8-bit colormap, non-interlaced'
>>> magic.from_file('greenland.png', mime=True)
'image/png'
The Python code in this case is calling to libmagic beneath the hood, which is the same library used by the *NIX file command. Thus, this does the same thing as the subprocess/shell-based answers, but without that overhead.
On unix and linux there is the file command to guess file types. There's even a windows port.
From the man page:
File tests each argument in an attempt to classify it. There are three
sets of tests, performed in this order: filesystem tests, magic number
tests, and language tests. The first test that succeeds causes the
file type to be printed.
You would need to run the file command with the subprocess module and then parse the results to figure out an extension.
edit: Ignore my answer. Use Chris Johnson's answer instead.
In the case of images, you can use the imghdr module.
>>> import imghdr
>>> imghdr.what('8e5d7e9d873e2a9db0e31f9dfc11cf47') # You can pass a file name or a file object as first param. See doc for optional 2nd param.
'png'
Python 2 imghdr doc
Python 3 imghdr doc
import subprocess as sub
p = sub.Popen('file yourfile.txt', stdout=sub.PIPE, stderr=sub.PIPE)
output, errors = p.communicate()
print(output)
As Steven pointed out, subprocess is the way. You can get the command output by the way above as this post said
You can also install the official file binding for Python, a library called file-magic (it does not use ctypes, like python-magic).
It's available on PyPI as file-magic and on Debian as python-magic. For me this library is the best to use since it's available on PyPI and on Debian (and probably other distributions), making the process of deploying your software easier.
I've blogged about how to use it, also.
With newer subprocess library, you can now use the following code (*nix only solution):
import subprocess
import shlex
filename = 'your_file'
cmd = shlex.split('file --mime-type {0}'.format(filename))
result = subprocess.check_output(cmd)
mime_type = result.split()[-1]
print mime_type
also you can use this code (pure python by 3 byte of header file):
full_path = os.path.join(MEDIA_ROOT, pathfile)
try:
image_data = open(full_path, "rb").read()
except IOError:
return "Incorrect Request :( !!!"
header_byte = image_data[0:3].encode("hex").lower()
if header_byte == '474946':
return "image/gif"
elif header_byte == '89504e':
return "image/png"
elif header_byte == 'ffd8ff':
return "image/jpeg"
else:
return "binary file"
without any package install [and update version]
Only works for Linux but Using the "sh" python module you can simply call any shell command
https://pypi.org/project/sh/
pip install sh
import sh
sh.file("/root/file")
Output:
/root/file: ASCII text
This code list all files of a given extension in a given folder recursively
import magic
import glob
from os.path import isfile
ROOT_DIR = 'backup'
WANTED_EXTENSION = 'sqlite'
for filename in glob.iglob(ROOT_DIR + '/**', recursive=True):
if isfile(filename):
extension = magic.from_file(filename, mime = True)
if WANTED_EXTENSION in extension:
print(filename)
https://gist.github.com/izmcm/6a5d6fa8d4ec65fd9851a1c06c8946ac

Categories

Resources