IOError: encoder zip not available ubuntu - python

I getting following error when I am trying to run my python program which uses PIL.
Generate_Dot.py:14: RuntimeWarning: the frombuffer defaults may change in a future release; for portability, change the call to read:
frombuffer(mode, size, data, 'raw', mode, 0, 1)
img = Image.frombuffer('L', size, data)
Traceback (most recent call last):
File "Generate_Dot.py", line 15, in <module>
img.save('image.png')
File "/home/kapil/python/lib/python2.7/site-packages/PIL-1.1.7-py2.7-linux-i686.egg/PIL/Image.py", line 1439, in save
save_handler(self, fp, filename)
File "/home/kapil/python/lib/python2.7/site-packages/PIL-1.1.7-py2.7-linux-i686.egg/PIL/PngImagePlugin.py", line 572, in _save
ImageFile._save(im, _idat(fp, chunk), [("zip", (0,0)+im.size, 0, rawmode)])
File "/home/kapil/python/lib/python2.7/site-packages/PIL-1.1.7-py2.7-linux-i686.egg/PIL/ImageFile.py", line 481, in _save
e = Image._getencoder(im.mode, e, a, im.encoderconfig)
File "/home/kapil/python/lib/python2.7/site-packages/PIL-1.1.7-py2.7-linux-i686.egg/PIL/Image.py", line 401, in _getencoder
raise IOError("encoder %s not available" % encoder_name)
IOError: encoder zip not available

cpython depends on various third party libraries being installed in your system. In this case 'zip' is required to perform some compression. I'm guessing in this case it would be one of gzip, compress, or zlib. You should be able to install these quite easily using yum, or apt-get.

Related

BadRarFile when extracting single file using RarFile in Python

I need to extract a single file (~10kB) from many very large RAR files (>1Gb). The code below shows a basic implementation of how I'm doing this.
from rarfile import RarFile
rar_file='D:\\File.rar'
file_of_interest='Folder 1/Subfolder 2/File.dat'
output_folder='D:/Output'
rardata = RarFile(rar_file)
rardata.extract(file_of_interest, output_folder)
rardata.close()
However, the extract instruction is returning the following error: rarfile.BadRarFile: Failed the read enough data: req=16384 got=52
When I open the file using WinRAR, I can extract the file successfully, so I'm sure the file isn't corrupted.
I've found some similar questions, but not a definite answer that worked for me.
Can someone help me to solve this error?
Additional info:
Windows 10 build 1909
Spyder 5.0.0
Python 3.8.1
Complete traceback of the error:
Traceback (most recent call last):
File "D:\Test\teste_rar_2.py", line 27, in <module>
rardata.extract(file_of_interest, output_folder)
File "C:\Users\bernard.kusel\AppData\Local\Continuum\anaconda3\lib\site-packages\rarfile.py", line 826, in extract
return self._extract_one(inf, path, pwd, True)
File "C:\Users\bernard.kusel\AppData\Local\Continuum\anaconda3\lib\site-packages\rarfile.py", line 912, in _extract_one
return self._make_file(info, dstfn, pwd, set_attrs)
File "C:\Users\bernard.kusel\AppData\Local\Continuum\anaconda3\lib\site-packages\rarfile.py", line 927, in _make_file
shutil.copyfileobj(src, dst)
File "C:\Users\bernard.kusel\AppData\Local\Continuum\anaconda3\lib\shutil.py", line 79, in copyfileobj
buf = fsrc.read(length)
File "C:\Users\bernard.kusel\AppData\Local\Continuum\anaconda3\lib\site-packages\rarfile.py", line 2197, in read
raise BadRarFile("Failed the read enough data: req=%d got=%d" % (orig, len(data)))
BadRarFile: Failed the read enough data: req=16384 got=52

Python Pillow Image to PDF and then merging memory issues

Goal:
Convert finite number of files in .jpg format and merge them into one PDF file.
Expected result:
Files from folder are successfully converted and merged into one pdf file at specified location.
Problem:
When size of files exceed certain number, in my tests it was around 400 mb the program crashes with following message:
Traceback (most recent call last):
File "C:\Users\kaczk\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PIL\ImageFile.py", line 498, in _save
fh = fp.fileno()
io.UnsupportedOperation: fileno
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "MakePDF.py", line 10, in <module>
im1.save(pdf1_filename, "PDF" ,resolution=1000.0, save_all=True, append_images=imageList)
File "C:\Users\kaczk\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PIL\Image.py", line 2084, in save
save_handler(self, fp, filename)
File "C:\Users\kaczk\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PIL\PdfImagePlugin.py", line 46, in _save_all
_save(im, fp, filename, save_all=True)
File "C:\Users\kaczk\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PIL\PdfImagePlugin.py", line 175, in _save
Image.SAVE["JPEG"](im, op, filename)
File "C:\Users\kaczk\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PIL\JpegImagePlugin.py", line 770, in _save
ImageFile._save(im, fp, [("jpeg", (0, 0) + im.size, 0, rawmode)], bufsize)
File "C:\Users\kaczk\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PIL\ImageFile.py", line 513, in _save
fp.write(d)
MemoryError
After running the program with task manager i noticed that indeed the computer runs out of ram memory when executing this program. Below is the code used.
import os
from PIL import Image
fileList = os.listdir(r'C:\location\of\photos\folder')
imageList = []
im1 = Image.open(os.path.join(r'C:\location\of\photos\folder',fileList[0]))
for file in fileList[1:]:
imageList.append(Image.open(os.path.join(r'C:\location\of\photos\folder',file)))
pdf1_filename = r'C:\location\of\pdf\destination.pdf'
im1.save(pdf1_filename, "PDF" ,resolution=500.0, save_all=True, append_images=imageList)
Is there an easy mistake I am making here regarding memory usage? Is there different module that would make the task easier while working with more and larger files? I will be very grateful for all help.
This question is quite old but since I got there struggling with the same issue, here is an answer.
You simply have to close your images after using them:
im1.close()
for i in imageList:
i.close()
This solved it for me.
PS: take a look at glob, it eases working with paths a lot.

How can read Minecraft .mca files so that in python I can extract individual blocks?

I can't find a way of reading the Minecraft world files in a way that i could use in python
I've looked around the internet but can find no tutorials and only a few libraries that claim that they can do this but never actually work
from nbt import *
nbtfile = nbt.NBTFile("r.0.0.mca",'rb')
I expected this to work but instead I got errors about the file not being compressed or something of the sort
Full error:
Traceback (most recent call last):
File "C:\Users\rober\Desktop\MinePy\MinecraftWorldReader.py", line 2, in <module>
nbtfile = nbt.NBTFile("r.0.0.mca",'rb')
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 628, in __init__
self.parse_file()
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 652, in parse_file
type = TAG_Byte(buffer=self.file)
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 99, in __init__
self._parse_buffer(buffer)
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 105, in _parse_buffer
self.value = self.fmt.unpack(buffer.read(self.fmt.size))[0]
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\gzip.py", line 276, in read
return self._buffer.read(size)
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\gzip.py", line 463, in read
if not self._read_gzip_header():
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\gzip.py", line 411, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'\x00\x00')
Use anvil parser. (Install with pip install anvil-parser)
Reading
import anvil
region = anvil.Region.from_file('r.0.0.mca')
# You can also provide the region file name instead of the object
chunk = anvil.Chunk.from_region(region, 0, 0)
# If `section` is not provided, will get it from the y coords
# and assume it's global
block = chunk.get_block(0, 0, 0)
print(block) # <Block(minecraft:air)>
print(block.id) # air
print(block.properties) # {}
https://pypi.org/project/anvil-parser/
According to this page, the .mca files is not totally kind of of NBT file. It begins with an 8KiB header which includes the offsets of chunks in the region file itself and the timestamps for the last updates of those chunks.
I recommend you to see the offical announcement and this page for more information.

Persisting a Large scipy.sparse.csr_matrix

I have a very large sparse scipy matrix. Attempting to use save_npz resulted in the following error:
>>> sp.save_npz('/projects/BIGmatrix.npz',W)
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/numpy/lib/npyio.py", line 716, in _savez
pickle_kwargs=pickle_kwargs)
File "/usr/local/lib/python3.5/dist-packages/numpy/lib/format.py", line 597, in write_array
array.tofile(fp)
OSError: 6257005295 requested and 3283815408 written
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/scipy/sparse/_matrix_io.py", line 78, in save_npz
np.savez_compressed(file, **arrays_dict)
File "/usr/local/lib/python3.5/dist-packages/numpy/lib/npyio.py", line 659, in savez_compressed
_savez(file, args, kwds, True)
File "/usr/local/lib/python3.5/dist-packages/numpy/lib/npyio.py", line 721, in _savez
raise IOError("Failed to write to %s: %s" % (tmpfile, exc))
OSError: Failed to write to /projects/BIGmatrix.npzg6ub_z3y-numpy.npy: 6257005295 requested and 3283815408 written
As such I wanted to try persisting it to postgres via psycopg2 but I haven't found a method of iterating over all nonzeros so I can persist them as rows in a table.
What is the best way to handle this task?
Save all the attributes in __dict__ of the matrix object, and recreate the csr_matrix when load:
from scipy import sparse
import numpy as np
a = np.zeros((1000, 2000))
a[np.random.randint(0, 1000, 100), np.random.randint(0, 2000, 100)] = np.random.randn(100)
b = sparse.csr_matrix(a)
np.savez("tmp", data=b.data, indices=b.indices, indptr=b.indptr, shape=np.array(b.shape))
f = np.load("tmp.npz")
b2 = sparse.csr_matrix((f["data"], f["indices"], f["indptr"]), shape=f["shape"])
(b != b2).sum()
It seems that the way things go is:
When you invoke scipy.sparse.save_npz(), by default it saves as a compressed file; however, in order to do so it first creates a temporary uncompressed version of the target file that it then compresses down to the final result. This means that whatever drive you save to needs to be large enough to accommodate the uncompressed temp file which in my case was 47G.
I re-tried the save in a larger drive and the process completed without incident.
Note: The compression can take quite a long time.

OSError: raw write() returned invalid length when using print() in python

I'm using python tensorflow to train a model to recognise images in python. But I'm getting the below error when trying to execute train.py from github
Traceback (most recent call last):
File "train.py", line 1023, in <module>
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "C:\Users\sande\Anaconda3\envs\tensorflow\lib\site-
packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train.py", line 766, in main
bottleneck_tensor)
File "train.py", line 393, in cache_bottlenecks
jpeg_data_tensor, bottleneck_tensor)
File "train.py", line 341, in get_or_create_bottleneck
bottleneck_tensor)
File "train.py", line 290, in create_bottleneck_file
print('Creating bottleneck at ' + bottleneck_path)
OSError: raw write() returned invalid length 112 (should have been between 0
and 56)
Below is the code for create_bottleneck_file()
def create_bottleneck_file(bottleneck_path, image_lists, label_name, index,
image_dir, category, sess, jpeg_data_tensor,
bottleneck_tensor):
"""Create a single bottleneck file."""
print('Creating bottleneck at ' + bottleneck_path)
image_path = get_image_path(image_lists, label_name, index,
image_dir, category)
if not gfile.Exists(image_path):
tf.logging.fatal('File does not exist %s', image_path)
image_data = gfile.FastGFile(image_path, 'rb').read()
try:
bottleneck_values = run_bottleneck_on_image(
sess, image_data, jpeg_data_tensor, bottleneck_tensor)
except:
raise RuntimeError('Error during processing file %s' % image_path)
bottleneck_string = ','.join(str(x) for x in bottleneck_values)
with open(bottleneck_path, 'w') as bottleneck_file:
bottleneck_file.write(bottleneck_string)
I tried reducing the file names so that bottleneck_path will be a small value but that did not work. I tried to search online for this error but did not find anything useful. Please let me know if you have a fix to this issue
If you're unable to migrate to 3.6 or from Windows like me, install the win_unicode_console package, import it and add this line at the beggining of your script to enable it:
win_unicode_console.enable()
This issue appears to be generally unique to pre-3.6 Python as the code responsible for handling text output was rewritten for this latest version. This also means that we will most likely not see a fix coming for this issue.
Source: https://bugs.python.org/issue32245
I think this is a bug on the stdout/stderr streams introduced by the November's creators update, it happens in both powershell.exe and cmd.exe
It seems to only happen on Windows 10 Version 1709 (OS Build 16299.64). My guess is that it is unicode realted (output size is twice the length expected)
A (very) quick and dirty fix is to only output ASCII on your console :
mystring.encode("utf-8").decode("ascii")
https://github.com/Microsoft/vscode/issues/39149#issuecomment-347260954
Adding more to #AMSAntiago answer. You could run the win_unicode_console.enable(). But instead of using it on every file, you could run it on every Python invocation (docs). That works for me.

Categories

Resources