I have the following piece of python code to convert PDF to JPG.
with Img(filename=pdfName, resolution=300) as pic:
pic.compression_quality = self.compressionQuality
pic.background_color = Color("white")
pic.alpha_channel = 'remove'
pic.save(filename=output)
My problem is, with a large PDF file (10mb) I have the following error :
File "/home/nathan/PycharmProjects/oc_for_maarch/worker.py", line 44, in <module>
launch(args)
File "/home/nathan/PycharmProjects/oc_for_maarch/src/main.py", line 105, in launch
q = process(args, path + file, Log, Separator, Config, Image, Ocr, Locale, WebService, q)
File "/home/nathan/PycharmProjects/oc_for_maarch/src/process/OCForMaarch.py", line 48, in process
Image.pdf_to_jpg(file + '[0]')
File "/home/nathan/PycharmProjects/oc_for_maarch/src/classes/Images.py", line 36, in pdf_to_jpg
self.save_img_with_wand(pdfName, self.jpgName)
File "/home/nathan/PycharmProjects/oc_for_maarch/src/classes/Images.py", line 46, in save_img_with_wand
with Img(filename=pdfName, resolution=300) as pic:
File "/home/nathan/Documents/OpenCV/lib/python3.7/site-packages/wand/image.py", line 6406, in __init__
self.read(filename=filename, resolution=resolution)
File "/home/nathan/Documents/OpenCV/lib/python3.7/site-packages/wand/image.py", line 6799, in read
raise WandRuntimeError(msg)
wand.exceptions.WandRuntimeError: MagickReadImage returns false, but did raise ImageMagick exception. This can occurs when a delegate is missing, or returns EXIT_SUCCESS without generating a raster.
I checked a little on Internet, and for what I've seen the problems was related to ghostscript but It's installed
I have the problem on Debian 10 and Ubuntu 19.04 using Python 3.7
EDIT : If I put the resolution to 100 instead of 300, I didn't have the issue
When you rasterize at a high density, you will make a potentially very large dimension image from your PDF. So it sounds like you may be running out of RAM. If so, then you need to edit your ImageMagick policy.xml file to allow for more ram or map space. See policy.xml at https://imagemagick.org/script/resources.php. It controls your resources which you can view with the command line command:
convert -list resource
Related
The below error is thrown when trying to read the image URL referenced below. (Note, I can't even upload the image to SO because it throws an error when I try to upload it.)
https://s3.amazonaws.com/comicgeeks/comics/covers/large-7441962.jpg
image = imread('https://s3.amazonaws.com/comicgeeks/comics/covers/large-7441962.jpg', as_gray=True)
This is the stack trace.
Traceback (most recent call last):
....
return imread(image_or_path, as_gray=True)
File "skimage/io/_io.py", line 48, in imread
img = call_plugin('imread', fname, plugin=plugin, **plugin_args)
File "skimage/io/manage_plugins.py", line 209, in call_plugin
return func(*args, **kwargs)
File "skimage/io/_plugins/imageio_plugin.py", line 10, in imread
return np.asarray(imageio_imread(*args, **kwargs))
File "imageio/__init__.py", line 86, in imread
return imread_v2(uri, format=format, **kwargs)
File "imageio/v2.py", line 159, in imread
with imopen(uri, "ri", plugin=format) as file:
File "imageio/core/imopen.py", line 333, in imopen
raise err_type(err_msg)
ValueError: Could not find a backend to open `/var/folders/82/rky4yjcx75n1zskhy5570v0m0000gn/T/tmpy9xg7dvb.jpg`` with iomode `ri`.
After looking into this further I believe the image was originally a TIFF file that was just renamed to a .jpg file manually, but I'm not sure. If I download the file and try to open it with Photoshop I get the following message.
Could not open “large-7441962.jpeg” because an unknown or invalid JPEG marker type is found.
If I simply change the extension to a .tiff file it will not open as it states it is an invalid tiff file.
The only way I can open it with photoshop is if I open it with the preview.app and then save a copy of the image as a .tiff file. Then I can open it in photoshop.
This is an issue with a potentially large number of images so re-saving them one-by-one is not an option.
Are there any possible ways to re-save this file when this error is thrown? Or somehow figure out how to handle it even though imread() is failing?
I was able to work around this by using the following.
from PIL import Image
from skimage.io import imread
try:
image = imread(url, as_gray=True)
return image
except:
image = Image.open(requests.get(url, stream=True).raw)
return image
However it is worth noting that when having to make this request using PIL it is significantly slower.
Python 3.8
h5py 2.10.0
Windows 10
I have found that when using any other mode, other than “r”, attributes are not accessible and an error is raised.
_casa.h5 is the HDF file which contains numerous links to external files.
"/149901/20118/VRM_DATA" is the path to a group within one of the external files.
This works:
# open file in read only mode
hfile = h5py.File("..\casa\_data\_casa.h5", “r”)
hfile["/149901/20118/VRM_DATA"].attrs
Out[21]: <Attributes of HDF5 object at 1660502855488>
hfile.close()
This does not work:
# Open file in Read/write, file must exist mode
hfile = h5py.File("..\casa\_data\_casa.h5", “r+”)
hfile["/149901/20118/VRM_DATA"].attrs
Traceback (most recent call last):
File “C:\Users\HIAPRC\Miniconda3\envs\py38\lib\site-packages\IPython\core\interactiveshell.py”, line 3418, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File “”, line 1, in
hfile["/149901/20118/VRM_DATA"].attrs
File “h5py_objects.pyx”, line 54, in h5py._objects.with_phil.wrapper
File “h5py_objects.pyx”, line 55, in h5py._objects.with_phil.wrapper
File “C:\Users\HIAPRC\Miniconda3\envs\py38\lib\site-packages\h5py_hl\group.py”, line 264, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File “h5py_objects.pyx”, line 54, in h5py._objects.with_phil.wrapper
File “h5py_objects.pyx”, line 55, in h5py._objects.with_phil.wrapper
File “h5py\h5o.pyx”, line 190, in h5py.h5o.open
KeyError: “Unable to open object (unable to open external file, external link file name = ‘.\sub_files\casa_2021-01-13_10-03-51.h5’)”
I consulted the documentation and Googled around and did not find any answer to this.
Is this a result of design or a bug?
If this is by design then I guess I will have to edit attributes by opening the external file directly and make the changes there.
[EDIT] - HDFView 3.1.1 has no problem opening the _casa.h5 file, editing and saving edits done on the attributes of groups in external files.
Thanks in advance.
I pulled code from other SO answers into a single example. The code below creates 3 files, each with a single dataset with an attribute. It then creates another HDF5 file with ExternalLinks to the previously created files. The link objects also have attributes. All 4 files are then reopened in 'r' mode and attributes are accessed and printed. Maybe this will help you diagnose your problem. See below. Note: I am running Python 3.8.3 with h5py 2.10.0 on Windows 7 (Anaconda distribution to be precise).
import h5py
import numpy as np
for fcnt in range(1,4,1):
fname = 'file' + str(fcnt) + '.h5'
arr = np.random.random(50).reshape(10,5)
with h5py.File(fname,'w') as h5fw :
h5fw.create_dataset('data_'+str(fcnt),data=arr)
h5fw['data_'+str(fcnt)].attrs['ds_attr']='attribute '+str(fcnt)
with h5py.File('SO_65705770.h5',mode='w') as h5fw:
for fcnt in range(1,4,1):
h5name = 'file' + str(fcnt) + '.h5'
link_obj = h5py.ExternalLink(h5name,'/')
h5fw['link'+str(fcnt)] = h5py.ExternalLink(h5name,'/')
h5fw['link'+str(fcnt)].attrs['link_attr']='attr '+str(fcnt)
for fcnt in range(1,4,1):
fname = 'file' + str(fcnt) + '.h5'
print (fname)
with h5py.File(fname,'r') as h5fr :
print( h5fr['data_'+str(fcnt)].attrs['ds_attr'] )
with h5py.File('SO_65705770.h5',mode='r') as h5fr:
for fcnt in range(1,4,1):
print ('file',fcnt,":")
print('link attr:', h5fr['link'+str(fcnt)].attrs['link_attr'] )
print('linked ds attr:', h5fr['link'+str(fcnt)]['data_'+str(fcnt)].attrs['ds_attr'] )
The h5py documentation on external links was either not clear or I didn't understand it.
Anyway, it turned out that I had not assigned the external links properly.
Why it worked for one mode and not another is beyond me. If the way I did the external link assignment had fail for all modes, I probably would have figured out the cause sooner.
EDIT: What I was doing wrong and how to do it correctly to get what I wanted...
External File and what I want it to look like in the file containing the links.
What I did wrong.
with h5py.File("file_with_links.h5", mode="w") as h5fw:
h5fw["/901/20111/VRM_DATA"] = h5py.ExternalLink(h5name, "/901/20111/VRM_DATA")
What I should have done:
with h5py.File("file_with_links.h5", mode="w") as h5fw:
h5fw["/901/20111"] = h5py.ExternalLink(h5name, "/901/20111")
How can you use Flow-guided video completion (FGVC) for a personal file?
Operation is not specified on the various official sources for those who would like to use the FGVC freely from the Google Colab platform (https://colab.research.google.com/drive/1pb6FjWdwq_q445rG2NP0dubw7LKNUkqc?usp=sharing).
I, as a test, I uploaded a video to Google Drive (of the same account from which I was running Google Colab's scripts) divided into various frames, located in a .zip folder called "demo1.zip".
I then ran the first script in the sequence, called "Prepare environment", I activated video sharing via public link and I copied the link in the second script (immediately after the first word "wget –quiet") and in the first entry "rm" I entered "demo1.zip", in relation to the name of my video file.
I proceeded like this after reading the description just above the run button of the second script: "We show a demo on a 15-frames sequence. To process your own data, simply upload the sequence and specify the path."
Running the second script as well, this is successful and my video file is loaded.
I then go to the fourth (and last) script which consists in processing the content through an AI to obtain the final product with an enlarged Field Of View (FOV => larger aspect ratio).
After a few seconds of running, the process ends with an error:
File "video_completion.py", line 613, in <module>
main (args)
File "video_completion.py", line 576, in main
video_completion_sphere (args)
File "video_completion.py", line 383, in video_completion_sphere
RAFT_model = initialize_RAFT (args)
File "video_completion.py", line 78, in initialize_RAFT
model.load_state_dict (torch.load (args.model))
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 594, in load
return _load (opened_zipfile, map_location, pickle_module, ** pickle_load_args)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 853, in _load
result = unpickler.load ()
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 845, in persistent_load
load_tensor (data_type, size, key, _maybe_decode_ascii (location))
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 834, in load_tensor
loaded_storages [key] = restore_location (storage, location)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 175, in default_restore_location
result = fn (storage, location)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 151, in _cuda_deserialize
device = validate_cuda_device (location)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 135, in validate_cuda_device
raise RuntimeError ('Attempting to deserialize object on a CUDA'
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available () is False. If you are running on a CPU-only machine, please use torch.load with map_location = torch.device ('cpu') to map your storages to the CPU.
What's wrong with execution? Is there a way to fix and allow to finish the process with Google Colab?
Let me know!
I'm using python tensorflow to train a model to recognise images in python. But I'm getting the below error when trying to execute train.py from github
Traceback (most recent call last):
File "train.py", line 1023, in <module>
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "C:\Users\sande\Anaconda3\envs\tensorflow\lib\site-
packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train.py", line 766, in main
bottleneck_tensor)
File "train.py", line 393, in cache_bottlenecks
jpeg_data_tensor, bottleneck_tensor)
File "train.py", line 341, in get_or_create_bottleneck
bottleneck_tensor)
File "train.py", line 290, in create_bottleneck_file
print('Creating bottleneck at ' + bottleneck_path)
OSError: raw write() returned invalid length 112 (should have been between 0
and 56)
Below is the code for create_bottleneck_file()
def create_bottleneck_file(bottleneck_path, image_lists, label_name, index,
image_dir, category, sess, jpeg_data_tensor,
bottleneck_tensor):
"""Create a single bottleneck file."""
print('Creating bottleneck at ' + bottleneck_path)
image_path = get_image_path(image_lists, label_name, index,
image_dir, category)
if not gfile.Exists(image_path):
tf.logging.fatal('File does not exist %s', image_path)
image_data = gfile.FastGFile(image_path, 'rb').read()
try:
bottleneck_values = run_bottleneck_on_image(
sess, image_data, jpeg_data_tensor, bottleneck_tensor)
except:
raise RuntimeError('Error during processing file %s' % image_path)
bottleneck_string = ','.join(str(x) for x in bottleneck_values)
with open(bottleneck_path, 'w') as bottleneck_file:
bottleneck_file.write(bottleneck_string)
I tried reducing the file names so that bottleneck_path will be a small value but that did not work. I tried to search online for this error but did not find anything useful. Please let me know if you have a fix to this issue
If you're unable to migrate to 3.6 or from Windows like me, install the win_unicode_console package, import it and add this line at the beggining of your script to enable it:
win_unicode_console.enable()
This issue appears to be generally unique to pre-3.6 Python as the code responsible for handling text output was rewritten for this latest version. This also means that we will most likely not see a fix coming for this issue.
Source: https://bugs.python.org/issue32245
I think this is a bug on the stdout/stderr streams introduced by the November's creators update, it happens in both powershell.exe and cmd.exe
It seems to only happen on Windows 10 Version 1709 (OS Build 16299.64). My guess is that it is unicode realted (output size is twice the length expected)
A (very) quick and dirty fix is to only output ASCII on your console :
mystring.encode("utf-8").decode("ascii")
https://github.com/Microsoft/vscode/issues/39149#issuecomment-347260954
Adding more to #AMSAntiago answer. You could run the win_unicode_console.enable(). But instead of using it on every file, you could run it on every Python invocation (docs). That works for me.
TLDR; I'm trying to take a TIFF, resize it, then save it. However it returns an error. This works fine if I change the saved filetype to png or jpg.
System: Windows 7
Tried using both Python 3.4 and 2.7.
Code:
from PIL import Image
try: #test file exists
im = Image.open(r"c:\temp\file.tif")
except:
print("Error opening image")
multiply = 5 #how much bigger
processing = tuple([multiply*x for x in im.size]) #maths
saved = (r"c:\temp\biggerfile.tif") #save location
imB = im.resize((processing)) #resizing
imB.save(saved) #saving
I need to resize a TIFF because I'm using tesseract-ocr, and resizing the image to get a better output. The program seems to work best with a TIFF.
The error I receive is:
_TIFFVSetField: c:\temp\biggerfile.tif: Bad value 2 for "ExtraSamples" tag.
Traceback (most recent call last):
File "step1.py", line 15, in <module>
imB.save(saved)
File "C:\Python34\lib\site-packages\PIL\Image.py", line 1684, in save
save_handler(self, fp, filename)
File "C:\Python34\lib\site-packages\PIL\TiffImagePlugin.py", line 1185, in _save
e = Image._getencoder(im.mode, 'libtiff', a, im.encoderconfig)
File "C:\Python34\lib\site-packages\PIL\Image.py", line 430, in _getencoder
return encoder(mode, *args + extra)
RuntimeError: Error setting from dictionary
Thanks!
Try to install libtiff
http://gnuwin32.sourceforge.net/packages/tiff.htm
File "C:\Python34\lib\site-packages\PIL\TiffImagePlugin.py", line 1185, in _save
e = Image._getencoder(im.mode, 'libtiff', a, im.encoderconfig)
Looks like that's the error that is holding you up. It's trying to access libtiff and you don't have it installed so it's failing.
Had the same issue, when using PIL to combine multiple images to one and adding a label.
I could fix this easily by converting the .tif file to a .png file in MS Paint (pls don't hate me for using MS :D). Quality of the final merged image was not reduced.