Python: SpooledTemporaryFile suffix not working - python

I want to write an image using opencv to a temporary file, get the path of that temporary file and pass that path to a function.
import cv2 as cv
from tempfile import NamedTemporaryFile, SpooledTemporaryFile
img = create_my_awesome_image()
with NamedTemporaryFile(suffix=".png") as temp:
print(temp.name)
cv.imwrite(temp.name, img) # this one sparks joy
with SpooledTemporaryFile(max_size=1000000, suffix=".png") as temp:
print(temp.name)
cv.imwrite(temp.name, img) # this one does not
The first print prints C:\Users\FLORIA~1\AppData\Local\Temp\tmpl2i6nc47.png.
While the second print prints: None.
Using NamedTemporaryFile works perfectly find. However, because the second print prints None, I cannot use the SpooledTemporaryFile together with opencv. Any ideas why the prefix argument of SpooledTemporaryFile is ignored?

The problem is that a spooled file (such as a SpooledTemporaryFile) doesn't exist on the disk, so it also doesn't have a name.
However, note that cv2.imread() will take a file name as an argument, meaning that it will handle the file opening and it doesn't support spooled files.
If you are only working with png images, they are not encoded, meaning that the variable img already contains the image data in memory and there is nothing else for you to do, just call cv2.imwrite() when you want to save it to the disk. If you want to use a temporary file, it has to be a NamedTemporaryFile.
If you want to handle an encoded image format in memory, such as jpg, you can use cv2.imencode() for that purpose, as in this answer.

Related

Cannot save multiple files with PIL save method

I have modified a vk4 converter to allow for the conversion of several .vk4 files into .jpg image files. When ran, IDLE does not give me an error, but it only manages to convert one file before ending the process. I believe the issue is that image.save() only seems to affect a single file and I have been unsuccessful in looping that command to extend to all other files in the directory.
Code:
import numpy as np
from PIL import Image
import vk4extract
import os
os.chdir(r'path\to\directory')
root = ('.\\')
vkimages = os.listdir(root)
for img in vkimages:
if (img.endswith('.vk4')):
with open(img, 'rb') as in_file:
offsets = vk4extract.extract_offsets(in_file)
rgb_dict = vk4extract.extract_color_data(offsets, 'peak', in_file)
rgb_data = rgb_dict['data']
height = rgb_dict['height']
width = rgb_dict['width']
rgb_matrix = np.reshape(rgb_data, (height, width, 3))
image = Image.fromarray(rgb_matrix, 'RGB')
image.save('sample.jpeg', 'JPEG')
How do I prevent the converted files from being overwritten while using the PIL module?
Thank you.
It is saving every file, but since you are always providing the same name to each file (image.save('sample.jpeg', 'JPEG')), only the last one will be saved and all the other ones will be overwritten. You need to specify different names to every file. There are several ways of doing it. One is adding the index when looping using enumerate():
for i, img in enumerate(vkimages):
and then using the i on the name of the file when saving:
image.save(f'sample_{i}.jpeg', 'JPEG')
Another way is to use the original filename and replace the extension. From your code, it looks like the files are .vk4 files. So another possibility is to save with the same name but replacing .vk4 to .jpeg:
image.save(img.replace('.vk4', '.jpeg'), 'JPEG')

skimage imread from temporary file

I'm trying to skimage.io.imread() an image (say a tiff file, for concreteness) that was previously written to a tempfile.TemporaryFile(). However, skimage complains by saying
ValueError: Cannot determine type of file b'<_io.BufferedRandom name=6>'
I am doing this because another program writes the image to standard output.
I collect it with subprocess.check_output and write it to the temporary file, thus avoiding saving the image to disk.
Does anyone know how to achieve this, or has got a better idea on how to pipe an image from stdout into a python image, ultimately to be treated as a numpy.ndarray?
A solution is the following
with NamedTemporaryFile() as f:
skimage.io.imread(f.name, plugin="tifffile")
Alternatively, one can replace freeimage with tifffile.
Earlier I was passing the file object, but imread actually wants a filename.

having cv2.imread reading images from file objects or memory-stream-like data (here non-extracted tar)

I have a .tar file containing several hundreds of pictures (.png). I need to process them via opencv.
I am wondering whether - for efficiency reasons - it is possible to process them without passing by the disc. In other, words I want to read the pictures from the memory stream related to the tar file.
Consider for instance
import tarfile
import cv2
tar0 = tarfile.open('mytar.tar')
im = cv2.imread( tar0.extractfile('fname.png').read() )
The last line doesn't work as imread expects a file name rather than a stream.
Consider that this way of reading directly from the tar stream can be achieved e.g. for text (see e.g. this SO question).
Any suggestion to open the stream with the correct png encoding?
Untarring to ramdisk is of course an option, although I was looking for something more cachable.
Thanks to the suggestion of #abarry and this SO answer I managed to find the answer.
Consider the following
def get_np_array_from_tar_object(tar_extractfl):
'''converts a buffer from a tar file in np.array'''
return np.asarray(
bytearray(tar_extractfl.read())
, dtype=np.uint8)
tar0 = tarfile.open('mytar.tar')
im0 = cv2.imdecode(
get_np_array_from_tar_object(tar0.extractfile('fname.png'))
, 0 )
Perhaps use imdecode with a buffer coming out of the tar file? I haven't tried it but seems promising.

PIL: How to reopen an image after verifying?

I need open an image, verify the image, then reopen it (see last sentence of below quote from PIL docs)
im.verify()
Attempts to determine if the file is broken, without actually decoding
the image data. If this method finds any problems, it raises suitable
exceptions. This method only works on a newly opened image; if the
image has already been loaded, the result is undefined. Also, if you
need to load the image after using this method, you must reopen the
image file.
This is what I have in my code, where picture is a django InMemoryUploadedFile object:
img = Image.open(picture)
img.verify()
img = Image.open(picture)
The first two lines work fine, but I get the following error for the third line (where I'm attempting to "reopen" the image):
IOError: cannot identify image file
What is the proper way to reopen the image file, as the docs suggest?
This is no different than doing
f = open('x.png')
Image.open(f)
Image.open(f)
The code above does not work because PIL advances in the file while reading its first few bytes to (attempt to) identify its format. Trying to use a second Image.open in this situation will fail as noted because now the current position in the file is past its image's header. To confirm this, you can verify what f.tell() returns. To solve this issue you have to go back to the start of the file either by doing f.seek(0) between the two calls to Image.open, or closing and reopening the file.
Try doing a del img between the verify and second open.

md5 from pil object

how i can get md5 of the pil object without saving to file ?
imq.save('out.png')
hash = hashlib.md5(open('out.png','rb').read()).hexdigest()
Actually there is simpler solution:
hashlib.md5(img.tostring()).hexdigest()
Turning #Ignacio's answer into code, using this answer to help:
import StringIO, hashlib
output = StringIO.StringIO()
img.save(output)
hash = hashlib.md5(output.getvalue()).hexdigest()
As the referenced other answer notes, this might lead to a KeyError if PIL tries to automatically detect the output format. To avoid this problem you can specify the format manually:
img.save(output, format='GIF')
(Note: I've used "img" as the variable, rather than your "imq" which I assumed was a typo.)
You could write it to a StringIO instead, and then take the hash of that.
You could use the following PIL Image class method to get the raw image data to feed to md5().
im.getdata() => sequence
Returns the contents of an image as a
sequence object containing pixel
values. The sequence object is
flattened, so that values for line one
follow directly after the values of
line zero, and so on.
Note that the resulting MD5 hash of using this won't be the same as your sample code because it is (at least partially) independent of the particular image file format used to save the image. It could be useful if you wanted to compare actual images independent of the particular image file format they may be saved in.
To use it you would need to store the MD5 hash of the image data somewhere independent of any image file where it could be retrieved when needed -- as opposed to generating it by reading the entire file into memory as binary data like the code in your question does. Instead you would need to always load the image into PIL and then use the getdata() method on it to compute hashes.

Categories

Resources