Convert mp3 song image from png to jpg - python

I have a set of many songs, some of which have png images in metadata, and I need to convert these to jpg.
I know how to convert png images to jpg in general, but I am currently accessing metadata using eyed3, which returns ImageFrame objects, and I don't know how to manipulate these. I can, for instance, access the image type with
print(img.mime_type)
which returns
image/png
but I don't know how to progress from here. Very naively I tried loading the image with OpenCV, but it is either not a compatible format or I didn't do it properly. And anyway I wouldn't know how to update the old image with the new one either!
Note: While I am currently working with eyed3, it is perfectly fine if I can solve this any other way.

I was finally able to solve this, although in a not very elegant way.
The first step is to load the image. For some reason I could not make this work with eyed3, but TinyTag does the job:
from PIL import Image
from tinytag import TinyTag
tag = TinyTag.get(mp3_path, image=True)
image_data = tag.get_image()
img_bites = io.BytesIO(image_data)
photo = Image.open(im)
Then I manipulate it. For example we may resize it and save it as jpg. Because we are using Pillow (PIL) for these operations, we actually need to save the image and finally load it back to get the binary data (this detail is probably what should be improved in the process).
photo = photo.resize((500, 500)) # suppose we want 500 x 500 pixels
rgb_photo = photo.convert("RGB")
rgb_photo.save(temp_file_path, format="JPEG")
The last step is thus to load the image and set it as metadata. You have more details about this step in this answer.:
audio_file = eyed3.load(mp3_path) # this has been loaded before
audio_file.tag.images.set(
3, open(temp_file_path, "rb").read(), "image/jpeg"
)
audio_file.tag.save()

Related

OpenCV read images from pyspark and pass to a Keras model

This is a follow-up question to the answer posted here. I'm using PySpark 2.4.4. I have a bunch of images (some .png some .jpeg) stored on Google Cloud Storage (GCS) that I need to pass to a Tensorflow model. I'm getting my images like this.
images = spark.read.format("image").option("dropInvalid", False).load("gs://my-bucket/my_image.jpg")
images = images.collect()
image = cv2.imdecode(np.frombuffer(images[0].image.data, np.uint8), cv2.IMREAD_COLOR)
Based on the OpenCV documentation I've read, it seems like OpenCV isn't able to understand my data format. I know this because cv2.imdecode(...) returns None. The official Spark documentation explicitly mentions compatibility with OpenCV, so I know it's possible.
Eventually I want to be able to do this.
prediction = model.predict(np.array([image]))[0]
Outside of Spark, if I get my image not from GCS but from an http endpoint, all I have to do is this, which works.
resp = urllib.request.urlopen(image_url)
image = resp.read()
prediction = model.predict(np.array([image]))[0]
To get a better sense of what the model is looking for, this is what the data should look like before it's passed into the np.array([...]) part.
print(resp.read())
>>> b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\ ...'
I can confirm that the images aren't corrupted when they're on GCS. When I download the same image from GCS to my laptop, and then read it like this, I get a similarly looking format. The model is also able to consume the image this way. I've also visually inspected the downloaded GCS image, and it looks fine.
with open("./my_image.jpeg", "rb") as image:
print(image.read())
>>> b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\ ...'
Not sure if this is what you are looking for, but I was able to achieve by converting PIL images to cv2 image.
Spark loading :
images = sc.binaryFiles('/tmp/images/*', 10)
df = images.map(lambda img: extract_line_coords(img)).toDF()
df.show(5, False)
Function
def extract_line_coords(binary_images):
name, img = binary_images
pil_image = Image.open(io.BytesIO(img)).convert('RGB')
cv2_image = numpy.array(pil_image)
cv2_image = cv2_image[:, :, ::-1].copy()
gray = cv2.cvtColor(cv2_image, cv2.COLOR_BGR2GRAY)
...
...
Reference : Convert image from PIL to openCV format

Using Pillow and img2pdf to convert images to pdf

I have a task that requires me to get data from an image upload (jpg or png), resize it based on the requirement, and then transform it into pdf and then store in s3.
The file comes in as ByteIO
I have Pillow available so I can resize the image with it
Now the file type is class 'PIL.Image.Image' and I don't know how to proceed.
I found img2pdf library on https://gitlab.mister-muffin.de/josch/img2pdf, but I don't know how to use it when I have PIL format (use tobytes()?)
The s3 upload also looks like a file-like object, so I don't want to save it into a temp file before loading it again. Do I even need img2pdf in this case then?
How do I achieve this goal?
EDIT: I tried using tobytes() and upload to s3 directly. Upload was successful. However, when downloading to see the content, it shows an empty page. It seems like the file data is not written into the pdf file
EDIT 2: Actually went on the s3 and check the file stored. When I download it and open it, it shows cannot be opened
EDIT 3: I don't really have working code as I'm still experimenting what could work, but here's a gist
data = request.FILES['file'].file # where the data is
im = Image.open(data)
(width, height) = (im.width // 2, im.height // 2) # example action I wanna take with Pillow
data = im_resized.tobytes()
# potential step for using img2pdf here but I don't know how
# img2pdf.convert(data) # this fails because "ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO..."
# img2pdf.convert(im_resized) # this also fails because "TypeError: Neither implements read() nor is str or bytes"
upload_to_s3(data) # some function that utilizes boto3 to upload to s3
The problem is that u use Image.Image object instead of JPEG or something like it
Try this:
bytes_io = io.BytesIO()
image.save(bytes_io, 'PNG')
with open(output_pdf, "wb") as f:
f.write(img2pdf.convert(bytes_io.getvalue()))

Failed to GET matplotlib generated png in django

I want to serve matplotlib generated images with django.
If the image is a static png file, the following code works great:
from django.http import HttpResponse
def static_image_view(request):
response = HttpResponse(mimetype='image/png')
with open('test.png', 'rb') as f:
response.write(f.read())
return response
However, if the image is dynamically generated:
import numpy as np
import matplotlib
matplotlib.use('Agg')
from matplotlib import pyplot as plt
def dynamic_image_view(request):
response = HttpResponse(mimetype='image/png')
fig = plt.figure()
plt.plot(np.random.rand(100))
plt.savefig(response, format='png')
plt.close(fig)
return response
When accessing the url in Chrome (v36.0), the image will show up for a few seconds, then disappear and turn to the alt text. It seems that the browser doesn't know the image has already finished loading and waits until timeout. Checking with Chrome > Tools > Developer tools > Network supports this hypothesis: although the image appears after only about 1 sec, the status of the corresponding http request becomes "failed" after about 5 sec.
Note again, this strange phenomenon occurs only with the dynamically generated image, so it shouldn't be Chrome's problem (though it doesn't happen with IE or FireFox, presumably due to different rules in dealing with timeout requests).
To make it more tricky (i.e., hard to reproduce), it seems to be network speed dependent. It happens if I access the url from an IP in China, but not if via a proxy in the US (which seems to be faster visiting the host on which django is running)...
According to #HSquirrel, I tested writing the png into temporary disk file. Strangely, saving file with matplotlib didn't work,
plt.savefig('MPL.png', format='png')
with open('MPL.png', 'rb') as f:
response.write(f.read())
while saving file with PIL worked:
import io
from PIL import Image
f = io.BytesIO()
plt.savefig(f, format='png')
f.seek(0)
im = Image.open(f)
im.save('PIL.png', 'PNG')
Attempt of getting rid of temp file failed:
im.save(response, 'PNG')
However, if I generate the image data stream with PIL rather than matplotlib, temporary disk file would be unnecessary. The following code works:
from PIL import Image, ImageDraw
im = Image.new('RGBA', (256,256), (0,255,0,255))
draw = ImageDraw.Draw(im)
draw.line((100,100, 150,200), fill=128, width=3)
im.save(response, 'PNG')
Finally, savefig(response, 'jepg') has no problem at all.
Have you tried saving the image to disk and then returning that? (you can periodically clear your disk of such generated images based on their time of creation)
If that gives the same problem, it might be a problem with the way the png is generated. Than you could use some kind of image library (like PIL) to make sure all your png's are (re)generated in a way that works with all browsers.
EDIT:
I've checked the png you've linked and I've played around with it a bit, opening and saving it with different programs and with PIL. I get different binary data every time. It seems each program decides which chunks to keep and which to remove. They all encode the png image data differently as well (as far as I can see, I am by no means a specialist in this, I just looked at the binary data based on the specs).
There are a few different paths you can take:
1.The quick and dirty one:
import io
from PIL import Image
f = io.BytesIO()
plt.savefig(f, format='png')
f.seek(0)
im = Image.open(f)
tempfilename = generatetempfilename()
im.save(tempfilename, 'PNG')
with open(tempfilename, 'rb') as f:
response.write(f.read())
2.Adapt how matplotlib makes PNG files (possibly by just using PIL for
it as well). See
http://matplotlib.org/users/customizing.html#customizing-matplotlib
3.If it's an option for you, use jpeg.
4.Figure out what's wrong with the PNG generated by matplotlib and fix
it binary (I don't recommend this). You can use xxd (linux command: xxd test.png) to figure out how the files look in binary and then see how things go using the png spec: overview chunk spec

Python: Remove Exif info from images

In order to reduce the size of images to be used in a website, I reduced the quality to 80-85%. This decreases the image size quite a bit, up to an extent.
To reduce the size further without compromising the quality, my friend pointed out that raw images from cameras have a lot of metadata called Exif info. Since there is no need to retain this Exif info for images in a website, we can remove it. This will further reduce the size by 3-10 kB.
But I'm not able to find an appropriate library to do this in my Python code. I have browsed through related questions and tried out some of the methods:
Original image: http://mdb.ibcdn.com/8snmhp4sjd75vdr27gbadolc003i.jpg
Mogrify
/usr/local/bin/mogrify -strip filename
Result: http://s23.postimg.org/aeaw5x7ez/8snmhp4sjd75vdr27gbadolc003i_mogrify.jpg
This method reduces the size from 105 kB to 99.6 kB, but also changed the color quality.
Exif-tool
exiftool -all= filename
Result: http://s22.postimg.org/aiq99o775/8snmhp4sjd75vdr27gbadolc003i_exiftool.jpg
This method reduces the size from 105 kB to 72.7 kB, but also changed the color quality.
This answer explains in detail how to manipulate the Exif info, but how do I use it to remove the info?
Can anyone please help me remove all the extra metadata without changing the colours, dimensions, and other properties of an image?
from PIL import Image
image = Image.open('image_file.jpeg')
# next 3 lines strip exif
data = list(image.getdata())
image_without_exif = Image.new(image.mode, image.size)
image_without_exif.putdata(data)
image_without_exif.save('image_file_without_exif.jpeg')
For me, gexiv2 works fine:
#!/usr/bin/python3
from gi.repository import GExiv2
exif = GExiv2.Metadata('8snmhp4sjd75vdr27gbadolc003i.jpg')
exif.clear_exif()
exif.clear_xmp()
exif.save_file()
See also Exif manipulation library for python, which you linked, but didn't read all answers ;)
You can try loading the image with the Python Image Lirbary (PIL) and then save it again to a different file. That should remove the meta data.
You don't even need to do the extra steps #user2141737 suggested. Just opening it up with PIL and saving it again seems to do the trick just fine:
from PIL import Image
image = Image.open('path/to/image')
image.save('new/path/' + file_name)
As for pillow==9.2.0
This seems to print exif data, a mutable mapping
print(im.info)
This seems to clear exif data for PNG
def clear_exif():
with Image.open('./my_image.png', mode='r', formats=['PNG']) as im:
fields_to_keep = ('transparency', )
exif_fields = list(im.info.keys())
for k in exif_fields:
if k not in fields_to_keep:
del im.info[k]
im.save('./my_image.png', format='PNG')

Resize image in Python without losing EXIF data

I need to resize jpg images with Python without losing the original image's EXIF data (metadata about date taken, camera model etc.). All google searches about python and images point to the PIL library which I'm currently using, but doesn't seem to be able to retain the metadata. The code I have so far (using PIL) is this:
img = Image.open('foo.jpg')
width,height = 800,600
if img.size[0] < img.size[1]:
width,height = height,width
resized_img = img.resize((width, height), Image.ANTIALIAS) # best down-sizing filter
resized_img.save('foo-resized.jpg')
Any ideas? Or other libraries that I could be using?
There is actually a really simple way of copying EXIF data from a picture to another with only PIL. Though it doesn't permit to modify the exif tags.
image = Image.open('test.jpg')
exif = image.info['exif']
# Your picture process here
image = image.rotate(90)
image.save('test_rotated.jpg', 'JPEG', exif=exif)
As you can see, the save function can take the exif argument which permits to copy the raw exif data in the new image when saving. You don't actually need any other lib if that's all you want to do. I can't seem to find any documentation on the save options and I don't even know if that's specific to Pillow or working with PIL too. (If someone has some kind of link, I would enjoy if they posted it in the comments)
import jpeg
jpeg.setExif(jpeg.getExif('foo.jpg'), 'foo-resized.jpg')
http://www.emilas.com/jpeg/
You can use pyexiv2 to copy EXIF data from source image. In the following example image is resized using PIL library, EXIF data copied with pyexiv2 and image size EXIF fields are set with new size.
def resize_image(source_path, dest_path, size):
# resize image
image = Image.open(source_path)
image.thumbnail(size, Image.ANTIALIAS)
image.save(dest_path, "JPEG")
# copy EXIF data
source_image = pyexiv2.Image(source_path)
source_image.readMetadata()
dest_image = pyexiv2.Image(dest_path)
dest_image.readMetadata()
source_image.copyMetadataTo(dest_image)
# set EXIF image size info to resized size
dest_image["Exif.Photo.PixelXDimension"] = image.size[0]
dest_image["Exif.Photo.PixelYDimension"] = image.size[1]
dest_image.writeMetadata()
# resizing local file
resize_image("41965749.jpg", "resized.jpg", (600,400))
Why not using ImageMagick?
It is quite a standard tool (for instance, it is the standard tool used by Gallery 2); I have never used it, however it has a python interface as well (or, you can also simply spawn the command) and most of all, should maintain EXIF information between all transformation.
Here's an updated answer as of 2018. piexif is a pure python library that for me installed easily via pip (pip install piexif) and worked beautifully (thank you, maintainers!). https://pypi.org/project/piexif/
The usage is very simple, a single line will replicate the accepted answer and copy all EXIF tags from the original image to the resized image:
import piexif
piexif.transplant("foo.jpg", "foo-resized.jpg")
I haven't tried yet, but it looks like you could also perform modifcations easily by using the load, dump, and insert functions as described in the linked documentation.
For pyexiv2 v0.3.2, the API documentation refers to the copy method to carry over EXIF data from one image to another. In this case it would be the EXIF data of the original image over to the resized image.
Going off #Maksym Kozlenko, the updated code for copying EXIF data is:
source_image = pyexiv2.ImageMetadata(source_path)
source_image.read()
dest_image = pyexiv2.ImageMetadata(dest_path)
dest_image.read()
source_image.copy(dest_image,exif=True)
dest_image.write()
You can use pyexiv2 to modify the file after saving it.
from PIL import Image
img_path = "/tmp/img.jpg"
img = Image.open(img_path)
exif = img.info['exif']
img.save("output_"+img_path, exif=exif)
Tested in Pillow 2.5.3
It seems #Depado's solution does not work for me, in my scenario the image does not even contain an exif segment.
pyexiv2 is hard to install on my Mac, instead I use the module pexif https://github.com/bennoleslie/pexif/blob/master/pexif.py. To "add exif segment" to an image does not contain exif info, I added the exif info contained in an image which owns a exif segment
from pexif import JpegFile
#get exif segment from an image
jpeg = JpegFile.fromFile(path_with_exif)
jpeg_exif = jpeg.get_exif()
#import the exif segment above to the image file which does not contain exif segment
jpeg = JpegFile.fromFile(path_without_exif)
exif = jpeg.import_exif(jpeg_exif)
jpeg.writeFile(path_without_exif)
Updated version of Maksym Kozlenko
Python3 and py3exiv2 v0.7
# Resize image and update Exif data
from PIL import Image
import pyexiv2
def resize_image(source_path, dest_path, size):
# resize image
image = Image.open(source_path)
# Using thumbnail, then 'size' is MAX width or weight
# so will retain aspect ratio
image.thumbnail(size, Image.ANTIALIAS)
image.save(dest_path, "JPEG")
# copy EXIF data
source_exif = pyexiv2.ImageMetadata(source_path)
source_exif.read()
dest_exif = pyexiv2.ImageMetadata(dest_path)
dest_exif.read()
source_exif.copy(dest_exif,exif=True)
# set EXIF image size info to resized size
dest_exif["Exif.Photo.PixelXDimension"] = image.size[0]
dest_exif["Exif.Photo.PixelYDimension"] = image.size[1]
dest_exif.write()
PIL handles EXIF data, doesn't it? Look in PIL.ExifTags.

Categories

Resources