What is the best way to save image metadata alongside a tif?

What is the best way to save image metadata alongside a tif? - python

In my work as a grad student, I capture microscope images and use python to save them as raw tif's. I would like to add metadata such as the name of the microscope I am using, the magnification level, and the imaging laser wavelength. These details are all important for how I post-process the images.
I should be able to do this with a tif, right? Since it has a header?
I was able to add to the info in a PIL image:
im.info['microscope'] = 'george'
but when I save and load that image, the info I added is gone.
I'm open to all suggestions. If I have too, I'll just save a separate .txt file with the metadata, but it would be really nice to have it embedded in the image.

Tifffile is one option for saving microscopy images with lots of metadata in python.
It doesn't have a lot of external documentation, but the docstings are great so you can get a lot of info just by typing help(tifffile) in python, or go look at the source code.
You can look at the TiffWriter.save function in the source code (line 750) for the different keyword arguments you can use to write metadata.
One is to use description, which accepts a string. It will show up as the tag "ImageDescription" when you read your image.
Another is to use the extratags argument, which accepts a list of tuples. That allows you to write any tag name that exist in TIFF.TAGS(). One of the easiest way is to write them as strings because then you don't have to specify length.
You can also write ImageJ metadata with ijmetadata, for which the acceptable types are listed in the source code here.
As an example, if you write the following:
import json
import numpy as np
import tifffile
im = np.random.randint(0, 255, size=(150, 100), dtype=np.uint8)
# Description
description = "This is my description"
# Extratags
metadata_tag = json.dumps({"ChannelIndex": 1, "Slice": 5})
extra_tags = [("MicroManagerMetadata", 's', 0, metadata_tag, True),
("ProcessingSoftware", 's', 0, "my_spaghetti_code", True)]
# ImageJ metadata. 'Info' tag needs to be a string
ijinfo = {"InitialPositionList": [{"Label": "Pos1"}, {"Label": "Pos3"}]}
ijmetadata = {"Info": json.dumps(ijinfo)}
# Write file
tifffile.imsave(
save_name,
im,
ijmetadata=ijmetadata,
description=description,
extratags=extra_tags,
)
You can see the following tags when you read the image:
frames = tifffile.TiffFile(save_name)
page = frames.pages[0]
print(page.tags["ImageDescription"].value)
Out: 'this is my description'
print(page.tags["MicroManagerMetadata"].value)
Out: {'ChannelIndex': 1, 'Slice': 5}
print(page.tags["ProcessingSoftware"].value)
Out: my_spaghetti_code

For internal use, try saving the metadata as JSON in the TIFF ImageDescription tag, e.g.
from __future__ import print_function, unicode_literals
import json
import numpy
import tifffile # http://www.lfd.uci.edu/~gohlke/code/tifffile.py.html
data = numpy.arange(256).reshape((16, 16)).astype('u1')
metadata = dict(microscope='george', shape=data.shape, dtype=data.dtype.str)
print(data.shape, data.dtype, metadata['microscope'])
metadata = json.dumps(metadata)
tifffile.imsave('microscope.tif', data, description=metadata)
with tifffile.TiffFile('microscope.tif') as tif:
data = tif.asarray()
metadata = tif[0].image_description
metadata = json.loads(metadata.decode('utf-8'))
print(data.shape, data.dtype, metadata['microscope'])
Note that JSON uses unicode strings.
To be compatible with other microscopy software, consider saving OME-TIFF files, which store defined metadata as XML in the ImageDescription tag.

I should be able to do this with a tif, right? Since it has a header?
No.
First, your premise is wrong, but that's a red herring. TIFF does have a header, but it doesn't allow you to store arbitrary metadata in it.
But TIFF is a tagged file format, a series of chunks of different types, so the header isn't important here. And you can always create your own private chunk (any ID > 32767) and store anything you want there.
The problem is, nothing but your own code will have any idea what you stored there. So, what you probably want is to store EXIF or XMP or some other standardized format for extending TIFF with metadata. But even there, EXIF or whatever you choose isn't going to have a tag for "microscope", so ultimately you're going to end up having to store something like "microscope=george\nspam=eggs\n" in some string field, and then parse it back yourself.
But the real problem is that PIL/Pillow doesn't give you an easy way to store EXIF or XMP or anything else like that.
First, Image.info isn't for arbitrary extra data. At save time, it's generally ignored.
If you look at the PIL docs for TIFF, you'll see that it reads additional data into a special attribute, Image.tag, and can save data by passing a tiffinfo keyword argument to the Image.save method. But that additional data is a mapping from TIFF tag IDs to binary hunks of data. You can get the Exif tag IDs from the undocumented PIL.ExifTags.TAGS dict (or by looking them up online yourself), but that's as much support as PIL is going to give you.
Also, note that accessing tag and using tiffinfo in the first place requires a reasonably up-to-date version of Pillow; older versions, and classic PIL, didn't support it. (Ironically, they did have partial EXIF support for JPG files, which was never finished and has been stripped out…) Also, although it doesn't seem to be documented, if you built Pillow without libtiff it seems to ignore tiffinfo.
So ultimately, what you're probably going to want to do is:
Pick a metadata format you want.
Use a different library than PIL/Pillow to read and write that metadata. (For example, you can use GExiv2 or pyexif for EXIF.)

You could try setting tags in the tag property of a TIFF image. This is an ImageFileDirectory object. See TiffImagePlugin.py.
Or, if you have libtiff installed, you can use the subprocess module to call the tiffset command to set a field in the header after you have saved the file. There are online references of available tags.
According to this page:
If one needs more than 10 private tags or so, the TIFF specification suggests that, rather then using a large amount of private tags, one should instead allocate a single private tag, define it as datatype IFD, and use it to point to a socalled 'private IFD'. In that private IFD, one can next use whatever tags one wants. These private IFD tags do not need to be properly registered with Adobe, they live in a namespace of their own, private to the particular type of IFD.
Not sure if PIL supports this, though.

Related

How do I modify TIFF physical resolution metadata

I have several pyramidal, tiled TIFF images that were converted from a different format. The converter program wrote incorrect data to the XResolution and YResolution TIFF metadata. How can I modify these fields?
tiff.ResolutionUnit: 'centimeter'
tiff.XResolution: '0.34703996762331574'
tiff.YResolution: '0.34704136833246829'
Ideally I would like to use Python or a command-line tool.

One can use tifftools.tiff_set from Tiff Tools.
import tifftools
tifftools.tiff_set(
PATH_TO_ORIG_IMAGE,
PATH_TO_NEW_IMAGE,
overwrite=False,
setlist=[
(
tifftools.Tag.RESOLUTIONUNIT,
tifftools.constants.ResolutionUnit.CENTIMETER.value,
),
(tifftools.Tag.XRESOLUTION, xresolution),
(tifftools.Tag.YRESOLUTION, yresolution),
],
)
Replace xresolution and yresolution with the desired values. These values must be floats. In this example, the resolution unit is centimeter.
This is also possible with the excellent tifffile package. In fact there is an example of this use case in the README.
with TiffFile('temp.tif', mode='r+') as tif:
_ = tif.pages[0].tags['XResolution'].overwrite((96000, 1000))
Be aware that this will overwrite the original image. If this is not desired, make a copy of the image first and then overwrite the tags.

Python - add arbitrary EXIF data to image (UserComment field)?

I need to add arbitrary data to a JPEG image. Specifically, I need to store two integers. From reading about EXIF data, I'm under the impression that it is not possible to make your own custom fields, but rather the EXIF standard fields must be used.
This post Custom Exif Tags however mentions a UserComment field which I gather it is possible to write a string to. If this is the only option it's fine since I can store two integers in a comma-delimited string, ex '2,5' to store the integers 2 and 5, so if I only have one string of storage to work with it's still sufficient.
I downloaded a few random images from a Google image search and found they don't seem to have EXIF data, perhaps it's stripped off purposefully by Google? Also I took a few images with my cell phone and found that as expected they have a significant amount of EXIF data (image size, GPS location, etc.)
Upon some Googleing I found this example on how to read/dump EXIF data:
from PIL import Image
image = Image.open('image.jpg')
exifData = image._getexif()
print('exifData = ' + str(exifData))
This works great, if I run this on an image with no EXIF data I get:
exifData = None
and if I run this on an image with EXIF data I get a dictionary showing the EXIF fields as expected.
Now my question is, how can I add to the EXIF data? Using the UserComment 37510 field mentioned in the above linked post, and also here https://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html, and using piexif this is my best attempt so far:
from PIL import Image
import piexif
image = Image.open('image.jpg')
exifData = image._getexif()
if exifData is None:
exifData = {}
# end if
exifData[37510] = 'my message'
exifDataBytes = piexif.dump(exifData)
image.save('image_mod.jpg', format='jpeg', exif=exifDataBytes)
If I then run the 1st code above on image_mod.jpg I get:
exifData = {}
So clearly the 37510 message was not properly written. I get this same empty dictionary result whether I'm using an image that has EXIF data or an image without EXIF data to begin with.
Before somebody marks this as a duplicate, I also tried what this post How can I insert EXIF/other metadata into a JPEG stored in a memory buffer? mentions in the highest-rated answer and got the same result when attempting to read the EXIF data (empty dictionary).
What am I doing wrong? How can I properly add custom EXIF data to an image using 37510, or any other means?

You're missing a step in handling the data passed to piexif.dump:
exif_ifd = {piexif.ExifIFD.UserComment: 'my message'.encode()}
exif_dict = {"0th": {}, "Exif": exif_ifd, "1st": {},
"thumbnail": None, "GPS": {}}
exif_dat = piexif.dump(exif_dict)
img.save('image_mod.jpg', exif=exif_dat)
You should be able to read it back out after this. See also this answer for dealing with custom metadata.

Rasterio tags are the easiest way to add metadata of any kind to an image. Easy and practical. example:
import rasterio
old_file=rasterio.open('old_image.tif')
profile=old_file.profile
data=old_file.read()
with rasterio.open('new_image.tif','w',**profile) as dst:
dst.update_tags(a='1', b='2')
dst.write(data)
dst.close()
#now access the tags like below:
im=rasterio.open('new_image.tif')
print(im.tags())

Cannot Embed Cover Art To Mp3 in Python 3.5.2

I have this file "image.jp
and this .mp3 file:
"Green Day - When I Come Around [Official Music Video].mp3"
in the directory "test"
I have already successfully set tags as Author, Title, Album and etc using eyeD3 library.
and then I try to set the Cover Art.
I've tried two possibilities, but none of them worked:
First one: Mutagen:
from mutagen.mp3 import MP3
from mutagen.id3 import ID3, APIC, error
complete_file_path = "test\\"+"Green Day - When I Come Around [Official Music Video].mp3"
path_to_thumb_wf = "test\\"+"image.jpg"
audio = MP3(complete_file_path, ID3=ID3)
# add ID3 tag if it doesn't exist
try:
audio.add_tags()
except error:
pass
print(path_to_thumb_wf)
audio.tags.add(
APIC(
encoding=3, # 3 is for utf-8
mime='image/jpg', # image/jpeg or image/png
type=3, # 3 is for the cover image
desc=u'Cover',
data=open(path_to_thumb_wf, 'rb').read()
)
)
audio.save(v2_version=3)
And the solution using eyeD3
audiofile = eyed3.load(complete_file_path)
# read image into memory
imagedata = open(path_to_thumb_wf,"rb").read()
# append image to tags
audiofile.tag.images.set(3,imagedata,"image/jpeg", u"you can put a description here")
audiofile.tag.save()
I'm using python 3.5.2 on Windows 10. And i don't know if it could influence the result but i'll say anyway, the song has already a cover art that I'd like to change.

As explained in the ID3v2.3 section on APIC:
There may be several pictures attached to one file, each in their individual "APIC" frame, but only one with the same content descriptor. There may only be one picture with the picture type declared as picture type $01 and $02 respectively.
In v2.3, IIRC, "content descriptor" isn't actually documented anywhere, so different clients may do slightly different things here, but most tools will treat it either as the picture type plus description string, or as the entire header (text encoding, MIME type, picture type, and encoded description) as a binary blob. (And some tools just ignore it and allow you to store pictures with completely identical frame headers, but I don't think that's relevant with Mutagen.)
At any rate, this means you're probably just adding another Cover (front) picture, named 'Cover', rather than replacing any existing one.
You haven't explained how you're looking at the file. But I'm guessing you're trying to open it in Windows Media Player or iTunes or some other player, or view it in Windows Explorer (which I think just asks WMP to read the tag), or something like that?
Almost all such tools, when faced with multiple images, just show you the first one. (Some of them don't even distinguish on picture type, and show you the first image of any type, even if it's a 32x32 file icon…)
Some do have a way to view the other pictures, however. For example, in iTunes, if you Get Info or Properties on the track, then go to the Cover Art or similar tab (sorry for the vagueness, but the names have changed across versions), you can see all of the pictures in the tag.
At any rate, if you want to replace the APIC with a different one, you either need to exactly match the descriptor (and, again, that can mean different things to different libraries…), or, more simply, just delete the old one as well as adding the new one.
One more thing to watch out for: both iTunes and WMP cache cover art, and assume that it's never going to change once the file has been imported. And WMP also has various things that can override the image in the file, such as a properly-UUID'd folder cover art image in the same directory.

How to send embedded image created using PIL/pillow as email (Python 3)

I am creating image that I would like to embed in the e-mail. I cannot figure out how to create image as binary and pass into MIMEImage. Below is the code I have and I have error when I try to read image object - the error is "AttributeError: 'NoneType' object has no attribute 'read'".
image=Image.new("RGBA",(300,400),(255,255,255))
image_base=ImageDraw.Draw(image)
emailed_password_pic=image_base.text((150,200),emailed_password,(0,0,0))
imgObj=emailed_password_pic.read()
msg=MIMEMultipart()
html="""<p>Please finish registration <br/><img src="cid:image.jpg"></p>"""
img_file='image.jpg'
msgText = MIMEText(html,'html')
msgImg=MIMEImage(imgObj)
msgImg.add_header('Content-ID',img_file)
msg.attach(msgImg)
msg.attach(msgText)
If you look at line 4 - I am trying to read image so that I can pass it into MIMEImage. Apparently, image needs to be read as binary. However, I don't know how to convert it to binary so that .read() can process it.
FOLLOW-UP
I edited code per suggestions from jsbueno - thank you very much!!!:
emailed_password=os.urandom(16)
image=Image.new("RGBA",(300,400),(255,255,255))
image_base=ImageDraw.Draw(image)
emailed_password_pic=image_base.text((150,200),emailed_password,(0,0,0))
stream_bytes=BytesIO()
image.save(stream_bytes,format='png')
stream_bytes.seek(0)
#in_memory_file=stream_bytes.getvalue()
#imgObj=in_memory_file.read()
imgObj=stream_bytes.read()
msg=MIMEMultipart()
sender='xxx#abc.com'
receiver='jjjj#gmail.com'
subject_header='Please use code provided in this e-mail to confirm your subscription.'
msg["To"]=receiver
msg["From"]=sender
msg["Subject"]=subject_header
html="""<p>Please finish registration by loging into your account and typing in code from this e-mail.<br/><img src="cid:image.png"></p>"""
img_file='image.png'
msgText=MIMEText(html,'html')
msgImg=MIMEImage(imgObj) #Is mistake here?
msgImg.add_header('Content-ID',img_file)
msg.attach(msgImg)
msg.attach(msgText)
smtpObj=smtplib.SMTP('smtp.mandrillapp.com', 587)
smtpObj.login(userName,userPassword)
smtpObj.sendmail(sender,receiver,msg.as_string())
I am not getting errors now but e-mail does not have image in it. I am confused about the way image gets attached and related to in html/email part. Any help is appreciated!
UPDATE:
This code actually works - I just had minor typo in the code on my PC.

There are a couple of conceptual errors there, both in using PIL and on what format an image should be in order to be incorporated into an e-mail.
In PIL: the ImageDraw class operates inplace, not like the Image class calls, which usually return a new image after each operation. In your code, it means that the call to image_base.text is actually changing the pixel data of the object that lies in your image variable. This call actually returns None and the code above should raise an error like "AttributeError: None object does not have attribute 'read'" on the following line.
Past that (that is, you should fetch the data from your image variable to attach it to the e-mail) comes the second issue: PIL, for obvious reasons, have images in an uncompressed, raw pixel data format in memory. When attaching images in e-mails we usually want images neatly packaged inside a file - PNG or JPG formats are usually better depending on the intent - let's just stay with .PNG. So, you have to create the file data using PIL, and them attach the file data (i.e. the data comprising a PNG file, including headers, metadata, and the actual pixel data in a compressed form). Otherwise you'd be putting in your e-mail a bunch of (uncompressed) pixel data that the receiving party would have no way to assemble back into an image (even if he would treat the data as pixels, raw pixel data does not contain the image shape so-)
You have two options: either generate the file-bytes in memory, or write them to an actual file in disk, and re-read that file for attaching. The second form is easier to follow. The first is both more efficient and "the right thing to do" - so let's keep it:
from io import BytesIO
# In Python 2.x:
# from StringIO import StringIO.StringIO as BytesIO
image=Image.new("RGBA",(300,400),(255,255,255))
image_base=ImageDraw.Draw(image)
# this actually modifies "image"
emailed_password_pic=image_base.text((150,200),emailed_password,(0,0,0))
stream = BytesIO()
image.save(stream, format="png")
stream.seek(0)
imgObj=stream.read()
...
(NB: I have not checked the part dealing with mail and mime proper in your code - if you are using it correctly, it should work now)

Is it possible to code images into a python script?

Instead of using directories to reference an image, is it possible to code an image into the program directly?

You can use the base64 module to embed data into your programs. From the base64 documentation:
>>> import base64
>>> encoded = base64.b64encode('data to be encoded')
>>> encoded
'ZGF0YSB0byBiZSBlbmNvZGVk'
>>> data = base64.b64decode(encoded)
>>> data
'data to be encoded'
Using this ability you can base64 encode an image and embed the resulting string in your program. To get the original image data you would pass that string to base64.b64decode.

Try img2py script. It's included as part of wxpython (google to see if you can dl seperately).
img2py.py -- Convert an image to PNG format and embed it in a Python
module with appropriate code so it can be loaded into a program at runtime. The benefit is that since it is Python source code it can be delivered as a .pyc or 'compiled' into the program using freeze, py2exe, etc.
Usage:
img2py.py [options] image_file python_file

There is no need to base64 encode the string, just paste it's repr into the code

If you mean, storing the bytes that represent the image in the program code itself, you could do it by base64 encoding the image file, and setting a variable to that string.
You could also declare a byte array, where the contents of the array are the bytes that represent the image.
In both cases, if you want to operate on the image, you may need to decode the value that you have included in your source code.
Warning: you may be treading on a performance minefield here.
A better way might be to store the image/s in the directory structure of your module, and the loading it on demand (even caching it). You could write a generalized method/function that loads the right image based on some identifier which maps to the particular image file name that is part and parcel of your module.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.