python image taken date and time - python

I'm trying to create an array that contains the filenames of all images in a folder in the first column and the "time taken" of the image in the second column. This time should be in hh:mm:ss:msmsms (or hhmmssmsmsms), where "ms" is milliseconds.
I found a piece of code that uses the Pillow library with the to pull the EXIFTAG data of the image. I realize that I would need the DateTimeOriginal and SubsecTimeOriginal tags to get the data I want.
Now the problem is that I just don't understand how the code bellow pulls the data from the image and how I would be able to create the desired array. If anyone knows how the .ExifTags and ._getexif() modules work, some explanation would be appreciated.
code:
from PIL import Image
from PIL.ExifTags import TAGS
file_path = 'IMG_20200528_125319.jpg'
results = {}
i = Image.open(file_path)
info = i._getexif()
for tag, value in info.items():
decoded = TAGS.get(tag, tag)
results[decoded] = value
print results

Sadly the info I was looking for is not in the exif tags of the picture. See Mark Setchell's comments.

Related

How to extract images, video and audio from a pdf file using python

I need a python program that can extract videos audio and images from a pdf. I have tried using libraries such as PyPDF2 and Pillow, but I was unable to get all three to work let alone one.
I think you could achieve this using pymupdf.
To extract images see the following: https://pymupdf.readthedocs.io/en/latest/recipes-images.html#how-to-extract-images-pdf-documents
For Sound and Video these are essentially Annotation types.
The following "annots" function would get all the annotations of a specific type for a PDF page:
https://pymupdf.readthedocs.io/en/latest/page.html#Page.annots
Annotation types are as follows:
https://pymupdf.readthedocs.io/en/latest/vars.html#annotationtypes
Once you have acquired an annotation I think you can use the get_file method to extract the content ( see: https://pymupdf.readthedocs.io/en/latest/annot.html#Annot.get_file)
Hope this helps!
#George Davis-Diver can you please let me have an example PDF with video?
Sounds and videos are embedded in their specific annotation types. Both are no FileAttachment annotation, so the respective mathods cannot be used.
For a sound annotation, you must use `annot.get_sound()`` which returns a dictionary where one of the keys is the binary sound stream.
Images on the other hand may for sure be embedded as FileAttachment annotations - but this is unusual. Normally they are displayed on the page independently. Find out a page's images like this:
import fitz
from pprint import pprint
doc=fitz.open("your.pdf")
page=doc[0] # first page - use 0-based page numbers
pprint(page.get_images())
[(1114, 0, 1200, 1200, 8, 'DeviceRGB', '', 'Im1', 'FlateDecode')]
# extract the image stored under xref 1114:
img = doc.extract_image(1114)
This is a dictionary with image metadata and the binary image stream.
Note that PDF stores transparency data of an image separately, which therefore needs some additional care - but let us postpone this until actually happening.
Extracting video from RichMedia annotations is currently possible in PyMuPDF low-level code only.
#George Davis-Diver - thanks for example file!
Here is code that extracts video content:
import sys
import pathlib
import fitz
doc = fitz.open("vid.pdf") # open PDF
page = doc[0] # load desired page (0-based)
annot = page.first_annot # access the desired annot (first one in example)
if annot.type[0] != fitz.PDF_ANNOT_RICH_MEDIA:
print(f"Annotation type is {annot.type[1]}")
print("Only support RichMedia currently")
sys.exit()
cont = doc.xref_get_key(annot.xref, "RichMediaContent/Assets/Names")
if cont[0] != "array": # should be PDF array
sys.exit("unexpected: RichMediaContent/Assets/Names is no array")
array = cont[1][1:-1] # remove array delimiters
# jump over the name / title: we will get it later
if array[0] == "(":
i = array.find(")")
else:
i = array.find(">")
xref = array[i + 1 :] # here is the xref of the actual video stream
if not xref.endswith(" 0 R"):
sys.exit("media contents array has more than one entry")
xref = int(xref[:-4]) # xref of video stream file
video_filename = doc.xref_get_key(xref, "F")[1]
video_xref = doc.xref_get_key(xref, "EF/F")[1]
video_xref = int(video_xref.split()[0])
video_stream = doc.xref_stream_raw(video_xref)
pathlib.Path(video_filename).write_bytes(video_stream)

Python - add arbitrary EXIF data to image (UserComment field)?

I need to add arbitrary data to a JPEG image. Specifically, I need to store two integers. From reading about EXIF data, I'm under the impression that it is not possible to make your own custom fields, but rather the EXIF standard fields must be used.
This post Custom Exif Tags however mentions a UserComment field which I gather it is possible to write a string to. If this is the only option it's fine since I can store two integers in a comma-delimited string, ex '2,5' to store the integers 2 and 5, so if I only have one string of storage to work with it's still sufficient.
I downloaded a few random images from a Google image search and found they don't seem to have EXIF data, perhaps it's stripped off purposefully by Google? Also I took a few images with my cell phone and found that as expected they have a significant amount of EXIF data (image size, GPS location, etc.)
Upon some Googleing I found this example on how to read/dump EXIF data:
from PIL import Image
image = Image.open('image.jpg')
exifData = image._getexif()
print('exifData = ' + str(exifData))
This works great, if I run this on an image with no EXIF data I get:
exifData = None
and if I run this on an image with EXIF data I get a dictionary showing the EXIF fields as expected.
Now my question is, how can I add to the EXIF data? Using the UserComment 37510 field mentioned in the above linked post, and also here https://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html, and using piexif this is my best attempt so far:
from PIL import Image
import piexif
image = Image.open('image.jpg')
exifData = image._getexif()
if exifData is None:
exifData = {}
# end if
exifData[37510] = 'my message'
exifDataBytes = piexif.dump(exifData)
image.save('image_mod.jpg', format='jpeg', exif=exifDataBytes)
If I then run the 1st code above on image_mod.jpg I get:
exifData = {}
So clearly the 37510 message was not properly written. I get this same empty dictionary result whether I'm using an image that has EXIF data or an image without EXIF data to begin with.
Before somebody marks this as a duplicate, I also tried what this post How can I insert EXIF/other metadata into a JPEG stored in a memory buffer? mentions in the highest-rated answer and got the same result when attempting to read the EXIF data (empty dictionary).
What am I doing wrong? How can I properly add custom EXIF data to an image using 37510, or any other means?
You're missing a step in handling the data passed to piexif.dump:
exif_ifd = {piexif.ExifIFD.UserComment: 'my message'.encode()}
exif_dict = {"0th": {}, "Exif": exif_ifd, "1st": {},
"thumbnail": None, "GPS": {}}
exif_dat = piexif.dump(exif_dict)
img.save('image_mod.jpg', exif=exif_dat)
You should be able to read it back out after this. See also this answer for dealing with custom metadata.
Rasterio tags are the easiest way to add metadata of any kind to an image. Easy and practical. example:
import rasterio
old_file=rasterio.open('old_image.tif')
profile=old_file.profile
data=old_file.read()
with rasterio.open('new_image.tif','w',**profile) as dst:
dst.update_tags(a='1', b='2')
dst.write(data)
dst.close()
#now access the tags like below:
im=rasterio.open('new_image.tif')
print(im.tags())

Python-Retain both geocoding and orientation while image save/overwite

I am using PIL to enhance my images. While saving I need both the geographic coordinates and the orientation angles written to the header of the enhanced image. So far I have failed find a way to write the orientation angles.
I could write the coordinates using piexif after reading Preserve exif data of image with PIL when resize(create thumbnail). But this seems not enough to write the orientation also, or maybe I am missing something.
im = Image.open(direc + '\\' + filename)
exif_dict = piexif.load(im.info["exif"])
exif_bytes = piexif.dump(exif_dict)
enhancer = ImageEnhance.Brightness(im)
enhanced_im = enhancer.enhance(1.8)
enhanced_im.save(s + 'enhanced\\' + directory + "\e_" + filename, "JPEG", exif=exif_bytes)
When I print my exif_dict I see two main keys 0th and Exif (with reasonable key-value pairs under each of them and a lot of \x00\x00\x00q\x00\x00\x00g\x00\x00\x00r\x00\x00\x00l\x00\x00\x0... such characters which continues even after the parenthesis of the dictionary has ended. Please advise.
You could write a world file for every image: https://en.wikipedia.org/wiki/World_file
Create a text file, calculate the values, write them in the text file and add the corresponding extension to the file name.
EDIT: If you need to change the exif values I would recommend looking at the tags which already are in the exif data and change/add the orientation tag (How to modify EXIF data in python).
If you search for exif orientation tag on google, you can find the explanation of the values. They are also explained on this page https://sno.phy.queensu.ca/~phil/exiftool/TagNames/EXIF.html.
This page also explains how to change the orientation https://magnushoff.com/jpeg-orientation.html.
Hopefully it helps.

Can I convert PDF blob to image using Python and Wand?

I'm trying to convert a PDF's first page to an image. However, the PDF is coming straight from the database in a base64 format. I then convert it to a blob. I want to know if it's possible to convert the first page of the PDF to an image within my Python code.
I'm familiar with being able to use filename in the Image object:
Image(filename="test.pdf[0]") as img:
The issue I'm facing is there is not an actual filename, just a blob. This is what I have so far, any suggestions would be appreciated.
x = object['file']
fileBlob = base64.b64decode('x')
with Image(**what do I put here for pdf blob?**) as img:
more code
It works for me
all_pages = Image(blob=blob_pdf) # PDF will have several pages.
single_image = all_pages.sequence[0] # Just work on first page
with Image(single_image) as i:
...
Documentation says something about blobs.
So it should be:
with Image(blob=fileBlob):
#etc etc
I didn't test that but I think this is what you are after.

pyexiv2 - Empty XMP and ITPC Tags?

I'm trying to manipulate an image's exif, XMP and ITPC tags with Python 2.7 and pyexiv2 in Windows 7. I can get a list of exif tags, but for some reason the XMP and ITPC lists are coming back empty, even though those tags exist in my test image (at least according to the mapping presented here. Anyone else run into this issue and been able to solve it? Many thanks for any feedback!
Code:
import pyexiv2
img = r'pathToImage'
metadata = pyexiv2.ImageMetadata(img)
metadata.read()
exifTags = metadata.exif_keys
print exifTags
xmpTags = metadata.xmp_keys
print xmpTags
iptcTags = metadata.iptc_keys
print iptcTags
metada.exif_keys got you a list of EXIF tags from the image.
To view the keys with their respective values you might want to use a small loop:
for tag in exifTags:
print exifTags[tag]
Additionaly you can use tag.value or tag.raw_value to access the values themselves.
You will find it all neatly explained in the pyexiv2 tutorial.

Categories

Resources