Python SAX Parser: resolveEntity - python

I am having a hard time figuring out how to bind a ResolveEntityHandler of my own to a SAX parser. On SO there this answer. But unfortunately I cannot reproduce the result there.
When I run the following code, which is actually copied from the aforementioned answer, just updated to Python 3,
import io
import xml.sax
from xml.sax.handler import ContentHandler
# Inheriting from EntityResolver and DTDHandler is not necessary
class TestHandler(ContentHandler):
# This method is only called for external entities. Must return a value.
def resolveEntity(self, publicID, systemID):
print ("TestHandler.resolveEntity(): %s %s" % (publicID, systemID))
return systemID
def skippedEntity(self, name):
print ("TestHandler.skippedEntity(): %s" % (name))
def unparsedEntityDecl(self, name, publicID, systemID, ndata):
print ("TestHandler.unparsedEntityDecl(): %s %s" % (publicID, systemID))
def startElement(self, name, attrs):
summary = attrs.get('summary', '')
print ('TestHandler.startElement():', summary)
def main(xml_string):
try:
parser = xml.sax.make_parser()
curHandler = TestHandler()
parser.setContentHandler(curHandler)
parser.setEntityResolver(curHandler)
parser.setDTDHandler(curHandler)
stream = io.StringIO(xml_string)
parser.parse(stream)
stream.close()
except xml.sax.SAXParseException as e:
print ("ERROR %s" % e)
XML = """<!DOCTYPE test SYSTEM "test.dtd">
<test summary='step: &num;'>Entity: ¬</test>
"""
main(XML)
and the external test.dtd
<!ENTITY num "FOO">
<!ENTITY pic SYSTEM 'bar.gif' NDATA gif>
What I got is
TestHandler.startElement(): step:
TestHandler.skippedEntity(): not
Process finished with exit code 0
So my questions are:
why was resolveEntity never called?
how to bind a ResolveEntityHandler to your parser?

What you are seeing has to do with a change in Python 3.7.1:
Changed in version 3.7.1: The SAX parser no longer processes general external entities by default to increase security. Before, the parser created network connections to fetch remote files or loaded local files from the file system for DTD and entities. The feature can be enabled again with method setFeature() on the parser object and argument feature_external_ges.
To get the same behaviour as in earlier versions, add these lines:
from xml.sax.handler import feature_external_ges
and (in the main function)
parser.setFeature(feature_external_ges, True)

Related

Sublime plugin for executing a command

I been writing markdown files lately, and have been using the awesome table of content generator (github-markdown-toc) tool/script on a daily basis, but I'd like it to be regenerated automatically each time I press Ctrl+s, right before saving the md file in my sublime3 environment.
What I have done till now was to generate it from the shell manually, using:
gh-md-toc --insert my_file.md
So I wrote a simple plugin, but for some reason I can't see the result I wanted.
I see my print but the toc is not generated.
Does anybody has any suggestions? what's wrong?
import sublime, sublime_plugin
import subprocess
class AutoRunTOCOnSave(sublime_plugin.EventListener):
""" A class to listen for events triggered by ST. """
def on_post_save_async(self, view):
"""
This is called after a view has been saved. It runs in a separate thread
and does not block the application.
"""
file_path = view.file_name()
if not file_path:
return
NOT_FOUND = -1
pos_dot = file_path.rfind(".")
if pos_dot == NOT_FOUND:
return
file_extension = file_path[pos_dot:]
if file_extension.lower() == ".md": #
print("Markdown TOC was invoked: handling with *.md file")
subprocess.Popen(["gh-md-toc", "--insert ", file_path])
Here's a slightly modified version of your plugin:
import sublime
import sublime_plugin
import subprocess
class AutoRunTOCOnSaveListener(sublime_plugin.EventListener):
""" A class to listen for events triggered by ST. """
def on_post_save_async(self, view):
"""
This is called after a view has been saved. It runs in a separate thread
and does not block the application.
"""
file_path = view.file_name()
if not file_path:
return
if file_path.split(".")[-1].lower() == "md":
print("Markdown TOC was invoked: handling with *.md file")
subprocess.Popen(["/full/path/to/gh-md-toc", "--insert ", file_path])
I changed a couple things, along with the name of the class. First, I simplified your test for determining if the current file is a Markdown document (fewer operations means less room for error). Second, you should include the full path to the gh-md-toc command, as it's possible subprocess.Popen can't find it on the default path.
I figured out, since gh-md-toc is a bash script, I replaced the following line:
subprocess.Popen(["gh-md-toc", "--insert ", file_path])
with:
subprocess.check_call("gh-md-toc --insert %s" % file_path, shell=True)
So now it works well, on each save.

Creating a central registry that can be loaded in a distributed way across modules

I am trying to create a registry that I can load with name-factory_method pairs, so that client code is able to use the registry to instantiate these objects by their given names. I can get this to work if I load the registry with pairs within the registry module.
However, I cannot seem to get the registry loaded if I distribute the loading among other modules (e.g. with the factory methods). I would prefer the latter option, as then the registry module doesn't have to be aware of all the potential factory methods. But I can't seem to get this to work.
I have created a simple three module version that works and then one that fails below:
Working version
registry.py
registry = {}
def register_thing(description, thingmaker):
registry[description] = thingmaker
def get_thing(description, *args, **kwargs):
thingmaker = registry[description]
return thingmaker(*args, **kwargs)
def show_things():
return registry.keys()
from things import Thing1
from things import Thing2
register_thing("Thing1", Thing1)
register_thing("Thing2", Thing2)
things.py
class Thing1(object):
def __init__(self):
pass
def message(self):
return "This is a thing"
class Thing2(object):
def __init__(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
def message(self):
return "This is a different thing with args %r and kwargs %r" \
% (self.args, self.kwargs)
use_things.py
import registry
print("The things in the registry are: %r" % registry.show_things())
print("Getting a Thing1")
thing = registry.get_thing("Thing1")
print("It has message %s" % thing.message())
print("Getting a Thing2")
thing = registry.get_thing("Thing2", "kite", on_string="Mothers new gown")
print("It has message %s" % thing.message())
Running use_things.py gives:
The things in the registry are: dict_keys(['Thing1', 'Thing2'])
Getting a Thing1
It has message This is a thing
Getting a Thing2
It has message This is a different thing with args ('kite',) and kwargs {'on_string': 'Mothers new gown'}
Failing distributed version
registry.py
registry = {}
def register_thing(description, thingmaker):
registry[description] = thingmaker
def get_thing(description, *args, **kwargs):
thingmaker = registry[description]
return thingmaker(*args, **kwargs)
def show_things():
return registry.keys()
things.py
import registry
class Thing1(object):
def __init__(self):
pass
def message(self):
return "This is a thing"
class Thing2(object):
def __init__(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
def message(self):
return "This is a different thing with args %r and kwargs %r" \
% (self.args, self.kwargs)
register.register_thing("Thing1", Thing1)
register.register_thing("Thing2", Thing2)
use_things.py (as before)
Now if I run use_things.py I get the following:
The things in the registry are: dict_keys([])
Getting a Thing1
Traceback (most recent call last):
File "use_things.py", line 6, in <module>
thing = registry.get_thing("Thing1")
File "/home/luke/scratch/registry_example/registry.py", line 7, in get_thing
thingmaker = registry[description]
KeyError: 'Thing1'
Clearly, the things.py module is never getting called and populating the registry.
If I re-add the following line at the bottom of registry.py it again works:
import things
But again this requires registry.py to be aware of the modules needed. I would prefer the registry to be populated automatically by modules below a certain directory but I cannot seem to get this to work. Can anybody help?
What you are describing is basically a "plug-in" software architecture and there are different ways of implementing one. I personally think using a Python package to do it is a good approach because it's a well-defined "pythonic" way to organize modules and the languages supports it directly, which makes doing some of the things involved a little easier.
Here's something that I think does basically everything you want. It's based on my answer to the question How to import members of all modules within a package? which requires putting all the factory scripts in a package directory, in a file hierarchy like this:
use_things.py
things/
__init__.py
thing1.py
thing2.py
The names of the package and factory scripts can easily be changed to something else if you wish.
Instead of having an explicit public registry, it just uses the package's name, things in this example. (There is a private _registry dictionary in the module, though, if you feel you really need one for some reason.)
Although the package does have to be explicitly imported, its __init__.py initialization script will import the rest of the files in the subdirectory automatically — so adding or deleting one is simply a matter of placing its script in subdirectory or removing it from there.
There's no register_thing() function in this implementation, because the private _import_all_modules() function in __init__.py script effectively does it automatically — but note that it "auto-registers" everything public in each factory module script. You can, of course, modify how this works if you want it done in a different manner. (I have a couple of ideas if you're interested.)
Here's the contents of each of the files as outlined above:
use_things.py:
import things # Import package.
print("The things in the package are: %r" % things.show_things())
print("Getting a Thing1")
thing = things.get_thing("Thing1")
print(f"It has message {thing.message()!r}")
print("Getting a Thing2")
thing = things.get_thing("Thing2", "kite", on_string="Mothers new gown")
print(f"It has message {thing.message()!r}")
things/__init__.py:
def _import_all_modules():
""" Dynamically imports all modules in this package directory. """
import traceback
import os
globals_, locals_ = globals(), locals()
registry = {}
# Dynamically import all the package modules in this file's directory.
for filename in os.listdir(__name__):
# Process all python files in directory that don't start with an underscore
# (which also prevents this module from importing itself).
if filename[0] != '_' and filename.split('.')[-1] in ('py', 'pyw'):
modulename = filename.split('.')[0] # Filename sans extension.
package_module = '.'.join([__name__, modulename])
try:
module = __import__(package_module, globals_, locals_, [modulename])
except:
traceback.print_exc()
raise
for name in module.__dict__:
if not name.startswith('_'):
registry[name] = module.__dict__[name]
return registry
_registry = _import_all_modules()
def get_thing(description, *args, **kwargs):
thingmaker = _registry[description]
return thingmaker(*args, **kwargs)
def show_things():
return list(_registry.keys())
things/thing1.py
class Thing1(object):
def __init__(self):
pass
def message(self):
return f'This is a {type(self).__name__}'
things/thing2.py:
class Thing2(object):
def __init__(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
def message(self):
return (f"This is a different thing with args {self.args}"
f" and kwargs {self.kwargs}")
Running use_things.py gives:
The things in the package are: ['Thing1', 'Thing2']
Getting a Thing1
It has message 'This is a Thing1'
Getting a Thing2
It has message "This is a different thing with args ('kite',) and kwargs {'on_string': 'Mothers new gown'}"
Note: Martineau has mostly answered my question and the sophisticated stuff is there. However, there was a little additional requirement that I had (in my question) but which wasn't very clear. I have used martineau's answer to create a full answer and I have shared it here for anyone wanting to see it.
The additional requirements were that I could use any factory_method (not just a class' __init__ function) and that I wanted to explicitly register the ones I wanted in my registry.
So here is my final version...
I use the same directory structure as Martineau:
use_things.py
things/
__init__.py
thing1.py
thing2.py
To demonstrate the other type of factory_method I have extended use_things.py by a couple of lines:
import things # Import package.
print("The things in the package are: %r" % things.show_things())
print("Getting a Thing1")
thing = things.get_thing("Thing1")
print(f"It has message {thing.message()!r}")
print("Getting a Thing2")
thing = things.get_thing("Thing2", "kite", on_string="Mothers new gown")
print(f"It has message {thing.message()!r}")
print("Getting a Thing2 in a net")
thing = things.get_thing("Thing2_in_net", "kite", on_string="Mothers new gown")
print(f"It has message {thing.message()!r}")
Note that getting Thing2_in_net constructs an object of type Thing2 but with some precomputation applied.
thing1.py now explicitly registers Thing1's constructor (__init__) by declaring a tuple with a name starting _register_<something>. Another class UnregisteredThing is not registered.
class Thing1(object):
def __init__(self):
pass
def message(self):
return f'This is a {type(self).__name__}'
_register_thing1 = ('Thing1', Thing1)
class UnregisteredThing(object):
def __init__(self):
pass
def message(self):
return f'This is an unregistered thing'
And thing2.py registers two makers, one the basic constructor of Thing2 and one from a factory method:
class Thing2(object):
def __init__(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
def message(self):
return (f"This is a different thing with args {self.args}"
f" and kwargs {self.kwargs}")
def build_thing2_in_net(*args, **kwargs):
return Thing2(*args, located='in net', **kwargs)
_register_thing2 = ('Thing2', Thing2)
_register_thing2_in_net = ('Thing2_in_net', build_thing2_in_net)
Finally, the __init__.py script, is modified to look specifically for module attributes called _register_<something> and it will treat these as a key/maker pair to register:
def build_registry():
""" Dynamically imports all modules in this package directory. """
import traceback
import os
globals_, locals_ = globals(), locals()
registry = {}
for filename in os.listdir(__name__):
# Process all python files in directory that don't start with an underscore
# (which also prevents this module from importing itself).
if filename[0] != '_' and filename.split('.')[-1] in ('py', 'pyw'):
modulename = filename.split('.')[0] # Filename sans extension.
package_module = '.'.join([__name__, modulename])
try:
module = __import__(
package_module, globals_, locals_, [modulename])
except:
traceback.print_exc()
raise
for name in module.__dict__:
## look for attributes of module starting in _register_
if name.startswith('_register_'):
# if so assume they are key-maker pair and register them
key, maker = module.__dict__[name]
registry[key] = maker
return registry
_registry = build_registry()
def get_thing(description, *args, **kwargs):
thingmaker = _registry[description]
return thingmaker(*args, **kwargs)
def show_things():
return list(_registry.keys())
The resulting output shows that only registered things appear in the registry and these can be any method that constructs an object:
The things in the package are: ['Thing2', 'Thing2_in_net', 'Thing1']
Getting a Thing1
It has message 'This is a Thing1'
Getting a Thing2
It has message "This is a different thing with args ('kite',) and kwargs {'on_string': 'Mothers new gown'}"
Getting a Thing2 in a net
It has message "This is a different thing with args ('kite',) and kwargs {'located': 'in net', 'on_string': 'Mothers new gown'}"

How do I get the Python line number and file name of the point this function was called from? [duplicate]

In C++, I can print debug output like this:
printf(
"FILE: %s, FUNC: %s, LINE: %d, LOG: %s\n",
__FILE__,
__FUNCTION__,
__LINE__,
logmessage
);
How can I do something similar in Python?
There is a module named inspect which provides these information.
Example usage:
import inspect
def PrintFrame():
callerframerecord = inspect.stack()[1] # 0 represents this line
# 1 represents line at caller
frame = callerframerecord[0]
info = inspect.getframeinfo(frame)
print(info.filename) # __FILE__ -> Test.py
print(info.function) # __FUNCTION__ -> Main
print(info.lineno) # __LINE__ -> 13
def Main():
PrintFrame() # for this line
Main()
However, please remember that there is an easier way to obtain the name of the currently executing file:
print(__file__)
For example
import inspect
frame = inspect.currentframe()
# __FILE__
fileName = frame.f_code.co_filename
# __LINE__
fileNo = frame.f_lineno
There's more here http://docs.python.org/library/inspect.html
Building on geowar's answer:
class __LINE__(object):
import sys
def __repr__(self):
try:
raise Exception
except:
return str(sys.exc_info()[2].tb_frame.f_back.f_lineno)
__LINE__ = __LINE__()
If you normally want to use __LINE__ in e.g. print (or any other time an implicit str() or repr() is taken), the above will allow you to omit the ()s.
(Obvious extension to add a __call__ left as an exercise to the reader.)
You can refer my answer:
https://stackoverflow.com/a/45973480/1591700
import sys
print sys._getframe().f_lineno
You can also make lambda function
I was also interested in a __LINE__ command in python.
My starting point was https://stackoverflow.com/a/6811020 and I extended it with a metaclass object. With this modification it has the same behavior like in C++.
import inspect
class Meta(type):
def __repr__(self):
# Inspiration: https://stackoverflow.com/a/6811020
callerframerecord = inspect.stack()[1] # 0 represents this line
# 1 represents line at caller
frame = callerframerecord[0]
info = inspect.getframeinfo(frame)
# print(info.filename) # __FILE__ -> Test.py
# print(info.function) # __FUNCTION__ -> Main
# print(info.lineno) # __LINE__ -> 13
return str(info.lineno)
class __LINE__(metaclass=Meta):
pass
print(__LINE__) # print for example 18
wow, 7 year old question :)
Anyway, taking Tugrul's answer, and writing it as a debug type method, it can look something like:
def debug(message):
import sys
import inspect
callerframerecord = inspect.stack()[1]
frame = callerframerecord[0]
info = inspect.getframeinfo(frame)
print(info.filename, 'func=%s' % info.function, 'line=%s:' % info.lineno, message)
def somefunc():
debug('inside some func')
debug('this')
debug('is a')
debug('test message')
somefunc()
Output:
/tmp/test2.py func=<module> line=12: this
/tmp/test2.py func=<module> line=13: is a
/tmp/test2.py func=<module> line=14: test message
/tmp/test2.py func=somefunc line=10: inside some func
import inspect
.
.
.
def __LINE__():
try:
raise Exception
except:
return sys.exc_info()[2].tb_frame.f_back.f_lineno
def __FILE__():
return inspect.currentframe().f_code.co_filename
.
.
.
print "file: '%s', line: %d" % (__FILE__(), __LINE__())
Here is a tool to answer this old yet new question!
I recommend using icecream!
Do you ever use print() or log() to debug your code? Of course, you
do. IceCream, or ic for short, makes print debugging a little sweeter.
ic() is like print(), but better:
It prints both expressions/variable names and their values.
It's 40% faster to type.
Data structures are pretty printed.
Output is syntax highlighted.
It optionally includes program context: filename, line number, and parent function.
For example, I created a module icecream_test.py, and put the following code inside it.
from icecream import ic
ic.configureOutput(includeContext=True)
def foo(i):
return i + 333
ic(foo(123))
Prints
ic| icecream_test.py:6 in <module>- foo(123): 456
To get the line number in Python without importing the whole sys module...
First import the _getframe submodule:
from sys import _getframe
Then call the _getframe function and use its' f_lineno property whenever you want to know the line number:
print(_getframe().f_lineno) # prints the line number
From the interpreter:
>>> from sys import _getframe
... _getframe().f_lineno # 2
Word of caution from the official Python Docs:
CPython implementation detail: This function should be used for internal and specialized purposes only. It is not guaranteed to exist in all implementations of Python.
In other words: Only use this code for personal testing / debugging reasons.
See the Official Python Documentation on sys._getframe for more information on the sys module, and the _getframe() function / submodule.
Based on Mohammad Shahid's answer (above).

Example of use libextractor3 from python

I'm using python-extractor to work with libextractor3. I can not find any examples of it. Does any one have any documentations or examples?
Source package of python-extractor contains a file named extract.py which has a small demo on how to use libextractor Python binding.
Content from extract.py
import extractor
import sys
from ctypes import *
import struct
xtract = extractor.Extractor()
def print_k(xt, plugin, type, format, mime, data, datalen):
mstr = cast (data, c_char_p)
# FIXME: this ignores 'datalen', not that great...
# (in general, depending on the mime type and format, only
# the first 'datalen' bytes in 'data' should be used).
if (format == extractor.EXTRACTOR_METAFORMAT_UTF8):
print "%s - %s" % (xtract.keywordTypes()[type], mstr.value)
return 0
for arg in sys.argv[1:]:
print "Keywords from %s:" % arg
xtract.extract(print_k, None, arg)
To have better understanding of python-extractor go through source code in extractor.py.

Speed up nautilus python-extensions for reading image's Exif

I've written a Nautilus extension which reads picture's metadata (executing exiftool), but when I open folders with many files, it really slows down the file manager and hangs until it finishes reading the file's data.
Is there a way to make Nautilus keep its work while it runs my extension? Perhaps the Exif data could appear gradually in the columns while I go on with my work.
#!/usr/bin/python
# Richiede:
# nautilus-python
# exiftool
# gconf-python
# Versione 0.15
import gobject
import nautilus
from subprocess import Popen, PIPE
from urllib import unquote
import gconf
def getexiftool(filename):
options = '-fast2 -f -m -q -q -s3 -ExifIFD:DateTimeOriginal -IFD0:Software -ExifIFD:Flash -Composite:ImageSize -IFD0:Model'
exiftool=Popen(['/usr/bin/exiftool'] + options.split() + [filename],stdout=PIPE,stderr=PIPE)
#'-Nikon:ShutterCount' non utilizzabile con l'argomento -fast2
output,errors=exiftool.communicate()
return output.split('\n')
class ColumnExtension(nautilus.ColumnProvider, nautilus.InfoProvider, gobject.GObject):
def __init__(self):
pass
def get_columns(self):
return (
nautilus.Column("NautilusPython::ExifIFD:DateTimeOriginal","ExifIFD:DateTimeOriginal","Data (ExifIFD)","Data di scatto"),
nautilus.Column("NautilusPython::IFD0:Software","IFD0:Software","Software (IFD0)","Software utilizzato"),
nautilus.Column("NautilusPython::ExifIFD:Flash","ExifIFD:Flash","Flash (ExifIFD)","Modalit\u00e0 del flash"),
nautilus.Column("NautilusPython::Composite:ImageSize","Composite:ImageSize","Risoluzione (Exif)","Risoluzione dell'immagine"),
nautilus.Column("NautilusPython::IFD0:Model","IFD0:Model","Fotocamera (IFD0)","Modello fotocamera"),
#nautilus.Column("NautilusPython::Nikon:ShutterCount","Nikon:ShutterCount","Contatore scatti (Nikon)","Numero di scatti effettuati dalla macchina a questo file"),
nautilus.Column("NautilusPython::Mp","Mp","Megapixel (Exif)","Dimensione dell'immagine in megapixel"),
)
def update_file_info_full(self, provider, handle, closure, file):
client = gconf.client_get_default()
if not client.get_bool('/apps/nautilus/nautilus-metadata/enable'):
client.set_bool('/apps/nautilus/nautilus-metadata/enable',0)
return
if file.get_uri_scheme() != 'file':
return
if file.get_mime_type() in ('image/jpeg', 'image/png', 'image/gif', 'image/bmp', 'image/x-nikon-nef', 'image/x-xcf', 'image/vnd.adobe.photoshop'):
gobject.timeout_add_seconds(1, self.update_exif, provider, handle, closure, file)
return Nautilus.OperationResult.IN_PROGRESS
file.add_string_attribute('ExifIFD:DateTimeOriginal','')
file.add_string_attribute('IFD0:Software','')
file.add_string_attribute('ExifIFD:Flash','')
file.add_string_attribute('Composite:ImageSize','')
file.add_string_attribute('IFD0:Model','')
file.add_string_attribute('Nikon:ShutterCount','')
file.add_string_attribute('Mp','')
return Nautilus.OperationResult.COMPLETE
def update_exif(self, provider, handle, closure, file):
filename = unquote(file.get_uri()[7:])
data = getexiftool(filename)
file.add_string_attribute('ExifIFD:DateTimeOriginal',data[0].replace(':','-',2))
file.add_string_attribute('IFD0:Software',data[1])
file.add_string_attribute('ExifIFD:Flash',data[2])
file.add_string_attribute('Composite:ImageSize',data[3])
file.add_string_attribute('IFD0:Model',data[4])
#file.add_string_attribute('Nikon:ShutterCount',data[5])
width, height = data[3].split('x')
mp = float(width) * float(height) / 1000000
mp = "%.2f" % mp
file.add_string_attribute('Mp',str(mp) + ' Mp')
Nautilus.info_provider_update_complete_invoke(closure, provider, handle, Nautilus.OperationResult.COMPLETE)
return false
That happens because you are invoking update_file_info, which is part of the asynchronous IO system of Nautilus. Therefore, it blocks nautilus if the operations are not fast enough.
In your case it is exacerbated because you are calling an external program, and that is an expensive operation. Notice that update_file_info is called once per file. If you have 100 files, then you will call 100 times the external program, and Nautilus will have to wait for each one before processing the next one.
Since nautilus-python 0.7 are available update_file_info_full and cancel_update, which allows you to program async calls. You can check the documentation of Nautilus 0.7 for more details.
It worth to mention this was a limitation of nautilus-python only, which previously did not expose those methods available in C.
EDIT: Added a couple of examples.
The trick is make the process as fast as possible or make it asynchronous.
Example 1: Invoking an external program
Using a simplified version of your code, we make asynchronous using
GObject.timeout_add_seconds in update_file_info_full.
from gi.repository import Nautilus, GObject
from urllib import unquote
from subprocess import Popen, PIPE
def getexiftool(filename):
options = '-fast2 -f -m -q -q -s3 -ExifIFD:DateTimeOriginal'
exiftool = Popen(['/usr/bin/exiftool'] + options.split() + [filename],
stdout=PIPE, stderr=PIPE)
output, errors = exiftool.communicate()
return output.split('\n')
class MyExtension(Nautilus.ColumnProvider, Nautilus.InfoProvider, GObject.GObject):
def __init__(self):
pass
def get_columns(self):
return (
Nautilus.Column(name='MyExif::DateTime',
attribute='Exif:Image:DateTime',
label='Date Original',
description='Data time original'
),
)
def update_file_info_full(self, provider, handle, closure, file_info):
if file_info.get_uri_scheme() != 'file':
return
filename = unquote(file_info.get_uri()[7:])
attr = ''
if file_info.get_mime_type() in ('image/jpeg', 'image/png'):
GObject.timeout_add_seconds(1, self.update_exif,
provider, handle, closure, file_info)
return Nautilus.OperationResult.IN_PROGRESS
file_info.add_string_attribute('Exif:Image:DateTime', attr)
return Nautilus.OperationResult.COMPLETE
def update_exif(self, provider, handle, closure, file_info):
filename = unquote(file_info.get_uri()[7:])
try:
data = getexiftool(filename)
attr = data[0]
except:
attr = ''
file_info.add_string_attribute('Exif:Image:DateTime', attr)
Nautilus.info_provider_update_complete_invoke(closure, provider,
handle, Nautilus.OperationResult.COMPLETE)
return False
The code above will not block Nautilus, and if the column 'Date Original' is available in the column view, the JPEG and PNG images will show the 'unknown' value, and slowly they will being updated (the subprocess is called after 1 second).
Examples 2: Using a library
Rather than invoking an external program, it could be better to use a library. As the example below:
from gi.repository import Nautilus, GObject
from urllib import unquote
import pyexiv2
class MyExtension(Nautilus.ColumnProvider, Nautilus.InfoProvider, GObject.GObject):
def __init__(self):
pass
def get_columns(self):
return (
Nautilus.Column(name='MyExif::DateTime',
attribute='Exif:Image:DateTime',
label='Date Original',
description='Data time original'
),
)
def update_file_info_full(self, provider, handle, closure, file_info):
if file_info.get_uri_scheme() != 'file':
return
filename = unquote(file_info.get_uri()[7:])
attr = ''
if file_info.get_mime_type() in ('image/jpeg', 'image/png'):
metadata = pyexiv2.ImageMetadata(filename)
metadata.read()
try:
tag = metadata['Exif.Image.DateTime'].value
attr = tag.strftime('%Y-%m-%d %H:%M')
except:
attr = ''
file_info.add_string_attribute('Exif:Image:DateTime', attr)
return Nautilus.OperationResult.COMPLETE
Eventually, if the routine is slow you would need to make it asynchronous (maybe using something better than GObject.timeout_add_seconds.
At last but not least, in my examples I used GObject Introspection (typically for Nautilus 3), but it easy to change it to use the module nautilus directly.
The above solution is only partly correct.
Between state changes for file_info metadata, the user should call file_info.invalidate_extension_info() to notify nautilus of the change.
Failing to do this could end up with 'unknown' appearing in your columns.
file_info.add_string_attribute('video_width', video_width)
file_info.add_string_attribute('video_height', video_height)
file_info.add_string_attribute('name_suggestion', name_suggestion)
file_info.invalidate_extension_info()
Nautilus.info_provider_update_complete_invoke(closure, provider, handle, Nautilus.OperationResult.COMPLETE)
Full working example here:
Fully working example
API Documentation
thanks to Dave!
i was looking for a solution to the 'unknown' text in the column for ages
file_info.invalidate_extension_info()
Fixed the issue for me right away :)
Per the api API Documentation
https://projects-old.gnome.org/nautilus-python/documentation/html/class-nautilus-python-file-info.html#method-nautilus-python-file-info--invalidate-extension-info
Nautilus.FileInfo.invalidate_extension_info
def invalidate_extension_info()
Invalidates the information Nautilus has about this file, which causes it to request new information from its Nautilus.InfoProvider providers.

Categories

Resources