best way to implement custom pretty-printers - python

Customizing pprint.PrettyPrinter
The documentation for the pprint module mentions that the method PrettyPrinter.format is intended to make it possible to customize formatting.
I gather that it's possible to override this method in a subclass, but this doesn't seem to provide a way to have the base class methods apply line wrapping and indentation.
Am I missing something here?
Is there a better way to do this (e.g. another module)?
Alternatives?
I've checked out the pretty module, which looks interesting, but doesn't seem to provide a way to customize formatting of classes from other modules without modifying those modules.
I think what I'm looking for is something that would allow me to provide a mapping of types (or maybe functions) that identify types to routines that process a node. The routines that process a node would take a node and return the string representation it, along with a list of child nodes. And so on.
Why I’m looking into pretty-printing
My end goal is to compactly print custom-formatted sections of a DocBook-formatted xml.etree.ElementTree.
(I was surprised to not find more Python support for DocBook. Maybe I missed something there.)
I built some basic functionality into a client called xmlearn that uses lxml. For example, to dump a Docbook file, you could:
xmlearn -i docbook_file.xml dump -f docbook -r book
It's pretty half-ass, but it got me the info I was looking for.
xmlearn has other features too, like the ability to build a graph image and do dumps showing the relationships between tags in an XML document. These are pretty much totally unrelated to this question.
You can also perform a dump to an arbitrary depth, or specify an XPath as a set of starting points. The XPath stuff sort of obsoleted the docbook-specific format, so that isn't really well-developed.
This still isn't really an answer for the question. I'm still hoping that there's a readily customizable pretty printer out there somewhere.

My solution was to replace pprint.PrettyPrinter with a simple wrapper that formats any floats it finds before calling the original printer.
from __future__ import division
import pprint
if not hasattr(pprint,'old_printer'):
pprint.old_printer=pprint.PrettyPrinter
class MyPrettyPrinter(pprint.old_printer):
def _format(self,obj,*args,**kwargs):
if isinstance(obj,float):
obj=round(obj,4)
return pprint.old_printer._format(self,obj,*args,**kwargs)
pprint.PrettyPrinter=MyPrettyPrinter
def pp(obj):
pprint.pprint(obj)
if __name__=='__main__':
x=[1,2,4,6,457,3,8,3,4]
x=[_/17 for _ in x]
pp(x)

This question may be a duplicate of:
Any way to properly pretty-print ordered dictionaries in Python?
Using pprint.PrettyPrinter
I looked through the source of pprint. It seems to suggest that, in order to enhance pprint(), you’d need to:
subclass PrettyPrinter
override _format()
test for issubclass(),
and (if it's not your class), pass back to _format()
Alternative
I think a better approach would be just to have your own pprint(), which defers to pprint.pformat when it doesn't know what's up.
For example:
'''Extending pprint'''
from pprint import pformat
class CrazyClass: pass
def prettyformat(obj):
if isinstance(obj, CrazyClass):
return "^CrazyFoSho^"
else:
return pformat(obj)
def prettyp(obj):
print(prettyformat(obj))
# test
prettyp([1]*100)
prettyp(CrazyClass())
The big upside here is that you don't depend on pprint internals. It’s explicit and concise.
The downside is that you’ll have to take care of indentation manually.

If you would like to modify the default pretty printer without subclassing, you can use the internal _dispatch table on the pprint.PrettyPrinter class. You can see how examples of how dispatching is added for internal types like dictionaries and lists in the source.
Here is how I added a custom pretty printer for MatchPy's Operation type:
import pprint
import matchpy
def _pprint_operation(self, object, stream, indent, allowance, context, level):
"""
Modified from pprint dict https://github.com/python/cpython/blob/3.7/Lib/pprint.py#L194
"""
operands = object.operands
if not operands:
stream.write(repr(object))
return
cls = object.__class__
stream.write(cls.__name__ + "(")
self._format_items(
operands, stream, indent + len(cls.__name__), allowance + 1, context, level
)
stream.write(")")
pprint.PrettyPrinter._dispatch[matchpy.Operation.__repr__] = _pprint_operation
Now if I use pprint.pprint on any object that has the same __repr__ as matchpy.Operation, it will use this method to pretty print it. This works on subclasses as well, as long as they don't override the __repr__, which makes some sense! If you have the same __repr__ you have the same pretty printing behavior.
Here is an example of the pretty printing some MatchPy operations now:
ReshapeVector(Vector(Scalar('1')),
Vector(Index(Vector(Scalar('0')),
If(Scalar('True'),
Scalar("ReshapeVector(Vector(Scalar('2'), Scalar('2')), Iota(Scalar('10')))"),
Scalar("ReshapeVector(Vector(Scalar('2'), Scalar('2')), Ravel(Iota(Scalar('10'))))")))))

Consider using the pretty module:
http://pypi.python.org/pypi/pretty/0.1

Related

python string to a function call with arguments, without using eval

I have a string stored in a database stands for a class instance creation for example module1.CustomHandler(filename="abc.csv", mode="rb"), where CustomHandler is a class defined in module1.
I would like to evaluate this string to create a class instance for a one time use. Right now I am using something like this
statement = r'module1.CustomHandler(filename="abc.csv", mode="rb")' # actually read from db
exec(f'from parent.module import {statement.split(".")[0]}')
func_or_instance = eval(statement) # this is what I need
Only knowledgable developers can insert such records into database so I am not worried about eval some unwanted codes. But I've read several posts saying eval is unsafe and there is always a better way. Is there a way I can achieve this without using eval?
You might want to take a look at the ast Python module, which stands for abstract syntax trees. It's mainly used when you need to process the grammar of the programming language, work with code in string format, and so much more functions available in the official documentation.
In this case eval() function looks like the best solution, clear and readable, but safe only under certain conditions. For example, if you try to evaluate a code that contains a class not implemented in the code, it will throw an exception. That's the main reason why eval is sometimes unsafe.

Is it a good idea to inherit from dict or list classes? [duplicate]

Here's my general problem space:
I have a byte/bit protocol with a device over I2C.
I've got a "database" of the commands to fully describe all the bitfields types and values and enumerations.
I have a class to consume the database and a i2c driver/transactor so that I can then call commands and get responses.
MyProtocol = Protocol('database.xml',I2CDriver())
theStatus = MyProtocol.GET_STATUS()
creates the proper byte stream for the GET_STATUS command, sends it over the i2c and returns the response as a byte array currently. I can get it to pretty print the response inside of the GET_STATUS() implementation, but I want to move that behavior to return object, rather than in the command.
I want my return object to be 'smart': theStatus needs to have the list/array of bytes plus a reference to its field definitions.
I want theStatus to act like an list/bytearray, so I can directly inspect the bytes. I don't care if slices are anything other than lists of bytes or bytearray. Once they've been sliced out they're just bytes.
I want 'theStatus' to be able to be printed print(theStatus) and have it pretty print all the fields in the status. I'm comfortable on how to make this happen once I settle on a workable data structure that allows me to access the bytes and the data base.
I want to inspect theStatus by field name with something like theStatus.FIELDNAME or maybe theStatus['FIELDNAME']'. Same thing: once I have a workable data structure that has the byte array and database as members, I can make this happen.
The problem is I don't know the "right" data structure to cause the least amount of problems.
Any suggestions on the most pythonic way to accomplish this? My initial thought was to subclass list and add the field definitions as a member, but it seems like python doesn't like that idea at all.
Composition seems like the next bet, but getting it to act like a proper list seems like that might be a bunch of work to get it 'right'.
What you really want is to implement a new sequence type, one that is perhaps mutable. You can either create one from scratch by implementing the special methods needed to emulate container types, or you can use a suitable collections.abc collection ABC as a base.
The latter is probably the easiest path to take, as the ABCs provide implementations for many of the methods as base versions that rely on a few abstract methods you must implement.
For example, the (immutable) Sequence ABC only requires you to provide implementations for __getitem__ and __len__; the base ABC implementation provides the rest:
from collections.abc import Sequence
class StatusBytes(Sequence):
def __init__(self, statusbytes):
self._bytes = statusbytes
def __getitem__(self, idx_or_name):
try:
return self._bytes[idx_or_name]
except IndexError:
# assume it is a fieldname
return FIELDNAMES[idx_or_name]
def __len__(self):
return len(self._bytes)
If you really need a full list implementation, including support for rich comparisons (list_a <= list_b), sorting (list_a.sort()), copying [list_a.copy()] and multiplication (list_a * 3), then there is also the collections.UserList() class. This class inherits from collections.abc.MutableSequence, and adds the extra functionality that list offers over the base sequence ABCs. If you don't need that extra functionality, stick to base ABCs.
It sounds like you're looking for collections.UserList.
Make a subclass which inherits from collections.UserList, that's exactly what its for https://docs.python.org/3/library/collections.html#collections.UserList

How to change mapping type in pyyaml?

I want to load some YAML data in a Python script, but instead of using regular dict for mapping, I would like to use my custom class (which keeps the insertion order and also merges keys/subdictionaries instead of overwriting them). I don't want to alter any other YAML types, only mapping. Browsing through the net, pyyaml documentation and SO I don't see any clear an generic solution - in almost all cases undocumented features of pyyaml are used (for example most of the solutions here). Initially I was thinking about inheriting from yaml.Loader and reimplementing construct_mapping(), but it seems that it would also require to use some of pyyaml internals... But maybe it would "just work" when done like this:
# in my custom loader
def construct_mapping(self, node, deep=False):
mapping = yaml.Loader.construct_mapping(self, node, deep)
# do my own stuff with mapping, changing it to my own type
return mapping
?
Maybe I should just add my custom constructor and use !!map as the YAML tag which will be matched? Will this work that way, even though typically mapping has no explicit tag in YAML file?
Am I missing some obvious solution here, or maybe reimplementing construct_mapping() is the easiest approach?

What is the Python equivalent of Ruby's "inspect"?

I just want to quickly see the properties and values of an object in Python, how do I do that in the terminal on a mac (very basic stuff, never used python)?
Specifically, I want to see what message.attachments are in this Google App Engine MailHandler example (images, videos, docs, etc.).
If you want to dump the entire object, you can use the pprint module to get a pretty-printed version of it.
from pprint import pprint
pprint(my_object)
# If there are many levels of recursion, and you don't want to see them all
# you can use the depth parameter to limit how many levels it goes down
pprint(my_object, depth=2)
Edit: I may have misread what you meant by 'object' - if you're wanting to look at class instances, as opposed to basic data structures like dicts, you may want to look at the inspect module instead.
use the getmembers attribute of the inspect module
It will return a list of (key, value) tuples. It gets the value from obj.__dict__ if available and uses getattr if the the there is no corresponding entry in obj.__dict__. It can save you from writing a few lines of code for this purpose.
Update
There are better ways to do this than dir. See other answers.
Original Answer
Use the built in function dir(fp) to see the attributes of fp.
I'm surprised no one else has mentioned Python's __str__ method, which provides a string representation of an object. Unfortunately, it doesn't seem to print automatically in pdb.
One can also use __repr__ for that, but __repr__ has other requirements: for one thing, you are (at least in theory) supposed to be able to eval() the output of __repr__, though that requirement seems to be enforced only rarely.
Try
repr(obj) # returns a printable representation of the given object
or
dir(obj) # the list of object methods
or
obj.__dict__ # object variables
Or unify Abrer and Mazur answers and get:
from pprint import pprint
pprint(my_object.__dict__ )

Python __setattr__ and __getattr__ for global scope?

Suppose I need to create my own small DSL that would use Python to describe a certain data structure. E.g. I'd like to be able to write something like
f(x) = some_stuff(a,b,c)
and have Python, instead of complaining about undeclared identifiers or attempting to invoke the function some_stuff, convert it to a literal expression for my further convenience.
It is possible to get a reasonable approximation to this by creating a class with properly redefined __getattr__ and __setattr__ methods and use it as follows:
e = Expression()
e.f[e.x] = e.some_stuff(e.a, e.b, e.c)
It would be cool though, if it were possible to get rid of the annoying "e." prefixes and maybe even avoid the use of []. So I was wondering, is it possible to somehow temporarily "redefine" global name lookups and assignments? On a related note, maybe there are good packages for easily achieving such "quoting" functionality for Python expressions?
I'm not sure it's a good idea, but I thought I'd give it a try. To summarize:
class PermissiveDict(dict):
default = None
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
return self.default
def exec_with_default(code, default=None):
ns = PermissiveDict()
ns.default = default
exec code in ns
return ns
You might want to take a look at the ast or parser modules included with Python to parse, access and transform the abstract syntax tree (or parse tree, respectively) of the input code. As far as I know, the Sage mathematical system, written in Python, has a similar sort of precompiler.
In response to Wai's comment, here's one fun solution that I've found. First of all, to explain once more what it does, suppose that you have the following code:
definitions = Structure()
definitions.add_definition('f[x]', 'x*2')
definitions.add_definition('f[z]', 'some_function(z)')
definitions.add_definition('g.i', 'some_object[i].method(param=value)')
where adding definitions implies parsing the left hand sides and the right hand sides and doing other ugly stuff. Now one (not necessarily good, but certainly fun) approach here would allow to write the above code as follows:
#my_dsl
def definitions():
f[x] = x*2
f[z] = some_function(z)
g.i = some_object[i].method(param=value)
and have Python do most of the parsing under the hood.
The idea is based on the simple exec <code> in <environment> statement, mentioned by Ian, with one hackish addition. Namely, the bytecode of the function must be slightly tweaked and all local variable access operations (LOAD_FAST) switched to variable access from the environment (LOAD_NAME).
It is easier shown than explained: http://fouryears.eu/wp-content/uploads/pydsl/
There are various tricks you may want to do to make it practical. For example, in the code presented at the link above you can't use builtin functions and language constructions like for loops and if statements within a #my_dsl function. You can make those work, however, by adding more behaviour to the Env class.
Update. Here is a slightly more verbose explanation of the same thing.

Categories

Resources