Pythonic way to parse command line output into a container object - python

Please read this whole question before answering, as it's not what you think... I'm looking at creating python object wrappers that represent hardware devices on a system (trimmed example below).
class TPM(object):
#property
def attr1(self):
"""
Protects value from being accidentally modified after
constructor is called.
"""
return self._attr1
def __init__(self, attr1, ...):
self._attr1 = attr1
...
#classmethod
def scan(cls):
"""Calls Popen, parses to dict, and passes **dict to constructor"""
Most of the constructor inputs involve running command line outputs in subprocess.Popen and then parsing the output to fill in object attributes. I've come up with a few ways to handle these, but I'm unsatisfied with what I've put together just far and am trying to find a better solution. Here are the common catches that I've found. (Quick note: tool versions are tightly controlled, so parsed outputs don't change unexpectedly.)
Many tools produce variant outputs, sometimes including fields and sometimes not. This means that if you assemble a dict to be wrapped in a container object, the constructor is more or less forced to take **kwargs and not really have defined fields. I don't like this because it makes static analysis via pylint, etc less than useful. I'd prefer a defined interface so that sphinx documentation is clearer and errors can be more reliably detected.
In lieu of **kwargs, I've also tried setting default args to None for many of the fields, with what ends up as pretty ugly results. One thing I dislike strongly about this option is that optional fields don't always come at the end of the command line tool output. This makes it a little mind-bending to look at the constructor and match it up to tool output.
I'd greatly prefer to avoid constructing a dictionary in the first place, but using setattr to create attributes will make pylint unable to detect the _attr1, etc... and create warnings. Any ideas here are welcome...
Basically, I am looking for the proper Pythonic way to do this. My requirements, for a re-summary are the following:
Command line tool output parsed into a container object.
Container object protects attributes via properties post-construction.
Varying number of inputs to constructor, with working static analysis and error detection for missing required fields during runtime.
Is there a good way of doing this (hopefully without a ton of boilerplate code) in Python? If so, what is it?
EDIT:
Per some of the clarification requests, we can take a look at the tpm_version command. Here's the output for my laptop, but for this TPM it doesn't include every possible attribute. Sometimes, the command will return extra attributes that I also want to capture. This makes parsing to known attribute names on a container object fairly difficult.
TPM 1.2 Version Info:
Chip Version: 1.2.4.40
Spec Level: 2
Errata Revision: 3
TPM Vendor ID: IFX
Vendor Specific data: 04280077 0074706d 3631ffff ff
TPM Version: 01010000
Manufacturer Info: 49465800
Example code (ignore lack of sanity checks, please. trimmed for brevity):
def __init__(self, chip_version, spec_level, errata_revision,
tpm_vendor_id, vendor_specific_data, tpm_version,
manufacturer_info):
self._chip_version = chip_version
...
#classmethod
def scan(cls):
tpm_proc = Popen("/usr/sbin/tpm_version")
stdout, stderr = Popen.communicate()
tpm_dict = dict()
for line in tpm_proc.stdout.splitlines():
if "Version Info:" in line:
pass
else:
split_line = line.split(":")
attribute_name = (
split_line[0].strip().replace(' ', '_').lower())
tpm_dict[attribute_name] = split_line[1].strip()
return cls(**tpm_dict)
The problem here is that this (or a different one that I may not be able to review the source of to get every possible field) could add extra things that cause my parser to work, but my object to not capture the fields. That's what I'm really trying to solve in an elegant way.

I've been working on a more solid answer to this the last few months, as I basically work on hardware support libraries and have finally come up with a satisfactory (though pretty verbose) answer.
Parse the tool outputs, whatever they look like, into objects structures that match up to how the tool views the device. These can have very generic dict structures, but should be broken out as much as possible.
Create another container class on top of that that which uses attributes to access items in the tool-container-objects. This enforces an API and can return sane errors across multiple versions of the tool, and across differing tool outputs!

Related

objc.error: NSInternalInconsistencyException - readFromData:ofType:error: is a subclass responsibility but has not been overridden

I am using PyObj-C and am making some methods in a python file to read and write files using NSDocument, which uses the abstract NSFileCoordinater class. Accessing files this way instead of just using python's open let's these classes handle things for me such as preventing files from being edited from more than one program at a time or giving enough time for read/write operations to finish before it could get deadlocked.
These features are very important, and the app I ma building I want to be up to standard as much as I can here.
I have this code that instantiates a NSDocument object that contains the content of whatever file path you put into it, as a function:
#classmethod
def write(cls, file: str):
path = NSURL.fileURLWithPath_(file)
ext = file.split('.')[-1]
doc = NSDocument.alloc().initWithContentsOfURL_ofType_error_(path, ext, None)
When I call this function with a valid file path I get this error:
File "/Users/user123/PycharmProjects/shoutout/src/sutils/cfiles.py", line 27, in write
doc = NSDocument.alloc().initWithContentsOfURL_ofType_error_(path, ext, None)
objc.error: NSInternalInconsistencyException - readFromData:ofType:error: is a subclass responsibility but has not been overridden.
I have tried to find forums both objective-c, swift, or pyobj-c based as it were asking any keywords such as objective-c is a subclass responsibility but has not been overridden on google, and checked stackoverflow, and github for existing posts on this error but I could find none.
As I understand it Objective-C being polymorphic, has my method initWithContentsOfURL:ofType:error: call readFromData:ofType:error, among other ones at the same time. I don't understand exactly however what it means when it's saying that "is a subclass responsibility but has not been overridden." I am not sure also about what it means to override a class or a one being a responsibility so that doesn't help on my part.
A NSInternalInconsistencyException means a "when an internal assertion fails and implies an unexpected condition within the called code." Not sure what a internal "assertion" is either or what this could mean.
Any idea of what I could do to fix this?
NSDocument is an abstract class that requires you to subclass and implement a number of methods to make it usable. This is document in Apple's documentation for the class.
the older Document-Baed App Programming Guide for Mac gives more information on this.

python what data type is this?

I'm pretty new to python, and currently playing with the zeroconf library.
when I try to register a service on the network, I'm seeing this in the function definition:
def register_service(self, info, ttl=_DNS_TTL):
"""Registers service information to the network with a default TTL
of 60 seconds. Zeroconf will then respond to requests for
information for that service. The name of the service may be
changed if needed to make it unique on the network."""
self.check_service(info)
self.services[info.name.lower()] = info
if info.type in self.servicetypes:
self.servicetypes[info.type] += 1
else:
self.servicetypes[info.type] = 1
now = current_time_millis()
next_time = now
i = 0
while i < 3:
if now < next_time:
self.wait(next_time - now)
now = current_time_millis()
continue
out = DNSOutgoing(_FLAGS_QR_RESPONSE | _FLAGS_AA)
out.add_answer_at_time(DNSPointer(info.type, _TYPE_PTR,
_CLASS_IN, ttl, info.name), 0)
out.add_answer_at_time(DNSService(info.name, _TYPE_SRV,
_CLASS_IN, ttl, info.priority, info.weight, info.port,
info.server), 0)
out.add_answer_at_time(DNSText(info.name, _TYPE_TXT, _CLASS_IN,
ttl, info.text), 0)
if info.address:
out.add_answer_at_time(DNSAddress(info.server, _TYPE_A,
_CLASS_IN, ttl, info.address), 0)
self.send(out)
i += 1
next_time += _REGISTER_TIME
Anyone know what type info is meant to be?
EDIT
Thanks for providing the answer that it's a ServiceInfo class. Besides the fact that the docstring provides this answer when one goes searching for it. I'm still unclear on:
the process expert python programmers follow when encountering this sort of situation - what steps to take to find the data type for info say when docstring wasn't available?
how does python interpreter know info is of ServiceInfo class when we don't specify the class type as part of the input param for register_service? How does it know info.type is a valid property, and say info.my_property isn't?
It is an instance of ServiceInfo class.
It can be deduced from reading the code and docstrings. register_service invokes check_service function which, I quote, "checks the network for a unique service name, modifying the ServiceInfo passed in if it is not unique".
It looks like it should be a ServiceInfo. Found in the examples of the repository:
https://github.com/jstasiak/python-zeroconf/blob/master/examples/registration.py
Edit
I'm not really sure what to say besides "any way I have to". In practice I can't really remember a time when the contract of the interface wasn't made perfectly clear, because that's just part of using Python. Documentation is more a requirement for this reason.
The short answer is, "it doesn't". Python uses the concept of "duck typing" in which any object that supports the necessary operations of the contract is valid. You could have given it any value that has all the properties the code uses and it wouldn't know the difference. So, per part 1, worst case you just have to trace every use of the object back as far as it is passed around and provide an object that meets all the requirements, and if you miss a piece, you'll get a runtime error for any code path that uses it.
My preference is for static typing as well. Largely I think documentation and unit tests just become "harder requirements" when working with dynamic typing since the compiler can't do any of that work for you.

How do I use the GeometryConstraint class?

I've been trying to get this to work for so long now, I've read the docs here, but I can't seem to understand how to implement the GeometryConstraint.
Normally, the derivative version of this would be:
geometryConstraintNode = pm.geometryConstraint(target, object)
However, in Pymel, It looks a little nicer when setting attributes, which is why I want to use it, because it's much more readable.
I've tried this:
geometryConstraintNode = nt.GeometryConstraint(target, object).setName('geoConstraint')
But no luck, can someone take a look?
Shannon
this doesn't work for you?
import pymel.core as pm
const = pm.geometryConstraint('pSphere1', 'locator1', n='geoConstraint')
print const
const.rename('fred')
print const
output would be
geoConstraint
fred
and a constraint object named 'fred'.
The pymel node is the return value that comes back from the command defined in pm.animation.geometryConstraint. What it returns is a class wrapper for the actual in-scene constraint, which is defined in pm.nodetypes.GeometryConstraint. It's the class version where you get to do all the attribute setting, etc; the command version is a match for the same thing in maya.cmds with sometimes a little syntactic sugar added.
In this case, the pymel node is like any other pymel node, so things like renamimg use the same '.rename' functionality inherited from DagNode. You could also use functions inherited from Transform, like 'getChildren()' or 'setParent()' The docs make this clear in a round-about way by including the inheritance tree at the top of the nodetype's page. Basically all pynode returns will share at least DagNode (stuff like naming) and usually Transform (things like move, rotate, parent) or Shape (query components, etc)

How can I tell PyCharm/IDEA what type an instance or class variable is expected to be? [duplicate]

When it comes to constructors, and assignments, and method calls, the PyCharm IDE is pretty good at analyzing my source code and figuring out what type each variable should be. I like it when it's right, because it gives me good code-completion and parameter info, and it gives me warnings if I try to access an attribute that doesn't exist.
But when it comes to parameters, it knows nothing. The code-completion dropdowns can't show anything, because they don't know what type the parameter will be. The code analysis can't look for warnings.
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
peasant = Person("Dennis", 37)
# PyCharm knows that the "peasant" variable is of type Person
peasant.dig_filth() # shows warning -- Person doesn't have a dig_filth method
class King:
def repress(self, peasant):
# PyCharm has no idea what type the "peasant" parameter should be
peasant.knock_over() # no warning even though knock_over doesn't exist
King().repress(peasant)
# Even if I call the method once with a Person instance, PyCharm doesn't
# consider that to mean that the "peasant" parameter should always be a Person
This makes a certain amount of sense. Other call sites could pass anything for that parameter. But if my method expects a parameter to be of type, say, pygame.Surface, I'd like to be able to indicate that to PyCharm somehow, so it can show me all of Surface's attributes in its code-completion dropdown, and highlight warnings if I call the wrong method, and so on.
Is there a way I can give PyCharm a hint, and say "psst, this parameter is supposed to be of type X"? (Or perhaps, in the spirit of dynamic languages, "this parameter is supposed to quack like an X"? I'd be fine with that.)
EDIT: CrazyCoder's answer, below, does the trick. For any newcomers like me who want the quick summary, here it is:
class King:
def repress(self, peasant):
"""
Exploit the workers by hanging on to outdated imperialist dogma which
perpetuates the economic and social differences in our society.
#type peasant: Person
#param peasant: Person to repress.
"""
peasant.knock_over() # Shows a warning. And there was much rejoicing.
The relevant part is the #type peasant: Person line of the docstring.
If you also go to File > Settings > Python Integrated Tools and set "Docstring format" to "Epytext", then PyCharm's View > Quick Documentation Lookup will pretty-print the parameter information instead of just printing all the #-lines as-is.
Yes, you can use special documentation format for methods and their parameters so that PyCharm can know the type. Recent PyCharm version supports most common doc formats.
For example, PyCharm extracts types from #param style comments.
See also reStructuredText and docstring conventions (PEP 257).
Another option is Python 3 annotations.
Please refer to the PyCharm documentation section for more details and samples.
If you are using Python 3.0 or later, you can also use annotations on functions and parameters. PyCharm will interpret these as the type the arguments or return values are expected to have:
class King:
def repress(self, peasant: Person) -> bool:
peasant.knock_over() # Shows a warning. And there was much rejoicing.
return peasant.badly_hurt() # Lets say, its not known from here that this method will always return a bool
Sometimes this is useful for non-public methods, that do not need a docstring. As an added benefit, those annotations can be accessed by code:
>>> King.repress.__annotations__
{'peasant': <class '__main__.Person'>, 'return': <class 'bool'>}
Update: As of PEP 484, which has been accepted for Python 3.5, it is also the official convention to specify argument and return types using annotations.
PyCharm extracts types from a #type pydoc string. See PyCharm docs here and here, and Epydoc docs. It's in the 'legacy' section of PyCharm, perhaps it lacks some functionality.
class King:
def repress(self, peasant):
"""
Exploit the workers by hanging on to outdated imperialist dogma which
perpetuates the economic and social differences in our society.
#type peasant: Person
#param peasant: Person to repress.
"""
peasant.knock_over() # Shows a warning. And there was much rejoicing.
The relevant part is the #type peasant: Person line of the docstring.
My intention is not to steal points from CrazyCoder or the original questioner, by all means give them their points. I just thought the simple answer should be in an 'answer' slot.
I'm using PyCharm Professional 2016.1 writing py2.6-2.7 code, and I found that using reStructuredText I can express types in a more succint way:
class Replicant(object):
pass
class Hunter(object):
def retire(self, replicant):
""" Retire the rogue or non-functional replicant.
:param Replicant replicant: the replicant to retire.
"""
replicant.knock_over() # Shows a warning.
See: https://www.jetbrains.com/help/pycharm/2016.1/type-hinting-in-pycharm.html#legacy
You can also assert for a type and Pycharm will infer it:
def my_function(an_int):
assert isinstance(an_int, int)
# Pycharm now knows that an_int is of type int
pass

How to design a program with many configuration options?

Lets say I have a program that has a large number of configuration options. The user can specify them in a config file. My program can parse this config file, but how should it internally store and pass around the options?
In my case, the software is used to perform a scientific simulation. There are about 200 options most of which have sane defaults. Typically the user only has to specify a dozen or so. The difficulty I face is how to design my internal code. Many of the objects that need to be constructed depend on many configuration options. For example an object might need several paths (for where data will be stored), some options that need to be passed to algorithms that the object will call, and some options that are used directly by the object itself.
This leads to objects needing a very large number of arguments to be constructed. Additionally, as my codebase is under very active development, it is a big pain to go through the call stack and pass along a new configuration option all the way down to where it is needed.
One way to prevent that pain is to have a global configuration object that can be freely used anywhere in the code. I don't particularly like this approach as it leads to functions and classes that don't take any (or only one) argument and it isn't obvious to the reader what data the function/class deals with. It also prevents code reuse as all of the code depends on a giant config object.
Can anyone give me some advice about how a program like this should be structured?
Here is an example of what I mean for the configuration option passing style:
class A:
def __init__(self, opt_a, opt_b, ..., opt_z):
self.opt_a = opt_a
self.opt_b = opt_b
...
self.opt_z = opt_z
def foo(self, arg):
algo(arg, opt_a, opt_e)
Here is an example of the global config style:
class A:
def __init__(self, config):
self.config = config
def foo(self, arg):
algo(arg, config)
The examples are in Python but my question stands for any similar programming langauge.
matplotlib is a large package with many configuration options. It use a rcParams module to manage all the default parameters. rcParams save all the default parameters in a dict.
Every functions will get the options from keyword argurments:
for example:
def f(x,y,opt_a=None, opt_b=None):
if opt_a is None: opt_a = rcParams['group1.opt_a']
A few design patterns will help
Prototype
Factory and Abstract Factory
Use these two patterns with configuration objects. Each method will then take a configuration object and use what it needs. Also consider applying a logical grouping to config parameters and think about ways to reduce the number of inputs.
psuedo code
// Consider we can run three different kinds of Simulations. sim1, sim2, sim3
ConfigFactory configFactory = new ConfigFactory("/path/to/option/file");
....
Simulation1 sim1;
Simulation2 sim2;
Simulation3 sim3;
sim1.run( configFactory.ConfigForSim1() );
sim2.run( configFactory.ConfigForSim2() );
sim3.run( configFactory.ConfigForSim3() );
Inside of each factory method it might create a configuration from a prototype object (that has all of the "sane" defaults) and the option file becomes just the things that are different from default. This would be paired with clear documentation on what these defaults are and when a person (or other program) might want to change them.
** Edit: **
Also consider that each config returned by the factory is a subset of the overall config.
Pass around either the config parsing class, or write a class that wraps it and intelligently pulls out the requested options.
Python's standard library configparser exposes the sections and options of an INI style configuration file using the mapping protocol, and so you can retrieve your options directly from that as though it were a dictionary.
myconf = configparser.ConfigParser()
myconf.read('myconf.ini')
what_to_do = myconf['section']['option']
If you explicitly want to provide the options using the attribute notation, create a class that overrides __getattr__:
class MyConf:
def __init__(self, path):
self._parser = configparser.ConfigParser()
self._parser.read('myconf.ini')
def __getattr__(self, option):
return self._parser[{'what_to_do': 'section'}[option]][option]
myconf = MyConf()
what_to_do = myconf.what_to_do
Have a module load the params to its namespace, then import it and use wherever you want.
Also see related question here

Categories

Resources