My target is to allow an easy way to “filter” previously defined nodes. Consider this fictional YAML file:
%YAML 1.1
---
- fruit: &fruitref { type: banana, color: yellow }
- another_fruit: !rotten *fruitref
What do I need to define in either the YAML file or the Python code that parses this file in order to call a custom function with *fruitref (i.e. the previously defined object, in this case a map) as argument and get the return value? The target is as simple and terse a syntax for “filtering” a previously defined value (map, sequence, whatever).
Note
It seems to me that the construct !tag *alias is invalid YAML, because of the error:
expected <block end>, but found '<alias>'
in "/tmp/test.yaml", line 4, column 21
which most possibly implies that I won't be able to achieve the required syntax, but I do care about terseness (or rather, the target users will).
Routes taken
YAML: !!python/object/apply:__main__.rotten [*fruitref]
It works but it is too verbose for the intended use; and there is no need for multiple arguments, the use case is ALWAYS a filter for an alias (a previously defined map/sequence/object).
YAML: %TAG !f! !!python/object/apply:__main__.
Perhaps !f!rotten [*fruitref] would be acceptable, but I can't find how to make use of the %TAG directive.
EDIT: I discovered that the !! doesn't work for PyYAML 3.10, it has to be the complete URL like this: %TAG !f! %TAG !f! tag:yaml.org,2002:python/object/apply:__main__.
Python: yaml.add_constructor
I already use add_constructor for “casting” maps to specific instances of my classes; the caveat is that tag alias seems to be invalid YAML.
Best so far
add_constructor('!rotten', filter_rotten) in Python and !rotten [*fruitref] in YAML seem to work, but I'm wondering how to omit the square brackets if possible.
It seems that it is not possible to apply a tag to an already tagged reference, so:
!tag *reference
is not acceptable. The best possible solution is to enclose the reference to square brackets (create a sequence) and make the tag to be either a function call or a special constructor expecting a sequence of one object, so the tersest syntax available is:
!prefix!suffix [*reference]
or
!tag [*reference]
Related
I have a variable, abc, defined as 'batting_stats' and want to run a line of code that is 'pyb.batting_stats(2020)'. I want to use string syntax to create the line by joining abc with 'pyb.' and '(2020)' and then run that line of code - how can I do this? I seem to be creating a larger string instance rather than a runnable line of code. Cheers!
You probably don't want to do this; it's possible, but in the vast majority of circumstances, it's a bad idea.
Options:
If possible, try to rewrite the code so you don't need to look up by name at all; for instance, you could change the other code so that it stores pyb.batting_stats as a function rather than as a string
abc = pyb.batting_stats # note no brackets
# later
result = abc(2020)
If you do need to look up by name, you can use getattr, like this:
# At the top of the script
ALLOWED_NAMES = ['batting_stats', ...]
# in the code where you need it
if abc not in ALLOWED_NAMES:
raise ValueError("Invalid name passed: %s" % abc)
result = getattr(pyb, abc)(2020)
Probably a better way would be to use a dictionary as a dispatch table:
dispatch_table = {
'batting_stats': pyb.batting_stats,
...: ...,
}
result = dispatch_table[abc](2020)
This automatically raises an exception if an unexpected name is passed.
It also has the benefit that you can use a different string in the abc variable than the method name; for example, if you need to rename the function but maintain the names in an API or vice versa:
dispatch_table = {
'battingstats': pyb.batting_stats, # legacy name
'batting_stats': pyb.batting_stats,
...: ...,
}
result = dispatch_table[abc](2020)
If you absolutely must run a piece of code from a string, you can use the eval or exec builtin functions; however, it's almost always a bad idea.
Use of eval and exec is so frequently dangerous and insecure (Common Weakness #95) that it's better to avoid it altogether. Luckily, at least in Python, there's almost always an alternative; moreover, those alternatives are typically cleaner, more flexible, faster to run and easier to debug. Cases where there's no alternative are vanishingly rare.
For example I have a solid named initiate_load , it is yielding a dictionary and an integer , something like :
#solid(
output_defs=[
OutputDefinition(name='l_dict', is_required=False),
OutputDefinition(name='l_int', is_required=False)
],
)
def initiate_load(context):
....
....
yield Output(l_dict, output_name='l_dict')
yield Output(l_int, output_name='l_int')
I have a composite_solid also ,let's say call_other_solid_composite
and I am passing l_dict and l_int to this composite_solid
and I am using the l_dict to get the values mapped to its keys. Something like.
#composite_solid
def call_other_solid_composite(p_dict,p_int):
l_val1 = p_dict['val1']
...
...
Then I am getting a Error as :TypeError: 'InputMappingNode' object is not subscriptable.
I searched everywhere but can't find a solution . The documentation is also of no help . I have use case that I need to parse those values .
Any help will be appreciated.
Similar to methods decorated with #pipeline, you should not think of methods decorated with #composite_solid as regular python methods. Dagster will wrap them and make it something completely different. That's why the p_dict parameter cannot be used inside the method as a regular method parameter.
To achieve what you want, you have a couple of options:
pass the p_dict parameter directly in another solid and inside this solid you will be able to do l_val1 = p_dict['val1']
next to the yields you now have in the initiate_load method, you can yield p_dict['val1'] as an output as well. This allows you to use both the dict and the 'val1' value as inputs in other solids (also in your composite)
you can have a solid in your composite solid that yields the p_dict['val1'], this allows you to use this value as an input for other solids inside the composite.
Hope this helps. For reference, the documentation about the composite solids can be found here.
A small remark on the snippets you provided. Dagster has a very neat typing system, it is best practice to use this as much as possible.
Recently I use AppKit to write a program which aims to automatically click inside an application window. When I want to activate the window first, I go through the NSRunningApplication documentation, and found a function called "activateWithOptions", and I wrote a simple program like the following.
Apps = NSWorkspace.sharedWorkspace().runningApplications()
for app in Apps:
print(app.localizedName())
app.activateWithOptions(NSApplicationActivateAllWindows)
Here are my questions.
The first question is that, inside the documentation, the localizedName attribute is a variable, but in Python, you must use it as a function such that you can get the name. Why is the difference?
If you run the program, it just throws an error in the following. But if you changed it to app.activateWithOptions_(NSApplicationActivateAllWindows), the code can be passed. Why is the documentation inconsistent with my usage?
AttributeError: 'NSRunningApplication' object has no attribute 'activateWithOptions'
Localized name
It's not a variable, it's a property declared as:
#property(readonly, copy) NSString *localizedName;
It's synthesized into the _localizedName instance variable and this function:
- (NSString *)localizedName {
return _localizedName;
}
Underscore
PyObjC documentation - Underscores, and lots of them:
An Objective-C message looks like this:
[someObject doSomething:arg1 withSomethingElse:arg2];
The selector (message name) for the above snippet is this (note the colons):
doSomething:withSomethingElse:
In order to have a lossless and unambiguous translation between Objective-C messages and Python methods, the Python method name equivalent is simply the selector with colons replaced by underscores. Since each colon in an Objective-C selector is a placeholder for an argument, the number of underscores in the PyObjC-ified method name is the number of arguments that should be given.
The PyObjC translation of the above selector is (note the underscores):
doSomething_withSomethingElse_
activateWithOptions: -> activateWithOptions_
This happens all the time. A function returns an object that I can't read. Here:
discoverer = GstPbutils.Discoverer()
discoverer.connect('discovered', on_discovered)
info = discoverer.discover_uri(self.loaded_file)
print(vinfo.get_tags())
Returns this:
<Gst.TagList object at 0x7f00a360c0a8 (GstTagList at 0x7f00880024a0)>
But when I try to do this:
tags = vinfo.get_tags()
for tag in tags:
print (tag)
I get this:
TypeError: 'TagList' object is not iterable
But when I read the doc of this data structure, I seem to understand it's ... List? Can somebody, beyond telling me how to get the tags, indicate me how to read those docs? Also, am I missing some introspection methods and tools, that I could use to discover what the objects I encounter are, and how they work?
This is all hypothetical as I never used python with GStreamer:
According to documentation - yes it is said its list.. but this could be represented as internal structure.. remember that python bindings are just.. bindings - it all works similarly (if not implemented in a better way) as in C.. and what do you do in C with tags to iterate them .. but dont ask me how I found it out - you have to look around the docs checking all available functions.
You have to be wise and think of how could the object you are using may be implemented - along with the fact you know what it represents.. I mean - this is the list of tags when each tag has different type - one is string, the other one is int etc.. you cannot easily iterate over that.
So I think you have two options - according to what do you want to do with the tags..
1, serialize to string and work with that:
I am not sure but in C there is to_string which may do the same thing as in to_string in python - so try that if you are interested only in the tag names.. or whatever it returns.
2, use builtin foreach with its callback definition:
tags = vinfo.get_tags()
tags.foreach(my_callback, self)
And in your callback:
def my_callback(list, tag, user_data):
print(tag)
#do whatever you want with list
#not sure how to use casting in python:
YourClass ptr = user_data
ptr.your_method(whatever, tag);
Please read this whole question before answering, as it's not what you think... I'm looking at creating python object wrappers that represent hardware devices on a system (trimmed example below).
class TPM(object):
#property
def attr1(self):
"""
Protects value from being accidentally modified after
constructor is called.
"""
return self._attr1
def __init__(self, attr1, ...):
self._attr1 = attr1
...
#classmethod
def scan(cls):
"""Calls Popen, parses to dict, and passes **dict to constructor"""
Most of the constructor inputs involve running command line outputs in subprocess.Popen and then parsing the output to fill in object attributes. I've come up with a few ways to handle these, but I'm unsatisfied with what I've put together just far and am trying to find a better solution. Here are the common catches that I've found. (Quick note: tool versions are tightly controlled, so parsed outputs don't change unexpectedly.)
Many tools produce variant outputs, sometimes including fields and sometimes not. This means that if you assemble a dict to be wrapped in a container object, the constructor is more or less forced to take **kwargs and not really have defined fields. I don't like this because it makes static analysis via pylint, etc less than useful. I'd prefer a defined interface so that sphinx documentation is clearer and errors can be more reliably detected.
In lieu of **kwargs, I've also tried setting default args to None for many of the fields, with what ends up as pretty ugly results. One thing I dislike strongly about this option is that optional fields don't always come at the end of the command line tool output. This makes it a little mind-bending to look at the constructor and match it up to tool output.
I'd greatly prefer to avoid constructing a dictionary in the first place, but using setattr to create attributes will make pylint unable to detect the _attr1, etc... and create warnings. Any ideas here are welcome...
Basically, I am looking for the proper Pythonic way to do this. My requirements, for a re-summary are the following:
Command line tool output parsed into a container object.
Container object protects attributes via properties post-construction.
Varying number of inputs to constructor, with working static analysis and error detection for missing required fields during runtime.
Is there a good way of doing this (hopefully without a ton of boilerplate code) in Python? If so, what is it?
EDIT:
Per some of the clarification requests, we can take a look at the tpm_version command. Here's the output for my laptop, but for this TPM it doesn't include every possible attribute. Sometimes, the command will return extra attributes that I also want to capture. This makes parsing to known attribute names on a container object fairly difficult.
TPM 1.2 Version Info:
Chip Version: 1.2.4.40
Spec Level: 2
Errata Revision: 3
TPM Vendor ID: IFX
Vendor Specific data: 04280077 0074706d 3631ffff ff
TPM Version: 01010000
Manufacturer Info: 49465800
Example code (ignore lack of sanity checks, please. trimmed for brevity):
def __init__(self, chip_version, spec_level, errata_revision,
tpm_vendor_id, vendor_specific_data, tpm_version,
manufacturer_info):
self._chip_version = chip_version
...
#classmethod
def scan(cls):
tpm_proc = Popen("/usr/sbin/tpm_version")
stdout, stderr = Popen.communicate()
tpm_dict = dict()
for line in tpm_proc.stdout.splitlines():
if "Version Info:" in line:
pass
else:
split_line = line.split(":")
attribute_name = (
split_line[0].strip().replace(' ', '_').lower())
tpm_dict[attribute_name] = split_line[1].strip()
return cls(**tpm_dict)
The problem here is that this (or a different one that I may not be able to review the source of to get every possible field) could add extra things that cause my parser to work, but my object to not capture the fields. That's what I'm really trying to solve in an elegant way.
I've been working on a more solid answer to this the last few months, as I basically work on hardware support libraries and have finally come up with a satisfactory (though pretty verbose) answer.
Parse the tool outputs, whatever they look like, into objects structures that match up to how the tool views the device. These can have very generic dict structures, but should be broken out as much as possible.
Create another container class on top of that that which uses attributes to access items in the tool-container-objects. This enforces an API and can return sane errors across multiple versions of the tool, and across differing tool outputs!