What does isinstance mean combined with ScalarNode, SequenceNode, and MappingNode?

What does isinstance mean combined with ScalarNode, SequenceNode, and MappingNode? - python

I have googled without success so I would like what does isinstance means combined with yaml ScalarNode, SequenceNode and MappingNode?
(I know already what isinstance is)
For example
if isinstance(v,yaml.ScalarNode):
#do something
elif isinstance(v,yaml.SequenceNode):
#something else
elif isinstance(v, yaml.MappingNode):
#another thing

Node types are a part of YAML's Representation data structure. YAML defines its (de)serialization pipeline as follows:
(source: yaml.org)
The Representation is a, potentially cyclic, graph of nodes. In it, anchors and aliases have been resolved. In PyYAML, you typically use subgraphs of this data structure to implement custom constructors and representers that generate your native objects, as indicated by the arrows in the diagram.
A ScalarNode is a node representing a single scalar in the YAML source or output. A single scalar can be a plain scalar (e.g. foo), a quoted scalar ('foo' or "foo"), or a block scalar (starting with | or >). The scalar content, with escape sequences, newlines, indentation already processed, is available in the field .value as string. This is even true for values that are by default constructed into non-strings. For example, true will by default generate a boolean value, but as ScalarNode, it contains the value "true" as string.
SequenceNode is a node representing a sequence. The value field of a SequenceNode contains a list of nodes that correspond to the items in the sequence.
MappingNode is a node representing a mapping. The value field of a MappingNode contains a list of tuples, where each tuple consists of the key node and the value node.
All nodes have a field tag that contains the resolved tag of the node. i.e. a ScalarNode with value true would typically have the tag yaml.org,2002:bool. The resolved tag depends on the loader you use, for example if you use PyYAML's BaseLoader, true will resolve to a normal string, which is yaml.org,2002:str. In any case, if there was an explicit tag on the node (e.g. !!str, that tag will be in the tag field.
Coming back to the question, this kind of code is typically used in custom constructors. They get a node as input and are to produce a native value and return it. Usually, a custom constructor expects a specific kind of node but if you want to do proper error reporting, you still want to check whether you actually got the kind of node you need. For this, you use the code you posted.

Related

Ansible - Wrapper of python interpreter

Ansible uses YAML syntax mainly has key-value pairs, where every value can be
a simple value (number or string)
or
a list
or
a key-value pair(nested)
Anchoring a value, Type conversion in YAML is just a pre-processing option.
1)
From the data structure aspect,
Is YAML syntax a dictionary of dictionary?
2)
For command: ansible -m shell 'hostname' all, Is ansible a wrapper of python interpreter? taking multiple command line options...

From the data structure aspect,
Is YAML syntax a dictionary of dictionary?
No. YAML syntax models a directed graph. Your assumptions on YAML given initially are wrong. In YAML, a value is one of three things:
A scalar (number, string, date, …)
A sequence (list of values)
A mapping (list of key-value pairs where both keys and values are any kind of value)
Since any non-scalar value can contain other non-scalar values, YAML can represent a tree of arbitrary depth – so it's not necessarily a dictionary of dictionaries.
Now, YAML also allows to have an anchor on any value, and reference that value later via alias:
anchored value: &anchor My value
alias: *anchor
Here, *alias references the anchored scalar value My value. This can be used to define cyclic graphs:
--- &root # this annotates the root sequence;
- one
- two # simple sequence items
- three
- *root # reference to the sequence, meaning that the sequence contains itself
Mind that both sequence and mappings are usually started implicitly in YAML syntax. If children are key/value pairs, it's a mapping (first example); if children are list items, it's a sequence (second example). --- starts the document and is usually omitted.
For command: ansible -m shell 'hostname' all, Is ansible a wrapper of python interpreter? taking multiple command line options...
See the man page of the ansible command. You are probably looking for the -a ARGS option. I am unsure what you would consider a wrapper of the Python interpreter and you may want to clarify what you actually want to do. Generally, the answer to that is no.

generate array element if not exists in python

What is the right way to check if element does not exist in Python ?
The element is expected to be present most of the time, and if it is empty it is not an "error" and need to be processed normally:
def checkElement(self, x, y):
if not (self.map[x][y]):
self.map[x][y] = 'element {}:{}'.format(x, y)
return self.map[x][y]

tldr
Your own code together with triplee's answer cover the common cases. I want to point out ambiguity in your question. How you check for "empty" very much depends on what your definition of empty is.
This is a tricky question because the semantics of "empty" are not exactly clear. Assuming that the data structure is a nested dict as could be inferred from your example, then it could be the case that empty means the inner/outer key is not contained in the dictionary. In that case you'd want to go with what triplee suggests. Similarly if the container is a nested list, but instead of KeyError you'd catch IndexError.
Alternatively, it could also be the case that "empty" means both the inner and outer keys are in the dictionary (or list) but the value at that position is some signifier for "empty". In this case the most natural "empty" in Python would be None, so you'd want to check if the value under those keys is None. None evaluates to False in boolean expressions so your code would work just fine.
However, depending on how your application defines empty these are not the only alternatives. If you're loading json data and the producer of said json has been prudent, empty values are null in json and map to None when loaded into Python. More often than not the producer of the json has not been prudent and empty values are actually just empty strings {firstName:''}, this happens more often than one would like. It turns out that if not self.map[x][y] works in this case as well because an empty string also evaluates to False, same applies to an empty list, an empty set and an empty dict.
We can generalise the meaning of empty further and say that "empty" is any value that is not recognised as actionable or valid content by the application and should therefore be considered "empty" - but you can already see how this is completely dependent on what the application is. Would {firstName: ' '} a string that only contains white space be empty, is a partially filled in email address empty?

The Best way to check if any object (Lists, Dicts, etc) exist or not is to wrap it within a try...except Block. Your checkElement Function could be re-written thus:
def checkElement
try:
self.map[x][y]
except:
# HANDLE THE CASE WHERE self.map[x][y] ISN'T SET...
self.map[x][y] = 'element {}:{}'.format(x, y)

The answer to what you seem to be asking is simply
try:
result = self.map[x][y]
except KeyError:
result = 'element {}:{}'.format(x, y)
self.map[x][y] = result
return result
Of course, if self.map[x] might also not exist, you have to apply something similar to that; or perhaps redefine it to be a defaultdict() instead, or perhaps something else entirely, depending on what sort of structure this is.
KeyError makes sense for a dict; if self[x] is a list, probably trap IndexError instead.

passing fields of an array of collections through functions in python

is there a way of passing a field of an array of collections into a function so that it can still be used to access a element in the collection in python?. i am attempting to search through an array of collections to locate a particular item by comparing it with an identifier. this identifier and field being compared will change as the function is called in different stages of the program. is there a way of passing up the field to the function, to access the required element for comparison?
this is the code that i have tried thus far:
code ...

In your code, M_work is a list. Lists are accessed using an index and this syntax: myList[index]. So that would translate to M_work[place] in your case. Then you say that M_work stores objects which have fields, and you want to access one of these fields by name. To do that, use getattr like this: getattr(M_work[place], field). You can compare the return value to identifier.
Other mistakes in the code you show:
place is misspelled pace at one point.
True is misspelled true at one point.
The body of your loop always returns at the first iteration: there is a return in both the if found == True and else branches. I don't think this is what you want.
You could improve your code by:
noticing that if found == True is equivalent to if found.
finding how you don't actually need the found variable.
looking at Python's for...in loop.

PyXB with XML schema containing choice statement

i am using pyXB for binding XML.
my schema used at there has choice elements.
so when i convert XML into a python instance
i don't know exactly which element is chosen at choice element.
So in order to distinguish, i have had to use if/else statement considering all cases.
for example, if the choice element has a and b, to distinguish one within a and b
A = binder.CreateFromDocument(xml) #bind into a python instance
#At this point, i don't know which element is included
#So I have to check using if/else
if A.a:
#processing in the case of a
A.a.aa = 'a'
else if A.b:
#processing in the case of b
A.b.bb = 'b'
the example is so simple and if/else looks enough but if the choice element has so many element about more than 100.
that processing(repeated if/else) will be so bad.
is there any other way to know which element is chosen?

Yes; there is a method orderedContent on complex type instances that can be used to determine what elements are present in the instance. This can also be used to recover the document order of elements when order is not enforced by the schema, as described in the user documentation.
Note that the members of the orderedContent list are wrapped in objects that provide information about them, so to get the underlying content binding you have to drill down through the wrapper's value property.

C data structures

Is there a C data structure equatable to the following python structure?
data = {'X': 1, 'Y': 2}
Basically I want a structure where I can give it an pre-defined string and have it come out with an integer.

The data-structure you are looking for is called a "hash table" (or "hash map"). You can find the source code for one here.
A hash table is a mutable mapping of an integer (usually derived from a string) to another value, just like the dict from Python, which your sample code instantiates.
It's called a "hash table" because it performs a hash function on the string to return an integer result, and then directly uses that integer to point to the address of your desired data.
This system makes it extremely extremely quick to access and change your information, even if you have tons of it. It also means that the data is unordered because a hash function returns a uniformly random result and puts your data unpredictable all over the map (in a perfect world).

Also note that if you're doing a quick one-off hash, like a two or three static hash for some lookup: look at gperf, which generates a perfect hash function and generates simple code for that hash.

The above data structure is a dict type.
In C/C++ paralance, a hashmap should be equivalent, Google for hashmap implementation.

There's nothing built into the language or standard library itself but, depending on your requirements, there are a number of ways to do it.
If the data set will remain relatively small, the easiest solution is to probably just have an array of structures along the lines of:
typedef struct {
char *key;
int val;
} tElement;
then use a sequential search to look them up. Have functions which insert keys, delete keys and look up keys so that, if you need to change it in future, the API itself won't change. Pseudo-code:
def init:
create g.key[100] as string
create g.val[100] as integer
set g.size to 0
def add (key,val):
if lookup(key) != not_found:
return already_exists
if g.size == 100:
return no_space
g.key[g.size] = key
g.val[g.size] = val
g.size = g.size + 1
return okay
def del (key):
pos = lookup (key)
if pos == not_found:
return no_such_key
if pos < g.size - 1:
g.key[pos] = g.key[g.size-1]
g.val[pos] = g.val[g.size-1]
g.size = g.size - 1
def find (key):
for pos goes from 0 to g.size-1:
if g.key[pos] == key:
return pos
return not_found
Insertion means ensuring it doesn't already exist then just tacking an element on to the end (you'll maintain a separate size variable for the structure). Deletion means finding the element then simply overwriting it with the last used element and decrementing the size variable.
Now this isn't the most efficient method in the world but you need to keep in mind that it usually only makes a difference as your dataset gets much larger. The difference between a binary tree or hash and a sequential search is irrelevant for, say, 20 entries. I've even used bubble sort for small data sets where a more efficient one wasn't available. That's because it massively quick to code up and the performance is irrelevant.
Stepping up from there, you can remove the fixed upper size by using a linked list. The search is still relatively inefficient since you're doing it sequentially but the same caveats apply as for the array solution above. The cost of removing the upper bound is a slight penalty for insertion and deletion.
If you want a little more performance and a non-fixed upper limit, you can use a binary tree to store the elements. This gets rid of the sequential search when looking for keys and is suited to somewhat larger data sets.
If you don't know how big your data set will be getting, I would consider this the absolute minimum.
A hash is probably the next step up from there. This performs a function on the string to get a bucket number (usually treated as an array index of some sort). This is O(1) lookup but the aim is to have a hash function that only allocates one item per bucket, so that no further processing is required to get the value.
A degenerate case of "all items in the same bucket" is no different to an array or linked list.
For maximum performance, and assuming the keys are fixed and known in advance, you can actually create your own hashing function based on the keys themselves.
Knowing the keys up front, you have extra information that allows you to fully optimise a hashing function to generate the actual value so you don't even involve buckets - the value generated by the hashing function can be the desired value itself rather than a bucket to get the value from.
I had to put one of these together recently for converting textual months ("January", etc) in to month numbers. You can see the process here.
I mention this possibility because of your "pre-defined string" comment. If your keys are limited to "X" and "Y" (as in your example) and you're using a character set with contiguous {W,X,Y} characters (which even covers EBCDIC as well as ASCII though not necessarily every esoteric character set allowed by ISO), the simplest hashing function would be:
char *s = "X";
int val = *s - 'W';
Note that this doesn't work well if you feed it bad data. These are ideal for when the data is known to be restricted to certain values. The cost of checking data can often swamp the saving given by a pre-optimised hash function like this.

C doesn't have any collection classes. C++ has std::map.
You might try searching for C implementations of maps, e.g. http://elliottback.com/wp/hashmap-implementation-in-c/

A 'trie' or a 'hasmap' should do. The simplest implementation is an array of struct { char *s; int i }; pairs.
Check out 'trie' in 'include/nscript.h' and 'src/trie.c' here: http://github.com/nikki93/nscript . Change the 'trie_info' type to 'int'.

Try a Trie for strings, or a Tree of some sort for integer/pointer types (or anything that can be compared as "less than" or "greater than" another key). Wikipedia has reasonably good articles on both, and they can be implemented in C.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.