Simple list of branch for a Node Python AnyTree - python

Given a simple tree in AnyTree, when I look at a node it tells me its path to the root, but I cannot find anything that will simply give me that path as a list of the node names, or as a string which I can simply turn into a list of the names.
Typical example
from anytree import Node, RenderTree, AsciiStyle
f = Node("f")
b = Node("b", parent=f)
a = Node("a", parent=b)
d = Node("d", parent=b)
c = Node("c", parent=d)
e = Node("e", parent=d)
g = Node("g", parent=f)
i = Node("i", parent=g)
h = Node("h", parent=i)
print(RenderTree(f, style=AsciiStyle()).by_attr())
Which renders up as the tree:
f
|-- b
| |-- a
| +-- d
| |-- c
| +-- e
+-- g
+-- i
+-- h
I just want the path from a leaf to the top or root. So some command that will do this:
>>> ShowMePath(e,f) # a command like this
["e", "d", "b", "f"]
I'd even be happy if I could just get a string version of the node name which I could .split() to get that string.
>>> e
Node('/f/b/d/e') < can I get this as a string to split it?
All the iterator methods examples (eg PostOrderIter) seem to return parallel branches rather than just a simple path to the top. I've looked through the docs but don't see this simple give-me-the-path option. What am I missing? Isn't this something everyone needs?

Okay - a solution. Turns out there is a .path attribute which returns a tuple of the path of nodes to the top. Didn't see that digging through the docs initially.
So I can get to my desired list in the example above with:
pathList = list(e.path)
namesList = [n.name for n in pathList]
and I'm there.

Related

How to parse individual variables from python code in case of multiple variables assignments?

I have this simple python code example that parses a python code and extracts the variables assigned in it:
import ast
import sys
import astunparse
import json
tree=ast.parse('''\
a = 10
b,c=5,6
[d,e]=7,8
(f,g)=9,10
h=20
''',mode="exec")
for thing in tree.body:
if isinstance(thing, ast.Assign):
print(astunparse.unparse(thing).split('=')[0].strip())
I also tried a similar method with a NodeVisitor:
import ast
import sys
import astunparse
import json
tree=ast.parse('''\
a = 10
b,c=5,6
[d,e]=7,8
(f,g)=9,10
h=20
''',mode="exec")
class AnalysisNodeVisitor2(ast.NodeVisitor):
def visit_Assign(self,node):
print(astunparse.unparse(node.targets))
Analyzer=AnalysisNodeVisitor2()
Analyzer.visit(tree)
But both methods gave me this same results:
a
(b, c)
[d, e]
(f, g)
h
But the output I'm trying to get is individual variables like this:
a
b
c
d
e
f
g
h
Is there a way to achieve this?
Each target is either an object with an id attribute, or a sequence of targets.
for thing in tree.body:
if isinstance(thing, ast.Assign):
for t in thing.targets:
try:
print(t.id)
except AttributeError:
for x in t.elts:
print(x.id)
This, of course, doesn't handle more complicated possible assignments like a, (b, c) = [3, [4,5]], but I leave it as an exercise to write a recursive function that walks the tree, printing target names as they are found. (You may also need to adjust the code to handle things like a[3] = 5 or a.b = 10.)

Return list of all files in a directory trouble

i'm still in an introductory course in python. i'm trying to write a program that returns a list of all the file names in a directory using recursions but for some reason it is not working as expected. here is my code: Thank you
from pathlib import Path
p = Path('/Users/name/Documents/')
def directory_files (dirct: Path) -> list:
Lf = []
if dirct.is_file:
Lf.append(dirct)
else:
for d in list(dirct.iterdir()):
directory_files(d)
return Lf
You forgot to call is_file, if you want to fill your list with files you need to check if dirct is a directory, not being a file does not make it a directory. You can also simply extend your list recursing on each element:
def directory_files(dirct: Path) -> list:
Lf = []
if dirct.is_dir():
for d in dirct.iterdir():
Lf.extend(directory_files(d))
else:
Lf.append(dirct)
return Lf
Demo:
In [6]: ls
bar.txt foo.txt test2/
In [7]: p = Path(".")
In [8]: directory_files(p)
Out[8]:
[PosixPath('foo.txt'),
PosixPath('bar.txt'),
PosixPath('test2/bar2.txt'),
PosixPath('test2/foo2.txt')]
If you want just the names use the .name attribute:
def directory_files (dirct: Path) -> list:
Lf = []
if dirct.is_dir():
for d in dirct.iterdir():
Lf.extend(directory_files(d))
else:
Lf.append(dirct.name)
return Lf
Demo:
In [10]: directory_files(p)
Out[10]: ['foo.txt', 'bar.txt', 'bar2.txt', 'foo2.txt']
is_file like is_dir are methods which need to be called, if dirct.is_file is checking if dirct.is_file which is a reference to a method is True which is always the case, the parens is_file() actually call the method which then returns a boolean.
If you want to explicitly check for file you also need to add an is_file, for instance a named pipe is not a directory but would not pass an is_file test so:
def directory_files(dirct: Path) -> list:
Lf = []
if dirct.is_dir():
for d in dirct.iterdir():
Lf.extend(directory_files(d))
elif dirct.is_file():
Lf.append(dirct.name)
return Lf
You can see the difference in the output:
In [27]: ls
bar.txt foo_pipe| foo.txt test2/
In [28]: p = Path(".")
In [29]: directory_files1(p) # has if dirct.is_file()
Out[29]:
[PosixPath('foo.txt'),
PosixPath('bar.txt'),
PosixPath('test2/bar2.txt'),
PosixPath('test2/foo2.txt')]
In [30]: directory_files(p)
Out[30]:
[PosixPath('foo.txt'),
PosixPath('bar.txt'),
PosixPath('test2/bar2.txt'),
PosixPath('test2/foo2.txt'),
PosixPath('foo_pipe')]
You may find rcviz a nice tool to help your understanding of recursion, it can create a graph where:
The edges are numbered by the order in which they were traversed by the execution. 2. The edges are colored from black to grey to indicate order of traversal : black edges first, grey edges last.
It is a bit hard to see here but if you click on the image it should make it a lot easier to see.

Is it possible to have a scenario outline table with empty values?

Scenaio Outline: Blah de blah
When I enter and on the input field
Then Everything is good
Examples:
| a | b |
| 1 | 2 |
| | 3 |
The above scenario throws the following error in BBD Behave
Test undefined
Please define test
I am not sure how I can come around this.
Any suggestions?
Use Custom Type Conversions described in https://pypi.org/project/parse/
import parse
from behave import given, register_type
#parse.with_pattern(r'.*')
def parse_nullable_string(text):
return text
register_type(NullableString=parse_nullable_string)
#given('params "{a:NullableString}" and "{b:NullableString}"'
def set_params(context, a, b):
# a or b will be empty if they are blank in the Examples
context.a = a
context.b = b
Now the feature file could look like this,
Given params "<a>" and "<b>"
# Rest of the steps
Examples:
| a | b |
| 1 | 2 |
| | 3 |
It is indeed possible to use empty table cells (as in your example, not using "" or something) if you can go without explicitly mentioning your parameters in your given/when/then steps.
In your example, that would mean that you must not write your step definitions like this
Given two parameters <a> and <b>
...
#given('two parameters {a} and {b}
def step(context, a, b):
# test something with a and b
but rather like this:
Given two parameters a and b # <-- note the missing angle brackets
...
#given('two parameters a and b') <-- note the missing curly brackets
def step(context): # <-- note the missing function arguments for a and b
# ...
Now, in order to access the current table row, you can use context.active_outline (which is a bit hidden in the appendix of the documentation).
context.active_outline returns a behave.model.row object which can be accessed in the following ways:
context.active_outline.headings returns a list of the table headers, no matter what the currently iterated row is (a and b in the example from the question )
context.active_outline.cells returns a list of the cell values for the currently iterated row (1, 2 and '' (empty string that can be tested with if not...), 3 in the example from the question)
index-based access like context.active_outline[0] returns the cell value from the first column (no matter the heading) etc.
named-based access like context.active_outline['a'] returns the cell value for the column with the a header, no matter its index
As context.active_outline.headings and context.active_outline.cells return lists, one can also do useful stuff like for heading, cell in zip(context.active_outline.headings, context.active_outline.cells) to iterate over the heading-value pairs etc.
As far as I know you can't do it. But you can use either an empty string or a placeholder value (e.g. 'N/A') that you can look out for in your step definitions.

Nodes of a graph in Python

I am trying to figure out the best way of coding a Node Class (to be used in a binary tree) that would contain the attributes key, left and right.
I thought I would do something like
class Node:
def __init__(self, key):
self.key= key
self.left = None
self.right = None
and then do
a = Node('A')
b = Node('B')
c = Node('C')
a.left = b
a.right = c
Here I am a little bit confused: are (under the hood) left and right pointers? Or is a containing a copy of the whole tree?
If I add
d = Node('B') # same key as before, but an entirely different node
c.right = d
then are b and d two different objects even if they have the same attributes? I would think so because they don't share any memory.
Finally, if I want to do a deep copy of one of my nodes, is
e = Node(a.key))
sufficient?
Python is dynamically typed, so you can't say left and right are references. One can also store an integer, or float in them. You can even first store an integer then a reference to an object and later a float in them, so the type might vary over time. But if you perform an assignment to an object. That will indeed result in a pointer (this is a huge semantical difference with your question).
For your second question, it depends on how you see deep copying. If your node contains references to other nodes, do you want to copy these nodes as well?
If you are interested only in generating a new node with the same value but with references to the same other nodes, then use: copy.copy, otherwise use copy.deepcopy.
The difference is:
B <- A -> C B' <- D -> C'
^ ^
| |
\-- S --/
With S a shallow copy and D a deep copy. Note that a deep copy thus results in new nodes B' and C'. You can imagine that if you deep copy a huge tree this can result in a large memory and CPU footprint.
Your code
e = Node(a.key))
Is not completely correct since you don't copy (potential) references to your left and right node, and furthermore it's not good design since you can attach more items to the node and you need to modify your copy function(s) each time. Using the copy.copy-functions is thus more safe.
Yes b and d have the same attributes, but they are two independent instances. To see this:
print id(b) # one number
print id(d) # new number
This proves that they are two different objects in memory. To see that a.right is the same object as c use the same technique, checking for equivalent id values.
print id(a.right)
print id(c) # same
Yes these are just references to the left or right object.
Everytime you do Node("some_str), a new object is created. So b & d will be different, and a new object gets created for e = Node(a.key)).
Doing a e = Node('E') and doing f = e will be the same, with f and e referring to the same object.

Flattening a tree with parents/children and return all nodes

It's probably too late, but I can't sleep until it's solved:
I've got a tree with some parents, which have children, which have also children etc.
Now I need a function to get all nodes from the tree.
This is what currently works, but only with one level of depth:
def nodes_from_tree(tree, parent):
r = []
if len(tree.get_children(parent)) == 0:
return parent
for child in tree.get_children(parent):
r.append(nodes_from_tree(tree, child))
return r
Then I tried to pass r through, so it remembers the children, but I'm using the function more then once and r stores cumulatively all nodes, although I'm setting it to r=[]:
def nodes_from_tree(tree, parent, r=[]):
r = []
if len(tree.get_children(parent)) == 0:
return parent
for child in tree.get_children(parent):
r.append(nodes_from_tree(tree, child, r))
return r
Edit: This is the tree structure:
parent1 parent2 parent3
| | |
| | |
child | |
| |
+--------------+ |
| | | |
child child child |
| |
+---+---+ |
child child +---+---+
| |
child |
|
+-----+-----+-----+
| | | |
child child child child
Available methods:
tree.get_parents() # returns the nodes of the very top level
tree.get_children(node) # returns the children of parent or child
I think your problem is just that you're accumulating things incorrectly.
First, if you hit an intermediate node, each child should return a list, but you're appending that list instead of extending it. So, instead of [1, 2, 3, 4] you're going to get something like [[1, 2], [3, 4]]—in other words, you're just transforming it into a list-of-list tree, not a flat list. Change this to extend.
Second, if you hit a leaf node, you're not returning a list at all, just parent. Change this to return [parent].
Third, if you hit an intermediate node, you don't include parent anywhere, so you're only going to end up with the leaves. But you wanted all the nodes. So change the r = [] to r = [parent].
And with that last change, you don't need the if block at all. If there are no children, the loop will happen 0 times, and you'll end up returning [parent] as-is, exactly as you wanted to.
So:
def nodes_from_tree(tree, parent, r=[]):
r = [parent]
for child in tree.get_children(parent):
r.extend(nodes_from_tree(tree, child, r))
return r
Meanwhile, while this version will work, it's still confused. You're mixing up two different styles of recursion. Passing an accumulator down the chain and adding to on the way down is one way to do it; returning values up the chain and accumulating results on the way up is the other. You're doing half of each.
As it turns out, the way you're doing the upstream recursion is making the downstream recursion have no effect at all. While you do pass an r down to each child, you never modify it, or even use it; you just create a new r list and return that.
The easiest way to fix that is to just remove the accumulator argument:
def nodes_from_tree(tree, parent):
r = [parent]
for child in tree.get_children(parent):
r.extend(nodes_from_tree(tree, child))
return r
(It's worth noting that branching recursion can only be tail-call-optimized if you do it in downstream accumulator style instead of upstream gathering style. But that doesn't really matter in Python, because Python doesn't do tail call optimization. So, write whichever one makes more sense to you.)
If I understand your question, you want to make a flat list containing all the values in a tree, in which case a tree represented by tuples the following would work:
def nodes_from_tree(tree,nodes=list()):
if isinstance(tree,tuple):
for child in tree:
nodes_from_tree(child,nodes=nodes)
else:
nodes.append(tree)
mynodes = []
tree = (('Root',
('Parent',(
('Child1',),
('Child2',)
)
),
('Parent2',(
('child1',(
('childchild1','childchild2')
)),
('child2',),
('child3',)
)),
('Parent3',(
('child1',),
('child2',(
('childchild1',),
('childchild2',),
('childchild3',),
('childchild4',)
))
))
))
nodes_from_tree(tree,nodes=mynodes)
print(mynodes)
Produces
['Root', 'Parent', 'Child1', 'Child2', 'Parent2', 'child1', 'childchild1', 'childchild2',
'child2', 'child3', 'Parent3', 'child1', 'child2', 'childchild1', 'childchild2', 'childchild3', 'childchild4']

Categories

Resources