pydot has a huge number of bound methods for getting and setting every little thing in a dot graph, reading and writing, you-name-it, but I can't seem to find a simple membership test.
>>> d = pydot.Dot()
>>> n = pydot.Node('foobar')
>>> d.add_node(n)
>>> n in d.get_nodes()
False
is just one of many things that didn't work. It appears that nodes, once added to a graph, acquire a new identity
>>> d.get_nodes()[0]
<pydot.Node object at 0x171d6b0>
>>> n
<pydot.Node object at 0x1534650>
Can anyone suggest a way to create a node and test to see if it's in a graph before adding it so you could do something like this:
d = pydot.Dot()
n = pydot.Node('foobar')
if n not in d:
d.add_node(n)
Looking through the source code, http://code.google.com/p/pydot/source/browse/trunk/pydot.py, it seems that node names are unique values, used as the keys to locate the nodes within a graph's node dictionary (though, interestingly, rather than return an error for an existing node, it simply adds the attributes of the new node to those of the existing one).
So unless you want to add an implementation of __contains__() to one of the classes in the pydot.py file that does the following, you can just do the following in your code:
if n.get_name() not in d.obj_dict['nodes'].keys():
d.add_node(n)
Related
I have a MultiDiGraph with all my data in it, now I want to do some math on a filtered view of it that has only single directed edges between nodes.
>>> filtered_view[0][1]
Out[23]: AtlasView(FilterAtlas({0: {'d': 0.038, 'l': 2, 'showfl': True, 'type': 'pipe', 'q': 0.0001}}, <function FilterMultiInner.__getitem__.<locals>.new_node_ok at 0x7fa0987b55a0>))
I already have a lot of code that was working on a DiGraph, so a lot of it would not work anymore because of the differences in accessing and storing information. So thus my question:
Is there a way to have the view behave like a DiGraph?
Alternatively, I can do: ndg = nx.DiGraph(filtered_view)to get a DiGraph, but is there a smart (simple, clear, error free) way of merging it back into the main graph?
This is the implementation I came up with, it allows to either merge only data on existing nodes and edges (allnodes=False) or to merge the entire results_graph which is a DiGraph (allnodes=True). Condition is that the MultiDiGraph has not changed since the filtered view was created.
def merge_results_back(results_graph, multidigraph, allnodes=False):
for n in results_graph.nodes:
if n not in multidigraph.nodes and allnodes:
multidigraph.add_node(n)
if n in multidigraph.nodes:
nx.set_node_attributes(multidigraph, {n : results_graph.nodes[n]})
for e in results_graph.edges:
if e in multidigraph.edges:
for ed1, ed2, key, data in multidigraph.edges(e[0], keys=True, data=True):
if data['type'] == results_graph.edges[e]['type']:
nx.set_edge_attributes(multidigraph, {(e[0], e[1], key) : results_graph.edges[e]})
else:
nx.set_edge_attributes(multidigraph, {(e[0], e[1], 0): results_graph.edges[e]})
Offering a couple of suggestions for improvement here based on the code that you posted. It's unclear under what circumstances a node would be added (if the DiGraph is based on the MultiDiGraph, how is a new node possible?), so I'll leave that part alone.
In the loop for modifying edges, you end up looping through multidigraph every time a common edge is found. As an improvement, I'd suggest the following (assuming the type attribute differs based on the edge index, which wasn't clear in your question):
for u, v, data in results_graph.edges(data = True):
#only loops through each edge in the multidigraph one time
for i in range(multidigraph.number_of_edges(u, v)):
if multidigraph.edges[u, v, i]['type'] == data['type']:
multidigraph.edges[u, v, i].update(data)
If the type doesn't change based on the index, just eliminate that if statement line.
I think you can also get rid of the else block:
else:
nx.set_edge_attributes(multidigraph, {(e[0], e[1], 0): results_graph.edges[e]})
If edge e from results_graph isn't in multidigraph, then setting the edge attributes won't create edge e and it will be silently ignored. If you have a new edge and attributes though (again, unclear how this is possible if results_graph was created from multidigraph), you can add the following directly under the for u, v, data... line:
if (u, v) not in multidigraph.edges:
multidigraph.add_edge(u, v, **data)
I want to individually update nodes for a certain new custom attribute called 'journeys', but am having severe difficulties with it.
Say I have some graph:
import pandas as pd
import networkx as nx
cols = ['node_a','node_b','travel_time','attribute']
data = [['A','B',3,'attribute1'],
['B','C',1,'attribute1'],
[ 'C','D',7,'attribute1'],
['D','E',3,'attribute1'],
['E','F',2,'attribute1'],
['F','G',4,'attribute1'],
['A','L',4,'attribute2'],
['L','D',3,'attribute2']
]
edges = pd.DataFrame(data)
edges.columns = cols
G=nx.convert_matrix.from_pandas_edgelist(edges,'node_a','node_b', ['travel_time','attribute'])
For each node, I want to add an attribute in the form of {direction: [[id,timestamp, set_of_carrying_items]]} where the inner one is a list of lists as i want to add more lists of the form [id,timestamp, carrying_items] to it.
Example: Update a particular node A with
new_attribute = {'A':{'up': [[0, Timestamp('1900-01-01 05:31:00'), set()]]}}}
However, no matter what I try, the node doesnt get updated correctly. nx.get_node_attributes(G, 'A') returns an empty dictionary. But nx.get_node_attributes(G,'up') returns the attribute!!
It seems i'm setting it wrongly but I cant figure out how. Anyone know the proper way?
Using networkx 2.4
Solved it.
nx.get_node_attributes(G,X) returns attributes called X for all nodes. You're looking for G.node['nodeA']
if your attribute is of the form {'nodeA':{'attribute_name': value}}, you'll set value to attribute_name for node nodeA.
For example, in my case:
att = {'nodeA':{'up': [[1, Timestamp('1900-01-01 22:31:00'), set()]]}}}
nx.set_node_attributes(G,att)
works
I have overcome the problem of avoiding the creation of duplicate nodes on my DB with the use of merge_one functions which works like that:
t=graph.merge_one("User","ID","someID")
which creates the node with unique ID. My problem is that I can't find a way to add multiple attributes/properties to my node along with the ID which is added automatically (date for example).
I have managed to achieve this the old "duplicate" way but it doesn't work now since merge_one can't accept more arguments! Any ideas???
Graph.merge_one only allows you to specify one key-value pair because it's meant to be used with a uniqueness constraint on a node label and property. Is there anything wrong with finding the node by its unique id with merge_one and then setting the properties?
t = graph.merge_one("User", "ID", "someID")
t['name'] = 'Nicole'
t['age'] = 23
t.push()
I know I am a bit late... but still useful I think
Using py2neo==2.0.7 and the docs (about Node.properties):
... and the latter is an instance of PropertySet which extends dict.
So the following worked for me:
m = graph.merge_one("Model", "mid", MID_SR)
m.properties.update({
'vendor':"XX",
'model':"XYZ",
'software':"OS",
'modelVersion':"",
'hardware':"",
'softwareVesion':"12.06"
})
graph.push(m)
This hacky function will iterate through the properties and values and labels gradually eliminating all nodes that don't match each criteria submitted. The final result will be a list of all (if any) nodes that match all the properties and labels supplied.
def find_multiProp(graph, *labels, **properties):
results = None
for l in labels:
for k,v in properties.iteritems():
if results == None:
genNodes = lambda l,k,v: graph.find(l, property_key=k, property_value=v)
results = [r for r in genNodes(l,k,v)]
continue
prevResults = results
results = [n for n in genNodes(l,k,v) if n in prevResults]
return results
The final result can be used to assess uniqueness and (if empty) create a new node, by combining the two functions together...
def merge_one_multiProp(graph, *labels, **properties):
r = find_multiProp(graph, *labels, **properties)
if not r:
# remove tuple association
node,= graph.create(Node(*labels, **properties))
else:
node = r[0]
return node
example...
from py2neo import Node, Graph
graph = Graph()
properties = {'p1':'v1', 'p2':'v2'}
labels = ('label1', 'label2')
graph.create(Node(*labels, **properties))
for l in labels:
graph.create(Node(l, **properties))
graph.create(Node(*labels, p1='v1'))
node = merge_one_multiProp(graph, *labels, **properties)
Say I'm given a tuple of strings, representing relationships between objects, for example:
connections = ("dr101-mr99", "mr99-out00", "dr101-out00", "scout1-scout2","scout3-scout1", "scout1-scout4", "scout4-sscout", "sscout-super")
each dash "-" shows a relationship between the two items in the string. Then I'm given two items:
first = "scout2"
second = "scout3"
How might I go about finding if first and second are interrelated, meaning I could find a path that connects them, not necessarily if they are just in a string group.
You can try concatenating the strings and using the in operator to check if it is an element of the tuple connections:
if first + "-" + second in connections:
# ...
Edit:
You can also use the join() function:
if "-".join((first, second)) in connections:
# ...
If you plan on doing this any number of times, I'd consider frozensets...
connections_set = set(frozenset(c.split('-')) for c in connections)
Now you can do something like:
if frozenset((first, second)) in connections_set:
...
and you have an O(1) solution (plus the O(N) upfront investment). Note that I'm assuming the order of the pairs is irrelevant. If it's relevant, just use a tuple instead of frozenset and you're good to go.
If you actually need to walk through a graph, an adjacency list implementation might be a little better.
from collections import defaultdict
adjacency_dict = defaultdict(list)
for c in connections:
left, right = c.split('-')
adjacency_dict[left].append(right)
# if undirected: adjacency_dict[right].append(left)
class DFS(object):
def __init__(self, graph):
self.graph = graph
def is_connected(self, node1, node2):
self._seen = set()
self._walk_connections(node1)
output = node2 in self._seen
del self._seen
return output
def _walk_connections(self, node):
if node in self._seen:
return
self._seen.add(node)
for subnode in self.graph[node]:
self._walk_connections(subnode)
print DFS(adjacency_dict).is_connected()
Note that this implementation is definitely suboptimal (I don't stop when I found the node I'm looking for for example) -- and I don't check for an optimal path from node1 to node2. For that, you'd want something like Dijkstra's algorithm
You could use a set of pairs (tuples):
connections = {("dr101", "mr99"), ("mr99", "out00"), ("dr101", "out00")} # ...
if ("scout2", "scout3") in connections:
print "scout2-scout3 in connections"
This only works if the 2 elements are already in the right order, though, because ("scout3", "scout2") != ("scout2", "scout3"), but maybe this is what you want.
If the order of the items in the connection is not significant, you can use a set of frozensets instead (see mgilson's answer). Then you can look up pairs of item regardless of which order they appear in, but the order of the original pairs in connections is lost.
I've found related methods:
find - doesn't work because this version of neo4j doesn't support labels.
match - doesn't work because I cannot specify a relation, because the node has no relations yet.
match_one - same as match.
node - doesn't work because I don't know the id of the node.
I need an equivalent of:
start n = node(*) where n.name? = "wvxvw" return n;
Cypher query. Seems like it should be basic, but it really isn't...
PS. I'm opposed to using Cypher for too many reasons to mention. So that's not an option either.
Well, you should create indexes so that your start nodes are reduced. This will be automatically taken care of with the use of labels, but in the meantime, there can be a work around.
Create an index, say "label", which will have keys pointing to the different types of nodes you will have (in your case, say 'Person')
Now while searching you can write the following query :
START n = node:label(key_name='Person') WHERE n.name = 'wvxvw' RETURN n; //key_name is the key's name you will assign while creating the node.
user797257 seems to be out of the game, but I think this could still be useful:
If you want to get nodes, you need to create an index. An index in Neo4j is the same as in MySQL or any other database (If I understand correctly). Labels are basically auto-indexes, but an index offers additional speed. (I use both).
somewhere on top, or in neo4j itself create an index:
index = graph_db.get_or_create_index(neo4j.Node, "index_name")
Then, create your node as usual, but do add it to the index:
new_node = batch.create(node({"key":"value"}))
batch.add_indexed_node(index, "key", "value", new_node)
Now, if you need to find your new_node, execute this:
new_node_ref = index.get("key", "value")
This returns a list. new_node_ref[0] has the top item, in case you want/expect a single node.
use selector to obtain node from the graph
The following code fetches the first node from list of nodes matching the search
selector = NodeSelector(graph)
node = selector.select("Label",key='value')
nodelist=list(node)
m_node=node.first()
using py2neo, this hacky function will iterate through the properties and values and labels gradually eliminating all nodes that don't match each criteria submitted. The final result will be a list of all (if any) nodes that match all the properties and labels supplied.
def find_multiProp(graph, *labels, **properties):
results = None
for l in labels:
for k,v in properties.iteritems():
if results == None:
genNodes = lambda l,k,v: graph.find(l, property_key=k, property_value=v)
results = [r for r in genNodes(l,k,v)]
continue
prevResults = results
results = [n for n in genNodes(l,k,v) if n in prevResults]
return results
see my other answer for creating a merge_one() that will accept multiple properties...