neo4j query for a node using python rest client - python

I have nodes in index with following proprties:
{'user_id': u'00050714572570434939', 'hosts': [u'http://shyjive.blogspot.com/'], 'follows': ['null']}
Now i have index and I am trying simple query to index to get nodes as :
index = gdb.nodes.indexes.create('blogger2')
uid = gdb.nodes.create()
uid["hosts"] = ['http://shyjive.blogspot.com/']
uid["user_id"] = "00050714572570434939"
uid["follows"] = ['null']
print index["user_id"]["00050714572570434939"][:]
this returns [] , what is wrong here !!
reason why i am using list in python as suggested by developers on neo4j groups is I want to store multi property values to the node , so instead of array i am using list here

You first need to index the node. If you are not using automatic indexing, the code for neo4j-rest-client would be:
index["user_id"]["00050714572570434939"] = uid
Now you have:
>>> index["user_id"]["00050714572570434939"][:]
[<Neo4j Node: http://localhost:7474/db/data/node/38>]

Related

Cypher query problem when trying to find max of a returned column under certain relation id

I am facing a very strange problem I am calling the same function get_objects() 4 times and getting the max from the returned column, the item 10172 which should be returned as a maximum still present in the result list but instead of that it returns me another item 9998 which is not a maximum. While for other two calls to the same function with another parameter it gives me correct results.
I have run and tested the statement into Neo4j browser, it gives me the same problem behaves like just that node doesn't exist, but when I individually search for that node 10172 which should be returned as a maximum it does exist in the database but why it is not returning me as maximum in final result?
I also extracted the CSV file from the Neo4j to double check the relation and presence of that specific node. It exists. Where I am going wrong?
I have a data stored in a graph database as 4 types of nodes and they are connected with different 4 relations and the relation id attribute as (1,2,3,4) In cypher query I am trying to get the maximum paper id against relation 1. The problem seems to be exists with relation 1 and relation 4 calls. But I rechecked into database these nodes are present under these particular relations.
Here is what i have tried so far.
def get_objects(x):
par = str(x)
query = ''' MATCH (p)-[r]->(a) WHERE r.id = $par RETURN a.id '''
resultNodes = session.run(query, par = par)
df = DataFrame(resultNodes)
return df[0]
def find_max_1():
authors,terms,venues,papers=0,0,0,0
authors=get_objects(1).max()
terms=get_objects(2).max()
venues=get_objects(3).max()
papers=get_objects(4).max()
return authors,terms,venues,papers
def main():
m = find_max_1()
if __name__ == "__main__":
main()
The output is:
[9998, 14669, 10190, 9999]
Expected output:
[10172, 14669, 10190, 15648]
Any kind of help would be appreciated!
Thanks in advance.
The problem was returned result was string type and max() was calculating maximum between strings instead of int.

Correct way to extract values from Django Query

How do you set the results of a Django query to equal a new set of variables?
for example, when I run this query on my Django shell:
getDefaults = defaultParams.objects.filter(isDefault = True, device = 41)
then run:
for val in getDefaults.values():
print(val)
It returns:
{'id': 2, 'device_id': 41, 'customerTag': 'ABCD001', 'isDefault': True}
I would like to use the values in this dictionary to save a new record into my database, but can't seem to extract the values from the dictionary?
I thought it would be something like:
device_id = getDefaults.device_id
NewCustomerTag = getDefaults.customerTag
values() explicitly returns a list of dictionaries, so you would need to use dictionary syntax: val['device_id'].
But in your case there is no reason to do that. Skip the values call altogether, and you will get an instance of defaultParams on which you can use normal attribute lookups:
for val in getDefaults:
print(val.device_id)
you can not do getDefaults.device_id since defaultParams.objects.filter return a list of matched objects.
If you are sure you get a single match do as follows.
device_id = getDefaults[0].device_id
NewCustomerTag = getDefaults[0].customerTag
or
you can iterate through the list and use the dict data.
for val in getDefaults:
device_id = val.device_id
NewCustomerTag = val.customerTag
If you want to save the retrieved object info as new object I would suggest the following approach.
getDefaults = defaultParams.objects.filter(isDefault = True, device = 41)
getDefaults[0].pk = None
getDefaults[0].save()

Parsing py2neo paths into Pandas

We are returning paths from a cypher query using py2neo. We would like to parse the result into a Pandas DataFrame. The cypher query is similar to the following query
query='''MATCH p=allShortestPaths(p1:Type1)-[r*..3]-(p2:Type1)
WHERE p1.ID =123456
RETURN distinct(p)''
result = graph.run(query)
The resulting object is a walkable object - which can be traversed. It should be noted that the Nodes and Relationships don't have the same properties.
What would be the most pythonic way to iterate over the object? Is it necessary to process the entire path or since the object is a dictionary is it possible to use the Pandas.from_dict method? There is an issue that sometimes the length of the paths are not equal.
Currently we are enumerating the object and if it is an un-equal object then it is a Node , otherwise we process the object as a relationship.
for index, item in enumerate(paths):
if index%2 == 0:
#process as Node
else:
#process as Relationship
We can use the isinstance method i.e.
if isinstance(item, py2neo.types.Node ):
#process as Node
But that still requires processing every element separately.
I solve the problem as follows:
I wrote a function that receives a list of paths with the properties of the nodes and relationships
def neo4j_graph_to_dict(paths, node_properties, rels_properties):
paths_dict=OrderedDict()
for (pathID, path) in enumerate(paths):
paths_dict[pathID]={}
for (i, node_rel) in enumerate(path):
n_properties = [node_rel[np] for np in node_properties]
r_properties = [node_rel[rp] for rp in rels_properties]
if isinstance(node_rel, Node):
node_fromat = [np+': {}|'for np in node_properties]
paths_dict[pathID]['Node'+str(i)]=('{}: '+' '.join(node_fromat)).format(list(node_rel.labels())[0], *n_properties)
elif isinstance(node_rel, Relationship):
rel_fromat = [np+': {}|'for np in rels_properties]
reltype= 'Rel'+str(i-1)
paths_dict[pathID][reltype]= ('{}: '+' '.join(rel_fromat)).format(node_rel.type(), *r_properties)
return paths_dict
Assuming the query returns the paths, nodes and relationships we can run the following code:
query='''MATCH paths=allShortestPaths(
(pr1:Type1 {ID:'123456'})-[r*1..9]-(pr2:Type2 {ID:'654321'}))
RETURN paths, nodes(paths) as nodes, rels(paths) as rels'''
df_qf = pd.DataFrame(graph.data(query))
node_properties = set([k for series in df_qf.nodes for node in series for k in node.keys() ]) # get unique values for Node properites
rels_properties = set([k for series in df_qf.rels for rel in series for k in rel.keys() ]) # get unique values for Rels properites
wg = [(walk(path)) for path in df_qf.paths ]
paths_dict = neo4j_graph_to_dict(wg, node_properties, rels_properties)
df = pd.DataFrame(paths_dict).transpose()
df = pd.DataFrame(df, columns=paths_dict[0].keys()).drop_duplicates()

py2neo how to retrieve a node based on node's property?

I've found related methods:
find - doesn't work because this version of neo4j doesn't support labels.
match - doesn't work because I cannot specify a relation, because the node has no relations yet.
match_one - same as match.
node - doesn't work because I don't know the id of the node.
I need an equivalent of:
start n = node(*) where n.name? = "wvxvw" return n;
Cypher query. Seems like it should be basic, but it really isn't...
PS. I'm opposed to using Cypher for too many reasons to mention. So that's not an option either.
Well, you should create indexes so that your start nodes are reduced. This will be automatically taken care of with the use of labels, but in the meantime, there can be a work around.
Create an index, say "label", which will have keys pointing to the different types of nodes you will have (in your case, say 'Person')
Now while searching you can write the following query :
START n = node:label(key_name='Person') WHERE n.name = 'wvxvw' RETURN n; //key_name is the key's name you will assign while creating the node.
user797257 seems to be out of the game, but I think this could still be useful:
If you want to get nodes, you need to create an index. An index in Neo4j is the same as in MySQL or any other database (If I understand correctly). Labels are basically auto-indexes, but an index offers additional speed. (I use both).
somewhere on top, or in neo4j itself create an index:
index = graph_db.get_or_create_index(neo4j.Node, "index_name")
Then, create your node as usual, but do add it to the index:
new_node = batch.create(node({"key":"value"}))
batch.add_indexed_node(index, "key", "value", new_node)
Now, if you need to find your new_node, execute this:
new_node_ref = index.get("key", "value")
This returns a list. new_node_ref[0] has the top item, in case you want/expect a single node.
use selector to obtain node from the graph
The following code fetches the first node from list of nodes matching the search
selector = NodeSelector(graph)
node = selector.select("Label",key='value')
nodelist=list(node)
m_node=node.first()
using py2neo, this hacky function will iterate through the properties and values and labels gradually eliminating all nodes that don't match each criteria submitted. The final result will be a list of all (if any) nodes that match all the properties and labels supplied.
def find_multiProp(graph, *labels, **properties):
results = None
for l in labels:
for k,v in properties.iteritems():
if results == None:
genNodes = lambda l,k,v: graph.find(l, property_key=k, property_value=v)
results = [r for r in genNodes(l,k,v)]
continue
prevResults = results
results = [n for n in genNodes(l,k,v) if n in prevResults]
return results
see my other answer for creating a merge_one() that will accept multiple properties...

Indexing nodes in neo4j in python

I'm building a database with tag nodes and url nodes, and the url nodes are connected to tag nodes. In this case if the same url is inserted in to the database, it should be linking to the tag node, rather than creating duplicate url nodes. I think indexing would solve this problem. How is it possible to do indexing and traversal with the neo4jrestclient?. Link to a tutorial would be fine. I'm currently using versae neo4jrestclient.
Thanks
The neo4jrestclient supports both indexing and traversing the graph, but I think by using just indexing could be enoguh for your use case. However, I don't know if I understood properly your problem. Anyway, something like this could work:
>>> from neo4jrestclient.client import GraphDatabase
>>> gdb = GraphDatabase("http://localhost:7474/db/data/")
>>> idx = gdb.nodes.indexes.create("urltags")
>>> url_node = gdb.nodes.create(url="http://foo.bar", type="URL")
>>> tag_node = gdb.nodes.create(tag="foobar", type="TAG")
We add the property count to the relationship to keep track the number of URLs "http://foo.bar" tagged with the tag foobar.
>>> url_node.relationships.create(tag_node["tag"], tag_node, count=1)
And after that, we index the url node according the value of the URL.
>>> idx["url"][url_node["url"]] = url_node
Then, when I need to create a new URL node tagged with a TAG node, we first query the index to check if that is yet indexed. Otherwise, we create the node and index it.
>>> new_url = "http://foo.bar2"
>>> nodes = idx["url"][new_url]
>>> if len(nodes):
... rel = nodes[0].relationships.all(types=[tag_node["tag"]])[0]
... rel["count"] += 1
... else:
... new_url_node = gdb.nodes.create(url=new_url, type="URL")
... new_url_node.relationships.create(tag_node["tag"], tag_node, count=1)
... idx["url"][new_url_node["url"]] = new_url_node
An important concept is that the indexes are key/value/object triplets where the object is either a node or a relationship you want to index.
Steps to create and use the index:
Create an instance of the graph database rest client.
from neo4jrestclient.client import GraphDatabase
gdb = GraphDatabase("http://localhost:7474/db/data/")
Create a node or relationship index (Creating a node index here)
index = gdb.nodes.indexes.create('latin_genre')
Add nodes to the index
nelly = gdb.nodes.create(name='Nelly Furtado')
shakira = gdb.nodes.create(name='Shakira')
index['latin_genre'][nelly.get('name')] = nelly
index['latin_genre'][shakira.get('name')] = shakira
Fetch nodes based on the index and do further processing:
for artist in index['latin_genre']['Shakira']:
print artist.get('name')
More details can be found from the notes in the webadmin
Neo4j has two types of indexes, node and relationship indexes. With
node indexes you index and find nodes, and with relationship indexes
you do the same for relationships.
Each index has a provider, which is the underlying implementation
handling that index. The default provider is lucene, but you can
create your own index provides if you like.
Neo4j indexes take key/value/object triplets ("object" being a node or
a relationship), it will index the key/value pair, and associate this
with the object provided. After you have indexed a set of
key/value/object triplets, you can query the index and get back
objects that where indexed with key/value pairs matching your query.
For instance, if you have "User" nodes in your database, and want to
rapidly find them by username or email, you could create a node index
named "Users", and for each user index username and email. With the
default lucene configuration, you can then search the "Users" index
with a query like: "username:bob OR email:bob#gmail.com".
You can use the data browser to query your indexes this way, the
syntax for the above query is "node:index:Users:username:bob OR
email:bob#gmail.com".

Categories

Resources