NEO4J APOC LOAD JSON FROM EXTERNAL VARIABLE - python

I'm trying to load a json document into Neo4j but, if possible, I don't want to use a file because, in my case, it's a waste of time.
WHAT I'M DOING NOW:
Python query to Elasticsearch Database
Push data into a .json file
From Neo4j Python Library, run apoc.load.json('file:///file.json')
WHAT I WANT TO DO:
Python query to Elasticsearch Database
From Neo4j Python Library, run apoc.load.json()
Is there any syntax that could help me with that? Thank you

If you already have APOC installed, you can utilize the APOC to ES connector without having to use apoc.load.json.
Here is an example from the documentation:
CALL apoc.es.query("localhost","bank","_doc",null,{
query: { match_all: {} },
sort: [
{ account_number: "asc" }
]
})
YIELD value
UNWIND value.hits.hits AS hit
RETURN hit;
Link to docs: https://neo4j.com/labs/apoc/4.1/overview/apoc.es/

Related

Error: "Cannot find reference to loads in json.py"

I am a newbie to python. I am learning the how python works with json.
After writing this code in pycharm, I am getting unresolved references at several locations.
"Import resolves to its containing file".
"Cannot find reference to dumps in json.py "
"Cannot find reference to loads in json.py "
I am getting this error while importing json, loads() and dumps() method are called.
This is the video link from where I am learning to code python.
https://www.youtube.com/watch?v=9N6a-VLBa2I&list=PL-osiE80TeTt2d9bfVyTiXJA-UTHn6WwU&index=44
Please help me in resolving this.
import json
# Decoding json string to python.
# This a python string that happens to be a valid json also.
people_string = '''
{
"people": [
{
"name": "Sumedha",
"phone":"0987654312"
"City": "Middletown"
},
{
"name": "Ankit",
"phone":"9999999999"
"City": "Middletown2"
},
{
"name": "Hemlata",
"phone":"9865656475"
"City": "Chandigarh"
}
]
}
'''
# loads method loads the string.
data = json.loads(people_string)
for person in data['people']:
print(person['name'])
del person['phone']
new_string = json.dumps(data, indent=2, sort_keys=True)
You named your test script json.py, so it's shadowing the built-in json module, preventing you from importing the built-in module, making import json try to import itself (that's what "Import resolves to its containing file" is trying to warn you about). Name your script something else (e.g. jsontest.py) and it will work.
The error seems to appear when you have both JSON & pandas packages in the same python environment.
I found a cheap solution by having a separate project file for pandas & JSON so that I install the packages in different virtual environments.

Azure function app - output CosmosDB

Im using Python, and there is no documentation on doing this in Python. I have blob storage working with python. Now I am trying to save data to the cosmos db. I have no idea what i am supposed to do in azure function?
cosmosdb_data = open(os.environ['outputDocument'], 'wb')
Would really appreciate any help on this!
EDIT:
I got it storing, but it complains that the document is corrupt anmd the _id field is missing. Does this mean you have to set your own id??
data = {
"timestamp": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"image":"path/image.jpg",
"device":subject.split(",")[1],
"detected":"false",
"detection_type":"null"
}
document = open(os.environ['outputCosmosDB'], 'w')
document.write('%s' % data)
document.close()
document.write doesn't output valid JSON, does it? Doesn't it output single quotes, not double quotes? You need to make sure it outputs valid JSON.
_id is not necessary.
Also, Python on Azure Functions v1 is not very good and I'd recommend not using it. We're actively working on a new version of Python for v2 which will work properly for this kind of thing.

storing full text from txt file into mongodb

I have created a python script that automates a workflow converting PDF to txt files. I want to be able to store and query these files in MongoDB. Do I need to turn the .txt file into JSON/BSON? Should I be using a program like PyMongo?
I am just not sure what the steps of such a project would be let alone the tools that would help with this.
I've looked at this post: How can one add text files in Mongodb?, which makes me think I need to convert the file to a JSON file, and possibly integrate GridFS?
You don't need to JSON/BSON encode it if you're using a driver. If you're using the MongoDB shell, you'd need to worry about it when you pasted the contents.
You'd likely want to use the Python MongoDB driver:
from pymongo import MongoClient
client = MongoClient()
db = client.test_database # use a database called "test_database"
collection = db.files # and inside that DB, a collection called "files"
f = open('test_file_name.txt') # open a file
text = f.read() # read the entire contents, should be UTF-8 text
# build a document to be inserted
text_file_doc = {"file_name": "test_file_name.txt", "contents" : text }
# insert the contents into the "file" collection
collection.insert(text_file_doc)
(Untested code)
If you made sure that the file names are unique, you could set the _id property of the document and retrieve it like:
text_file_doc = collection.find_one({"_id": "test_file_name.txt"})
Or, you could ensure the file_name property as shown above is indexed and do:
text_file_doc = collection.find_one({"file_name": "test_file_name.txt"})
Your other option is to use GridFS, although it's often not recommended for small files.
There's a starter here for Python and GridFS.
Yes, you must convert your file to JSON. There is a trivial way to do that: use something like {"text": "your text"}. It's easy to extend / update such records later.
Of course you'd need to escape the " occurences in your text. I suppose that you use a JSON library and/or MongoDB library of your favorite language to do all the formatting.

bsddb and reprepro (berkeley) database

I'm trying to read the database files created by reprepro. I don't have that much experience with bdb, so I might be confused here, but it looks like the database is layered in some way.
If I simply do btopen('path/to/packages.db', 'r'), I get the database object with contents like:
In [4]: packages.items()
Out[4]:
[('local-lenny|main|amd64', '\x00\x00\x00\x04'),
('local-lenny|main|i386', '\x00\x00\x00\x02'),
('local-lenny|main|powerpc', '\x00\x00\x00\x14'),
('local-lenny|main|source', '\x00\x00\x00\x06'),
('local-lenny|main|sparc', '\x00\x00\x00\x12')]
However the db4.6_dump shows:
VERSION=3
format=bytevalue
database=local-lenny|main|sparc
type=btree
db_pagesize=4096
HEADER=END
<loads of data>
The file itself is identified as: /var/packages/db/packages.db: Berkeley DB (Btree, version 9, native byte-order) by file.
How do I get to that contents? If I understand it correctly, I got only the names of actual databases in keys(). How do I get to the contents of those dbs now?
And the answer seems to be that the "nice" version of the bsddb interface doesn't support multi btree tables inside one file. You can open such table explicitly via bsddb.db, using:
env = db.DBEnv()
env.open(None, db.DB_CREATE | db.DB_INIT_MPOOL)
internal_db = db.DB(env)
internal_db.open("the filename", "the internal db name", db.DB_BTREE, db.DB_RDONLY)

How can I parse JSON in Google App Engine?

I'd like to parse a JSON string into an object under Google App Engine (python). What do you recommend? Something to encode/stringify would be nice too. Is what you recommend built in, or a library that I have to include in my app? Is it secure? Thanks.
Consider using Django's json lib, which is included with GAE.
from django.utils import simplejson as json
# load the object from a string
obj = json.loads( string )
The link above has examples of Django's serializer, and here's the link for simplejson's documentation.
If you're looking at storing Python class instances or objects (as opposed to compositions of lists, strings, numbers, and dictionaries), you probably want to look at pickle.
Incidentally, to get Django 1.0 (instead of Django 0.96) running on GAE, you can use the following call in your main.py, per this article:
from google.appengine.dist import use_library
use_library('django', '1.0')
Edit: Native JSON support in Google App Engine 1.6.0 with Python 2.7
As of Google App Engine 1.6.0, you can use the Python 2.7 runtime by adding runtime: python27 in app.yaml, and then you can import the native JSON library with import json.
Google App Engine now supports python 2.7. If using python 2.7, you can do the following:
import json
structured_dictionary = json.loads(string_received)
Include the simplejson library with your app?
This is an old question, but I thought I'd give an updated, more detailed answer. For those landing here now, you are almost certainly using python 2.6 or greater, so you can use the built-in json module for Python 2 (or for Python 3, since Google recently added support for Python 3 on GAE). Importing is as easy as import json. Here are some examples of how to use the json module:
import json
# parse json_string into a dict
json_string = '{"key_one": "value_one", "key_two": 1234}'
json_dict = json.loads(json_string)
# json_dict: {u'key_two': 1234, u'key_one': u'value_one'}
# generate json from a dict
json_dict = {'key': 'value', 'key_two': 1234, 'key_three': True}
json_string = json.dumps(json_dict)
# json_string: '{"key_two": 1234, "key": "value", "key_three": true}'
If you are using an older version of python, stick to #Brian M. Hunt's answer.
Again, here is the doc page for the json module for Python 2, and here it is for Python 3.
If you're using Python2.6 or greater, I've used with success the built-in json.load function. Otherwise, simplejson works on 2.4 without dependencies.
Look at the python section of json.org. The standard library support for JSON started at python 2.6, which I believe is newer than what the app engine provides. Maybe one of the other options listed?

Categories

Resources