pymongo converts . variables into a dict - python

I am inserting the data to mongoDB collection through pymongo. I have logged all the information and data which is being sent to update_one statement.
Data which is logged just before update_one statement :
data = {'a': 'h9421976fc124d5756497d3b', 'b': 1611046532.4558306, 'kw_trigger_thing_name': 'ThingName.a', 'ThingName.a_capability_temperature': 44, 'ThingName.a_capability_humidity': '288', 'ThingName.a_kw_thing_name': 'ThingName.a'}
But when it got inserted into "test" then it got appended like this :
inserted_data = { "_id" : ObjectId("6005d317525e0d67866c564f"), "a" : "h9421976fc124d5756497d3b", "b" : 1611046532.4558306, "ThingName" : { "a_capability_humidity" : "288", "a_capability_temperature" : 44,"a_kw_thing_name" : "ThingName.a"}
Using this to update the document:
collection.update_one({"a": "h9421976fc124d5756497d3b"},{"$set": data},upsert=True,)
So here you'll see parsed data with same prefix keys ThingName. get converted into a dict in mongo collection with key as ThingName.
WHy this is happening and how can we override this?

That's perfectly valid. Because when you update with thing.a:x, then it will store as object thing: { a : x}
Field names restrictions
Though mongo supports dot in latest versions, drivers do not support them yet. Hence the conversion still happens.
Another wonderful post

Related

Is there a way in python write json on redis record by record

I have the following json file content :
{
"transactions":
"tr1": {
"type":"deposit",
"account_id":123456789012345,
"amount":20000.0
},
"tr2": {
"type":"deposit",
"account_id":555456789012345,
"amount":20000.0
},
"tr3":{
"type":"payment",
"account_id":123456789012345,
"amount":20000.0
},
"tr4":{
"type":"transfer",
"from":555456789012345,
"to":123456789012345,
"amount":20000.0
}
}
I need to write the information of each transaction on Redis using python program! is there a way to do this ?
I have tryied this , but it insert everything under one key 'traninfo'
data=json.load(f)
for i in data['transactions']:
r.execute_command('JSON.SET','traninfo','.', json.dumps(data))
The matter with the snippet you give is that at each loop you put the whole json as value of your key. As the key is the same, you write four times the same value in your redis.
From your JSON, I guess you want to have one key per transaction in your redis: "tr1", "tr2", "tr3", "tr4". If you do:
for i in data['transactions']:
print(i)
it will print:
"tr1"
"tr2"
"tr3"
"tr4"
you can modify your existing code like this:
data=json.load(f)
for i in data['transactions']:
r.execute_command('JSON.SET', i,'.', json.dumps(data['transactions'][i]))
it will do what you want. But, there is a better way, the items function that allows you to iterates on key and values at the same time:
data=json.load(f)
for key, value in data['transactions'].items():
r.execute_command('JSON.SET', key, '.', json.dumps(value))
I let you use the #Guy Korland improvement about redis api.

Updating a pre-exiting fields datatype(string=> date) in a mongoDb collection

I am trying to update a field that was initially captured as a string instead of a date type.
Currently, the query that insert into the collection has been modified , so that future insert to that field is date. data type
However, I am trying to update the previously inserted data, before query modification that still has the string data type
Here is what I tried, but giving error
db.collection.update_one({“clusterTime”:{"$type":“string”}},{"$set":{“clusterTime:datetime.datetime.strptime(’$clusterTime’,’%y-%m-%d’).date()}})
I really would appreciate contributions.
Thank you.
Loop over the records using forEach, convert to date, and save the document
db.collectionName.find({ 'clusterTime' : { '$type' : 'string' }} ).forEach(function (doc) {
doc.clusterTime = new Date(doc.clusterTime); // convert field to date
db.collectionName.save(doc);
});
The python version for updating the datatype
db.collection.update(
{
"clusterTime":{
"$type":"string"
}
},
[
{
"$set":{
"clusterTime":{
"$dateFromString":{
"dateString":"$clusterTime",
"format":"%Y-%m-%d"
}
}
}
}
]
)

Can I sort with Firebase Firestore child with the Python SDK?

I have a Firebase Firestore with data structured as per below but am having issues being able to sort by a child.
{
'agreement' : 'ALR87HJLKJJ78954',
'agreementDetails' : {
'name' : 'johnny5isalive',
'country' : 'Spain'
}
}
My query looks like the following:
query = db.collection('rentalAgreements').where('agreement', '==', agreement_number).order_by('/agreementDetails/name', direction=firestore.Query.ASCENDING).limit(10)
results = query.get()
I get the following error:
ValueError: Path /agreementDetails/name not consumed, residue:
/agreementDetails/name
I did think I was chancing it a bit and found through Google a number of references to order_by_child which I also tried but it came back with an error:
AttributeError: 'Query' object has no attribute 'order_by_child'
Is this possible?
When referencing an object property, you will need to use dot notation in the field path:
query = db.collection('rentalAgreements')
.where('agreement', '==', agreement_number)
.order_by('agreementDetails.name', direction=firestore.Query.ASCENDING)
.limit(10)

How does Python JSON library dealing with time?

So I'm currently learning MongoDB and I'm using PyMongo rather than MongoDB shell.
When I started trying the basic CRUD operations, I found it is hard to load the bios data using PyMongo, since the original data posted on the website had a strange ISODATA for time.
The original python JSON library seemed to be not support this and the mongoimport seemed to be not support this either(not sure). But I found this, after modifying into {$date:"2017-04-01T05:00:00Z"}, mongoimport was working.
Right now I'm using subprocess to call a external command to import the data. So my question is, how to use python correctly read the JSON data and using PyMongo to insert the data.
Details
the bios data in the mongodb documentation looks like this
{
"_id" : 1,
"name" : {
"first" : "John",
"last" : "Backus"
},
"birth" : ISODate("1924-12-03T05:00:00Z"),
"death" : ISODate("2007-03-17T04:00:00Z"),
"contribs" : [
"Fortran",
"ALGOL",
"Backus-Naur Form",
"FP"
],
"awards" : [
{
"award" : "W.W. McDowell Award",
"year" : 1967,
"by" : "IEEE Computer Society"
},
{
"award" : "National Medal of Science",
"year" : 1975,
"by" : "National Science Foundation"
},
{
"award" : "Turing Award",
"year" : 1977,
"by" : "ACM"
},
{
"award" : "Draper Prize",
"year" : 1993,
"by" : "National Academy of Engineering"
}
]
}
And when I try to parse it with Python's JSON library, I get a error messagejson.decoder.JSONDecodeError because of the "birth" : ISODate("1924-12-03T05:00:00Z"),. And mongoimport can not parse this because of the same reason.
When I modified,
"birth" : ISODate("1924-12-03T05:00:00Z"), into
"birth" : $date:"2017-04-01T05:00:00Z"
mongoimport was working but python still wasn't able to parse it.
What I am asking here is a way to deal this problem within Python and PyMongo rather than calling a external commands.
The example that you're looking at was probably intended to be used within the mongo shell, where the use of the ISODate bson type can be parsed as shown.
Outside of that, we have the challenge that JSON does not have a date datatype, nor does it have a standard way of representing dates. To deal with this challenge, MongoDB created something called Extended JSON, which can encode dates in JSON similar to how you have shown with $date.
In order to work with Extended JSON in Python / PyMongo, you could use json_util.
Here's a brief example:
from bson.json_util import loads
from pymongo import MongoClient
json = '''
{
"_id" : 1,
"name" : {
"first" : "John",
"last" : "Backus"
},
"birth" : {"$date":"2017-04-01T05:00:00.000Z"},
"death" : {"$date":"2017-04-01T05:00:00.000Z"}
}
'''
bson = loads(json)
print(str(bson))
db = MongoClient().test
collection = db.bios
collection.insert(bson)

pyarango driver for arangoDB: validation

I am using pyarango driver (https://github.com/tariqdaouda/pyArango) for arangoDB, but I cannot understand how the field validation works. I have set the fields of a collection as in the github example:
import pyArango.Collection as COL
import pyArango.Validator as VAL
from pyArango.theExceptions import ValidationError
import types
class String_val(VAL.Validator) :
def validate(self, value) :
if type(value) is not types.StringType :
raise ValidationError("Field value must be a string")
return True
class Humans(COL.Collection) :
_validation = {
'on_save' : True,
'on_set' : True,
'allow_foreign_fields' : True # allow fields that are not part of the schema
}
_fields = {
'name' : Field(validators = [VAL.NotNull(), String_val()]),
'anything' : Field(),
'species' : Field(validators = [VAL.NotNull(), VAL.Length(5, 15), String_val()])
}
So I was expecting that when I try to add a document into "Humans" collection, if 'name' field is not a string, an error would rise. But it didn't seem to work that easy.
This is how I add documents to the collection:
myjson = json.loads(open('file.json').read())
collection_name = "Humans"
bindVars = {"doc": myjson, '#collection': collection_name}
aql = "For d in #doc INSERT d INTO ##collection LET newDoc = NEW RETURN newDoc"
queryResult = db.AQLQuery(aql, bindVars = bindVars, batchSize = 100)
So if 'name' is not a string I actually don't get any error and is uploaded into the collection.
Does someone knows how can check if a document contains proper fields for that collection using the built-in validation of pyarango?
I don't see anything wrong with your validator, its just that if you're using AQL queries to insert your documents, pyArango has no way of knowing the contents prior to insertion.
Validators only work on pyArango documents if you do:
humans = db["Humans"]
doc = humans.createDocument()
doc["name"] = 101
That should trigger the exception because you've defined:
'on_set': True
ArangoDB as document store itself doesn't enforce schemas, neither do the drivers.
If you need schema validation, this can be done on top of the driver or inside of ArangoDB using a Foxx service (via the joi validation library).
One possible solution for doing this is using JSON Schema with its python implementation on top of the driver in your application:
from jsonschema import validate
schema = {
"type" : "object",
"properties" : {
"name" : {"type" : "string"},
"species" : {"type" : "string"},
},
}
Another real life example using JSON Schema is swagger.io, which is also used to document the ArangoDB REST API and ArangoDB Foxx services.
I don't know yet what was wrong with the code I posted but now seems to work. However I had to convert unicode to utf-8 when reading the json file otherwise it was not able to identify strings. I know ArangoDB as itself does not enforce schemes but I am using that has a built-in validation.
For those interested in a built-in validation of arangoDB using python visit pyarango github.

Categories

Resources