SyntaxError with pymongo and $ne : null - python

this is my code :
#! /usr/bin/python
import os
from pymongo.connection import Connection
from pymongo.master_slave_connection import MasterSlaveConnection
database = 'toto'
collection = 'logs'
master = Connection(host="X.X.X.X", port=27017)
slave1 = Connection(host="X.X.X.X", port=27017)
con = MasterSlaveConnection(master, slaves=[slave1, master])
db = getattr(con,database)
#host_name.append("getattr(db,collection).distinct( 'host_name' )")
#print host_name[1]
hosts = db.logs.distinct( 'host_name' )
services = db.logs.distinct("service_description" , { "service_description" : { $ne : null } } )
#print hosts
print services
I got this error :
File "./rapport.py", line 23
services = db.logs.distinct("service_description" , { "service_description" : { $ne : null } } )
^
SyntaxError: invalid syntax
Why i can't use "$ne : null" in my code? I don't understand because when i execute this query "db.logs.distinct("service_description" , { "service_description" : { $ne : null } } )" directly in mongodb it works.
I also tried this but it doesn't work :
services = db.logs.distinct("service_description", { "service_description" : { "$ne" : None } } )
Thanks for your help.

You need to quote the $ne and use None instead of null.
Pymongo uses dicts as parameters.
asdf = "something"
{ asdf: "foo"}
is a valid declaration, using "something" as key.
If you compare that with
{$ne: "foo"}
the interpreter expects a variable name as first entry, and $neis invalid.
Also, nullis not predefined in Python, so use None instead.
Combined with the fluid interface in pymongo, your query should be:
db.logs.find({"service_description": {"$ne" : None}}).distinct('service_description')

Related

boto3, python: appending value to DynamoDB String Set

I have an object in DynamoDB:
{ 'UserID' : 'Hank', ConnectionList : {'con1', 'con2'} }
By using boto3 in lambda functions, I would like to add 'con3' to the String Set.
So far, I have been trying with the following code without success:
ddbClient = boto3.resource('dynamodb')
table = ddbClient.Table("UserInfo")
table.update_item(
Key={
"UserId" : 'Hank'
},
UpdateExpression =
"SET ConnectionList = list_append(ConnectionList, :i)",
ExpressionAttributeValues = {
":i": { "S": "Something" }
},
ReturnValues="ALL_NEW"
)
However, no matter the way I try to put the information inside the String Set, it always runs error.
Since you're using the resource API, you have to use the Python data type set in your statement:
table.update_item(
Key={
"UserId" : 'Hank'
},
UpdateExpression =
"ADD ConnectionList :i",
ExpressionAttributeValues = {
":i": {"Something"}, # needs to be a set type
},
ReturnValues="ALL_NEW"
)

Improve the performce of a MongoDB query which uses a "$where" expression

I need to run the following query on a MongoDB server:
QUERY = {
"$and" : [
{"x" : {'$gt' : 1.0}},
{"y" : {'$gt' : 0.1}},
{"$where" : 'this.s1.length < this.s2.length+3'}
]
}
This query is very slow, due to the JavaScript expression which the server needs to execute on every document in the collection.
Is there any way for me to optimize it?
I thought about using the $size operator, but I'm not really sure that it works on strings, and I'm even less sure on how to compare its output on a pair of strings (as is the case here).
Here is the rest of my script, in case needed:
from pymongo import MongoClient
USERNAME = ...
PASSWORD = ...
SERVER_NAME = ...
DATABASE_NAME = ...
COLLECTION_NAME = ...
uri = 'mongodb://{}:{}#{}/{}'.format(USERNAME,PASSWORD,SERVER_NAME,DATABASE_NAME)
mongoClient = MongoClient(uri)
collection = mongoClient[DATABASE_NAME][COLLECTION_NAME]
cursor = collection.find(QUERY)
print cursor.count()
The pymongo version is 3.4.
You can use aggregation framework, which provides $strLenCP to get length of a string and $cmp to compare them:
db.collection.aggregate(
[
{
$match: {
"x" : {'$gt' : 1.0},
"y" : {'$gt' : 0.1}
}
},
{
$addFields: {
str_cmp: { $cmp: [ { $strLenCP: "$s1" }, { $add: [ { $strLenCP: "$s2" }, 3 ] } ] }
}
},
{
$match: {
"str_cmp": -1,
}
}
]
)

Pymongo $in + aggregate with regex not working in pymongo

I am running the following query in Mongo shell :
db.coll.aggregate([ { "$match" : { "_id":{"$in" : [/^4_.*/,/^3_.*/]}}},
{ "$unwind" : "$rp"},
{"$group":{"_id": "$_id", "rp": { "$push": "$rp" }}} , {"$limit":120}],{allowDiskUse:true})
which is working correctly. But when I am trying the same in pymongo as :
ids_list = [3,4]
ids_list = ["^" + str(c_id) + "_.*" for c_id in ids_list]
pipe = [ { "$match" : { "_id":{"$in" : ids_list}}},
{ "$unwind" : "$rp"},
{"$group":{"_id": "$_id", "rp": { "$push": "$rp" }}} , {"$limit":500}]
res = list(db.coll.aggregate(pipeline = pipe,allowDiskUse=True))
which is not working. I am new to Mongo queries.
I changed the for loop where each element is compiled using re module i.e.
ids_list = [re.compile("^" + str(c_id) + "_.*") for c_id in ids_list]
and it worked :)

Pymongo find by _id in subdocuments

Assuming that this one item of my database:
{"_id" : ObjectID("526fdde0ef501a7b0a51270e"),
"info": "foo",
"status": true,
"subitems : [ {"subitem_id" : ObjectID("65sfdde0ef501a7b0a51e270"),
//more},
{....}
],
//more
}
I want to find (or find_one, doesn't matter) the document(s) with "subitems.subitem_id" : xxx.
I have tried the following. All of them return an empty list.
from pymongo import MongoClient,errors
from bson.objectid import ObjectId
id = '65sfdde0ef501a7b0a51e270'
db.col.find({"subitems.subitem_id" : id } ) #obviously wrong
db.col.find({"subitems.subitem_id" : Objectid(id) })
db.col.find({"subitems.subitem_id" : {"$oid":id} })
db.col.find({"subitems.subitem_id.$oid" : id })
db.col.find({"subitems.$.subitem_id" : Objectid(id) })
In mongoshell this one works however:
find({"subitems.subitem_id" : { "$oid" : "65sfdde0ef501a7b0a51e270" } })
The literal 65sfdde0ef501a7b0a51e270 is not hexadecimal, hence, not a valid ObjectId.
Also, id is a Python built-in function. Avoid reseting it.
Finally, you execute a find but do not evaluate it, so you do not see any results. Remember that pymongo cursors are lazy.
Try this.
from pymongo import MongoClient
from bson.objectid import ObjectId
db = MongoClient().database
oid = '65cfdde0ef501a7b0a51e270'
x = db.col.find({"subitems.subitem_id" : ObjectId(oid)})
print list(x)
Notice I adjusted oid to a valid hexadecimal string.
Same query in the Mongo JavaScript shell.
db.col.find({"subitems.subitem_id" : new ObjectId("65cfdde0ef501a7b0a51e270")})
Double checked. Right answer is db.col.find({"subitems.subitem_id" : Objectid(id)})
Be aware that this query will return full record, not just matching part of sub-array.
Mongo shell:
a = ObjectId("5273e7d989800e7f4959526a")
db.m.insert({"subitems": [{"subitem_id":a},
{"subitem_id":ObjectId()}]})
db.m.insert({"subitems": [{"subitem_id":ObjectId()},
{"subitem_id":ObjectId()}]})
db.m.find({"subitems.subitem_id" : a })
>>> { "_id" : ObjectId("5273e8e189800e7f4959526d"),
"subitems" :
[
{"subitem_id" : ObjectId("5273e7d989800e7f4959526a") },
{"subitem_id" : ObjectId("5273e8e189800e7f4959526c")}
]}

Mongodb python syntax to get array value .

so please be gentle
Have a mongo doc like this :
{ "Institute" : "Ucambridge",
"Project" : [ #array of projects
{"Sample":[ #array of samples
{ "workflow" : "abc", "owner" : "peter" }
]
"pname":"project1",
"dir" : "C drive"
}
]
}
I am aware that having nested loops in mongo isn't a great idea , however this is the way the data is being handed to me.
Trying to loop over all my projects and extract the project name, on my python server.
so get cursor :
u = mongo.db.testpymongo.find( )
Can get Institute by :
for x in u :
print x["Institute"]
Can get project by :
for x in u :
print x["Project"]
which returns :
[{u'Sample':[{u'workflow:':u'wf', u'owner':u'peter'} ] u'pname':u'project1 ', u'dir:u'C drive'}]
but , how do i get access to just my pname variable from the cursor ?
i have tried :
1.print x["Project:pname"] # does not work
2.print x["Project":"pname"] # gives unhashable type error
3.print x["pname"] # gives Key error
4.print x["Project"].["pname"] # gives syntax error
5.print x["Project.pname"] # gives key error
Should i be using attributes in the find() function to only return part of the document ?
i.e : like so ?
d = mongo.db.testpymongo.find( {"Institute":"UCambridge", "Project.pname": "project 1" } )
Thank you !
You would need to use $elemMatch :
http://docs.mongodb.org/manual/reference/projection/elemMatch/
db.testpymongo.find( { "Project": { $elemMatch: { "pname": "project1" } } } )

Categories

Resources