I am running the following query in Mongo shell :
db.coll.aggregate([ { "$match" : { "_id":{"$in" : [/^4_.*/,/^3_.*/]}}},
{ "$unwind" : "$rp"},
{"$group":{"_id": "$_id", "rp": { "$push": "$rp" }}} , {"$limit":120}],{allowDiskUse:true})
which is working correctly. But when I am trying the same in pymongo as :
ids_list = [3,4]
ids_list = ["^" + str(c_id) + "_.*" for c_id in ids_list]
pipe = [ { "$match" : { "_id":{"$in" : ids_list}}},
{ "$unwind" : "$rp"},
{"$group":{"_id": "$_id", "rp": { "$push": "$rp" }}} , {"$limit":500}]
res = list(db.coll.aggregate(pipeline = pipe,allowDiskUse=True))
which is not working. I am new to Mongo queries.
I changed the for loop where each element is compiled using re module i.e.
ids_list = [re.compile("^" + str(c_id) + "_.*") for c_id in ids_list]
and it worked :)
Related
I need to run the following query on a MongoDB server:
QUERY = {
"$and" : [
{"x" : {'$gt' : 1.0}},
{"y" : {'$gt' : 0.1}},
{"$where" : 'this.s1.length < this.s2.length+3'}
]
}
This query is very slow, due to the JavaScript expression which the server needs to execute on every document in the collection.
Is there any way for me to optimize it?
I thought about using the $size operator, but I'm not really sure that it works on strings, and I'm even less sure on how to compare its output on a pair of strings (as is the case here).
Here is the rest of my script, in case needed:
from pymongo import MongoClient
USERNAME = ...
PASSWORD = ...
SERVER_NAME = ...
DATABASE_NAME = ...
COLLECTION_NAME = ...
uri = 'mongodb://{}:{}#{}/{}'.format(USERNAME,PASSWORD,SERVER_NAME,DATABASE_NAME)
mongoClient = MongoClient(uri)
collection = mongoClient[DATABASE_NAME][COLLECTION_NAME]
cursor = collection.find(QUERY)
print cursor.count()
The pymongo version is 3.4.
You can use aggregation framework, which provides $strLenCP to get length of a string and $cmp to compare them:
db.collection.aggregate(
[
{
$match: {
"x" : {'$gt' : 1.0},
"y" : {'$gt' : 0.1}
}
},
{
$addFields: {
str_cmp: { $cmp: [ { $strLenCP: "$s1" }, { $add: [ { $strLenCP: "$s2" }, 3 ] } ] }
}
},
{
$match: {
"str_cmp": -1,
}
}
]
)
I'm unwinding one field which is an array of date objects, however in some cases there are empty array's which is fine. I'd like the same treatment using a pipeline, but in some cases, I want to filter the results which have an empty array.
pipeline = []
pipeline.append({"$unwind": "$date_object"})
pipeline.append({"$sort": {"date_object" : 1}})
I want to use the pipeline format, however the following code does not return any records:
pipeline.append({"$match": {"date_object": {'$exists': False }}})
nor does the following work:
pipeline.append({"$match": {"date_object": []}})
and then:
results = mongo.db.xxxx.aggregate(pipeline)
I'm also trying:
pipeline.append({ "$cond" : [ { "$eq" : [ "$date_object", [] ] }, [ { '$value' : 0 } ], '$date_object' ] } )
But with this I get the following error:
.$cmd failed: exception: Unrecognized pipeline stage name: '$cond'
However if I query using find such as find({"date_object": []}), I can get these results. How can I make this work with the pipeline.
I've done in MongoDB shell, but it can be translated into Python easily in python language.
Is it your requirements?
I suppose you have such structure:
db.collection.save({foo:1, date_object:[new Date(), new Date(2016,1,01,1,0,0,0)]})
db.collection.save({foo:2, date_object:[new Date(2016,0,16,1,0,0,0),new Date(2016,0,5,1,0,0,0)]})
db.collection.save({foo:3, date_object:[]})
db.collection.save({foo:4, date_object:[new Date(2016,1,05,1,0,0,0), new Date(2016,1,06,1,0,0,0)]})
db.collection.save({foo:5, date_object:[]})
// Get empty arrays after unwind
db.collection.aggregate([
{$project:{_id:"$_id", foo:"$foo",
date_object:{
$cond: [ {"$eq": [{ $size:"$date_object" }, 0]}, [null], "$date_object" ]
}
}
},
{$unwind:"$date_object"},
{$match:{"date_object":null}}
])
// Get empty arrays before unwind
db.collection.aggregate([
{$match:{"date_object.1":{$exists:false}}},
{$project:{_id:"$_id", foo:"$foo",
date_object:{
$cond: [ {"$eq": [{ $size:"$date_object" }, 0]}, [null], "$date_object" ]
}
}
},
{$unwind:"$date_object"}
])
Only empty date_object
[
{
"_id" : ObjectId("56eb0bd618d4d09d4b51087a"),
"foo" : 3,
"date_object" : null
},
{
"_id" : ObjectId("56eb0bd618d4d09d4b51087c"),
"foo" : 5,
"date_object" : null
}
]
At the end, if you need only empty date_object, you don't need to aggregate, you can easely achieve it with find:
db.collection.find({"date_object.1":{$exists:false}},{date_object:0})
Output
{
"_id" : ObjectId("56eb0bd618d4d09d4b51087a"),
"foo" : 3
}
{
"_id" : ObjectId("56eb0bd618d4d09d4b51087c"),
"foo" : 5
}
I have this Document in mongo engine:
class Mydoc(db.Document):
x = db.DictField()
item_number = IntField()
And I have this data into the Document
{
"_id" : ObjectId("55e360cce725070909af4953"),
"x" : {
"mongo" : [
{
"list" : "lista"
},
{
"list" : "listb"
}
],
"hello" : "world"
},
"item_number" : 1
}
Ok if I want to push to mongo list using mongoengine, i do this:
Mydoc.objects(item_number=1).update_one(push__x__mongo={"list" : "listc"})
That works pretty well, if a query the database again i get this
{
"_id" : ObjectId("55e360cce725070909af4953"),
"x" : {
"mongo" : [
{
"list" : "lista"
},
{
"list" : "listb"
},
{
"list" : "listc"
}
],
"hello" : "world"
},
"item_number" : 1
}
But When I try to pull from same list using pull in mongo engine:
Mydoc.objects(item_number=1).update_one(pull__x__mongo={'list': 'lista'})
I get this error:
mongoengine.errors.OperationError: Update failed (Cannot apply $pull
to a non-array value)
comparising the sentences:
Mydoc.objects(item_number=1).update_one(push__x__mongo={"list" : "listc"}) # Works
Mydoc.objects(item_number=1).update_one(pull__x__mongo={"list" : "listc"}) # Error
How can I pull from this list?
I appreciate any help
I believe that the problem is that mongoengine doesn't know the structure of your x document. You declared it as DictField, so mongoengine thinks you are pulling from DictField not from ListField. Declare x as ListField and both queries should work just fine.
I suggest you should also create an issue for this:
https://github.com/MongoEngine/mongoengine/issues
As a workaround, you can use a raw query:
Mydoc.objects(item_number=1).update_one(__raw__={'$pull': {'x.mongo': {'list': 'listc'}}})
I'm using Python. I send datetime.utcnow() to my MongoDB.
What is wrong with my code:
deltaTime = timedelta(minutes=1)
s.find({"status" : "pending",
"$and" : [{"time" : {"$lt" : datetime.utcnow()}},
{"time" : {"$gt" : datetime.utcnow() - deltaTime }}
]
}, page=0 , perpage=15 )
but it doesn't work.
and the same query in MongoDb does not work either:
db.s.find(
{"status" : "pending" ,
"$and" :
[
{"time" : {"$lt" : ISODate("2014-06-05 06:59:31.442Z") } }
,
{"time" : {"$gt" : ISODate("2014-06-05 05:59:31.442Z") } }
]
}
)
MongoDB says "Script executed successfully but there is no result to show"
I have records in between! but there is no result, I also have tried the MongoDB query without ISODate() and still there is no result!
I've solved this issue by the following query:
db.s.find(
{
"status" : "Pending"
,
"time" : {
"$gt" : ISODate("2014-06-05 06:01:29.397069") ,
"$lt" : ISODate("2014-06-05 07:01:29.397069")
}
}
)
and the "$and" have not worked for me in this case.
this is my code :
#! /usr/bin/python
import os
from pymongo.connection import Connection
from pymongo.master_slave_connection import MasterSlaveConnection
database = 'toto'
collection = 'logs'
master = Connection(host="X.X.X.X", port=27017)
slave1 = Connection(host="X.X.X.X", port=27017)
con = MasterSlaveConnection(master, slaves=[slave1, master])
db = getattr(con,database)
#host_name.append("getattr(db,collection).distinct( 'host_name' )")
#print host_name[1]
hosts = db.logs.distinct( 'host_name' )
services = db.logs.distinct("service_description" , { "service_description" : { $ne : null } } )
#print hosts
print services
I got this error :
File "./rapport.py", line 23
services = db.logs.distinct("service_description" , { "service_description" : { $ne : null } } )
^
SyntaxError: invalid syntax
Why i can't use "$ne : null" in my code? I don't understand because when i execute this query "db.logs.distinct("service_description" , { "service_description" : { $ne : null } } )" directly in mongodb it works.
I also tried this but it doesn't work :
services = db.logs.distinct("service_description", { "service_description" : { "$ne" : None } } )
Thanks for your help.
You need to quote the $ne and use None instead of null.
Pymongo uses dicts as parameters.
asdf = "something"
{ asdf: "foo"}
is a valid declaration, using "something" as key.
If you compare that with
{$ne: "foo"}
the interpreter expects a variable name as first entry, and $neis invalid.
Also, nullis not predefined in Python, so use None instead.
Combined with the fluid interface in pymongo, your query should be:
db.logs.find({"service_description": {"$ne" : None}}).distinct('service_description')