MongoDB TTL index doesn't delete expired documents - python

I'm trying to create a collection named ttl, and using a TTL index, make the documents in that collection expire after 30 seconds.
I've created the collection using mongoengine, like so:
class Ttl(Document):
meta = {
'indexes': [
{
'name': 'TTL_index',
'fields': ['expire_at'],
'expireAfterSeconds': 0
}
]
}
expire_at = DateTimeField()
The index has been created and Robo3T shows it's as expected.
The actual documents are inserted to the collection using mongoengine as well:
current_ttl = models.monkey.Ttl(expire_at=datetime.now() + timedelta(seconds=30))
current_ttl.save()
The save is successful (the document is inserted into the DB), but it never expires!
How can I make the documents expire?
I'm adding the collection's contents here as well in case I'm saving them wrong. These are the results of running db.getCollection('ttl').find({}):
/* 1 */
{
"_id" : ObjectId("5ccf0f5a4bdc6edcd3773cd6"),
"created_at" : ISODate("2019-05-05T19:31:10.715Z")
}
/* 2 */
{
"_id" : ObjectId("5ccf121c0b792dae8f55cc80"),
"expire_at" : ISODate("2019-05-05T19:41:08.220Z")
}
/* 3 */
{
"_id" : ObjectId("5ccf127d6729084a24772fad"),
"expire_at" : ISODate("2019-05-05T19:42:47.522Z")
}
/* 4 */
{
"_id" : ObjectId("5ccf15bab124a97322da28de"),
"expire_at" : ISODate("2019-05-05T19:56:56.359Z")
}
The indexes themselves, as per the results of db.getCollection('ttl').getIndexes(), are:
/* 1 */
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "monkeyisland.ttl"
},
{
"v" : 2,
"key" : {
"expire_at" : 1
},
"name" : "TTL_index",
"ns" : "monkeyisland.ttl",
"background" : false,
"expireAfterSeconds" : 0
}
]
My db.version() is 4.0.8 and it's running on Ubuntu 18.04.

The issue is with:
current_ttl = models.monkey.Ttl(expire_at=datetime.now() + timedelta(seconds=30))
that should be
current_ttl = models.monkey.Ttl(expire_at=datetime.utcnow() + timedelta(seconds=30))

Related

How to check in pymongo that item is not in list field?

So lets say I have a record like this
{ "name" : "Kobe Bryant",
"jersey_numbers" : [8,24]
}
{ "name" : "Michael Jordan",
"jersey_numbers" : [23]
}
How can i find all the records where in field "jersey_number" number 23 is not included ?
You can use the $nin operator (https://docs.mongodb.com/manual/reference/operator/query/nin/)
players = db['players'].find({ 'jersey_numbers': { "$nin": [ 23 ] } })

MongoDB pipeline unwind and check for empty array

I'm unwinding one field which is an array of date objects, however in some cases there are empty array's which is fine. I'd like the same treatment using a pipeline, but in some cases, I want to filter the results which have an empty array.
pipeline = []
pipeline.append({"$unwind": "$date_object"})
pipeline.append({"$sort": {"date_object" : 1}})
I want to use the pipeline format, however the following code does not return any records:
pipeline.append({"$match": {"date_object": {'$exists': False }}})
nor does the following work:
pipeline.append({"$match": {"date_object": []}})
and then:
results = mongo.db.xxxx.aggregate(pipeline)
I'm also trying:
pipeline.append({ "$cond" : [ { "$eq" : [ "$date_object", [] ] }, [ { '$value' : 0 } ], '$date_object' ] } )
But with this I get the following error:
.$cmd failed: exception: Unrecognized pipeline stage name: '$cond'
However if I query using find such as find({"date_object": []}), I can get these results. How can I make this work with the pipeline.
I've done in MongoDB shell, but it can be translated into Python easily in python language.
Is it your requirements?
I suppose you have such structure:
db.collection.save({foo:1, date_object:[new Date(), new Date(2016,1,01,1,0,0,0)]})
db.collection.save({foo:2, date_object:[new Date(2016,0,16,1,0,0,0),new Date(2016,0,5,1,0,0,0)]})
db.collection.save({foo:3, date_object:[]})
db.collection.save({foo:4, date_object:[new Date(2016,1,05,1,0,0,0), new Date(2016,1,06,1,0,0,0)]})
db.collection.save({foo:5, date_object:[]})
// Get empty arrays after unwind
db.collection.aggregate([
{$project:{_id:"$_id", foo:"$foo",
date_object:{
$cond: [ {"$eq": [{ $size:"$date_object" }, 0]}, [null], "$date_object" ]
}
}
},
{$unwind:"$date_object"},
{$match:{"date_object":null}}
])
// Get empty arrays before unwind
db.collection.aggregate([
{$match:{"date_object.1":{$exists:false}}},
{$project:{_id:"$_id", foo:"$foo",
date_object:{
$cond: [ {"$eq": [{ $size:"$date_object" }, 0]}, [null], "$date_object" ]
}
}
},
{$unwind:"$date_object"}
])
Only empty date_object
[
{
"_id" : ObjectId("56eb0bd618d4d09d4b51087a"),
"foo" : 3,
"date_object" : null
},
{
"_id" : ObjectId("56eb0bd618d4d09d4b51087c"),
"foo" : 5,
"date_object" : null
}
]
At the end, if you need only empty date_object, you don't need to aggregate, you can easely achieve it with find:
db.collection.find({"date_object.1":{$exists:false}},{date_object:0})
Output
{
"_id" : ObjectId("56eb0bd618d4d09d4b51087a"),
"foo" : 3
}
{
"_id" : ObjectId("56eb0bd618d4d09d4b51087c"),
"foo" : 5
}

Pull from a list in a dict using mongoengine

I have this Document in mongo engine:
class Mydoc(db.Document):
x = db.DictField()
item_number = IntField()
And I have this data into the Document
{
"_id" : ObjectId("55e360cce725070909af4953"),
"x" : {
"mongo" : [
{
"list" : "lista"
},
{
"list" : "listb"
}
],
"hello" : "world"
},
"item_number" : 1
}
Ok if I want to push to mongo list using mongoengine, i do this:
Mydoc.objects(item_number=1).update_one(push__x__mongo={"list" : "listc"})
That works pretty well, if a query the database again i get this
{
"_id" : ObjectId("55e360cce725070909af4953"),
"x" : {
"mongo" : [
{
"list" : "lista"
},
{
"list" : "listb"
},
{
"list" : "listc"
}
],
"hello" : "world"
},
"item_number" : 1
}
But When I try to pull from same list using pull in mongo engine:
Mydoc.objects(item_number=1).update_one(pull__x__mongo={'list': 'lista'})
I get this error:
mongoengine.errors.OperationError: Update failed (Cannot apply $pull
to a non-array value)
comparising the sentences:
Mydoc.objects(item_number=1).update_one(push__x__mongo={"list" : "listc"}) # Works
Mydoc.objects(item_number=1).update_one(pull__x__mongo={"list" : "listc"}) # Error
How can I pull from this list?
I appreciate any help
I believe that the problem is that mongoengine doesn't know the structure of your x document. You declared it as DictField, so mongoengine thinks you are pulling from DictField not from ListField. Declare x as ListField and both queries should work just fine.
I suggest you should also create an issue for this:
https://github.com/MongoEngine/mongoengine/issues
As a workaround, you can use a raw query:
Mydoc.objects(item_number=1).update_one(__raw__={'$pull': {'x.mongo': {'list': 'listc'}}})

How to Query this in MongoDB?

My items store in MongoDB like this :
{"ProductName":"XXXX",
"Catalogs" : [
{
"50008064" : "Apple"
},
{
"50010566" : "Box"
},
{
"50016422" : "Water"
}
]}
Now I want query all the items belong to Catalog:50008064,how to?
(the catalog id "50008064" , catalog name "Apple")
You cannot query this in an efficient manner and performance will decrease as your data grows. As such I would consider it a schema bug and you should refactor/migrate to the following model which does allow for indexing :
{"ProductName":"XXXX",
"Catalogs" : [
{
id : "50008064",
value : "Apple"
},
{
id : "50010566",
value : "Box"
},
{
id : "50016422",
value : "Water"
}
]}
And then index :
ensureIndex({'Catalogs.id':1})
Again, I strongly suggest you change your schema as this is a potential performance bottleneck you cannot fix any other way.
This should probably work according to the entry here, although this won't be very fast, as stated in in the link.
db.products.find({ "Catalogs.50008064" : { $exists: true } } )

Get child dict values use Mongo Map/Reduce

I have a mongo collection, i want get total value of 'number_of_ad_clicks' by given sitename, timestamp and variant id. Because we have large data so it would be better use map/reduce. Could any guys give me any suggestion?
Here is my collection json format
{ "_id" : ObjectId( "4e3c280ecacbd1333b00f5ff" ),
"timestamp" : "20110805",
"variants" : { "94" : { "number_of_ad_clicks" : 41,
"number_of_search_keywords" : 9,
"total_duration" : 0,
"os" : { "os_2" : 2,
"os_1" : 1,
"os_0" : 0 },
"countries" : { "ge" : 6,
"ca" : 1,
"fr" : 8,
"uk" : 4,
"us" : 6 },
"screen_resolutions" : { "(320, 240)" : 1,
"(640, 480)" : 5,
"(1024, 960)" : 5,
"(1280, 768)" : 5 },
"widgets" : { "widget_1" : 1,
"widget_0" : 0 },
"languages" : { "ua_uk" : 8,
"ca_en" : 2,
"ca_fr" : 2,
"us_en" : 5 },
"search_keywords" : { "search_keyword_8" : 8,
"search_keyword_5" : 5,
"search_keyword_4" : 4,
"search_keyword_7" : 7,
"search_keyword_6" : 6,
"search_keyword_1" : 1,
"search_keyword_3" : 3,
"search_keyword_2" : 2 },
"number_of_pageviews" : 18,
"browsers" : { "browser_4" : 4,
"browser_0" : 0,
"browser_1" : 1,
"browser_2" : 2,
"browser_3" : 3 },
"keywords" : { "keyword_5" : 5,
"keyword_4" : 4,
"keyword_1" : 1,
"keyword_0" : 0,
"keyword_3" : 3,
"keyword_2" : 2 },
"number_of_keyword_clicks" : 83,
"number_of_visits" : 96 } },
"site_name" : "fonter.com",
"number_of_variants" : 1 }
Here is my try. but failed.
He is my try.
m = function() {
emit(this.query, {variants: this.variants});
}
r = function(key , vals) {
var clicks = 0 ;
for(var i = 0; i < vals.length(); i++){
clicks = vals[i]['number_of_ad_clicks'];
}
return clicks;
}
res = db.variant_daily_collection.mapReduce(m, r, {out : "myoutput", "query":{"site_name": 'fonter.com', 'timestamp': '20110805'}})
db.myoutput.find()
could somebody any suggestion?
Thank you very much, i try you solution but nothing return.
I invoke the mapreduce in the following, is there any thing wrong?
res = db.variant_daily_collection.mapReduce(map, reduce, {out : "myoutput", "query":{"site_name": 'facee.com', 'timestamp': '20110809', 'variant_id': '305'}})
db.myoutput.find()
The emit function emits both a key and a value.
If you are used to SQL think of key as your GROUP BY and value as your SUM(), AVG(), etc..
In your case you want to "group by": site_name, timestamp and variant id. It looks like you may have more than one variant, so you will need to loop through the variants, like this:
map = function() {
for(var i in variants){
var key = {};
key.timestamp = this.timestamp;
key.site_name = this.site_name;
key.variant_id = i; // that's the "94" string.
var value = {};
value.clicks = this.variants[i].number_of_ad_clicks;
emit(key, value);
}
}
The reduce function will get an array of values each one like this { clicks: 41 }. The function needs to return one object that looks the same.
So if you get values = [ {clicks:21}, {clicks:10}, {clicks:5} ] you must output {clicks:36}.
So you do something like this:
reduce = function(key , vals) {
var returnValue = { clicks: 0 }; // initializing to zero
for(var i = 0; i < vals.length(); i++){
returnValue.clicks += vals[i].clicks;
}
return returnValue;
}
Note that the value from map has the same shape as the return from reduce.

Categories

Resources