I'm using MongoDB 3.2.1 / python 3.4 / pymongo / pandas 0.17 (although the latter two are probably completely irrelavant to this question).
I'm having a really strange (and wrong) behavior with MongoDB find.
I have a collection, containing a document like this:
{
"_id" : NumberLong(-1819413477243867792),
"targetentity" : "NODOGENERICO .ag.HP_BAR_DEG_APP_1",
"tx" : false,
"ocname" : ".oc.serv6",
"specificproblem" : null,
"saf" : false,
"iscriticalnode" : null,
"checkmask" : null,
"notificationidentifier" : 1347592,
"province" : null,
"usertext" : null,
"additionaltext" : "AAA Invalid Response",
"director" : ".temip.madrids01_director",
"problemoccurences" : 1,
"usertags" : null,
"managedobject" : "NODOGENERICO .ag.HP_BAR_DEG_APP_1",
"isacceptednode" : null,
"elementcode" : null,
"state" : "Terminated",
"probablecause" : "Unknown",
"ran" : false,
"counttotal" : 1,
"locationcode" : "NULL",
"problemstatus" : "Closed",
"structurednotes" : null,
"collection" : "serv6",
"operatornotes" : null,
"alarmtype" : "CommunicationsAlarm",
"workinfo" : null,
"perceivedseverity" : "Major",
"core" : true,
"eventtime" : NumberLong(1467342666000),
"originalseverity" : "Major",
"vendor" : "Several",
"controlelementcode" : null,
"outageflag" : false,
"incident" : null,
}
This "_id" it's basically a Hash computed using "hash" builtin method of Python 3.4.
The problem is that I cannot find any element with this id after I insert it.
I've tried (at this point I'm trying this on mongo terminal directly, but over Pymongo it gets me the same results):
db.getCollection('unique_alarm').find({"_id": NumberLong(-1819413477243867792)}
and
db.getCollection('unique_alarm').find({"_id": -1819413477243867792})
And for both I get this:
Fetched 0 record(s) in 1ms
I thought the problem was about how I deal with NumberLong, but for field eventtime (which has the same type) I have absolutely no problem.
I.e., for the eventtime if I query:
db.getCollection('unique_alarm').find({"eventtime" :
NumberLong(1467342666000)})
or by:
db.getCollection('unique_alarm').find({"eventtime" :1467342666000})
Both these queries return this first document again, no problem.
Any clues on what is happening? Why are the first two queries returning 0 results?
More information on my trial and error:
it doesnt matter if the field is "_id" or any other field, I cannot search for these numbers
I'm inserting these documents using pymongo
If I try to insert this document again (either using pymongo or the mongodb terminal), I get an error of duplicate key...
the answer could be trivial but all is connected with quotes " around value for number long.
Inserting data and querying need to be 'quoted'
db.sofia.find({"_id" : NumberLong("-1819413477243867792")}).pretty()
{
"_id" : NumberLong("-1819413477243867792"),
"targetentity" : "NODOGENERICO .ag.HP_BAR_DEG_APP_1",
"tx" : false,
"ocname" : ".oc.serv6",
....
}
I think you're hitting some sort of limits within mongo NumberLong.
I've opened a mongo console and this is the output
> NumberLong(-1819413477243867792)
NumberLong("-1819413477243867904")
So I would assume that if you find by NumberLong("-1819413477243867904") you would magically find your record, which would probably prove that your hash is hitting some sort of mongo db limit if NumberLong.
Related
I'm using db.collection.find({}, {'_id': False}).limit(2000) to get the documents from a collection. This documents are sent to a Facebook API, after the API return success this documents need to be deleted from the collection.
My main doubt is:
Is there a way to I delete all this 2000 documents withou using a for
loop? I know that collection.find returns a cursor, is there a way
to use this cursor in a delete_many?
The structure of my document is:
{
"_id" : ObjectId("61608068887f1a0e2162d94b"),
"event_time" : "1632582893",
"value" : "549.9000",
"contents" : [
{
"product_id" : "1-1",
"quantity" : "1.000000",
"value" : "10"
}
]
}
To solve this problem, based on the comments of #adarsh and #J.F I've used the following code:
rm = [x['_id'] for x in MongoDB(mongo).db.get_collection("DataToSend").find({}, {'_id' : 1}).limit(2000)
MongoDB(mongo).db.get_collection("DataToSend").delete_many({'_id' : { '$in' : list(rm)}})
Im developing a django application uisng python and mongoDB. Im developing a form and take user inputs and save to DB.
Before inserting i want to check if data is already present DB.
I have a mongo collection which looks something like below :
coll_1 :
{ "_id" : ObjectId("56e0a3a2d59feaa43fba49d5"), "timestamp" : ISODate("2017-11-18T10:23:29.620Z"), "City_list" : "[PN-City1, PN-City2,PN-City3, PN-City4]", "LDE" : "LDE-1234, LDE-345, LDE-456" , "Name": "ABC"}
{ "_id" : ObjectId("56e0a3a2d59feaa43fba49d6"), "timestamp" : ISODate("2016-12-18T10:23:29.620Z"), "City_list" : "[PN-City4, PN-City5,PN-City6,PN- City7]", "LDE" : "LDE-444, LDE-3445, LDE-456", "Name": "BCD"}
{ "_id" : ObjectId("56e0a3a2d59feaa43fd67873"), "timestamp" : ISODate("2016-12-18T10:23:29.620Z"), "City_list" : "[PN-City1, PN-City6,PN-City9,PN- City10]", "LDE" : "LDE-444, LDE-3445, LDE-456", "Name": "XYZ"}
I have a form from where i take user inputs : Name, Cities (one or more comma separated), LDE (comma separated)
In my script i want to check before inserting into mongodb
If the user is new user insert directly db.
If old user, check if cities inputed by user is present in db already if not update db else throw a messagee to html with message saying city already present in DB.
Say my input is something like this :
Name: PQR
City_list : PN-City4, PN-City12
LDE: LDE-6767
My code is as below :
if 'Name' in pdata and ('city_list' in pdata and re.match("(PN-\w*-\d)(PN-\w*-\d)*", pdata['city_list'])):
user_input = pdata['city_list'].split(",")
pname = pdata['Name']
for data in user_input:
if db.coll_1.find({"Name": pname , 'City_list': { "$in": data}})
This is giving me error.
How do i achieve this
I tried something like this :
for data in user_input:
data = str(data) # it was taking as unicode
if (db.coll_1.find({"Name": pname , 'City_list': { "$in": data}}).count() > 0):
Gives error : OperationFailure: $in needs an array
CIty_list is a string
Can some one please help me with this
I need to update a document in an array inside another document in Mongo DB.
{
"_id" : ObjectId("51cff693d342704b5047e6d8"),
"author" : "test",
"body" : "sdfkj dsfhk asdfjad ",
"comments" : [
{
"author" : "test",
"body" : "sdfkjdj\r\nasdjgkfdfj",
"email" : "test#tes.com"
},
{
"author" : "hola",
"body" : "sdfl\r\nhola \r\nwork here"
}
],
"date" : ISODate("2013-06-30T09:12:51.629Z"),
"permalink" : "mxwnnnqafl",
"tags" : [
"ab"
],
"title" : "cd"
}
If I try to update first document in comments array by below command, it works.
db.posts.update({'permalink':"cxzdzjkztkqraoqlgcru"},{'$inc': {"comments.0.num_likes": 1}})
But if I put the same in python code like below, I am getting Write error, that it can't traverse the element. I am not understanding what is missing!!
Can anyone help me out please.
post = self.posts.find_one({'permalink': permalink})
response = self.posts.update({'permalink': permalink},
{'$inc':"comments.comment_ordinal.num_likes": 1}})
WriteError: cannot use the part (comments of comments.comment_ordinal.num_likes) to traverse the element
comment_ordinal should be a substitution, not the index itself. You're treating it like an integer that can be mapped to an ordinal number. I mean you should do something like:
updated_field = "comments." + str(comment_ordinal) + ".num_likes"
response = self.posts.update({'permalink': permalink}, {'$inc': {updated_field: 1}})
Hope this helps.
You are doing it wrong you need to build your query dynamically and the best way to do that is using the str.format method.
response = self.posts.update_one(
{'permalink': permalink},
{'$inc': {"comments.{}.num_likes".format(comment_ordinal): 1}}
)
Also you should consider to use the update_one method for single update and update_many if you need to update multiple documents because update is deprecated.
I am using pymongo and I am trying to insert dicts into mongodb database. My dictionaries look like this
{
"name" : "abc",
"Jobs" : [
{
"position" : "Systems Engineer (Data Analyst)",
"time" : [
"October 2014",
"May 2015"
],
"is_current" : 1,
"location" : "xyz",
"organization" : "xyz"
},
{
"position" : "Systems Engineer (MDM Support Lead)",
"time" : [
"January 2014",
"October 2014"
],
"is_current" : 1,
"location" : "xxx",
"organization" : "xxx"
},
{
"position" : "Asst. Systems Engineer (ETL Support Executive)",
"time" : [
"May 2012",
"December 2013"
],
"is_current" : 1,
"location" : "zzz",
"organization" : "xzx"
},
],
"location" : "Buffalo, New York",
"education" : [
{
"school" : "State University of New York at Buffalo - School of Management",
"major" : "Management Information Systems, General",
"degree" : "Master of Science (MS), "
},
{
"school" : "Rajiv Gandhi Prodyogiki Vishwavidyalaya",
"major" : "Electrical and Electronics Engineering",
"degree" : "Bachelor of Engineering (B.E.), "
}
],
"id" : "abc123",
"profile_link" : "example.com",
"html_source" : "<html> some_source_code </html>"
}
I am getting this error:
pymongo.errors.DuplicateKeyError: E11000 duplicate key error index:
Linkedin_DB.employee_info.$id dup key: { :
ObjectId('56b64f6071c54604f02510a8') }
When I run my program 1st document gets inserted properly but when I insert the second document I get this error. When I start my script again the document which was not inserted because of this error get inserted properly and error comes for next document and this continues.
Clearly mognodb is using the same objecID during two inserts. I don't understand why mongodb is failing to generate a unique ID for new documents.
My code to save passed data:
class Mongosave:
"""
Pass collection_name and dict data
This module stores the passed dict in collection
"""
def __init__(self):
self.connection = pymongo.MongoClient()
self.db = self.connection['Linkedin_DB']
def _exists(self, id):
#To check if user alredy exists
return True if list(self.collection.find({'id': id})) else False
def save(self, collection_name, data):
self.collection = self.db[collection_name]
if not self._exists(data['id']):
print (data['id'])
self.collection.insert(data)
else:
self.collection.update({'id':data['id']}, {"$set": data})
I can figure out why this is happening. Any help is appreciated.
The problem is that your save method is using a field called "id" to decide if it should do an insert or an upsert. You want to use "_id" instead. You can read about the _id field and index here. PyMongo automatically adds an _id to you document if one is not already present. You can read more about that here.
You might have inserted two copies of the same document into your collection in one run.
I cannot quite understand what do you mean by:
When I start my script again the document which was not inserted because of this error get inserted properly and error comes for next document and this continues.
What I do know is if you do:
from pymongo import MongoClient
client = MongoClient()
db = client['someDB']
collection = db['someCollection']
someDocument = {'x': 1}
for i in range(10):
collection.insert_one(someDocument)
You'll get a:
pymongo.errors.DuplicateKeyError: E11000 duplicate key error index:
This make me think although pymongo would generate a unique _id for you if you don't provide one, it is not guaranteed to be unique, especially if the document provided is not unique. Presumably pymongo is using some sort of hash algorithm on what you insert for their auto-gen _id without changing the seed.
Try generate your own _id and see if it would happen again.
Edit:
I just tried this and it works:
for i in range(10):
collection.insert_one({'x':1})
This make me think the way pymongo generates _id is associated with the object you feed into it, this time I'm not referencing to the same object anymore and the problem disappeared.
Are you giving your database two references of a same object?
I am trying to update/create a new dataset and combine the previous value with the new one.
This is how it looks like in my Python script right now.
dailyDataset = {
"pId" : pub,
"oId" : off,
"payout" : +addPayout,
}
db[dbName].update( { 'pId' : publisher, 'oId' : offer.id }, {"$set" : dailyDataset }, True)
What I try to achieve is, if the dataset pId and oId exists use the current value from "payout" and add the value from addPayout.
E.g. payout = 1.22 and addPayout = 1.22 result should be 2.44.
Any tip is welcome.
Thanks!
You can use the $inc operator:
db[dbName].update( { 'pId' : publisher, 'oId' : offer.id }, {"$inc" : {'payout':1.22}}, True)