I have problem when i try to select data in mongodb with pymongo, this is my code :
import pymongo
from pymongo import MongoClient
import sys
from datetime import datetime
try:
conn=pymongo.MongoClient('10.33.109.228',27017)
db=conn.mnemosyne
data_ip=db.session.aggregate({'$match':{'timestamp':{'$gte': ISODate('2016-11-11T00:00:00.000Z'),'$lte': ISODate('2016-11-11T23:59:59.000Z')}}},{'$group':{'_id':'$source_ip'}})
for f in data_ip:
print f['_id']
except pymongo.errors.ConnectionFailure, e:
print "Could not connect to MongoDB: %s" % e
and when i execute it i have some error like this:
Traceback (most recent call last):
File "test.py", line 9, in <module>
data_ip=db.session.aggregate({'$match':{'timestamp':{'$gte': ISODate('2016-11-11T00:00:00.000Z'),'$lte': ISODate('2016-11-11T23:59:59.000Z')}}},{'$group':{'_id':'$source_ip'}})
NameError: name 'ISODate' is not defined
I want the result like this:
{ "_id" : "60.18.133.207" }
{ "_id" : "178.254.52.96" }
{ "_id" : "42.229.218.192" }
{ "_id" : "92.82.171.117" }
{ "_id" : "103.208.120.205" }
{ "_id" : "185.153.208.142" }
this is example structure of mydatabase:
> db.session.findOne()
{
"_id" : ObjectId("5786398d1f50070f31f27f7c"),
"protocol" : "epmapper",
"hpfeed_id" : ObjectId("5786398d1f50070f31f27f7b"),
"timestamp" : ISODate("2016-07-13T12:52:29.112Z"),
"source_ip" : "23.251.55.182",
"source_port" : 2713,
"destination_port" : 135,
"identifier" : "d3374f14-48f7-11e6-9e19-0050569163b4",
"honeypot" : "dionaea"
}
Please help me to fix the error
ISODate is a function in the Mongo shell, which is a javascript environment, it's not available within Python.
You can use dateutil for converting a string to datetime object in Python,
import dateutil.parser
dateStr = "2016-11-11T00:00:00.000Z"
dateutil.parser.parse(dateStr) # returns a datetime.datetime(2016, 11, 11, 00, 0, tzinfo=tzutc())
Using PyMongo, if you want to insert datetime in MongoDB you can simply do the following:
import pymongo
import dateutil
dateStr = '2016-11-11T00:00:00.000Z'
myDatetime = dateutil.parser.parse(dateStr)
client = pymongo.MongoClient()
client.db.collection.insert({'date': myDatetime})
ISODate is a JavaScript Date object. To query range of date using PyMongo, you need to use a datetime.datetime instance which mongod will convert to the appropriate BSON type. You don't need any third party library.
Also you shouldn't be using the Aggregation Framework to do this because the _id field is unique within the collection which makes this a perfect job for the distinct() method.
import datetime
start = datetime.datetime(2016, 11, 11)
end = datetime(2016, 11, 11, 23, 59, 59)
db.session.distinct('_id', {'timestamp': {'$gte': start, '$lte': end}})
If you really need to use the aggregate() method, your $match stage must look like this:
{'$match': {'timestamp': {'$gte': start, '$lte': end}}}
Related
An existing collection like as below:
"_id" : "12345",
"vals" : {
"dynamickey1" : {}
}
I need to add
"vals" : {
"dynamickey2" : {}
}
I have tried in python 2.7 with pymongo 2.8:
col.update({'_id': id)},{'$push': {'vals': {"dynamickey2":{"values"}}}})
Error log:
pymongo.errors.OperationFailure: The field 'vals' must be an array but is of type object in document
Expected Output:
"_id" : "12345",
"vals" : {
"dynamickey1" : {},
"dynamickey2" : {}
}
Edited following question edit:
Two options; use $set with the dot notation, or use python dict manipulation.
The first method is more MongoDB native and is one line of code; the second is a bit more work but gives more flexilbility if you use case is more nuanced.
Method 1:
from pymongo import MongoClient
from bson.json_util import dumps
db = MongoClient()['mydatabase']
db.mycollection.insert_one({
"_id": "12345",
"vals": {
"dynamickey1": {},
}
})
db.mycollection.update_one({'_id': '12345'},{'$set': {'vals.dynamickey2':{}}})
print(dumps(db.mycollection.find_one({}), indent=4))
Method 2:
from pymongo import MongoClient
from bson.json_util import dumps
db = MongoClient()['mydatabase']
db.mycollection.insert_one({
"_id": "12345",
"vals": {
"dynamickey1": {},
}
})
record = db.mycollection.find_one({'_id': '12345'})
vals = record['vals']
vals['dynamickey2'] = {}
record = db.mycollection.update_one({'_id': record['_id']}, {'$set': {'vals': vals}})
print(dumps(db.mycollection.find_one({}), indent=4))
Either way gives:
{
"_id": "12345",
"vals": {
"dynamickey1": {},
"dynamickey2": {}
}
}
Previous answer
Your expected output has an object with duplicate fields (vals); this isn't allowed.~
So whatever you are trying to do, it isn't going to work.
I am a newbie to mongodb. I want to retrieve the datas of a certain fields on a specified date from mongodb using python. My Mongodb Collection looks like this
{
"_id" : ObjectId("5d9d7eec7c6265a42e352d6d"),
"browser" : "Chrome",
"countryCode" : "IN",
"Page" : "http://192.168.1.34/third.html",
"date" : "2019-10-09T10:32:08.438660"
}
{
"_id" : ObjectId("5d9d7eec7c6265a42e352d6e"),
"browser" : "Chrome",
"countryCode" : "IN",
"Page" : "http://192.168.1.14/fourth.html",
"date" : "2019-10-12T10:32:08.438662"
}
and so on
I retrieved the data from mongodb by using the following query in mongodb
db.collection_name.find({"date": {'$gte': "2019-10-09T10:32:08.438660", '$lte': "2019-10-10T10:32:08.438661"}},{}, {Page:[], _id:0})
I want to get that data using pymongo in python. Here's the Code I tried,
from pymongo import MongoClient
import pymongo
from bson.raw_bson import RawBSONDocument
myclient = pymongo.MongoClient(
"mongodb://localhost:27017/", document_class=RawBSONDocument)
mydb = myclient['smackcoders']
mycol = mydb['logs']
from_date = "2019-10-09T10:32:08.438663"
to_date = "2019-10-12T10:32:08.438671"
for doc in mycol.find({"date": {'$gte': from_date, '$lte': to_date}}, {}, {'Page': [], '_id': 0}):
print(doc)
It shows error:
Traceback (most recent call last):
File "temp3.py", line 20, in <module>
for doc in mycol.find({"date": {'$gte': from_date, '$lte': to_date}}, {}, {'url': [], '_id': 0}):
File "/home/paulsteven/.local/lib/python3.7/site-packages/pymongo/collection.py", line 1460, in find
return Cursor(self, *args, **kwargs)
File "/home/paulsteven/.local/lib/python3.7/site-packages/pymongo/cursor.py", line 145, in __init__
raise TypeError("skip must be an instance of int")
TypeError: skip must be an instance of int
Output Required:
["http://192.168.1.34/third.html","http://192.168.1.14/fourth.html",.....and goes on for a specified date]
I don't Know how to make it work. Query works in mongodb but in python, it fails. Help me with some solutions.
You've got 3 parameters in your find function; you probably only need 2; a query and a projection. The third parameter is skip which is why it's failing with that error.
Mongo shell only takes 2 parameters so it is likely ignoring the third which is why it looks like it is working.
I have a MongoDB with dates fields. I define the object like this one:
"
{
"_id" : ObjectId("5d019fbdace49e498de7d915"),
"created_date" : ISODate("2018-05-18T16:00:00.000Z"),
"published_date" : ISODate("2018-05-18T16:00:00.000Z")
}
"
Mongoengine model looks like this one:
MyObject(Document):
created_date = DateTimeField(default = datetime.datetime.utcnow)
When I get an object from database it comes to Python as:
'created_date':{'$date': 1463587200000}
I get objects doing to_json and back from_json. to_json() functions converts the Python datetime to this format. The problem is I don't know how to handle it back.
If I just try to save the object back (even if I don't touch this field, data is json)
doc = MyDoc(**data)
doc.save()
I have the following exception:
mongoengine.errors.ValidationError: ValidationError (MyObject:5d019aca1c9d4400008cb934) (cannot parse date "{'$date': 1463587200000}"
to_json() doesn't help to handle this date conversion. The possible solution is to convert data using to_mongo() after retrieving from DB and after that convert it to Python-object using to_dict().
I don't have any problem using the data you mentioned.
I've inserted it in mongo with:
obj = {
"_id" : ObjectId("5d019fbdace49e498de7d915"),
"created_date" : ISODate("2018-05-18T16:00:00.000Z"),
"published_date" : ISODate("2018-05-18T16:00:00.000Z")
}
db.my_doc.insertOne(obj)
Then I'm able to read it, modify and save in mongoengine:
from mongoengine import *
import datetime as dt
connect()
class MyDoc(Document):
created_date = DateTimeField(default=dt.datetime.utcnow)
published_date = DateTimeField(default=dt.datetime.utcnow)
doc = MyDoc.objects.first()
assert doc.created_date == dt.datetime(2018, 5, 18, 16, 0)
assert doc.published_date == dt.datetime(2018, 5, 18, 16, 0)
doc.created_date = dt.datetime.utcnow()
doc.save()
But if I insert the following:
{
"_id" : ObjectId("5d019fbdace49e498de7d916"),
"created_date" : {
"$date" : 1463587200000
},
"published_date" : ISODate("2018-05-18T16:00:00Z")
}
Then I get the ValidationError you mentioned. Mongoengine's DateTimeField correctly resolves to ISODate in mongodb, in this case it simply sounds like some of your documents have a different structure (or perhaps its your schema that should be changed). Double check the raw objects in mongodb and make sure they are all the same.
I am using python to query a mongo collection and retrieve a value from it:
subquery = db.partsupp.aggregate([
{"$match": {"r_name": region }},
{"$group": {
"_id" : 0,
"minim": {"$min": "$supplycost"}
}
}
])
This query works just fine and it outputs:
[{'_id': 0, 'minim': 10}]
What I am trying to do now is to get the minim value from this aggregation.
Initially what I wanted was an 'if' to check if the query had any results, like this:
if len(subselect['result']) > 0 :
minim = subquery['result'][0]['minim']
else:
return subselect
But doing this only gets me the following error:
Traceback (most recent call last):
File "query2.py", line 195, in <module>
pprint( list(query2('Catalonia', 1, 1)) )
File "query2.py", line 72, in query2
if len(subquery['result']) > 0 :
TypeError: 'CommandCursor' object is not subscriptable
It looks like the result from the subselect query is not iterable or something like that, how can I solve this?
I am using Python 3.4.3 and pymongo 3.0.1.
Pymongo 3.0.1 returns aggregation results as cursor, which means you can't access the result with subquery['result']. To disable cursor and force pymongo to return a document with {'result':{...}} instead of a cursor, use this:
subquery = db.partsupp.aggregate([
{"$match": {"r_name": region }},
{"$group": {
"_id" : 0,
"minim": {"$min": "$supplycost"}
}
}
], useCursor=False)
From pymongo 4.0, useCursor is no longer available, use list() to convert cursor to a list:
cursor = db.partsupp.aggregate([
{"$match": {"r_name": region }},
{"$group": {
"_id" : 0,
"minim": {"$min": "$supplycost"}
}
}
])
subquery['result'] = list(cursor)
Since useCursor is deprecated and will be removed in PyMongo 4.0, I suggest iterating over the results:
subquery = db.partsupp.aggregate([
{"$match": {"r_name": region }},
{"$group": {
"_id" : 0,
"minim": {"$min": "$supplycost"}
}
}
])
results = [doc for doc in subquery]
Assuming that this one item of my database:
{"_id" : ObjectID("526fdde0ef501a7b0a51270e"),
"info": "foo",
"status": true,
"subitems : [ {"subitem_id" : ObjectID("65sfdde0ef501a7b0a51e270"),
//more},
{....}
],
//more
}
I want to find (or find_one, doesn't matter) the document(s) with "subitems.subitem_id" : xxx.
I have tried the following. All of them return an empty list.
from pymongo import MongoClient,errors
from bson.objectid import ObjectId
id = '65sfdde0ef501a7b0a51e270'
db.col.find({"subitems.subitem_id" : id } ) #obviously wrong
db.col.find({"subitems.subitem_id" : Objectid(id) })
db.col.find({"subitems.subitem_id" : {"$oid":id} })
db.col.find({"subitems.subitem_id.$oid" : id })
db.col.find({"subitems.$.subitem_id" : Objectid(id) })
In mongoshell this one works however:
find({"subitems.subitem_id" : { "$oid" : "65sfdde0ef501a7b0a51e270" } })
The literal 65sfdde0ef501a7b0a51e270 is not hexadecimal, hence, not a valid ObjectId.
Also, id is a Python built-in function. Avoid reseting it.
Finally, you execute a find but do not evaluate it, so you do not see any results. Remember that pymongo cursors are lazy.
Try this.
from pymongo import MongoClient
from bson.objectid import ObjectId
db = MongoClient().database
oid = '65cfdde0ef501a7b0a51e270'
x = db.col.find({"subitems.subitem_id" : ObjectId(oid)})
print list(x)
Notice I adjusted oid to a valid hexadecimal string.
Same query in the Mongo JavaScript shell.
db.col.find({"subitems.subitem_id" : new ObjectId("65cfdde0ef501a7b0a51e270")})
Double checked. Right answer is db.col.find({"subitems.subitem_id" : Objectid(id)})
Be aware that this query will return full record, not just matching part of sub-array.
Mongo shell:
a = ObjectId("5273e7d989800e7f4959526a")
db.m.insert({"subitems": [{"subitem_id":a},
{"subitem_id":ObjectId()}]})
db.m.insert({"subitems": [{"subitem_id":ObjectId()},
{"subitem_id":ObjectId()}]})
db.m.find({"subitems.subitem_id" : a })
>>> { "_id" : ObjectId("5273e8e189800e7f4959526d"),
"subitems" :
[
{"subitem_id" : ObjectId("5273e7d989800e7f4959526a") },
{"subitem_id" : ObjectId("5273e8e189800e7f4959526c")}
]}