pymongo shows skip must be an instance of int - error - python

I am a newbie to mongodb. I want to retrieve the datas of a certain fields on a specified date from mongodb using python. My Mongodb Collection looks like this
{
"_id" : ObjectId("5d9d7eec7c6265a42e352d6d"),
"browser" : "Chrome",
"countryCode" : "IN",
"Page" : "http://192.168.1.34/third.html",
"date" : "2019-10-09T10:32:08.438660"
}
{
"_id" : ObjectId("5d9d7eec7c6265a42e352d6e"),
"browser" : "Chrome",
"countryCode" : "IN",
"Page" : "http://192.168.1.14/fourth.html",
"date" : "2019-10-12T10:32:08.438662"
}
and so on
I retrieved the data from mongodb by using the following query in mongodb
db.collection_name.find({"date": {'$gte': "2019-10-09T10:32:08.438660", '$lte': "2019-10-10T10:32:08.438661"}},{}, {Page:[], _id:0})
I want to get that data using pymongo in python. Here's the Code I tried,
from pymongo import MongoClient
import pymongo
from bson.raw_bson import RawBSONDocument
myclient = pymongo.MongoClient(
"mongodb://localhost:27017/", document_class=RawBSONDocument)
mydb = myclient['smackcoders']
mycol = mydb['logs']
from_date = "2019-10-09T10:32:08.438663"
to_date = "2019-10-12T10:32:08.438671"
for doc in mycol.find({"date": {'$gte': from_date, '$lte': to_date}}, {}, {'Page': [], '_id': 0}):
print(doc)
It shows error:
Traceback (most recent call last):
File "temp3.py", line 20, in <module>
for doc in mycol.find({"date": {'$gte': from_date, '$lte': to_date}}, {}, {'url': [], '_id': 0}):
File "/home/paulsteven/.local/lib/python3.7/site-packages/pymongo/collection.py", line 1460, in find
return Cursor(self, *args, **kwargs)
File "/home/paulsteven/.local/lib/python3.7/site-packages/pymongo/cursor.py", line 145, in __init__
raise TypeError("skip must be an instance of int")
TypeError: skip must be an instance of int
Output Required:
["http://192.168.1.34/third.html","http://192.168.1.14/fourth.html",.....and goes on for a specified date]
I don't Know how to make it work. Query works in mongodb but in python, it fails. Help me with some solutions.

You've got 3 parameters in your find function; you probably only need 2; a query and a projection. The third parameter is skip which is why it's failing with that error.
Mongo shell only takes 2 parameters so it is likely ignoring the third which is why it looks like it is working.

Related

How to append a new array of values to an existing array document in mongodb using pymongo?

An existing collection like as below:
"_id" : "12345",
"vals" : {
"dynamickey1" : {}
}
I need to add
"vals" : {
"dynamickey2" : {}
}
I have tried in python 2.7 with pymongo 2.8:
col.update({'_id': id)},{'$push': {'vals': {"dynamickey2":{"values"}}}})
Error log:
pymongo.errors.OperationFailure: The field 'vals' must be an array but is of type object in document
Expected Output:
"_id" : "12345",
"vals" : {
"dynamickey1" : {},
"dynamickey2" : {}
}
Edited following question edit:
Two options; use $set with the dot notation, or use python dict manipulation.
The first method is more MongoDB native and is one line of code; the second is a bit more work but gives more flexilbility if you use case is more nuanced.
Method 1:
from pymongo import MongoClient
from bson.json_util import dumps
db = MongoClient()['mydatabase']
db.mycollection.insert_one({
"_id": "12345",
"vals": {
"dynamickey1": {},
}
})
db.mycollection.update_one({'_id': '12345'},{'$set': {'vals.dynamickey2':{}}})
print(dumps(db.mycollection.find_one({}), indent=4))
Method 2:
from pymongo import MongoClient
from bson.json_util import dumps
db = MongoClient()['mydatabase']
db.mycollection.insert_one({
"_id": "12345",
"vals": {
"dynamickey1": {},
}
})
record = db.mycollection.find_one({'_id': '12345'})
vals = record['vals']
vals['dynamickey2'] = {}
record = db.mycollection.update_one({'_id': record['_id']}, {'$set': {'vals': vals}})
print(dumps(db.mycollection.find_one({}), indent=4))
Either way gives:
{
"_id": "12345",
"vals": {
"dynamickey1": {},
"dynamickey2": {}
}
}
Previous answer
Your expected output has an object with duplicate fields (vals); this isn't allowed.~
So whatever you are trying to do, it isn't going to work.

Nested complex query to MongoDB with Python

I have the following document in my MongoDB:
{
"_id" : ObjectId("5a672fe5c9afd19e04d011ca"),
"data" : [
{
"name" : "Smith",
"age" : 10,
"spouse" : "Lopez"
},
{
"name" : "Davis",
"age" : 10,
"spouse" : "Peter"
},
{
"name" : "Clark",
"age" : 10
}
],
"header" : {
"sourece" : "http://www.some.com/api/json/data?department=security&gender=female",
"fetch_time" : "2018-01-23T09:35:51"
}
}
Now I want to:
Get all the data under "data" node.
Get all the people who have
"spouse" node.
The following code doesn't work:
from pymongo import MongoClient
from pprint import pprint
client = MongoClient('mongodb://localhost:27017/')
db = client['test']
coll = db['test_2']
print('All content:')
for item in coll.find():
pprint(item)
print('-'*20)
print("Content under 'data':")
for item in coll.find({"data": "$all"}):
pprint(item)
for item in coll.find({"data": []}):
pprint(item)
for item in coll.find({"data": ["$all"]}):
pprint(item)
print('-'*20)
print("People who have 'spouse':")
for item in coll.find({"data": [{"spouse":"$all"}]}):
pprint(item)
The above code outputs the following:
All content:
{u'_id': ObjectId('5a672fe5c9afd19e04d011ca'),
u'data': [{u'age': 10, u'name': u'Smith', u'spouse': u'Lopez'},
{u'age': 10, u'name': u'Davis', u'spouse': u'Peter'},
{u'age': 10, u'name': u'Clark'}],
u'header': {u'fetch_time': u'2018-01-23T09:35:51',
u'sourece': u'http://www.some.com/api/json/data?department=security&gender=female'}}
--------------------
Content under 'data':
--------------------
People who have 'spouse':
I can get all the content from my MongoDB, which means the data is there in the database. But when I run the subsequent code, nothing was printed. I tried different ways but none of them work.
Moreover, is there any document like, say Oracle SQL reference.pdf stating the query statement grammar with strict structure specification so I can build any query statement based on it?
No need to get all data.
First part ( Regular Query ) - Read here
- Use projection to output all data fields with no query filter.
Something like coll.find({},{"data": 1}).
Second part ( Aggregate Query ) - Read here - Use $match to contain the documents where "data" have atleast have one array element where it has spouse field followed by $filter with $type expression to check for missing field to $project matching array elements.
Something like
col.aggregate([
{"$match":{"data.spouse":{"$exists":true}}},
{"$project":{
"data":{
"$filter":{
"input":"$data",
"as":"result",
"cond":{"$ne":[{"$type":"$$result.spouse"},"missing"]
}
}
}
}}
])
Also not query operators are different from aggregation comparison operators.

pymongo: name 'ISODate' is not defined

I have problem when i try to select data in mongodb with pymongo, this is my code :
import pymongo
from pymongo import MongoClient
import sys
from datetime import datetime
try:
conn=pymongo.MongoClient('10.33.109.228',27017)
db=conn.mnemosyne
data_ip=db.session.aggregate({'$match':{'timestamp':{'$gte': ISODate('2016-11-11T00:00:00.000Z'),'$lte': ISODate('2016-11-11T23:59:59.000Z')}}},{'$group':{'_id':'$source_ip'}})
for f in data_ip:
print f['_id']
except pymongo.errors.ConnectionFailure, e:
print "Could not connect to MongoDB: %s" % e
and when i execute it i have some error like this:
Traceback (most recent call last):
File "test.py", line 9, in <module>
data_ip=db.session.aggregate({'$match':{'timestamp':{'$gte': ISODate('2016-11-11T00:00:00.000Z'),'$lte': ISODate('2016-11-11T23:59:59.000Z')}}},{'$group':{'_id':'$source_ip'}})
NameError: name 'ISODate' is not defined
I want the result like this:
{ "_id" : "60.18.133.207" }
{ "_id" : "178.254.52.96" }
{ "_id" : "42.229.218.192" }
{ "_id" : "92.82.171.117" }
{ "_id" : "103.208.120.205" }
{ "_id" : "185.153.208.142" }
this is example structure of mydatabase:
> db.session.findOne()
{
"_id" : ObjectId("5786398d1f50070f31f27f7c"),
"protocol" : "epmapper",
"hpfeed_id" : ObjectId("5786398d1f50070f31f27f7b"),
"timestamp" : ISODate("2016-07-13T12:52:29.112Z"),
"source_ip" : "23.251.55.182",
"source_port" : 2713,
"destination_port" : 135,
"identifier" : "d3374f14-48f7-11e6-9e19-0050569163b4",
"honeypot" : "dionaea"
}
Please help me to fix the error
ISODate is a function in the Mongo shell, which is a javascript environment, it's not available within Python.
You can use dateutil for converting a string to datetime object in Python,
import dateutil.parser
dateStr = "2016-11-11T00:00:00.000Z"
dateutil.parser.parse(dateStr) # returns a datetime.datetime(2016, 11, 11, 00, 0, tzinfo=tzutc())
Using PyMongo, if you want to insert datetime in MongoDB you can simply do the following:
import pymongo
import dateutil
dateStr = '2016-11-11T00:00:00.000Z'
myDatetime = dateutil.parser.parse(dateStr)
client = pymongo.MongoClient()
client.db.collection.insert({'date': myDatetime})
ISODate is a JavaScript Date object. To query range of date using PyMongo, you need to use a datetime.datetime instance which mongod will convert to the appropriate BSON type. You don't need any third party library.
Also you shouldn't be using the Aggregation Framework to do this because the _id field is unique within the collection which makes this a perfect job for the distinct() method.
import datetime
start = datetime.datetime(2016, 11, 11)
end = datetime(2016, 11, 11, 23, 59, 59)
db.session.distinct('_id', {'timestamp': {'$gte': start, '$lte': end}})
If you really need to use the aggregate() method, your $match stage must look like this:
{'$match': {'timestamp': {'$gte': start, '$lte': end}}}

Get result from mongo aggregation using pymongo 3.0

I am using python to query a mongo collection and retrieve a value from it:
subquery = db.partsupp.aggregate([
{"$match": {"r_name": region }},
{"$group": {
"_id" : 0,
"minim": {"$min": "$supplycost"}
}
}
])
This query works just fine and it outputs:
[{'_id': 0, 'minim': 10}]
What I am trying to do now is to get the minim value from this aggregation.
Initially what I wanted was an 'if' to check if the query had any results, like this:
if len(subselect['result']) > 0 :
minim = subquery['result'][0]['minim']
else:
return subselect
But doing this only gets me the following error:
Traceback (most recent call last):
File "query2.py", line 195, in <module>
pprint( list(query2('Catalonia', 1, 1)) )
File "query2.py", line 72, in query2
if len(subquery['result']) > 0 :
TypeError: 'CommandCursor' object is not subscriptable
It looks like the result from the subselect query is not iterable or something like that, how can I solve this?
I am using Python 3.4.3 and pymongo 3.0.1.
Pymongo 3.0.1 returns aggregation results as cursor, which means you can't access the result with subquery['result']. To disable cursor and force pymongo to return a document with {'result':{...}} instead of a cursor, use this:
subquery = db.partsupp.aggregate([
{"$match": {"r_name": region }},
{"$group": {
"_id" : 0,
"minim": {"$min": "$supplycost"}
}
}
], useCursor=False)
From pymongo 4.0, useCursor is no longer available, use list() to convert cursor to a list:
cursor = db.partsupp.aggregate([
{"$match": {"r_name": region }},
{"$group": {
"_id" : 0,
"minim": {"$min": "$supplycost"}
}
}
])
subquery['result'] = list(cursor)
Since useCursor is deprecated and will be removed in PyMongo 4.0, I suggest iterating over the results:
subquery = db.partsupp.aggregate([
{"$match": {"r_name": region }},
{"$group": {
"_id" : 0,
"minim": {"$min": "$supplycost"}
}
}
])
results = [doc for doc in subquery]

Pymongo - Mod on _id not allowed

I have a Mongo Collection that I need to update, and I'm trying to use the collection.update command to no avail.
Code below:
import pymongo
from pymongo import MongoClient
client = MongoClient()
db = client.SensorDB
sensors = db.Sensor
for sensor in sensors.find():
lat = sensor['location']['latitude']
lng = sensor['location']['longitude']
sensor['location'] = {
"type" : "Feature",
"geometry" : {
"type" : "Point",
"coordinates" : [lat ,lng]
},
"properties": {
"name": sensor['name']
}
}
sensors.update({'webid': sensor['webid']} , {"$set": sensor}, upsert=True)
However, running this gets me the following:
Traceback (most recent call last):
File "purgeDB.py", line 21, in <module>
cameras.update({'webid': sensor['webid']} , {"$set": sensor}, upsert=True)
File "C:\Anaconda\lib\site-packages\pymongo\collection.py", line 561, in update
check_keys, self.uuid_subtype), safe)
File "C:\Anaconda\lib\site-packages\pymongo\mongo_client.py", line 1118, in _send_message
rv = self.__check_response_to_last_error(response, command)
File "C:\Anaconda\lib\site-packages\pymongo\mongo_client.py", line 1060, in __check_response_to_last_error
raise OperationFailure(details["err"], code, result)
pymongo.errors.OperationFailure: Mod on _id not allowed
Change this line:
for sensor in sensors.find():
to this:
for sensor in sensors.find({}, {'_id': 0}):
What this does is prevent Mongo from returning the _id field, since you aren't using it, and it's causing your problem later in your update() call since you cannot "update" _id.
An even better solution (Only write the data that is needed)
for sensor in sensors.find():
lat = sensor['location']['latitude']
lng = sensor['location']['longitude']
location = {
"type" : "Feature",
"geometry" : {
"type" : "Point",
"coordinates" : [lat ,lng]
},
"properties": {
"name": sensor['name']
}
}
sensors.update({'webid': sensor['webid']} , {"$set": {'location': location}})
Edit:
As mentioned by Loïc Faure-Lacroix, you also do not need the upsert flag in your case - your code in this case is always updating, and never inserting.
Edit2:
Surrounded _id in quotes for first solution.

Categories

Resources