Upsert and Multi flag in pymongo - python

I am working on pymongo and this is my document:
{
"_id": ObjectId("51211b57f07ddaa377000000"),
"assignments": {
"0": {
"0": {
"_id": ObjectId("5120dd7400a4453d58a0d0ec")
},
"1": {
"_id": ObjectId("5120dd8e00a4453d58a0d0ed")
},
"2": {
"_id": ObjectId("5120ddad00a4453d58a0d0ee")
}
}
},
"password": "my_passwd",
"username": "john"
}
I would like to unset the "assignment" property of all such docs.
I was able to achieve this on the mongo shell by doing:
db.users.update({}, {$unset: {"assignments": 1}}, false, true)
i.e., I passed the upsert and multi flag as the last two parameters to the update function function on users collection.
However I did this with pymongo:
db.users.update({}, {"$unset": {"assignments": 1}}, False, True)
But the python interpreter threw an error as follows:
File "notes/assignment.py", line 34, in <module>
db.users.update({}, {"$unset": {"assignments": 1}}, False, True)
File "/usr/local/lib/python2.7/dist-packages/pymongo/collection.py", line 481, in update
check_keys, self.__uuid_subtype), safe)
File "/usr/local/lib/python2.7/dist-packages/pymongo/mongo_client.py", line 852, in _send_message
rv = self.__check_response_to_last_error(response)
File "/usr/local/lib/python2.7/dist-packages/pymongo/mongo_client.py", line 795, in __check_response_to_last_error
raise OperationFailure(details["err"], details["code"])
pymongo.errors.OperationFailure: Modifiers and non-modifiers cannot be mixed
Where am I going wrong?

The problem is that the two flags you are passing in aren't upsert and multi. Based on the documentation of PyMongo's Collection.update (found here), it looks like you might be passing in values for the upsert and manipulate options, although I am not certain.
All you have to do to solve this is use one of Python's most awesome features: named arguments. By specifying which options you are passing to update, you add clarity to your code in addition to making sure accidents like this don't happen.
In this case, we want to pass the options upsert=False and multi=True.
db.users.update({}, { "$unset": { "assignments": 1 } }, upsert=False, multi=True)

Related

MongoDB using $cond with Update ($inc)

Is there any way to use $cond along with ($set, $inc, ...) operators in update? (MongoDB 4.2)
I want to update a field in my document by $inc it with "myDataInt" if a condition comes true, otherwise keeps it as it is:
db.mycoll.update(
{"_id" : "5e9e5da03da783817d231dc4"},
{"$inc" : {
"my_data_sum" : {
"$cond" : [
{
"$ne" : ["snapshot_time", new_snapshot_time)]
},myDataInt, 0]
]
}
},
{upsert=True, multi=False}
)
However, this gives an error in pymongo:
raise WriteError(error.get("errmsg"), error.get("code"), error)
pymongo.errors.WriteError: The dollar ($) prefixed field '$cond' in 'my_data_sum.$cond' is not valid for storage.
Any idea to avoid using find() before update in this case?
Update:
If I use the approach that Joe has mentioned, an exception will be raised in PyMongo (v3.10.1) due to using 'list' as a parameter in update_many() instead of 'dict':
from pymongo import MongoClient
db = MongoClient()['mydb']
db.mycoll.update_many(
{"_id" : "5e9e5da03da783817d231dc4"},
[{"$set" : {
"my_data_sum" : {
"$sum": [
"$my_data_sum",
{"$cond" : [
{"$ne" : ["snapshot_time", new_snapshot_time]},
myDataInt,
0
]}
]
}
}}],
upsert:true
)
That ends up with this error:
File "/usr/local/lib64/python3.6/site-packages/pymongo/collection.py", line 1076, in update_many session=session),
File "/usr/local/lib64/python3.6/site-packages/pymongo/collection.py", line 856, in _update_retryable _update, session)
File "/usr/local/lib64/python3.6/site-packages/pymongo/mongo_client.py", line 1491, in _retryable_write return self._retry_with_session(retryable, func, s, None)
File "/usr/local/lib64/python3.6/site-packages/pymongo/mongo_client.py", line 1384, in _retry_with_session return func(session, sock_info, retryable)
File "/usr/local/lib64/python3.6/site-packages/pymongo/collection.py", line 852, in _update retryable_write=retryable_write)
File "/usr/local/lib64/python3.6/site-packages/pymongo/collection.py", line 823, in _update _check_write_command_response(result)
File "/usr/local/lib64/python3.6/site-packages/pymongo/helpers.py", line 221, in _check_write_command_response _raise_last_write_error(write_errors)
File "/usr/local/lib64/python3.6/site-packages/pymongo/helpers.py", line 203, in _raise_last_write_error raise WriteError(error.get("errmsg"), error.get("code"), error)
pymongo.errors.WriteError: Modifiers operate on fields but we found type array instead. For example: {$mod: {<field>: ...}} not {$set: [ { $set: { my_data_sum: { $sum: [ "$my_data_sum", { $cond: [ { $ne: [ "$snapshot_time", 1586910283 ] }, 1073741824, 0 ] } ] } } } ]}
If you are using MongoDB 4.2, you can use aggregation operators with updates. $inc is not an aggregation operator, but $sum is. To specify a pipeline, pass an array as the second argument to update:
db.coll.update(
{"_id" : "5e9e5da03da783817d231dc4"},
[{"$set" : {
"my_data_sum" : {
"$sum": [
"$my_data_sum",
{"$cond" : [
{"$ne" : ["snapshot_time", new_snapshot_time]},
myDataInt,
0
]}
]
}
}}],
{upsert:true, multi:false}
)
After spending some time and searching online, I figured that the update_many(), update_one(), and update() methods of Collection object in PyMongo do not accept type list as parameters to support the new Aggregation Pipeline feature of the Update operation in MongoDB 4.2+. (At least this option is not available in PyMongo v3.10 yet.)
However, looks like I could use the command method of the Database object in PyMongo which is an instance of the (MongoDB runCommand) and it worked just fine for me:
from pymongo import MongoClient
db = MongoClient()['mydb']
result = db.command(
{
"update" : "mycoll",
"updates" : [{
"q" : {"_id" : "5e9e5da03da783817d231dc4"},
"u" : [
{"$set" : {
"my_data_sum" : {
"$sum": [
"$my_data_sum",
{"$cond" : [
{"$ne" : ["snapshot_time", new_snapshot_time]},
myDataInt,
0
]}
]
}
}}
],
"upsert" : True,
"multi" : True
}],
"ordered": False
}
)
The command method of the database object gets a dict object of all the required commands as its first argument, and then the list of Aggregation Pipeline can be included inside the dict object (q is the update query, and the u defined the fields to be updated).
result is a dictionary of Ack message from MongoDB which contains 'nModified', 'upserted', and 'writeErrors'.
https://mongoplayground.net/p/1AklFKuhFi6
[
{
"id": 1,
"like": 3
},
{
"id": 2,
"like": 1
}
]
let value = 1,
if you want to increment then use
value = -1 * value
db.collection.aggregate([
{
"$match": {
"id": 1
}
},
{
"$set": {
"count": {
$cond: {
if: {
$gt: [
"$like",
0
]
},
then: {
"$subtract": [
"$like",
value
]
},
else: 0
}
}
}
}
])

Getting KeyError when parsing JSON in Python for following response

TL;DR:
Confused on how to parse following JSON response and get the value of [status of 12345 of dynamicValue_GGG of payload] in this case.
Full question:
I get the following as (sanitized) response upon hitting a REST API via Python code below:
response = requests.request("POST", url, data=payload,
headers=headers).json()
{
"payload": {
"name": "asdasdasdasd",
"dynamicValue_GGG": {
"12345": {
"model": "asad",
"status": "active",
"subModel1": {
"dynamicValue_67890": {
"model": "qwerty",
"status": "active"
},
"subModel2": {
"dynamicValue_33445": {
"model": "gghjjj",
"status": "active"
},
"subModel3": {
"dynamicValue_66778": {
"model": "tyutyu",
"status": "active"
}
}
}
},
"date": "2016-02-04"
},
"design": "asdasdWWWsaasdasQ"
}
If I do a type(response['payload']), it gives me 'dict'.
Now, I'm trying to parse the response above and fetch certain keys and values out of it. The problem is that I'm not able to iterate through using "index" and rather have to specify the "key", but then the response has certain "keys" that are dynamically generated and sent over. For instance, the keys called "dynamicValue_GGG", "dynamicValue_66778" etc are not static unlike the "status" key.
I can successfully parse by mentioning like:
print response['payload']['dynamicValue_GGG']['12345'][status]
in which case I get the expected output = 'active'.
However, since I have no control on 'dynamicValue_GGG', it would work only if I can specify something like this instead:
print response['payload'][0][0][status]
But the above line gives me error: " KeyError: 0 " when the python code is executed.
Is there someway in which I can use the power of both keys as well as index together in this case?
The order of values in a dictionary in Python are random, so you cannot use indexing. You'll have to iterate over all elements, potentially recursive, and test to see if it's the thing you're looking for. For example:
def find_submodels(your_dict):
for item_key, item_values in your_dict.items():
if 'status' in item_values:
print item_key, item_values['status']
if type(item_values) == dict:
find_submodels(item_values)
find_submodels(your_dict)
Which would output:
12345 active
dynamicValue_67890 active
dynamicValue_66778 active
dynamicValue_33445 active

export json data to csv from mongodb

I am having the problem with missing field name in python script when exported data to csv from mongodb. type field name exists in first record, but it does not appear in the rest of records. how to write python script to give null value for type field if it does not exist.
the sample of Mongodb collection:
"stages": [
{
"interview": false,
"hmNotification": false,
"hmStage": false,
"type": "new",
"isEditable": false,
"order": 0,
"name": {
"en": "New"
},
"stageId": "51d1a2f4c0d9887b214f3694"
},
{
"interview": false,
"hmNotification": true,
"isEditable": true,
"order": 1,
"hmStage": true,
"name": {
"en": "Pre-Screen"
},
"stageId": "51f0078d7297363f62059699"
},
{
"interview": false,
"hmNotification": false,
"hmStage": false,
"isEditable": true,
"order": 2,
"name": {
"en": "Phone Screen"
},
"stageId": "51d1a326c0d9887721778eae"
}]
the sample of Python script:
import csv
cursor = db.workflows.find( {}, {'_id': 1, 'stages.interview': 1, 'stages.hmNotification': 1, 'stages.hmStage': 1, 'stages.type':1, 'stages.isEditable':1, 'stages.order':1,
'stages.name':1, 'stages.stageId':1 })
flattened_records = []
for stages_record in cursor:
stages_record_id = stages_record['_id']
for stage_record in stages_record['stages']:
flattened_record = {
'_id': stages_record_id,
'stages.interview': stage_record['interview'],
'stages.hmNotification': stage_record['hmNotification'],
'stages.hmStage': stage_record['hmStage'],
'stages.type': stage_record['type'],
'stages.isEditable': stage_record['isEditable'],
'stages.order': stage_record['order'],
'stages.name': stage_record['name'],
'stages.stageId': stage_record['stageId']}
flattened_records.append(flattened_record)
when run the python script, it shows keyerror:"type". please help me how to add the missing field name in the script.
When you're trying to fetch values that might not exist in a Python dictionary, you can use the .get() method of the dict class.
For instance, let's say you have a dictionary like this:
my_dict = {'a': 1,
'b': 2,
'c': 3}
You can use the get method to get one of the keys that exist:
>>> print(my_dict.get('a'))
1
But if you try to get a key that doesn't exist (such as does_not_exist), you will get None by default:
>>> print(my_dict.get("does_not_exist"))
None
As mentioned in the documentation, you can also provide a default value that will be returned when the key doesn't exist:
>>> print(my_dict.get("does_not_exist", "default_value"))
default_value
But this default value won't be used if the key does exist in the dictionary (if the key does exist, you'll get its value):
>>> print(my_dict.get("a", "default_value"))
1
Knowing that, when you build your flattened_record you can do:
'stages.hmStage': stage_record['hmStage'],
'stages.type': stage_record.get('type', ""),
'stages.isEditable': stage_record['isEditable'],
So if the stage_record dictionary doesn't contain a key type, get('type') will return an empty string.
You can also try with just:
'stages.hmStage': stage_record['hmStage'],
'stages.type': stage_record.get('type'),
'stages.isEditable': stage_record['isEditable'],
and then stage_record.get('type') will return None when that stage_record doesn't contain a type key.
Or you could make the default "UNKNOWN"
'stages.type': stage_record.get('type', "UNKNOWN"),

Pymongo - Mod on _id not allowed

I have a Mongo Collection that I need to update, and I'm trying to use the collection.update command to no avail.
Code below:
import pymongo
from pymongo import MongoClient
client = MongoClient()
db = client.SensorDB
sensors = db.Sensor
for sensor in sensors.find():
lat = sensor['location']['latitude']
lng = sensor['location']['longitude']
sensor['location'] = {
"type" : "Feature",
"geometry" : {
"type" : "Point",
"coordinates" : [lat ,lng]
},
"properties": {
"name": sensor['name']
}
}
sensors.update({'webid': sensor['webid']} , {"$set": sensor}, upsert=True)
However, running this gets me the following:
Traceback (most recent call last):
File "purgeDB.py", line 21, in <module>
cameras.update({'webid': sensor['webid']} , {"$set": sensor}, upsert=True)
File "C:\Anaconda\lib\site-packages\pymongo\collection.py", line 561, in update
check_keys, self.uuid_subtype), safe)
File "C:\Anaconda\lib\site-packages\pymongo\mongo_client.py", line 1118, in _send_message
rv = self.__check_response_to_last_error(response, command)
File "C:\Anaconda\lib\site-packages\pymongo\mongo_client.py", line 1060, in __check_response_to_last_error
raise OperationFailure(details["err"], code, result)
pymongo.errors.OperationFailure: Mod on _id not allowed
Change this line:
for sensor in sensors.find():
to this:
for sensor in sensors.find({}, {'_id': 0}):
What this does is prevent Mongo from returning the _id field, since you aren't using it, and it's causing your problem later in your update() call since you cannot "update" _id.
An even better solution (Only write the data that is needed)
for sensor in sensors.find():
lat = sensor['location']['latitude']
lng = sensor['location']['longitude']
location = {
"type" : "Feature",
"geometry" : {
"type" : "Point",
"coordinates" : [lat ,lng]
},
"properties": {
"name": sensor['name']
}
}
sensors.update({'webid': sensor['webid']} , {"$set": {'location': location}})
Edit:
As mentioned by Loïc Faure-Lacroix, you also do not need the upsert flag in your case - your code in this case is always updating, and never inserting.
Edit2:
Surrounded _id in quotes for first solution.

Problem decoding json strings using json module

After contacting a server I get the following strings as response
{"kind": "t2", "data": {"has_mail": null, "name": "shadyabhi", "created": 1273919273.0, "created_utc": 1273919273.0, "link_karma": 1343, "comment_karma": 301, "is_gold": false, "is_mod": false, "id": "425zf", "has_mod_mail": null}}
which is stored as type 'str' in my script.
Now, when I try to decode it using json.dumps(mystring, sort_keys=True, indent=4), I get this.
"{\"kind\": \"t2\", \"data\": {\"has_mail\": null, \"name\": \"shadyabhi\", \"created\": 1273919273.0, \"created_utc\": 1273919273.0, \"link_karma\": 1343, \"comment_karma\": 301, \"is_gold\": false, \"is_mod\": false, \"id\": \"425zf\", \"has_mod_mail\": null}}"
which should really be like this
shadyabhi#archlinux ~ $ echo '{"kind": "t2", "data": {"has_mail": "null", "name": "shadyabhi", "created": 1273919273.0, "created_utc": 1273919273.0, "link_karma": 1343, "comment_karma": 299, "is_gold": "false", "is_mod": "false", "id": "425zf", "has_mod_mail": "null"}}' | python2 -mjson.tool
{
"data": {
"comment_karma": 299,
"created": 1273919273.0,
"created_utc": 1273919273.0,
"has_mail": "null",
"has_mod_mail": "null",
"id": "425zf",
"is_gold": "false",
"is_mod": "false",
"link_karma": 1343,
"name": "shadyabhi"
},
"kind": "t2"
}
shadyabhi#archlinux ~ $
So, what is it that's going wrong?
You need to load it before you can dump it. Try this:
data = json.loads(returnFromWebService)
json.dumps(data, sort_keys=True, indent=4)
To add a bit more detail - you're receiving a string, and then asking the json library to dump it to a string. That doesn't make a great deal of sense. What you need to do first is put the data into a more meaningful container. By calling loads you take the string value of the return and parse it into an actual Python Dictionary. Then, you can pass that data to dumps which outputs a string using your requested formatting.
You have things backwards. If you want to convert a string to a data structure you need to use json.loads(thestring). json.dumps() is for converting a data structure to a json encoded string.
You are supposed to dump an object (like a dictionary) which then becomes a string, not the other way round... see here.
Use json.loads() instead.
You want json.loads. The dumps method is for going the other way (dumping an object to a json string).

Categories

Resources