Pymongo Insert Doc with String ID's rather than ObjectIDs

Pymongo Insert Doc with String ID's rather than ObjectIDs - python

I am trying to get pymongo to insert new documents which have id's in string format rather than ObjectId's. The app I am building integrates meteor and python and meteor inserts string id's so having to work with both string and Objectids adds complexity.
Example:
Meteor-inserted doc:
{
"_id" : "22FHWpvqrAeyfvh7B"
}
Pymongo-inserted doc:
{
"_id" : ObjectId("5880387d1fd21c2dc66e9b7d")
}

You could just switch your Meteor app to insert ObjectIds instead of strings. Just use the idGeneration option property and set the value to MONGO.
Here is an example.
var todos = new Mongo.Collection('todos', {
idGeneration: 'MONGO'
});
It is described in the Meteor docs here.
Or, if you want Meteir to keep strings and can't figure out how to configure Pymongo to store as strings, then you can do the approach described here to convert between ObjectIds and strings in Pymongo..

Related

Python and MongoDB Convert data type from string to long

I am new to mongodb and pymongo. Have a small clarification in converting the field datatype from string to long. Fieldname = sc_2g
The following is working fine using mongoshell.
db.collection.aggregate({$set: {sc_g: { $toLong: "$sc_g" }}},{$out:"collection"})
but i need equivalent in python. Can anyone help..?

You can use aggregation too in pymongo.
pipeline = {"$set": {"sc_g": { "$toLong": "$sc_g" }}}, {"$out":"collection"}
list(db.collection.aggregate(pipeline))

Parsing a json string within pymongo

I have a field called financials that contains a json string.
{
"_id" : ObjectId("57506d74c469888f0d631be6"),
"financials" : "{"year":[2015], ...}"
}
What I currently do is extract the data, convert it to a pandas dataframe, parse the string using json.loads and fiddle with the financial data from there.
Is there any way to parse the json string in pymongo, preferably as part of the aggregate pipeline as I wish to use some functions (namely $unwind) within pymongo?

I do not know how to do it via pymongo (which could probably mean that there is no option to do it via pymongo, for example $convert operator does not have option of parsing string to json), but different solution could be via mongo shell with using JSON.parse.
db.YourCollection.find().forEach( function(Object)
{var modified_data = JSON.parse(Object.financials);db.YourCollection.updateOne({_id:Object._id},{$set:{financials:modified_data}})} )

How to use MongoDB find() to perform range queries on numeric strings?

How can I make a find() in MongoDB, using find to be >= with some value, but that value is a numeric string?
If I run the following line (that searches the MongoDB database for modes higher than 1):
cursor = db.foo.find({"mode": {"$gt": 1}})
This will work only if the data in MongoDB is in the format:
data = {"mode":3}
But I need to use the find() with this data:
data = {"mode":'3'} # as string
How can I do this?
Here is my example:
from pymongo import MongoClient
client = MongoClient()
db = client.test
db.foo.drop()
data = {"mode":3} # Works because this is a numeric
data = {"mode":'3'} # Won't work!!!!!!!!!! But my database contains only numeric strings...how can use like this?
db.foo.insert_one(data)
print(db.foo.count())
cursor = db.foo.find({"mode": {"$gt": 1}})
for document in cursor:
print(document)

If you leave your numeric data stored in the database as strings, in order to query your data with range operators such as $gt and $lt you're going to have to use one of two approaches.
First, you can use JavaScript's automatic conversion to run your range queries. This works as shown below, but it is very limited as you will not be able to use any indexes, as explained in the comments to previous answers. Thus for big data sets, this will be prohibitively slow.
db.foo.find("this.mode > 1");
A second approach would involve regular expressions. You will have to figure out what regex to use, but once you have that, you can use the syntax below to run your query or use the $regex operator as highlighted here.
db.foo.find({ mode: /pattern/<options> });
Aside from having to figure out some complex regex, again there are possible performance issues with this approach, as explained here (see extract below). Most likely, you will also run into issues where your query is not taking advantage of indexes.
If an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a “prefix expression”, which means that all potential matches start with the same string. This allows MongoDB to construct a “range” from that prefix and only match against those values from the index that fall within that range.
Because of this, if you're going to be running these queries often, I would recommend that you follow a third approach, which would be to change your schema and store your data as numbers. You can achieve this with a simple migration script such as the following in JavaScript, which you could run in the shell.
var cursor = db.foo.find();
while (cursor.hasNext()) {
var doc = cursor.next();
var _id = doc._id;
if (doc.mode) {
var modeString = doc.mode;
var modeInt = parseInt(modeString);
db.foo.update({ _id: _id }, { $set: { mode: modeInt } });
}
}
Having done that you will be able to query your data using operators such as $gt and $lt, sort it without much hassle, and take advantage of indexes.

From Mongo docs,
$type selects the documents where the value of the field is an instance of the specified BSON type. Querying by data type is useful when dealing with highly unstructured data where data types are not predictable.
{ field: { $type: BSON type number | String alias } }
$type returns documents where the BSON type of the field matches the BSON type passed to $type.
I guess you'll have to pass the $type explicitly in your case which might be:
data = {{"mode":{$type:"string"}}:'3'}

You could try this synthax (JavaScript's automatic conversion):
db.test.find("this.mode > 1")
source

including a NumberInt in a dict for pymongo

I need to load a list of dicts (see below) into a mongoDB. Within mongo, you have to define an int type as NumberInt(). Python doesn't recognize this as a valid type for a dict. I've found pages on custom encoding for pymongo that don't actually do what I need. I'm totally stuck. Someone has to have encountered this before!
Need to insert a list of dicts like this into mongoDB from python.
agg = {
'_id' : unique_id_str,
'total' : NumberInt(int(total)),
'mode' : NumberInt(int(mymode))
}

You should be able to just insert the dict with an int, I've never needed to use NumberInt to insert documents using pymongo.
Also, fwiw, folks at mongodb told me that letting mongo create the _id itself tends to be more efficient but obviously it may work better for you to define in your case.
agg = {
'_id' : unique_id_str,
'total' : int(total),
'mode' : int(mymode)
}
should work

How to use $push/$addToSet mongodb update modifiers on python dicts

Mongodb updates provide the $push modifier to append to an array. My problem is that i want this to happen on a dict e.g
If my record looks like this initially:
{"collaborations":{'id1':{'role':'dev','scope':'dev'}}}
I want to add another item("id2" below) to the "collaborations" field dict to look something like this:
{"collaborations":{'id1':{'role':'dev','scope':'dev'},'id2':{'role':'qa','scope':'qa'}}}
I am trying with $push:
my_record.update({match_criteria},{$push,{"collaborations":{'id2':{'role':'qa','scope':'qa'}}}})
and also with $addToSet:
my_record.update({match_criteria},{$,{"collaborations":{'id2':{'role':'qa','scope':'qa'}}}})
With both of these, mongodb throws as error "Cannot apply $addToSet($push) modifier to non-array".
How can this be done for dict types? Any ideas?

The problem is that $addToSet and $push modifiers work with arrays.
To update sub-document (that is what you need here) just use $set modifier with dot notation to access sub-document (field.subfield):
my_record.update({
match_criteria
}, {
'$set': {
'collaborators.id2': {
// new document fields here
}
}
})

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pymongo Insert Doc with String ID's rather than ObjectIDs - python

Related

Python and MongoDB Convert data type from string to long

Parsing a json string within pymongo

How to use MongoDB find() to perform range queries on numeric strings?

including a NumberInt in a dict for pymongo

How to use $push/$addToSet mongodb update modifiers on python dicts

Categories

Resources