Adding multiple fields to documents in mongodb using pymongo - python

I have a sample collection of documents in mongo db like below
[{"name":"hans","age":30,"test":"pass","pre":"no","calc":"no"},
{"name":"abs","age":20,"test":"not_pass","pre":"yes","calc":"no"},
{"name":"cdf","age":40,"test":"pass"},
{"name":"cvf","age":30,"test":"not_pass","pre":"no","calc":"yes"},
{"name":"cdf","age":23,"test":"pass"},
{"name":"asd","age":35,"test":"not_pass"}]
For some documents the fields pre and calc are not present. I want to add those two fields to the documents which dont have those fields with value null for both "pre":"null", "calc":"null".
The final document should look like
[{"name":"hans","age":30,"test":"pass","pre":"no","calc":"no"},
{"name":"abs","age":20,"test":"not_pass","pre":"yes","calc":"no"},
{"name":"cdf","age":40,"test":"pass","pre":"null","calc":"null"},
{"name":"cvf","age":30,"test":"not_pass","pre":"no","calc":"yes"},
{"name":"cdf","age":23,"test":"pass","pre":"null","calc":"null"},
{"name":"asd","age":35,"test":"not_pass","pre":"null","calc":"null"}]
I tried this way but didnt work.
db.users.update({}, { "$set" : { "pre":"null","calc":"null" }}, false,true)

Thinking that you need an update with the aggregation pipeline.
And use $ifNull operator.
db.users.update({},
[
{
"$set": {
"pre": {
$ifNull: [
"$pre",
"null"
]
},
"calc": {
$ifNull: [
"$calc",
"null"
]
}
}
}
],
false,
true
)
Sample Mongo Playground

The easiest option is to run this query for every missing field that you have , for example for pre:
db.collection.update({
pre: {
$exists: false
}
},
{
"$set": {
"pre": null
}
},
{
multi: true
})
Playground

Related

How do I get a specific element in array - MongoDB

Documents store in mongo db in following form
{
"_id" : ObjectId("54fa059ce4b01b3e086c83e9"),
"field1" : "value1",
"field2" : "value2"
"field3" : [
{
"abc123": ["somevalue", "somevalue"]
},
{
"xyz345": ["somevalue", "somevalue"]
}
]
}
What I want in output is whenever I pass abc123 in pymongo query I need result in following form
{
"abc123": ["somevalue", "somevalue"]
}
or
["somevalue", "somevalue"]
Please suggest a mongo query for it. Thanks
Maybe something like this:
db.collection.aggregate([
{
$project: {
field3: {
"$filter": {
"input": "$field3",
"as": "f",
"cond": {
$ne: [
"$$f.abc123",
undefined
]
}
}
}
}
},
{
$unwind: "$field3"
},
{
"$replaceRoot": {
"newRoot": "$field3"
}
}
])
Explained:
Use the mongo aggregation framework with below 3x stages:
project/filter only the needed array field3 if exist
unwind the field3 array
replace the root document with the content of field3
playground

Exclude _id field during a join query

I try to create a join query and exclude _id field from my result
stage_lookup_comments = {
"$lookup": {
"from": "products",
"localField": "product_codename",
"foreignField": "codename",
"as": "product",
}
}
pipeline = [
{ "$match": {
"category":category,
"archived_at":{"$eq": None}
}
},
stage_lookup_comments
]
array = await db[collection].aggregate(pipeline).to_list(CURSOR_LIMIT)
return array
I don't know what is the syntax to add the "_id": 0 parameter to my query.
You should be able to use MongoDB $project in your pipeline to select only those fields you want to return. In this particular case you can exclude _id field as you already mentioned putting _id:0.
Read documentation about $project here for more details.
I didn't test it, but your query should be something similar to the following:
stage_lookup_comments = {
"$lookup": {
"from": "products",
"localField": "product_codename",
"foreignField": "codename",
"as": "product",
}
}
pipeline = [
{
"$match": {
"category":category,
"archived_at":{"$eq": None}
}
},
stage_lookup_comments,
{
$project: { "_id": 0 }
}
]
array = await db[collection].aggregate(pipeline).to_list(CURSOR_LIMIT)
return array
EDIT:
Also, starting in MongoDB 4.2, you can use operator $unset to explicitly remove a field from a document (see documentation here):
{ $unset: ["_id"] }
You can read more about this in this very similar question here on Stackoverflow.
I hope this works!

Update document if value there is no match

In Mongodb, how do you skip an update if one field of the document exists?
To give an example, I have the following document structure, and I'd like to only update it if the link key is not matching.
{
"_id": {
"$oid": "56e9978732beb44a2f2ac6ae"
},
"domain": "example.co.uk",
"good": [
{
"crawled": true,
"added": {
"$date": "2016-03-16T17:27:17.461Z"
},
"link": "/url-1"
},
{
"crawled": false,
"added": {
"$date": "2016-03-16T17:27:17.461Z"
},
"link": "url-2"
}
]
}
My update query is:
links.update({
"domain": "example.co.uk"
},
{'$addToSet':
{'good':
{"crawled": False, 'link':"/url-1"} }}, True)
Part of the problem is the crawl field could be set to True or False and the date will also always be different - I don't want to add to the array if the URL exists, regardless of the crawled status.
Update:
Just for clarity, if the URL is not within the document, I want it to be added to the existing array, for example, if /url-3 was introduced, the document would look like this:
{
"_id": {
"$oid": "56e9978732beb44a2f2ac6ae"
},
"domain": "example.co.uk",
"good": [
{
"crawled": true,
"added": {
"$date": "2016-03-16T17:27:17.461Z"
},
"link": "/url-1"
},
{
"crawled": false,
"added": {
"$date": "2016-03-16T17:27:17.461Z"
},
"link": "url-2"
},
{
"crawled": false,
"added": {
"$date": "2016-04-16T17:27:17.461Z"
},
"link": "url-3"
}
]
}
The domain will be unique and specific to the link and I want it to insert the link within the good array if it doesn't exist and do nothing if it does exist.
The only way to do this is to find if there is any document in the collection that matches your criteria using the find_one method, also you need to consider the "good.link" field in your filter criteria. If no document matches you run your update query using the update_one method, but this time you don't use the "good.link" field in your query criteria. Also you don't need the $addToSet operator as it's not doing anything simple use the $push update operator, it makes your intention clear. You also don't need to "upsert" option here.
if not link.find_one({"domain": "example.co.uk", "good.link": "/url-1"}):
link.update_one({"domain": "example.co.uk"},
{"$push": {"good": {"crawled": False, 'link':"/url-1"}}})
in your find section of the query you are matching all documents where
"domain": "example.co.uk"
you need to add that you don't want to match
'good.link':"/url-1"
so try
{
"domain": "example.co.uk",
"good.link": {$ne: "/url-1"}
}
The accepted answer is not correct by saying the only way to do it is using findOne first.
You can do it in a single db call by using the aggregation pipelined updates feature, this allows you to use aggregation operators within an update, now the strategy will be to concat two arrays, the first array will always be the "good" array, the second array will either be [new link] or an empty array based on the condition if the links exists or not using $cond, like so:
links.update({
"domain": "example.co.uk"
},
[
{
"$set": {
"good": {
"$ifNull": [
"$good",
[]
]
}
}
},
{
"$set": {
"good": {
"$concatArrays": [
"$good",
{
"$cond": [
{
"$in": [
"/url-1",
"$good.link"
]
},
[],
[
{
"crawled": False,
"link": "/url-1"
}
]
]
}
]
}
}
}
], True)
Mongo Playground

highlighting based on term or bool query match in elasticsearch

I have two queries.
{'bool':
{'must':
{ 'terms': 'metadata.loc':['ten','twenty']}
{ 'terms': 'metadata.doc':['prince','queen']}
}
{'should':
{ 'match': 'text':'kingdom of dreams'}
}
},
{'highlight':
{'text':
{'type':fvh,
'matched_fields':['metadata.doc','text']
}
}
}
There are two questions ?
Why the documents with should query match are getting highlighted whereas documents with only must term match are not getting highlighted.
Is there any way to mention highlight condition specific to term query above ?
This means highlight condition for { 'terms': 'metadata.loc':['ten','twenty']}
and a seperate highlight condition for { 'terms': 'metadata.doc':['prince','queen']}
1) Only documents with should query are getting highlighted because you are highlighting against only text field which is basically your should clause. Although you are using matched_fields , you are considering only text field.
From the Docs
All matched_fields must have term_vector set to with_positions_offsets but only the field to which the matches are combined is loaded so only that field would benefit from having store set to yes.
Also you are combining two very different fields, 'matched_fields':['metadata.doc','text'], this is hard to understand, again from the Docs
Technically it is also fine to add fields to matched_fields that don’t share the same underlying string as the field to which the matches are combined. The results might not make much sense and if one of the matches is off the end of the text then the whole query will fail.
2) You can write highlight condition specific to term query with Highlight Query
Try this in your highlight part of the query
{
"query": {
...your query...
},
"highlight": {
"fields": {
"text": {
"type": "fvh",
"matched_fields": [
"text",
"metadata.doc"
]
},
"metadata.doc": {
"highlight_query": {
"terms": {
"metadata.doc": [
"prince",
"queen"
]
}
}
},
"metadata.loc": {
"highlight_query": {
"terms": {
"metadata.loc": [
"ten",
"twenty"
]
}
}
}
}
}
}
Does this help?

pymongo nested embedded document field update

I have a document as:
{
"name": "restaurant 1",
"rooms":
[
{"name": "room1",
"desks": [
{
"name": "desk1",
"unique": "abcde",
"busy": False
},
{
"name": "desk2",
"unique": "abcdf",
"busy": True
}
]},
{"name": "room2",
"desks": [
{
"name": "desk1",
"unique": "bbcde",
"busy": False
},
{
"name": "desk2",
"unique": "bbcdf",
"busy": False
}
]}
]
}
My pymongo search query:
db.restaurants.update(
{'rooms.desks.unique': 'bbcdf')},
{'$set': {'rooms.$.desks.$$.busy': True}}
)
I couldn't update "busy" field of the desk. $$ part didn't work. What should I replace "$$" with?
or
How can I find the index of the desk.
Thanks in advance
According to documentation it's not possible:
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value.
Most likely you will need to redesign your database schema.

Categories

Resources