Exclude _id field during a join query - python

I try to create a join query and exclude _id field from my result
stage_lookup_comments = {
"$lookup": {
"from": "products",
"localField": "product_codename",
"foreignField": "codename",
"as": "product",
}
}
pipeline = [
{ "$match": {
"category":category,
"archived_at":{"$eq": None}
}
},
stage_lookup_comments
]
array = await db[collection].aggregate(pipeline).to_list(CURSOR_LIMIT)
return array
I don't know what is the syntax to add the "_id": 0 parameter to my query.

You should be able to use MongoDB $project in your pipeline to select only those fields you want to return. In this particular case you can exclude _id field as you already mentioned putting _id:0.
Read documentation about $project here for more details.
I didn't test it, but your query should be something similar to the following:
stage_lookup_comments = {
"$lookup": {
"from": "products",
"localField": "product_codename",
"foreignField": "codename",
"as": "product",
}
}
pipeline = [
{
"$match": {
"category":category,
"archived_at":{"$eq": None}
}
},
stage_lookup_comments,
{
$project: { "_id": 0 }
}
]
array = await db[collection].aggregate(pipeline).to_list(CURSOR_LIMIT)
return array
EDIT:
Also, starting in MongoDB 4.2, you can use operator $unset to explicitly remove a field from a document (see documentation here):
{ $unset: ["_id"] }
You can read more about this in this very similar question here on Stackoverflow.
I hope this works!

Related

How can I retrieve relative document in MongoDB?

I'm using Flask with Jinja2 template engine and MongoDB via pymongo. This are my documents from two collections (phone and factory):
phone = db.get_collection("phone")
{
"_id": ObjectId("63d8d39206c9f93e68d27206"),
"brand": "Apple",
"model": "iPhone XR",
"year": NumberInt("2016"),
"image": "https://apple-mania.com.ua/media/catalog/product/cache/e026f651b05122a6916299262b60c47d/a/p/apple-iphone-xr-yellow_1.png",
"CPU": {
"manufacturer": "A12 Bionic",
"cores": NumberInt("10")
},
"misc": [
"Bluetooth 5.0",
"NFC",
"GPS"
],
"factory_id": ObjectId("63d8d42b7a4d7a7e825ef956")
}
factory = db.get_collection("factory")
{
"_id": ObjectId("63d8d42b7a4d7a7e825ef956"),
"name": "Foxconn",
"stock": NumberInt("1000")
}
In my python code to retrieve the data I do:
models = list(
phone.find({"brand": brand}, projection={"model": True, "image": True, "factory_id": True})
)
How can I retrieve relative factory document by factory_id and have it as an embedded document in a models list?
I think you are looking for this query using aggregation stage $lookup.
So this query:
First $match by your desired brand.
Then do a "join" between collections based on the factory_id and store it in an array called "factory". The $lookup output is always an array because can be more than one match.
Last project only values you want. In this case, as _id is unique you can get the factory using $arrayElemAt position 0.
So the code can be like this (I'm not a python expert)
models = list(
phone.aggregate([
{
"$match": {
"brand": brand
}
},
{
"$lookup": {
"from": "factory",
"localField": "factory_id",
"foreignField": "_id",
"as": "factories"
}
},
{
"$project": {
"model": True,
"image": True,
"factory": {
"$arrayElemAt": [
"$factories",
0
]
}
}
}
])
)

ElasticSearch - Compile Error on Adding a Field?

Using Python, I'm trying to go row-by-row through an Elasticsearch index with 12 billion documents and add a field to each document. The field is named direction and will contain "e" for some values of the field src and "e" for others. For this particular _id, the field should contain an "e".
from elasticsearch import Elasticsearch
es = Elasticsearch(["https://myESserver:9200"],
http_auth=('myUsername', 'myPassword'))
query_to_add_direction_field = {
"script": {
"inline": "direction=\"e\"",
"lang": "painless"
},
"query": {"constant_score": {
"filter": {"bool": {"must": [{"match": {"_id": "YKReAoQBk7dLIXMBhYBF"}}]}}}}
}
results = es.update_by_query(index="myIndex-*", body=query_to_add_direction_field)
I'm getting this error:
elasticsearch.BadRequestError: BadRequestError(400, 'script_exception', 'compile error')
I'm new to Elasticsearch. How can I correct my query so that it does not throw an error?
UPDATE:
I updated the code like this:
query_find_id = {
"size": "1",
"query": {
"bool": {
"filter": {
"term": {
"_id": "YKReAoQBk7dLIXMBhYBF"
}
}
}
}
}
query_to_add_direction_field = {
"script": {
"source": "ctx._source['egress'] = true",
"lang": "painless"
},
"query": {
"bool": {
"filter": {
"term": {
"_id": "YKReAoQBk7dLIXMBhYBF"
}
}
}
}
}
results = es.search(index="traffic-*", body=query_find_id)
results = es.update_by_query(index="traffic-*", body=query_to_add_direction_field)
results_after_update = es.search(index="traffic-*", body=query_find_id)
The code now runs without errors... I think I may have fixed it.
I say I think I may have fixed it because if I run the same code again, I get a version_conflict_engine_exception error on the call to update_by_query... but I think that just means the big 12B-row index is still being updated to match the change I made. Does that sound possibly accurate?
Please try the following query:
{
"script": {
"source": "ctx._source.direction = 'e'",
"lang": "painless"
},
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"match": {
"_id": "YKReAoQBk7dLIXMBhYBF"
}
}
]
}
}
}
}
}
Regarding version_conflict_engine_exception it happens because the version of the document is not the one that the update_by_query operation expects, for example, because other process updated that doc at the same time.
You can add /_update_by_query?conflicts=proceed to workaround the issue.
Read more about conflicts here:
https://www.elastic.co/guide/en/elasticsearch/reference/8.5/docs-update-by-query.html#docs-update-by-query-api-desc
If you think it is a temporal conflict, you can use retry_on_conflict to try again after the conflicts:
retry_on_conflict
(Optional, integer) Specify how many times should the operation be retried when a conflict occurs. Default: 0.

How do I get a specific element in array - MongoDB

Documents store in mongo db in following form
{
"_id" : ObjectId("54fa059ce4b01b3e086c83e9"),
"field1" : "value1",
"field2" : "value2"
"field3" : [
{
"abc123": ["somevalue", "somevalue"]
},
{
"xyz345": ["somevalue", "somevalue"]
}
]
}
What I want in output is whenever I pass abc123 in pymongo query I need result in following form
{
"abc123": ["somevalue", "somevalue"]
}
or
["somevalue", "somevalue"]
Please suggest a mongo query for it. Thanks
Maybe something like this:
db.collection.aggregate([
{
$project: {
field3: {
"$filter": {
"input": "$field3",
"as": "f",
"cond": {
$ne: [
"$$f.abc123",
undefined
]
}
}
}
}
},
{
$unwind: "$field3"
},
{
"$replaceRoot": {
"newRoot": "$field3"
}
}
])
Explained:
Use the mongo aggregation framework with below 3x stages:
project/filter only the needed array field3 if exist
unwind the field3 array
replace the root document with the content of field3
playground

Adding multiple fields to documents in mongodb using pymongo

I have a sample collection of documents in mongo db like below
[{"name":"hans","age":30,"test":"pass","pre":"no","calc":"no"},
{"name":"abs","age":20,"test":"not_pass","pre":"yes","calc":"no"},
{"name":"cdf","age":40,"test":"pass"},
{"name":"cvf","age":30,"test":"not_pass","pre":"no","calc":"yes"},
{"name":"cdf","age":23,"test":"pass"},
{"name":"asd","age":35,"test":"not_pass"}]
For some documents the fields pre and calc are not present. I want to add those two fields to the documents which dont have those fields with value null for both "pre":"null", "calc":"null".
The final document should look like
[{"name":"hans","age":30,"test":"pass","pre":"no","calc":"no"},
{"name":"abs","age":20,"test":"not_pass","pre":"yes","calc":"no"},
{"name":"cdf","age":40,"test":"pass","pre":"null","calc":"null"},
{"name":"cvf","age":30,"test":"not_pass","pre":"no","calc":"yes"},
{"name":"cdf","age":23,"test":"pass","pre":"null","calc":"null"},
{"name":"asd","age":35,"test":"not_pass","pre":"null","calc":"null"}]
I tried this way but didnt work.
db.users.update({}, { "$set" : { "pre":"null","calc":"null" }}, false,true)
Thinking that you need an update with the aggregation pipeline.
And use $ifNull operator.
db.users.update({},
[
{
"$set": {
"pre": {
$ifNull: [
"$pre",
"null"
]
},
"calc": {
$ifNull: [
"$calc",
"null"
]
}
}
}
],
false,
true
)
Sample Mongo Playground
The easiest option is to run this query for every missing field that you have , for example for pre:
db.collection.update({
pre: {
$exists: false
}
},
{
"$set": {
"pre": null
}
},
{
multi: true
})
Playground

How to join multiple collections in MongoDB (one to many relationship)?

I have two collections: document and citation. Their structures are shown below:
# document
{id:001, title:'foo'}
{id:002, title:'bar'}
{id:003, title:'abc'}
# citation
{from_id:001, to_id:002}
{from_id:001, to_id:003}
I want to query the information of cited documents (called references, which is denoted by to_id) of each document. In SQL, I would use the document table left joins citation, and then left joins document to get full information of the references (not just their ids).
However, I can only achieve the first step with $lookup in MongoDB. Here is my aggregate pipeline:
[
{'$lookup':{
'from': 'citation',
'localField': 'id',
'foreignField': 'from_id',
'as': 'references'
}}
]
I am able to get the following results with this pipeline:
{
id:001,
title:'foo',
references:[{from_id:001, to_id:002}, {from_id:001, to_id:003}]
}
The desired result is:
{
id:001,
title:'foo',
references:[{id:002, title:'bar'}, {id:003, title:'abc'}]
}
I have found this answer but it seems to be a one-to-one relationship that is not applicable in my case.
EDIT: Some people said that join should be avoided in MongoDB as it's not a relational database. I choose MongoDB because it's much faster than MySQL in my case.
You need to use $unwind and again $lookup on same collection, then you should $group by _id to get the desired result.
Try the below:
[
{
"$lookup": {
"from": "citation",
"localField": "_id",
"foreignField": "from_id",
"as": "references"
}
},
{
"$unwind": "$references"
},
{
"$lookup": {
"from": "doc",
"localField": "references.to_id",
"foreignField": "_id",
"as": "map"
}
},
{
"$unwind": "$map"
},
{
"$project": {
"_id": 1,
"title": 1,
"map_id": "$map._id",
"map_title": "$map.title"
}
},
{
"$group": {
"_id": "$_id",
"title": {
"$first": "$title"
},
"references": {
"$push": {
"id": "$map_id",
"title": "$map_title"
}
}
}
}
]

Categories

Resources