MongoDB Query in Pymongo - python

I have a collection in this format:
{
"name": ....,
"users": [....,....,....,....]
}
I have two different names and I want to find the total number of users that belongs to both documents. Now, I am doing it with Python. I download the document of name 1 and the document of name 2 and check how many users are in both of the documents. I was wondering if there is any other way to do it only with MongoDB and return the number.
Example:
{
"name": "John",
"users": ["001","003","008","010"]
}
{
"name": "Peter",
"users": ["002, "003", "004","005","006","008"]
}
The result would be 2 since users 003 and 008 belongs to both documents.
How I do it:
doc1 = db.collection.find_one({"name":"John"})
doc2 = db.collection.find_one({"name":"Peter"})
total = 0
for user in doc1["users"]:
if user in doc2["users"]:
total += 1

You could also do this with the aggregation framework, but I think it would only make sense if you were doing this over a more than two users even though your could use it that way:
db.users.aggregate([
{ "$match": {
"name": { "$in": [ "John", "Peter" ] }
}},
{ "$unwind": "$users" },
{ "$group": {
"_id": "$users",
"count": { "$sum": 1 }
}},
{ "$match": { "count": { "$gt": 1 } }},
{ "$group": {
"_id": null,
"count": { "$sum": 1 }
}}
])
That allows you to find the same counts over the names you supply to $in in $match

Related

How to filter ElasticSearch results without having it affect the document score?

I am trying to filter my results on "publication_year" field but I don't want it to affect the score of the document, but if I add the "range" to the query or to "filter", it seems to affect the score and score the documents higher whose "publication_year" is closer to "lte" or "less than equal to" the upper limit in the "range".
My query:
query = {
'bool': {
'should': [
{
'match_phrase': {
"title": keywords
}
},
{
'match_phrase': {
"abstract": keywords
}
},
]
}
}
if publication_year_constraint:
range_query = {"range":{"publication_year":{"gte":publication_year_constraint, "lte": datetime.datetime.today().year}}}
query["bool"]["filter"] = [range_query]
tried putting the "range" inside the "should" block as well, similar results.
Try use Filter Context.
In a filter context, a query clause answers the question “Does this
document match this query clause?” The answer is a simple Yes or
No — no scores are calculated.
Example:
{
"query": {
"bool": {
"must": [
{ "match": { "title": "Search" }},
{ "match": { "content": "Elasticsearch" }}
],
"filter": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "2015-01-01" }}}
]
}
}
}

How can I retrieve relative document in MongoDB?

I'm using Flask with Jinja2 template engine and MongoDB via pymongo. This are my documents from two collections (phone and factory):
phone = db.get_collection("phone")
{
"_id": ObjectId("63d8d39206c9f93e68d27206"),
"brand": "Apple",
"model": "iPhone XR",
"year": NumberInt("2016"),
"image": "https://apple-mania.com.ua/media/catalog/product/cache/e026f651b05122a6916299262b60c47d/a/p/apple-iphone-xr-yellow_1.png",
"CPU": {
"manufacturer": "A12 Bionic",
"cores": NumberInt("10")
},
"misc": [
"Bluetooth 5.0",
"NFC",
"GPS"
],
"factory_id": ObjectId("63d8d42b7a4d7a7e825ef956")
}
factory = db.get_collection("factory")
{
"_id": ObjectId("63d8d42b7a4d7a7e825ef956"),
"name": "Foxconn",
"stock": NumberInt("1000")
}
In my python code to retrieve the data I do:
models = list(
phone.find({"brand": brand}, projection={"model": True, "image": True, "factory_id": True})
)
How can I retrieve relative factory document by factory_id and have it as an embedded document in a models list?
I think you are looking for this query using aggregation stage $lookup.
So this query:
First $match by your desired brand.
Then do a "join" between collections based on the factory_id and store it in an array called "factory". The $lookup output is always an array because can be more than one match.
Last project only values you want. In this case, as _id is unique you can get the factory using $arrayElemAt position 0.
So the code can be like this (I'm not a python expert)
models = list(
phone.aggregate([
{
"$match": {
"brand": brand
}
},
{
"$lookup": {
"from": "factory",
"localField": "factory_id",
"foreignField": "_id",
"as": "factories"
}
},
{
"$project": {
"model": True,
"image": True,
"factory": {
"$arrayElemAt": [
"$factories",
0
]
}
}
}
])
)

How to paginate subdocuments in a MongoDB collection?

I have a MongoDB collection with the following data structure;
[
{
"_id": "1",
"name": "businessName1",
"reviews": [
{
"_id": "1",
"comment": "comment1",
},
{
"_id": "2",
"comment": "comment1",
},
...
]
}
]
As you can see, the reviews for each business are a subdocument within the collection, where businessName1 has a total of 2 reviews. In my real MongoDB collection, each business has 100s of reviews. I want to view only 10 on one page using pagination.
I currently have a find_one() function in Python that retrieves this single business, but it also retrieves all of its reviews as well.
businesses.find_one( \
{ "_id" : ObjectId(1) }, \
{ "reviews" : 1, "_id" : 0 } )
I'm aware of the skip() and limit() methods in Python, where you can limit the number of results that are retrieved, but as far as I'm aware, you can only perform these methods on the find() method. Is this correct?
Option 1: You can use $slice for pagination as follow:
db.collection.find({
_id: 1
},
{
_id: 0,
reviews: {
$slice: [
3,
5
]
}
})
Playground
Option 2: Or via aggregation + total array size maybe better:
db.collection.aggregate([
{
$project: {
_id: 0,
reviews: {
$slice: [
"$reviews",
3,
5
]
},
total: {
$size: "$reviews"
}
}
}
])
Playground

MongoDB - How to aggregate data for each record

I have some stored data like this:
{
"_id" : 1,
"serverAddresses" : {
"name" : "0.0.0.0:8000",
"name2": "0.0.0.0:8001"
}
}
I need aggregated data to this:
[
{
"gameId": "1",
"name": "name1",
"url": "0.0.0.0:8000"
},
{
"gameId": "1",
"name": "name2",
"url": "0.0.0.0:8001"
}
]
What is the solution without using for loop?
$project - Add addresses field by converting $serverAddress to (key-value) array.
$unwind - Descontruct addresses field to multiple documents.
$replaceRoot - Decorate the output document based on (2).
db.collection.aggregate([
{
"$project": {
"addresses": {
"$objectToArray": "$serverAddresses"
}
}
},
{
$unwind: "$addresses"
},
{
"$replaceRoot": {
"newRoot": {
gameId: "$_id",
name: "$addresses.k",
address: "$addresses.v"
}
}
}
])
Sample Mongo Playground

How to filter elements of array in mongodb

i have a document in mongodb:
{
"company": "npcompany",
"department": [
{
"name": "it",
"employeeIds": [
"emp1",
"emp2",
"emp3"
]
},
{
"name": "economy",
"employeeIds": [
"emp1",
"emp3",
"emp4"
]
}
]
}
I want to find "emp4". In this case i want to get "economy" department data only. If i found "emp1" then i want to get "npcompany" and "economy" datas. How can i do it in mongodb (or pymongo)?
play
db.collection.aggregate([ //As you need to fetch all matching array elements, reshape them
{
$unwind: "$department"
},
{
"$match": {//look for match
"department.employeeIds": "emp4"
}
},
{
$group: {//regroup them
"_id": "$_id",
data: {
"$push": "$$ROOT"
}
}
}
])

Categories

Resources