Get elements from array between two dates [duplicate] - python

Suppose you have the following documents in my collection:
{
"_id":ObjectId("562e7c594c12942f08fe4192"),
"shapes":[
{
"shape":"square",
"color":"blue"
},
{
"shape":"circle",
"color":"red"
}
]
},
{
"_id":ObjectId("562e7c594c12942f08fe4193"),
"shapes":[
{
"shape":"square",
"color":"black"
},
{
"shape":"circle",
"color":"green"
}
]
}
Do query:
db.test.find({"shapes.color": "red"}, {"shapes.color": 1})
Or
db.test.find({shapes: {"$elemMatch": {color: "red"}}}, {"shapes.color": 1})
Returns matched document (Document 1), but always with ALL array items in shapes:
{ "shapes":
[
{"shape": "square", "color": "blue"},
{"shape": "circle", "color": "red"}
]
}
However, I'd like to get the document (Document 1) only with the array that contains color=red:
{ "shapes":
[
{"shape": "circle", "color": "red"}
]
}
How can I do this?

MongoDB 2.2's new $elemMatch projection operator provides another way to alter the returned document to contain only the first matched shapes element:
db.test.find(
{"shapes.color": "red"},
{_id: 0, shapes: {$elemMatch: {color: "red"}}});
Returns:
{"shapes" : [{"shape": "circle", "color": "red"}]}
In 2.2 you can also do this using the $ projection operator, where the $ in a projection object field name represents the index of the field's first matching array element from the query. The following returns the same results as above:
db.test.find({"shapes.color": "red"}, {_id: 0, 'shapes.$': 1});
MongoDB 3.2 Update
Starting with the 3.2 release, you can use the new $filter aggregation operator to filter an array during projection, which has the benefit of including all matches, instead of just the first one.
db.test.aggregate([
// Get just the docs that contain a shapes element where color is 'red'
{$match: {'shapes.color': 'red'}},
{$project: {
shapes: {$filter: {
input: '$shapes',
as: 'shape',
cond: {$eq: ['$$shape.color', 'red']}
}},
_id: 0
}}
])
Results:
[
{
"shapes" : [
{
"shape" : "circle",
"color" : "red"
}
]
}
]

The new Aggregation Framework in MongoDB 2.2+ provides an alternative to Map/Reduce. The $unwind operator can be used to separate your shapes array into a stream of documents that can be matched:
db.test.aggregate(
// Start with a $match pipeline which can take advantage of an index and limit documents processed
{ $match : {
"shapes.color": "red"
}},
{ $unwind : "$shapes" },
{ $match : {
"shapes.color": "red"
}}
)
Results in:
{
"result" : [
{
"_id" : ObjectId("504425059b7c9fa7ec92beec"),
"shapes" : {
"shape" : "circle",
"color" : "red"
}
}
],
"ok" : 1
}

Caution: This answer provides a solution that was relevant at that time, before the new features of MongoDB 2.2 and up were introduced. See the other answers if you are using a more recent version of MongoDB.
The field selector parameter is limited to complete properties. It cannot be used to select part of an array, only the entire array. I tried using the $ positional operator, but that didn't work.
The easiest way is to just filter the shapes in the client.
If you really need the correct output directly from MongoDB, you can use a map-reduce to filter the shapes.
function map() {
filteredShapes = [];
this.shapes.forEach(function (s) {
if (s.color === "red") {
filteredShapes.push(s);
}
});
emit(this._id, { shapes: filteredShapes });
}
function reduce(key, values) {
return values[0];
}
res = db.test.mapReduce(map, reduce, { query: { "shapes.color": "red" } })
db[res.result].find()

Another interesing way is to use $redact, which is one of the new aggregation features of MongoDB 2.6. If you are using 2.6, you don't need an $unwind which might cause you performance problems if you have large arrays.
db.test.aggregate([
{ $match: {
shapes: { $elemMatch: {color: "red"} }
}},
{ $redact : {
$cond: {
if: { $or : [{ $eq: ["$color","red"] }, { $not : "$color" }]},
then: "$$DESCEND",
else: "$$PRUNE"
}
}}]);
$redact "restricts the contents of the documents based on information stored in the documents themselves". So it will run only inside of the document. It basically scans your document top to the bottom, and checks if it matches with your if condition which is in $cond, if there is match it will either keep the content($$DESCEND) or remove($$PRUNE).
In the example above, first $match returns the whole shapes array, and $redact strips it down to the expected result.
Note that {$not:"$color"} is necessary, because it will scan the top document as well, and if $redact does not find a color field on the top level this will return false that might strip the whole document which we don't want.

Better you can query in matching array element using $slice is it helpful to returning the significant object in an array.
db.test.find({"shapes.color" : "blue"}, {"shapes.$" : 1})
$slice is helpful when you know the index of the element, but sometimes you want
whichever array element matched your criteria. You can return the matching element
with the $ operator.

db.getCollection('aj').find({"shapes.color":"red"},{"shapes.$":1})
OUTPUTS
{
"shapes" : [
{
"shape" : "circle",
"color" : "red"
}
]
}

The syntax for find in mongodb is
db.<collection name>.find(query, projection);
and the second query that you have written, that is
db.test.find(
{shapes: {"$elemMatch": {color: "red"}}},
{"shapes.color":1})
in this you have used the $elemMatch operator in query part, whereas if you use this operator in the projection part then you will get the desired result. You can write down your query as
db.users.find(
{"shapes.color":"red"},
{_id:0, shapes: {$elemMatch : {color: "red"}}})
This will give you the desired result.

Thanks to JohnnyHK.
Here I just want to add some more complex usage.
// Document
{
"_id" : 1
"shapes" : [
{"shape" : "square", "color" : "red"},
{"shape" : "circle", "color" : "green"}
]
}
{
"_id" : 2
"shapes" : [
{"shape" : "square", "color" : "red"},
{"shape" : "circle", "color" : "green"}
]
}
// The Query
db.contents.find({
"_id" : ObjectId(1),
"shapes.color":"red"
},{
"_id": 0,
"shapes" :{
"$elemMatch":{
"color" : "red"
}
}
})
//And the Result
{"shapes":[
{
"shape" : "square",
"color" : "red"
}
]}

You just need to run query
db.test.find(
{"shapes.color": "red"},
{shapes: {$elemMatch: {color: "red"}}});
output of this query is
{
"_id" : ObjectId("562e7c594c12942f08fe4192"),
"shapes" : [
{"shape" : "circle", "color" : "red"}
]
}
as you expected it'll gives the exact field from array that matches color:'red'.

Along with $project it will be more appropriate other wise matching elements will be clubbed together with other elements in document.
db.test.aggregate(
{ "$unwind" : "$shapes" },
{ "$match" : { "shapes.color": "red" } },
{
"$project": {
"_id":1,
"item":1
}
}
)

Likewise you can find for the multiple
db.getCollection('localData').aggregate([
// Get just the docs that contain a shapes element where color is 'red'
{$match: {'shapes.color': {$in : ['red','yellow'] } }},
{$project: {
shapes: {$filter: {
input: '$shapes',
as: 'shape',
cond: {$in: ['$$shape.color', ['red', 'yellow']]}
}}
}}
])

db.test.find( {"shapes.color": "red"}, {_id: 0})

Use aggregation function and $project to get specific object field in document
db.getCollection('geolocations').aggregate([ { $project : { geolocation : 1} } ])
result:
{
"_id" : ObjectId("5e3ee15968879c0d5942464b"),
"geolocation" : [
{
"_id" : ObjectId("5e3ee3ee68879c0d5942465e"),
"latitude" : 12.9718313,
"longitude" : 77.593551,
"country" : "India",
"city" : "Chennai",
"zipcode" : "560001",
"streetName" : "Sidney Road",
"countryCode" : "in",
"ip" : "116.75.115.248",
"date" : ISODate("2020-02-08T16:38:06.584Z")
}
]
}

Although the question was asked 9.6 years ago, this has been of immense help to numerous people, me being one of them. Thank you everyone for all your queries, hints and answers. Picking up from one of the answers here.. I found that the following method can also be used to project other fields in the parent document.This may be helpful to someone.
For the following document, the need was to find out if an employee (emp #7839) has his leave history set for the year 2020. Leave history is implemented as an embedded document within the parent Employee document.
db.employees.find( {"leave_history.calendar_year": 2020},
{leave_history: {$elemMatch: {calendar_year: 2020}},empno:true,ename:true}).pretty()
{
"_id" : ObjectId("5e907ad23997181dde06e8fc"),
"empno" : 7839,
"ename" : "KING",
"mgrno" : 0,
"hiredate" : "1990-05-09",
"sal" : 100000,
"deptno" : {
"_id" : ObjectId("5e9065f53997181dde06e8f8")
},
"username" : "none",
"password" : "none",
"is_admin" : "N",
"is_approver" : "Y",
"is_manager" : "Y",
"user_role" : "AP",
"admin_approval_received" : "Y",
"active" : "Y",
"created_date" : "2020-04-10",
"updated_date" : "2020-04-10",
"application_usage_log" : [
{
"logged_in_as" : "AP",
"log_in_date" : "2020-04-10"
},
{
"logged_in_as" : "EM",
"log_in_date" : ISODate("2020-04-16T07:28:11.959Z")
}
],
"leave_history" : [
{
"calendar_year" : 2020,
"pl_used" : 0,
"cl_used" : 0,
"sl_used" : 0
},
{
"calendar_year" : 2021,
"pl_used" : 0,
"cl_used" : 0,
"sl_used" : 0
}
]
}

if you want to do filter, set and find at the same time.
let post = await Post.findOneAndUpdate(
{
_id: req.params.id,
tasks: {
$elemMatch: {
id: req.params.jobId,
date,
},
},
},
{
$set: {
'jobs.$[i].performer': performer,
'jobs.$[i].status': status,
'jobs.$[i].type': type,
},
},
{
arrayFilters: [
{
'i.id': req.params.jobId,
},
],
new: true,
}
);

This answer does not fully answer the question but it's related and I'm writing it down because someone decided to close another question marking this one as duplicate (which is not).
In my case I only wanted to filter the array elements but still return the full elements of the array. All previous answers (including the solution given in the question) gave me headaches when applying them to my particular case because:
I needed my solution to be able to return multiple results of the subarray elements.
Using $unwind + $match + $group resulted in losing root documents without matching array elements, which I didn't want to in my case because in fact I was only looking to filter out unwanted elements.
Using $project > $filter resulted in loosing the rest of the fields or the root documents or forced me to specify all of them in the projection as well which was not desirable.
So at the end I fixed all of this problems with an $addFields > $filter like this:
db.test.aggregate([
{ $match: { 'shapes.color': 'red' } },
{ $addFields: { 'shapes': { $filter: {
input: '$shapes',
as: 'shape',
cond: { $eq: ['$$shape.color', 'red'] }
} } } },
])
Explanation:
First match documents with a red coloured shape.
For those documents, add a field called shapes, which in this case will replace the original field called the same way.
To calculate the new value of shapes, $filter the elements of the original $shapes array, temporarily naming each of the array elements as shape so that later we can check if the $$shape.color is red.
Now the new shapes array only contains the desired elements.

for more details refer =
mongo db official referance
suppose you have document like this (you can have multiple document too) -
{
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44b"
},
"results": [
{
"yearOfRelease": "2022",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/d/d4/The_Kashmir_Files_poster.jpg",
"title": "The Kashmir Files",
"overview": "Krishna endeavours to uncover the reason behind his parents' brutal killings in Kashmir. He is shocked to uncover a web of lies and conspiracies in connection with the massive genocide.",
"originalLanguage": "hi",
"imdbRating": "8.3",
"isbookMark": null,
"originCountry": "india",
"productionHouse": [
"Zee Studios"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44c"
}
},
{
"yearOfRelease": "2022",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/a/a9/Black_Adam_%28film%29_poster.jpg",
"title": "Black Adam",
"overview": "In ancient Kahndaq, Teth Adam was bestowed the almighty powers of the gods. After using these powers for vengeance, he was imprisoned, becoming Black Adam. Nearly 5,000 years have passed, and Black Adam has gone from man to myth to legend. Now free, his unique form of justice, born out of rage, is challenged by modern-day heroes who form the Justice Society: Hawkman, Dr. Fate, Atom Smasher and Cyclone",
"originalLanguage": "en",
"imdbRating": "8.3",
"isbookMark": null,
"originCountry": "United States of America",
"productionHouse": [
"DC Comics"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44d"
}
},
{
"yearOfRelease": "2022",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/0/09/The_Sea_Beast_film_poster.png",
"title": "The Sea Beast",
"overview": "A young girl stows away on the ship of a legendary sea monster hunter, turning his life upside down as they venture into uncharted waters.",
"originalLanguage": "en",
"imdbRating": "7.1",
"isbookMark": null,
"originCountry": "United States Canada",
"productionHouse": [
"Netflix Animation"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44e"
}
},
{
"yearOfRelease": "2021",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/7/7d/Hum_Do_Hamare_Do_poster.jpg",
"title": "Hum Do Hamare Do",
"overview": "Dhruv, who grew up an orphan, is in love with a woman who wishes to marry someone with a family. In order to fulfil his lover's wish, he hires two older individuals to pose as his parents.",
"originalLanguage": "hi",
"imdbRating": "6.0",
"isbookMark": null,
"originCountry": "india",
"productionHouse": [
"Maddock Films"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44f"
}
},
{
"yearOfRelease": "2021",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/7/74/Shang-Chi_and_the_Legend_of_the_Ten_Rings_poster.jpeg",
"title": "Shang-Chi and the Legend of the Ten Rings",
"overview": "Shang-Chi, a martial artist, lives a quiet life after he leaves his father and the shadowy Ten Rings organisation behind. Years later, he is forced to confront his past when the Ten Rings attack him.",
"originalLanguage": "en",
"imdbRating": "7.4",
"isbookMark": null,
"originCountry": "United States of America",
"productionHouse": [
"Marvel Entertainment"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c450"
}
}
],
"__v": 0
}
=======
mongo db query by aggregate command -
mongomodels.movieMainPageSchema.aggregate(
[
{
$project: {
_id:0, // to supress id
results: {
$filter: {
input: "$results",
as: "result",
cond: { $eq: [ "$$result.yearOfRelease", "2022" ] }
}
}
}
}
]
)

For the new version of MongoDB, it's slightly different.
For db.collection.find you can use the second parameter of find with the key being projection
db.collection.find({}, {projection: {name: 1, email: 0}});
You can also use the .project() method.
However, it is not a native MongoDB method, it's a method provided by most MongoDB driver like Mongoose, MongoDB Node.js driver etc.
db.collection.find({}).project({name: 1, email: 0});
And if you want to use findOne, it's the same that with find
db.collection.findOne({}, {projection: {name: 1, email: 0}});
But findOne doesn't have a .project() method.

Related

Query a 3rd level nested

I have a DB in MongoDB that has 3 levels, and I want to get the value form last level. The structure is the following:
{
"_id" : "10000",
"Values" : [
{
"Value1" : "Article 1",
"Value2" : [
{
"Value2_1" : 1,
"Value2_2" : 2,
}
]
}
]
}
I need to get the value form the label "Value2_1".
So far my code is the following:
for row in collection.find({"_id":1, "Values.Value2.Value2_1":1})
print(row)
The output is always "None".
Any ideas about how to make the correct query?
Thanks!
By using dot (.) notation you can get your expected result.
db.collection.find({"Values.Value2.Value2_1" : 100})
The above query will select all documents where the Values array has Values2 array and Values2 has Values2_1 whose value is equal 100
Output:
{
"_id" : ObjectId("5b86bd1172876096c7a9d6cf"),
"Values" : [
{
"Value1" : "Article 1",
"Value2" : [
{
"Value2_1" : 100.0,
"Value2_2" : 200.0
},
{
"Value2_1" : 15.0,
"Value2_2" : 25.0
}
]
}
]
}
And if you try to search with _id then you don't need to use your
second condition because by definition _id is always unique.
This following query will also show the same result as above.
db.collection.find({"_id" : ObjectId("5b86bd1172876096c7a9d6cf")})
If you want to specifically get only those items from inner array which have fulfill your inner conditions, you can aggregate the query
PS - My 2 cents - I do not know if you require this, as i could not get it that clear from your question, i just thought you may be asking this.
db.coll.aggregate([{
$unwind: '$Values'
}, {
$project: {
'Values_F': {
$filter: {
input: "$Values.Value2",
as: "value2",
cond: {
$eq: ["$$value2.Value2_1", 1]
}
}
}
}
}, {
$project: {
'Values_F': 1,
'total': {
$size: '$Values_F'
}
}
}, {
$match: {
total: {
$gte: 1
}
}
}
])

How can I merge rankings from several Elasticsearch queries?

I would like to merge the rankings obtained from querying separate fields of an Elasticsearch index, so to obtain a "compound" ranking.
As a (silly) "matchmaking" example, suppose I wanted to retrieve best-matching results on an index of people containing their favorite music, food, sports.
The separate queries could be e.g.
"query": { "match" : { "music" : "indie classical metal" } }
which would yield me as ranked results:
Alice, 2. Bob, 3. Charlie;
"query": { "match" : { "foods" : "falafel strawberries coffee" } }
yielding
Alice, 2. Charlie, 3. Bob;
and
"query": { "match" : { "sports" : "basketball ski" } }
yielding
Charlie, 2. Alice, 3. Bob.
Now, I would like to obtained an "aggregate" ranking based on the rankings above, e.g. using the voting methods listed in How to merge a collection of ordered preferences.
So far, to achieve something along these lines I used syntax for compound queries such as
"query": {
"bool": {
"should": [
{ "match" : { "music" : "indie classical metal" } },
{ "match" : { "foods" : "falafel strawberries coffee" } },
{ "match" : { "sports" : "basketball ski" } },
]
}
}
or
"query": {
"dis_max": {
"queries": [
{ "match" : { "music" : "indie classical metal" } },
{ "match" : { "foods" : "falafel strawberries coffee" } },
{ "match" : { "sports" : "basketball ski" } },
]
}
}
but (AFAIK) these don't do what I am looking for (which is not using scores, but ranks). I understand that's fairly straightforward to post-process the rankings (e.g. using elasticsearch-py and then a few Python lines), but is it possible to do the things above directly with an Elasticsearch query?
(bonus question: could you suggest alternative strategies to merge rankings from multiple fields, beyond bool+should and dis_max that I could try out?)
Have a look at Function Score Query - it should allow you to do what you’re looking for. But be aware that it might result in slower query execution.

Collect records into single array in Eve/mongodb to reduce bandwidth

I have a record which is a dictionary of performance sampling at a specific revision of our source code. I am storing this in our eve database. We do this performance test for every revision. We have over 20,000 revisions.
I can get the values using http://host/api/performance?projection={"FileIO.Reads":1,"Revision":1}, which gives me 20,000 records with the following:
{
"_items" : [
{ "_id" : ... ,
"_updated": ...,
"_created":...,
"_etag":...,
"Revision":1000,
"FileIO" : {
{ "Reads": [20.34,10,30] } # avg/min/max
}
},
# next item
{ "_id" : ... ,
"_updated": ...,
"_created":...,
"_etag":...,
"Revision":1001,
"FileIO" : {
{ "Reads": [23,10,50] } # avg/min/max
}
}
# and so on
]
}
Is there some way to ask Eve, or even better MongoDB, to group all of these into a single value of the form of [ [Revision, Reads], [Revision, Reads]... ] or even [Revision, Avg, Min, Max] to minimize the JSON conversion, performance and bandwidth cost?
Should I do my own processing in the event hooks? If so, in what way?
I think I should be able to do this with aggregation of some type but it isn't clear how to merge my revision with my FileIO Reads.
I don't really have any other ideas how to store this data - we just have a dictionary of performance values per revision.
I did some sleuthing and mucking about and came up with the following aggregation pipeline. I don't know if it is efficient but it does what I need it to do. I guess I kind-of understand how it works but the double grouping seems like it should be unnecessary.
db.getCollection('test_profiles').aggregate( [
{ $group: {
_id : { revision :"$revision", value : "$FileIO.Reads" }
}},
{ $unwind : "$_id"},
{ $group: {
_id : null,
values:
{ $push: "$_id" }
}}
])
This yields the following kind of record:
{
"_id" : null,
"values" : [
{
"revision" : 109999,
"value" : [
0.903873742,
0.00723229861,
1.23190153
]
},
{
"revision" : 109998,
"value" : [
0.903873742,
0.00723229861,
1.23190153
]
},
// .. and on and on
]
}

MongoDB documents combination

I have the collection with document structure like this:
{
"_id" : "Host CPU Utilization (%)",
"count" : 1,
"avg" : NumberDecimal("20.2397439956"),
"flaga" : 4
},
{
"_id" : "Active Sessions Using CPU",
"count" : 1,
"avg" : NumberDecimal("4.0580000000"),
"flaga" : 4
},
{
"_id" : "Wait Time (%)",
"count" : 1,
"avg" : NumberDecimal("1795.2150000000"),
"flaga" : 999
}
Is that possible to use pymongo changing data like:
{
"_id" : 4,
"Host CPU Utilization (%)" : NumberDecimal("20.2397439956"),
"Active Sessions Using CPU" : NumberDecimal("4.0580000000")
},
{
"_id" : 999,
"Wait Time (%)" : NumberDecimal("1795.2150000000"),
}
I have tried to use update commend rename but can't do it dynamically and can't combine two documents into one. If I use aggregation framework, I don't know how to $put documents with variable field name.
Since version 3.4 we have $arrayToObject operator which might be helpful. You can try grouping by flaga field and then using mentioned operator.
db.myCollection.aggregate([
{
$group: {
"_id": "$flaga",
"values": {
"$push": {
"k": "$_id",
"v": "$avg"
}
}
}
},
{
$project: {
"_id": 1,
"values": { $arrayToObject: "$values" }
}
}
])
This will give you results like:
{
"_id":4,
"values":{
"Host CPU Utilization (%)":NumberDecimal("20.2397439956"),
"Active Sessions Using CPU":NumberDecimal("4.0580000000")
}
}
You can add next pipeline stage with $replaceRoot to get rid of this nesting but unfortunately you'll loose _id field and I bet it's not what you're looking for, so probably you should perform this post-processing in your business logic code.

Elastic Search: including #/hashtags in search results

Using elastic search's query DSL this is how I am currently constructing my query:
elastic_sort = [
{ "timestamp": {"order": "desc" }},
"_score",
{ "name": { "order": "desc" }},
{ "channel": { "order": "desc" }},
]
elastic_query = {
"fuzzy_like_this" : {
"fields" : [ "msgs.channel", "msgs.msg", "msgs.name" ],
"like_text" : search_string,
"max_query_terms" : 10,
"fuzziness": 0.7,
}
}
res = self.es.search(index="chat", body={
"from" : from_result, "size" : results_per_page,
"track_scores": True,
"query": elastic_query,
"sort": elastic_sort,
})
I've been trying to implement a filter or an analyzer that will allow the inclusion of "#" in searches (I want a search for "#thing" to return results that include "#thing"), but I am coming up short. The error messages I am getting are not helpful and just telling me that my query is malformed.
I attempted to incorporate the method found here : http://www.fullscale.co/blog/2013/03/04/preserving_specific_characters_during_tokenizing_in_elasticsearch.html but it doesn't make any sense to me in context.
Does anyone have a clue how I can do this?
Did you create a mapping for you index? You can specify within your mapping to not analyze certain fields.
For example, a tweet mapping can be something like:
"tweet": {
"properties": {
"id": {
"type": "long"
},
"msg": {
"type": "string"
},
"hashtags": {
"type": "string",
"index": "not_analyzed"
}
}
}
You can then perform a term query on "hashtags" for an exact string match, including "#" character.
If you want "hashtags" to be tokenized as well, you can always create a multi-field for "hashtags".

Categories

Resources