Update document if value there is no match - python

In Mongodb, how do you skip an update if one field of the document exists?
To give an example, I have the following document structure, and I'd like to only update it if the link key is not matching.
{
"_id": {
"$oid": "56e9978732beb44a2f2ac6ae"
},
"domain": "example.co.uk",
"good": [
{
"crawled": true,
"added": {
"$date": "2016-03-16T17:27:17.461Z"
},
"link": "/url-1"
},
{
"crawled": false,
"added": {
"$date": "2016-03-16T17:27:17.461Z"
},
"link": "url-2"
}
]
}
My update query is:
links.update({
"domain": "example.co.uk"
},
{'$addToSet':
{'good':
{"crawled": False, 'link':"/url-1"} }}, True)
Part of the problem is the crawl field could be set to True or False and the date will also always be different - I don't want to add to the array if the URL exists, regardless of the crawled status.
Update:
Just for clarity, if the URL is not within the document, I want it to be added to the existing array, for example, if /url-3 was introduced, the document would look like this:
{
"_id": {
"$oid": "56e9978732beb44a2f2ac6ae"
},
"domain": "example.co.uk",
"good": [
{
"crawled": true,
"added": {
"$date": "2016-03-16T17:27:17.461Z"
},
"link": "/url-1"
},
{
"crawled": false,
"added": {
"$date": "2016-03-16T17:27:17.461Z"
},
"link": "url-2"
},
{
"crawled": false,
"added": {
"$date": "2016-04-16T17:27:17.461Z"
},
"link": "url-3"
}
]
}
The domain will be unique and specific to the link and I want it to insert the link within the good array if it doesn't exist and do nothing if it does exist.

The only way to do this is to find if there is any document in the collection that matches your criteria using the find_one method, also you need to consider the "good.link" field in your filter criteria. If no document matches you run your update query using the update_one method, but this time you don't use the "good.link" field in your query criteria. Also you don't need the $addToSet operator as it's not doing anything simple use the $push update operator, it makes your intention clear. You also don't need to "upsert" option here.
if not link.find_one({"domain": "example.co.uk", "good.link": "/url-1"}):
link.update_one({"domain": "example.co.uk"},
{"$push": {"good": {"crawled": False, 'link':"/url-1"}}})

in your find section of the query you are matching all documents where
"domain": "example.co.uk"
you need to add that you don't want to match
'good.link':"/url-1"
so try
{
"domain": "example.co.uk",
"good.link": {$ne: "/url-1"}
}

The accepted answer is not correct by saying the only way to do it is using findOne first.
You can do it in a single db call by using the aggregation pipelined updates feature, this allows you to use aggregation operators within an update, now the strategy will be to concat two arrays, the first array will always be the "good" array, the second array will either be [new link] or an empty array based on the condition if the links exists or not using $cond, like so:
links.update({
"domain": "example.co.uk"
},
[
{
"$set": {
"good": {
"$ifNull": [
"$good",
[]
]
}
}
},
{
"$set": {
"good": {
"$concatArrays": [
"$good",
{
"$cond": [
{
"$in": [
"/url-1",
"$good.link"
]
},
[],
[
{
"crawled": False,
"link": "/url-1"
}
]
]
}
]
}
}
}
], True)
Mongo Playground

Related

How can I retrieve relative document in MongoDB?

I'm using Flask with Jinja2 template engine and MongoDB via pymongo. This are my documents from two collections (phone and factory):
phone = db.get_collection("phone")
{
"_id": ObjectId("63d8d39206c9f93e68d27206"),
"brand": "Apple",
"model": "iPhone XR",
"year": NumberInt("2016"),
"image": "https://apple-mania.com.ua/media/catalog/product/cache/e026f651b05122a6916299262b60c47d/a/p/apple-iphone-xr-yellow_1.png",
"CPU": {
"manufacturer": "A12 Bionic",
"cores": NumberInt("10")
},
"misc": [
"Bluetooth 5.0",
"NFC",
"GPS"
],
"factory_id": ObjectId("63d8d42b7a4d7a7e825ef956")
}
factory = db.get_collection("factory")
{
"_id": ObjectId("63d8d42b7a4d7a7e825ef956"),
"name": "Foxconn",
"stock": NumberInt("1000")
}
In my python code to retrieve the data I do:
models = list(
phone.find({"brand": brand}, projection={"model": True, "image": True, "factory_id": True})
)
How can I retrieve relative factory document by factory_id and have it as an embedded document in a models list?
I think you are looking for this query using aggregation stage $lookup.
So this query:
First $match by your desired brand.
Then do a "join" between collections based on the factory_id and store it in an array called "factory". The $lookup output is always an array because can be more than one match.
Last project only values you want. In this case, as _id is unique you can get the factory using $arrayElemAt position 0.
So the code can be like this (I'm not a python expert)
models = list(
phone.aggregate([
{
"$match": {
"brand": brand
}
},
{
"$lookup": {
"from": "factory",
"localField": "factory_id",
"foreignField": "_id",
"as": "factories"
}
},
{
"$project": {
"model": True,
"image": True,
"factory": {
"$arrayElemAt": [
"$factories",
0
]
}
}
}
])
)

check if a property contains a specific value in a document with pymongo

I have a collection of documents that looks like this
{
"_id": "4",
"contacts": [
{
"email": "mail#mail.com",
"name": "A1",
"phone": "00",
"crashNotificationEnabled": false,
"locationShared": true,
"creationDate": ISODate("2020-10-19T15:19:04.498Z")
},
{
"email": "mail#mail.com",
"name": "AG2",
"phone": "00",
"crashNotificationEnabled": false,
"locationShared": false,
"creationDate": ISODate("2020-10-19T15:19:04.498Z")
}
],
"creationDate": ISODate("2020-10-19T15:19:04.498Z"),
"_class": ".model.UserContacts"
}
And i would like to iterate through all documents to check if either crashNotificationEnabled or locationShared is true and add +1 to a counter if its the case, im quite new to python and mongosql so i actually have a hard time trying to do that, i tried a lot of things but there is my last try :
def users_with_guardian_angel(mongoclient):
try:
mydb = mongoclient["main"]
userContacts = mydb["userContacts"]
users = userContacts.find()
for user in users:
result = userContacts.find_one({contacts : { $in: [true]}})
if result:
count_users = count_users + 1
print(f"{count_users} have at least one notificiation enabled")
But the result variable stays empty all the time, so if somebody could help me to accomplish what i want to do and tell what i did wrong here ?
Thanks !
Here's one way you could do it by letting the MongoDB server do all the work.
N.B.: This doesn't consider the possibility of multiple entries of the same user.
db.userContacts.aggregate([
{
"$unwind": "$contacts"
},
{
"$match": {
"$expr": {
"$or": [
"$contacts.crashNotificationEnabled",
"$contacts.locationShared"
]
}
}
},
{
"$count": "userCountWithNotificationsEnabled"
}
])
Try it on mongoplayground.net.
Example output:
[
{
"userCountWithNotificationsEnabled": 436
}
]

Elasticsearch not returning result for single word query

I have a basic Elasticsearch index that consists of a variety of help articles. Users can search for them in my Python/Django app.
The index has the following mappings:
{
"mappings": {
"properties": {
"body": {
"type": "text"
},
"category": {
"type": "nested",
"properties": {
"category_id": {
"type": "long"
},
"category_title": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
}
}
},
"title": {
"type": "keyword"
},
"date_updated": {
"type": "date"
},
"position": {
"type": "integer"
}
}
}
}
I basically want the user to be able to search for a query and get any results that match the article title or category.
Say I have an article called "I Can't Remember My Password" in the "Your Account" category.
If I search for the article title exactly, I see the result. If I search for the category title exactly, I also see the result.
But if I search for just "password", I get nothing. What do I need to change in my setup/query to make it so that this query (or similarly non-exact queries) also returns the result?
My query looks like:
{
"query": {
"bool": {
"should": [{
"multi_match": {
"fields": ["title"],
"query": "password"
}
},
{
"nested": {
"path": "category",
"query": {
"multi_match": {
"fields": ["category.category_title"],
"query": "password"
}
}
}
}
]
}
}
}
I have read other questions and experimented with various settings but no luck so far. I am not doing anything particularly special at index time in terms of preparing the fields so I don't know if that's something to look at. I'm just using the elasticsearch-dsl defaults.
The solution was to reindex the title field as text rather than keyword. The latter only allows exact matching.
Credit to LeBigCat for pointing that out in the comments. They haven't posted it as an answer so I'm doing it on their behalf to improve visibility.

pymongo nested embedded document field update

I have a document as:
{
"name": "restaurant 1",
"rooms":
[
{"name": "room1",
"desks": [
{
"name": "desk1",
"unique": "abcde",
"busy": False
},
{
"name": "desk2",
"unique": "abcdf",
"busy": True
}
]},
{"name": "room2",
"desks": [
{
"name": "desk1",
"unique": "bbcde",
"busy": False
},
{
"name": "desk2",
"unique": "bbcdf",
"busy": False
}
]}
]
}
My pymongo search query:
db.restaurants.update(
{'rooms.desks.unique': 'bbcdf')},
{'$set': {'rooms.$.desks.$$.busy': True}}
)
I couldn't update "busy" field of the desk. $$ part didn't work. What should I replace "$$" with?
or
How can I find the index of the desk.
Thanks in advance
According to documentation it's not possible:
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value.
Most likely you will need to redesign your database schema.

Elastic Search: including #/hashtags in search results

Using elastic search's query DSL this is how I am currently constructing my query:
elastic_sort = [
{ "timestamp": {"order": "desc" }},
"_score",
{ "name": { "order": "desc" }},
{ "channel": { "order": "desc" }},
]
elastic_query = {
"fuzzy_like_this" : {
"fields" : [ "msgs.channel", "msgs.msg", "msgs.name" ],
"like_text" : search_string,
"max_query_terms" : 10,
"fuzziness": 0.7,
}
}
res = self.es.search(index="chat", body={
"from" : from_result, "size" : results_per_page,
"track_scores": True,
"query": elastic_query,
"sort": elastic_sort,
})
I've been trying to implement a filter or an analyzer that will allow the inclusion of "#" in searches (I want a search for "#thing" to return results that include "#thing"), but I am coming up short. The error messages I am getting are not helpful and just telling me that my query is malformed.
I attempted to incorporate the method found here : http://www.fullscale.co/blog/2013/03/04/preserving_specific_characters_during_tokenizing_in_elasticsearch.html but it doesn't make any sense to me in context.
Does anyone have a clue how I can do this?
Did you create a mapping for you index? You can specify within your mapping to not analyze certain fields.
For example, a tweet mapping can be something like:
"tweet": {
"properties": {
"id": {
"type": "long"
},
"msg": {
"type": "string"
},
"hashtags": {
"type": "string",
"index": "not_analyzed"
}
}
}
You can then perform a term query on "hashtags" for an exact string match, including "#" character.
If you want "hashtags" to be tokenized as well, you can always create a multi-field for "hashtags".

Categories

Resources