Elasticsearch multi field query request in Python - python

I'm a beginner in Elasticsearch and Python and I have an index created in Elasticsearch with some data, and I want to perform a query request on those data with python. This is my data mapping created in Kibana's Dev tools:
PUT /main-news-test-data
{
"mappings": {
"properties": {
"content": {
"type": "text"
},
"title": {
"type": "text"
},
"lead": {
"type": "text"
},
"agency": {
"type": "keyword"
},
"date_created": {
"type": "date"
},
"url": {
"type": "keyword"
},
"image": {
"type": "keyword"
},
"category": {
"type": "keyword"
},
"id":{
"type": "keyword"
}
}
}
}
and here is my Python code, in which we give it a keyword and a category number and it has to check in title, lead and content fields of the elastic data for the matching keyword and also check the entered category number with the data category number and return/print out any object that matches this criteria:
from elasticsearch import Elasticsearch
import json,requests
es = Elasticsearch(HOST="http://localhost", PORT=9200)
es = Elasticsearch()
def QueryMaker (keyword,category):
response = es.search(index="main-news-test-data",body={"from":0,"size":5,"query":{"multi_match":{
"content":keyword,"category":category,"title":keyword,"lead":keyword}}})
return(response)
if __name__ == '__main__':
keyword = input('Enter Keyword: ')
category = input('Enter Category: ')
#startDate = input('Enter StartDate: ')
#endDate = input('Enter EndDate: ')
data = QueryMaker(keyword,category)
print(data)
but I receive this error when I give the data to the input:
elasticsearch.exceptions.RequestError: RequestError(400, 'parsing_exception', '[multi_match] query does not support [content]')
What am I doing wrong?
Edit: the keyword has to be included in the title, lead and content but it doesn't have to be the same as them

Your multi_match query syntax is wrong here, also I think you need something like this, See more: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html
{
"from":0,
"size":5,
"query": {
"bool": {
"should": [
{
"multi_match" : {
"query": keyword,
"fields": [ "content", "title","lead" ]
}
},
{
"multi_match" : {
"query": category,
"fields": [ "category" ]
}
}
]
}
}
}

Related

ElasticSearch 7 & Kibana unexpected behavior

I am trying to store a data into elastic search index the data of a column look as below
C ID
1234
5678
NA
123D D5614 A7890
Now I know this data is kind of mixed and so I have selected the text field for this with below properties
"mappings": {
"properties":{
"C ID":{"type":"text" , "fields" :{'keyword': {'type':'keyword'}}},
}
}
Even after this I am always getting the error.
failed to parse field[C ID] of type long in document id 4
Please help me out with this. I have not given any reference of long don't know why I am getting this error
Update
My code base
from elasticsearch import Elasticsearrch
ESConnector is a class responisble for kerberos login. We are calling Elasticsearch under ESConnector class
es = ESConnector()
if not ex.indices.exist(INDEX):
set = {"settings":{"index":{"number_of_shards":1, "number_of_replicas":1}}
es.indices.create(INDEX, body = set)
mbody = {
"mappings": {
"properties":{
"C ID":{"type":"text" , "fields" :{'keyword': {'type':'keyword'}}},
}
}
}
es.indices.put_mapping(INDEX, body = mbody)
You can create the index with the mapping in a single call
if not es.indices.exist(INDEX):
body = {
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 1
}
},
"mappings": {
"properties": {
"C ID": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
es.indices.create(INDEX, body = body)
It should work this way.

Elasticsearch not returning result for single word query

I have a basic Elasticsearch index that consists of a variety of help articles. Users can search for them in my Python/Django app.
The index has the following mappings:
{
"mappings": {
"properties": {
"body": {
"type": "text"
},
"category": {
"type": "nested",
"properties": {
"category_id": {
"type": "long"
},
"category_title": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
}
}
},
"title": {
"type": "keyword"
},
"date_updated": {
"type": "date"
},
"position": {
"type": "integer"
}
}
}
}
I basically want the user to be able to search for a query and get any results that match the article title or category.
Say I have an article called "I Can't Remember My Password" in the "Your Account" category.
If I search for the article title exactly, I see the result. If I search for the category title exactly, I also see the result.
But if I search for just "password", I get nothing. What do I need to change in my setup/query to make it so that this query (or similarly non-exact queries) also returns the result?
My query looks like:
{
"query": {
"bool": {
"should": [{
"multi_match": {
"fields": ["title"],
"query": "password"
}
},
{
"nested": {
"path": "category",
"query": {
"multi_match": {
"fields": ["category.category_title"],
"query": "password"
}
}
}
}
]
}
}
}
I have read other questions and experimented with various settings but no luck so far. I am not doing anything particularly special at index time in terms of preparing the fields so I don't know if that's something to look at. I'm just using the elasticsearch-dsl defaults.
The solution was to reindex the title field as text rather than keyword. The latter only allows exact matching.
Credit to LeBigCat for pointing that out in the comments. They haven't posted it as an answer so I'm doing it on their behalf to improve visibility.

How to specify Stopwords in Elasticsearch mapping using python

I have this python code where I first create a Elasticsearch mapping and then after data is inserted I do searching for that data:
# Create Data mapping
data_mapping = {
"mappings": {
(doc_type): {
"properties": {
"data_id": {
"type": "string",
"fields": {
"stemmed": {
"type": "string",
"analyzer": "english"
}
}
},
"data":{
"type": "array",
"fields": {
"stemmed": {
"type": "string",
"analyzer": "english"
}
}
},
"resp": {
"type": "string",
"fields": {
"stemmed": {
"type": "string",
"analyzer": "english"
}
}
},
"update": {
"type": "integer",
"fields": {
"stemmed": {
"type": "integer",
"analyzer": "english"
}
}
}
}
}
}
}
#Search
data_search = {
"query": {
"function_score": {
"query": {
"match": {
'data': question
}
},
"field_value_factor": {
"field": "update",
"modifier": "log2p"
}
}
}
}
response = es.search(index=doc_type, body=data_search)
Now what I am unable to figure out where and how to specify stopwords in the above code? This link gives an example of using stopwords but I am unable to relate it to my code. Do I need to specify in the data mapping section, search section or both? And how do I specify it?
Any example help would be appreciated!
UPDATE: Based on some comments suggestion is to add either analysis section or settings sections but I am not sure how should I add those to the mapping section I have written above.

How to push object data in elasticksearch array field

I want to push object data type inside an array field and i have error, I thing is the mapping
I Use Python to work with ElastickSearch
Create the mappings for the customer
def create_customer_index():
''' Start creating customers index manually '''
mapping = {
"mappings": {
"customer": {
"properties": {
"created": {
"type": "long"
},
"updated": {
"type": "long"
},
"shopping_cart": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
}
}
}
}
es.indices.create(index='customers', body=mapping)
#end
With this method I add the product the user select and the quantity inside the array
def add_item_in_shopping_cart(uid, product_id, quantity=1):
''' Add new item in customer shopping cart '''
print('Start adding a new item in shopping cart')
# print('UID => ', uid)
print('Product id to add is: => ', product_id)
# 1st check if the user has this product on favorites,
# 2nd if the user doesn't have this remove it from the list
customer_shoping_cart = get_customer_shopping_cart(uid)
print('CUSTUMER SHOPING CART => ', customer_shoping_cart)
x = False
for item in customer_shoping_cart:
if item == product_id:
x = True
if x:
print('Item already exist in shopping_cart')
return 'item_already_exist_in_shopping_cart', 500
else:
print('Item dont exist in shopping_cart, i gona added')
timestamp = int(round(time.time() * 1000))
schema = {
"product_id": product_id,
"quantity" : 10
}
doc = {
"script" : {
"inline":"ctx._source.shopping_cart.add(params.data)",
"params":{
"data":{
"product_id": product_id,
"quantity" : quantity
}
}
}
}
try:
es.update(index="customers", doc_type='customer', id=uid, body=doc)
es.indices.refresh(index="customers")
return jsonify({'message': 'item_added_in_shopping_cart'}), 200
except Exception as e:
print('ERROR => ', e)
return 'we have error', 500
#end
And I have this error
ERROR => TransportError(400, 'mapper_parsing_exception', 'failed to parse [shopping_cart]')
From the mapping you posted it looks like ElasticSearch expects documents like this:
{
"created": 1510094303,
"updated": 1510094303,
"shopping_cart": "I am in a shopping cart"
}
Or like this:
{
"created": 1510094303,
"updated": 1510094303,
"shopping_cart": [
"I am in a shopping cart",
"me too!"
]
}
And you are trying to treat "shopping_cart" is an array of objects, which it is not (it is an array of strings). ElasticSearch does not allow to put objects that do not fit the mapping into the index.
What you should try first is to change your mapping to something like this:
mapping = {
"mappings": {
"customer": {
"properties": {
"created": {
"type": "long"
},
"updated": {
"type": "long"
},
"shopping_cart": {
"properties": {
"product_id": {
"type": "keyword"
},
"quantity": {
"type": "integer"
}
}
},
}
}
}
}
Moreover, consider also changing the document entirely on the client side, i.e. in your script, and putting the new version document in the index (replacing the previous one), since it will probably be easier in the implementation logic (for instance, no need for ElasticSearch-side scripts to update the document).
Hope that helps.

Elastic Search: including #/hashtags in search results

Using elastic search's query DSL this is how I am currently constructing my query:
elastic_sort = [
{ "timestamp": {"order": "desc" }},
"_score",
{ "name": { "order": "desc" }},
{ "channel": { "order": "desc" }},
]
elastic_query = {
"fuzzy_like_this" : {
"fields" : [ "msgs.channel", "msgs.msg", "msgs.name" ],
"like_text" : search_string,
"max_query_terms" : 10,
"fuzziness": 0.7,
}
}
res = self.es.search(index="chat", body={
"from" : from_result, "size" : results_per_page,
"track_scores": True,
"query": elastic_query,
"sort": elastic_sort,
})
I've been trying to implement a filter or an analyzer that will allow the inclusion of "#" in searches (I want a search for "#thing" to return results that include "#thing"), but I am coming up short. The error messages I am getting are not helpful and just telling me that my query is malformed.
I attempted to incorporate the method found here : http://www.fullscale.co/blog/2013/03/04/preserving_specific_characters_during_tokenizing_in_elasticsearch.html but it doesn't make any sense to me in context.
Does anyone have a clue how I can do this?
Did you create a mapping for you index? You can specify within your mapping to not analyze certain fields.
For example, a tweet mapping can be something like:
"tweet": {
"properties": {
"id": {
"type": "long"
},
"msg": {
"type": "string"
},
"hashtags": {
"type": "string",
"index": "not_analyzed"
}
}
}
You can then perform a term query on "hashtags" for an exact string match, including "#" character.
If you want "hashtags" to be tokenized as well, you can always create a multi-field for "hashtags".

Categories

Resources