partial word search with haystack/elasticsearch

partial word search with haystack/elasticsearch - python

We are currently running haystack with an elasticsearch backend. We are having trouble getting the partial word search to run correctly.
We currently have an index that has an EdgeNgramField. I have tried doing a search on this field, but I'm not finding any results unless it's an exact match. I'm trying to use this to find products so for example: I type "sun", I won't get the result for "sunglasses".
I started using curl commands directly on the elasticsearch to see if I could figure out what was happening. I even created my own index directly with curl, along with an ngram analyzer and I receive the proper results using partial word searches.
Another interesting thing is that: if I run the _mapping command using curl on my test index directly on elasticsearch that I created directly with curl, i get the following: "testfield":{"type":"string", "analyzer":"test_analyzer"}, however, if I run the mapping command on the index created by haystack, it only has "type":"string". It says nothing about the edgengram_analyzer that it should be using.
Any ideas?

I think there's a bug in haystack in elasticsearch_backend.py which is not using pyelasticsearch properly line 868 looks like:
self.conn.put_mapping('modelresult', current_mapping, index=self.index_name)
And if you replace it by:
self.conn.put_mapping(doc_type='modelresult', mapping=current_mapping, index=self.index_name)
which is how pyelasticsearch expects it, then you will see that the edgengram_analyzer is added to your EdgeNgramField field. At least it works for me.

Related

Unable to search for words contining hyphens using Postgres TSVector

I've got a system in place that uses Postgres and TSVector to search for relevant terms, however whenever I search for text containing a hyphen (-) the results are not very useful. Example: searching IL-7 returns results with April 7.
Is there a configuration option I'm missing?
For background I'm using sqlalchemy-searchable as a frontend.
https://sqlalchemy-searchable.readthedocs.io/en/latest/configuration.html
I've tried modifying my search parameters but I don't believe that to be the issue.

How to use elasticsearch 8.x ingest-attachment with Python

I am trying to read a pdf from python and send it to elasticsearch.
I tried to use ingest-attachment to help with that, but I don't know how.
https://www.elastic.co/guide/en/elasticsearch/reference/master/attachment.html
When I followed the official documentation, it worked. However, there doesn't seem to be a way to use Python in the official documentation.
so, I looked at the official documentation and created my own mapping
Data is entered but not attached.
Wandered around and found this.
but i don't know how to use
elasticsearch.exceptions.RequestError: RequestError(400, 'invalid_index_name_exception', 'Invalid index name [_ingest/pipeline/attachment_pipeline], must not contain the following characters ['\','/','*
','?','"','<','>','|',' ',',']')
If you just run it, it will come out like this: It looks like you have / in your index name. So this time I only ran the bottom part.
elasticsearch.exceptions.RequestError: RequestError(400, 'illegal_argument_exception', 'pipeline with id [attachment_pipeline] does not exist')
..........
I want to know how to use it..
How to make attachment plugin work in python

Is there a way to get the response after saving data in Firestore?

I'm using python to add data in firestore and I would want to get a response from firestore so that I can know if the data was successfully added or not. Is there a way to do this using Python? Newbie here.
I've already searched from other resources but can't find anything.

According to the Firestore documentation in the code, the DocumentReference.set() method returns a:
google.cloud.firestore_v1.types.WriteResult: The write result corresponding to the committed document. A write result contains an update_time field.
I can't find any Python examples, but it should work similarly to the Node.js version here: https://cloud.google.com/nodejs/docs/reference/firestore/1.2.x/WriteResult

I cannot calculate a working AWS signature version 4 (hexadecimal string) for curl commands to work to test the REST API

I have never been able to get Rest APIs to completely work with AWS. The error messages I have seen have been about the time not being correct or the command not being recognized (e.g., list-users). I have verified the "version" was appropriate for the command with AWS's website documentation.
I am trying to use curl with Linux to list the users or instances in my AWS account. I have a problem when I run it. My current error, that I would like to focus on, is "request signatures calculated does not match the signature provided." I went through the process of creating a signature carefully. It wasn't that surprising that it did not work given the hours of trouble and the seemingly many potential pitfalls in the tedious task of creating a signature.
I used this link to generate the hexadecimal string for the signature:
http://docs.aws.amazon.com/general/latest/gr/signature-v4-examples.html#signature-v4-examples-python
I analyzed the output of the signatureKey using a modification of the Python code in the above link. The result is not hexadecimal nor alphanumeric. The result is a combination of special non-alphabet, non-numeric symbols and alphabet letters. I tried to work around this problem by using import binascii and binascii.hexlify. I was able to get a hexadecimal string from otherwise strictly adhering to the sample of Python code from the above link. I tend to think my signatureKey is not right because of this binascii work that I had to do. But what did I do wrong? How is that Python code supposed to calculate a signature?
Alternatively, are there thorough directions not written by Amazon to create a signature key? The process is not simple and seemingly error prone. I could start over with creating a signature if someone cannot clearly tell me how to create a signature. Amazon's forums have few postings related to this topic. I'd prefer to create the signature with Python. If someone recommends Ruby (an accessible language for me), I could try something like that.

Imgur TagVote issue python

A beginner here, I am using the imgur python library to get tags related to an image. For this I am using the gallery_item_tags method as mentioned here.
However whenever i call the method it gives me an output as shown here.
I have followed the authorization procedure using the needed client id and client secret and i can run all methods not involving TagVotes array. How can i get the required information from this?

You're getting a list of TagVote instances. You probably want the name, you can access it like this:
for tag in tags:
print tag.name

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.