Ealasticsearch results exactly as parameter - python

I'm trying to filter logs based on the domain name. For example I only want the results of domain: bh250.example.com.
When I use the following query:
http://localhost:9200/_search?pretty&size=150&q=domainname=bh250.example.com
the first 3 results have a domain name: bh250.example.com where the 4th having bh500.example.com
I have read several documentations on how to query to Elasticsearch but I seem to miss something. I only want results having 100% match with the parameter.
UPDATE!! After question from Val
queryFilter = Q("match", domainname="bh250.example.com")
search=Search(using=dev_client, index="logstash-2016.09.21").query("bool", filter=queryFilter)[0:20]

You're almost there, you just need to make a small change:
http://localhost:9200/_search?pretty&size=150&q=domainname:"bh250.example.com"
^ ^
| |
use colon instead of equal... and double quotes

Related

Check if string is in certain format in Python

I have string as below.
/customer/v1/123456789/account/
The id in the url is dynamic.
What I want to check is if I have that string how can I be sure that if first part and second part is matching with below structure. /customer/v1/<customer_id>/account
What I have done so far is this. however, I want to check if endpoints is totally matching to the structure or not.
endpoint_structure = '/customer/v1/'
endpoint = '/customer/v1/123456789/account/'
if endpoint_structure in endpoint:
return True
return False
Endpoint structure might change as well.
For example: /customer/v1/<customer_id>/documents/<document_id>/ and there will be again given endpoint and I need to check if given endpoint fits with the structure.
You can use a regular expression;
import re
return re.match(r'^/customer/v1/\d+/account/$', endpoint)
or you can examine the beginning and the end:
return endpoint.startswith('/customer/v1/') and endpoint.endswith('/account/')
... though this doesn't attempt to verify that the stuff between the beginning and the end is numeric.
Can solve this using regular expression
^(/customer/v1/)(\d)+(/account/)$
Also if you want to specify the minimum length for customer_id
(/customer/v1/<customer_id>/account ) then use the following regexp
^(/customer/v1/)(\d){5,}(/account/)$
Here expecting the customer_id must have at least 5 digits length
Check here

Flask SQLAlchemy Contains/Ilike producing different results?

I am trying to query a column from a database with contains/ilike, they are producing different results. Any idea why?
My current code;
search = 'nel'
find = Clients.query.filter(Clients.lastName.ilike(search)).all()
# THE ABOVE LINE PRODUCES 0 RESULTS
find = Clients.query.filter(Clients.lastName.contains(search)).all()
# THE ABOVE LINE PRODUCES THE DESIRED RESULTS
for row in find:
print(row.lastName)
My concern is am I missing something? I have read that 'contains' does not always work either. Is there a better way to do what I am doing?
For ilike and like, you need to include wildcards in your search like this:
Clients.lastName.ilike(r"%{}%".format(search))
As the Postgres docs say:
LIKE pattern matching always covers the entire string. Therefore, to match a sequence anywhere within a string, the pattern must start and end with a percent sign.
The other difference is that contains is case-sensitive, while ilike is insensitive.

Python Syntax Incorrect for Email Creation

I am trying to write out some basic python for my kolab email server. For the primary_mail, I want it to be first initial last name, such as jdoe. The default is first name (dot) last name. john.doe#domain.com
I have came up the following:
primary_mail ='%(givenname)s'[0:1]%(surname)s#%(domain)s
Which I want to basically say jdoe#domain.com
givenname would be someone's full name. (i.e John)
surname would be someone's last name. (i.e Doe)
domain is the email domain. domain.com
When python goes to canonify it, it comes up with some mumbo jumbo like so:
'john[0:1]'doe#domain.com
Can someone help me out with correcting this? I am so close.
EDIT:
According to kolab documentation, it looks like it is something like:
"{0}#{1}": "format('%(uid)s', '%(domain)s')"
This of course doesn't work for me though....
EDIT 2:
I am getting the following in my error logs:
imaps[1916]: ptload completely failed: unable to canonify identifier: 'john'[0:1]doe#domain.com
String formatting is by far the easiest, most readable and preferred way of accomplishing this:
first_name = 'John'
surname = 'Smith'
domain = 'company.com'
primary_mail = '{initial}{surname}#{domain}'.format(initial=first_name[0].lower(), surname=surname.lower(), domain=domain)
primary_mail now equals 'jsmith#company.com'. You define a string containing named placeholders in braces, then call the format method to have those placeholders replaced at runtime with the appropriate values. Here, we take the first character of first_name and convert it to lower case, convert the entirety of surname also, and leave domain unchanged.
You can read more on string formatting at the Python 2.7 docs.
James Scholes is right that format is a better way of doing it, however reading the Kolab documentation it seems that you can only give the format string, and they use the % style formatter internally, where you can't change it. From
the Kolab 'primary_mail' documentation
primary_mail = %(givenname)s.%(surname)s#%(domain)s
The equivalent of the following Python is then executed:
primary_mail = "%(givenname)s.%(surname)s#%(domain)s" % {
"givenname": "Maria",
"surname": "Moller",
"preferredlanguage": "en_US"
}
In this case, we need a modifier to the format conversation. We have %(givenname)s, which ensures that givenname is a string. We can also specify a minimum length, followed by a . and then a precision. This is normally only used it numbers, but we can use it for strings, too. Here is a format string with no minimum length, but a maximum length (precision) of 1 character:
"%(givenname).1s"
So you probably want a string like this:
"%(givenname).1s%(surname)#%(domain)"

procmal recipe to pass values to my Python script

I have never used procmail before but I believe (from my R&D) that it is likely my best choice to crack my riddle. Our system receives an email, out of which I need 3 values, which are:
either a 4-digit or 5-digit integer from the SUBJECT line. (we will refer to as "N")
email alias from REPLY-TO line (we will refer to as "R")
determine the type of email it is, by which I mean to say a "case" or a "project". (we will refer to as "T") This value would be parsed out of the SUBJECT line.
If any one could help me with that recipe, I would be most appreciative.
The next thing I need to do is:
send these 3 values to a Python script (can I do this directly from procmail? pipe? something else?)
delete the email messages
I need to accept these emails from only 4 domain names, such as:
(#sjobeck.com|#cases.example.com|#messages.example.com|#bounces.example.com)
Last, is to pipe these 3 values in to the second script, and some advice as to the best syntax to do so. Any advice here is most appreciative. Would this be something like this:
this-recipe $N $T $R | second-script.py
Or exactly how would that look? Or is this not a procmail issue and a Python issue? (if it is, that's fine, I'll handle it over there.)
Thanks so much!
Jason
Procmail can extract those values, or you can just pass the whole message to Python on stdin.
Assuming you want the final digits and you require there to be 4 or 5, something like this:
R=`formail -zxReply-to: | sed 's/.*<//;s/>.*//'`
:0
* ^From:.*#(helpicantfindgoogle\.com|searchengineshateme\.net|disabled\.org)\>
* ^Subject:(.*[^0-9])?\/[0-9][0-9][0-9][0-9][0-9]?$
| scriptname.py --reply-to "$R" --number "$MATCH"
This illustrates two different techniques for extracting a header value; the Reply-To header is extracted by invoking formail (this will extract just the email terminus, as per your comment; if you mean something else by "alias" then please define it properly) while the trailing 4- or 5-number integer from the Subject is grabbed my matching it in the condition with the special operator \/.
Update: Added an additional condition to only process email where the From: header indicates a sender in one of the domains helpicantfindgoogle.com, searchengineshateme.net, or disabled.org.
As implied by the pipe action, your script will be able to read the triggering message on its standard input, but if you don't need it, just don't read standard input.
If delivery is successful, Procmail will stop processing when this recipe finishes. Thus you should not need to explicitly discard a matching message. (If you want to keep going, use :0c instead of just :0.)
As an efficiency tweak (if you receive a lot of email, and only a small fraction of it needs to be passed to this script, for example) you might want to refactor to only extract the Reply-To: when the conditions match.
:0
* ^From:.*#(helpicantfindgoogle.com|searchengineshateme\.net|disabled\.org)\>
* ^Subject:(.*[^0-9])?\/[0-9][0-9][0-9][0-9][0-9]?$
{
R=`formail -zxReply-To: | sed 's/.*<//;s/>.*//'`
:0
| scriptname.py --reply-to "$R" --number "$MATCH"
}
The block (the stuff between { and }) will only be entered when both the conditions are met. The extraction of the number from the Subject: header into $MATCH works as before; if the From: condition matched and the Subject: condition matched, the extracted number will be in $MATCH.

How to make complex contains queries in Django?

I need to make query like this:
WHERE Comment like '%ev% 3628%' or Comment like '%ew% 3628%'
the number '3628' is a parametr. So I've tried in my view:
First try:
wherestr = "Comment like '%%ev%% %s%%' or Comment like '%%ew%% %s%%'" % (rev_number, rev_number)
comment_o = Issuecomments.objects.extra(where=[wherestr])
but I've got:
TypeError at /comments_by_rev/3628/
not enough arguments for format string
Request Method: GET
Request URL: http://127.0.0.1:8001/comments_by_rev/3628/
Exception Type: TypeError
Exception Value:
not enough arguments for format string
Second try:
comment = IssuetrackerIssuecomments.objects.filter(Q(comment__contains=rev_number), Q(comment__contains='ew') | Q(comment__contains='ev'))
but its not excactly the same.
Have you people of wisdom any idea how to accomplish this?
You need something similar to this:
from django.db.models import Q
def myview(request):
query = "hi" #string to search for
items = self.filter(Q(comment__contains=query) | Q(comment__contains=query))
...
Just make sure the query string is properly escaped.
You almost got it right... The problem is that your % are being subsituted twice. Django actually has a way of passing parameters in the extra clause like this
wherestr = "Comment like '%%ev%% %s%%' or Comment like '%%ew%% %s%%'"
params = (rev_number, rev_number)
comment_o = Issuecomments.objects.extra(where=[wherestr], params=[params])
This is a better way of passing the parameters as it won't leave you open to SQL injection attacks like your way will.
Take a look at http://docs.djangoproject.com/en/dev/ref/models/querysets/, specifically
icontains: Case-insensitive containment test.
Example: Entry.objects.get(headline__icontains='Lennon')
SQL equivalent: SELECT ... WHERE headline ILIKE '%Lennon%';
Since you're looking for a pattern like %%ev%% or %%ew%%, consider the IREGEX or REGEX versions as well?
Lastly, consider performing the search differently...perhaps parse out the interesting parts of the message and put them in their own indexed columns for querying later. You'll regret doing this search once the table gets large:).

Categories

Resources