Python elasticsearch range query - python

I know that there are several alternative elasticsearch clients for python beyond this one. However, I do not have access to those. How can I write a query that has a 'less than or equal' logic for a timestamp? My current way of doing this is:
query = group_id:" + gid + '" AND data_model.fields.price:' + price
less_than_time = # datetime object
data = self.es.search(index=self.es_index, q=query, size=searchsize)
hits = data['hits']['hits']
results = []
for hit in hits:
time = datetime.strptime(hit['_source']['data_model']['utc_time'], time_format)
dt = abs(time - less_than_time).seconds
if dt <= 0:
results.append(hit)
This is a really clumsy way of doing it. Is there a way I can keep my query generation using strings and include a range?

I have a little script that generates a query for me. The query however is in the json notation (which I believe the client can use).
here's my script:
#!/usr/bin/python
from datetime import datetime
import sys
RANGE = '"range":{"#timestamp":{"gte":"%s","lt":"%s"}}'
QUERY = '{"query":{"bool":{"must":[{"prefix": {"myType":"test"}},{%s}]}}}'
if __name__ == "__main__":
if len(sys.argv) < 3:
print "\nERROR: 2 Date arguments needed: From and To, for example:\n\n./range_query.py 2016-08-10T00:00:00.000Z 2016-08-10T00:00:00.000Z\n\n"
sys.exit(1)
try:
date1 = datetime.strptime(sys.argv[1], "%Y-%m-%dT%H:%M:%S.%fZ")
date2 = datetime.strptime(sys.argv[2], "%Y-%m-%dT%H:%M:%S.%fZ")
except:
print "\nERROR: Invalid dates. From: %s, To: %s" %(sys.argv[1], sys.argv[2]) + "\n\nValid date format: %Y-%m-%dT%H:%M:%S.%fZ\n"
sys.exit(1)
range_q = RANGE %(sys.argv[1], sys.argv[2])
print(QUERY %(range_q))
The script also uses a bool query. It should be fairly easy to remove that and use only the time constraints for the range.
I hope this is what you're looking for.
This can be called and spits out a query such as:
./range_prefix_query.py.tmp 2016-08-10T00:00:00.000Z 2016-08-10T00:00:00.000Z
{"query":{"bool":{"must":[{"prefix": {"myType":"test"}},{"range":{"#timestamp":{"gte":"2016-08-10T00:00:00.000Z","lt":"2016-08-10T00:00:00.000Z"}}}]}}}
Artur

Take a look at https://elasticsearch-dsl.readthedocs.io/en/latest/
s = Search()\
.filter("term", **{"name": name})\
.query(q)\
.extra(**paging)

Related

How can I make a time string match up if the miliseconds do not match

I am trying to make a clock that stops at a certain time. This is the code I currently have:
import time as t
import datetime as dt
import os
tc = input("When do you want this to stop? (military time please) ")
exit = False
date = str(dt.datetime.now().date())
while (exit == False):
if dt.datetime.now() == date + " " + tc + ":00.0000":
exit = True
else:
print(dt.datetime.now())
t.sleep(0.01)
os.system('cls')
The problem is that the time never exactly gets to the perfect place for the parts less than a second so how do I get it to stop?
do you mean like this?
if dt.datetime.now() >= date + " " + tc + ":00.0000"
also please format the datetime.now() to the string you want
using something like datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
You could check if the time has passed, something like
if dt.datetime.now() >= date + " " + tc + ":00.0000":
You'll probably have to fiddle with the available methods to get it to work, I don't know if there's a built in comparator in that library. But something along those lines just checking if the current time is past the desired time.

Python filter a DBF by a range of date (between two dates)

I'm using the dbf library with python3.5.
The DBF table has a column with only dates without time and another just with time. Want to retrieve records from the last five minutes.
I'm new to this module and currently see just two approaches to get a portion of the data stored in a DBF:
First, with the sympathetic SQL like query:
records = table.query("SELECT * WHERE (SA03 BETWEEN " + beforedfilter + " AND " + nowdfilter + ") AND (SA04 BETWEEN " + beforetfilter + " AND " + nowtfilter + ")")
This would be a familiar approach but the records returned are the first records from the file and not between the given range of time. Probably it is because the sql querying is not well supported by the module? Or just I'm mistaking something in my query? And another odd is that after a few records are printed I'll get an exception: UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 3: ordinal not in range(128). To my knowledge there are no non-ascii characters in the table.
The other approach is using the module's default way of narrowing down records.. Got stuck with the filtering, as I could use it if I would want to find one specific date and time but for a range, I have no clues how to proceed.
index = table.create_index(lambda rec: rec.SA03)
records = index.search(match=(?!))
The simplest way is to have a filter function that only tracks matching records:
# lightly tested
def last_five_minutes(record, date_field, time_field):
now = dbf.DateTime.now()
record_date = record[date_field]
try:
# if time is stored as HH:MM:SS
record_time = dbf.DateTime.strptime(record[time_field], '%H:%M:%S').time()
moment = dbf.DateTime.combine(record_date, record_time)
lapsed = now - moment
except (ValueError, TypeError):
# should log exceptions, not just ignore them
return dbf.DoNotIndex
if lapsed <= datetime.timedelta(seconds=300):
# return value to sort on
return moment
else:
# do not include this record
return dbf.DoNotIndex
and then use it:
index = table.create_index(
lambda rec: last_five_minutes(rec, 'date_field', 'time_field'))

How to validate time format?

This is what I have so far, it probably is completely junk. What I want to do is validate caminput1, so that the format is HH:MM:SS.
The hashes are from when I was testing.
def cameraspeedcheck():
timeformat = ("%H:%M:%S")
caminput1 = input("At what time did sensor 1 actuate? ")
# is caminput1 = time(HH:MM:SS)
# time.strptime(caminput1[%H:%M:%S])
caminput1.strptime(timeformat)
# else cameraspeedcheck()
I am not very experienced with the syntax of all this stuff, or coding in general, but before you tell me to go and look it up.
I have been looking around for ages, and I cannot find anything that explains the whole process.
strptime is a class-method of datetime.datetime which accepts the string to parse as first argument and the format as the second argument. So you should do -
def cameraspeedcheck():
timeformat = "%H:%M:%S"
caminput1 = input("At what time did sensor 1 actuate? ")
try:
validtime = datetime.datetime.strptime(caminput1, timeformat)
#Do your logic with validtime, which is a valid format
except ValueError:
#Do your logic for invalid format (maybe print some message?).

Python script for EC2 snapshots, use datetime to delete old snapshots

I am a beginner with Python and I have written a python script which takes a snaphot of a specified volume and then retains only the number of snapshots requested for that volume.
#Built with Python 3.3.2
import boto.ec2
from boto.ec2.connection import EC2Connection
from boto.ec2.regioninfo import RegionInfo
from boto.ec2.snapshot import Snapshot
from datetime import datetime
from functools import cmp_to_key
import sys
aws_access_key = str(input("AWS Access Key: "))
aws_secret_key = str(input("AWS Secret Key: "))
regionname = str(input("AWS Region Name: "))
regionendpoint = str(input("AWS Region Endpoint: "))
region = RegionInfo(name=regionname, endpoint=regionendpoint)
conn = EC2Connection(aws_access_key_id = aws_access_key, aws_secret_access_key = aws_secret_key, region = region)
print (conn)
volumes = conn.get_all_volumes()
print ("%s" % repr(volumes))
vol_id = str(input("Enter Volume ID to snapshot: "))
keep = int(input("Enter number of snapshots to keep: "))
volume = volumes[0]
description = str(input("Enter volume snapshot description: "))
if volume.create_snapshot(description):
print ('Snapshot created with description: %s' % description)
snapshots = volume.snapshots()
print (snapshots)
def date_compare(snap1, snap2):
if snap1.start_time < snap2.start_time:
return -1
elif snap1.start_time == snap2.start_time:
return 0
return 1
snapshots.sort(key=cmp_to_key(date_compare))
delta = len(snapshots) - keep
for i in range(delta):
print ('Deleting snapshot %s' % snapshots[i].description)
snapshots[i].delete()
What I want to do now is rather than use the number of snapshots to keep I want to change this to specifying the date range of the snapshots to keep. For example delete anything older than a specific date & time. I kind of have an idea where to start and based on the above script I have the list of snapshots sorted by date. What I would like to do is prompt the user to specify the date and time from where snapshots would be deleted eg 2015-3-4 14:00:00 anything older than this would be deleted. Hoping someone can get me started here
Thanks!!
First, you can prompt user to specify the date and time from when snapshots would be deleted.
import datetime
user_time = str(input("Enter datetime from when you want to delete, like this format 2015-3-4 14:00:00:"))
real_user_time = datetime.datetime.strptime(user_time, '%Y-%m-%d %H:%M:%S')
print real_user_time # as you can see here, user time has been changed from a string to a datetime object
Second, delete anything older than that
SOLUTION ONE:
for snap in snapshots:
start_time = datetime.datetime.strptime(snap.start_time[:-5], '%Y-%m-%dT%H:%M:%S')
if start_time > real_user_time:
snap.delete()
SOLUTION TWO:
Since snapshots is sorted, you only find the first snap older than real_user_time and delete all the rest of them.
snap_num = len(snapshots)
for i in xrange(snap_num):
# if snapshots[i].start_time is not the format of datetime object, you will have to format it first like above
start_time = datetime.datetime.strptime(snapshots[i].start_time[:-5], '%Y-%m-%dT%H:%M:%S')
if start_time > real_user_time:
for n in xrange(i,snap_num):
snapshots[n].delete()
break
Hope it helps. :)
Be careful. Make sure to normalize the start time values (e.g., convert them to UTC). It doesn't make sense to compare the time in user local timezone with whatever timezone is used on the server. Also the local timezone may have different utc offsets at different times anyway. See Find if 24 hrs have passed between datetimes - Python.
If all dates are in UTC then you could sort the snapshots as:
from operator import attrgetter
snapshots.sort(key=attrgetter('start_time'))
If snapshots is sorted then you could "delete anything older than a specific date & time" using bisect module:
from bisect import bisect
class Seq(object):
def __init__(self, seq):
self.seq = seq
def __len__(self):
return len(self.seq)
def __getitem__(self, i):
return self.seq[i].start_time
del snapshots[:bisect(Seq(snapshots), given_time)]
it removes all snapshots with start_time <= given_time.
You could also remove older snapshots without sorting:
snapshots[:] = [s for s in snapshots if s.start_time > given_time]
If you want to call .delete() method explicitly without changing snapshots list:
for s in snapshots:
if s.start_time <= given_time:
s.delete()
If s.start_time is a string that uses 2015-03-04T06:35:18.000Z format then given_time should also be in that format (note: Z here means that the time is in UTC) if user uses a different timezone; you have to convert the time before comparison (str -> datetime -> datetime in utc -> str). If given_time is already a string in the correct format then you could compare the string directly without converting them to datetime first.

Regex is not validating date correctly

def chkDay(x, size, part):
dayre = re.compile('[0-3][0-9]') # day digit 0-9
if (dayre.match(x)):
if (len(x) > size):
return tkMessageBox.showerror("Warning", "This "+ part +" is invalid")
app.destroy
else:
tkMessageBox.showinfo("OK", "Thanks for inserting a valid "+ part)
else:
tkMessageBox.showerror("Warning", part + " not entered correctly!")
root.destroy
#when clicked
chkDay(vDay.get(),31, "Day")
#interface of tkinter
vDay = StringVar()
Entry(root, textvariable=vDay).pack()
Problem:
Not validating, I can put in a day greater than 31 and it still shows: OK
root (application) does not close when I call root.destroy
Validating date with regex is hard. You can use some patterns from: http://regexlib.com/DisplayPatterns.aspx?cattabindex=4&categoryId=5&AspxAutoDetectCookieSupport=1
or from http://answers.oreilly.com/topic/226-how-to-validate-traditional-date-formats-with-regular-expressions/
Remember that it is especially hard to check if year is leap, for example is date 2011-02-29 valid or not?
I think it is better to use specialized functions to parse and validate date. You can use strptime() from datetime module.
Let the standard datetime library handle your datetime data as well as parsing:
import datetime
try:
dt = datetime.datetime.strptime(date_string, '%Y-%m-%d')
except ValueError:
# insert error handling
else:
# date_string is ok, it represents the date stored in dt, now use it
31 is actually in your regex because [0-3][0-9] is not exactly what you're looking for.
You would better try to cast it to a int and explicitly check its bound.
Else the correct regex would be ([0-2]?\d|3[01]) to match a number from 0 up to 31
In order to limit the values between 1 and 31, you could use:
[1-9]|[12][0-9]|3[01]

Categories

Resources