Python script for EC2 snapshots, use datetime to delete old snapshots - python

I am a beginner with Python and I have written a python script which takes a snaphot of a specified volume and then retains only the number of snapshots requested for that volume.
#Built with Python 3.3.2
import boto.ec2
from boto.ec2.connection import EC2Connection
from boto.ec2.regioninfo import RegionInfo
from boto.ec2.snapshot import Snapshot
from datetime import datetime
from functools import cmp_to_key
import sys
aws_access_key = str(input("AWS Access Key: "))
aws_secret_key = str(input("AWS Secret Key: "))
regionname = str(input("AWS Region Name: "))
regionendpoint = str(input("AWS Region Endpoint: "))
region = RegionInfo(name=regionname, endpoint=regionendpoint)
conn = EC2Connection(aws_access_key_id = aws_access_key, aws_secret_access_key = aws_secret_key, region = region)
print (conn)
volumes = conn.get_all_volumes()
print ("%s" % repr(volumes))
vol_id = str(input("Enter Volume ID to snapshot: "))
keep = int(input("Enter number of snapshots to keep: "))
volume = volumes[0]
description = str(input("Enter volume snapshot description: "))
if volume.create_snapshot(description):
print ('Snapshot created with description: %s' % description)
snapshots = volume.snapshots()
print (snapshots)
def date_compare(snap1, snap2):
if snap1.start_time < snap2.start_time:
return -1
elif snap1.start_time == snap2.start_time:
return 0
return 1
snapshots.sort(key=cmp_to_key(date_compare))
delta = len(snapshots) - keep
for i in range(delta):
print ('Deleting snapshot %s' % snapshots[i].description)
snapshots[i].delete()
What I want to do now is rather than use the number of snapshots to keep I want to change this to specifying the date range of the snapshots to keep. For example delete anything older than a specific date & time. I kind of have an idea where to start and based on the above script I have the list of snapshots sorted by date. What I would like to do is prompt the user to specify the date and time from where snapshots would be deleted eg 2015-3-4 14:00:00 anything older than this would be deleted. Hoping someone can get me started here
Thanks!!

First, you can prompt user to specify the date and time from when snapshots would be deleted.
import datetime
user_time = str(input("Enter datetime from when you want to delete, like this format 2015-3-4 14:00:00:"))
real_user_time = datetime.datetime.strptime(user_time, '%Y-%m-%d %H:%M:%S')
print real_user_time # as you can see here, user time has been changed from a string to a datetime object
Second, delete anything older than that
SOLUTION ONE:
for snap in snapshots:
start_time = datetime.datetime.strptime(snap.start_time[:-5], '%Y-%m-%dT%H:%M:%S')
if start_time > real_user_time:
snap.delete()
SOLUTION TWO:
Since snapshots is sorted, you only find the first snap older than real_user_time and delete all the rest of them.
snap_num = len(snapshots)
for i in xrange(snap_num):
# if snapshots[i].start_time is not the format of datetime object, you will have to format it first like above
start_time = datetime.datetime.strptime(snapshots[i].start_time[:-5], '%Y-%m-%dT%H:%M:%S')
if start_time > real_user_time:
for n in xrange(i,snap_num):
snapshots[n].delete()
break
Hope it helps. :)

Be careful. Make sure to normalize the start time values (e.g., convert them to UTC). It doesn't make sense to compare the time in user local timezone with whatever timezone is used on the server. Also the local timezone may have different utc offsets at different times anyway. See Find if 24 hrs have passed between datetimes - Python.
If all dates are in UTC then you could sort the snapshots as:
from operator import attrgetter
snapshots.sort(key=attrgetter('start_time'))
If snapshots is sorted then you could "delete anything older than a specific date & time" using bisect module:
from bisect import bisect
class Seq(object):
def __init__(self, seq):
self.seq = seq
def __len__(self):
return len(self.seq)
def __getitem__(self, i):
return self.seq[i].start_time
del snapshots[:bisect(Seq(snapshots), given_time)]
it removes all snapshots with start_time <= given_time.
You could also remove older snapshots without sorting:
snapshots[:] = [s for s in snapshots if s.start_time > given_time]
If you want to call .delete() method explicitly without changing snapshots list:
for s in snapshots:
if s.start_time <= given_time:
s.delete()
If s.start_time is a string that uses 2015-03-04T06:35:18.000Z format then given_time should also be in that format (note: Z here means that the time is in UTC) if user uses a different timezone; you have to convert the time before comparison (str -> datetime -> datetime in utc -> str). If given_time is already a string in the correct format then you could compare the string directly without converting them to datetime first.

Related

Python how to make datetime update time

Im using datetime with pytz, but i cant get time to update.
format = "[%B %d %H:%M]"
now_utc = datetime.now(timezone('UTC'))
greece = now_utc.astimezone(timezone('Europe/Athens'))
date = greece.strftime(format)
For example i print(date) at 11:30, it stays like that.
Any idea?
As it is, date remains the same throughout the runtime. There is nothing to update it at the current time. If you want to check and print the time at regular intervals, you need to define a function and have your script call it after that amount of time.
import time
fmt = "[%B %d %H:%M]"
def print_now()
now_utc = datetime.now(timezone('UTC'))
greece = now_utc.astimezone(timezone('Europe/Athens'))
date = greece.strftime(fmt)
print(date)
while True:
print_now()
time.sleep(60) # argument is time to wait in seconds
As long as True is True (which is always), the loop will continue, unless you define some condition to force it to end at some point. Of course, you could have the print_now() function contents within the while loop, but it's a bit cleaner to have it in it's own function.

Is there a function to 'autocomplete' a variable to a desired library?

I'm trying to set a variable to one I have in a library. Is there a command to do this?
I'm trying to make a simple time zone converter and I want to check the input variable, but I can only check the variables in the list from pytz so I want to 'autocomplete' the variable. can I do this?
import time
import pytz
country = input("enter country")
from datetime import datetime
from pytz import timezone
fmt = "%H:%M %p"
now_utc = datetime.now(timezone('UTC'))
print (now_utc.strftime(fmt))
from pytz import all_timezones
if country in all_timezones:
country = #completed country in list 'all_timezones'
timecountry = now_utc.astimezone(timezone(country))
print (timecountry.strftime(fmt))
So what you are looking for is a way to match the user input to the strings in all_timezones and look for a valid timezone.
As far as I know, there is no built-in function that does it, you have to do it by yourself.
It's not an immediate task, as you may have multiple options (let say the user inputs just 'Europe') and you have to take this in consideration
A possible way to do is the following:
import datetime
import time
import pytz
country = input("Contry name: ")
now_utc = datetime.datetime.now(pytz.timezone('UTC'))
fmt = "%H:%M %p"
while True:
possible_countries = [ac for ac in pytz.all_timezones if country in ac]
if len(possible_countries) == 1:
cc = possible_countries[0]
timecountry = now_utc.astimezone(pytz.timezone(cc))
print(timecountry.strftime(fmt))
break
elif len(possible_countries) > 1:
print("Multiple countries are possible, please rewrite the country name")
for cs in possible_countries:
print(cs)
country = input("Contry name: ")
else:
print("No idea of the country, here are the possible choices")
for cs in pytz.all_timezones:
print(cs)
country = input("Contry name: ")
With a list comprehension I look for all the strings in all_timezones which contains the user input. If there is just one, the script assumes that is the correct one and perform the task. Otherwise if there are multiple possibilites it prints them (one per row with the for loop, but you may just print the list so its shorter on the screen) and then asks the user to rewrite the country name. If there is no match, it just print all the possibilites. You may find it ugly to see on the command line, but you should get the idea and then improve it.
If you wish to check also for spelling errors in the user input... that is a lot more difficult.

Python elasticsearch range query

I know that there are several alternative elasticsearch clients for python beyond this one. However, I do not have access to those. How can I write a query that has a 'less than or equal' logic for a timestamp? My current way of doing this is:
query = group_id:" + gid + '" AND data_model.fields.price:' + price
less_than_time = # datetime object
data = self.es.search(index=self.es_index, q=query, size=searchsize)
hits = data['hits']['hits']
results = []
for hit in hits:
time = datetime.strptime(hit['_source']['data_model']['utc_time'], time_format)
dt = abs(time - less_than_time).seconds
if dt <= 0:
results.append(hit)
This is a really clumsy way of doing it. Is there a way I can keep my query generation using strings and include a range?
I have a little script that generates a query for me. The query however is in the json notation (which I believe the client can use).
here's my script:
#!/usr/bin/python
from datetime import datetime
import sys
RANGE = '"range":{"#timestamp":{"gte":"%s","lt":"%s"}}'
QUERY = '{"query":{"bool":{"must":[{"prefix": {"myType":"test"}},{%s}]}}}'
if __name__ == "__main__":
if len(sys.argv) < 3:
print "\nERROR: 2 Date arguments needed: From and To, for example:\n\n./range_query.py 2016-08-10T00:00:00.000Z 2016-08-10T00:00:00.000Z\n\n"
sys.exit(1)
try:
date1 = datetime.strptime(sys.argv[1], "%Y-%m-%dT%H:%M:%S.%fZ")
date2 = datetime.strptime(sys.argv[2], "%Y-%m-%dT%H:%M:%S.%fZ")
except:
print "\nERROR: Invalid dates. From: %s, To: %s" %(sys.argv[1], sys.argv[2]) + "\n\nValid date format: %Y-%m-%dT%H:%M:%S.%fZ\n"
sys.exit(1)
range_q = RANGE %(sys.argv[1], sys.argv[2])
print(QUERY %(range_q))
The script also uses a bool query. It should be fairly easy to remove that and use only the time constraints for the range.
I hope this is what you're looking for.
This can be called and spits out a query such as:
./range_prefix_query.py.tmp 2016-08-10T00:00:00.000Z 2016-08-10T00:00:00.000Z
{"query":{"bool":{"must":[{"prefix": {"myType":"test"}},{"range":{"#timestamp":{"gte":"2016-08-10T00:00:00.000Z","lt":"2016-08-10T00:00:00.000Z"}}}]}}}
Artur
Take a look at https://elasticsearch-dsl.readthedocs.io/en/latest/
s = Search()\
.filter("term", **{"name": name})\
.query(q)\
.extra(**paging)

Batch filename rename with conditional or math operation

this is my firt question here. Thanks in advance.
I have automatically uploaded hundreds of images to a Webfaction server with an incorrect timestamp in the filename and now I have to rename them.
Here is an example:
tl0036_20161121-120558.jpg
myCode_yyyymmdd_hhmmss.jpg
I have to change the "hh" characters 1 hour back, so 12 should be 11. They are the 17th and 18th position. I imagine two ways of doing it:
Math operation: Images are taken from 1am to 11pm, so I see no problem in doing just a math operation like 12-1=11 for all of them.
Conditional: if 17th and 18th characters value are 12, then rename to 11. 24 conditions should be written, from 01 to 23 starting value.
I have found many answers here to replace/rename a fixed string and others about conditional operation, but nothing about my kind of mixed replacement.
Please I need advidce in how the script should be assuming it will be executed into the files folder. I am a novice used to work with bash or python.
Thank you!
Solution using datetime in Python
import time
import datetime
def change_filename(fn):
# EXTRACT JUST THE TIMESTAMP AS A STRING
timestamp_str = fn[7:22]
# CONVERT TO UNIX TIMESTAMP
timestamp = time.mktime(datetime.datetime.strptime(timestamp_str, "%Y%m%d-%H%M%S").timetuple())
# SUBTRACT AN HOUR (3600 SECONDS)
timestamp = timestamp - 3600
# CHANGE BACK TO A STRING
timestamp_str = datetime.datetime.fromtimestamp(timestamp).strftime("%Y%m%d-%H%M%S")
# RETURN THE FILENAME WITH THE NEW TIMESTAMP
return fn[:7] + timestamp_str + fn[22:]
This takes into account possible changes in the day, month, and year that could happen by putting the timestamp back an hour. If you're using a 12-hour time rather than 24 hour, you can use the format string "%Y%m%d-%I%M%S" instead; see the Python docs.
Credit to: Convert string date to timestamp in Python and Converting unix timestamp string to readable date in Python
This assumes that your myCode is of a fixed length, if not you could use the str.split method to pull out the hours from after the -, or if your filenames have an unknown number/placement of -s, you could look at using regular expressions to find the hours and replace them using capturing groups.
In Python, you can use a combination of glob and shutil.move to walk through your files and rename them using that function. You might want to use a regular expression to ensure that you only operate on files matching your naming scheme, if there are other files also in the directory/ies.
Naive Solution
With the caveats about the length of myCode and filename format as above.
If your timestamps are using the 24 hour format (00-23 hours), then you can replace the hours by subtracting one, as you say; but you'd have to use conditionals to ensure that you take care of turning 23 into 00, and take care of adding a leading zero to hours less than 10.
An example in Python would be:
def change_filename(fn):
hh = int(fn[16:18])
if hh == 0:
hh = 23
else:
hh -= 1
hh = str(hh)
# ADD LEADING ZERO IF hh < 10
if len(hh) == 1:
hh = '0' + hh
return fn[:16] + str(hh) + fn[18:]
As pointed out above, an important point to bear in mind is that this approach would not put the day back by one if the hour is 00 and is changed to 23, so you would have to take that into account as well. The same could happen for the month, and the year, and you'd have to take these all into account. It's much better to use datetime.
For your file renaming logic, not only are you going to have issues over day boundaries, but also month, year, and leap-year boundaries. For example, if your filename is tl0036_20160101-000558.jpg, it needs to change to tl0036_20151231-230558.jpg. Another example: tl0036_20160301-000558.jpg will be tl0036_20160229-230558.jpg.
Creating this logic from scratch will be very time consuming - but luckily there's the datetime module in Python that has all this logic already built in.
Your script should consist of the following steps:
Iterate through each '.jpg' file in your folder.
Try to match your timestamp file name for each '.jpg'
Extract the timestamp values and create a datetime.datetime object out of those values.
Subtract a datetime.timedelta object equal to 1 hour from that datetime.datetime object, and set that as your new timestamp.
Contstruct a new filename with your new timestamp.
Replace the old filename with the new filename.
Here's an example implementation:
import datetime
import glob
import os
import re
def change_timestamps(source_folder, hourdiff = -1):
file_re = re.compile(r'(.*)_(\d{8}-\d{6})\.jpg')
for source_file_name in glob.iglob(os.path.join(source_folder, '*.jpg')):
file_match = file_re.match(source_file_name)
if file_match is not None:
old_time = datetime.datetime.strptime(file_match.group(2), "%Y%m%d-%H%M%S")
new_time = old_time + datetime.timedelta(hours = hourdiff)
new_file_str = '{}_{}.jpg'.format(file_match.group(1), new_time.strftime("%Y%m%d-%H%M%S"))
new_file_name = os.path.join(source_folder, new_file_str)
os.replace(source_file_name, new_file_name)

Python 2.7 - Add user inputed number to current time?

So basically I need to add a number that the user inputs, I'm just using raw_input, and I want to add that number, in minutes, to the current time.
So:
breakTime = int(raw_input("How long do you want to have a break for?"))
And I want to add whatever they type to
datetime.datetime.now()
Is this possible?
Thanks :)
You can use a datetime.timedelta for that:
import datetime
minutes = int(raw_input("break time"))
dt = datetime.timedelta(minutes = minutes)
later = datetime.datetime.now() + dt

Categories

Resources