Getting API Report Results into a pandas dataframe - python

I am having an issue because of one of my vendors. For some reason whenever I run any report through their statistics API it is always ran using Pacific Standard Time, regardless of the fact that I am in Eastern Standard Time. To account for this, I have to run the report with the start and end date dialed back by three hours, then I need to manually change the time of the "TimeStamp" column forward by three hours. Finally I need all the results input into my MS SQL instance. I have gotten to the point where I can get the results back, but I am stuck on what to do next. My instincts say it's going to probably be a pandas solution, but I am not sure how to get the results into the pandas dataframe. Here is what I have so far (note the vendor I am working with is called Five9, and I found a library for them that helps me connect to the API and get the report results I want):
from five9 import Five9
import datetime
from datetime import datetime, timedelta
import time
from pytz import timezone
import pyodbc
import json
now_utc = datetime.now(timezone('UTC'))
now_eastern = now_utc.astimezone(timezone('US/Eastern'))
#Change days from current time
startreportime = now_eastern - timedelta(days=2)
endreportime = now_eastern - timedelta(days=1)
#Set start and end time for report criteria
starttime = f"{(startreportime):%Y-%m-%d}" + 'T21:00:00.000'
endtime = f"{(endreportime):%Y-%m-%d}" + 'T20:59:00.000'
#connect to API
client = Five9('MyUID','MyPWD')
#Set variables as start and end
start = starttime
end = endtime
#set criteria using variables
criteria = {'time':{'end':end, 'start':start}}
#Get report and seet criteria for report
identifier = client.configuration.runReport(folderName='Five9 Import Data',\
reportName='Agent State Details',criteria=criteria)
#Sleep so report has time to complete
time.sleep(30)
#Get report results
get_results = client.configuration.getReportResult(identifier)
results = get_results['records']
print(results)
Using this I get these kinds of results:
[{
'values': {
'data': [
'Mon, 22 Feb 2021 21:00:00',
'abowling#*****.com',
'Adam',
'Bowling',
'Login',
None,
None,
'TUPSS, Telamon Inbound, Stericycle Environment Inbound, Stericycle ComSol Inbound,
'01:18:05',
'08 - TS'
]
}
If I could get these results into a dataframe I am pretty sure I could manage the rest. I know how to use a timedelta to handle the timestamp issues, and I can handle getting it from a dataframe to sql. I am just having a heck of a time trying to figure out how to get these results into a dataframe.

Not sure if anyone will read this, but I got it to work with the following:
def process_rows(rows):
for row in rows:
date1 = row['values']['data'][0]
date1 = datetime.strptime(date1, '%a, %d %b %Y %H:%M:%S').astimezone(timezone("US/Pacific"))
date2 = date1.astimezone(timezone("US/Eastern"))
date2 = date2.strftime('%Y-%m-%d %H:%M:%S')
cloned_row = [value for value in row['values']['data']]
cloned_row[0] = str(date2)
if cloned_row[8] == '24:00:00':
cloned_row[8] = '00:00:00'
yield cloned_row
args = process_rows(results)
insertSQL = ('''
INSERT INTO [Reporting].[dbo].[AgentState]
(TimeStamp, Agent, FirstName, LastName, State, ReasonCode, Media, Skill, StateTime, [Group])
VALUES (?,?,?,?,?,?,?,?,?,?)
'''
)
cursor.fast_executemany = True
cursor.executemany(insertSQL, args)
conn.commit()

Related

How to write a python function that returns the number of days between two dates

I am new to functions and I am trying to write a function that returns the number of days between two dates:
My attempt:
import datetime
from dateutil.parser import parse
def get_x_days_ago (date_from, current_date = None):
td = current_date - parse(date_from)
if current_date is None:
current_date = datetime.datetime.today()
else:
current_date = datetime.datetime.strptime(date_from, "%Y-%m-%d")
return td.days
print(get_x_days_ago(date_from="2021-04-10", current_date="2021-04-11"))
Expected outcome in days:
1
So there seem to be multiple issues, and as I said in the comments, a good idea would be to separate the parsing and the logic.
def get_x_days_ago(date_from, current_date = None):
if current_date is None:
current_date = datetime.datetime.today()
return (current_date - date_from).days
# Some other code, depending on where you are getting the dates from.
# Using the correct data types as the input to the get_x_days_ago (datetime.date in this case) will avoid
# polluting the actual logic with the parsing/formatting.
# If it's a web framework, convert to dates in the View, if it's CLI, convert in the CLI handling code
date_from = parse('April 11th 2020')
date_to = None # or parse('April 10th 2020')
days = get_x_days_ago(date_from, date_to)
print(days)
The error you get is from this line (as you should see in the traceback)
td = current_date - parse(date_from)
Since current_date="2021-04-11" (string), but date_from is parsed parse(date_from), you are trying to subtract date from the str.
P.S. If you have neither web nor cli, you can put this parsing code into def main, or any other point in code where you first get the initial strings representing the dates.
It looks like you're already aware that you can subtract a datetime from a datetime. I think, perhaps, you're really looking for this:
https://stackoverflow.com/a/23581184/2649560

Convert timestamp to time only

I'm getting the calendar results from outlook, fetching only the Start time and the Subject of each calendar item.
import win32com, win32com.client
import datetime, time, pytz
def getCalendarEntries():
Outlook = win32com.client.Dispatch("Outlook.Application")
appointments = Outlook.GetNamespace("MAPI").GetDefaultFolder(9).Items
appointments.Sort("[Start]");
appointments.IncludeRecurrences = "True"
today = datetime.datetime.today().date().strftime("%Y-%d-%m")
tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).strftime("%Y-%d-%m")
appointments = appointments.Restrict("[Start] >= '" +today+"' AND [Start] < '"+tomorrow+"'");
events={'Start':[],'Subject':[]}
for a in appointments:
events['Start' ].append(a.Start );
events['Subject'].append(a.Subject)
return events
calendar = getCalendarEntries();
n=len(calendar['Start']);
i=0;
while( n ):
print(
calendar['Start'][i] ,
calendar['Subject'][i]
);
n-=1;
i+=1;
This is the result, and it is correct:
$ py test_outlook.py
2019-12-06 10:00:00+00:00 test apointment
What I need now is to manipule this data above to get only the time: 10:00, so that I can do calculations and find out how much time there is until the event starts... like if it's 10min away, 1h away, etc.
I really have no idea on how to do it... anyone has any idea?
Uri Goren seems to have answered the question here: https://stackoverflow.com/a/38992623/8678978
You need to use strptime with the datetime format to get a date object, and then you can extract the time portion.
dateString = '2019-12-06 10:00:00+00:00'
dateObject = datetime.datetime.strptime(str[0:19], '%Y-%m-%d %H:%M:%S')
Now you have a date object and can get the time parts using:
dateObject.hour
dateObject.minute
dateObject.second
I am not sure what type getCalendarEntries returns. You can find out by adding an additional temporary line in your program:
print(type(calendar['Start'][i]))
If it is a datetime object, you can simply query the hour attribute:
hours = calendar['Start'][i].hour
If getCalendarEntries returns a POSIX timestamp, you can first convert it to a Python datetime object and then query the hour
dt = datetime.fromtimestamp(calendar['Start'][i])
hours = dt.hour
If it is a string, you can parse it using datetime.fromisoformat:
dt = datetime.datetime.fromisoformat(calendar['Start'][i])
hours = dt.hour

Using datetime to evaluate if a given time is already in the past

This is my current code:
import requests
import json
res = requests.get("http://transport.opendata.ch/v1/connections?
from=Baldegg_kloster&to=Luzern&fields[]=connections/from/prognosis/departure")
parsed_json = res.json()
time_1 = parsed_json['connections'][0]['from']['prognosis']
time_2 = parsed_json['connections'][1]['from']['prognosis']
time_3 = parsed_json['connections'][2]['from']['prognosis']
The JSON data looks like this:
{
"connections": [
{"from": {"prognosis": {"departure": "2018-08-04T14:21:00+0200"}}},
{"from": {"prognosis": {"departure": "2018-08-04T14:53:00+0200"}}},
{"from": {"prognosis": {"departure": "2018-08-04T15:22:00+0200"}}},
{"from": {"prognosis": {"departure": "2018-08-04T15:53:00+0200"}}}
]
}
Time_1, 2 and 3 all contain different times where the train departs. I want to check if time_1 is already in the past, and time_2 now is the relevant time. In my opinion, using datetime.now to get the current time and then using If / elif to check if time_1 is sooner than datetime.now would be a viable option. I am new to coding, so I am unsure if this is a good way of doing it. Would this work and are there any better ways?
PS: I am planning to make a display that displays the time the next train leaves. Therefore, it would have to check if the time is still relevant over and over again.
The following code extracts all the departure time strings from the JSON data, and converts the valid time strings to datetime objects. It then prints the current time, and then a list of the departure times that are still in the future.
Sometimes the converted JSON has None for a departure time, so we need to deal with that. And we need to get the current time as a timezone-aware object. We could just use the UTC timezone, but it's more convenient to use the local timezone from the JSON data.
import json
from datetime import datetime
import requests
url = "http://transport.opendata.ch/v1/connections? from=Baldegg_kloster&to=Luzern&fields[]=connections/from/prognosis/departure"
res = requests.get(url)
parsed_json = res.json()
# Extract all the departure time strings from the JSON data
time_strings = [d["from"]["prognosis"]["departure"]
for d in parsed_json["connections"]]
#print(time_strings)
# The format string to parse ISO 8601 date + time strings
iso_format = "%Y-%m-%dT%H:%M:%S%z"
# Convert the valid time strings to datetime objects
times = [datetime.strptime(ts, iso_format)
for ts in time_strings if ts is not None]
# Grab the timezone info from the first time
tz = times[0].tzinfo
# The current time, using the same timezone
nowtime = datetime.now(tz)
# Get rid of the microseconds
nowtime = nowtime.replace(microsecond=0)
print('Now', nowtime)
# Print the times that are still in the future
for i, t in enumerate(times):
if t > nowtime:
diff = t - nowtime
print('{}. {} departing in {}'.format(i, t, diff))
output
Now 2018-08-04 17:17:25+02:00
1. 2018-08-04 17:22:00+02:00 departing in 0:04:35
2. 2018-08-04 17:53:00+02:00 departing in 0:35:35
3. 2018-08-04 18:22:00+02:00 departing in 1:04:35
That query URL is a bit ugly, and not convenient if you want to check on other stations. It's better to let requests build the query URL for you from a dictionary of parameters. And we should also check that the request was successful, which we can do with the raise_for_status method.
Just replace the top section of the script with this:
import json
from datetime import datetime
import requests
endpoint = "http://transport.opendata.ch/v1/connections"
params = {
"from": "Baldegg_kloster",
"to": "Luzern",
"fields[]": "connections/from/prognosis/departure",
}
res = requests.get(endpoint, params=params)
res.raise_for_status()
parsed_json = res.json()
If you've never used enumerate before, it can be a little confusing at first. Here's a brief demo of three different ways to loop over a list of items and print each item and its index number.
things = ['zero', 'one', 'two', 'three']
for i, word in enumerate(things):
print(i, word)
for i in range(len(things)):
word = things[i]
print(i, word)
i = 0
while i < len(things):
word = things[i]
print(i, word)
i = i + 1
I didn't understand your question properly. I think you are trying to compare two time.
First let's see the contents of time_1:
{'departure': '2018-08-04T15:24:00+0200'}
So add departure key to access time. To parse the date and time string to python understandable time we use datetime.strptime() method. See this link for further description on datatime.strptime()
The modified version of your code that does time comparision:
import requests
import json
from datetime import datetime
res = requests.get("http://transport.opendata.ch/v1/connections? from=Baldegg_kloster&to=Luzern&fields[]=connections/from/prognosis/departure")
parsed_json = res.json()
time_1 = parsed_json['connections'][0]['from']['prognosis']['departure']
time_2 = parsed_json['connections'][1]['from']['prognosis']['departure']
time_3 = parsed_json['connections'][2]['from']['prognosis']['departure']
mod_time_1 = datetime.strptime(time_1,'%Y-%m-%dT%H:%M:%S%z')
mod_time_2 = datetime.strptime(time_2,'%Y-%m-%dT%H:%M:%S%z')
# you need to provide datetime.now() your timezone.
timezone = mod_time_1.tzinfo
time_now = datetime.now(timezone)
print(time_now > mod_time_1)

How to resolve created events' time mismatches due to Calendar API upgrade?

For reference, my timezone is Eastern - New York.
I am inserting events from a PostgreSQL database to a Google Calendar. I have been using UTC-4 since early June, when I finally got my app moved from v2 to v3, and for a couple of years in v2. Up until the August 18 that has worked giving me the correct time. On August 18 the time was off by one hour so I changed the setting to UTC-5. That worked for about 2 hours and then I have had to reset it back to UTC-4.
Now today, August 21, it is off an hour again and I have set the UTC back to -5. The events are getting inserted as they should with the exception of an event being an hour off and the UTC needing to be changed sometimes. The system time is correct on my server.
Any ideas on what is happening?
Some of my code snippets:
#get an event from a PostgreSQL database to insert into a Google Calendar
curs.execute("SELECT c_event_title,c_name,c_event_date,c_event_starttime,c_event_endtime,c_department,seat_arrange,c_attendee_count from sched_421 where sched_id_421=%i;" %recnum)
mit=curs.fetchall() # mit IS NOW ALL THE RESULTS OF THE QUERY
for myrec in mit: # FOR THE ONE RECORD (EVENT) IN THE QUERY RESULTS
myend_time = time.strftime("%I:%M %p", time.strptime(str(myrec[4]),"%H:%M:%S"))
if myend_time[0]=='0': # Remove leading zero for 01:00 - 09:00
myend_time = myend_time[1:]
title = ' - %s %s - Group:%s' %(myend_time,myrec[0],myrec[5])
mycontent = myrec[0]+' - '+ myrec[5]
content = mycontent
where = where_dict[room_calendar]
# THIS IS WHERE THE UTC IS, SOMETIMES 4 WORKS SOMETIMES 5 WORKS
start_time = '%sT%s-05:00' %(myrec[2],myrec[3]) # Google format
end_time = '%sT%s-05:00' %(myrec[2],myrec[4]) # Google format
myend_time = '%s' %myrec[4] # User format (am/pm)
seat_arrange = '\nSeating - %s' %str(myrec[6])
attendee_count = '\nNumber of participants: %s' %str(myrec[7])
descript = str(myrec[0]) + ' ' + seat_arrange + attendee_count+ "\n Created By: me#somewhere.com"
# upload the event to the calendar
created_event = service.events().insert(calendarId=calendar_dict[room_calendar], body=event).execute()
Are the dates you are looking at on different sides of the daylight savings switch?
Eastern Time Zone is UTC-4:00 from March to November and UTC-5:00 from November to March.
Hard-coding the TZ Offset like that is a bad idea, especially in a TZ that uses daylight savings. It would be best to store all the times as UTC and just apply the TZ information at the endpoints (data input and data display).
At the very least, you will want to have something calculate the correct TZ offset, based on the date, like a helper function or some block of logic.
I'm not sure how much control you have over the data in the database, so that would dictate which path you choose.
Ideally, you could change the 3 fields (date, start time, end time) in the database into 2 (start datetime UTC, end datetime UTC)
I have had to change this code:
# THIS IS WHERE THE UTC IS, SOMETIMES 4 WORKS SOMETIMES 5 WORKS
start_time = '%sT%s-05:00' %(myrec[2],myrec[3]) # Google format
end_time = '%sT%s-05:00' %(myrec[2],myrec[4]) # Google format
to (check to see if the event is in daylight savings time or not, this was not necessary with v2)
if bool (pytz.timezone('America/New_York').dst(datetime.datetime(myrec[2].year,myrec[2].month,myrec[2].day), is_dst=None)):
utc_offset = '4'
else:
utc_offset = '5'
start_time = '%sT%s-0%s:00' %(myrec[2],myrec[3],utc_offset)
end_time = '%sT%s-0%s:00' %(myrec[2],myrec[4],utc_offset)

Formatting time from Google Calendar API with Python

I am trying to get an easy to read time format to list events from google calendar for the current day. I can pull in the data, but I'm having a problem formatting the data to be just the Hour and minute for both start time and end time.
I want to display the information in an easy to read list, so I want to drop the date and seconds and only display the time in order. I have tried several different methods including slicing and trying to convert into date time with no luck.
date = datetime.datetime.now()
tomorrow = date.today() + datetime.timedelta(days=2)
yesterday = date.today() - datetime.timedelta(days=1)
now = str
data = '{:%Y-%m-%d}'.format(date)
tdata = '{:%Y-%m-%d}'.format(tomorrow)
ydata = '{:%Y-%m-%d}'.format(yesterday)
def DateQuery(calendar_service, start_date=data, end_date=tdata):
print 'Date query for events on Primary Calendar: %s to %s' % (start_date, end_date,)
query = gdata.calendar.service.CalendarEventQuery('default', 'private', 'full')
query.start_min = start_date
query.start_max = end_date
feed = calendar_service.CalendarQuery(query)
for i, an_event in enumerate(feed.entry):
print '\'%s\'' % (an_event.title.text)
for a_when in an_event.when:
dstime = (a_when.start_time,)
detime = (a_when.end_time,)
print '\t\tEnd time: %s' % (dstime)
print '\t\tEnd time: %s' % (detime)
It prints like this
End time: 2013-03-23T04:00:00.000-05:00
and I would prefer it be
End time: 04:00
Using the dateutil module:
>>> import dateutil.parser
>>> dateutil.parser.parse('2013-03-23T04:00:00.000-05:00')
>>> dt = dateutil.parser.parse('2013-03-23T04:00:00.000-05:00')
>>> dt.strftime('%I:%M')
'04:00'
If you don't want to use dateutil, you an also parse the string using the specific format with strptime.

Categories

Resources