To Repeatedly call a web URL which has time stamp in the end,
Example URL
'https://mywebApi/StartTime=2019-05-01%2000:00:00&&endTime=2019-05-01%2003:59:59'
StartTime=2019-05-01%2000:00:00
is URL representation of Time 2019-05-01 00:00:00
endTime=2019-05-01%2003:59:59
is URL representation of Time 2019-05-01 00:00:00
Requirement is to make repetitive calls , with 4 hour window.
While adding 4 hours, the date may change,
Is there a lean way to generate the URL String,
Some thing like
baseUrl = 'https://mywebApi/StartTime='
startTime = DateTime(2018-05-03 00:01:00)
terminationTime = DateTime(2019-05-03 00:05:00)
while (startTime < terminationTime):
endTime = startTime + hours(4)
url = baseUrl+str(startTime)+"endtime="+str(startTime)
# request get url
startTime = startTime + hours(1)
You can use Datetime.timedelta as well as the strftime function as follows:
from datetime import datetime, timedelta
baseUrl = 'https://mywebApi/StartTime='
startTime = datetime(year=2018, month=5, day=3, hour=0, minute=1, second=0)
terminationTime = datetime(year=2018, month=5, day=3, hour=3, minute=59, second=59)
while (startTime < terminationTime):
endTime = startTime + timedelta(hours=4)
url = baseUrl + startTime.strftime("%Y-%m-%d%20%H:%M:%S") + "endtime=" + endtime.strftime("%Y-%m-%d%20%H:%M:%S")
# request get url
startTime = endTime
The following link is useful https://www.guru99.com/date-time-and-datetime-classes-in-python.html or you can look at the official datetime documentation.
edit: using what u/John Gordan said to declare the initial dates
Related
After trying to scrape data from twitter using Snscrape, I am unable to get the data of tweets posted within the past hour only.
import pandas as pd
import snscrape.modules.twitter as sntwitter
from datetime import datetime, time
from datetime import timedelta
now = datetime.utcnow()
since = now - timedelta(hours=1)
since_str = since.strftime('%Y-%m-%d %H:%M:%S.%f%z')
until_str = now.strftime('%Y-%m-%d %H:%M:%S.%f%z')
# Query tweets with hashtag #SOSREX in the last one hour
query = '#SOSREX Since:' + since_str + ' until:' + until_str
SOSREX_data = []
for tweet in sntwitter.TwitterSearchScraper(query).get_items():
if len(SOSREX_data)>100:
break
else:
SOSREX_data.append([tweet.date,tweet.user.username,tweet.user.displayname,
tweet.content,tweet.likeCount,tweet.retweetCount,
tweet.sourceLabel,tweet.user.followersCount,tweet.user.location
])
Tweets_data = pd.DataFrame(SOSREX_data,
columns=["Date_tweeted","username","display_name",
"Tweets","Number_of_Likes","Number_retweets",
"Source_of_Tweet",
"number_of_followers","location"
])
what is the best way to filter a rest request by date?
would work passing a variable maybe like this:
today = date.today() today_90 = today - timedelta(days = 90)
service-now.com/api/now/table/incident?sysparm_limit=1000&sysparm_query=sys_created_on**dates values here?**
I try to understanding your problem:
from datetime import date, timedelta
import requests
today = date.today()
today_90 = today - timedelta(days = 90)
r = requests.get('https://xxxx.service-now.com/api/now/table/incident?sysparm_limit=1000&sysparm_query=sys_created_on>' + str(today_90) + '&sysparm_query=sys_created_on<' + str(today))
This is a side project I am doing as I am attempting to learn Python.
I am trying to write a python script that will iterate through a date range and use each date that is returned in a GET request URL.
The URL uses a LastModified parameter and limits GET requests to a 24 hour period so I would like to run the GET request for each day from the start date.
Below is what I have currently, the major issue I am having is how to separate the returned dates in a way that I can use each date separately for the GET, the GET will also need to be looped to use each date I suppose.
Any pointer in the right direction would be helpful as I am trying to learn as much as possible.
start_date = datetime.date(2020, 1, 1)
end_date = datetime.date.today()
delta = datetime.timedelta(days=1)
while start_date <= end_date:
last_mod = start_date + delta
print(last_mod)
start_date += delta
import requests
from requests.auth import HTTPBasicAuth
vend_key = 'REDACTED'
user_key = 'REDACTED'
metrc_license = 'A12-0000015-LIC'
base_url = 'https://sandbox-api-ca.metrc.com'
last_mod_date = ''
a = HTTPBasicAuth(vend_key, user_key)
def get(path):
url = '{}/{}/?licenseNumber={}&lastModifiedStart={}'.format(base_url, path, metrc_license, last_mod_date, )
print('URL:', url)
r = requests.get(url, auth=a)
print("The server response is: ", r.status_code)
if r.status_code == 200:
return r.json()
# Would like an elif that is r.status_code is 500 wait _ seconds and try again
elif r.status_code == 500:
print("500 error, try again.")
else:
print("Error")
print((get('/packages/v1/active')))
Here is an example return from the current script, I do not need it to return each date so I can remove the print, but how can I make each loop from the date be its own variable to use in a loop of the GET?
2020-01-02
2020-01-03
2020-01-04
2020-01-05
2020-01-06
etc...
etc...
etc...
2020-05-24
2020-05-25
2020-05-26
2020-05-27
URL: https://sandbox-api-ca.metrc.com//packages/v1/active/?licenseNumber=A12-0000015-LIC&lastModifiedStart=2020-05-27
The server response is: 200
[]
It's super simple, you need to use use the while loop that generates all these dates into your get() function. Here is what I mean:
import requests
from requests.auth import HTTPBasicAuth
vend_key = 'REDACTED'
user_key = 'REDACTED'
metrc_license = 'A12-0000015-LIC'
base_url = 'https://sandbox-api-ca.metrc.com'
a = HTTPBasicAuth(vend_key, user_key)
def get(path):
start_date = datetime.date(2020, 1, 1)
end_date = datetime.date.today()
delta = datetime.timedelta(days=1)
while start_date <= end_date:
last_mod_date = start_date + delta
print(last_mod_date)
start_date += delta
url = '{}/{}/?licenseNumber={}&lastModifiedStart={}'.format(base_url, path, metrc_license, last_mod_date, )
print('URL:', url)
r = requests.get(url, auth=a)
print("The server response is: ", r.status_code)
if r.status_code == 200:
return r.json()
# Would like an elif that is r.status_code is 500 wait _ seconds and try again
elif r.status_code == 500:
print("500 error, try again.")
else:
print("Error")
print((get('/packages/v1/active')))
One thing you could do is call your get function inside the while loop. First modify the get function to take a new parameter date and then use this parameter when you build your url.
For instance:
def get(path, date):
url = '{}/{}/?licenseNumber={}&lastModifiedStart={}'.format(base_url, path, metrc_license, date, )
...
And then call get inside the while loop.
while start_date <= end_date:
last_mod = start_date + delta
get(some_path, last_mod)
start_date += delta
This would make a lot of GET requests in a short period of time, so you might want to be careful not to overload the server with requests.
Goal is the use datetime to reiterate over
http://www.harness.org.au/racing/results/?firstDate=01-01-2019
http://www.harness.org.au/racing/results/?firstDate=02-01-2019.... to yesterdays date
(should be done in new_url = base_url + str(enddate1))
then once in that href, i want to circulate over meetingfulllisttable to get name and href to then get results data from each track that day.
My current error is'<=' not supported between instances of 'datetime.timedelta' and 'str' - which comes from my while loop. why is this? never used datetime before
from datetime import datetime, date, timedelta
import requests
import re
from bs4 import BeautifulSoup
base_url = "http://www.harness.org.au/racing/results/?firstDate="
base1_url = "http://www.harness.org.au"
webpage_response = requests.get('http://www.harness.org.au/racing/results/?firstDate=')
soup = BeautifulSoup(webpage_response.content, "html.parser")
format = "%d-%m-%y"
delta = timedelta(days=1)
yesterday = datetime.today() - timedelta(days=1)
yesterday1 = yesterday.strftime(format)
enddate = datetime(2019, 1, 1)
enddate1 = enddate.strftime(format)
while enddate1 <= yesterday1:
enddate1 =+ timedelta(days=1)
new_url = base_url + str(enddate1)
soup12 = requests.get(new_url)
soup1 = BeautifulSoup(soup12.content, "html.parser")
table1 = soup1.find('table', class_='meetingListFull')
for tr in table1.find_all('tr'):
all_cells = tr.find_all('td')
track = all_cells.a.href.get_text()
href = all_cells.get('href')
trackresults = base1_url + href
This
yesterday1 = yesterday.strftime(format)
Is a string. That's why you are getting that error
I am required to extract the time of the day from the datetime.datetime object returned by the created_at attribute, but how can I do that?
This is my code for getting the datetime.datetime object.
from datetime import *
import tweepy
consumer_key = ''
consumer_secret = ''
access_token = ''
access_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
tweets = tweepy.Cursor(api.home_timeline).items(limit = 2)
t1 = datetime.strptime('Wed Jun 01 12:53:42 +0000 2011', '%a %b %d %H:%M:%S +0000 %Y')
for tweet in tweets:
print (tweet.created_at - t1)
t1 = tweet.created_at
I need to only extract the hour and minutes from t1.
I don't know how you want to format it, but you can do:
print("Created at %s:%s" % (t1.hour, t1.minute))
for example.
If the time is 11:03, then afrendeiro's answer will print 11:3.
You could zero-pad the minutes:
"Created at {:d}:{:02d}".format(tdate.hour, tdate.minute)
Or go another way and use tdate.time() and only take the hour/minute part:
str(tdate.time())[0:5]
import datetime
YEAR = datetime.date.today().year # the current year
MONTH = datetime.date.today().month # the current month
DATE = datetime.date.today().day # the current day
HOUR = datetime.datetime.now().hour # the current hour
MINUTE = datetime.datetime.now().minute # the current minute
SECONDS = datetime.datetime.now().second #the current second
print(YEAR, MONTH, DATE, HOUR, MINUTE, SECONDS)
2021 3 11 19 20 57
It's easier to use the timestamp for these things since Tweepy gets both:
import datetime
print(datetime.datetime.fromtimestamp(int(t1)).strftime('%H:%M'))
datetime has fields hour and minute. So to get the hours and minutes, you would use t1.hour and t1.minute.
However, when you subtract two datetimes, the result is a timedelta, which only has the days and seconds fields. So you'll need to divide and multiply as necessary to get the numbers you need.