Python: How do you scrape daily data from dynamic web using Python? - python

The following code works but stops after 29th of Feb. The website returns "you have entered an invalid date. Please re-enter your search", which necessitate clicking on "OK". How do I get around this?
country_search("United States")
time.sleep(2)
date_select = Select(driver.find_element_by_name("dr"))
date_select.select_by_visible_text("Enter date range...") #All Dates
select_economic_news()
#btnModifySearch
for month in range(1,9):
for day in range(1,32):
try:
set_from_month(month)
set_from_date(day)
set_from_year("2020")
set_to_month(month)
set_to_date(day)
set_to_year("2020")
time.sleep(5)
#select_economic_news()
time.sleep(5)
search_now()
time.sleep(8)
export_csv()
modify_search()
time.sleep(5)
#country_remove()
except ElementClickInterceptedException:
break
logout()

If you can only use the methods featured in the initial post then I would try something like:
set_from_year('2020')
set_to_year('2020')
for month in range(1, 9):
# 1 to 9 for Jan to Aug
month_str = '0' + str(month)
set_from_month(month_str)
set_to_month(month_str)
for day in range(1, 32):
# Assuming an error is thrown for invalid days
try:
# Store data as needed
except Exception as e:
# print(e) to learn from error if needed
pass
There is a lot more that goes into this if it turns out that you're writing these methods yourself and need to loop through HTML and find a pattern for daily data.

I believe you want to dynamically obtain the number of days in a month, so that you can loop over that number to get data for each date. You can do this as follows:
from datetime import datetime
currentDay = datetime.today()
# You can set the currentDay using this if you want the data till the current date or
# whenever your scheduler runs the job.
# Now you need to get the number of days in each month from the chosen date, you can
# have the corresponding function like getStartMonth() in your program which will
# return the starting month.
from calendar import monthrange
daysPerMonth = {}
year = currentDay.year #TODO : change this to getStartYear()
startMonth = 3 # TODO : Implement getStartMonth() in your code.
for month in range(startMonth, currentDay.month+1):
# monthrange returns (weekday,number of days in that month)
daysPerMonth[month] = monthrange(year, month)[1]
for month in daysPerMonth.items():
print(month[0], '-',month[1])
This will output something like this(Number of days in a month from - March 2020 till August 2020):
3 - 31
4 - 30
5 - 31
6 - 30
7 - 31
8 - 31
And then you can run a loop for number of days while referring the range from the dict that you've obtained.
NOTE : In the function where you're running the loop to get data for each date add one if condition to check if it's the last day of the year and modify the year accordingly.

Maybe You can use these function to get count days of month:
import datetime
def get_month_days_count(year: int, month: int) -> int:
date = datetime.datetime(year, month, 1)
while (date + datetime.timedelta(days=1)).month == month:
date = date + datetime.timedelta(days=1)
return date.day

Related

How to check whether a date is in the next week, python

Basically, I'm trying to check whether a date, e.g. 2021-07-08, is in the next week, or the week after that, or neither.
#I can call the start and end dates of the current week
start = tday - timedelta(days=tday.weekday())
end = start + timedelta(days=6)
print("Today: " + str(tday))
print("Start: " + str(start))
print("End: " + str(end))
# and I can get the current week number.
curr_week = datetime.date.today().strftime("%V")
print(curr_week)
Is there a better way than getting a list of dates in curr_week + 1 and then checking whether date is in in that list?
Thanks so much
GENERAL ANSWER
It is best to stick to datetime and timedelta, since this handles all edge cases like year changes, years with 53 weeks etc.
So find the number of the next week, and compare the weeknumber of the week you want to check against that.
import datetime
# Date to check in date format:
check_date = datetime.datetime.strptime("2021-09-08", "%Y-%d-%m").date()
# Current week number:
curr_week = datetime.date.today().strftime("%V")
# number of next week
next_week = (datetime.date.today()+datetime.timedelta(weeks=1)).strftime("%V")
# number of the week after that
week_after_next_week = (datetime.date.today()+datetime.timedelta(weeks=2)).strftime("%V")
# Compare week numbers of next weeks to the week number of the date to check:
if next_week == check_date.strftime("%V"):
# Date is within next week, put code here
pass
elif week_after_next_week == check_date.strftime("%V"):
# Date is the week after next week, put code here
pass
OLD ANSWER
This messes up around year changes, and modulo doesn't fix it because there are years with 53 weeks.
You can compare the week numbers by converting them to integers. You don't need to create a list of all dates within the next week.
import datetime
# Date to check in date format:
check_date = datetime.datetime.strptime("2021-07-08", "%Y-%d-%m").date()
# Current week number, make it modulo so that the last week is week 0:
curr_week = int(datetime.date.today().strftime("%V"))
# Compare week numbers:
if curr_week == (int(check_date.strftime("%V"))-1):
# Date is within next week, put code here
pass
elif curr_week == (int(check_date.strftime("%V"))-2):
# Date is the week after next week, put code here
pass
You can cast the date you want to check in datetime, and then compare the week numbers.
# date you want to check
date = datetime.datetime.strptime("2021-07-08","%Y-%m-%d")
# current date
tday = datetime.date.today()
# compare the weeks
print(date.strftime("%V"))
print(tday.strftime("%V"))
27
32
[see Alfred's answer]
You can get the week number directly as an integer integer from the IsoCalendarDate representation of each date.
from datetime import datetime
date_format = '%Y-%m-%d'
t_now = datetime.strptime('2021-08-11', date_format)
target_date = datetime.strptime('2021-08-18', date_format)
Just using datetime comparing:
from datetime import datetime, timedelta
def in_next_week(date):
""" -1: before; 0: in; 1: after next week;"""
today = datetime.today()
this_monday = today.date() - timedelta(today.weekday())
start = this_monday + timedelta(weeks=1)
end = this_monday + timedelta(weeks=2)
return -1 if date < start else 0 if date < end else 1
Test cases:
for i in range(14):
dt = datetime.today().date() + timedelta(days=i)
print(dt, in_next_week(dt))

Datetime usage in Python for finance related task

I am a complete beginner in Python and it is my first question on Stackoverflow. I have tried numerous tutorials on youtube + some additional google searching, but havent been really able to completely solve my task. Briefly putting it below asf:
We have a dataset of futures prices (values) for next 12-36 months. Each value corresponds to one month in future. The idea for the code is to have an input of following:
starting date in days (like 2nd of Feb 2021 or any other)
duration of given days (say 95 or 150 days or 425 days)
The code has to calculate the number of days from each given month between starting and ending date (which is starting + duration) and then to use appropriate values from corresponding month to calculate an average price for this particular duration in time.
Example:
Starting date is 2nd of Feb 2021 and duration is 95 days (end date 8th of May). Values are Feb - 7750, Mar - 9200, April - 9500, May is 10100.
I have managed to do same in Excel (which was very clumsy and too complicated to use on the daily basis) and average stands for around 8949 taking in mind all above. But I cant figure out how to code same "interval" with days per month in Python. All of the articles just simply point out to "monthrange" function, but how is that possible to apply same for this task?
Appreciate your understanding of a newbie question and sorry for the lack of knowledge to express/explain my thoughts more clear.
Looking forward to any help relative to above.
You can use dataframe.todatetime() to constuct your code. If you need further help, just click ctrl + tab within your code to see the inputs and their usage.
You can try the following code.
The input_start_date() function will input the start date, and return it when called.
After we have the start date we input the duration of days.
Then we simply add them using timedelta
For the Distribution of days in the month : SO - #wwii
import datetime
from datetime import timedelta
def input_start_date():
YEAR = int(input('Enter the year : '))
MONTH = int(input('Enter the month : '))
DAY = int(input('Enter the day : '))
DATE = datetime.date(YEAR, MONTH, DAY)
return DATE
# get the start date:
Start_date = input_start_date()
# get the Duration
Duration = int(input('Enter the duration : '))
print('Start Date : ', Start_date)
print('Duration :', Duration)
# final date.
Final_date = Start_date + timedelta(days=Duration)
print(Final_date)
# credit goes to #wwii -----------------------
one_day = datetime.timedelta(1)
start_dates = [Start_date]
end_dates = []
today = Start_date
while today <= Final_date:
tomorrow = today + one_day
if tomorrow.month != today.month:
start_dates.append(tomorrow)
end_dates.append(today)
today = tomorrow
end_dates.append(Final_date)
# -----------------------------------------------
print("Distribution : ")
for i in range(len(start_dates)):
days = int(str(end_dates[i]-start_dates[i]).split()[0]) + 1
print(start_dates[i], ' to ', end_dates[i], ' = ', days)
print(str(end_dates[0]-start_dates[0]))
'''
Distribution :
2021-02-02 to 2021-02-28 = 27
2021-03-01 to 2021-03-31 = 31
2021-04-01 to 2021-04-30 = 30
2021-05-01 to 2021-05-08 = 8
'''

Finding Month from Day, Week and Year Python

I can not figure out how to take the year, day and week to return the month. Right now I am just trying to develop a Python Script that will do this. The goal after finishing this script is to use it for a Spark SQL Query to find the month since in my data I am given a day, year and week in each row.
As of now my python code looks like so. This code only works for the statement I have into the print(getmonth(2, 30 ,2018) returning 7. I have tried other dates and the output is only "None". I have tried variables also, but no success there.
import datetime
def month(day, week, year):
for month in range(1,13):
try:
date = datetime.datetime(year, month, day)
except ValueError:
iso_year, iso_weeknum, iso_weekday = date.isocalendar()
if iso_weeknum == week:
return date.month
print(getmonth(2, 30, 2018))
#iso_(year,weeknum,weekday) are the classes for ISO. Year is 1-9999, weeknum is 0-52 or 53, and weekday is 0-6
#isocaldenar is a tuple (year, week#, weekday)
I don't really understand your questions, but i think datetime will work... sorce: Get date from ISO week number in Python:
>>> from datetime import datetime
>>> day = 28
>>> week = 30
>>> year = 2018
>>> t = datetime.strptime('{}_{}_{}{}'.format(day,week,year,-0), '%d_%W_%Y%w')
>>> t.strftime('%W')
'30'
>>> t.strftime('%m')
'07'
>>>
A simpler solution can be created using the pendulum library. As in your code, loop through month numbers, create dates, compare the weeks for these dates against the desired date. If found halt the loop; if the date is not seen then exit the loop with, say, a -1.
>>> import pendulum
>>> for month in range(1,13):
... date = pendulum.create(2018, month, 28)
... if date.week_of_year == 30:
... break
... else:
... month = -1
...
>>> month
7
>>> date
<Pendulum [2018-07-28T00:00:00+00:00]>
Here is a brute force method that loops through the days of the year (It expects the day as Monday being 0 and Sunday being 6, it also returns the Month 0 indexed, January being 0 and December being 11):
import datetime
def month(day, week, year):
#Generate list of No of days of the month
months = [31,28,31,30,31,30,31,31,30,31,30,31]
if((year % 4 == 0 and year % 100 != 0) or year % 400 == 0): months[1] += 1
#ISO wk1 of the yr is the first wk with a thursday, otherwise it's wk53 of the previous yr
currentWeek = 1 if day < 4 else 0
#The day that the chosen year started on
currentDay = datetime.datetime(year, 1, 1).weekday()
#Loop over every day of the year
for i in range(sum(months)):
#If the week is correct and day is correct you're done
if day == currentDay and week == currentWeek:
return months.index(next(filter(lambda x: x!=0, months)))
#Otherwise, go to next day of wk/next wk of yr
currentDay = (currentDay + 1) % 7
if currentDay == 0:
currentWeek += 1
#And decrement counter for current month
months[months.index(next(filter(lambda x: x!=0, months)))]-=1
print(month(2, 30, 2018)) # 6 i.e. July
months.index(next(filter(lambda x: x!=0, months))) is used to get the first month of that we haven't used all of the days of, i.e. the month you're currently in.

Python Date game, Future/Past determination

Issue:
My program keeps telling me, no matter what, that my date is invalid.
Assignment:
The user will input a year, a month number (1-12), and a day number in that order. The program will
determine if the date is in the future, or in the past. (If the date entered is today’s date, assume the date
is in the past). A future date is a date that has not happened yet. If today is July 31st, August 1 of the
same year is not in the past, just because the day (1) comes before today’s day (31). For the input, if the user enters an invalid month, display an appropriate error message (like “Invalid
Month”) and end the program. If the user enters an invalid day, display an appropriate error message
(like “Invalid Day”) and end the program. Assume 28 days in February. In other words, if the month is
February and the day entered is 29, display the error message and end the program.
Remember:
Thirty days has September,
April, June, and November
All the rest have 31
Except February, which has 28….
Define a function called inTheFuture() that accepts a given year number, a month number, and a
day number as 3 separate arguments. The function should return a Boolean value (True or False) to
indicate whether the date (year, month, and day) parameters are in the future or not. A True return
occurs if the date is in the future; False if the date is in the past. It should not draw any images or text to
the screen. It also should not ask the user for input. It just determines if a given date is in the future or
not.
Find an image to represent the future, and an image to represent the past. Examples could include
something like “The Jetson’s” for the future, and an old wagon for the past.If the date is in the future, display your future image in the middle of a canvas. If the date is in the past,
display your past image in the middle of the canvas. At the top of the canvas, display “In the future” or
“In the past”, whichever matches the image.
To find the current date, you may add this import and function to your code:
import datetime
def getTodaysDate():
return datetime.datetime.today()
If you call this function somewhere in your code:
today = getTodaysDate()
Then you can use the year, month, and day member variables to obtain the current year, month, and
day. For example:
print(today.month)
would output the current month.
Here is my program that I thought was finished. What am I missing?
This works for a date input like "24.12.2016". Change it to your needs in the line of strptime().
import datetime
from time import strptime
def date_in_the_future(date):
datetime_string = strptime(date, "%d.%m.%Y")
d = datetime.datetime(datetime_string[0],datetime_string[1],datetime_string[2])
now = datetime.datetime.now()
delta = d - now
diff = delta.days + 1
if diff > 0:
return True
else:
return False
You should learn how to use if,else, and elif. Check the code below:
import datetime
def getTodaysDate():
return datetime.datetime.today();
today = getTodaysDate();
print(today)
#def inTheFuture():
year= input ("Enter Year: ");
month= int(input ("Enter Month: "));
day= int(input ("Enter Day: "));
print"Correct, Your Day is:",day,"/",month,"/",year
if (month > 12):
print("How many months in a year? Not as many as you think I suppose..")
raise SystemExit
elif month in [1,3,5,7,8,10,12]:
if day > 31:
print("What is a month where you are from?")
raise SystemExit
else:
print"Correct, Your Day is:",day,"/",month,"/",year
elif (month == 2):
if (day > 28):
print("February only has so many days!")
raise SystemExit
else:
print"Correct, Your Day is:",day,"/",month,"/",year
elif (month in [4,6,9,11]):
if (day > 30):
print("That day is not possible!")
else:
print"Correct, Your Day is:",day,"/",month,"/",year

How to list next 24 months' start dates with python?

Please tell me how I can list next 24 months' start dates with python,
such as:
01May2014
01June2014
.
.
.
01Aug2015
and so on
I tried:
import datetime
this_month_start = datetime.datetime.now().replace(day=1)
for i in xrange(24):
print (this_month_start + i*datetime.timedelta(40)).replace(day=1)
But it skips some months.
Just increment the month value; I used datetime.date() types here as that's more than enough:
current = datetime.date.today().replace(day=1)
for i in xrange(24):
new_month = current.month % 12 + 1
new_year = current.year + current.month // 12
current = current.replace(month=new_month, year=new_year)
print current
The new month calculation picks the next month based on the last calculated month, and the year is incremented every time the previous month reached December.
By manipulating a current object, you simplify the calculations; you can do it with i as an offset as well, but the calculation gets a little more complicated.
It'll work with datetime.datetime() too.
To simplify arithmetics, try/except could be used:
from datetime import date
current = date.today().replace(day=1)
for _ in range(24):
try:
current = current.replace(month=current.month + 1)
except ValueError: # new year
current = current.replace(month=1, year=current.year + 1)
print(current.strftime('%d%b%Y'))

Categories

Resources