How to group dates based on day? - python

To simplify my problem, I have dates like these:
2019-10-05 # Day 1 => starting point
2019-10-07 # Day 3
2019-10-07 # Day 3
2019-10-09 # Day 5
2019-10-10 # Day 6
2019-10-10 # Day 6
result should be: {1: ['2019-10-05'], 3: ['2019-10-07', '2019-10-07'], and so on...}
I feel that there is a module in Python (probably Collections?) that can solve for this but I don't exactly know what terminology should I use that is applicable to this problem other than grouping dates in day.

with some amount of conversion between strings and dates (strptime does the heavy lifting)... you could do this:
from datetime import datetime
from collections import defaultdict
# the date format (for strptime)
fmt = "%Y-%m-%d"
startdate = datetime.strptime('2019-10-05', fmt).date()
# the dates as strings
date_str_list = [
"2019-10-07", # Day 3
"2019-10-07", # Day 3
"2019-10-09", # Day 5
"2019-10-10", # Day 6
"2019-10-10" # Day 6
]
# converting to date objects
date_list = [datetime.strptime(date_str, fmt).date() for date_str in date_str_list]
res = defaultdict(list)
for date in date_list:
res[(date - startdate).days + 1].append(str(date))
print(res)
# defaultdict(<class 'list'>, {3: ['2019-10-07', '2019-10-07'],
# 5: ['2019-10-09'], 6: ['2019-10-10', '2019-10-10']})
where i use defaultdict as the container.
the difference of two date object is a timedelta object; .days gives the number of days thereof.

Related

Counting the number of days in a range by month in Python?

I am trying to count the number of days in a range of dates by month. So let's say a range of dates occurs between 2 months since the beginning and ending dates are in 2 different months. I want the output to show that x amount of days in the range fall in one month and x amount of days fall in the next month.
So far my code only outputs each day in the range from 10 days after veterans day (my start date) to 20 days after veterans day (end date):
import datetime
Veterans = datetime.datetime(2019, 11, 12)
print(Veterans)
number_of_days = 10
ten_days = datetime.timedelta(days=10)
vetplus10 = Veterans + ten_days
date_list = [(vetplus10 + datetime.timedelta(days=day)).isoformat() for day in range(number_of_days)]
print(date_list)
['2019-11-22T00:00:00', '2019-11-23T00:00:00', '2019-11-24T00:00:00',
'2019-11-25T00:00:00', '2019-11-26T00:00:00', '2019-11-27T00:00:00',
'2019-11-28T00:00:00', '2019-11-29T00:00:00', '2019-11-30T00:00:00',
'2019-12-01T00:00:00']
The idea here would be for python to tally up all the days in November (9) and all the days in December (1).
Thank you in advance!
You can try using pandas to create a date range, convert it to a month and get the value counts.
import pandas as pd
pd.date_range(start='2019-11-22', periods=10, freq='D').to_period('M').value_counts()
2019-11 9
2019-12 1
I was able to get similar output without using an additional library:
import datetime
Veterans = datetime.datetime(2019, 11, 12)
print(Veterans)
number_of_days = 10
ten_days = datetime.timedelta(days=10)
vetplus10 = Veterans + ten_days
date_list = [(vetplus10 + datetime.timedelta(days=day)) for day in range(number_of_days)]
day_counts = {}
for day in date_list:
day_counts[f"{day.year}-{day.month}"] = day_counts.get(f"{day.year}-{day.month}", 0) + 1
print(day_counts)
2019-11-12 00:00:00
{'2019-11': 9, '2019-12': 1}
Essentially, I simply iterate over the datetime objects in your original list, and build a dictionary for each year-month that it encounters.

Reverse of datetime.weekday()?

d : Datetime object
Given a date d and a day of the week x in the range of 0–6, return the date of x within the same week as d.
I can think of some ways to do this, but they all seem rather inefficient. Is there a pythonic way?
Example
Input: datetime(2020,2,4,18,0,55,00000), 6
Output: date(2020,2,7)
Input: datetime(2020,2,4,18,0,55,00000), 0
Output date(2020,2,3)
This approach gets the first day in the week and goes from there to find the date requested by the weekday integer:
import datetime as dt
def weekday_in_week(d,weekday=None):
if not weekday:
return None
week_start = d - dt.timedelta(days=d.weekday())
return week_start + dt.timedelta(days=weekday)
Example usage:
In [27]: weekday_in_week(dt.date.today(),6)
Out[27]: datetime.date(2020, 2, 9)
Remember that the weekdays are as such: 0 is Monday, 6 is Sunday.

Pandas date range returns "could not convert string to Timestamp" for yyyy-ww

I have a dataframe with two columns; Sales and Date.
dataset.head(10)
Date Sales
0 2015-01-02 34988.0
1 2015-01-03 32809.0
2 2015-01-05 9802.0
3 2015-01-06 15124.0
4 2015-01-07 13553.0
5 2015-01-08 14574.0
6 2015-01-09 20836.0
7 2015-01-10 28825.0
8 2015-01-12 6938.0
9 2015-01-13 11790.0
I want to convert the Date column from yyyy-mm-dd (e.g. 2015-06-01) to yyyy-ww (e.g. 2015-23), so I run the following piece of code:
dataset["Date"] = pd.to_datetime(dataset["Date"]).dt.strftime('%Y-%V')
Then I group by my Sales based on weeks, i.e.
data = dataset.groupby(['Date'])["Sales"].sum().reset_index()
data.head(10)
Date Sales
0 2015-01 67797.0
1 2015-02 102714.0
2 2015-03 107011.0
3 2015-04 121480.0
4 2015-05 148098.0
5 2015-06 132152.0
6 2015-07 133914.0
7 2015-08 136160.0
8 2015-09 185471.0
9 2015-10 190793.0
Now I want to create a date range based on the Date column, since I'm predicting sales based on weeks:
ds = data.Date.values
ds_pred = pd.date_range(start=ds.min(), periods=len(ds) + num_pred_weeks,
freq="W")
However I'm getting the following error: could not convert string to Timestamp which I'm not really sure how to fix. So, if I use 2015-01-01 as the starting date of my date-import I get no error, which makes me realize that I'm using the functions wrong. However, I'm not sure how?
I would like to basically have a date range that spans weekly from the current week and then 52 weeks into the future.
I think problem is want create minimum of dataset["Date"] column filled by strings in format YYYY-VV. But for pass to date_range need format YYYY-MM-DD or datetime object.
I found this:
Several additional directives not required by the C89 standard are included for convenience. These parameters all correspond to ISO 8601 date values. These may not be available on all platforms when used with the strftime() method. The ISO 8601 year and ISO 8601 week directives are not interchangeable with the year and week number directives above. Calling strptime() with incomplete or ambiguous ISO 8601 directives will raise a ValueError.
%V ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4.
Pandas 0.24.2 bug with YYYY-VV format:
dataset = pd.DataFrame({'Date':['2015-06-01','2015-06-02']})
dataset["Date"] = pd.to_datetime(dataset["Date"]).dt.strftime('%Y-%V')
print (dataset)
Date
0 2015-23
1 2015-23
ds = pd.to_datetime(dataset['Date'], format='%Y-%V')
print (ds)
ValueError: 'V' is a bad directive in format '%Y-%V'
Possible solution is use %U or %W, check this:
%U Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0.
%W Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0.
dataset = pd.DataFrame({'Date':['2015-06-01','2015-06-02']})
dataset["Date"] = pd.to_datetime(dataset["Date"]).dt.strftime('%Y-%U')
print (dataset)
Date
0 2015-22
1 2015-22
ds = pd.to_datetime(dataset['Date'] + '-1', format='%Y-%U-%w')
print (ds)
0 2015-06-01
1 2015-06-01
Name: Date, dtype: datetime64[ns]
Or using data from original DataFrame in datetimes:
dataset = pd.DataFrame({'Date':['2015-06-01','2015-06-02'],
'Sales':[10,20]})
dataset["Date"] = pd.to_datetime(dataset["Date"])
print (dataset)
Date Sales
0 2015-06-01 10
1 2015-06-02 20
data = dataset.groupby(dataset['Date'].dt.strftime('%Y-%V'))["Sales"].sum().reset_index()
print (data)
Date Sales
0 2015-23 30
num_pred_weeks = 5
ds = data.Date.values
ds_pred = pd.date_range(start=dataset["Date"].min(), periods=len(ds) + num_pred_weeks, freq="W")
print (ds_pred)
DatetimeIndex(['2015-06-07', '2015-06-14', '2015-06-21',
'2015-06-28',
'2015-07-05', '2015-07-12'],
dtype='datetime64[ns]', freq='W-SUN')
If ds contains dates as string formatted as '2015-01' which should be '%Y-%W' (or '%G-%V' in datetime library) you have to add a day number to obtain a day. Here, assuming that you want the monday you should to:
ds_pred = pd.date_range(start=pd.to_datetime(ds.min() + '-1', format='%Y-%W-%w',
periods=len(ds) + num_pred_weeks, freq="W")

Print last week dates only corresponding to weekdays

I have below code:
from datetime import date
from datetime import timedelta
today = datetime.date.today()
for i in range(0,7):
print (today - timedelta(days=i))
2018-10-31
2018-10-30
2018-10-29
2018-10-28
2018-10-27
2018-10-26
2018-10-25
Want I want is just to print weekdays and excluding weekends. So, my desired result should be:
2018-10-31
2018-10-30
2018-10-29
2018-10-26
2018-10-25
2018-10-24
2018-10-23
Where can I modify my code to achieve aimed results?
Use datetime.date.weekday(), which:
Return the day of the week as an integer, where Monday is 0 and Sunday is 6.
from datetime import date
from datetime import timedelta
today = date.today()
for i in range(7):
d = today - timedelta(days=i)
if d.weekday() < 5: # Here
print(d)
Produces:
2018-10-31
2018-10-30
2018-10-29
2018-10-26
2018-10-25
This gives you the weekdays that fall in the last 7 days. Or, if you want the previous 7 weekdays, consider:
from datetime import date
from datetime import timedelta
today = date.today()
num_weekdays = 0
for i in range(10):
d = today - timedelta(days=i)
if d.weekday() < 5:
print(d)
num_weekdays += 1
if num_weekdays >= 7:
break
This version is basically the same, with the range stop changed from 7 to 10, and an added num_weekdays counter. We increment the counter when we print a date, and once we hit 7, we break the loop (otherwise we may print 8 dates, depending on the day of the week of today).
Or, another way:
from datetime import date
from datetime import timedelta
today = date.today()
prev_days = [today - timedelta(days=i) for i in range(10)] # Get 10 previous days
prev_days = [d for d in prev_days if d.weekday() < 5] # Filter out the weekends
for d in prev_days[:7]: # Select the first 7
print(d)
Similar idea, we create a list of 10 previous dates called prev_days. We then filter that list down by filtering out weekend dates. Then, in the for loop, we only loop over the first 7 elements of the filtered list, so that we print at most 7 dates.

Python get number of the week by month

Any one could help me please, How to get number of week by month in Python?
from datetime import datetime, date, timedelta
Input:
date1 = "2015-07-09"
date2 = "2016-08-20"
Output:
2015-07 : 4
2015-08 : 5
2015-08 : 4
....
2016-08 : 5
How to count number of the week by monthly from date1 to date2?
If you wanted to measure the number of full weeks between two dates, you could accomplish this with datetime.strptime and timedelta like so:
from datetime import datetime, date, timedelta
dateformat = "%Y-%m-%d"
date1 = datetime.strptime("2015-07-09", dateformat)
date2 = datetime.strptime("2016-08-20", dateformat)
weeks = int((date2-date1).days/7)
print weeks
This outputs 58. The divide by 7 causes the number of weeks to be returned. The number of whole weeks is used (rather than partial) because of int which returns only the integer portion. If you wanted to get the number of partial weeks, you could divide by 7.0 instead of 7, and ensure that you remove the int piece.
Try this:
date1 = "2015-07-09"
date2 = "2016-08-20"
d1 = datetime.datetime.strptime(date1, '%Y-%m-%d').date()
d2 = datetime.datetime.strptime(date2, '%Y-%m-%d').date()
diff = d2 -d1
weeks, days = divmod(diff.days, 7)

Categories

Resources