Python Date issue when starting a new month - python

I have an issue with a python request. In this request I need to set two dates, today and yesterday. This has functioned without issue throughout my testing until today.
The issue here being that of course we have just started a new month.
I am currently using the following date codes, however as i have now realized they do not take the monthly reset into consideration.
yesterday = str(datetime.datetime.today().month) + "/" +
str(datetime.datetime.today().day-1) + "/" +
str(datetime.datetime.today().year)
today = str(datetime.datetime.today().month) + "/" +
str(datetime.datetime.today().day) + "/" +
str(datetime.datetime.today().year)
As soon as the date is not 0 the application works like a charm.

Use datetime.timedelta
Ex:
import datetime
today = datetime.datetime.now()
print(today.strftime("%m/%d/%Y"))
yesterday = (today - datetime.timedelta(days=1)).strftime("%m/%d/%Y")
print(yesterday)
Output:
10/01/2018
09/30/2018
Use strftime to get your required date format in string

You make things too complicated, instead of worrying about "wrap arounds", etc. In your code you subtract the number of days with 1, but if we are the first of the month (for example October, 1st), then by subtracting one from it, we get "October 0th" (sic.).
You better perform the arithmetic on the date object:
yesterday_date = datetime.date.today() - datetime.timestamp(days=1)
and then convert it to a string with:
yesterday = yesterday_date.strftime('%m/%d/%Y')
At the moment of writing, this generates:
>>> yesterday_date.strftime('%m/%d/%Y')
'09/30/2018'
Performing arithmetic in the printing is giving "two responsibilities at once", and this is typically bad software design: the idea is one responsibility.

Related

Generate randomly formatted date strings for machine learning

For a NLP project in python I need to generate random dates for model training purpose. Particularly, the date format must be random and coherent with a set of language locales. The formats includes those with only numbers and formats with (partially) written out day and month names, and various common punctuations.
My best solution so far is the following algorithm:
generate a datetime() object with random values (nice solution here)
randomly select a locale, i.e. pick one of ['en_US','fr_FR','it_IT','de_DE'] where in this case this list is well known and short, so not a problem.
randomly select a format string for strftime(), i.e. ['%Y-%m-%d','%d %B %Y',...]. In my case the list should reflect potentially occuring date formats in the documents that will be exposed to the NLP model in the future.
generate a sting with strftime()
Especially for 3) i do not know a better version than to hardcode the list of what I saw manually within the training documents. I could not yet find a function that would turn ocr-dates into a format string, such that i could extend the list when yet-unseen date formats come by.
Do you have any suggestions on how to come up with better randomly formatted dates, or how to improve this approach?
USE random.randrange() AND datetime.timedelta() TO GENERATE A RANDOM DATE BETWEEN TWO DATES
Call datetime.date(year, month, day) to return a datetime object representing the time indicated by year, month, and day. Call this twice to define the start and end date. Subtract the start date from the end date to get the time between the two dates. Call datetime.timedelta.days to get the number of days from the previous result datetime.timedelta. Call random.randrange(days) to get a random integer less than the previous result days. Call datetime.timedelta(days=n) to get a datetime.timedelta representing the previous result n. Add this result to the start date.
start_date = datetime.date(2020, 1, 1)
end_date = datetime.date(2020, 2, 1)
time_between_dates = end_date - start_date
days_between_dates = time_between_dates.days
random_number_of_days = random.randrange(days_between_dates)
random_date = start_date + datetime.timedelta(days=random_number_of_days)
print(random_date)
Here is my solution. Concerning the local, all need to be available on your computer to avoid error
import random
from datetime import datetime, timedelta
import locale
LOCALE = ['en_US','fr_FR','it_IT','de_DE'] # all need to be available on your computer to avoid error
DATE_FORMAT = ['%Y-%m-%d','%d %B %Y']
def gen_datetime(min_year=1900, max_year=datetime.now().year):
# generate a datetime
start = datetime(min_year, 1, 1)
years = max_year - min_year + 1
end = start + timedelta(days=365 * years)
format_date = DATE_FORMAT[random.randint(0, len(DATE_FORMAT)-1)]
locale_date = LOCALE[random.randint(0, len(LOCALE)-1)]
locale.setlocale(locale.LC_ALL, locale_date) # generate error if local are not available on your computer
return (start + (end - start) * random.random()).strftime(format_date)
date = gen_datetime()
print(date)

Convert 18-digit LDAP/FILETIME timestamps to human readable date

I have exported a list of AD Users out of AD and need to validate their login times.
The output from the powershell script give lastlogin as LDAP/FILE time
EXAMPLE 130305048577611542
I am having trouble converting this to readable time in pandas
Im using the following code:
df['date of login'] = pd.to_datetime(df['FileTime'], unit='ns')
The column FileTime contains time formatted like the EXAMPLE above.
Im getting the following output in my new column date of login
EXAMPLE 1974-02-17 03:50:48.577611542
I know this is being parsed incorrectly as when i input this date time on a online converter i get this output
EXAMPLE:
Epoch/Unix time: 1386031258
GMT: Tuesday, December 3, 2013 12:40:58 AM
Your time zone: Monday, December 2, 2013 4:40:58 PM GMT-08:00
Anyone have an idea of what occuring here why are all my dates in the 1970'
I know this answer is very late to the party, but for anyone else looking in the future.
The 18-digit Active Directory timestamps (LDAP), also named 'Windows NT time format','Win32 FILETIME or SYSTEMTIME' or NTFS file time. These are used in Microsoft Active Directory for pwdLastSet, accountExpires, LastLogon, LastLogonTimestamp and LastPwdSet. The timestamp is the number of 100-nanoseconds intervals (1 nanosecond = one billionth of a second) since Jan 1, 1601 UTC.
Therefore, 130305048577611542 does indeed relate to December 3, 2013.
When putting this value through the date time function in Python, it is truncating the value to nine digits. Therefore the timestamp becomes 130305048 and goes from 1.1.1970 which does result in a 1974 date!
In order to get the correct Unix timestamp you need to do:
(130305048577611542 / 10000000) - 11644473600
Here's a solution I did in Python that worked well for me:
import datetime
def ad_timestamp(timestamp):
if timestamp != 0:
return datetime.datetime(1601, 1, 1) + datetime.timedelta(seconds=timestamp/10000000)
return np.nan
So then if you need to convert a Pandas column:
df.lastLogonTimestamp = df.lastLogonTimestamp.fillna(0).apply(ad_timestamp)
Note: I needed to use fillna before using apply. Also, since I filled with 0's, I checked for that in the conversion function about, if timestamp != 0. Hope that makes sense. It's extra stuff but you may need it to convert the column in question.
I've been stuck on this for couple of days. But now i am ready to share really working solution in more easy to use form:
import datetime
timestamp = 132375402928051110
value = datetime.datetime (1601, 1, 1) +
datetime.timedelta(seconds=timestamp/10000000) ### combine str 3 and 4
print(value.strftime('%Y-%m-%d %H:%M:%S'))

What is an efficient way to trim a date in Python?

Currently I am trying to trim the current date into day, month and year with the following code.
#Code from my local machine
from datetime import datetime
from datetime import timedelta
five_days_ago = datetime.now()-timedelta(days=5)
# result: 2017-07-14 19:52:15.847476
get_date = str(five_days_ago).rpartition(' ')[0]
#result: 2017-07-14
#Extract the day
day = get_date.rpartition('-')[2]
# result: 14
#Extract the year
year = get_date.rpartition('-')[0])
# result: 2017-07
I am not a Python professional because I grasp this language for a couple of months ago but I want to understand a few things here:
Why did I receive this 2017-07 if str.rpartition() is supposed to separate a string once you have declared some sort separator (-, /, " ")? I was expecting to receive 2017...
Is there an efficient way to separate day, month and year? I do not want to repeat the same mistakes with my insecure code.
I tried my code in the following tech. setups:
local machine with Python 3.5.2 (x64), Python 3.6.1 (x64) and repl.it with Python 3.6.1
Try the code online, copy and paste the line codes
Try the following:
from datetime import date, timedelta
five_days_ago = date.today() - timedelta(days=5)
day = five_days_ago.day
year = five_days_ago.year
If what you want is a date (not a date and time), use date instead of datetime. Then, the day and year are simply properties on the date object.
As to your question regarding rpartition, it works by splitting on the rightmost separator (in your case, the hyphen between the month and the day) - that's what the r in rpartition means. So get_date.rpartition('-') returns ['2017-07', '-', '14'].
If you want to persist with your approach, your year code would be made to work if you replace rpartition with partition, e.g.:
year = get_date.partition('-')[0]
# result: 2017
However, there's also a related (better) approach - use split:
parts = get_date.split('-')
year = parts[0]
month = parts[1]
day = parts[2]

Determine if a day is a business day in Python / Pandas

I currently have a program setup to run two different ways. One way is to run over a specified time frame, and the other way is to run everyday. However, when I have it set to run everyday, I only want it to continue if its a business day. Now from research I've seen that you can iterate through business days using Pandas like so:
start = 2016-08-05
end = datetime.date.today().strftime("%Y-%m-%d")
for day in pd.bdate_range(start, end):
print str(day) + " is a business Day"
And this works great when I run my program over the specified period.
But when I want to have the program ran everyday, I can't quite figure out how to test one specific day for being a business day. Basically I want to do something like this:
start = datetime.date.today().strftime("%Y-%m-%d")
end = datetime.date.today().strftime("%Y-%m-%d")
if start == end:
if not Bdate(start)
print "Not a Business day"
I know I could probably setup pd.bdate_range() to do what I'm looking for, but in my opinion would be sloppy and not intuitive. I feel like there must be a simpler solution to this. Any advice?
Since len of pd.bdate_range() tells us how many business days are in the supplied range of dates, we can cast this to a bool to determine if a range of a single day is a business day:
def is_business_day(date):
return bool(len(pd.bdate_range(date, date)))
I just found a different solution to this. This might be interesting if you want to find the next business day if your date is not a business day.
bdays=BDay()
def is_business_day(date):
return date == date + 0*bdays
adding 0*bdays rolls forward on the next business day including the current one. Unfortunately, subtracting 0*bdays does not roll backwards (at least with the pandas version I was using).
Moreover, due to this behavior, you also need to be careful since not necessarily
0*bdays + 1*bdays != 1*bdays
There is builtin method to do this in pandas.
For Pandas version <1.0
from pandas.tseries.offsets import Day, BDay
from datetime import datetime
bday=BDay()
is_business_day = bday.onOffset(datetime(2020,8,20))
For Pandas version >=1.1.0 (onOffset is deprecated)
from pandas.tseries.offsets import Day, BDay
from datetime import datetime
bday=BDay()
is_business_day = bday.is_on_offset(datetime(2020,8,20))
Using at least numpy version 1.7.0., try np.is_busday()
start = datetime.date.today().strftime("%Y-%m-%d")
end = datetime.date.today().strftime("%Y-%m-%d")
if start == end:
# added code here
if not np.is_busday(start):
print("Not a Business day")
for me I use an old trick from Excel:
from pandas.tseries.offsets import Day, BDay
def is_bday(x):
return x == x + Day(1) - BDay(1)
Please check this module - bdateutil
Please check the below code using above module :
from bdateutil import isbday
from datetime import datetime,date
now = datetime.now()
val = isbday(date(now.year, now.month, now.day))
print val
Please let me know if this help.

Formatting a string in Python is giving me a file not found error

I am new to python and trying to work with Pandas to do some work with several .csv files that have predictable names, Log_(yyyy/mm/dd).
What I'm planning is simple enough, but opening the file is giving me problems.
today = date.today()
m,d,y = today.month, today.day, today.year
file_name = 'Log_{}-{}-{}'.format(y,m,d)
pd.read_csv(file_name)
This will give me an error, but this works
file_name = 'Log_2015-01-10'
pd.read_csv(file_name)
They print the same thing, and str(file_name) doesn't fix the issue.
You have two problems: your tuple assignment swaps the day and year values, and you need to zero-pad values below 10. You are actually producing the string 'Log_10-1-2015', not 'Log_2014-01-10'.
Formatting a date into a string is easiest done by leaving formatting to the date object rather than extracting the individual components yourself:
today = date.today()
file_name = 'Log_{:%Y-%m-%d}'.format(today)
The % fields are strftime() formatting instructions that use zero-padding by default.
Demo:
>>> from datetime import date
>>> today = date.today()
>>> 'Log_{:%Y-%m-%d}'.format(today)
'Log_2015-01-11'
Yes, it is already the 11th in my timezone. :-)

Categories

Resources