What is an efficient way to trim a date in Python? - python

Currently I am trying to trim the current date into day, month and year with the following code.
#Code from my local machine
from datetime import datetime
from datetime import timedelta
five_days_ago = datetime.now()-timedelta(days=5)
# result: 2017-07-14 19:52:15.847476
get_date = str(five_days_ago).rpartition(' ')[0]
#result: 2017-07-14
#Extract the day
day = get_date.rpartition('-')[2]
# result: 14
#Extract the year
year = get_date.rpartition('-')[0])
# result: 2017-07
I am not a Python professional because I grasp this language for a couple of months ago but I want to understand a few things here:
Why did I receive this 2017-07 if str.rpartition() is supposed to separate a string once you have declared some sort separator (-, /, " ")? I was expecting to receive 2017...
Is there an efficient way to separate day, month and year? I do not want to repeat the same mistakes with my insecure code.
I tried my code in the following tech. setups:
local machine with Python 3.5.2 (x64), Python 3.6.1 (x64) and repl.it with Python 3.6.1
Try the code online, copy and paste the line codes

Try the following:
from datetime import date, timedelta
five_days_ago = date.today() - timedelta(days=5)
day = five_days_ago.day
year = five_days_ago.year
If what you want is a date (not a date and time), use date instead of datetime. Then, the day and year are simply properties on the date object.
As to your question regarding rpartition, it works by splitting on the rightmost separator (in your case, the hyphen between the month and the day) - that's what the r in rpartition means. So get_date.rpartition('-') returns ['2017-07', '-', '14'].
If you want to persist with your approach, your year code would be made to work if you replace rpartition with partition, e.g.:
year = get_date.partition('-')[0]
# result: 2017
However, there's also a related (better) approach - use split:
parts = get_date.split('-')
year = parts[0]
month = parts[1]
day = parts[2]

Related

.Between function not working as expected

I am encountering some issues when using the .between method in Python.
I have a simple dataset consisting of ~59000 records
The date format is in DD/MM/YYYY and I would like to filter the days in the month of April in the year 2014.
psi_df = pd.read_csv('thecsvfile.csv')
psi_west_df = psi_df[['24-hr_psi','west']]
april_records = psi_west_df[psi_west_df['24-hr_psi'].between('1/4/2014','31/4/2014')]
april_records.head(100)
I received the output whereby the date suddenly jumps from 3/4/2014 (3rd April) - 10/4/2014 (10th April). This pattern recurs for every month and for every year up till the year 2020 (the final year of this dataset), which was not my original intention of obtaining the data for the month of April in the year 2014.
As I am still rather new to python, I decided to perform some fixes in Excel instead. I separated the date and the time columns and reran the code with the necessary syntax updated.
psi_df = pd.read_csv('psi_new.csv')
psi_west_df = psi_df[['date','west']]
april_records = psi_west_df[psi_west_df['date'].between('1/4/2014','31/4/2014')]
april_records.head(100)
I still faced the same issue and now, I am totally stumped as to why this is occurring. Am I using the .between method wrongly? Seeking everyone's kind guidance and directions as to why this is occurring. Much appreciated and many thanks everyone.
The csv file that I am using can be obtained from this website:
https://data.gov.sg/dataset/historical-24-hr-psi
The first problem is your date column isn't a date but an object column.
Ensure you column is really a date by using the pandas to_datetime function.
psi_west_df['date'] = pd.to_datetime(psi_west_df['date'], format='%d/%m/%Y')
After the column is really a date column in order for the between function to run with no problems you should give it two date object and not string object like this:
start_day = pd.to_datetime('1/4/2014', format='%d/%m/%Y')
end_day = pd.to_datetime('30/4/2014', format='%d/%m/%Y')
april_records = psi_west_df[psi_west_df['date'].between(start_day, end_day)]
So all together:
psi_df = pd.read_csv('psi_new.csv')
psi_west_df = psi_df[['date','west']]
psi_west_df['date'] = pd.to_datetime(psi_west_df['date'], format='%d/%m/%Y')
start_day = pd.to_datetime('1/4/2014', format='%d/%m/%Y')
end_day = pd.to_datetime('30/4/2014', format='%d/%m/%Y')
april_records = psi_west_df[psi_west_df['date'].between(start_day, end_day)]
april_records.head(100)
Note - this code should work on the data after you change it with excel, meaning you have a separate column for data and time.

Python Date issue when starting a new month

I have an issue with a python request. In this request I need to set two dates, today and yesterday. This has functioned without issue throughout my testing until today.
The issue here being that of course we have just started a new month.
I am currently using the following date codes, however as i have now realized they do not take the monthly reset into consideration.
yesterday = str(datetime.datetime.today().month) + "/" +
str(datetime.datetime.today().day-1) + "/" +
str(datetime.datetime.today().year)
today = str(datetime.datetime.today().month) + "/" +
str(datetime.datetime.today().day) + "/" +
str(datetime.datetime.today().year)
As soon as the date is not 0 the application works like a charm.
Use datetime.timedelta
Ex:
import datetime
today = datetime.datetime.now()
print(today.strftime("%m/%d/%Y"))
yesterday = (today - datetime.timedelta(days=1)).strftime("%m/%d/%Y")
print(yesterday)
Output:
10/01/2018
09/30/2018
Use strftime to get your required date format in string
You make things too complicated, instead of worrying about "wrap arounds", etc. In your code you subtract the number of days with 1, but if we are the first of the month (for example October, 1st), then by subtracting one from it, we get "October 0th" (sic.).
You better perform the arithmetic on the date object:
yesterday_date = datetime.date.today() - datetime.timestamp(days=1)
and then convert it to a string with:
yesterday = yesterday_date.strftime('%m/%d/%Y')
At the moment of writing, this generates:
>>> yesterday_date.strftime('%m/%d/%Y')
'09/30/2018'
Performing arithmetic in the printing is giving "two responsibilities at once", and this is typically bad software design: the idea is one responsibility.

Convert 18-digit LDAP/FILETIME timestamps to human readable date

I have exported a list of AD Users out of AD and need to validate their login times.
The output from the powershell script give lastlogin as LDAP/FILE time
EXAMPLE 130305048577611542
I am having trouble converting this to readable time in pandas
Im using the following code:
df['date of login'] = pd.to_datetime(df['FileTime'], unit='ns')
The column FileTime contains time formatted like the EXAMPLE above.
Im getting the following output in my new column date of login
EXAMPLE 1974-02-17 03:50:48.577611542
I know this is being parsed incorrectly as when i input this date time on a online converter i get this output
EXAMPLE:
Epoch/Unix time: 1386031258
GMT: Tuesday, December 3, 2013 12:40:58 AM
Your time zone: Monday, December 2, 2013 4:40:58 PM GMT-08:00
Anyone have an idea of what occuring here why are all my dates in the 1970'
I know this answer is very late to the party, but for anyone else looking in the future.
The 18-digit Active Directory timestamps (LDAP), also named 'Windows NT time format','Win32 FILETIME or SYSTEMTIME' or NTFS file time. These are used in Microsoft Active Directory for pwdLastSet, accountExpires, LastLogon, LastLogonTimestamp and LastPwdSet. The timestamp is the number of 100-nanoseconds intervals (1 nanosecond = one billionth of a second) since Jan 1, 1601 UTC.
Therefore, 130305048577611542 does indeed relate to December 3, 2013.
When putting this value through the date time function in Python, it is truncating the value to nine digits. Therefore the timestamp becomes 130305048 and goes from 1.1.1970 which does result in a 1974 date!
In order to get the correct Unix timestamp you need to do:
(130305048577611542 / 10000000) - 11644473600
Here's a solution I did in Python that worked well for me:
import datetime
def ad_timestamp(timestamp):
if timestamp != 0:
return datetime.datetime(1601, 1, 1) + datetime.timedelta(seconds=timestamp/10000000)
return np.nan
So then if you need to convert a Pandas column:
df.lastLogonTimestamp = df.lastLogonTimestamp.fillna(0).apply(ad_timestamp)
Note: I needed to use fillna before using apply. Also, since I filled with 0's, I checked for that in the conversion function about, if timestamp != 0. Hope that makes sense. It's extra stuff but you may need it to convert the column in question.
I've been stuck on this for couple of days. But now i am ready to share really working solution in more easy to use form:
import datetime
timestamp = 132375402928051110
value = datetime.datetime (1601, 1, 1) +
datetime.timedelta(seconds=timestamp/10000000) ### combine str 3 and 4
print(value.strftime('%Y-%m-%d %H:%M:%S'))

Formatting string into datetime using Pandas - trouble with directives

I have a string that is the full year followed by the ISO week of the year (so some years have 53 weeks, because the week counting starts at the first full week of the year). I want to convert it to a datetime object using pandas.to_datetime(). So I do:
pandas.to_datetime('201145', format='%Y%W')
and it returns:
Timestamp('2011-01-01 00:00:00')
which is not right. Or if I try:
pandas.to_datetime('201145', format='%Y%V')
it tells me that %V is a bad directive.
What am I doing wrong?
I think that the following question would be useful to you: Reversing date.isocalender()
Using the functions provided in that question this is how I would proceed:
import datetime
import pandas as pd
def iso_year_start(iso_year):
"The gregorian calendar date of the first day of the given ISO year"
fourth_jan = datetime.date(iso_year, 1, 4)
delta = datetime.timedelta(fourth_jan.isoweekday()-1)
return fourth_jan - delta
def iso_to_gregorian(iso_year, iso_week, iso_day):
"Gregorian calendar date for the given ISO year, week and day"
year_start = iso_year_start(iso_year)
return year_start + datetime.timedelta(days=iso_day-1, weeks=iso_week-1)
def time_stamp(yourString):
year = int(yourString[0:4])
week = int(yourString[-2:])
day = 1
return year, week, day
yourTimeStamp = iso_to_gregorian( time_stamp('201145')[0] , time_stamp('201145')[1], time_stamp('201145')[2] )
print yourTimeStamp
Then run that function for your values and append them as date time objects to the dataframe.
The result I got from your specified string was:
2011-11-07

Django/Python - Check a date is in current week

I would like to do something like this:
entries = Entry.objects.filter(created_at__in = current_week())
How to make it for good performance. Thanks!
Edit: I still have no idea for current_week() function.
Use __range. You'll need to actually calculate the beginning and end of the week first:
import datetime
date = datetime.date.today()
start_week = date - datetime.timedelta(date.weekday())
end_week = start_week + datetime.timedelta(7)
entries = Entry.objects.filter(created_at__range=[start_week, end_week])
Since Django 1.11, we you can use week Field lookup:
Entry.objects.filter(created_at__week=current_week)
It will give you the week from monday to sunday, according to ISO-8601.
To query for the current week:
from datetime import date
current_week = date.today().isocalendar()[1]
isocalendar() will return a tuple with 3 items: (ISO year, ISO week number, ISO weekday).
Yep, this question is at 2 years ago. Today with more experiences, I recommend using arrow with less pain in handling date time.
Checkout: https://github.com/crsmithdev/arrow

Categories

Resources