Pandas dataframe to_datetime() is converting date incorrectly - python

I have a date in this format - '17-JUL-53'
when I pd.to_datetime('17-JUL-53') it returns Timestamp('2053-07-17 00:00:00')
You could say it is correct, but the actual date to be returned is 1953-07-17. That comes out OK in excel, how do we do that with to_datetime()?
[edit] Just to show what happens when we convert from str to time in python:
>>> time.strptime('17-JUL-53', '%d-%b-%y')
time.struct_time(tm_year=2053, tm_mon=7, tm_mday=17, tm_hour=0, tm_min=0,tm_sec=0, tm_wday=3, tm_yday=198, tm_isdst=-1)

I think you need add substring 19 to year.
More info about formatting of datetime is here.
import pandas as pd
s = '17-JUL-53'
d = s[:7] + '19' + s[7:]
print d
#17-JUL-1953
dt = pd.to_datetime(d, format='%d-%b-%Y')
print dt
#1953-07-17 00:00:00
%d-%b-%Y means:
%d - Day of the month as a zero-padded decimal number
%b - Month as locale’s abbreviated name
%Y - Year with century as a decimal number

I would do it this way, providing all your dates are in the 1900 century :)
from dateutil.relativedelta import relativedelta
input ='17-jul-53'
output = pd.to_datetime(input)
output_clean = output - relativedelta(years=100)

Somehow you need to mention in which century you are ... in pandas this cannot be handled by to_datetime function, so you need to do it upstream. Here is an approach with regex:
import re
import pandas as pd
date = '17-JUL-53'
pd.to_datetime(re.sub(r'(\d{2}-\w{3}-)(\d{2})', r'\g<1>19\2', date))
#Timestamp('1953-07-17 00:00:00')

Related

Convert DataFrame column from string to datetime for format "January 1, 2001 Monday"

I am trying to convert a dataframe column "date" from string to datetime. I have this format: "January 1, 2001 Monday".
I tried to use the following:
from dateutil import parser
for index,v in df['date'].items():
df['date'][index] = parser.parse(df['date'][index])
But it gives me the following error:
ValueError: Cannot set non-string value '2001-01-01 00:00:00' into a StringArray.
I checked the datatype of the column "date" and it tells me string type.
This is the snippet of the dataframe:
Any help would be most appreciated!
why don't you try this instead of dateutils, pandas offer much simpler tools such as pd.to_datetime function:
df['date'] = pd.to_datetime(df['date'], format='%B %d, %Y %A')
You need to specify the format for the datetime object in order it to be parsed correctly. The documentation helps with this:
%A is for Weekday as locale’s full name, e.g., Monday
%B is for Month as locale’s full name, e.g., January
%d is for Day of the month as a zero-padded decimal number.
%Y is for Year with century as a decimal number, e.g., 2021.
Combining all of them we have the following function:
from datetime import datetime
def mdy_to_ymd(d):
return datetime.strptime(d, '%B %d, %Y %A').strftime('%Y-%m-%d')
print(mdy_to_ymd('January 1, 2021 Monday'))
> 2021-01-01
One more thing is for your case, .apply() will work faster, thus the code is:
df['date'] = df['date'].apply(lambda x: mdy_to_ymd)
Feel free to add Hour-Minute-Second if needed.

How to convert string in to date in Python [duplicate]

This question already has answers here:
Python date string to date object
(9 answers)
Closed 1 year ago.
I have a string:
dat="012915"
I want to convert it to a date:
01-29-2015
I tried:
import datetime
from datetime import datetime
dat="012915"
dat = datetime.strptime(dat, '%m%d%Y').date()
dat
but failed:
ValueError: time data '01-29-15' does not match format '%m%d%Y'
%Y is for full year
%y is the short version for year
Your code is totally fine, just change the year directive to lowercase %y will do.
import datetime
from datetime import datetime
dat="012915"
dat = datetime.strptime(dat, '%m%d%y').date()
dat
I think you are looking for
import datetime
from datetime import datetime
dat="012915"
#lower %y
dat = datetime.strptime(dat, '%m%d%y').date()
print(dat)
this will give you
2015-01-29
Trying this out; changing the format string to '%m%d%y' seems to work. Looking at the python docs:
%y Year without century as a zero-padded decimal number.
%Y Year with century as a decimal number.
So the first one is what you need. Source: https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes
import datetime
from datetime import datetime
dat="012915"
dat = datetime.strptime(dat, '%m%d%y').date()
print(dat)
Change %Y to %y. If you want to use %Y, change dat to '01292015'.
y% is formatted as 15 while %Y is formatted as 2015.
from datetime import datetime
date_str = '012915'
date_obj = datetime.strptime(date_str, '%m/%d/%y')
print("The type of the date is now", type(date_obj))
print("The date is", date_obj)

Retreive data and time from a datetime string

In my project I have a string like this one:
"2018-03-07 06:46:02.737951"
I would like to get two variables: one in date format that contains the data, and the other for the time.
I tried:
from datetime import datetime
datetime_object = datetime.strptime('2018-03-07 06:46:02.737951', '%b %d %Y %I:%M%p')
but I get an error.
Then I tried:
from dateutil import parser
dt = parser.parse("2018-03-07 06:46:02.737951")
but I don't know what I can do with these results.
How can I extract the values for my vars "date_var" and "time_var"?
You need to match your string exactly. Reference: strftime-and-strptime-behavior
from datetime import datetime
dt = datetime.strptime('2018-03-07 06:46:02.737951', '%Y-%m-%d %H:%M:%S.%f')
print(dt.date())
print(dt.time())
d = dt.date() # extract date
t = dt.time() # extract time
print(type(d)) # printout the types
print(type(t))
Output:
2018-03-07
06:46:02.737951
<class 'datetime.date'>
<class 'datetime.time'>
Your format string is something along the lines of:
Month as locale’s abbreviated name.
Day of the month as a zero-padded decimal number.
Year with century as a decimal number.
Hour (12-hour clock) as a zero-padded decimal number.
Minute as a zero-padded decimal number.
Locale’s equivalent of either AM or PM.
with some spaces and : in it - which does not match your format.
# Accessing the time as an object:
the_time = dt.time()
#the_time
datetime.time(23, 55)
# Accessing the time as a string:
the_time.strftime("%H:%M:%S")
'23:55:00'
Similar for Date
Refer here

How to convert Julian date to standard date?

I have a string as Julian date like "16152" meaning 152'nd day of 2016 or "15234" meaning 234'th day of 2015.
How can I convert these Julian dates to format like 20/05/2016 using Python 3 standard library?
I can get the year 2016 like this: date = 20 + julian[0:1], where julian is the string containing the Julian date, but how can I calculate the rest according to 1th of January?
The .strptime() method supports the day of year format:
>>> import datetime
>>>
>>> datetime.datetime.strptime('16234', '%y%j').date()
datetime.date(2016, 8, 21)
And then you can use strftime() to reformat the date
>>> date = datetime.date(2016, 8, 21)
>>> date.strftime('%d/%m/%Y')
'21/08/2016'
Well, first, create a datetime object (from the module datetime)
from datetime import datetime
from datetime import timedelta
julian = ... # Your julian datetime
date = datetime.strptime("1/1/" + jul[:2], "%m/%d/%y")
# Just initializing the start date, which will be January 1st in the year of the Julian date (2 first chars)
Now add the days from the start date:
daysToAdd = int(julian[2:]) # Taking the days and converting to int
date += timedelta(days = daysToAdd - 1)
Now, you can just print it as is:
print(str(date))
Or you can use strftime() function.
print(date.strftime("%d/%m/%y"))
Read more about strftime format string here
Easy way
Convert from regular date to Julian date
print datetime.datetime.now().strftime("%y%j")
Convert from Julian date to regular date
print datetime.datetime.strptime('19155', '%y%j').strftime("%d-%m-%Y")
I used this for changing a Juian date to xml xsd:datetime
def julianDate2ISO8601(d, offset='+00:00'):
"""
return ISO8601 formated datetime from julian date
optional offset [+|-]hh:mm
"""
d = str(d) # make sure it is a string
# replace leading number with correct century
centuryArray = ['19','20','21']
d = centuryArray[int(d[:1])] + d[1:]
# format string to iso 8601 datetime
return datetime.datetime.strptime(d, '%Y%j').date().strftime(
'%Y-%m-%dT00:00:00') + offset

How to print a date in a regular format?

This is my code:
import datetime
today = datetime.date.today()
print(today)
This prints: 2008-11-22 which is exactly what I want.
But, I have a list I'm appending this to and then suddenly everything goes "wonky". Here is the code:
import datetime
mylist = []
today = datetime.date.today()
mylist.append(today)
print(mylist)
This prints the following:
[datetime.date(2008, 11, 22)]
How can I get just a simple date like 2008-11-22?
The WHY: dates are objects
In Python, dates are objects. Therefore, when you manipulate them, you manipulate objects, not strings or timestamps.
Any object in Python has TWO string representations:
The regular representation that is used by print can be get using the str() function. It is most of the time the most common human readable format and is used to ease display. So str(datetime.datetime(2008, 11, 22, 19, 53, 42)) gives you '2008-11-22 19:53:42'.
The alternative representation that is used to represent the object nature (as a data). It can be get using the repr() function and is handy to know what kind of data your manipulating while you are developing or debugging. repr(datetime.datetime(2008, 11, 22, 19, 53, 42)) gives you 'datetime.datetime(2008, 11, 22, 19, 53, 42)'.
What happened is that when you have printed the date using print, it used str() so you could see a nice date string. But when you have printed mylist, you have printed a list of objects and Python tried to represent the set of data, using repr().
The How: what do you want to do with that?
Well, when you manipulate dates, keep using the date objects all long the way. They got thousand of useful methods and most of the Python API expect dates to be objects.
When you want to display them, just use str(). In Python, the good practice is to explicitly cast everything. So just when it's time to print, get a string representation of your date using str(date).
One last thing. When you tried to print the dates, you printed mylist. If you want to print a date, you must print the date objects, not their container (the list).
E.G, you want to print all the date in a list :
for date in mylist :
print str(date)
Note that in that specific case, you can even omit str() because print will use it for you. But it should not become a habit :-)
Practical case, using your code
import datetime
mylist = []
today = datetime.date.today()
mylist.append(today)
print mylist[0] # print the date object, not the container ;-)
2008-11-22
# It's better to always use str() because :
print "This is a new day : ", mylist[0] # will work
>>> This is a new day : 2008-11-22
print "This is a new day : " + mylist[0] # will crash
>>> cannot concatenate 'str' and 'datetime.date' objects
print "This is a new day : " + str(mylist[0])
>>> This is a new day : 2008-11-22
Advanced date formatting
Dates have a default representation, but you may want to print them in a specific format. In that case, you can get a custom string representation using the strftime() method.
strftime() expects a string pattern explaining how you want to format your date.
E.G :
print today.strftime('We are the %d, %b %Y')
>>> 'We are the 22, Nov 2008'
All the letter after a "%" represent a format for something:
%d is the day number (2 digits, prefixed with leading zero's if necessary)
%m is the month number (2 digits, prefixed with leading zero's if necessary)
%b is the month abbreviation (3 letters)
%B is the month name in full (letters)
%y is the year number abbreviated (last 2 digits)
%Y is the year number full (4 digits)
etc.
Have a look at the official documentation, or McCutchen's quick reference you can't know them all.
Since PEP3101, every object can have its own format used automatically by the method format of any string. In the case of the datetime, the format is the same used in
strftime. So you can do the same as above like this:
print "We are the {:%d, %b %Y}".format(today)
>>> 'We are the 22, Nov 2008'
The advantage of this form is that you can also convert other objects at the same time.
With the introduction of Formatted string literals (since Python 3.6, 2016-12-23) this can be written as
import datetime
f"{datetime.datetime.now():%Y-%m-%d}"
>>> '2017-06-15'
Localization
Dates can automatically adapt to the local language and culture if you use them the right way, but it's a bit complicated. Maybe for another question on SO(Stack Overflow) ;-)
import datetime
print datetime.datetime.now().strftime("%Y-%m-%d %H:%M")
Edit:
After Cees' suggestion, I have started using time as well:
import time
print time.strftime("%Y-%m-%d %H:%M")
The date, datetime, and time objects all support a strftime(format) method,
to create a string representing the time under the control of an explicit format
string.
Here is a list of the format codes with their directive and meaning.
%a Locale’s abbreviated weekday name.
%A Locale’s full weekday name.
%b Locale’s abbreviated month name.
%B Locale’s full month name.
%c Locale’s appropriate date and time representation.
%d Day of the month as a decimal number [01,31].
%f Microsecond as a decimal number [0,999999], zero-padded on the left
%H Hour (24-hour clock) as a decimal number [00,23].
%I Hour (12-hour clock) as a decimal number [01,12].
%j Day of the year as a decimal number [001,366].
%m Month as a decimal number [01,12].
%M Minute as a decimal number [00,59].
%p Locale’s equivalent of either AM or PM.
%S Second as a decimal number [00,61].
%U Week number of the year (Sunday as the first day of the week)
%w Weekday as a decimal number [0(Sunday),6].
%W Week number of the year (Monday as the first day of the week)
%x Locale’s appropriate date representation.
%X Locale’s appropriate time representation.
%y Year without century as a decimal number [00,99].
%Y Year with century as a decimal number.
%z UTC offset in the form +HHMM or -HHMM.
%Z Time zone name (empty string if the object is naive).
%% A literal '%' character.
This is what we can do with the datetime and time modules in Python
import time
import datetime
print "Time in seconds since the epoch: %s" %time.time()
print "Current date and time: ", datetime.datetime.now()
print "Or like this: ", datetime.datetime.now().strftime("%y-%m-%d-%H-%M")
print "Current year: ", datetime.date.today().strftime("%Y")
print "Month of year: ", datetime.date.today().strftime("%B")
print "Week number of the year: ", datetime.date.today().strftime("%W")
print "Weekday of the week: ", datetime.date.today().strftime("%w")
print "Day of year: ", datetime.date.today().strftime("%j")
print "Day of the month : ", datetime.date.today().strftime("%d")
print "Day of week: ", datetime.date.today().strftime("%A")
That will print out something like this:
Time in seconds since the epoch: 1349271346.46
Current date and time: 2012-10-03 15:35:46.461491
Or like this: 12-10-03-15-35
Current year: 2012
Month of year: October
Week number of the year: 40
Weekday of the week: 3
Day of year: 277
Day of the month : 03
Day of week: Wednesday
Use date.strftime. The formatting arguments are described in the documentation.
This one is what you wanted:
some_date.strftime('%Y-%m-%d')
This one takes Locale into account. (do this)
some_date.strftime('%c')
This is shorter:
>>> import time
>>> time.strftime("%Y-%m-%d %H:%M")
'2013-11-19 09:38'
# convert date time to regular format.
d_date = datetime.datetime.now()
reg_format_date = d_date.strftime("%Y-%m-%d %I:%M:%S %p")
print(reg_format_date)
# some other date formats.
reg_format_date = d_date.strftime("%d %B %Y %I:%M:%S %p")
print(reg_format_date)
reg_format_date = d_date.strftime("%Y-%m-%d %H:%M:%S")
print(reg_format_date)
OUTPUT
2016-10-06 01:21:34 PM
06 October 2016 01:21:34 PM
2016-10-06 13:21:34
Or even
from datetime import datetime, date
"{:%d.%m.%Y}".format(datetime.now())
Out: '25.12.2013
or
"{} - {:%d.%m.%Y}".format("Today", datetime.now())
Out: 'Today - 25.12.2013'
"{:%A}".format(date.today())
Out: 'Wednesday'
'{}__{:%Y.%m.%d__%H-%M}.log'.format(__name__, datetime.now())
Out: '__main____2014.06.09__16-56.log'
Simple answer -
datetime.date.today().isoformat()
With type-specific datetime string formatting (see nk9's answer using str.format().) in a Formatted string literal (since Python 3.6, 2016-12-23):
>>> import datetime
>>> f"{datetime.datetime.now():%Y-%m-%d}"
'2017-06-15'
The date/time format directives are not documented as part of the Format String Syntax but rather in date, datetime, and time's strftime() documentation. The are based on the 1989 C Standard, but include some ISO 8601 directives since Python 3.6.
I hate the idea of importing too many modules for convenience. I would rather work with available module which in this case is datetime rather than calling a new module time.
>>> a = datetime.datetime(2015, 04, 01, 11, 23, 22)
>>> a.strftime('%Y-%m-%d %H:%M')
'2015-04-01 11:23'
You need to convert the datetime object to a str.
The following code worked for me:
import datetime
collection = []
dateTimeString = str(datetime.date.today())
collection.append(dateTimeString)
print(collection)
Let me know if you need any more help.
In Python you can format a datetime using the strftime() method from the date, time and datetime classes in the datetime module.
In your specific case, you are using the date class from datetime. You can use the following snippet to format the today variable into a string with the format yyyy-MM-dd:
import datetime
today = datetime.date.today()
print("formatted datetime: %s" % today.strftime("%Y-%m-%d"))
In the following a more complete example:
import datetime
today = datetime.date.today()
# datetime in d/m/Y H:M:S format
date_time = today.strftime("%d/%m/%Y, %H:%M:%S")
print("datetime: %s" % date_time)
# datetime in Y-m-d H:M:S format
date_time = today.strftime("%Y-%m-%d, %H:%M:%S")
print("datetime: %s" % date_time)
# format date
date = today.strftime("%d/%m/%Y")
print("date: %s" % time)
# format time
time = today.strftime("%H:%M:%S")
print("time: %s" % time)
# day
day = today.strftime("%d")
print("day: %s" % day)
# month
month = today.strftime("%m")
print("month: %s" % month)
# year
year = today.strftime("%Y")
print("year: %s" % year)
More directives:
Sources:
Format DateTime in Python
strftime
You can do:
mylist.append(str(today))
Considering the fact you asked for something simple to do what you wanted, you could just:
import datetime
str(datetime.date.today())
For those wanting locale-based date and not including time, use:
>>> some_date.strftime('%x')
07/11/2019
Since the print today returns what you want this means that the today object's __str__ function returns the string you are looking for.
So you can do mylist.append(today.__str__()) as well.
from datetime import date
def today_in_str_format():
return str(date.today())
print (today_in_str_format())
This will print 2018-06-23 if that's what you want :)
You may want to append it as a string?
import datetime
mylist = []
today = str(datetime.date.today())
mylist.append(today)
print(mylist)
For pandas.Timestamps, strftime() can be used e.g.:
utc_now = datetime.now()
For isoformat:
utc_now.isoformat()
For any format e.g.:
utc_now.strftime("%m/%d/%Y, %H:%M:%S")
You can use easy_date to make it easy:
import date_converter
my_date = date_converter.date_to_string(today, '%Y-%m-%d')
A quick disclaimer for my answer - I've only been learning Python for about 2 weeks, so I am by no means an expert; therefore, my explanation may not be the best and I may use incorrect terminology. Anyway, here it goes.
I noticed in your code that when you declared your variable today = datetime.date.today() you chose to name your variable with the name of a built-in function.
When your next line of code mylist.append(today) appended your list, it appended the entire string datetime.date.today(), which you had previously set as the value of your today variable, rather than just appending today().
A simple solution, albeit maybe not one most coders would use when working with the datetime module, is to change the name of your variable.
Here's what I tried:
import datetime
mylist = []
present = datetime.date.today()
mylist.append(present)
print present
and it prints yyyy-mm-dd.
Here is how to display the date as (year/month/day) :
from datetime import datetime
now = datetime.now()
print '%s/%s/%s' % (now.year, now.month, now.day)
import datetime
import time
months = ["Unknown","January","Febuary","Marchh","April","May","June","July","August","September","October","November","December"]
datetimeWrite = (time.strftime("%d-%m-%Y "))
date = time.strftime("%d")
month= time.strftime("%m")
choices = {'01': 'Jan', '02':'Feb','03':'Mar','04':'Apr','05':'May','06': 'Jun','07':'Jul','08':'Aug','09':'Sep','10':'Oct','11':'Nov','12':'Dec'}
result = choices.get(month, 'default')
year = time.strftime("%Y")
Date = date+"-"+result+"-"+year
print Date
In this way you can get Date formatted like this example: 22-Jun-2017
I don't fully understand but, can use pandas for getting times in right format:
>>> import pandas as pd
>>> pd.to_datetime('now')
Timestamp('2018-10-07 06:03:30')
>>> print(pd.to_datetime('now'))
2018-10-07 06:03:47
>>> pd.to_datetime('now').date()
datetime.date(2018, 10, 7)
>>> print(pd.to_datetime('now').date())
2018-10-07
>>>
And:
>>> l=[]
>>> l.append(pd.to_datetime('now').date())
>>> l
[datetime.date(2018, 10, 7)]
>>> map(str,l)
<map object at 0x0000005F67CCDF98>
>>> list(map(str,l))
['2018-10-07']
But it's storing strings but easy to convert:
>>> l=list(map(str,l))
>>> list(map(pd.to_datetime,l))
[Timestamp('2018-10-07 00:00:00')]
maybe the shortest solution, which exactly matches your situation, would be:
mylist.append(str(AnyDate)[:10])
or even shorter, e.g.:
f'{AnyDate}'[:10]
PS: it doesn't need to be today.

Categories

Resources