How to remove unconverted data from a Python datetime object - python

I have a database of mostly correct datetimes but a few are broke like so: Sat Dec 22 12:34:08 PST 20102015
Without the invalid year, this was working for me:
end_date = soup('tr')[4].contents[1].renderContents()
end_date = time.strptime(end_date,"%a %b %d %H:%M:%S %Z %Y")
end_date = datetime.fromtimestamp(time.mktime(end_date))
But once I hit an object with a invalid year I get ValueError: unconverted data remains: 2, which is great but im not sure how best to strip the bad characters out of the year. They range from 2 to 6 unconverted characters.
Any pointers? I would just slice end_date but im hoping there is a datetime-safe strategy.

Unless you want to rewrite strptime (a very bad idea), the only real option you have is to slice end_date and chop off the extra characters at the end, assuming that this will give you the correct result you intend.
For example, you can catch the ValueError, slice, and try again:
def parse_prefix(line, fmt):
try:
t = time.strptime(line, fmt)
except ValueError as v:
if len(v.args) > 0 and v.args[0].startswith('unconverted data remains: '):
line = line[:-(len(v.args[0]) - 26)]
t = time.strptime(line, fmt)
else:
raise
return t
For example:
parse_prefix(
'2015-10-15 11:33:20.738 45162 INFO core.api.wsgi yadda yadda.',
'%Y-%m-%d %H:%M:%S'
) # -> time.struct_time(tm_year=2015, tm_mon=10, tm_mday=15, tm_hour=11, tm_min=33, ...

Yeah, I'd just chop off the extra numbers. Assuming they are always appended to the datestring, then something like this would work:
end_date = end_date.split(" ")
end_date[-1] = end_date[-1][:4]
end_date = " ".join(end_date)
I was going to try to get the number of excess digits from the exception, but on my installed versions of Python (2.6.6 and 3.1.2) that information isn't actually there; it just says that the data does not match the format. Of course, you could just continue lopping off digits one at a time and re-parsing until you don't get an exception.
You could also write a regex that will match only valid dates, including the right number of digits in the year, but that seems like overkill.

Here's an even simpler one-liner I use:
end_date = end_date[:-4]

Improving (i hope) the code of Adam Rosenfield:
import time
for end_date in ( 'Fri Feb 18 20:41:47 Paris, Madrid 2011',
'Fri Feb 18 20:41:47 Paris, Madrid 20112015'):
print end_date
fmt = "%a %b %d %H:%M:%S %Z %Y"
try:
end_date = time.strptime(end_date, fmt)
except ValueError, v:
ulr = len(v.args[0].partition('unconverted data remains: ')[2])
if ulr:
end_date = time.strptime(end_date[:-ulr], fmt)
else:
raise v
print end_date,'\n'

strptime() really expects to see a correctly formatted date, so you probably need to do some munging on the end_date string before you call it.
This is one way to chop the last item in the end_date to 4 chars:
chop = len(end_date.split()[-1]) - 4
end_date = end_date[:-chop]

Related

Python - Convert February 29 dates to date time objects

I need to convert dates in string format to date time object but I keep getting a value error for 29th February dates. Here is my code.
from datetime import datetime
def try_parsing_date(text):
for fmt in ('%Y-%m-%d', '%d.%m.%Y', '%m/%d/%Y', '%d/%m/%Y', '%d-%b-%y', '%d/%m/%y', '%m/%d/%y', '%m/%d/%Y'):
try:
return datetime.strptime(text, fmt)
except ValueError:
pass
raise ValueError(text)
df['Dateofbirth'] = df.apply(lambda row: try_parsing_date(row['Dateofbirth']), axis=1)
The error I get is ValueError: ('2/29/57', 'occurred at index 82445').
What is the best way to resolve this issue?
This is not a Python problem. 1957 wasn't a leap and 2/29/57 never existed. If someone claims that as his date of birth, he's lying. So you could as well put any date into your list - or nan.
Error message I obtained was more specific and usefull:
from datetime import datetime
datetime.strptime('1957-02-29', '%Y-%m-%d')
ValueError: day is out of range for month
BTW there is a lot of misleading web sites about history informing 'what happend' on 29 Feb '57 :-)

How to format date to 1900's?

I'm preprocessing data and one column represents dates such as '6/1/51'
I'm trying to convert the string to a date object and so far what I have is:
date = row[2].strip()
format = "%m/%d/%y"
datetime_object = datetime.strptime(date, format)
date_object = datetime_object.date()
print(date_object)
print(type(date_object))
The problem I'm facing is changing 2051 to 1951.
I tried writing
format = "%m/%d/19%y"
But it gives me a ValueError.
ValueError: time data '6/1/51' does not match format '%m/%d/19%y'
I couldn't easily find the answer online so I'm asking here. Can anyone please help me with this?
Thanks.
Parse the date without the century using '%m/%d/%y', then:
year_1900 = datetime_object.year - 100
datetime_object = datetime_object.replace(year=year_1900)
You should put conditionals around that so you only do it on dates that are actually in the 1900's, for example anything later than today.

Validating date format with Python regex

I want to check if the format of the date input by user matches the below:
Jan 5 2018 6:10 PM
Month: First letter should be caps, followed 2 more in small. (total 3 letters)
<Space>: single space, must exist
Date: For single digit it should not be 05, but 5
<Space>: single space, must exist
Hour: 0-12, for single digit it should not be 06, but 6
Minute: 00-59
AM/PM
I'm using the below regex and trying to match:
import re,sys
usr_date = str(input("Please enter the older date until which you want to scan ? \n[Date Format Example: Jan 5 2018 6:10 PM] : "))
valid_usr_date = re.search("^(\s+)*[A-Z]{1}[a-z]{2}\s{1}[1-31]{1}\s{1}[1-2]{1}[0-9]{1}[0-9]{1}[0-9]{1}\s{1}[0-12]{1}:[0-5]{1}[0-9]{1}\s{1}(A|P)M$",usr_date,re.M)
if not valid_usr_date:
print ("The date format is incorrect. Please follow the exact date format as shown in example. Exiting Program!")
sys.exit()
But, even for the correct format it gives a syntax wrong error. What am I doing wrong.
I would not use regex for that, as you have no way to actually validate the date itself (eg, a regex will happily accept Abc 99 9876 9:99 PM).
Instead, use strptime:
from datetime import datetime
string = 'Jan 5 2018 6:10 PM'
datetime.strptime(string, '%b %d %Y %I:%M %p')
If the string would be in the "wrong" format you'd get a ValueError.
The only apparent "problem" with this approach is that for some reason you require the day and hour not to be zero-padded and strptime doesn't seem to have such directives.
A table with all available directives is here.
You could use a function which parses the input string and tries to return a datetime object, if it can't it raises an ValueError:
from datetime import datetime
def valid_date(s):
try:
return datetime.strptime(s, '%Y-%m-%d %H:%M')
except ValueError:
msg = "Not a valid date: '{0}'.".format(s)
raise argparse.ArgumentTypeError(msg)

Difficulty with the replace method

I must have the user enter a date in mm/dd/yy format and then output the string in long-date format like January, ##, ####. I cannot for the life of me get the month to replace as a the word.
def main():
get_date=input('Input a date in mm/dd/yy format!\nIf you would like to enter a 1-digit number, enter a zero first, then the number\nDate:')
month= int(get_date[:2])
day=int(get_date[3:5])
year=int(get_date[6:])
validate(month, day, year)#validates input
get_month(get_date)
def validate(month,day,year):
while month>12 or month<1 or day>31 or day<1 or year!=15:
print("if you would like to enter a one-digit number, enter a zero first, then the number\n theres only 12 months in a year\n only up to 31 days in a month, and\n you must enter 15 as the year")
get_date=input('Input a date in mm/dd/yy format!:')
month= int(get_date[:2])
day=int(get_date[3:5])
year=int(get_date[6:])
def get_month(get_date):
if get_date.startswith('01'):
get_date.replace('01','January')
print(get_date)
I have tried a plethora of things to fix this but I cannot make January appear instead of 01.
Strings in Python are immutable, they don't change once they're created. That means any function that modifies it must return a new string. You need to capture that new value.
get_date = get_date.replace('01','January')
You can do this (and simplify the code) using python's date module.
The strptime function will parse a date from a string using format codes. If it's can't parse it correctly, it will raise a value error, so no need for your custom validation function
https://docs.python.org/2.7/library/datetime.html#datetime.datetime.strptime
The strftime function will print out that date formatted according to the same codes.
https://docs.python.org/2.7/library/datetime.html#datetime.datetime.strftime
Updated, your code would look something like this:
from datetime import datetime
parsed = None
while not parsed:
get_date=input('Input a date in mm/dd/yy format!\nIf you would like to enter a 1-digit number, enter a zero first, then the number\nDate:')
try:
parsed = datetime.strptime(get_date, '%m/%d/%y')
except ValueError:
parsed = None
print parsed.strftime('%B %d, %Y')
Why don't you use datetime module ?
year = 2007; month=11; day=3
import datetime
d = datetime.date(year, month, day)
print d.strftime("%d %B %Y")
You might be better off using Python's datetime module for this:
from datetime import datetime
entered_date = input('Input a date in mm/dd/yy format!\nIf you would like to enter a 1-digit number, enter a zero first, then the number\nDate:')
d = datetime.strptime(entered_date, '%m/%d/%y')
entered_date = d.strftime('%B, %d, %Y')
e.g.
'February, 29, 2016'
This way you catch invalid dates (such as 02/29/15) as well as badly-formatted ones.

How to print a date in a regular format?

This is my code:
import datetime
today = datetime.date.today()
print(today)
This prints: 2008-11-22 which is exactly what I want.
But, I have a list I'm appending this to and then suddenly everything goes "wonky". Here is the code:
import datetime
mylist = []
today = datetime.date.today()
mylist.append(today)
print(mylist)
This prints the following:
[datetime.date(2008, 11, 22)]
How can I get just a simple date like 2008-11-22?
The WHY: dates are objects
In Python, dates are objects. Therefore, when you manipulate them, you manipulate objects, not strings or timestamps.
Any object in Python has TWO string representations:
The regular representation that is used by print can be get using the str() function. It is most of the time the most common human readable format and is used to ease display. So str(datetime.datetime(2008, 11, 22, 19, 53, 42)) gives you '2008-11-22 19:53:42'.
The alternative representation that is used to represent the object nature (as a data). It can be get using the repr() function and is handy to know what kind of data your manipulating while you are developing or debugging. repr(datetime.datetime(2008, 11, 22, 19, 53, 42)) gives you 'datetime.datetime(2008, 11, 22, 19, 53, 42)'.
What happened is that when you have printed the date using print, it used str() so you could see a nice date string. But when you have printed mylist, you have printed a list of objects and Python tried to represent the set of data, using repr().
The How: what do you want to do with that?
Well, when you manipulate dates, keep using the date objects all long the way. They got thousand of useful methods and most of the Python API expect dates to be objects.
When you want to display them, just use str(). In Python, the good practice is to explicitly cast everything. So just when it's time to print, get a string representation of your date using str(date).
One last thing. When you tried to print the dates, you printed mylist. If you want to print a date, you must print the date objects, not their container (the list).
E.G, you want to print all the date in a list :
for date in mylist :
print str(date)
Note that in that specific case, you can even omit str() because print will use it for you. But it should not become a habit :-)
Practical case, using your code
import datetime
mylist = []
today = datetime.date.today()
mylist.append(today)
print mylist[0] # print the date object, not the container ;-)
2008-11-22
# It's better to always use str() because :
print "This is a new day : ", mylist[0] # will work
>>> This is a new day : 2008-11-22
print "This is a new day : " + mylist[0] # will crash
>>> cannot concatenate 'str' and 'datetime.date' objects
print "This is a new day : " + str(mylist[0])
>>> This is a new day : 2008-11-22
Advanced date formatting
Dates have a default representation, but you may want to print them in a specific format. In that case, you can get a custom string representation using the strftime() method.
strftime() expects a string pattern explaining how you want to format your date.
E.G :
print today.strftime('We are the %d, %b %Y')
>>> 'We are the 22, Nov 2008'
All the letter after a "%" represent a format for something:
%d is the day number (2 digits, prefixed with leading zero's if necessary)
%m is the month number (2 digits, prefixed with leading zero's if necessary)
%b is the month abbreviation (3 letters)
%B is the month name in full (letters)
%y is the year number abbreviated (last 2 digits)
%Y is the year number full (4 digits)
etc.
Have a look at the official documentation, or McCutchen's quick reference you can't know them all.
Since PEP3101, every object can have its own format used automatically by the method format of any string. In the case of the datetime, the format is the same used in
strftime. So you can do the same as above like this:
print "We are the {:%d, %b %Y}".format(today)
>>> 'We are the 22, Nov 2008'
The advantage of this form is that you can also convert other objects at the same time.
With the introduction of Formatted string literals (since Python 3.6, 2016-12-23) this can be written as
import datetime
f"{datetime.datetime.now():%Y-%m-%d}"
>>> '2017-06-15'
Localization
Dates can automatically adapt to the local language and culture if you use them the right way, but it's a bit complicated. Maybe for another question on SO(Stack Overflow) ;-)
import datetime
print datetime.datetime.now().strftime("%Y-%m-%d %H:%M")
Edit:
After Cees' suggestion, I have started using time as well:
import time
print time.strftime("%Y-%m-%d %H:%M")
The date, datetime, and time objects all support a strftime(format) method,
to create a string representing the time under the control of an explicit format
string.
Here is a list of the format codes with their directive and meaning.
%a Locale’s abbreviated weekday name.
%A Locale’s full weekday name.
%b Locale’s abbreviated month name.
%B Locale’s full month name.
%c Locale’s appropriate date and time representation.
%d Day of the month as a decimal number [01,31].
%f Microsecond as a decimal number [0,999999], zero-padded on the left
%H Hour (24-hour clock) as a decimal number [00,23].
%I Hour (12-hour clock) as a decimal number [01,12].
%j Day of the year as a decimal number [001,366].
%m Month as a decimal number [01,12].
%M Minute as a decimal number [00,59].
%p Locale’s equivalent of either AM or PM.
%S Second as a decimal number [00,61].
%U Week number of the year (Sunday as the first day of the week)
%w Weekday as a decimal number [0(Sunday),6].
%W Week number of the year (Monday as the first day of the week)
%x Locale’s appropriate date representation.
%X Locale’s appropriate time representation.
%y Year without century as a decimal number [00,99].
%Y Year with century as a decimal number.
%z UTC offset in the form +HHMM or -HHMM.
%Z Time zone name (empty string if the object is naive).
%% A literal '%' character.
This is what we can do with the datetime and time modules in Python
import time
import datetime
print "Time in seconds since the epoch: %s" %time.time()
print "Current date and time: ", datetime.datetime.now()
print "Or like this: ", datetime.datetime.now().strftime("%y-%m-%d-%H-%M")
print "Current year: ", datetime.date.today().strftime("%Y")
print "Month of year: ", datetime.date.today().strftime("%B")
print "Week number of the year: ", datetime.date.today().strftime("%W")
print "Weekday of the week: ", datetime.date.today().strftime("%w")
print "Day of year: ", datetime.date.today().strftime("%j")
print "Day of the month : ", datetime.date.today().strftime("%d")
print "Day of week: ", datetime.date.today().strftime("%A")
That will print out something like this:
Time in seconds since the epoch: 1349271346.46
Current date and time: 2012-10-03 15:35:46.461491
Or like this: 12-10-03-15-35
Current year: 2012
Month of year: October
Week number of the year: 40
Weekday of the week: 3
Day of year: 277
Day of the month : 03
Day of week: Wednesday
Use date.strftime. The formatting arguments are described in the documentation.
This one is what you wanted:
some_date.strftime('%Y-%m-%d')
This one takes Locale into account. (do this)
some_date.strftime('%c')
This is shorter:
>>> import time
>>> time.strftime("%Y-%m-%d %H:%M")
'2013-11-19 09:38'
# convert date time to regular format.
d_date = datetime.datetime.now()
reg_format_date = d_date.strftime("%Y-%m-%d %I:%M:%S %p")
print(reg_format_date)
# some other date formats.
reg_format_date = d_date.strftime("%d %B %Y %I:%M:%S %p")
print(reg_format_date)
reg_format_date = d_date.strftime("%Y-%m-%d %H:%M:%S")
print(reg_format_date)
OUTPUT
2016-10-06 01:21:34 PM
06 October 2016 01:21:34 PM
2016-10-06 13:21:34
Or even
from datetime import datetime, date
"{:%d.%m.%Y}".format(datetime.now())
Out: '25.12.2013
or
"{} - {:%d.%m.%Y}".format("Today", datetime.now())
Out: 'Today - 25.12.2013'
"{:%A}".format(date.today())
Out: 'Wednesday'
'{}__{:%Y.%m.%d__%H-%M}.log'.format(__name__, datetime.now())
Out: '__main____2014.06.09__16-56.log'
Simple answer -
datetime.date.today().isoformat()
With type-specific datetime string formatting (see nk9's answer using str.format().) in a Formatted string literal (since Python 3.6, 2016-12-23):
>>> import datetime
>>> f"{datetime.datetime.now():%Y-%m-%d}"
'2017-06-15'
The date/time format directives are not documented as part of the Format String Syntax but rather in date, datetime, and time's strftime() documentation. The are based on the 1989 C Standard, but include some ISO 8601 directives since Python 3.6.
I hate the idea of importing too many modules for convenience. I would rather work with available module which in this case is datetime rather than calling a new module time.
>>> a = datetime.datetime(2015, 04, 01, 11, 23, 22)
>>> a.strftime('%Y-%m-%d %H:%M')
'2015-04-01 11:23'
You need to convert the datetime object to a str.
The following code worked for me:
import datetime
collection = []
dateTimeString = str(datetime.date.today())
collection.append(dateTimeString)
print(collection)
Let me know if you need any more help.
In Python you can format a datetime using the strftime() method from the date, time and datetime classes in the datetime module.
In your specific case, you are using the date class from datetime. You can use the following snippet to format the today variable into a string with the format yyyy-MM-dd:
import datetime
today = datetime.date.today()
print("formatted datetime: %s" % today.strftime("%Y-%m-%d"))
In the following a more complete example:
import datetime
today = datetime.date.today()
# datetime in d/m/Y H:M:S format
date_time = today.strftime("%d/%m/%Y, %H:%M:%S")
print("datetime: %s" % date_time)
# datetime in Y-m-d H:M:S format
date_time = today.strftime("%Y-%m-%d, %H:%M:%S")
print("datetime: %s" % date_time)
# format date
date = today.strftime("%d/%m/%Y")
print("date: %s" % time)
# format time
time = today.strftime("%H:%M:%S")
print("time: %s" % time)
# day
day = today.strftime("%d")
print("day: %s" % day)
# month
month = today.strftime("%m")
print("month: %s" % month)
# year
year = today.strftime("%Y")
print("year: %s" % year)
More directives:
Sources:
Format DateTime in Python
strftime
You can do:
mylist.append(str(today))
Considering the fact you asked for something simple to do what you wanted, you could just:
import datetime
str(datetime.date.today())
For those wanting locale-based date and not including time, use:
>>> some_date.strftime('%x')
07/11/2019
Since the print today returns what you want this means that the today object's __str__ function returns the string you are looking for.
So you can do mylist.append(today.__str__()) as well.
from datetime import date
def today_in_str_format():
return str(date.today())
print (today_in_str_format())
This will print 2018-06-23 if that's what you want :)
You may want to append it as a string?
import datetime
mylist = []
today = str(datetime.date.today())
mylist.append(today)
print(mylist)
For pandas.Timestamps, strftime() can be used e.g.:
utc_now = datetime.now()
For isoformat:
utc_now.isoformat()
For any format e.g.:
utc_now.strftime("%m/%d/%Y, %H:%M:%S")
You can use easy_date to make it easy:
import date_converter
my_date = date_converter.date_to_string(today, '%Y-%m-%d')
A quick disclaimer for my answer - I've only been learning Python for about 2 weeks, so I am by no means an expert; therefore, my explanation may not be the best and I may use incorrect terminology. Anyway, here it goes.
I noticed in your code that when you declared your variable today = datetime.date.today() you chose to name your variable with the name of a built-in function.
When your next line of code mylist.append(today) appended your list, it appended the entire string datetime.date.today(), which you had previously set as the value of your today variable, rather than just appending today().
A simple solution, albeit maybe not one most coders would use when working with the datetime module, is to change the name of your variable.
Here's what I tried:
import datetime
mylist = []
present = datetime.date.today()
mylist.append(present)
print present
and it prints yyyy-mm-dd.
Here is how to display the date as (year/month/day) :
from datetime import datetime
now = datetime.now()
print '%s/%s/%s' % (now.year, now.month, now.day)
import datetime
import time
months = ["Unknown","January","Febuary","Marchh","April","May","June","July","August","September","October","November","December"]
datetimeWrite = (time.strftime("%d-%m-%Y "))
date = time.strftime("%d")
month= time.strftime("%m")
choices = {'01': 'Jan', '02':'Feb','03':'Mar','04':'Apr','05':'May','06': 'Jun','07':'Jul','08':'Aug','09':'Sep','10':'Oct','11':'Nov','12':'Dec'}
result = choices.get(month, 'default')
year = time.strftime("%Y")
Date = date+"-"+result+"-"+year
print Date
In this way you can get Date formatted like this example: 22-Jun-2017
I don't fully understand but, can use pandas for getting times in right format:
>>> import pandas as pd
>>> pd.to_datetime('now')
Timestamp('2018-10-07 06:03:30')
>>> print(pd.to_datetime('now'))
2018-10-07 06:03:47
>>> pd.to_datetime('now').date()
datetime.date(2018, 10, 7)
>>> print(pd.to_datetime('now').date())
2018-10-07
>>>
And:
>>> l=[]
>>> l.append(pd.to_datetime('now').date())
>>> l
[datetime.date(2018, 10, 7)]
>>> map(str,l)
<map object at 0x0000005F67CCDF98>
>>> list(map(str,l))
['2018-10-07']
But it's storing strings but easy to convert:
>>> l=list(map(str,l))
>>> list(map(pd.to_datetime,l))
[Timestamp('2018-10-07 00:00:00')]
maybe the shortest solution, which exactly matches your situation, would be:
mylist.append(str(AnyDate)[:10])
or even shorter, e.g.:
f'{AnyDate}'[:10]
PS: it doesn't need to be today.

Categories

Resources