converting unicode to datetime in python 2.7 - python

This should be an easy one but I am new to using datetime ...
I want to convert the following unicode to any usable datetime format:
u'Tuesday, March 28, 2017'
So I have:
>> from datetime import datetime
>> test = u'Tuesday, March 28, 2017'
>> date_time = datetime.strptime(test, '????')
I have tried a bunch of combinations for '????' but I keep getting an error saying that the format does not match. I am looking for one working example of '????' to get the unicode date into datetime type and then I can mess with the format to get it that way I want it in datetime.

If you have trouble figuring out what datetime.strptime() specifications work for your date, break down the date into components. It is much easier to puzzle out a single specification per component.
So for your date, start perhaps with the March 28 component (just Tuesday is very ambiguous, nor is it very unique to a date):
>>> from datetime import datetime
>>> datetime.strptime('March 28', '%B %d') # Full month and numeric day
datetime.datetime(1900, 3, 28, 0, 0)
>>> datetime.strptime('March 28, ', '%B %d, ') # add in the comma and space
datetime.datetime(1900, 3, 28, 0, 0)
>>> datetime.strptime('March 28, 2017', '%B %d, %Y') # add in the year
datetime.datetime(2017, 3, 28, 0, 0)
>>> datetime.strptime(', March 28, 2017', ', %B %d, %Y') # another comma and space
datetime.datetime(2017, 3, 28, 0, 0)
>>> datetime.strptime('Tuesday, March 28, 2017', '%A, %B %d, %Y') # Full weekday name
datetime.datetime(2017, 3, 28, 0, 0)
So '%A, %B %d, %Y' matches the string you tried to parse.

Related

Parse date with specific format in python

how would you go about parsing a date like that in python:
Monday, April 1st
I've tried
datetime_object = datetime.strptime(date.replace("st","").replace("rd","").replace("th","").replace("nd","").strip(), '%A, %B %d')
But obviously it would remove the "nd" from "Monday" and cause an exception
thanks
Don't replace. Strip, from the right using str.rstrip. If the unwanted characters don't exist, the string is returned as is:
>>> from datetime import datetime
>>> s = "Monday, April 1st"
>>> datetime.strptime(s.rstrip('strndh'), '%A, %B %d')
datetime.datetime(1900, 4, 1, 0, 0)
Note that the day information here (i.e. Monday) is redundant.
You can use the dateutil module (pip install py-dateutil):
>>> from dateutil import parser
>>> parser.parse("Monday, April 1st")
datetime.datetime(2017, 4, 1, 0, 0)
Also if all your string doesn't have the same length:
a = "Monday, April 1st"
if not a[-1].isdigit():
a = a[:-2]
datetime_object = datetime.strptime(a, '%A, %B %d')

How to convert a string into date-format in python?

I have a string like 23 July 1914 and want to convert it to 23/07/1914 date format.
But my code gives error.
from datetime import datetime
datetime_object = datetime.strptime('1 June 2005','%d %m %Y')
print datetime_object
Your error is in the format you are using to strip your string. You use %m as the format specifier for month, but this expects a 0 padded integer representing the month of the year (e.g. 06 for your example). What you want to use is %B, which expects an month of the year written out fully (e.g. June in your example).
For a full explanation of the datetime format specifiers please consult the documentation, and if you have any other issues please check there first.
Here is what you should be doing:
from datetime import datetime
datetime_object = datetime.strptime('1 June 2005','%d %B %Y')
s = datetime_object.strftime("%d/%m/%y")
print(s)
Output:
>>> 01/06/05
You see your strptime requires two parameters.
strptime(string[, format])
And the string will be converted to a datetime object according to a format you specify.
There are various formats
%a - abbreviated weekday name
%A - full weekday name
%b - abbreviated month name
%B - full month name
%c - preferred date and time representation
%C - century number (the year divided by 100, range 00 to 99)
%d - day of the month (01 to 31)
%D - same as %m/%d/%y
%e - day of the month (1 to 31)
%g - like %G, but without the century
%G - 4-digit year corresponding to the ISO week number (see %V).
%h - same as %b
%H - hour, using a 24-hour clock (00 to 23)
The above are some examples. Take a look here for formats
Take a goood look at these two!
%b - abbreviated month name
%B - full month name
It should be in a similar pattern to the string you provide. Confusing take a look at these examples.
>>> datetime.strptime('1 jul 2009','%d %b %Y')
datetime.datetime(2009, 7, 1, 0, 0)
>>> datetime.strptime('1 Jul 2009','%d %b %Y')
datetime.datetime(2009, 7, 1, 0, 0)
>>> datetime.strptime('jul 21 1996','%b %d %Y')
datetime.datetime(1996, 7, 21, 0, 0)
As you can see based on the format the string is turned into a datetime object. Now take a look!
>>> datetime.strptime('1 July 2009','%d %b %Y')
Traceback (most recent call last):
File "<pyshell#12>", line 1, in <module>
datetime.strptime('1 July 2009','%d %b %Y')
File "/usr/lib/python3.5/_strptime.py", line 510, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "/usr/lib/python3.5/_strptime.py", line 343, in _strptime
(data_string, format))
ValueError: time data '1 July 2009' does not match format '%d %b %Y'
Why error because jun or Jun (short form) stands for %b. When you supply a June it gets confused. Now what to do? Changed the format.
>>> datetime.strptime('1 July 2009','%d %B %Y')
datetime.datetime(2009, 7, 1, 0, 0)
Simple now converting the datetime object is simple enough.
>>> s = datetime.strptime('1 July 2009','%d %B %Y')
>>> s.strftime('%d/%m/%Y')
'01/07/2009
Again the %m is the format for displaying it in months (numbers) read more about them.
The placeholder for "Month as locale’s full name." would be %B not %m:
>>> from datetime import datetime
>>> datetime_object = datetime.strptime('1 June 2005','%d %B %Y')
>>> print(datetime_object)
2005-06-01 00:00:00
>>> print(datetime_object.strftime("%d/%m/%Y"))
01/06/2005
This should work:
from datetime import datetime
print(datetime.strptime('1 June 2005', '%d %B %Y').strftime('%d/%m/%Y'))
print(datetime.strptime('23 July 1914', '%d %B %Y').strftime('%d/%m/%Y'))
For more info you can read about strftime-strptime-behavior
%d means "Day of the month as a zero-padded decimal number."
%m means "Month as a zero-padded decimal number."
Neither day or month are supplied what you tell it to expect. What you need it %B for month (only if your locale is en_US), and %-d for day.

Python Error Handling concerning datetime and time

I have this variable called pubdate which is derived from rss feeds. Most of the time it's a time tuple which is what I want it to be, so there are no errors.
Sometimes it's a unicode string, that's where it gets annoying.
So far, I have this following code concerning pubdate when it is a unicode string:
if isinstance(pubdate, unicode):
try:
pubdate = time.mktime(datetime.strptime(pubdate, '%d/%m/%Y %H:%M:%S').timetuple()) # turn the string into a unix timestamp
except ValueError:
pubdate = re.sub(r'\w+,\s*', '', pubdate) # removes day words from string, i.e 'Mon', 'Tue', etc.
pubdate = time.mktime(datetime.strptime(pubdate, '%d %b %Y %H:%M:%S').timetuple()) # turn the string into a unix timestamp
But my problem is if the unicode string pubdate is in a different format from the one in the except ValueError clause it will raise another ValueError, what's the pythonic way to deal with multiple ValueError cases?
As you are parsing date string from a Rss. Maybe you need some guess when parsing the date string. I recommend you to use dateutil instead of the datetime module.
dateutil.parser offers a generic date/time string parser which is able to parse most known formats to represent a date and/or time.
The prototype of this function is: parse(timestr)(you don't have to specify the format yourself).
DEMO
>>> parse("2003-09-25T10:49:41")
datetime.datetime(2003, 9, 25, 10, 49, 41)
>>> parse("2003-09-25T10:49")
datetime.datetime(2003, 9, 25, 10, 49)
>>> parse("2003-09-25T10")
datetime.datetime(2003, 9, 25, 10, 0)
>>> parse("2003-09-25")
datetime.datetime(2003, 9, 25, 0, 0)
>>> parse("Sep 03", default=DEFAULT)
datetime.datetime(2003, 9, 3, 0, 0)
Fuzzy parsing:
>>> s = "Today is 25 of September of 2003, exactly " \
... "at 10:49:41 with timezone -03:00."
>>> parse(s, fuzzy=True)
datetime.datetime(2003, 9, 25, 10, 49, 41,
tzinfo=tzoffset(None, -10800))
You could take the following approach:
from datetime import datetime
import time
pub_dates = ['2/5/2013 12:23:34', 'Monday 2 Jan 2013 12:23:34', 'mon 2 Jan 2013 12:23:34', '10/14/2015 11:11', '10 2015']
for pub_date in pub_dates:
pubdate = 0 # value if all conversion attempts fail
for format in ['%d/%m/%Y %H:%M:%S', '%d %b %Y %H:%M:%S', '%a %d %b %Y %H:%M:%S', '%A %d %b %Y %H:%M:%S', '%m/%d/%Y %H:%M']:
try:
pubdate = time.mktime(datetime.strptime(pub_date, format).timetuple()) # turn the string into a unix timestamp
break
except ValueError as e:
pass
print '{:<12} {}'.format(pubdate, pub_date)
Giving output as:
1367493814.0 2/5/2013 12:23:34
1357129414.0 Monday 2 Jan 2013 12:23:34
1357129414.0 mon 2 Jan 2013 12:23:34
1444817460.0 10/14/2015 11:11
0 10 2015

How to convert date like "Apr 15 2014 16:21:16 UTC" to UTC time using python

I have dates in the following format that are used to name zip files:
Apr 15 2014 16:21:16 UTC
I would like to convert that to UTC numbers using Python. Does python recognize the 3-character month?
Use:
import datetime
datetime.datetime.strptime(yourstring, '%b %d %Y %H:%M:%S UTC')
%b is the abbreviated month name. By default, Python uses the C (English) locale, regardless of environment variables used.
Demo:
>>> import datetime
>>> yourstring = 'Apr 15 2014 16:21:16 UTC'
>>> datetime.datetime.strptime(yourstring, '%b %d %Y %H:%M:%S UTC')
datetime.datetime(2014, 4, 15, 16, 21, 16)
The value is timezone neutral, which for UTC timestamps is fine, provided you don't mix local objects into the mix (e.g. stick to datetime.datetime.utcnow() and similar methods).
An easier way is to use dateutil:
>>> from dateutil import parser
>>> parser.parse("Apr 15 2014 16:21:16 UTC")
datetime.datetime(2014, 4, 15, 16, 21, 16, tzinfo=tzutc())
Timezone is handled, and it supports other common datetime formats as well.

Converting dates with times like "midnight"

I have the following string which I am trying to convert to a datetime in python
From django template I am getting the following date format:
July 1, 2013, midnight
I am trying to convert the string above into a date time format
date_object = datetime.strptime(x, '%B %d, %Y, %I:%M %p')
It throws a format error
time data 'July 1, 2013, midnight' does not match format '%B %d, %Y, %I:%M %p'
Your best shot is probably the parsedatetime module.
Here's your example:
>>> import parsedatetime
>>> cal = parsedatetime.Calendar()
>>> cal.parse('July 1, 2013, midnight')
((2013, 7, 1, 0, 0, 0, 0, 245, 0), 3)
cal.parse() returns a tuple of two items. The first is the modified parsedatetime.Calendar object, the second is an integer, as explained in the docstring of the parse method:
0 = not parsed at all
1 = parsed as a C{date}
2 = parsed as a C{time}
3 = parsed as a C{datetime}
A few words on strptime:
strptime won't be able to understand "midnight", but you can replace it with an actual hour, using something like this:
def fix_dt(raw_date):
"""Replace 'midnight', 'noon', etc."""
return raw_date.replace('midnight', '0').replace('noon', '12')
def parse_dt(raw_date):
"""Parse the fuzzy timestamps."""
return datetime.datetime.strptime(fix_dt(raw_date), '%B %d, %Y, %H')
Then:
>>> parse_dt('July 1, 2013, midnight')
datetime.datetime(2013, 7, 1, 0, 0)
You can play on strfti.me to see which one will match your format.
You should check out this other question. The answers suggest using parsedatetime and pyparsing to parse fuzzy timestamps like the one in your example. Also check this pyparsing wiki page.
You could also just combine the date withe datetime's start time:
from datetime import datetime, date
dt = date.today()
print(datetime.combine(dt, datetime.min.time()))

Categories

Resources