from dateutil.parser import parse
parse(‘2m’) - treating it as a datetime
I’m testing a few terms like 2w, 2y, 2m and some other date formats received in a string to check whether a given string has a date of any format.
Everything works fine except ‘2m’
It’s treating it as 2 minutes past, I don’t want to treat it as a date format.
Please can anyone help to resolve this issue
Related
I have a column of float values which are tweet creation dates. This is the code I used to convert them from float to datetime:
t = 1508054212.0
datetime.utcfromtimestamp(t).strftime('%Y-%m-%d %H:%M:%S')
All the values returned belong to October 2017. However, the data is supposed to be collected over multiple months. So the dates should have different months and not just different Hours, Minutes and Seconds.
These are some values which I need to convert:
1508054212.0
1508038548.0
1506890436.0
Request you to suggest an alternative approach to determine the dates. Thank you.
I assumed df['tweet_creation'].loc[1] will return a number like the examples you gave.
Unfortunately, I don't know what f is, but I assumed it was a float.
My answer is inspired by this other answer: Converting unix timestamp string to readable date. You have a UNIX timestamp, so the easiest way is to use it and not convert it as a string.
from datetime import datetime, timedelta
dtobj = datetime.utcfromtimestamp(int(df['tweet_creation'].loc[1])) + timedelta(days=f-int(f))
To have the string representation you can use the function strftime.
So this question is more of best way to handle this sort of input in python. Here is an example of input date 2018-12-31 23:59:59.999999. The millisecond part may or may not be part of input.
I am currently using this code to convert this to datetime
input_ts = datetime.datetime.strptime(input_str, '%Y-%m-%dT%H:%M:%S.%f')
But the problem in this case is that it will throw an exception if input string doesn't contain milliseconds part i.e., 2018-12-31 23:59:59
In Java, I could have approached this problem in two ways. (its a pseudo explanation, without taking into account of small boundary checks)
(preferred approach). Check the input string length. if its less than 19 then it is missing milliseconds. Append .000000 to it.
(not preferred). Let the main code parse the string, if it throws an exception, then parse it with new time format i.e., %Y-%m-%dT%H:%M:%S
The third approach could be just strip off milliseconds.
I am not sure if python has anything built-in to handle these kind of situations. Any suggestions?
You could use python-dateutil library, it is smart enough to parse most of the basic date formats.
import dateutil.parser
dateutil.parser.parse('2018-12-31 23:59:59.999999')
dateutil.parser.parse('2018-12-31 23:59:59')
In case you don't want to install any external libraries, you could iterate over list of different formats as proposed in this answer.
from datetime import datetime # import datetime class from datetime package
dt = datetime.now() # get current time
dt1 = dt1.strftime('%Y-%m-%d %H:%M:%S') # converting time to string
dt3 = dt2.strptime('2018/5/20','%Y/%m/%d') # converting a string to specified time
I have data which is in-64 in the Index with values like "01/11/2018" in the index. It is data that has been imported from a csv. I am unable to convert it to a "01-11-2018" format. How do I do this because I get an error message:
'time data 0 does not match format '%Y' (match)'
I got the data from the following website:
https://www.nasdaq.com/symbol/spy/historical
and you can find a ' Download this file in Excel Format ' icon at the bottom.
import datetime
spyderdat.index = pd.to_datetime(spyderdat.index, format='%Y')
spyderdat.head()
How do I format this correctly?
Thanks a lot.
Your format string must match exactly:
import datetime
spyderdat.index = pd.to_datetime(spyderdat.index, format='%d/%m/%Y')
spyderdat.head()
Example w/o spyder:
import datetime
date = "1/11/2018"
print(datetime.datetime.strptime(date,"%d/%m/%Y"))
Output:
2018-11-01 00:00:00
You can strftime this datetime then anyhow you like. See link for formats. Or you store datetimes.
Assuming your input is a string, simply converting the / to - won't fix the issue.
The real problem is that you've told to_datetime to expect the input string to be only a 4-digit year but you've handed it an entire date, days and months included.
If you meant to use only the year portion you should manually extract the year first with something like split.
If you meant to use the full date as a value, you'll need to change your format to something like %d/%m/%Y. (Although I can't tell if your input is days first or months first due to their values.)
The easy way is to try this
datetime.datetime.strptime("01/11/2018", '%d/%m/%Y').strftime('%d-%m-%Y')
I'm trying to parse the date and time from a bunch of filenames that have one of these formats:
prefix.YYYY-MM-DD.suffix
prefix.YYYY-MM-DD_HH:MM:SS.sufix
prefix.YYYY-MM-DD-SSSSS.sufix
The datetime formats for these three are:
prefix.%Y-%m-%d.suffix
prefix.%Y-%m-%d_%H:%M:%S.suffix
prefix.%Y-%m-%d-%?????.suffix
The first two are easy to parse with the datetime module but I'm having trouble figuring out how to parse the 5-digit seconds which range from 00000 to 82800 (86400 seconds per day).
If at all possible, I'd like to use the standard datetime module as this needs to be extremely portable.
My goal is to have a function that can ingest multiple datetime formats so I need to stay away from a one off parser if possible.
def myparser(filename, datetimeformat):
# do some stuff - maybe as easy as
datetimeobject = datetime.strptime(filename, datetimeformat)
return datetimeobject
Any thoughts on how best to do this would be greatly appreciated.
If for all of them you split off the date and parse that as a datetime.datetime then parse the time into a datetime.timedetla and add it to the first value you should get where you need to be.
I would like a simple way to find and reformat text of the format 'DD/MM/YYYY' into 'YYYY/MM/DD' to be compatible with MySQL TIMESTAMPs, in a list of text items that may or may not contain a date atall, under python. (I'm thinking RegEx?)
Basically i am looking for a way to inspect a list of items and correct any timestamp formats found.
Great thing about standards is that there are so many to choose from....
You can read the string into a datetime object and then output it back as a string using a different format. For e.g.
>>> from datetime import datetime
>>> datetime.strptime("31/12/2009", "%d/%m/%Y").strftime("%Y/%m/%d")
'2009/12/31'
Basically i am looking for a way to inspect a list of items and correct any timestamp formats found.
If the input format is inconsistent, can vary, then you are better off with dateutil.
>>> from dateutil.parser import parse
>>> parse("31/12/2009").strftime("%Y/%m/%d")
'2009/12/31'
Dateutil can handle a lot of input formats automatically. To operate on a list you can map the a wrapper over the parse function over the list and convert the values appropriately.
If you're using the MySQLdb (also known as "mysql-python") module, for any datetime or timestamp field you can provide a datetime type instead of a string. This is the type that is returned, also and is the preferred way to provide the value.
For Python 2.5 and above, you can do:
from datetime import datetime
value = datetime.strptime(somestring, "%d/%m/%Y")
For older versions of python, it's a bit more verbose, but not really a big issue.
import time
from datetime import datetime
timetuple = time.strptime(somestring, "%d/%m/%Y")
value = datetime(*timetuple[:6])
The various format-strings are taken directly from what's accepted by your C library. Look up man strptime on unix to find other acceptable format values. Not all of the time formats are portable, but most of the basic ones are.
Note datetime values can contain timezones. I do not believe MySQL knows exactly what to do with these, though. The datetimes I make above are usually considered as "naive" datetimes. If timezones are important, consider something like the pytz library.