Parsing Python datetime from string with day-seconds - python

I'm trying to parse the date and time from a bunch of filenames that have one of these formats:
prefix.YYYY-MM-DD.suffix
prefix.YYYY-MM-DD_HH:MM:SS.sufix
prefix.YYYY-MM-DD-SSSSS.sufix
The datetime formats for these three are:
prefix.%Y-%m-%d.suffix
prefix.%Y-%m-%d_%H:%M:%S.suffix
prefix.%Y-%m-%d-%?????.suffix
The first two are easy to parse with the datetime module but I'm having trouble figuring out how to parse the 5-digit seconds which range from 00000 to 82800 (86400 seconds per day).
If at all possible, I'd like to use the standard datetime module as this needs to be extremely portable.
My goal is to have a function that can ingest multiple datetime formats so I need to stay away from a one off parser if possible.
def myparser(filename, datetimeformat):
# do some stuff - maybe as easy as
datetimeobject = datetime.strptime(filename, datetimeformat)
return datetimeobject
Any thoughts on how best to do this would be greatly appreciated.

If for all of them you split off the date and parse that as a datetime.datetime then parse the time into a datetime.timedetla and add it to the first value you should get where you need to be.

Related

Python dateutil.parser treating string ending with m as date

from dateutil.parser import parse
parse(‘2m’) - treating it as a datetime
I’m testing a few terms like 2w, 2y, 2m and some other date formats received in a string to check whether a given string has a date of any format.
Everything works fine except ‘2m’
It’s treating it as 2 minutes past, I don’t want to treat it as a date format.
Please can anyone help to resolve this issue

Is there an easy way to plot and manipulate time duration (hours/minutes/seconds) data in Python? NOT datetime data

I'm working with some video game speedrunning (basically, races where people try to beat a game as fast as they can) data, and I have many different run timings in HH:MM:SS format. I know it's possible to convert to seconds, but I want to keep in this format for the purposes of making the axes on any graphs easy to read.
I have all the data in a data frame already and tried converting the timing data to datetime format, with format = '%H:%M:%S', but it just uses this as the time on 1900-01-01.
data=[['Aggy','01:02:32'], ['Kirby','01:04:54'],['Sally','01:06:04']]
df=pd.DataFrame(data, columns=['Runner','Time'])
df['Time']=pd.to_datetime(df['Time'], format='%H:%M:%S')
I thought specifying the format to be just hours/minutes/seconds would strip away any date, but when I print out the header of my dataframe, it says that the time data is now 1900-01-01 01:02:32, as an example. 1:02:32 AM on January 1st, 1900. I want Python to recognize the 1:02:32 as a duration of time, not a datetime format. What's the best way to go about this?
The format argument defines the format of the input date, not the format of the resulting datetime object (reference).
For your needs you can either use the H:m:s part of the datetime, or use the to_timedelta
method.

Parsing date which may or may not contain milliseconds

So this question is more of best way to handle this sort of input in python. Here is an example of input date 2018-12-31 23:59:59.999999. The millisecond part may or may not be part of input.
I am currently using this code to convert this to datetime
input_ts = datetime.datetime.strptime(input_str, '%Y-%m-%dT%H:%M:%S.%f')
But the problem in this case is that it will throw an exception if input string doesn't contain milliseconds part i.e., 2018-12-31 23:59:59
In Java, I could have approached this problem in two ways. (its a pseudo explanation, without taking into account of small boundary checks)
(preferred approach). Check the input string length. if its less than 19 then it is missing milliseconds. Append .000000 to it.
(not preferred). Let the main code parse the string, if it throws an exception, then parse it with new time format i.e., %Y-%m-%dT%H:%M:%S
The third approach could be just strip off milliseconds.
I am not sure if python has anything built-in to handle these kind of situations. Any suggestions?
You could use python-dateutil library, it is smart enough to parse most of the basic date formats.
import dateutil.parser
dateutil.parser.parse('2018-12-31 23:59:59.999999')
dateutil.parser.parse('2018-12-31 23:59:59')
In case you don't want to install any external libraries, you could iterate over list of different formats as proposed in this answer.
from datetime import datetime # import datetime class from datetime package
dt = datetime.now() # get current time
dt1 = dt1.strftime('%Y-%m-%d %H:%M:%S') # converting time to string
dt3 = dt2.strptime('2018/5/20','%Y/%m/%d') # converting a string to specified time

Save date time in filename with valid chars

To keep track of my when my files were backed up I want to have the filename of the backups as the datetime of when they were backed up. This will eventually be sorted and retrieved and sorted using python to allow me to get the most recent file based on the datetime filename.
The problem is, the automatic format of date time cant be saved like this:
2007-12-31 22:29:59
It can for example be saved like this:
2007-12-31 22-29-59
What is the best way to format the datetime so that I can easily sort by datetime on the name, and for bonus points, what is the python to show the datetime in that way.
You should have a look the documentation of the python time module: http://docs.python.org/2/library/time.html#module-time
If you go to the strftime() function, you will see that it accepts a string as input, which describes the format of the string you want to get as the return value.
Example (with hyphens between each date/time token):
>>> s = time.strftime('%Y-%m-%d-%H-%M-%S')
>>> print s
2012-12-08-14-55-44
The documentation contains a complete table of directives you can use to get different tokens.
What is the best way to format the datetime so that I can easily sort by datetime?
If you want to sort files according to datetimes names, you can consider that a biggest-to-lowest time specifier representation of a datetime (e.g.: YYYYMMDDhhmmss) preserves the same chronological and lexicographical order.

Convert DD/MM/YYYY HH:MM:SS into MySQL TIMESTAMP

I would like a simple way to find and reformat text of the format 'DD/MM/YYYY' into 'YYYY/MM/DD' to be compatible with MySQL TIMESTAMPs, in a list of text items that may or may not contain a date atall, under python. (I'm thinking RegEx?)
Basically i am looking for a way to inspect a list of items and correct any timestamp formats found.
Great thing about standards is that there are so many to choose from....
You can read the string into a datetime object and then output it back as a string using a different format. For e.g.
>>> from datetime import datetime
>>> datetime.strptime("31/12/2009", "%d/%m/%Y").strftime("%Y/%m/%d")
'2009/12/31'
Basically i am looking for a way to inspect a list of items and correct any timestamp formats found.
If the input format is inconsistent, can vary, then you are better off with dateutil.
>>> from dateutil.parser import parse
>>> parse("31/12/2009").strftime("%Y/%m/%d")
'2009/12/31'
Dateutil can handle a lot of input formats automatically. To operate on a list you can map the a wrapper over the parse function over the list and convert the values appropriately.
If you're using the MySQLdb (also known as "mysql-python") module, for any datetime or timestamp field you can provide a datetime type instead of a string. This is the type that is returned, also and is the preferred way to provide the value.
For Python 2.5 and above, you can do:
from datetime import datetime
value = datetime.strptime(somestring, "%d/%m/%Y")
For older versions of python, it's a bit more verbose, but not really a big issue.
import time
from datetime import datetime
timetuple = time.strptime(somestring, "%d/%m/%Y")
value = datetime(*timetuple[:6])
The various format-strings are taken directly from what's accepted by your C library. Look up man strptime on unix to find other acceptable format values. Not all of the time formats are portable, but most of the basic ones are.
Note datetime values can contain timezones. I do not believe MySQL knows exactly what to do with these, though. The datetimes I make above are usually considered as "naive" datetimes. If timezones are important, consider something like the pytz library.

Categories

Resources