I have a Dataframe in python, with the data coming from a csv.
In the column "Date" I have a date (:)) but I don't know the date format. How can I detect it?
e.g.: I can have 05/05/2022. this can be M/D/Y or D/M/Y. I can manually understand it by looking at other entries, but I wish I can do it automatically.
Is there a way to do so?
thank you
datetime.strptime requires you to know the format.
trying (try - exept)-commands isn't good since there are so many different format I can receive.
it would be nice to have something that recognizes the format...
Update:
Thank you for the first answers, but the output I would like to have is THE FORMAT of the date that is used in the column.
Knowing also the fact that the format is unique within each column
You can try the dateutil library.
To deal with dates and also with the diversity of timezones people often use external libraries such as pytz or dateutil.
dateutil has a very powerful parser.
from dateutil.parser import parse
parse('05/05/2022') # datetime.datetime(2022, 5, 5, 0, 0)
parse('2022-05-05') # datetime.datetime(2022, 5, 5, 0, 0)
Use the isinstance built-in function to check if a variable is a datetime object in Python, e.g. if isinstance(today, datetime): . The isinstance function returns True if the passed in object is an instance or a subclass of the passed in class. Copied!20-Apr-2022
check out here
https://www.folkstalk.com/2022/10/python-check-if-string-is-date-format-with-code-examples.html
Related
I'm learning Pandas and I have a problem trying to change a format from Object to Date_time.
When I use 'to_datetime' the date I get in return is like in ISO Format, and I just want DD/MM/YYYY (13/10/1960). And I doing something wrong? Thanks a lot!!
enter image description here
At a glance, it doesn't seem like the code uses the right format.
The to_datetime() with the format argument follows strftime standards per the documentation here, and it's a way to tell the method how the original time was represented (so it can properly format it into a datetime object). An example can be seen from this Stack Overflow question.
Simple example:
datetime_object = pd.to_datetime(format='%d/%m/%Y')
The next problem is how you want that datetime object to be printed out (i.e. DD/MM/YYYY). Just throwing thoughts out there (would comment, but I don't have those privileges yet), if you want to print the string, you can cast that datetime object into the string that you want. Many ways to do this, one of which is to use strftime().
Simple example:
date_as_string = datetime_object.strftime('%d/%m/%Y')
But of course, why would you use a datetime object in that case. So the other option I can think of is to override how the datetime object is printed (redefining __str__ in a new class of datetime).
I have a function that consumes a datetime string that is returned from a DB query. Right now the query returns a datetime object.
What I am looking for is what would be the preferred way to create my datetime string. I have not done any performance profiling yet, just looking for previous experiences from people.
It depends.
Normally, the database is just a repository for data; it is not a formatting engine. This implies that you should expect to get strings like "2019-06-24 13:47:24" or numbers like 1561409293 and you deal with them from there.
However, it is often more straightforward to simply call DATE_FORMAT() in your SELECT statement. This is especially handy when the SELECT can generate the entire 'report' without further manipulation.
Another way to decide... Which approach requires fewer keystrokes on your part? Or has the least chance of programming errors? Or...
You say "consumes a datetime string that is returned from a DB query" -- but what will it do with it? If it will be manipulating it in more than one way, then a client "object" sounds like the better approach. If you will simply display the datetime, then DATE_FORMAT() may be better.
There is no noticeable performance difference.
If you have a datetime object, could could just keep it around in your code as a datetime object, extracting whatever information you need from it. Then when you really need the actual string, use strftime to format it in the way you want.
>>> from datetime import datetime
>>> t = datetime.now()
>>> t
datetime.datetime(2019, 6, 24, 14, 23, 45, 835379)
>>> print(t.month)
6
>>> print(t.second)
45
>>> as_string = t.strftime("%B %d, %Y")
>>> print(as_string)
June 24, 2019
>>> as_another_string = t.strftime("%Y-%h-%d %H:%m")
>>> print(as_another_string)
2019-Jun-24 14:06
This page shows you the sorts of format codes you can call upon, in order to extract whichever date/time information you want to display in your string:
My website's articles are written using .md files, to get the created and modified times of these files I use the os.path.getctime() and os.path.getmtime() methods.
The output of these methods look like this:
1553541590.723329
1553541590.723329
While HTML requires this format:
2001-09-17T05:59:00+01:00
2013-09-16T19:08:47+01:00
I have two questions regarding this matter:
What's are the names of these two time formats?
How do I change the output of those methods to look like the required HTML format?
Thanks.
1) The os.path documentation indicates that both os.path.getctime() and os.path.getmtime() return a float indicating seconds since epoch. That seems consistent with the numbers you are getting.
2) The easiest thing to do would be to convert to an object to represent a date and then provide your desired format. Here, I used datetime with strftime() to output a string of desired format.
import datetime
>>>> datetime.datetime.fromtimestamp(1553541590.723329)
datetime.datetime(2019, 3, 25, 12, 19, 50, 723329)
>>>> datetime.datetime.fromtimestamp(1553541590.723329).strftime('%Y-%m-%dT%H:%M:%S')
'2019-03-25T12:19:50'
You may find it easiest to just add the time zone string on the end since adding a timezone to a datetime object is a little involved. If you do want to go through with it, you need to create a tzinfo object and use it to update the datetime object using datetime.astimezone(tz). Here's a pretty good resource for adding a timezone to a datetime object.
Now to convert this strings to date time object in Python or django?
2010-08-17T19:00:00Z
2010-08-17T18:30:00Z
2010-08-17T17:05:00Z
2010-08-17T14:30:00Z
2010-08-10T22:20:00Z
2010-08-10T21:20:00Z
2010-08-10T20:25:00Z
2010-08-10T19:30:00Z
2010-08-10T19:00:00Z
2010-08-10T18:30:00Z
2010-08-10T17:30:00Z
2010-08-10T17:05:00Z
2010-08-10T17:05:00Z
2010-08-10T15:30:00Z
2010-08-10T14:30:00Z
whrn i do this datestr=datetime.strptime( datetime, "%Y-%m-%dT%H:%M:%S" )
it tell me that unconverted data remains: Z
You can parse the strings as-is without the need to slice if you don't mind using the handy dateutil module. For e.g.
>>> from dateutil.parser import parse
>>> s = "2010-08-17T19:00:00Z"
>>> parse(s)
datetime.datetime(2010, 8, 17, 19, 0, tzinfo=tzutc())
>>>
Use slicing to remove "Z" before supplying the string for conversion
datestr=datetime.strptime( datetime[:-1], "%Y-%m-%dT%H:%M:%S" )
>>> test = "2010-08-17T19:00:00Z"
>>> test[:-1]
'2010-08-17T19:00:00'
Those seem to be ISO 8601 dates. If your timezone is always the same, just remove the last letter before parsing it with strptime (e.g by slicing).
The Z indicates the timezone, so be sure that you are taking that into account when converting it to a datetime of a different timezone. If the timezone can change in your application, you'll have to parse that information also and change the datetime object accordingly.
You could also use the pyiso8601 module to parse these ISO dates, it will most likely also work with slighty different ISO date formats. If your data may contain different timezones I would suggest to use this module.
change your format string to ""%Y-%m-%dT%H:%M:%SZ" so that it includes the trailing Z (which makes it no longer unconverted). Note, however, that this Z perhaps is there to indicate that the time is in UTC which might be something you need to account for otherwise
I would like a simple way to find and reformat text of the format 'DD/MM/YYYY' into 'YYYY/MM/DD' to be compatible with MySQL TIMESTAMPs, in a list of text items that may or may not contain a date atall, under python. (I'm thinking RegEx?)
Basically i am looking for a way to inspect a list of items and correct any timestamp formats found.
Great thing about standards is that there are so many to choose from....
You can read the string into a datetime object and then output it back as a string using a different format. For e.g.
>>> from datetime import datetime
>>> datetime.strptime("31/12/2009", "%d/%m/%Y").strftime("%Y/%m/%d")
'2009/12/31'
Basically i am looking for a way to inspect a list of items and correct any timestamp formats found.
If the input format is inconsistent, can vary, then you are better off with dateutil.
>>> from dateutil.parser import parse
>>> parse("31/12/2009").strftime("%Y/%m/%d")
'2009/12/31'
Dateutil can handle a lot of input formats automatically. To operate on a list you can map the a wrapper over the parse function over the list and convert the values appropriately.
If you're using the MySQLdb (also known as "mysql-python") module, for any datetime or timestamp field you can provide a datetime type instead of a string. This is the type that is returned, also and is the preferred way to provide the value.
For Python 2.5 and above, you can do:
from datetime import datetime
value = datetime.strptime(somestring, "%d/%m/%Y")
For older versions of python, it's a bit more verbose, but not really a big issue.
import time
from datetime import datetime
timetuple = time.strptime(somestring, "%d/%m/%Y")
value = datetime(*timetuple[:6])
The various format-strings are taken directly from what's accepted by your C library. Look up man strptime on unix to find other acceptable format values. Not all of the time formats are portable, but most of the basic ones are.
Note datetime values can contain timezones. I do not believe MySQL knows exactly what to do with these, though. The datetimes I make above are usually considered as "naive" datetimes. If timezones are important, consider something like the pytz library.