I am using Pandas to read and process csv file. My csv file have date/time column that looks like:
11:59:50:322 02 10 2015 -0400 EDT
11:11:55:051 16 10 2015 -0400 EDT
00:38:37:106 02 11 2015 -0500 EST
04:15:51:600 14 11 2015 -0500 EST
04:15:51:600 14 11 2015 -0500 EST
13:43:28:540 28 11 2015 -0500 EST
09:24:12:723 14 12 2015 -0500 EST
13:28:12:346 28 12 2015 -0500 EST
How can I read this using python/pandas, so far what I have is this:
pd.to_datetime(pd.Series(df['senseStartTime']),format='%H:%M:%S:%f %d %m %Y %z %Z')
But this is not working, though previously I was able to use the same code for another format (with a different format specifier). Any suggestions?
The issue you're having is likely because versions of Python before 3.2 (I think?) had a lot of trouble with time zones, so your format string might be screwing up on the %z and %Z parts. For example, in Python 2.7:
In [187]: import datetime
In [188]: datetime.datetime.strptime('11:59:50:322 02 10 2015 -0400 EDT', '%H:%M:%S:%f %d %m %Y %z %Z')
ValueError: 'z' is a bad directive in format '%H:%M:%S:%f %d %m %Y %z %Z'
You're using pd.to_datetime instead of datetime.datetime.strptime but the underlying issues are the same, you can refer to this thread for help. What I would suggest is instead of using pd.to_datetime, do something like
In [191]: import dateutil
In [192]: dateutil.parser.parse('11:59:50.322 02 10 2015 -0400')
Out[192]: datetime.datetime(2015, 2, 10, 11, 59, 50, 322000, tzinfo=tzoffset(None, -14400))
It should be pretty simple to chop off the timezone at the end (which is redundant since you have the offset), and change the ":" to "." between the seconds and microseconds.
Since datetime.timezone has become available in Python 3.2, you can use %z with .strptime() (see docs). Starting with:
dateparse = lambda x: pd.datetime.strptime(x, '%H:%M:%S:%f %d %m %Y %z %Z')
df = pd.read_csv(path, parse_dates=['time_col'], date_parser=dateparse)
to get:
time_col
0 2015-10-02 11:59:50.322000-04:00
1 2015-10-16 11:11:55.051000-04:00
2 2015-11-02 00:38:37.106000-05:00
3 2015-11-14 04:15:51.600000-05:00
4 2015-11-14 04:15:51.600000-05:00
5 2015-11-28 13:43:28.540000-05:00
6 2015-12-14 09:24:12.723000-05:00
7 2015-12-28 13:28:12.346000-05:00
Related
I am using time module to convert epoch into human readable date using the code below.
import time
datetime = time.strftime("%a, %d %b %Y %H:%M:%S +0000", time.localtime(1609740000000))
print(datetime)
>>> Thu, 17 Aug 52980 20:00:00 +0000
The output is incorrect when I check it on https://www.epochconverter.com
Correct output should be Wed, 04 Aug 2021 21:49:24 +0000
time.localtime takes time in seconds. You presumably pass time in milliseconds.
datetime = time.strftime("%a, %d %b %Y %H:%M:%S +0000",
time.localtime(1609740000000 // 1000))
#'Mon, 04 Jan 2021 01:00:00 +0000'
The answer from epochconverter.com is the same. Your "correct output" is incorrect.
I have a DataFrame with a column containing seconds and I would like to convert the column to date and time and save the file with a column containing the date and time .I Have a column like this in seconds
time
2384798300
1500353475
7006557825
1239779541
1237529231
I was able to do it but by only inserting the number of seconds that i want to convert with the following code:
datetime.fromtimestamp(1238479969).strftime("%A, %B %d, %Y %I:%M:%S")
output : Tuesday, March 31, 2009 06:12:49'
What i want to get is the conversion of the whole column,I tried this :
datetime.fromtimestamp(df['time']).strftime("%A, %B %d, %Y %I:%M:%S") but I can not get it, any help of how i can do it will be appreciated.
Use df.apply:
In [200]: from datetime import datetime
In [203]: df['time'] = df['time'].apply(lambda x: datetime.fromtimestamp(x).strftime("%A, %B %d, %Y %I:%M:%S"))
In [204]: df
Out[204]:
time
0 Friday, July 28, 2045 01:28:20
1 Tuesday, July 18, 2017 10:21:15
2 Wednesday, January 11, 2192 03:33:45
3 Wednesday, April 15, 2009 12:42:21
4 Friday, March 20, 2009 11:37:11
I am trying to convert a datetime string (German) that comes from MS Project Excel Export.
02 Februar 2022 17:00
I read it from a Excel-Export of MS Project in to a pandas dataframe.
When converting it with
to_datetime(df["Anfang"], format= '%d %B %Y %H:%M').dt.date
but get the error
ValueError: time data '07 Januar 2019 07:00' does not match format '%d %B %Y %H:%M' (match)
from https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
%B Month as locale’s full name. September
What I am doing wrong here?
Do I have to check some local settings?
I am using German(Swiss)
import locale
locale.getdefaultlocale()
('de_CH', 'cp1252')
df in:
0 10 April 2019 08:00
1 07 Januar 2019 07:00
2 07 Januar 2019 07:00
3 07 Januar 2019 07:00
4 09 Oktober 2019 17:00
5 04 Dezember 2020 17:00
Name: Anfang, dtype: object
df out (wanted):
0 10-04-2019
1 07-01-2019
.
.
EDIT:
I changed my locale to ('de_DE', 'cp1252'), but I get the same error.
SOLVED:
By using matJ's answer, I got the error that "Die 15.06.21" was not matching the format, which led me to investigate the data. There I found two different date formats (Thanks, Microsoft!). After cleaning, the above code worked well!!!
So the error message of to_datetime wasn't precise as datetime.strptime.
Thanks for helping.
Johannes
One possible solution is use dateparser module:
import dateparser
df['Anfang'] = df['Anfang'].apply(dateparser.parse)
print (df)
Anfang
0 2019-04-10 08:00:00
1 2019-01-07 07:00:00
2 2019-01-07 07:00:00
3 2019-01-07 07:00:00
4 2019-10-09 17:00:00
5 2020-12-04 17:00:00
import dateparser
df['Anfang'] = df['Anfang'].apply(dateparser.parse).dt.date
print (df)
Anfang
0 2019-04-10
1 2019-01-07
2 2019-01-07
3 2019-01-07
4 2019-10-09
5 2020-12-04
I'd change the locale in a different way. Then your code should work.
The following works for me:
import locale
from datetime import datetime
locale.setlocale(locale.LC_ALL, 'de_DE') # changing locale to german
datetime.strptime('07 Januar 2019 07:00', '%d %B %Y %H:%M') # returns a datetime obj which you can format as you like
Let me know if that works for you as well.
I have UTC timestamps in a file generated from using bash date like the following:
'Sat Mar 15 01:30:01 UTC 2014'
'Sat Mar 15 01:30:16 UTC 2014'
'Sat Mar 15 02:00:01 UTC 2014'
'Sat Mar 15 02:00:12 UTC 2014'
I need to transform the timestamps to the local time of different regions. For e.g. convert first entry above to HongKong time, second to Singapore and so on. I can generate the offsets which can be added to get the local time. So my offsets look like the following:
2:00
-5:00
.
.
One possible approach may be to parse the date using Python and then add/subtract the offset.
However I'm wondering if I can preferably do this in Bash with date. I've tried to increment/decrement the date such as:
date -d 'Sat Mar 15 01:30:01 UTC 2014 2 hours'
However, the above would convert the specified date to my System's date and add 2 hours, whereas I need to achieve this for a particular target timezone and without having to rely on specifying offsets manually.
I would avoid the offset approach and specify your target timezone directly. For example to convert a date to Hong Kong time using GNU date:
$ TZ='Asia/Hong_Kong' date -d 'Sat Mar 15 01:30:01 UTC 2014'
Sat Mar 15 09:30:01 HKT 2014
TZ is the time zone variable. The -d option to date tells it to read the time from the specified string.
If you don't specify TZ, you will get your computer's default time zone:
$ date -d 'Sat Mar 15 01:30:01 UTC 2014'
Fri Mar 14 18:30:01 PDT 2014
A list of timezones by country is here.
This approach is not applicable to Mac OSX (BSD) version of date for which -d does something else.
For Python
In [1]: import datetime
In [2]: utcstring = 'Sat Mar 15 01:30:01 UTC 2014'
In [3]: offsetstring = '2:00'
Now parsing the two strings
In [4]: utc = datetime.datetime.strptime(utcstring, '%a %b %d %H:%M:%S %Z %Y')
In [5]: offset = datetime.datetime.strptime(offsetstring, '%H:%M')
delta is computed by timedelta. we can add/subtract datetime using this delta
In [6]: delta = datetime.timedelta(hours=offset.hour, minutes=offset.minute)
Let's check the result.
In [7]: utc + delta
Out[7]: datetime.datetime(2014, 3, 15, 3, 30, 1)
This can be converted to string back as
In [9]: datetime.datetime.strftime(utc + delta, '%a %b %d %H:%M:%S %Y')
Out[9]: 'Sat Mar 15 03:30:01 2014'
For more detail, see: https://docs.python.org/2.7/library/datetime.html
I have a lot of date strings like Mon, 16 Aug 2010 24:00:00 and some of them are in 00-23 hour format and some of them in 01-24 hour format. I want to get a list of date objects of them, but when I try to transform the example string into a date object, I have to transform it from Mon, 16 Aug 2010 24:00:00 to Tue, 17 Aug 2010 00:00:00. What is the easiest way?
import email.utils as eutils
import time
import datetime
ntuple=eutils.parsedate('Mon, 16 Aug 2010 24:00:00')
print(ntuple)
# (2010, 8, 16, 24, 0, 0, 0, 1, -1)
timestamp=time.mktime(ntuple)
print(timestamp)
# 1282017600.0
date=datetime.datetime.fromtimestamp(timestamp)
print(date)
# 2010-08-17 00:00:00
print(date.strftime('%a, %d %b %Y %H:%M:%S'))
# Tue, 17 Aug 2010 00:00:00
Since you say you have a lot of these to fix, you should define a function:
def standardize_date(date_str):
ntuple=eutils.parsedate(date_str)
timestamp=time.mktime(ntuple)
date=datetime.datetime.fromtimestamp(timestamp)
return date.strftime('%a, %d %b %Y %H:%M:%S')
print(standardize_date('Mon, 16 Aug 2010 24:00:00'))
# Tue, 17 Aug 2010 00:00:00