Calculate with different datetime formats, datetime and datetime64 - python

I am trying to calculate the days between two dates:
First Date:
date = datetime.datetime.today() # {datetime} 2018-09-17 14:42:06.506541
Second date, extracted from a data frame:
date2 = data_audit.loc[(data_audit.Audit == audit), 'Erledigungsdatum'].values[0]
# {datetime64} 2018-07-23T00:00:00.000000000
The error:
ufunc subtract cannot use operands with types dtype('O') and dtype('M8[ns]')
My next try was:
date = np.datetime64(datetime.datetime.now()) # {datetime64} 2018-09-17T14:48:16.599541
Which resulted in the following error (I pass the date as a parameter in a function):
ufunc 'bitwise_and' not supported for the input types, and the inputs
could not be safely coerced to any supported types according to the
casting rule ''safe''
How should I approach this problem? The second one seems more logical to me, but I don't understand why I cant pass a simple date to a function.

I believe something like this should work for you:
import datetime
import numpy as np
# earlier being of type datetime64
earlier = np.datetime64(datetime.datetime.today())
# Converting datetime64 to datetime
earlier = earlier.astype(datetime.datetime)
now = datetime.datetime.today()
print(now-earlier)

Let's try such approach
import datetime
date = datetime.datetime.today()
date2 = '2018-07-23'#here could be your date converted to proper type
date2 = datetime.datetime.strptime(date2, '%Y-%m-%d')
difference = date- date2
difference = difference.days
And you can apply df.some_col_with_difference.astype('timedelta64[D]') to whole column in dataframe as well

Related

Convert datetime object into a string

I need to convert a datetime into a string using numpy.
Is there another way to directly convert only one object to string that doesn't involve using the following function passing an array of 1 element (which returns an array too)?
numpy.datetime_as_string(arr, unit=None, timezone='naive', casting='same_kind')
With this function, I can make the conversion, but I just want to know if there is a more direct/clean way to do it.
Thanks in advance.
As we dont know what is inside of arr, I assume it is just datetime.now()
If so try this:
import datetime
datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')
>>> '2022-07-28 10:27:34.986848'
If you need numpy version:
np.array(datetime.datetime.now(), dtype='datetime64[s]')
>>> array('2022-07-28T10:32:19', dtype='datetime64[s]')
if you just want to convert one numpy DateTime64 object into a string, here is the answer.
import datetime
yourdt = yourdt.astype(datetime.datetime)
yourdt_str = yourdt.strftime("%Y-%m-%d %H:%M:%S")
that's it
from datetime import datetime
now = datetime.now() # current date and time
year = now.strftime("%Y")
print("year:", year)
month = now.strftime("%m")
print("month:", month)
day = now.strftime("%d")
print("day:", day)
time = now.strftime("%H:%M:%S")
print("time:", time)
date_time = now.strftime("%m/%d/%Y, %H:%M:%S")
print("date and time:",date_time)

Convert year string into datetime object

I have a date column in my dataframe that consists of strings like this...'201512'
I would like to convert it into a datetime object of just year to do some time series analysis.
I tried...
df['Date']= pd.to_datetime(df['Date'])
and something similar to
datetime.strptime(Date, "%Y")
I am not sure how datetime interfaces with pandas dataframes (perhaps somebody will comment if there is special usage), but in general the datetime functions would work like this:
import datetime
date_string = "201512"
date_object = datetime.datetime.strptime(date_string, "%Y%m")
print(date_object)
Getting us:
2015-12-01 00:00:00
Now that the hard part of creating a datetime object is done we simply
print(date_object.year)
Which spits out our desired
2015
More info about the parsing operators (the "%Y%m" bit of my code) is described in the documentation
I would look at the module arrow
https://arrow.readthedocs.io/en/latest/
import arrow
date = arrow.now()
#example of text formatting
fdate = date.format('YYYY')
#example of converting text into datetime
date = arrow.get('201905', 'YYYYMM').datetime

Pandas: Calculate the difference between two Datetime columns from different timezones

I have two different time series. One is a series of timestamps in ms-format from the CET timezone delivered as strings. The other are unix-timestamps in s-format in the UTC timezone.
Each of them is in a column in a larger dataframe, none of them is a DatetimeIndex and should not be one.
I need to convert the CET time to UTC and then calculate the difference between both columns and I'm lost between the Datetime functionalities of Python and Pandas, and the variety of different datatypes.
Here's an example:
import pandas as pd
import pytz
germany = pytz.timezone('Europe/Berlin')
D1 = ["2016-08-22 00:23:58.254","2016-08-22 00:23:58.254",
"2016-08-22 00:23:58.254","2016-08-22 00:40:33.260",
"2016-08-22 00:40:33.260","2016-08-22 00:40:33.260"]
D2 = [1470031195, 1470031195, 1470031195, 1471772027, 1471765890, 1471765890]
S1 = pd.to_datetime(pd.Series(D1))
S2 = pd.to_datetime(pd.Series(D2),unit='s')
First problem
is with the use of tz_localize. I need the program to understand, that the data in S1 is not in UTC, but in CET. However using tz_localize like this seems to interpret the given datetime as CET assuming it's UTC to begin with:
F1 = S1.apply(lambda x: x.tz_localize(germany)).to_frame()
Trying tz_convert always throws something like:
TypeError: index is not a valid DatetimeIndex or PeriodIndex
Second problem
is that even with both of them having the same format I'm stuck because I can't calculate the difference between the two columns now:
F1 = S1.apply(lambda x: x.tz_localize(germany)).to_frame()
F1.columns = ["CET"]
F2 = S2.apply(lambda x: x.tz_localize('UTC')).to_frame()
F2.columns = ["UTC"]
FF = pd.merge(F1,F2,left_index=True,right_index=True)
FF.CET-FF.UTC
ValueError: Incompatbile tz's on datetime subtraction ops
I need a way to do these calculation with tz-aware datetime objects that are no DatetimeIndex objects.
Alternatively I need a way to make my CET-column to just look like this:
2016-08-21 22:23:58.254
2016-08-21 22:23:58.254
2016-08-21 22:23:58.254
2016-08-21 22:40:33.260
2016-08-21 22:40:33.260
2016-08-21 22:40:33.260
That is, I don't need my datetime to be tz-aware, I just want to convert it automatically by adding/subtracting the necessary amount of time with an awareness for daylight saving times.
If it weren't for DST I could just do a simple subtraction on two integers.
First you need to convert the CET timestamps to datetime and specify the timezone:
S1 = pd.to_datetime(pd.Series(D1))
T1_cet = pd.DatetimeIndex(S1).tz_localize('Europe/Berlin')
Then convert the UTC timestamps to datetime and specify the timezone to avoid confusion:
S2 = pd.to_datetime(pd.Series(D2), unit='s')
T2_utc = pd.DatetimeIndex(S1).tz_localize('UTC')
Now convert the CET timestamps to UTC:
T1_utc = T1_cet.tz_convert('UTC')
And finally calculate the difference between the timestamps:
diff = pd.Series(T1_utc) - pd.Series(T2_utc)

pandas raises ValueError on DatetimeIndex Conversion

I am converting all ISO-8601 formatted values into Unix Values. For some inexplicable reason this line
a_col = pd.DatetimeIndex(a_col).astype(np.int64)/10**6
raises the error
ValueError: Unable to convert 0 2001-06-29
... (Abbreviated Output of Column
Name: DateCol, dtype: datetime64[ns] to datetime dtype
This is very odd because I've guaranteed that each value is in datetime.datetime format as you can see here:
if a_col.dtypes is (np.dtype('object') or np.dtype('O')):
a_col = a_col.apply(lambda x: x if isinstance(x, datetime.datetime) else epoch)
a_col = pd.DatetimeIndex(a_col).astype(np.int64)/10**6
Epoch is datetime.datetime.
When I check the dtypes of the column that gives me an error it's "object), exactly what I'm checking for. Is there something I'm missing?
Assuming that your time zone is US/Eastern (based on your dataset) and that your DataFrame is named df, please try the following:
import datetime as dt
from time import mktime
import pytz
df['Job Start Date'] = \
df['Job Start Date'].apply(lambda x: mktime(pytz.timezone('US/Eastern').localize(x)
.astimezone(pytz.UTC).timetuple()))
>>> df['Job Start Date'].head()
0 993816000
1 1080824400
2 1052913600
3 1080824400
4 1075467600
Name: Job Start Date, dtype: float64
You first need to make your 'naive' datetime objects timezone aware (to US/Eastern) and then convert them to UTC. Finally, pass your new UTC aware datetime object as a timetable to the mtkime function from the time module.

How to convert integer into date object python?

I am creating a module in python, in which I am receiving the date in integer format like 20120213, which signifies the 13th of Feb, 2012. Now, I want to convert this integer formatted date into a python date object.
Also, if there is any means by which I can subtract/add the number of days in such integer formatted date to receive the date value in same format? like subtracting 30 days from 20120213 and receive answer as 20120114?
This question is already answered, but for the benefit of others looking at this question I'd like to add the following suggestion: Instead of doing the slicing yourself as suggested in the accepted answer, you might also use strptime() which is (IMHO) easier to read and perhaps the preferred way to do this conversion.
import datetime
s = "20120213"
s_datetime = datetime.datetime.strptime(s, '%Y%m%d')
I would suggest the following simple approach for conversion:
from datetime import datetime, timedelta
s = "20120213"
# you could also import date instead of datetime and use that.
date = datetime(year=int(s[0:4]), month=int(s[4:6]), day=int(s[6:8]))
For adding/subtracting an arbitary amount of days (seconds work too btw.), you could do the following:
date += timedelta(days=10)
date -= timedelta(days=5)
And convert back using:
s = date.strftime("%Y%m%d")
To convert the integer to a string safely, use:
s = "{0:-08d}".format(i)
This ensures that your string is eight charecters long and left-padded with zeroes, even if the year is smaller than 1000 (negative years could become funny though).
Further reference: datetime objects, timedelta objects
Here is what I believe answers the question (Python 3, with type hints):
from datetime import date
def int2date(argdate: int) -> date:
"""
If you have date as an integer, use this method to obtain a datetime.date object.
Parameters
----------
argdate : int
Date as a regular integer value (example: 20160618)
Returns
-------
dateandtime.date
A date object which corresponds to the given value `argdate`.
"""
year = int(argdate / 10000)
month = int((argdate % 10000) / 100)
day = int(argdate % 100)
return date(year, month, day)
print(int2date(20160618))
The code above produces the expected 2016-06-18.
import datetime
timestamp = datetime.datetime.fromtimestamp(1500000000)
print(timestamp.strftime('%Y-%m-%d %H:%M:%S'))
This will give the output:
2017-07-14 08:10:00

Categories

Resources