Can't call strftime on numpy.datetime64, no definition - python

I have a datetime64 t that I'd like to represent as a string.
When I call strftime like this t.strftime('%Y.%m.%d') I get this error:
AttributeError: 'numpy.datetime64' object has no attribute 'strftime'
What am I missing? I am using Python 3.4.2 and Numpy 1.9.1

Importing a data structures library like pandas to accomplish type conversion feels like overkill to me. You can achieve the same thing with the standard datetime module:
import numpy as np
import datetime
t = np.datetime64('2017-10-26')
t = t.astype(datetime.datetime)
timestring = t.strftime('%Y.%m.%d')

Use this code:
import pandas as pd
t= pd.to_datetime(str(date))
timestring = t.strftime('%Y.%m.%d')

This is the simplest way:
t.item().strftime('%Y.%m.%d')
item() gives you a Python native datetime object, on which all the usual methods are available.

If your goal is only to represent t as a string, the simplest solution is str(t). If you want it in a specific format, you should use one of the solutions above.
One caveat is that np.datetime64 can have different amounts of precision. If t has nanosecond precision, user 12321's solution will still work, but apteryx's and John Zwinck's solutions won't, because t.astype(datetime.datetime) and t.item() return an int:
import numpy as np
print('second precision')
t = np.datetime64('2000-01-01 00:00:00')
print(t)
print(t.astype(datetime.datetime))
print(t.item())
print('microsecond precision')
t = np.datetime64('2000-01-01 00:00:00.0000')
print(t)
print(t.astype(datetime.datetime))
print(t.item())
print('nanosecond precision')
t = np.datetime64('2000-01-01 00:00:00.0000000')
print(t)
print(t.astype(datetime.datetime))
print(t.item())
import pandas as pd
print(pd.to_datetime(str(t)))
second precision
2000-01-01T00:00:00
2000-01-01 00:00:00
2000-01-01 00:00:00
microsecond precision
2000-01-01T00:00:00.000000
2000-01-01 00:00:00
2000-01-01 00:00:00
nanosecond precision
2000-01-01T00:00:00.000000000
946684800000000000
946684800000000000
2000-01-01 00:00:00

For those who might stumble upon this: numpy now has a numpy.datetime_as_string function. Only caveat is that it accepts an array rather than just an individual value. I could make however that this is still a better solution than having to use another library just to do the conversion.

It might help to convert the datetime object to string and use splitting as shown below:
dtObj = 2011-08-01T00:00:00.000000000
dtString = str(dtObj).split('-01T00:00:00.000000000')[0]
print(dtString)
>>> '2011-08-01'

Related

Create and initialise a time column Python

I need to add a time column to my existing dataframe and initialize it. I tried this line of code df['date']=datetime.time(0, 0, 0) in a small script :
import pandas as pd
import datetime
df = pd.DataFrame({'column1':[34,54,32,23,26]})
df['date']=datetime.time(0, 0, 0)
print(df['date'])
output:
0 00:00:00
1 00:00:00
2 00:00:00
3 00:00:00
4 00:00:00
but when I implemented it in my code, in which I work on large dataframes, I got this error:
dfreez['delta']=datetime.time(0, 0, 0)
TypeError: descriptor 'time' for 'datetime.datetime' objects doesn't apply to 'int' object
this is a piece of my code:
import pandas as pd
dfreez = pd.read_excel('file_name.xlsx',header=0, index= False)
from datetime import datetime
dfreez['delta']=datetime.time(0, 0, 0)
I don't understand what went wrong!
import datetime and from datetime import datetime and not the same.
After the first one, the local datetime variable is a reference to the module. So you can access the datetime class with datetime.datetime and the time class with datetime.time
After the second, the local datetime variable is a reference to the datetime class. So you have no (direct *) way to access the time class.
You should just use:
import datetime
in the second snippet just like what was done in first one.
(*) FYI: it is still possible with the ugly sys.modules['datetime'].time. But never pretend that I advised you to do that!

converting a string to np.array with datetime64, NOT using Pandas

I'm looking for a way to convert dates given in the format YYYYmmdd to an np.array with dtype='datetime64'. The dates are stored in another np.array but with dtype='float64'.
I am looking for a way to achieve this by avoiding Pandas!
I already tried something similar as suggested in this answer but the author states that "[...] if (the date format) was in ISO 8601 you could parse it directly using numpy, [...]".
As the date format in my case is YYYYmmdd which IS(?) ISO 8601 it should be somehow possible to parse it directly using numpy. But I don't know how as I am a total beginner in python and coding in general.
I really try to avoid Pandas because I don't want to bloat my script when there is a way to get the task done by using the modules I am already using. I also read it would decrease the speed here.
If noone else comes up with something more builtin, here is a pedestrian method:
>>> dates
array([19700101., 19700102., 19700103., 19700104., 19700105., 19700106.,
19700107., 19700108., 19700109., 19700110., 19700111., 19700112.,
19700113., 19700114.])
>>> y, m, d = dates.astype(int) // np.c_[[10000, 100, 1]] % np.c_[[10000, 100, 100]]
>>> y.astype('U4').astype('M8') + (m-1).astype('m8[M]') + (d-1).astype('m8[D]')
array(['1970-01-01', '1970-01-02', '1970-01-03', '1970-01-04',
'1970-01-05', '1970-01-06', '1970-01-07', '1970-01-08',
'1970-01-09', '1970-01-10', '1970-01-11', '1970-01-12',
'1970-01-13', '1970-01-14'], dtype='datetime64[D]')
You can go via the python datetime module.
from datetime import datetime
import numpy as np
datestrings = np.array(["18930201", "19840404"])
dtarray = np.array([datetime.strptime(d, "%Y%m%d") for d in datestrings], dtype="datetime64[D]")
print(dtarray)
# out: ['1893-02-01' '1984-04-04'] datetime64[D]
Since the real question seems to be how to get the given strings into the matplotlib datetime format,
from datetime import datetime
import numpy as np
from matplotlib import dates as mdates
datestrings = np.array(["18930201", "19840404"])
mpldates = mdates.datestr2num(datestrings)
print(mpldates)
# out: [691071. 724370.]

Issue with datetime remaining at epoch

I've got a dataframe with one column filled with milliseconds that I've been able to convert somewhat into datetime format. The issue is that for two years worth of data, from 2017-2018, the time output remains at 1-1-1970. The output datetime looks like this:
27 1970-01-01 00:25:04.232399999
28 1970-01-01 00:25:04.232699999
29 1970-01-01 00:25:04.232999999
...
85264 1970-01-01 00:25:29.962799999
85265 1970-01-01 00:25:29.963099999
85266 1970-01-01 00:25:29.963399999
It seems to me that the milliseconds, which begin at 1504224299999 and end at 1529971499999, are getting added to the 10th hour of epoch and are not representing the true range that it should.
This is my code so far...
import pandas as pd
import MySQLdb
import datetime
from pandas import DataFrame
con = MySQLdb.connect(host='localhost',user='root',db='binance',passwd='abcde')
cur = con.cursor()
ms = pd.read_sql('SELECT close_time FROM btcusdt', con=con)
ms['close_time'].apply( lambda x: datetime.datetime.fromtimestamp(x/1000) )
date = pd.to_datetime(ms['close_time'])
print(date)
I'm not quite sure where I'm going wrong, so if anybody can tell me what I'm doing stupidly it'd be greatly appreciated.
If you need to apply a function that doesn't support your argument directly, you can apply it element wise using dummy function lambda.
Also, you need to assign back to your original panda series to overwrite it, use:
ms['close_time'] = ms['close_time'].apply( lambda x: datetime.datetime.fromtimestamp(x/1000) )
If you want to use pandas.to_datetime directly. use:
pd.to_datetime(ms['close_time'], unit = 'ms')
PS. There might be difference in datetime obtained from these two methods

Converting to_datetime but keeping original time

I am trying to convert string to Datetime- but the conversion adds 5 hours to the original time. How do I convert but keep the time as is?
>>> import pandas as pd
>>> t = pd.to_datetime("2016-09-21 08:56:29-05:00", format='%Y-%m-%d %H:%M:%S')
>>> t
Timestamp('2016-09-21 13:56:29')
The conversion doesn't add 5 hours to the original time. Pandas just detects that your datetime is timezone-aware and converts it to naive UTC. But it's still the same datetime.
If you want a localized Timestamp instance, use Timestamp.tz_localize() to make t a timezone-aware UTC timestamp, and then use the Timestamp.tz_convert() method to convert to UTC-0500:
>>> import pandas as pd
>>> import pytz
>>> t = pd.to_datetime("2016-09-21 08:56:29-05:00", format='%Y-%m-%d %H:%M:%S')
>>> t
Timestamp('2016-09-21 13:56:29')
>>> t.tz_localize(pytz.utc).tz_convert(pytz.timezone('America/Chicago'))
Timestamp('2016-09-21 08:56:29-0500', tz='America/Chicago')
To achieve what you want you can remove the "-5:00" from the end of your time string "2016-09-21 08:56:29-05:00"
However, Erik Cederstrand is correct in explaining that pandas is not modifying the time, it's simply displaying it in a different format.

How to set a variable to be "Today's" date in Python/Pandas

I am trying to set a variable to equal today's date.
I looked this up and found a related article:
Set today date as default value in the model
However, this didn't particularly answer my question.
I used the suggested:
dt.date.today
But after
import datetime as dt
date = dt.date.today
print date
<built-in method today of type object at 0x000000001E2658B0>
Df['Date'] = date
I didn't get what I actually wanted which as a clean date format of today's date...in Month/Day/Year.
How can I create a variable of today's day in order for me to input that variable in a DataFrame?
You mention you are using Pandas (in your title). If so, there is no need to use an external library, you can just use to_datetime
>>> pandas.to_datetime('today').normalize()
Timestamp('2015-10-14 00:00:00')
This will always return today's date at midnight, irrespective of the actual time, and can be directly used in pandas to do comparisons etc. Pandas always includes 00:00:00 in its datetimes.
Replacing today with now would give you the date in UTC instead of local time; note that in neither case is the tzinfo (timezone) added.
In pandas versions prior to 0.23.x, normalize may not have been necessary to remove the non-midnight timestamp.
If you want a string mm/dd/yyyy instead of the datetime object, you can use strftime (string format time):
>>> dt.datetime.today().strftime("%m/%d/%Y")
# ^ note parentheses
'02/12/2014'
Using pandas: pd.Timestamp("today").strftime("%m/%d/%Y")
pd.datetime.now().strftime("%d/%m/%Y")
this will give output as '11/02/2019'
you can use add time if you want
pd.datetime.now().strftime("%d/%m/%Y %I:%M:%S")
this will give output as '11/02/2019 11:08:26'
strftime formats
You can also look into pandas.Timestamp, which includes methods like .now and .today.
Unlike pandas.to_datetime('now'), pandas.Timestamp.now() won't default to UTC:
import pandas as pd
pd.Timestamp.now() # will return California time
# Timestamp('2018-12-19 09:17:07.693648')
pd.to_datetime('now') # will return UTC time
# Timestamp('2018-12-19 17:17:08')
i got the same problem so tried so many things
but finally this is the solution.
import time
print (time.strftime("%d/%m/%Y"))
simply just use pd.Timestamp.now()
for example:
input: pd.Timestamp.now()
output: Timestamp('2022-01-12 14:43:05.521896')
I know all you want is Timestamp('2022-01-12') you don't anything after
thus we could use replace to remove hour, minutes , second and microsecond
here:
input: pd.Timestamp.now().replace(hour=0, minute=0, second=0, microsecond=0)
output: Timestamp('2022-01-12 00:00:00')
but looks too complicated right, here is a simple way use normalize
input: pd.Timestamp.now().normalize()
output: Timestamp('2022-01-12 00:00:00')
Easy solution in Python3+:
import time
todaysdate = time.strftime("%d/%m/%Y")
#with '.' isntead of '/'
todaysdate = time.strftime("%d.%m.%Y")
import datetime
def today_date():
'''
utils:
get the datetime of today
'''
date=datetime.datetime.now().date()
date=pd.to_datetime(date)
return date
Df['Date'] = today_date()
this could be safely used in pandas dataframes.
There are already quite a few good answers, but to answer the more general question about "any" period:
Use the function for time periods in pandas. For Day, use 'D', for month 'M' etc.:
>pd.Timestamp.now().to_period('D')
Period('2021-03-26', 'D')
>p = pd.Timestamp.now().to_period('D')
>p.to_timestamp().strftime("%Y-%m-%d")
'2021-03-26'
note: If you need to consider UTC, you can use: pd.Timestamp.utcnow().tz_localize(None).to_period('D')...
From your solution that you have you can use:
import pandas as pd
pd.to_datetime(date)
using the date variable that you use

Categories

Resources