Using DateTimeIndex for set_xlim on matplotlib - python

How can I set x/y limits on matplotlib to certain datetime values?
I got a DateTimeIndex object (called time) and i want the plots to fit inside the first and last value of this index.
If I try ax.set_xlim(time[0],time[-1])
it throws me this error:
Cannot compare type Timedelta with type float
Any suggestions?

The time handling in matplotlib is gregorian, so it needs to be converted, I think it needs to be done with date2num().
matplotlib API Overview: Dates
ax.set_xlim(date2num([series_.index.min(), series_.index.max()]))
ax.xaxis.set_major_formatter(DateFormatter('%H:%M'))

Related

Plotting datetime.time object: "float() argument must be a string or a number, not 'date time.time'"

I have a pandas data frame of dtype: int64, I then convert it to date time using pd.to_datetime. This gives a date as well as the time of day, I only want to plot a distribution plot of the times of the day. I have tried many different things and keep running into errors, I will post the code of my latest error:
type(justTime)
This returns 'pandas.core.frame.DataFrame' so I know it is a data frame.
justTime['ACCESS_TIME'].value_counts()
This returns a value_counts list of '2020-08-08 12:44:19.000' type objects, which it calls dtype: int64. Of note is when I do: type(justTime['ACCESS_TIME']) it returns 'pandas.core.series.Series'.
Next, I make it a datetime by doing the following:
justTime['ACCESS_TIME'] = pd.to_datetime(justTime['ACCESS_TIME'])
If I do the following: justTime['DE_ID_ACCESS_TIME'].dt.time it prints a list of just the times; for example "13:04:41" but shows them being of dtype: object.
Therefore, when I try
ax = sns.distplot(justTime['ACCESS_TIME'].dt.time)
I get the error: "TypeError: float() argument must be a string or a number, not 'datetime.time'"
Essentially I have a data frame of datetime object where I want to plot a distribution plot of just the times, no dates. I want to see around what time of the day these access times are clustering, and I have run into so many problems of how to handle this. Any help is appreciated, thank you.

Why does not Seaborn Relplot print datetime value on x-axis?

I'm trying to solve a Kaggle Competition to get deeper into data science knowledge. I'm dealing with an issue with seaborn library. I'm trying to plot a distribution of a feature along the date but the relplot function is not able to print the datetime value. On the output, I see a big black box instead of values.
Here there is my code, for plotting:
rainfall_types = list(auser.loc[:,1:])
grid = sns.relplot(x='Date', y=rainfall_types[0], kind="line", data=auser);
grid.fig.autofmt_xdate()
Here there is the
Seaborn.relpot output and the head of my dataset
I found the error. Pratically, when you use pandas.read_csv(dataset), if your dataset contains datetime column they are parsed as object, but python read these values as 'str' (string). So when you are going to plot them, matplotlib is not able to show them correctly.
To avoid this behaviour, you should convert the datetime value into datetime object by using:
df = pandas.read_csv(dataset, parse_date='Column_Date')
In this way, we are going to indicate to pandas library that there is a date column identified by the key 'Column_Date' and it has to be converted into datetime object.
If you want, you could use the Column Date as index for your dataframe, to speed up the analyis along the time. To do it add argument index='Column_Date' at your read_csv.
I hope you will find it helpful.

how to convert datetime to numeric data type?

I have a dataset as
time MachineId
1530677359000000000 01081081
1530677363000000000 01081081
1530681023000000000 01081090
1530681053000000000 01081090
1530681531000000000 01081090
So my codes goes like:
import pandas as pd
from datetime import datetime
import time
import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdate
df= pd.read_csv('acn.csv')`
df['time']=pd.to_datetime(df['time'], unit='ns')` #converting the epoch nanosec time to datetime-format
print(df.head())
Output:
time MachineId
0 2018-07-04 04:09:19 1081081.0
1 2018-07-04 04:09:23 1081081.0
2 2018-07-04 05:10:23 1081090.0
3 2018-07-04 05:10:53 1081090.0
4 2018-07-04 05:18:51 1081090.0
and now I want to change my data of time to numeric to generate a plot between time and machine id
dates = plt.dates.date2num(df['time'])
df.plot(kind='scatter',x='dates',y='MachineId')
plt.show()
which throws a error as :
AttributeError: 'module' object has no attribute 'dates'
How can I change datetime format to numeric so that a plot can be formed ?
You got the following error:
AttributeError: 'module' object has no attribute 'dates'
Your error message is telling you that matplotlib.pyplot.dates (plt.dates) doesn't exist. (The error says that there's a module that you're calling 'dates' but it doesn't exist).
So you need to fix that error before you worry about converting anything. Did you mean to call matplotlib.dates.date2num instead? In your code you have the following:
import matplotlib.dates as mdate
So maybe you meant to call mdate.date2num instead? That should eliminate the AttributeError.
If that doesn't work for you, you could try what is suggested in the link provided by one of the other commenters, to use pandas to_pydatetime. I'm not familiar with it, but in this example page, it is accessed as Series.dt.to_pydatetime()
All of this converting is just necessary because you are trying to use df.plot; maybe you should consider just calling matplotlib directly. For example, could you just use plt.plot_date instead? (here's the link to it). Pandas is excellent, but the plotting interface isn't as mature as the rest of it. As an example (I'm not saying this is the exact problem you are having) but here is a known bug in pandas regarding plotting dates. Here is an older stack overflow thread where someone stubs out a plt.plot_date method for you.
You can directly plot dates as well. For example if you want to have the date on the x-axis you pass the dates in ax.plot(df.time, ids). I think this might the closest thing to what you look for.

How to use matplotlib to plot line charts

I use pandas to read my csv file and turn two columns into arrays as independent/dependent variables respectively.
the data reading, array-turning trans and value assign
Then when I want to use matplotlib.pyplot to plot the line charts out, it turns out that 'numpy.ndarray' objects has no attribute 'find'.
import numpy as np
import matplotlib.pyplot as plt
plt.plot(x,y)
The problem is probably with your dtypes, assuming your data are in df check the df.dtypes. Columns you are trying to plot must be numeric (float, int, bool).
I guess that at least one of the columns you are plotting has object dtype, try to find out why (maybe missing values were read as some sort of string, or everything is just considered string) and convert it to correct type with astype, i.e.
df['float_col'] = df['float_col'].astype(np.float64)
Edit:
If you are trying to plot date use, make sure that dtype is actually a date i.e. datetime64[ns] and use matplotlibs dedicated method plot_date

Python matplotlib.dates.date2num: converting numpy array to matplotlib datetimes

I am trying to plot a custom chart with datetime axis. My understanding is that matplotlib requires a float format which is days since epoch. So, I want to convert a numpy array to the float epoch as required by matplotlib.
The datetime values are stored in a numpy array called t:
In [235]: t
Out[235]: array(['2008-12-01T00:00:59.000000000-0800',
'2008-12-01T00:00:59.000000000-0800',
'2008-12-01T00:00:59.000000000-0800',
'2008-12-01T00:09:26.000000000-0800',
'2008-12-01T00:09:41.000000000-0800'], dtype='datetime64[ns]')
Apparently, matplotlib.dates.date2num only accepts a sequence of python datetimes as input (not numpy datetimes arrays):
import matplotlib.dates as dates
plt_dates = dates.date2num(t)
raises AttributeError: 'numpy.datetime64' object has no attribute 'toordinal'
How should I resolve this issue? I hope to have a solution that works for all types of numpy.datetime like object.
My best workaround (which I am not sure to be correct) is not to use date2num at all. Instead, I try to use the following:
z = np.array([0]).astype(t.dtype)
plt_dates = (t - z)/ np.timedelta64(1,'D')
Even, if this solution is correct, it is nicer to use library functions, instead of manual adhoc workarounds.
For a quick fix, use:
import matplotlib.dates as dates
plt_dates = dates.date2num(t.to_pydatetime())
or:
import matplotlib.dates as dates
plt_dates = dates.date2num(list(t))
It seems the latest (matplotlib.__version__ '2.1.0') does not like numpy arrays... Edit: In my case, after checking the source code, the problem seems to be that the latest matplotlib.cbook cannot create an iterable from the numpy array and thinks the array is a number.
For similar but a bit more complex problems, check http://stackoverflow.com/questions/13703720/converting-between-datetime-timestamp-and-datetime64, possibly Why do I get "python int too large to convert to C long" errors when I use matplotlib's DateFormatter to format dates on the x axis?, and maybe matplotlib plot_date AttributeError: 'numpy.datetime64' object has no attribute 'toordinal' (if someone answers)
Edit: someone answered, his code using to_pydatetime() seems best, also: pandas 0.21.0 Timestamp compatibility issue with matplotlib, though that did not work in my case (because of python 2???)

Categories

Resources