Plot timedelta in matplotlib - python

I'm reading in some year, time (duration) data and I want to plot a chart of year on the x axis and time (duration) on the y axis. I want the y axis to have a HH:MM:SS format. I can't figure out how to do it. Here's my code (data is synthesized, real data set is much larger).
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, date2num
df = pd.DataFrame({'Year':[2010, 2011, 2012],
'Time': ['2:19:15', '2:11:16', '2:20:17']})
df['Time'] = pd.to_timedelta(df['Time'])
fig, ax = plt.subplots()
myFmt = DateFormatter("%H:%M:%S")
ax.yaxis.set_major_formatter(myFmt)
ax.plot(df['Year'], df['Time'].dt.total_seconds())
plt.gcf().autofmt_xdate()
plt.show()
If I don't convert to total_seconds, I get an error. It seems like the total seconds values are being interpreted as days. I tried dividing total_seconds by 24*60*60, but that gave me a message about a 0 date. I can't persuade date2num to work for me either.
I've checked on previous similar questions, but the code no longer works.
Does anyone know how to plot Pandas timedeltas in matplotlib?

Convert the timedetlas to datetime with pd.to_datetime. It will give everything a date in 1970, but if all you want is to plot and display then it wont matter. You then need to get rid of the .dt.total_seconds() too.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, date2num
df = pd.DataFrame({'Year':[2010, 2011, 2012],
'Time': ['2:19:15', '2:11:16', '2:20:17']})
df['Time'] = pd.to_timedelta(df['Time'])
df['Time'] = pd.to_datetime(df['Time'])
fig, ax = plt.subplots()
myFmt = DateFormatter("%H:%M:%S")
ax.yaxis.set_major_formatter(myFmt)
ax.plot(df['Year'], df['Time'])
plt.gcf().autofmt_xdate()
plt.show()

Related

how do we plot a scatter plot for time in python

i want to plot a scatter plot between mentioned two columns k and s . k should be on x axis showing time on hourly basis for 24 hours and s should be on y axis. i have already tried some code using using sns.relplot but got attribute error.
data columns in which we want scatter plot
code which we tried with error
Try:
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame([['2020-05-26 06:15:07','105'], ['2020-05-26 06:15:07','41'], ['2020-05-26 06:16:51','95']], columns=["k", "s"])
df.k = pd.to_datetime(df.k, format='%Y-%m-%d %H:%M:%S')
ax = sns.scatterplot(df.k, df.s)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
ax.plot()
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
df = pd.DataFrame([['2020-05-26 06:15:07','105'], ['2020-05-26 06:15:07','41'], ['2020-05-26 06:16:51','95']], columns=["k", "s"])
df.k = pd.to_datetime(df.k, format='%Y-%m-%d %H:%M:%S')
df.set_index(['k'],inplace=True)
ax = sns.scatterplot(df.index, df.s)
# ax.set(xlabel="time", ylabel="values")
ax.set_xlim(df.index[0], df.index[-1])
ax.xaxis.set_major_locator(mdates.HourLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
ax.plot()

matplotlib wrong location of the first tick with timeseries plot [duplicate]

Compare the following code:
test = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
test['date'] = pd.to_datetime(test['date'])
test = test.set_index('date')
ax = test.plot()
I added DateFormatter in the end:
test = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
test['date'] = pd.to_datetime(test['date'])
test = test.set_index('date')
ax = test.plot()
ax.xaxis.set_minor_formatter(dates.DateFormatter('%d\n\n%a')) ## Added this line
The issue with the second graph is that it starts on 5-24 instead 5-25. Also, 5-25 of 2017 is Thursday not Monday. What is causing the issue? Is this timezone related? (I don't understand why the date numbers are stacked on top of each other either)
In general the datetime utilities of pandas and matplotlib are incompatible. So trying to use a matplotlib.dates object on a date axis created with pandas will in most cases fail.
One reason is e.g. seen from the documentation
datetime objects are converted to floating point numbers which represent time in days since 0001-01-01 UTC, plus 1. For example, 0001-01-01, 06:00 is 1.25, not 0.25.
However, this is not the only difference and it is thus advisable not to mix pandas and matplotlib when it comes to datetime objects.
There is however the option to tell pandas not to use its own datetime format. In that case using the matplotlib.dates tickers is possible. This can be steered via.
df.plot(x_compat=True)
Since pandas does not provide sophisticated formatting capabilities for dates, one can use matplotlib for plotting and formatting.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
df = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
df['date'] = pd.to_datetime(df['date'])
usePandas=True
#Either use pandas
if usePandas:
df = df.set_index('date')
df.plot(x_compat=True)
plt.gca().xaxis.set_major_locator(dates.DayLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
plt.gca().invert_xaxis()
plt.gcf().autofmt_xdate(rotation=0, ha="center")
# or use matplotlib
else:
plt.plot(df["date"], df["ratio1"])
plt.gca().xaxis.set_major_locator(dates.DayLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
plt.gca().invert_xaxis()
plt.show()
Updated using the matplotlib object oriented API
usePandas=True
#Either use pandas
if usePandas:
df = df.set_index('date')
ax = df.plot(x_compat=True, figsize=(6, 4))
ax.xaxis.set_major_locator(dates.DayLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
ax.invert_xaxis()
ax.get_figure().autofmt_xdate(rotation=0, ha="center")
# or use matplotlib
else:
fig, ax = plt.subplots(figsize=(6, 4))
ax.plot('date', 'ratio1', data=df)
ax.xaxis.set_major_locator(dates.DayLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
fig.invert_xaxis()
plt.show()

labels lost while adjusting ticks in matplotlib [duplicate]

Compare the following code:
test = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
test['date'] = pd.to_datetime(test['date'])
test = test.set_index('date')
ax = test.plot()
I added DateFormatter in the end:
test = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
test['date'] = pd.to_datetime(test['date'])
test = test.set_index('date')
ax = test.plot()
ax.xaxis.set_minor_formatter(dates.DateFormatter('%d\n\n%a')) ## Added this line
The issue with the second graph is that it starts on 5-24 instead 5-25. Also, 5-25 of 2017 is Thursday not Monday. What is causing the issue? Is this timezone related? (I don't understand why the date numbers are stacked on top of each other either)
In general the datetime utilities of pandas and matplotlib are incompatible. So trying to use a matplotlib.dates object on a date axis created with pandas will in most cases fail.
One reason is e.g. seen from the documentation
datetime objects are converted to floating point numbers which represent time in days since 0001-01-01 UTC, plus 1. For example, 0001-01-01, 06:00 is 1.25, not 0.25.
However, this is not the only difference and it is thus advisable not to mix pandas and matplotlib when it comes to datetime objects.
There is however the option to tell pandas not to use its own datetime format. In that case using the matplotlib.dates tickers is possible. This can be steered via.
df.plot(x_compat=True)
Since pandas does not provide sophisticated formatting capabilities for dates, one can use matplotlib for plotting and formatting.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
df = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
df['date'] = pd.to_datetime(df['date'])
usePandas=True
#Either use pandas
if usePandas:
df = df.set_index('date')
df.plot(x_compat=True)
plt.gca().xaxis.set_major_locator(dates.DayLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
plt.gca().invert_xaxis()
plt.gcf().autofmt_xdate(rotation=0, ha="center")
# or use matplotlib
else:
plt.plot(df["date"], df["ratio1"])
plt.gca().xaxis.set_major_locator(dates.DayLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
plt.gca().invert_xaxis()
plt.show()
Updated using the matplotlib object oriented API
usePandas=True
#Either use pandas
if usePandas:
df = df.set_index('date')
ax = df.plot(x_compat=True, figsize=(6, 4))
ax.xaxis.set_major_locator(dates.DayLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
ax.invert_xaxis()
ax.get_figure().autofmt_xdate(rotation=0, ha="center")
# or use matplotlib
else:
fig, ax = plt.subplots(figsize=(6, 4))
ax.plot('date', 'ratio1', data=df)
ax.xaxis.set_major_locator(dates.DayLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
fig.invert_xaxis()
plt.show()

How to format x axis in matplotlib when plotting pandas series with timestamp as index? [duplicate]

Compare the following code:
test = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
test['date'] = pd.to_datetime(test['date'])
test = test.set_index('date')
ax = test.plot()
I added DateFormatter in the end:
test = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
test['date'] = pd.to_datetime(test['date'])
test = test.set_index('date')
ax = test.plot()
ax.xaxis.set_minor_formatter(dates.DateFormatter('%d\n\n%a')) ## Added this line
The issue with the second graph is that it starts on 5-24 instead 5-25. Also, 5-25 of 2017 is Thursday not Monday. What is causing the issue? Is this timezone related? (I don't understand why the date numbers are stacked on top of each other either)
In general the datetime utilities of pandas and matplotlib are incompatible. So trying to use a matplotlib.dates object on a date axis created with pandas will in most cases fail.
One reason is e.g. seen from the documentation
datetime objects are converted to floating point numbers which represent time in days since 0001-01-01 UTC, plus 1. For example, 0001-01-01, 06:00 is 1.25, not 0.25.
However, this is not the only difference and it is thus advisable not to mix pandas and matplotlib when it comes to datetime objects.
There is however the option to tell pandas not to use its own datetime format. In that case using the matplotlib.dates tickers is possible. This can be steered via.
df.plot(x_compat=True)
Since pandas does not provide sophisticated formatting capabilities for dates, one can use matplotlib for plotting and formatting.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
df = pd.DataFrame({'date':['20170527','20170526','20170525'],'ratio1':[1,0.98,0.97]})
df['date'] = pd.to_datetime(df['date'])
usePandas=True
#Either use pandas
if usePandas:
df = df.set_index('date')
df.plot(x_compat=True)
plt.gca().xaxis.set_major_locator(dates.DayLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
plt.gca().invert_xaxis()
plt.gcf().autofmt_xdate(rotation=0, ha="center")
# or use matplotlib
else:
plt.plot(df["date"], df["ratio1"])
plt.gca().xaxis.set_major_locator(dates.DayLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
plt.gca().invert_xaxis()
plt.show()
Updated using the matplotlib object oriented API
usePandas=True
#Either use pandas
if usePandas:
df = df.set_index('date')
ax = df.plot(x_compat=True, figsize=(6, 4))
ax.xaxis.set_major_locator(dates.DayLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
ax.invert_xaxis()
ax.get_figure().autofmt_xdate(rotation=0, ha="center")
# or use matplotlib
else:
fig, ax = plt.subplots(figsize=(6, 4))
ax.plot('date', 'ratio1', data=df)
ax.xaxis.set_major_locator(dates.DayLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('%d\n\n%a'))
fig.invert_xaxis()
plt.show()

matplotlib - plot wrong datetime

I'd plot a figure with matplotlib in which the x-axis there are timestamp with yy-mm-dd hh-mm-ss. I have ts in datetime64 (pandas series) and to show also (right) minutes and seconds i follow the hint in this link using date2num. The problem is that it plots no-sense dates:
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as md
for df in dfs:
datenums=md.date2num(df.toPandas()["timestamp"])
plt.xticks(rotation=25)
xfmt = md.DateFormatter('%Y-%m-%d %H:%M:%S')
ax.xaxis.set_major_formatter(xfmt)
plt.plot(datenums,x)
plt.show()
where df.toPandas()["timestamp"] is:
0 2015-12-15 03:53:13
Name: timestamp, dtype: datetime64[ns]
I tried to convert datetime64 in datetime but the result doesn't change.
If you have your timestamp values on seconds, use this to create a list for all the tick labels and then add them to the plot considering your data is related to an array of timestamps
import matplotlib.pyplot as plt
import numpy as np
import datetime
OX_ticks_name = [datetime.datetime.fromtimestamp(x).strftime('%Y-%m-%d %H:%M:%S') for x in arrayTmstmp]
OX_ticks_pos = np.arange(0,len(arrayTmstmp))
fig, ax = plt.subplots(figsize=(16, 9), dpi=100)
...
ax.set_xticks(OX_ticks_pos)
ax.set_xticklabels(OX_ticks_name, rotation=40, horizontalalignment='right', fontsize=10)
plt.tight_layout()
plt.show()
Of course, the position of each tick and the name for each can be configured as you want.

Categories

Resources