matplotlib - plot wrong datetime - python

I'd plot a figure with matplotlib in which the x-axis there are timestamp with yy-mm-dd hh-mm-ss. I have ts in datetime64 (pandas series) and to show also (right) minutes and seconds i follow the hint in this link using date2num. The problem is that it plots no-sense dates:
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as md
for df in dfs:
datenums=md.date2num(df.toPandas()["timestamp"])
plt.xticks(rotation=25)
xfmt = md.DateFormatter('%Y-%m-%d %H:%M:%S')
ax.xaxis.set_major_formatter(xfmt)
plt.plot(datenums,x)
plt.show()
where df.toPandas()["timestamp"] is:
0 2015-12-15 03:53:13
Name: timestamp, dtype: datetime64[ns]
I tried to convert datetime64 in datetime but the result doesn't change.

If you have your timestamp values on seconds, use this to create a list for all the tick labels and then add them to the plot considering your data is related to an array of timestamps
import matplotlib.pyplot as plt
import numpy as np
import datetime
OX_ticks_name = [datetime.datetime.fromtimestamp(x).strftime('%Y-%m-%d %H:%M:%S') for x in arrayTmstmp]
OX_ticks_pos = np.arange(0,len(arrayTmstmp))
fig, ax = plt.subplots(figsize=(16, 9), dpi=100)
...
ax.set_xticks(OX_ticks_pos)
ax.set_xticklabels(OX_ticks_name, rotation=40, horizontalalignment='right', fontsize=10)
plt.tight_layout()
plt.show()
Of course, the position of each tick and the name for each can be configured as you want.

Related

Why am I getting junk date values on x-axis in matplotlib?

I am new to Python and learning data visualization using matplotlib.
I am trying to plot Date/Time vs Values using matplotlib from this CSV file:
https://drive.google.com/file/d/1ex2sElpsXhxfKXA4ZbFk30aBrmb6-Y3I/view?usp=sharing
Following is the code snippet which I have been playing around with:
import pandas as pd
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
plt.style.use('seaborn')
years = mdates.YearLocator()
months = mdates.MonthLocator()
days = mdates.DayLocator()
hours = mdates.HourLocator()
minutes = mdates.MinuteLocator()
years_fmt = mdates.DateFormatter('%H:%M')
data = pd.read_csv('datafile.csv')
data.sort_values('Date/Time', inplace=True)
fig, ax = plt.subplots()
ax.plot('Date/Time', 'Discharge', data=data)
# format the ticks
ax.xaxis.set_major_locator(minutes)
ax.xaxis.set_major_formatter(years_fmt)
ax.xaxis.set_minor_locator(hours)
datemin = min(data['Date/Time'])
datemax = max(data['Date/Time'])
ax.set_xlim(datemin, datemax)
ax.format_xdata = mdates.DateFormatter('%Y.%m.%d %H:%M')
ax.format_ydata = lambda x: '%1.2f' % x # format the price.
ax.grid(True)
fig.autofmt_xdate()
plt.show()
The code is plotting the graph but it is not labeling the X-Axis and also giving some unknown values (on mouse over) for x on the bottom right corner as shown in the below screenshot:
Screenshot of matplotlib figure window
Can someone please suggest what changes are needed to plot the x-axis dates and also make the correct values appear when I move the cursor over the graph?
Thanks
I haven't used matplotlib. Instead I used pandas plotting
import pandas as pd
data = pd.read_csv('datafile.csv')
data.sort_values('Date/Time', inplace=True)
data["Date/Time"] = pd.to_datetime(data["Date/Time"], format="%d.%m.%Y %H:%M")
ax = data.plot.line(x='Date/Time', y='Discharge')
Here, you need to convert the Date/Time to pandas datetime type.
The main issue you have there is that the date formats are mixed up - your data uses '%d.%m.%Y %H:%M', but you set '%Y.%m.%d %H:%M' and this is why you saw 'rubbish' values in x ticks labels. Anyway the number of lines in your code can be reduced heavily if you convert your Date/Time column to timestamps, ie.:
import pandas as pd
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
plt.style.use('seaborn')
data = pd.read_csv('datafile.csv')
data.sort_values('Date/Time', inplace=True)
data["Date/Time"] = pd.to_datetime(data["Date/Time"], format="%d.%m.%Y %H:%M")
data.sort_values('Date/Time', inplace=True)
fig, ax = plt.subplots()
ax.plot('Date/Time', 'Discharge', data=data)
ax.format_xdata = mdates.DateFormatter('%Y.%m.%d %H:%M')
ax.tick_params(axis='x', rotation=45)
ax.grid(True)
fig.autofmt_xdate()
plt.show()
Note that the format of labels in the plot will depend on the zoom level, so you will need to enlarge a portion of the graph to see hours and minutes in the tick labels, but the cursor locator on the bottom bar of the window should be always displaying the detailed timestamp under the cursor.

how do we plot a scatter plot for time in python

i want to plot a scatter plot between mentioned two columns k and s . k should be on x axis showing time on hourly basis for 24 hours and s should be on y axis. i have already tried some code using using sns.relplot but got attribute error.
data columns in which we want scatter plot
code which we tried with error
Try:
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame([['2020-05-26 06:15:07','105'], ['2020-05-26 06:15:07','41'], ['2020-05-26 06:16:51','95']], columns=["k", "s"])
df.k = pd.to_datetime(df.k, format='%Y-%m-%d %H:%M:%S')
ax = sns.scatterplot(df.k, df.s)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
ax.plot()
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
df = pd.DataFrame([['2020-05-26 06:15:07','105'], ['2020-05-26 06:15:07','41'], ['2020-05-26 06:16:51','95']], columns=["k", "s"])
df.k = pd.to_datetime(df.k, format='%Y-%m-%d %H:%M:%S')
df.set_index(['k'],inplace=True)
ax = sns.scatterplot(df.index, df.s)
# ax.set(xlabel="time", ylabel="values")
ax.set_xlim(df.index[0], df.index[-1])
ax.xaxis.set_major_locator(mdates.HourLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
ax.plot()

Plot timedelta in matplotlib

I'm reading in some year, time (duration) data and I want to plot a chart of year on the x axis and time (duration) on the y axis. I want the y axis to have a HH:MM:SS format. I can't figure out how to do it. Here's my code (data is synthesized, real data set is much larger).
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, date2num
df = pd.DataFrame({'Year':[2010, 2011, 2012],
'Time': ['2:19:15', '2:11:16', '2:20:17']})
df['Time'] = pd.to_timedelta(df['Time'])
fig, ax = plt.subplots()
myFmt = DateFormatter("%H:%M:%S")
ax.yaxis.set_major_formatter(myFmt)
ax.plot(df['Year'], df['Time'].dt.total_seconds())
plt.gcf().autofmt_xdate()
plt.show()
If I don't convert to total_seconds, I get an error. It seems like the total seconds values are being interpreted as days. I tried dividing total_seconds by 24*60*60, but that gave me a message about a 0 date. I can't persuade date2num to work for me either.
I've checked on previous similar questions, but the code no longer works.
Does anyone know how to plot Pandas timedeltas in matplotlib?
Convert the timedetlas to datetime with pd.to_datetime. It will give everything a date in 1970, but if all you want is to plot and display then it wont matter. You then need to get rid of the .dt.total_seconds() too.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, date2num
df = pd.DataFrame({'Year':[2010, 2011, 2012],
'Time': ['2:19:15', '2:11:16', '2:20:17']})
df['Time'] = pd.to_timedelta(df['Time'])
df['Time'] = pd.to_datetime(df['Time'])
fig, ax = plt.subplots()
myFmt = DateFormatter("%H:%M:%S")
ax.yaxis.set_major_formatter(myFmt)
ax.plot(df['Year'], df['Time'])
plt.gcf().autofmt_xdate()
plt.show()

Unable to show Pandas dateindex on a matplotlib x axis

I'm trying to build matplotlib charts whose x-axis is a dateIndex from a pandas dataframe. Trying to mimic some examples from matplotlib, I've been unsuccessful. The xaxis ticks and labels never appear.
I thought maybe matplotlib wasn't properly digesting the pandas index, so I converted it to ordinal with the matplotlib date2num helper function, but that gave the same result.
# https://matplotlib.org/api/dates_api.html
# https://matplotlib.org/examples/api/date_demo.html
import datetime as dt
import matplotlib.dates as mdates
import matplotlib.cbook as cbook
import matplotlib.dates as mpd
years = mdates.YearLocator() # every year
months = mdates.MonthLocator() # every month
yearsFmt = mdates.DateFormatter('%Y')
majorLocator = years
majorFormatter = yearsFmt #FormatStrFormatter('%d')
minorLocator = months
y1 = np.arange(100)*0.14+1
y2 = -(np.arange(100)*0.04)+12
"""neither of these indices works"""
x = pd.date_range(start='4/1/2012', periods=len(y1))
#x = map(mpd.date2num, pd.date_range(start='4/1/2012', periods=len(y1)))
fig, ax = plt.subplots()
ax.plot(x,y1)
ax.plot(x,y2)
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
ax.xaxis.set_minor_locator(months)
datemin = x[0]
datemax = x[-1]
ax.set_xlim(datemin, datemax)
fig.autofmt_xdate()
plt.show()
The problem is the following. pd.date_range(start='4/1/2012', periods=len(y1)) creates dates from the first of April 2012 to the 9th of July 2012.
Now you set the major locator to be a YearLocator. This means, that you want to have a tick for each year on the axis. However, all dates are within the same year 2012. So there is no major tick to be shown within the plot range.
The suggestion would be to use a MonthLocator instead, such that the first of each month is ticked. Also if would make sense to use a formatter, which actually shows the months, e.g. '%b %Y'. You may use a DayLocator for the minor ticks, if you want, to show the small tickmarks for each day.
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
ax.xaxis.set_minor_locator(mdates.DayLocator())
Complete example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
y1 = np.arange(100)*0.14+1
y2 = -(np.arange(100)*0.04)+12
x = pd.date_range(start='4/1/2012', periods=len(y1))
fig, ax = plt.subplots()
ax.plot(x,y1)
ax.plot(x,y2)
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
ax.xaxis.set_minor_locator(mdates.DayLocator())
fig.autofmt_xdate()
plt.show()
You could use pd.DataFrame.plot to handle most of that
df = pd.DataFrame(dict(
y1=y1, y2=y2
), index=x)
df.plot()

Plotting dates on the x-axis

I am trying to plot information against dates. I have a list of dates in the format "01/02/1991".
I converted them by doing the following:
x = parser.parse(date).strftime('%Y%m%d'))
which gives 19910102
Then I tried to use num2date
import matplotlib.dates as dates
new_x = dates.num2date(x)
Plotting:
plt.plot_date(new_x, other_data, fmt="bo", tz=None, xdate=True)
But I get an error. It says "ValueError: year is out of range". Any solutions?
You can do this more simply using plot() instead of plot_date().
First, convert your strings to instances of Python datetime.date:
import datetime as dt
dates = ['01/02/1991','01/03/1991','01/04/1991']
x = [dt.datetime.strptime(d,'%m/%d/%Y').date() for d in dates]
y = range(len(x)) # many thanks to Kyss Tao for setting me straight here
Then plot:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
plt.plot(x,y)
plt.gcf().autofmt_xdate()
Result:
I have too low reputation to add comment to #bernie response, with response to #user1506145. I have run in to same issue.
The answer to it is an interval parameter which fixes things up
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=5))
plt.plot(days,y)
plt.gcf().autofmt_xdate()
plt.show()
As #KyssTao has been saying, help(dates.num2date) says that the x has to be a float giving the number of days since 0001-01-01 plus one. Hence, 19910102 is not 2/Jan/1991, because if you counted 19910101 days from 0001-01-01 you'd get something in the year 54513 or similar (divide by 365.25, number of days in a year).
Use datestr2num instead (see help(dates.datestr2num)):
new_x = dates.datestr2num(date) # where date is '01/02/1991'
Adapting #Jacek Szałęga's answer for the use of a figure fig and corresponding axes object ax:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(days,y)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator(interval=5))
ax.tick_params(axis='x', labelrotation=45)
plt.show()

Categories

Resources