Plotting dates on the x-axis - python

I am trying to plot information against dates. I have a list of dates in the format "01/02/1991".
I converted them by doing the following:
x = parser.parse(date).strftime('%Y%m%d'))
which gives 19910102
Then I tried to use num2date
import matplotlib.dates as dates
new_x = dates.num2date(x)
Plotting:
plt.plot_date(new_x, other_data, fmt="bo", tz=None, xdate=True)
But I get an error. It says "ValueError: year is out of range". Any solutions?

You can do this more simply using plot() instead of plot_date().
First, convert your strings to instances of Python datetime.date:
import datetime as dt
dates = ['01/02/1991','01/03/1991','01/04/1991']
x = [dt.datetime.strptime(d,'%m/%d/%Y').date() for d in dates]
y = range(len(x)) # many thanks to Kyss Tao for setting me straight here
Then plot:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator())
plt.plot(x,y)
plt.gcf().autofmt_xdate()
Result:

I have too low reputation to add comment to #bernie response, with response to #user1506145. I have run in to same issue.
The answer to it is an interval parameter which fixes things up
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=5))
plt.plot(days,y)
plt.gcf().autofmt_xdate()
plt.show()

As #KyssTao has been saying, help(dates.num2date) says that the x has to be a float giving the number of days since 0001-01-01 plus one. Hence, 19910102 is not 2/Jan/1991, because if you counted 19910101 days from 0001-01-01 you'd get something in the year 54513 or similar (divide by 365.25, number of days in a year).
Use datestr2num instead (see help(dates.datestr2num)):
new_x = dates.datestr2num(date) # where date is '01/02/1991'

Adapting #Jacek Szałęga's answer for the use of a figure fig and corresponding axes object ax:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import datetime as dt
np.random.seed(1)
N = 100
y = np.random.rand(N)
now = dt.datetime.now()
then = now + dt.timedelta(days=100)
days = mdates.drange(now,then,dt.timedelta(days=1))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(days,y)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator(interval=5))
ax.tick_params(axis='x', labelrotation=45)
plt.show()

Related

How to plot time on the y axis correctly using python matplotlib?

I have two lists containing the sunset and sunrise times and the corresponding dates.
It looks like:
sunrises = ['06:30', '06:28', '06:27', ...]
dates = ['3.21', '3.22', '3.23', ...]
I want to make a plot of the sunrise times as the Y axis and the dates as the X axis.
Simply using
ax.plot(dates, sunrises)
ax.xaxis.set_major_locator(matplotlib.ticker.MultipleLocator(7))
ax.yaxis.set_major_locator(matplotlib.ticker.MultipleLocator(7))
plt.show()
can plot the dates correctly, but the time is wrong:
And actually, the sunrise time isn't supposed to be a straight line.
How do I solve this problem?
You need to transform the datetime in string format to the format that matplotlib can comprehend by using datetime
from matplotlib import pyplot as plt
import matplotlib as mpl
from datetime import datetime
import matplotlib.dates as mdates
sunrises = ['06:30', '06:28', '06:27',]
sunrises_dt = [datetime.strptime(item,'%H:%M') for item in sunrises]
dates = ['3.21', '3.22', '3.23',]
fig,ax = plt.subplots()
ax.plot(dates, sunrises_dt)
ax.yaxis.set_major_formatter(mdates.DateFormatter('%H:%M',))
ax.xaxis.set_major_locator(mpl.ticker.MultipleLocator(1))
plt.show()
This is because your sunrises are not numerical. I'm assuming you'd want them in a form such that "6:30" means 6.5. Which is calculated below:
import matplotlib.pyplot as plt
sunrises = ['06:30', '06:28', '06:27']
# This converts to decimals
sunrises = [float(x[0:2])+(float(x[-2:])/60) for x in sunrises]
dates = ['3.21', '3.22', '3.23']
plt.plot(sunrises, dates)
plt.xlabel('sunrises')
plt.ylabel('dates')
plt.show()
Note, your dates are being treated as decimals. Is this correct?

how do we plot a scatter plot for time in python

i want to plot a scatter plot between mentioned two columns k and s . k should be on x axis showing time on hourly basis for 24 hours and s should be on y axis. i have already tried some code using using sns.relplot but got attribute error.
data columns in which we want scatter plot
code which we tried with error
Try:
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame([['2020-05-26 06:15:07','105'], ['2020-05-26 06:15:07','41'], ['2020-05-26 06:16:51','95']], columns=["k", "s"])
df.k = pd.to_datetime(df.k, format='%Y-%m-%d %H:%M:%S')
ax = sns.scatterplot(df.k, df.s)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
ax.plot()
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
df = pd.DataFrame([['2020-05-26 06:15:07','105'], ['2020-05-26 06:15:07','41'], ['2020-05-26 06:16:51','95']], columns=["k", "s"])
df.k = pd.to_datetime(df.k, format='%Y-%m-%d %H:%M:%S')
df.set_index(['k'],inplace=True)
ax = sns.scatterplot(df.index, df.s)
# ax.set(xlabel="time", ylabel="values")
ax.set_xlim(df.index[0], df.index[-1])
ax.xaxis.set_major_locator(mdates.HourLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
ax.plot()

Plot histogram of epoch list, x axis by month-year in PyPlot

With a list of epoch dates, is there a parameter in pyplot or numpy to have an histogram where the bins match the months in the data list? In this example, the list correspond to random date from 2012 to 2013. I would like that the histogram shows the bars from, for example, February 2012 to October 2013 if the values in data correspond only to dates from these months.
This code makes an histogram, but it separates manually for bins=24.
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import random
data = [int(random.randint(1293836400, 1356994800)) for _ in range(1000)]
# convert the epoch format to matplotlib date format
mpl_data = mdates.epoch2num(data)
fig, ax = plt.subplots(1,1)
ax.hist(mpl_data, bins=24, ec='black')
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d.%m.%y'))
fig.autofmt_xdate()
plt.show()
In order to do this you have to pick out the timestamps at which the beginning of each month begins. Dates/Times are always a lot trickier than just regular numbers so while this code looks a bit cumbersome, it does work.
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import random
data = [int(random.randint(1293836400, 1356994800)) for _ in range(1000)]
# create your bins as timestamps marked at the beginning of each month, using datetime objects to increment
import datetime as d
mindate = d.datetime.fromtimestamp(min(data))
maxdate = d.datetime.fromtimestamp(max(data))
bindate = d.datetime(year=mindate.year, month=mindate.month, day=1)
bins = [bindate.timestamp()]
while bindate < maxdate:
if bindate.month == 12:
bindate = d.datetime(year=bindate.year + 1, month=1, day=1)
else:
bindate = d.datetime(year=bindate.year, month=bindate.month + 1, day=1)
bins.append(bindate.timestamp())
bins = mdates.epoch2num(bins)
mpl_data = mdates.epoch2num(data)
fig, ax = plt.subplots(1,1, figsize=(16, 4), facecolor='white')
ax.hist(mpl_data, bins=bins, ec='black')
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d.%m.%y'))
fig.autofmt_xdate()
Another approach is to use pandas to group data by month, and then counting them. The code is much shorter, and you can make a quick bar plot. To re-create your plot above would take more work, but this gives you a feel for things you can do with other tools:
srs = pd.DatetimeIndex(pd.Series(data) * 1e9) # convert sec to nsec
df = pd.DataFrame({'count': np.ones(shape=len(srs))}, index=srs)
fig, ax = plt.subplots(1, 1, figsize=(16,4), facecolor='white')
df.groupby(pd.Grouper(freq='M')).count().plot.bar(ax=ax)

Adding formatted dates as xticks in Matplotlib

I am trying to add a list of dates to Matplotlib xticks and when I do that the actual plot disappears keeping only xticks.
For example, I have the following code:
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib.dates import (DateFormatter, rrulewrapper, RRuleLocator, YEARLY)
# Generate random data and dates
data = np.random.randn(10000)
start = dt.datetime.strptime("2019-03-14", "%Y-%m-%d")
end = dt.datetime.strptime("2046-07-30", "%Y-%m-%d")
date = [start + dt.timedelta(days=x) for x in range(0, (end-start).days)]
rule = rrulewrapper(YEARLY, byeaster=1, interval=2)
loc = RRuleLocator(rule)
formatter = DateFormatter('%d/%m/%y')
fig, ax = plt.subplots()
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(formatter)
ax.xaxis.set_tick_params(rotation=30, labelsize=10)
plt.plot(data)
# ax.set_xlim(min(date), max(date))
plt.show()
This code plots the data which looks like this:
Now if I uncomment ax.set_xlim(min(date), max(date)) and rerun the code I get:
You can see that I only get the dates, formatted correctly but not the plot. I am not sure what the problem here. Any help would be appreciated.
Update
If I change data = np.random.randn(10000) to data = np.random.randn(1000000), then I am able to see the plot Which is not what I want
Most likely your data is plotted, but not at the correct location. If you go along that example you would need to add something like fig.autofmt_xdate() to your code.
The way to do this is by passing the date array along with data array in the plot method. That is with the given example it will be:
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib.dates import (DateFormatter, rrulewrapper, RRuleLocator, YEARLY)
# Generate random data and dates
data = np.random.randn(10000)
start = dt.datetime.strptime("2019-03-14", "%Y-%m-%d")
end = dt.datetime.strptime("2046-07-30", "%Y-%m-%d")
date = [start + dt.timedelta(days=x) for x in range(0, (end-start).days)]
rule = rrulewrapper(YEARLY, byeaster=1, interval=2)
loc = RRuleLocator(rule)
formatter = DateFormatter('%d/%m/%y')
fig, ax = plt.subplots()
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(formatter)
ax.xaxis.set_tick_params(rotation=30, labelsize=10)
plt.plot(date, data)
ax.set_xlim(min(date), max(date))
plt.show()
Then you'll get:
See matplotlib.pyplot.plot() for more information.

matplotlib's cursor info seems to be dependant from Tick resolution - how can I change this dependancy

With matplotlib, how can I see the exact value of the cursor for date values at the bottom right of the interactive plot?
It seems, that this is dependent from the tick resolution:
the following code example has different tick resolution. As an effect, at the bottom right of the screenshots one time the cursor value is 2020, one time it is 2020-09.
I would like to have it as 2020-09-01 - so that it is with this resolution / granularity that the data has.
import datetime
import pandas as pd
import matplotlib
min_date = datetime.datetime.now().date()
max_date = min_date + datetime.timedelta(days=2*365)
date_range = [min_date + datetime.timedelta(days=x)
for x in range(0, (max_date - min_date).days)]
df = pd.DataFrame(date_range)
yRange = range(df.shape[0])
df["y"] = yRange
df.columns = ["x","y"]
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
plt.plot(df["x"], df["y"], "-o", markersize=2)
plt.show()
axes = plt.gca()
import matplotlib.ticker as ticker
tick_spacing = 50
axes.xaxis.set_major_locator(ticker.MultipleLocator(tick_spacing))
plt.rcParams["figure.dpi"] = 200
plt.xticks(rotation=20)
plt.grid()
plt.plot(df["x"], df["y"], "-o", markersize=2)
plt.show()
In order to format the x coordinate for datetime axes differently than the actual format used on the axes, you can set the Axes.fmt_xdata attribute to a callable that takes the position in and outputs the desired string.
In this case,
ax.fmt_xdata = lambda x: matplotlib.dates.num2date(x).strftime("%Y-%m-%d")
Example:
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
plt.plot([datetime.now(), datetime(2020,6,6)], [1,2], "-o")
plt.gca().fmt_xdata = lambda x: mdates.num2date(x).strftime("%Y-%m-%d")
plt.show()

Categories

Resources