Plotting Pandas Datetime Timeseries in AM/PM format - python

I have a pandas series with Timestamp indices that I'd like to plot.
print example.head()
2015-08-11 20:07:00-05:00 26
2015-08-11 20:08:00-05:00 66
2015-08-11 20:09:00-05:00 71
2015-08-11 20:10:00-05:00 63
2015-08-11 20:11:00-05:00 73
But when i plot it in pandas with:
plt.figure(figsize = (15,8))
cubs1m.plot(kind='area')
I'd like the values on the y-axis to show up in AM/PM format (8:08PM), not military time(20:08). Is there an easy way to do this?
And also, how would I control # of ticks and # of labels plotting with pandas?
Thanks in advance.

Your question has two elements:
How to control # of ticks/labels on a plot
How to change 24-hour time to 12-hour time
Axes methods set_xticks, set_yticks, set_xticklabels, and set_yticklabels control the ticks and the labels:
import matplotlib.pyplot as plt
plt.plot(range(10), range(10))
plt.gca().set_xticks(range(0,10,2))
plt.gca().set_xticklabels(['a{}'.format(ii) for ii in range(0,10,2)])
To change the time format, use pd.datetime.strftime: How can I convert 24 hour time to 12 hour time?
import pandas as pd
data = pd.Series(range(12), index=pd.date_range('2016-2-3 9:00','2016-2-3 20:00', freq='H'))
ax = data.plot(xticks=data.index[::2])
ax.set_xticklabels(data.index[::2].map(lambda x: pd.datetime.strftime(x, '%I %p')))
This question covers an alternate approach to plotting with dates: Pandas timeseries plot setting x-axis major and minor ticks and labels

Related

Time series data visualization issue

I have a time series data like below where the data consists of year and week. So, the data is from 2014 1st week to 2015 52 weeks.
Now, below is the line plot of the above mentioned data
As you can see the x axis labelling is not quite what I was trying to achieve since the point after 201453 should be 201501 and there should not be any straight line and it should not be up to 201499. How can I rescale the xaxis exactly according to Due_date column? Below is the code
rand_products = np.random.choice(Op_2['Sp_number'].unique(), 3)
selected_products = Op_2[Op_2['Sp_number'].isin(rand_products)][['Due_date', 'Sp_number', 'Billing']]
plt.figure(figsize=(20,10))
plt.grid(True)
g = sns.lineplot(data=selected_products, x='Due_date', y='Billing', hue='Sp_number', ci=False, legend='full', palette='Set1');
the issue is because 201401... etc. are read as numbers and that is the reason the line chart has that gap. To fix it, you will need to change the numbers to date format and plot it.
As the full data is not available, below is the two column dataframe which has the Due_date in the form of integer YYYYWW. Billing column is a bunch of random numbers. Use the method here to convert the integers to dateformat and plot. The gap will be removed....
import numpy as np
import pandas as pd
import random
import matplotlib.pyplot as plt
import seaborn as sns
Due_date = list(np.arange(201401,201454)) #Year 2014
Due_date.extend(np.arange(201501,201553)) #Year 2915
Billing = random.sample(range(500, 1000), 105) #billing numbers
df = pd.DataFrame({'Due_date': Due_date, 'Billing': Billing})
df.Due_date = df.Due_date.astype(str)
df.Due_date = pd.to_datetime(df['Due_date']+ '-1',format="%Y%W-%w") #Convert to date
plt.figure(figsize=(20,10))
plt.grid(True)
ax = sns.lineplot(data=df, x='Due_date', y='Billing', ci=False, legend='full', palette='Set1')
Output graph

Plotting time series dataframe in python

I am having a really really hard time plotting a time series plot from a data frame python. Please find datatype below.
Time_split datetime64[ns]
Total_S4_Sig1 float64
The time_split column is the X axis and is the time variable. The total s4 is the Y variable and is a float.
0 15:21:00
1 15:22:00
2 15:23:00
3 15:24:00
4 15:25:00
5 19:29:00
6 19:30:00
7 19:31:00
8 19:32:00
9 19:33:00
Please be advised that the time series will never seconds fraction i.e. it will always be 00 and also the data be continuous i.e. it will be minute wise continuous data.
The data will NOT NECESSARILY start at a whole hour. It could start at any time for example 15:35. I want to create a graph where the X axis major marking will be full hours like 19:00, 21:00, 22:00 and the minor marking should be half the hour i.e. 21:30, 19:30. I do not want the seconds part of the time to be seen as its useless.
What I want it to do is just graph hour and minute in format HH:MM and major markings at whole hours and minor markings at half hours.
keydata["Time_split"] = keydata["Time_split"].dt.time
keydata.plot(x='Time_split', y='Total_S4_Sig1')
plt.show()
This code leads to such a plot.
I do not want the seconds to be shown and I want the marking at full hours and minor markings at half hours.
keydata["Time_split"] = keydata["Time_split"].dt.time
time_form = mdates.DateFormatter("%H:%M")
ax = keydata.plot(x='Time_split', y='Total_S4_Sig1')
ax.xaxis.set_major_formatter(time_form)
plt.show()
This code leads to such a plot.
Please be advised the seconds will always be 00
Try using matplotlib date formatting
import matplotlib.dates as mdates
date_fmt = mdates.DateFormatter('%H:%M:%S')
# plot your data
ax = df.plot.line(x='time', y='values')
# add the date formatter as the x axis tick formatter
ax.xaxis.set_major_formatter(date_fmt)
The following should address the problems you are facing:
import pandas as pd
from datetime import date, datetime, timedelta
import matplotlib.pyplot as plt
import matplotlib.dates as md
import numpy as np
#testing data
#keydata = pd.read_csv('test.txt',sep='\t',header=None,names=['Time_split','Total_S4_Sig1'])
x = pd.to_datetime(keydata['Time_split'])
y = keydata['Total_S4_Sig1']
# plot
fig, ax = plt.subplots(1, 1)
ax.plot(x, y,'ok')
# Format xtick labels as hour : minutes
xformatter = md.DateFormatter('%H:%M')
## Set xtick labels to appear every 1 hours
ax.xaxis.set_major_locator(md.HourLocator(interval=1))
#set minor ticks every 1/2 hour
ax.xaxis.set_minor_locator(md.MinuteLocator(byminute=[0,30],interval=1))
plt.gcf().axes[0].xaxis.set_major_formatter(xformatter)
plt.show()

How to display only certain x axis values on plot

I am plotting values from a dataframe where time is the x-axis. The time is formatted as 00:00 to 23:45. I only want to display the specific times 00:00, 06:00, 12:00, 18:00 on the x-axis of my plot. How can this be done? I have posted two figures, the first shows the format of my dataframe after setting the index to time. And the second shows my figure. Thank you for your help!
monday.set_index("Time", drop=True, inplace=True)
monday_figure = monday.plot(kind='line', legend = False,
title = 'Monday Average Power consumption')
monday_figure.xaxis.set_major_locator(plt.MaxNLocator(8))
Edit: Adding data as text:
Time,DayOfWeek,kW
00:00:00,Monday,5.8825
00:15:00,Monday,6.0425
00:30:00,Monday,6.0025
00:45:00,Monday,5.7475
01:00:00,Monday,6.11
01:15:00,Monday,5.8025
01:30:00,Monday,5.6375
01:45:00,Monday,5.85
02:00:00,Monday,5.7250000000000005
02:15:00,Monday,5.66
02:30:00,Monday,6.0025
02:45:00,Monday,5.71
03:00:00,Monday,5.7425
03:15:00,Monday,5.6925
03:30:00,Monday,5.9475
03:45:00,Monday,6.380000000000001
04:00:00,Monday,5.65
04:15:00,Monday,5.8725
04:30:00,Monday,5.865
04:45:00,Monday,5.71
05:00:00,Monday,5.6925
05:15:00,Monday,5.9975000000000005
05:30:00,Monday,5.905000000000001
05:45:00,Monday,5.93
06:00:00,Monday,5.6025
06:15:00,Monday,6.685
06:30:00,Monday,7.955
06:45:00,Monday,8.9225
07:00:00,Monday,10.135
07:15:00,Monday,12.9475
07:30:00,Monday,14.327499999999999
07:45:00,Monday,14.407499999999999
08:00:00,Monday,15.355
08:15:00,Monday,16.2175
08:30:00,Monday,18.355
08:45:00,Monday,18.902499999999996
09:00:00,Monday,19.0175
09:15:00,Monday,20.0025
09:30:00,Monday,20.355
09:45:00,Monday,20.3175
10:00:00,Monday,20.8025
10:15:00,Monday,20.765
10:30:00,Monday,21.07
10:45:00,Monday,19.9825
11:00:00,Monday,20.94
11:15:00,Monday,22.1325
11:30:00,Monday,20.6275
11:45:00,Monday,21.4475
12:00:00,Monday,22.092499999999998
The image above is produced using the code from the comment below.
Make sure you have a datetime index using pd.to_datetime when plotting timeseries.
I then used matplotlib.mdates to detect the desired ticks and format them in the plot. I don't know if it can be done from pandas with df.plot.
See matplotlib date tick labels. You can customize the HourLocator or use a different locator to suit your needs. Minor ticks are created the same way with ax.xaxis.set_minor_locator. Hope it helps.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# Using your dataframe
df = pd.read_clipboard(sep=',')
# Make sure you have a datetime index
df['Time'] = pd.to_datetime(df['Time'])
df = df.set_index('Time')
fig, ax = plt.subplots(1,1)
ax.plot(df['kW'])
# Use mdates to detect hours
locator = mdates.HourLocator(byhour=[0,6,12,18])
ax.xaxis.set_major_locator(locator)
# Format x ticks
formatter = mdates.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(formatter)
# rotates and right aligns the x labels, and moves the bottom of the axes up to make room for them
fig.autofmt_xdate()

Manipulating Dates in x-axis Pandas Matplotlib

I have a pretty simple set of data as displayed below. I am looking for a way to plot this stacked bar chart and format the x-axis (dates) so it starts at 1996-31-12 and ends at 2016-31-12 on increments of 365 days. The code I have written is plotting every single date and therefore the x-axis is very bunched up and not readable.
Datafame:
Date A B
1996-31-12 10 3
1997-31-03 5 6
1997-31-07 7 5
1997-30-11 3 12
1997-31-12 4 10
1998-31-03 5 8
.
.
.
2016-31-12 3 9
This is a similar question: Pandas timeseries plot setting x-axis major and minor ticks and labels
You can manage this using matplotlib itself instead of pandas.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# if your dates are strings you need this step
df.Date = pd.to_datetime(df.Date)
fig,ax = plt.subplots()
ax.plot_date(df.Date,df.A)
ax.plot_date(df.Date,df.B)
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b\n%Y'))
plt.show()

matplotlib only business days without weekends on x-axis with plot_date

I have the following persistent problem:
The following code should draw a straight line:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
d = pd.date_range(start="1/1/2012", end="2/1/2012", freq="B")
v = np.linspace(1,10,len(d))
plt.plot_date(d,v,"-")
But all I get is a jagged line because "plot_date" somehow fills up the dates in "d" with the weekends.
Is there a way to force matplotlib to take my dates (only business days) as is without filing them up with weekend dates?
>>>d
DatetimeIndex(['2012-01-02', '2012-01-03', '2012-01-04', '2012-01-05',
'2012-01-06', '2012-01-09', '2012-01-10', '2012-01-11',
'2012-01-12', '2012-01-13', '2012-01-16', '2012-01-17',
'2012-01-18', '2012-01-19', '2012-01-20', '2012-01-23',
'2012-01-24', '2012-01-25', '2012-01-26', '2012-01-27',
'2012-01-30', '2012-01-31', '2012-02-01'],
dtype='datetime64[ns]', freq='B')
plot_date does a trick, it converts dates to number of days since 1-1-1 and uses these numbers to plot, then converts the ticks to dates again in order to draw nice tick labels. So using plot_date each day count as 1, business or not.
You can plot your data against a uniform range of numbers but if you want dates as tick labels you need to do it yourself.
d = pd.date_range(start="1/1/2012", end="2/1/2012", freq="B")
v = np.linspace(1,10,len(d))
plt.plot(range(d.size), v)
xticks = plt.xticks()[0]
xticklabels = [(d[0] + x).strftime('%Y-%m-%d') for x in xticks.astype(int)]
plt.xticks(xticks, xticklabels)
plt.autoscale(True, axis='x', tight=True)
But be aware that the labels can be misleading. The segment between 2012-01-02 and 2012-01-09 represents five days, not seven.

Categories

Resources