I have a bar graph with multiple data series and i want to set the xaxis values to a significant value of %.2f I already tried using the set_major formatter for the first graph, but it resets the values to 0, while the values should be like the second graph.
How can I fix this?
My code look like this:
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib.ticker as mtick
# select the measurement location
MATH = "import/data/place"
SAVE = "save/location"
fig, axes = plt.subplots(figsize=(12,15),nrows=2, ncols=1) # size of the plots and the placing
fig.subplots_adjust(hspace=0.5) # set space between plots
DATA = pd.read_csv(MATH,delimiter=',',usecols = [2,3,4,5,6,7,8,9,10,11,12],names = ['set_t','set_rh',
'type','math','ref','LUFFT','VPL','VPR','VVL','VVR','PRO'], parse_dates=True)
# select the data
temp = DATA.loc[(DATA['type']=='T')&(DATA['math']=='dif')] # dif temperature data
rh = DATA.loc[((DATA['type']=='RH')&(DATA['math']=='dif'))] # dif relative humidity data
# plot temperature
fg = temp.plot.bar(x='set_t',y = ['LUFFT','VPL','VPR','VVL','VVR','PRO'],
color = ['b','firebrick','orange','forestgreen','darkturquoise','indigo'],
ax=axes[0])
fg.grid(True)
fg.set_ylabel('$ΔT$(°C)',fontsize = 12)
fg.set_xlabel('ref $T$ (°C)',fontsize = 12)
fg.set_title('Difference in T from reference at constant relative humidity 50%',fontsize = 15)
fg.yaxis.set_major_formatter(mtick.FormatStrFormatter('%.2f'))
fg.xaxis.set_major_formatter(mtick.FormatStrFormatter('%.2f'))
# plot relative humidity
df = rh.plot.bar(x='set_t',y = ['LUFFT','VPL','VPR','VVL','VVR','PRO'],
color = ['b','firebrick','orange','forestgreen','darkturquoise','indigo'],
ax=axes[1])
df.grid(True)
df.set_ylabel('$ΔU$(%)',fontsize = 12)
df.set_xlabel('ref $T$ (°C)',fontsize = 12)
df.set_title('Difference in U from reference at constant relative humidity 50%',fontsize = 15)
plt.tight_layout()
plt.savefig(SAVE + "_example.jpg")
plt.show()
A sample of my data:
07:40:00,07:50:00,39.85716354999982,51.00504745588235,T,dif,,0.14283645000018197,-0.07502069285698099,-0.15716354999978677,0.0020201234696060055,-0.07111703837193772,-0.0620802166664447,
07:40:00,07:50:00,39.85716354999982,51.00504745588235,RH,dif,,-0.40504745588239643,3.994952544117652,2.994952544117652,4.994952544117652,,6.994952544117652,
08:40:00,08:50:00,34.861160704969016,51.1297401832298,T,dif,,0.22883929503095857,0.2509082605481865,-0.2575243413326831,0.24864321659958222,0.14092262836431502,-0.04441070496899613,
08:40:00,08:50:00,34.861160704969016,51.1297401832298,RH,dif,,-0.32974018322978793,3.8702598167702007,2.8702598167702007,4.870259816770201,,6.870259816770201,
This is due to the fact that with a grouped barplot like this, made by Pandas, the x-axes loses its actual 'range', and the values associated with the tick position become the position itself. That's a bit cryptic, but you can see with fg.get_xlim() that the values have lost 'touch' with the original data, and are simply increasing integers. You can explore/debug the 'values' and 'positions' Matplotlib uses if you provide a FuncFormatter with a function like this:
def check_pos(val, pos):
print(val, pos)
return '%.2f' % val
This basically shows that no formatter is going to work for your case.
Luckily the ticklabels are set correctly (as text), so you could parse these to float, and format them as you wish.
Remove your formatter altogether, and set the xticklabels with:
fg.set_xticklabels(['%.2f' % float(x.get_text()) for x in fg.get_xticklabels()])
Note that Matplotlib itself is perfectly capable of preserving the correct tickvalues in combination with a bar plot, but you would have to do the 'grouping' etc yourself, so that's not very convenient as well.
Related
I am trying to plot a time-series data by HOUR as shown in the image, but I keep getting this error - Locator attempting to generate 91897 ticks ([15191.0, ..., 19020.0]), which exceeds Locator.MAXTICKS (1000). I have tried all available solutions on StackOverflow for similar problems but still could not get around it, Please help.
Link to image: https://drive.google.com/file/d/1b1PNCqVp7W65ciVPEWELiV2cTiXgBu2V/view?usp=sharing
Link to CSV:
https://drive.google.com/file/d/113kYjsqbyL5wx1j204yK6Wmop4wLsqMQ/view?usp=sharing
Attempted codes:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as md
df = pd.read_csv('price.csv', delimiter=',')
plt.rcParams['font.size'] = 18
fig, (ax) = plt.subplots(ncols=1, nrows=1, figsize=(20,15))
# Plot the data
ax.plot(df.VOLUME, label = 'volume')
# Set the title
ax.set_title("AUDUSD Hourly Variation\n", fontsize=23)
# Set the Y-Axis label
ax.set_ylabel("\nVolume")
# Set the X-Axis label
ax.set_xlabel('\nHour')
ax.legend() # Plot the legend for axes
# apply locator and formatter to the ticks on the X axis
ax.xaxis.set_major_locator(md.HourLocator(interval = 1)) # X axis will be formatted in Hour
ax.xaxis.set_major_formatter(md.DateFormatter('%H')) # set the date format to the hour shortname
# Set the limits (range) of the X-Axis
ax.set_xlim([pd.to_datetime('2011.08.05', format = '%Y.%m.%d'),
pd.to_datetime('2022.01.28', format = '%Y.%m.%d')])
plt.tight_layout()
plt.show()
Thanks for your assistance.
According to the error message, you are attempting to plot a time range that covers ~4000 hours (19000 - 15000 = 4000), but are only allowed to have max 1000 ticks. Increase the interval to 4 or 5.
ax.xaxis.set_major_locator(md.HourLocator(interval = 5))
Perhaps . doesn't work well as a separator in dates (because it is also used as a decimal separator in numbers). Try:
ax.set_xlim([pd.to_datetime('2011-08-05', format = '%Y-%m-%d'),
pd.to_datetime('2022-01-28', format = '%Y-%m-%d')])
If that doesn't cause the error, then change your input data correspondingly.
I need to plot time(timestamp) vs space(intersectionId) single horizontal bar chart in matplotlib. The color of the bar will be changed at time intervals based on another column which will the currState. The colors will be
red,green,yellow. I have tried to create a dictionary of colors and values but unsure of how to use them in loop to change color based on the value. I have attached a sample csv below along with a code and what I try to achieve and what I have written till now.
category_colors = { 'red' : [2,3] , 'yellow' : [5,6] , 'green' : [7,8]}
date_test = df_sample['timestamp']
y_test = ['123456']
data = np.array(list(df_sample.currState))
fig, ax = plt.subplots(figsize=(10, 1))
ax = plt.barh(y_test,date_test,label="trafficsignal")
data_cum = data.cumsum
plt.xlabel('timestamp')
plt.ylabel('space')
plt.title('TimeSpace')
plt.legend()
plt.show()
timestamp currState IntersectionId
2020-02-26 16:12:13.131484 3 12345
2020-02-26 16:12:14.131484 3 12345
2020-02-26 16:12:15.131484 3 12345
2020-02-26 16:12:16.131484 5 12345
2020-02-26 16:12:17.131484 5 12345
2020-02-26 16:12:18.131484 5 12345
2020-02-26 16:12:19.131484 6 12345
2020-02-26 16:12:20.131484 6 12345
2020-02-26 16:12:21.131484 6 12345
Current plot:
Desired plot:
I am not aware of any plotting package that lets you create this plot in a straightforward way based on how your sample table is structured. One option could be to compute a start and an end variable and then create the plot like in the answers to this question, for example using the Altair Gantt chart like in this answer.
Here, I offer two solutions using matplotlib. By taking a look at the matplotlib gallery, I stumbled on the broken_barh plotting function which provides a way to create a plot like the one you want. There are two main hurdles to overcome when using it:
Deciding what unit to use for the x-axis and computing the xranges argument accordingly;
Creating and formatting the x ticks and tick labels.
Let me first create a sample dataset that resembles yours, note that you will need to adjust the color_dict to your codes:
import numpy as np # v 1.19.2
import pandas as pd # v 1.1.3
import matplotlib.pyplot as plt # v 3.3.2
import matplotlib.dates as mdates
## Create sample dataset
# Light color codes
gre = 1
yel_to_red = 2
red = 3
yel_to_gre = 4
color_dict = {1: 'green', 2: 'yellow', 3: 'red', 4: 'yellow'}
# Light color duration in seconds
sec_g = 45
sec_yr = 3
sec_r = 90
sec_yg = 1
# Light cycle
light_cycle = [gre, yel_to_red, red, yel_to_gre]
sec_cycle = [sec_g, sec_yr, sec_r, sec_yg]
ncycles = 3
sec_total = ncycles*sum(sec_cycle)
# Create variables and store them in a pandas dataframe with the datetime as index
IntersectionId = 12345
currState = np.repeat(ncycles*light_cycle, repeats=ncycles*sec_cycle)
time_sec = pd.date_range(start='2021-01-04 08:00:00', freq='S', periods=sec_total)
df = pd.DataFrame(dict(IntersectionId = np.repeat(12345, repeats=ncycles*sum(sec_cycle)),
currState = currState),
index = time_sec)
The broken_barh function takes the data in the format of tuples where for each colored rectangle that makes up the horizontal bar you need to provide the xy coordinates of the bottom-left corner as well as the length along each axis, like so:
xranges=[(x1_start, x1_length), (x2_start, x2_length), ... ], yranges=(y_all_start, y_all_width)
Note that yranges applies to all rectangles. The unit that is chosen for the x-axis determines how the data must be processed and how the x ticks and tick labels can be created. Here are two alternatives.
Matplotlib broken_barh with matplotlib date number as x-axis scale
In this approach, the timestamps of the rows where the light changes are extracted and then converted to matplotlib date numbers. This makes it possible to use a matplotlib date tick locator and formatter. This approach of using the matplotlib date for the x-axis values to simplify tick formatting was inspired by this answer by ImportanceOfBeingErnest.
For both this solution and the next one, the code for getting the indices of light changes and computing the lengths of the periods is based on this answer by Jaime, thanks to the general idea provided by this Gist by alimanfoo.
## Compute variables needed to define the plotting function arguments
states = np.array(df['currState'])
# Create a list of indices of the rows where the light changes
# (i.e. where a new currState code section starts)
starts_indices = np.where(np.concatenate(([True], states[:-1] != states[1:])))
# Append the last index to be able to compute the duration of the last
# light color period recorded in the dataset
starts_end_indices = np.append(starts_indices, states.size-1)
# Get the timestamps of those rows and convert them to python datetime format
starts_end_pydt = df.index[starts_end_indices].to_pydatetime()
# Convert the python timestamps to matplotlib date number that is used as the
# x-axis unit, this makes it easier to format the tick labels
starts_end_x = mdates.date2num(starts_end_pydt)
# Get the duration of each light color in matplotlib date number units
lengths = np.diff(starts_end_x)
# Add one second (computed in python datetime units) to the duration of
# the last light to make the bar chart left and right inclusive instead
# of just left inclusive
pydt_second = (max(starts_end_x) - min(starts_end_x))/starts_end_indices[-1]
lengths[-1] = lengths[-1] + pydt_second
# Compute the arguments for the broken_barh plotting function
xranges = [(start, length) for start, length in zip(starts_end_x, lengths)]
yranges = (0.75, 0.5)
colors = df['currState'][starts_end_indices[:-1]].map(color_dict)
## Create horizontal bar with colors by using the broken_barh function
## and format ticks and tick labels
fig, ax = plt.subplots(figsize=(10,2))
ax.broken_barh(xranges, yranges, facecolors=colors, zorder=2)
# Create and format x ticks and tick labels
loc = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(loc)
formatter = mdates.AutoDateFormatter(loc)
formatter.scaled[1/(24.*60.)] = '%H:%M:%S' # adjust this according to time range
ax.xaxis.set_major_formatter(formatter)
# Format y-axis and create y tick and tick label
ax.set_ylim(0, 2)
ax.set_yticks([1])
ax.set_yticklabels([df['IntersectionId'][0]])
plt.grid(axis='x', alpha=0.5, zorder=1)
plt.show()
Matplotlib broken_barh with seconds as x-axis scale
This approach takes advantage of the fact that the indices of the table can be used to compute the lights' durations in seconds. The downside is that this time the x ticks and tick labels must be created from scratch. The code is written so that labels automatically have a nice format depending on the total duration covered by the dataset. The only thing that needs adjusting is the number of ticks, as this depends on how wide the figure is.
The code used to automatically select an appropriate time step between ticks is based on this answer by kennytm. The datetime string format codes are listed here.
## Compute the variables needed for the plotting function arguments
## using the currState variable
states = np.array(df['currState'])
# Create list of indices indicating the rows where the currState code
# changes: note the comma to unpack the tuple
starts_indices, = np.where(np.concatenate(([True], states[:-1] != states[1:])))
# Compute durations of each light in seconds
lengths = np.diff(starts_indices, append=states.size)
## Compute the arguments for the plotting function
xranges = [(start, length) for start, length in zip(starts_indices, lengths)]
yranges = (0.75, 0.5)
colors = df['currState'][starts_indices].map(color_dict)
## Create horizontal bar with colors using the broken_barh function
fig, ax = plt.subplots(figsize=(10,2))
ax.broken_barh(xranges, yranges, facecolors=colors, zorder=2)
## Create appropriate x ticks and tick labels
# Define time variable and parameters needed for computations
time = pd.DatetimeIndex(df.index).asi8 // 10**9 # time is in seconds
tmin = min(time)
tmax = max(time)
trange = tmax-tmin
# Choose the approximate number of ticks, the exact number depends on
# the automatically selected time step
approx_nticks = 6 # low number selected because figure width is only 10 inches
round_time_steps = [15, 30, 60, 120, 180, 240, 300, 600, 900, 1800, 3600, 7200, 14400]
time_step = min(round_time_steps, key=lambda x: abs(x - trange//approx_nticks))
# Create list of x ticks including the right boundary of the last time point
# in the dataset regardless of whether not it is aligned with the time step
timestamps = np.append(np.arange(tmin, tmax, time_step), tmax+1)
xticks = timestamps-tmin
ax.set_xticks(xticks)
# Create x tick labels with format depending on time step
fmt_time = '%H:%M:%S' if time_step <= 60 else '%H:%M'
xticklabels = [pd.to_datetime(ts, unit='s').strftime(fmt_time) for ts in timestamps]
ax.set_xticklabels(xticklabels)
## Format y-axis limits, tick and tick label
ax.set_ylim(0, 2)
ax.set_yticks([1])
ax.set_yticklabels([df['IntersectionId'][0]])
plt.grid(axis='x', alpha=0.5, zorder=1)
plt.show()
Further documentation: to_datetime, to_pydatetime, strftime
I have date in one column and time in another which I retrieved from database through pandas read_sql. The dataframe looks like below (there are 30 -40 rows in my daaframe). I want to plot them in a time series graph. If I want I should be in a position to convert that to Histogram as well.
COB CALV14
1 2019-10-04 07:04
2 2019-10-04 05:03
3 2019-10-03 16:03
4 2019-10-03 05:15
First I got different errors - like not numeric field to plot etc. After searching a lot,the closest post I could find is : Matplotlib date on y axis
I followed and got some result - However the problem is:
I have to follow number of steps (convert to str then list and then to matplot lib datetime format) before I can plot them. (Please refer the code I am using) There must be a smarter and more precise way to do this.
This does not show the time beside the axis the way they exactly appear in the data frame. (eg it should show 07:03, 05:04 etc)
New to python - will appreciate any help on this.
Code
ob_frame['COB'] = ob_frame.COB.astype(str)
ob_frame['CALV14'] = ob_frame.CALV14.astype(str)
date = ob_frame.COB.tolist()
time = ob_frame.CALV14.tolist()
y = mdates.datestr2num(date)
x = mdates.datestr2num(time)
fig, ax = plt.subplots(figsize=(9,9))
ax.plot(x, y)
ax.yaxis_date()
ax.xaxis_date()
fig.autofmt_xdate()
plt.show()
I found the answer to it.I did not need to convert the data retrieved from DB to string type. Rest of the issue I was thought to be getting for not using the right formatting for the tick labels. Here goes the complete code - Posting in case this will help anyone.
In this code I have altered Y and X axis : i:e I plotted dates in x axis and time in Y axis as it looked better.
###### Import all the libraries and modules needed ######
import IN_OUT_SQL as IS ## IN_OUT_SQL.py is the file where the SQL is stored
import cx_Oracle as co
import numpy as np
import Credential as cd # Credentia.py is the File Where you store the DB credentials
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import dates as mdates
%matplotlib inline
###### Connect to DB, make the dataframe and prepare the x and y values to be plotted ######
def extract_data(query):
'''
This function takes the given query as input, Connects to the Databse, executes the SQL and
returns the result in a dataframe.
'''
cred = cd.POLN_CONSTR #POLN_CONSTR in the credential file stores the credential in '''USERNAME/PASSWORD#DB_NAME''' format
conn = co.connect(cred)
frame = pd.read_sql(query, con = conn)
return frame
query = IS.OUT_SQL
ob_frame = extract_data(query)
ob_frame.dropna(inplace = True) # Drop the rows with NaN values for all the columns
x = mdates.datestr2num(ob_frame['COB']) #COB is date in "01-MAR-2020" format- convert it to madates type
y = mdates.datestr2num(ob_frame['CALV14']) #CALV14 is time in "21:04" Format- convert it to madates type
###### Make the Timeseries plot of delivery time in y axis vs delivery date in x axis ######
fig, ax = plt.subplots(figsize=(15,8))
ax.clear() # Clear the axes
ax.plot(x, y, 'bo-', color = 'dodgerblue') #Plot the data
##Below two lines are to draw a horizontal line for 05 AM and 07 AM position
plt.axhline(y = mdates.date2num (pd.to_datetime('07:00')), color = 'red', linestyle = '--', linewidth = 0.75)
plt.axhline(y = mdates.date2num (pd.to_datetime('05:00')), color = 'green', linestyle = '--', linewidth = 0.75)
plt.xticks(x,rotation = '75')
ax.yaxis_date()
ax.xaxis_date()
#Below 6 lines are about setting the format with which I want my xor y ticks and their labels to be displayed
yfmt = mdates.DateFormatter('%H:%M')
xfmt = mdates.DateFormatter('%d-%b-%y')
ax.yaxis.set_major_formatter(yfmt)
ax.xaxis.set_major_formatter(xfmt)
ax.yaxis.set_major_locator(mdates.HourLocator(interval=1)) # Every 1 Hour
ax.xaxis.set_major_locator(mdates.DayLocator(interval=1)) # Every 1 Day
####### Name the x,y labels, titles and beautify the plot #######
plt.style.use('bmh')
plt.xlabel('\nCOB Dates')
plt.ylabel('Time of Delivery (GMT/BST as applicable)\n')
plt.title(" Data readiness time against COBs (Last 3 months)\n")
plt.rcParams["font.size"] = "12" #Change the font
# plt.rcParams["font.family"] = "Times New Roman" # Set the font type if needed
plt.tick_params(left = False, bottom = False, labelsize = 10) #Remove ticks, make tick labelsize 10
plt.box(False)
plt.show()
Output:
This is my first attempt using Matplotlib and I am in need of some guidance. I am trying to generate plot with 4 y-axes, two on the left and two on the right with shared x axis. Here's my dataset on shared dropbox folder
import pandas as pd
%matplotlib inline
url ='http://dropproxy.com/f/D34'
df= pd.read_csv(url, index_col=0, parse_dates=[0])
df.plot()
This is what the simple pandas plot looks like:
I would like to plot this similar to the example below, with TMAX and TMIN on primary y-axis (on same scale).
My attempt:
There's one example I found on the the matplotlib listserv..I am trying to adapt it to my data but something is not working right...Here's the script.
# multiple_yaxes_with_spines.py
# This is a template Python program for creating plots (line graphs) with 2, 3,
# or 4 y-axes. (A template program is one that you can readily modify to meet
# your needs). Almost all user-modifiable code is in Section 2. For most
# purposes, it should not be necessary to modify anything else.
# Dr. Phillip M. Feldman, 27 Oct, 2009
# Acknowledgment: This program is based on code written by Jae-Joon Lee,
# URL= http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/matplotlib/
# examples/pylab_examples/multiple_yaxis_with_spines.py?revision=7908&view=markup
# Section 1: Import modules, define functions, and allocate storage.
import matplotlib.pyplot as plt
from numpy import *
def make_patch_spines_invisible(ax):
ax.set_frame_on(True)
ax.patch.set_visible(False)
for sp in ax.spines.itervalues():
sp.set_visible(False)
def make_spine_invisible(ax, direction):
if direction in ["right", "left"]:
ax.yaxis.set_ticks_position(direction)
ax.yaxis.set_label_position(direction)
elif direction in ["top", "bottom"]:
ax.xaxis.set_ticks_position(direction)
ax.xaxis.set_label_position(direction)
else:
raise ValueError("Unknown Direction : %s" % (direction,))
ax.spines[direction].set_visible(True)
# Create list to store dependent variable data:
y= [0, 0, 0, 0, 0]
# Section 2: Define names of variables and the data to be plotted.
# `labels` stores the names of the independent and dependent variables). The
# first (zeroth) item in the list is the x-axis label; remaining labels are the
# first y-axis label, second y-axis label, and so on. There must be at least
# two dependent variables and not more than four.
labels= ['Date', 'Maximum Temperature', 'Solar Radiation',
'Rainfall', 'Minimum Temperature']
# Plug in your data here, or code equations to generate the data if you wish to
# plot mathematical functions. x stores values of the independent variable;
# y[1], y[2], ... store values of the dependent variable. (y[0] is not used).
# All of these objects should be NumPy arrays.
# If you are plotting mathematical functions, you will probably want an array of
# uniformly spaced values of x; such an array can be created using the
# `linspace` function. For example, to define x as an array of 51 values
# uniformly spaced between 0 and 2, use the following command:
# x= linspace(0., 2., 51)
# Here is an example of 6 experimentally measured y1-values:
# y[1]= array( [3, 2.5, 7.3e4, 4, 8, 3] )
# Note that the above statement requires both parentheses and square brackets.
# With a bit of work, one could make this program read the data from a text file
# or Excel worksheet.
# Independent variable:
x = df.index
# First dependent variable:
y[1]= df['TMAX']
# Second dependent variable:
y[2]= df['RAD']
y[3]= df['RAIN']
y[4]= df['TMIN']
# Set line colors here; each color can be specified using a single-letter color
# identifier ('b'= blue, 'r'= red, 'g'= green, 'k'= black, 'y'= yellow,
# 'm'= magenta, 'y'= yellow), an RGB tuple, or almost any standard English color
# name written without spaces, e.g., 'darkred'. The first element of this list
# is not used.
colors= [' ', '#C82121', '#E48E3C', '#4F88BE', '#CF5ADC']
# Set the line width here. linewidth=2 is recommended.
linewidth= 2
# Section 3: Generate the plot.
N_dependents= len(labels) - 1
if N_dependents > 4: raise Exception, \
'This code currently handles a maximum of four independent variables.'
# Open a new figure window, setting the size to 10-by-7 inches and the facecolor
# to white:
fig= plt.figure(figsize=(16,9), dpi=120, facecolor=[1,1,1])
host= fig.add_subplot(111)
host.set_xlabel(labels[0])
# Use twinx() to create extra axes for all dependent variables except the first
# (we get the first as part of the host axes). The first element of y_axis is
# not used.
y_axis= (N_dependents+2) * [0]
y_axis[1]= host
for i in range(2,len(labels)+1): y_axis[i]= host.twinx()
if N_dependents >= 3:
# The following statement positions the third y-axis to the right of the
# frame, with the space between the frame and the axis controlled by the
# numerical argument to set_position; this value should be between 1.10 and
# 1.2.
y_axis[3].spines["right"].set_position(("axes", 1.15))
make_patch_spines_invisible(y_axis[3])
make_spine_invisible(y_axis[3], "right")
plt.subplots_adjust(left=0.0, right=0.8)
if N_dependents >= 4:
# The following statement positions the fourth y-axis to the left of the
# frame, with the space between the frame and the axis controlled by the
# numerical argument to set_position; this value should be between 1.10 and
# 1.2.
y_axis[4].spines["left"].set_position(("axes", -0.15))
make_patch_spines_invisible(y_axis[4])
make_spine_invisible(y_axis[4], "left")
plt.subplots_adjust(left=0.2, right=0.8)
p= (N_dependents+1) * [0]
# Plot the curves:
for i in range(1,N_dependents+1):
p[i], = y_axis[i].plot(x, y[i], colors[i],
linewidth=linewidth, label=labels[i])
# Set axis limits. Use ceil() to force upper y-axis limits to be round numbers.
host.set_xlim(x.min(), x.max())
host.set_xlabel(labels[0], size=16)
for i in range(1,N_dependents+1):
y_axis[i].set_ylim(0.0, ceil(y[i].max()))
y_axis[i].set_ylabel(labels[i], size=16)
y_axis[i].yaxis.label.set_color(colors[i])
for sp in y_axis[i].spines.itervalues():
sp.set_color(colors[i])
for obj in y_axis[i].yaxis.get_ticklines():
# `obj` is a matplotlib.lines.Line2D instance
obj.set_color(colors[i])
obj.set_markeredgewidth(3)
for obj in y_axis[i].yaxis.get_ticklabels():
obj.set_color(colors[i])
obj.set_size(12)
obj.set_weight(600)
# To enable the legend, uncomment the following two lines:
lines= p[1:]
host.legend(lines, [l.get_label() for l in lines])
plt.draw(); plt.show()
And the output
How can I put the scale on max and min temp on a same scale? Also, how can I get rid of second y-axis with black color, scaled from 0 to 10?
Is there a simpler way to achieve this?
How can I put the scale on max and min temp on a same scale?
Plot them in the same axes.
Also, how can I get rid of second y-axis with black color, scaled from 0 to 10?
Do not create that axes.
You want to plot four variables, two of them can go in the same subplot so you only need three subplots. But you are creating five of them?
Step by step
Keep in mind: different y scales <-> different subplots sharing x-axis.
Two variables with a common scale (left), two variables with independent scales (right).
Create the primary subplot, let's call it ax1. Plot everything you want in it, in this case TMIN and TMAX as stated in your question.
Create a twin subplot sharing x axis twinx(ax=ax1). Plot the third variable, say RAIN.
Create another twin subplot twinx(ax=ax1). Plot the fourth variable 'RAD'.
Adjust colors, labels, spine positions... to your heart's content.
Unsolicited advice: do not try to fix code you don't understand.
Variation of the original plot showing how you can plot variables on multiple axes
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
url ='http://dropproxy.com/f/D34'
df= pd.read_csv(url, index_col=0, parse_dates=[0])
fig = plt.figure()
ax = fig.add_subplot(111) # Primary y
ax2 = ax.twinx() # Secondary y
# Plot variables
ax.plot(df.index, df['TMAX'], color='red')
ax.plot(df.index, df['TMIN'], color='green')
ax2.plot(df.index, df['RAIN'], color='orange')
ax2.plot(df.index, df['RAD'], color='yellow')
# Custom ylimit
ax.set_ylim(0,50)
# Custom x axis date formats
import matplotlib.dates as mdates
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))
I modified #bishopo's suggestions to generate what I wanted, however, the plot still needs some tweaking with font sizes for axes label.
Here's what I have done so far.
import pandas as pd
%matplotlib inline
url ='http://dropproxy.com/f/D34'
df= pd.read_csv(url, index_col=0, parse_dates=[0])
from mpl_toolkits.axes_grid1 import host_subplot
import mpl_toolkits.axisartist as AA
import matplotlib.pyplot as plt
if 1:
# Set the figure size, dpi, and background color
fig = plt.figure(1, (16,9),dpi =300, facecolor = 'W',edgecolor ='k')
# Update the tick label size to 12
plt.rcParams.update({'font.size': 12})
host = host_subplot(111, axes_class=AA.Axes)
plt.subplots_adjust(right=0.75)
par1 = host.twinx()
par2 = host.twinx()
par3 = host.twinx()
offset = 60
new_fixed_axis = par2.get_grid_helper().new_fixed_axis
new_fixed_axis1 = host.get_grid_helper().new_fixed_axis
par2.axis["right"] = new_fixed_axis(loc="right",
axes=par2,
offset=(offset, 0))
par3.axis["left"] = new_fixed_axis1(loc="left",
axes=par3,
offset=(-offset, 0))
par2.axis["right"].toggle(all=True)
par3.axis["left"].toggle(all=True)
par3.axis["right"].set_visible(False)
# Set limit on both y-axes
host.set_ylim(-30, 50)
par3.set_ylim(-30,50)
host.set_xlabel("Date")
host.set_ylabel("Minimum Temperature ($^\circ$C)")
par1.set_ylabel("Solar Radiation (W$m^{-2}$)")
par2.set_ylabel("Rainfall (mm)")
par3.set_ylabel('Maximum Temperature ($^\circ$C)')
p1, = host.plot(df.index,df['TMIN'], 'm,')
p2, = par1.plot(df.index, df.RAD, color ='#EF9600', linestyle ='--')
p3, = par2.plot(df.index, df.RAIN, '#09BEEF')
p4, = par3.plot(df.index, df['TMAX'], '#FF8284')
par1.set_ylim(0, 36)
par2.set_ylim(0, 360)
host.legend()
host.axis["left"].label.set_color(p1.get_color())
par1.axis["right"].label.set_color(p2.get_color())
par2.axis["right"].label.set_color(p3.get_color())
par3.axis["left"].label.set_color(p4.get_color())
tkw = dict(size=5, width=1.5)
host.tick_params(axis='y', colors=p1.get_color(), **tkw)
par1.tick_params(axis='y', colors=p2.get_color(), **tkw)
par2.tick_params(axis='y', colors=p3.get_color(), **tkw)
par3.tick_params(axis='y', colors=p4.get_color(), **tkw)
host.tick_params(axis='x', **tkw)
par1.axis["right"].label.set_fontsize(16)
par2.axis["right"].label.set_fontsize(16)
par3.axis["left"].label.set_fontsize(16)
host.axis["bottom"].label.set_fontsize(16)
host.axis["left"].label.set_fontsize(16)
plt.figtext(.5,.92,'Weather Data', fontsize=22, ha='center')
plt.draw()
plt.show()
fig.savefig("Test1.png")
The output
I have written code which plots the past seven day stock value for a user-determined stock market over time.
The problem I have is that I want to format the x axis in a YYMMDD format.
I also don't understand what 2.014041e7 means at the end of the x axis.
Values for x are:
20140421.0, 20140417.0, 20140416.0, 20140415.0, 20140414.0, 20140411.0, 20140410.0
Values for y are:
531.17, 524.94, 519.01, 517.96, 521.68, 519.61, 523.48
My code is as follows:
mini = min(y)
maxi = max(y)
minimum = mini - 75
maximum = maxi + 75
mini2 = int(min(x))
maxi2 = int(max(x))
plt.close('all')
fig, ax = plt.subplots(1)
pylab.ylim([minimum,maximum])
pylab.xlim([mini2,maxi2])
ax.plot(x, y)
ax.plot(x, y,'ro')
ax.plot(x, m*x + c)
ax.grid()
ax.plot()
When plotting your data using your method you are simply plotting your y data against numbers (floats) in x such as 20140421.0 (which I assume you wish to mean the date 21/04/2014).
You need to convert your data from these floats into an appropriate format for matplotlib to understand. The code below takes your two lists (x, y) and converts them.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
# Original data
raw_x = [20140421.0, 20140417.0, 20140416.0, 20140415.0, 20140414.0, 20140411.0, 20140410.0]
y = [531.17, 524.94, 519.01, 517.96, 521.68, 519.61, 523.48]
# Convert your x-data into an appropriate format.
# date_fmt is a string giving the correct format for your data. In this case
# we are using 'YYYYMMDD.0' as your dates are actually floats.
date_fmt = '%Y%m%d.0'
# Use a list comprehension to convert your dates into datetime objects.
# In the list comp. strptime is used to convert from a string to a datetime
# object.
dt_x = [dt.datetime.strptime(str(i), date_fmt) for i in raw_x]
# Finally we convert the datetime objects into the format used by matplotlib
# in plotting using matplotlib.dates.date2num
x = [mdates.date2num(i) for i in dt_x]
# Now to actually plot your data.
fig, ax = plt.subplots()
# Use plot_date rather than plot when dealing with time data.
ax.plot_date(x, y, 'bo-')
# Create a DateFormatter object which will format your tick labels properly.
# As given in your question I have chosen "YYMMDD"
date_formatter = mdates.DateFormatter('%y%m%d')
# Set the major tick formatter to use your date formatter.
ax.xaxis.set_major_formatter(date_formatter)
# This simply rotates the x-axis tick labels slightly so they fit nicely.
fig.autofmt_xdate()
plt.show()
The code is commented throughout so should be easily self explanatory. Details on the various modules can be found below:
datetime
matplotlib.dates