Python, matplotlib - legend based on a conditional variable - - python

I have a data, which can be used in scatter plot.
I also have labels for the same data. So I am using conditional coloring:
# import needed things
fig = plt.figure()
r = fig.add_subplot(121)
r.scatter(np.arange(500), X[ :500, 0] c = Y[:500]
# x and y labels set here
g = fig.add_subplot(122)
g.scatter(np.arange(500), X[ :500, 1] c = Y[:500]
# x and y labels set here
plt.show()
I need to have a legend as well, suggesting which type has which color. I tried this:
plt.legend((r, g), ("one", "zero"), scatterpoints = 1, loc = "upper left")
but I get a warning
.../site-packages/matplotlib/legend.py:633: UserWarning: Legend does not support <matplotlib.axes._subplots.AxesSubplot object at 0x7fe37f460668> instances.
A proxy artist may be used instead.
and legend is not displayed.

I was able to run your code by substituting
r.scatter(np.arange(500), np.arange(500), c= np.arange(500))
g.scatter(np.arange(500), np.arange(500), c= np.arange(500))
I got a similar error that points me to a page on matplotlib.org, see below:
/Users/sdurant/anaconda/lib/python2.7/site-packages/matplotlib/legend.py:611: UserWarning: Legend does not support instances.
A proxy artist may be used instead.
See: http://matplotlib.org/users/legend_guide.html#using-proxy-artist
"#using-proxy-artist".format(orig_handle))
I don't understand exactly what you want your legend to look like, but that page has examples for a few types, hope this helps.
Edit: That makes more sense, is this roughly what you are looking for then?
import matplotlib.patches as mpatches
import matplotlib.pyplot as plt
red_patch = mpatches.Patch(color='red', label='one')
blue_patch = mpatches.Patch(color='blue', label='zero')
area = np.pi *30
fig = plt.figure()
r = fig.add_subplot(121)
r.scatter(np.arange(10), np.arange(10), c= [random.randint(2) for k in range(10)], s=area)
# x and y labels set here
plt.legend(handles=[red_patch,blue_patch])
g = fig.add_subplot(122)
g.scatter(np.arange(10), np.arange(10), c= [random.randint(2) for k in range(10)], s=area)
# x and y labels set here
plt.legend(handles=[red_patch,blue_patch])

Related

Add meaningful minor ticks to a modified axis?

This example is specifically relating to plotting data as a function of log(redshift+1) and having a reference redshift axis but can be easily generalised to any functional modification.
I've written a neat little function (with the help of some question/answers on here) that allows me to easily add a redshift axis to the top of a log(1+redshift) plot. I am really struggling to get meaningful minor ticks (and would rather not share my dismal efforts!).
Here is the code, including example plot:
In this case, I would like redshifts at every 0.1 increment not occupied by a major tick, with the flexibility of changing that 0.1 in the function call.
import matplotlib.pyplot as plt
import numpy as np
def add_zaxis(axis,denomination):
oldx = axis.get_xlim()
axis.set_xlim(0., None)
zspan = [(10**x)-1 for x in axis.get_xlim()]
denom = denomination
zmax = int(np.floor(zspan[1]/denom))*denom
zspan[1] = zmax
k = len(np.arange(zspan[0],zspan[1],denom))+1
zs = np.linspace(zspan[0],zspan[1],k)
z_ticks = [np.log10(1+x) for x in zs]
axz = axis.twiny()
axz.set_xticks(z_ticks)
axz.set_xticklabels(['{:g}'.format(y) for y in zs])
axz.set_xlim(oldx)
axis.set_xlim(oldx)
return axz
data = np.random.randn(500)
data = data[data>0.]
fig, ax = plt.subplots(1)
plt.hist(np.log10(data+1), bins=22)
ax.set_xlabel('log(z+1)')
ax.minorticks_on()
axz = add_zaxis(ax,.3)
axz.set_xlabel('z')
axz.minorticks_on()
The idea would be to use a FixedLocator to position the ticks on the axis. You may then have one FixedLocator for the major ticks and one for the minor ticks.
import matplotlib.pyplot as plt
import matplotlib.ticker
import numpy as np
def add_zaxis(ax,d=0.3, dminor=0.1):
f = lambda x: np.log10(x+1)
invf = lambda x: 10.0**x - 1.
xlim = ax.get_xlim()
zlim = [invf(x) for x in xlim]
axz = ax.twiny()
axz.set_xlim(xlim)
zs = np.arange(0,zlim[1],d)
zpos = f(zs)
axz.xaxis.set_major_locator(matplotlib.ticker.FixedLocator(zpos))
axz.xaxis.set_major_formatter(matplotlib.ticker.FixedFormatter(zs))
zsminor = np.arange(0,zlim[1],dminor)
zposminor = f(zsminor)
axz.xaxis.set_minor_locator(matplotlib.ticker.FixedLocator(zposminor))
axz.tick_params(axis='x',which='minor',bottom='off', top="on")
axz.set_xlabel('z')
data = np.random.randn(400)
data = data[data>0.]
fig, ax = plt.subplots(1)
plt.hist(np.log10(data+1), bins=22)
ax.set_xlabel('log(z+1)')
add_zaxis(ax)
ax.minorticks_on()
ax.tick_params(axis='x',which='minor',bottom='on', top="off")
plt.show()

Matplotlib: Automatic coloured legend for all subplots using subplot line labels

The code below achieves what I want to do, but does so in a very roundabout way. I have looked around for a succinct way to produce a single legend for a figure that includes multiple subplots that takes into account their labels, to no avail. plt.figlegend() requires you to pass in labels and lines, and plt.legend() requires only handles (slightly better).
My example below illustrates what I want. I have 9 vectors, each with one of 3 categories. I want to plot each vector on a separate sub plot, label it, and plot a legend which indicates (using colour) what the label means; this is the automatic behaviour on a single plot.
Do you know of a better way of achieving the plot below?
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
nr_lines = 9
nr_cats = 3
np.random.seed(1337)
# Data
X = np.random.randn(nr_lines, 100)
labels = ['Category {}'.format(ii) for ii in range(nr_cats)]
y = np.random.choice(labels, nr_lines)
# Ideally wouldn't have to manually pick colours
clrs = matplotlib.rcParams['axes.prop_cycle'].by_key()['color']
clrs = [clrs[ii] for ii in range(nr_cats)]
lab_clr = {k: v for k, v in zip(labels, clrs)}
fig, ax = plt.subplots(3, 3)
ax = ax.flatten()
for ii in range(nr_lines):
ax[ii].plot(X[ii,:], label=y[ii], color=lab_clr[y[ii]])
lines = [a.lines[0] for a in ax]
l_labels = [l.get_label() for l in lines]
# the hack - get a single occurance of each label
idx_list = [l_labels.index(lab) for lab in labels]
lines_ = [lines[idx] for idx in idx_list]
#l_labels_ = [l_labels[idx] for idx in idx_list]
plt.legend(handles=lines_, bbox_to_anchor=[2, 2.5])
plt.tight_layout()
plt.savefig('/home/james/Downloads/stack_figlegend_example.png',
bbox_inches='tight')
You could use a dictionary to collect them using the label as a key. For example:
handles = {}
for ii in range(nr_lines):
l1, = ax[ii].plot(X[ii,:], label=y[ii], color=lab_clr[y[ii]])
if y[ii] not in handles:
handles[y[ii]] = l1
plt.legend(handles=handles.values(), bbox_to_anchor=[2, 2.5])
You only add a handle to the dictionary if the category isn't already present.

Specify format of floats for tick labels

I am trying to set the format to two decimal numbers in a matplotlib subplot environment. Unfortunately, I do not have any idea how to solve this task.
To prevent using scientific notation on the y-axis I used ScalarFormatter(useOffset=False) as you can see in my snippet below. I think my task should be solved by passing further options/arguments to the used formatter. However, I could not find any hint in matplotlib's documentation.
How can I set two decimal digits or none (both cases are needed)? I am not able to provide sample data, unfortunately.
-- SNIPPET --
f, axarr = plt.subplots(3, sharex=True)
data = conv_air
x = range(0, len(data))
axarr[0].scatter(x, data)
axarr[0].set_ylabel('$T_\mathrm{air,2,2}$', size=FONT_SIZE)
axarr[0].yaxis.set_major_locator(MaxNLocator(5))
axarr[0].yaxis.set_major_formatter(ScalarFormatter(useOffset=False))
axarr[0].tick_params(direction='out', labelsize=FONT_SIZE)
axarr[0].grid(which='major', alpha=0.5)
axarr[0].grid(which='minor', alpha=0.2)
data = conv_dryer
x = range(0, len(data))
axarr[1].scatter(x, data)
axarr[1].set_ylabel('$T_\mathrm{dryer,2,2}$', size=FONT_SIZE)
axarr[1].yaxis.set_major_locator(MaxNLocator(5))
axarr[1].yaxis.set_major_formatter(ScalarFormatter(useOffset=False))
axarr[1].tick_params(direction='out', labelsize=FONT_SIZE)
axarr[1].grid(which='major', alpha=0.5)
axarr[1].grid(which='minor', alpha=0.2)
data = conv_lambda
x = range(0, len(data))
axarr[2].scatter(x, data)
axarr[2].set_xlabel('Iterationsschritte', size=FONT_SIZE)
axarr[2].xaxis.set_major_locator(MaxNLocator(integer=True))
axarr[2].set_ylabel('$\lambda$', size=FONT_SIZE)
axarr[2].yaxis.set_major_formatter(ScalarFormatter(useOffset=False))
axarr[2].yaxis.set_major_locator(MaxNLocator(5))
axarr[2].tick_params(direction='out', labelsize=FONT_SIZE)
axarr[2].grid(which='major', alpha=0.5)
axarr[2].grid(which='minor', alpha=0.2)
See the relevant documentation in general and specifically
from matplotlib.ticker import FormatStrFormatter
fig, ax = plt.subplots()
ax.yaxis.set_major_formatter(FormatStrFormatter('%.2f'))
If you are directly working with matplotlib's pyplot (plt) and if you are more familiar with the new-style format string, you can try this:
from matplotlib.ticker import StrMethodFormatter
plt.gca().yaxis.set_major_formatter(StrMethodFormatter('{x:,.0f}')) # No decimal places
plt.gca().yaxis.set_major_formatter(StrMethodFormatter('{x:,.2f}')) # 2 decimal places
From the documentation:
class matplotlib.ticker.StrMethodFormatter(fmt)
Use a new-style format string (as used by str.format()) to format the
tick.
The field used for the value must be labeled x and the field used for
the position must be labeled pos.
The answer above is probably the correct way to do it, but didn't work for me.
The hacky way that solved it for me was the following:
ax = <whatever your plot is>
# get the current labels
labels = [item.get_text() for item in ax.get_xticklabels()]
# Beat them into submission and set them back again
ax.set_xticklabels([str(round(float(label), 2)) for label in labels])
# Show the plot, and go home to family
plt.show()
format labels using lambda function
3x the same plot with differnt y-labeling
Minimal example
import numpy as np
import matplotlib as mpl
import matplotlib.pylab as plt
from matplotlib.ticker import FormatStrFormatter
fig, axs = mpl.pylab.subplots(1, 3)
xs = np.arange(10)
ys = 1 + xs ** 2 * 1e-3
axs[0].set_title('default y-labeling')
axs[0].scatter(xs, ys)
axs[1].set_title('custom y-labeling')
axs[1].scatter(xs, ys)
axs[2].set_title('x, pos arguments')
axs[2].scatter(xs, ys)
fmt = lambda x, pos: '1+ {:.0f}e-3'.format((x-1)*1e3, pos)
axs[1].yaxis.set_major_formatter(mpl.ticker.FuncFormatter(fmt))
fmt = lambda x, pos: 'x={:f}\npos={:f}'.format(x, pos)
axs[2].yaxis.set_major_formatter(mpl.ticker.FuncFormatter(fmt))
You can also use 'real'-functions instead of lambdas, of course.
https://matplotlib.org/3.1.1/gallery/ticks_and_spines/tick-formatters.html
In matplotlib 3.1, you can also use ticklabel_format. To prevents scientific notation without offsets:
plt.gca().ticklabel_format(axis='both', style='plain', useOffset=False)

Discrete colorbar in matplotlib [duplicate]

How does one set the color of a line in matplotlib with scalar values provided at run time using a colormap (say jet)? I tried a couple of different approaches here and I think I'm stumped. values[] is a storted array of scalars. curves are a set of 1-d arrays, and labels are an array of text strings. Each of the arrays have the same length.
fig = plt.figure()
ax = fig.add_subplot(111)
jet = colors.Colormap('jet')
cNorm = colors.Normalize(vmin=0, vmax=values[-1])
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=jet)
lines = []
for idx in range(len(curves)):
line = curves[idx]
colorVal = scalarMap.to_rgba(values[idx])
retLine, = ax.plot(line, color=colorVal)
#retLine.set_color()
lines.append(retLine)
ax.legend(lines, labels, loc='upper right')
ax.grid()
plt.show()
The error you are receiving is due to how you define jet. You are creating the base class Colormap with the name 'jet', but this is very different from getting the default definition of the 'jet' colormap. This base class should never be created directly, and only the subclasses should be instantiated.
What you've found with your example is a buggy behavior in Matplotlib. There should be a clearer error message generated when this code is run.
This is an updated version of your example:
import matplotlib.pyplot as plt
import matplotlib.colors as colors
import matplotlib.cm as cmx
import numpy as np
# define some random data that emulates your indeded code:
NCURVES = 10
np.random.seed(101)
curves = [np.random.random(20) for i in range(NCURVES)]
values = range(NCURVES)
fig = plt.figure()
ax = fig.add_subplot(111)
# replace the next line
#jet = colors.Colormap('jet')
# with
jet = cm = plt.get_cmap('jet')
cNorm = colors.Normalize(vmin=0, vmax=values[-1])
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=jet)
print scalarMap.get_clim()
lines = []
for idx in range(len(curves)):
line = curves[idx]
colorVal = scalarMap.to_rgba(values[idx])
colorText = (
'color: (%4.2f,%4.2f,%4.2f)'%(colorVal[0],colorVal[1],colorVal[2])
)
retLine, = ax.plot(line,
color=colorVal,
label=colorText)
lines.append(retLine)
#added this to get the legend to work
handles,labels = ax.get_legend_handles_labels()
ax.legend(handles, labels, loc='upper right')
ax.grid()
plt.show()
Resulting in:
Using a ScalarMappable is an improvement over the approach presented in my related answer:
creating over 20 unique legend colors using matplotlib
I thought it would be beneficial to include what I consider to be a more simple method using numpy's linspace coupled with matplotlib's cm-type object. It's possible that the above solution is for an older version. I am using the python 3.4.3, matplotlib 1.4.3, and numpy 1.9.3., and my solution is as follows.
import matplotlib.pyplot as plt
from matplotlib import cm
from numpy import linspace
start = 0.0
stop = 1.0
number_of_lines= 1000
cm_subsection = linspace(start, stop, number_of_lines)
colors = [ cm.jet(x) for x in cm_subsection ]
for i, color in enumerate(colors):
plt.axhline(i, color=color)
plt.ylabel('Line Number')
plt.show()
This results in 1000 uniquely-colored lines that span the entire cm.jet colormap as pictured below. If you run this script you'll find that you can zoom in on the individual lines.
Now say I want my 1000 line colors to just span the greenish portion between lines 400 to 600. I simply change my start and stop values to 0.4 and 0.6 and this results in using only 20% of the cm.jet color map between 0.4 and 0.6.
So in a one line summary you can create a list of rgba colors from a matplotlib.cm colormap accordingly:
colors = [ cm.jet(x) for x in linspace(start, stop, number_of_lines) ]
In this case I use the commonly invoked map named jet but you can find the complete list of colormaps available in your matplotlib version by invoking:
>>> from matplotlib import cm
>>> dir(cm)
A combination of line styles, markers, and qualitative colors from matplotlib:
import itertools
import matplotlib as mpl
import matplotlib.pyplot as plt
N = 8*4+10
l_styles = ['-','--','-.',':']
m_styles = ['','.','o','^','*']
colormap = mpl.cm.Dark2.colors # Qualitative colormap
for i,(marker,linestyle,color) in zip(range(N),itertools.product(m_styles,l_styles, colormap)):
plt.plot([0,1,2],[0,2*i,2*i], color=color, linestyle=linestyle,marker=marker,label=i)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.,ncol=4);
UPDATE: Supporting not only ListedColormap, but also LinearSegmentedColormap
import itertools
import matplotlib.pyplot as plt
Ncolors = 8
#colormap = plt.cm.Dark2# ListedColormap
colormap = plt.cm.viridis# LinearSegmentedColormap
Ncolors = min(colormap.N,Ncolors)
mapcolors = [colormap(int(x*colormap.N/Ncolors)) for x in range(Ncolors)]
N = Ncolors*4+10
l_styles = ['-','--','-.',':']
m_styles = ['','.','o','^','*']
fig,ax = plt.subplots(gridspec_kw=dict(right=0.6))
for i,(marker,linestyle,color) in zip(range(N),itertools.product(m_styles,l_styles, mapcolors)):
ax.plot([0,1,2],[0,2*i,2*i], color=color, linestyle=linestyle,marker=marker,label=i)
ax.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.,ncol=3,prop={'size': 8})
U may do as I have written from my deleted account (ban for new posts :( there was). Its rather simple and nice looking.
Im using 3-rd one of these 3 ones usually, also I wasny checking 1 and 2 version.
from matplotlib.pyplot import cm
import numpy as np
#variable n should be number of curves to plot (I skipped this earlier thinking that it is obvious when looking at picture - sorry my bad mistake xD): n=len(array_of_curves_to_plot)
#version 1:
color=cm.rainbow(np.linspace(0,1,n))
for i,c in zip(range(n),color):
ax1.plot(x, y,c=c)
#or version 2: - faster and better:
color=iter(cm.rainbow(np.linspace(0,1,n)))
c=next(color)
plt.plot(x,y,c=c)
#or version 3:
color=iter(cm.rainbow(np.linspace(0,1,n)))
for i in range(n):
c=next(color)
ax1.plot(x, y,c=c)
example of 3:
Ship RAO of Roll vs Ikeda damping in function of Roll amplitude A44

Stop matplotlib repeating labels in legend

Here is a very simplified example:
xvalues = [2,3,4,6]
for x in xvalues:
plt.axvline(x,color='b',label='xvalues')
plt.legend()
The legend will now show 'xvalues' as a blue line 4 times in the legend.
Is there a more elegant way of fixing this than the following?
for i,x in enumerate(xvalues):
if not i:
plt.axvline(x,color='b',label='xvalues')
else:
plt.axvline(x,color='b')
plt.legend takes as parameters
A list of axis handles which are Artist objects
A list of labels which are strings
These parameters are both optional defaulting to plt.gca().get_legend_handles_labels().
You can remove duplicate labels by putting them in a dictionary before calling legend. This is because dicts can't have duplicate keys.
For example:
For Python versions < 3.7
from collections import OrderedDict
import matplotlib.pyplot as plt
handles, labels = plt.gca().get_legend_handles_labels()
by_label = OrderedDict(zip(labels, handles))
plt.legend(by_label.values(), by_label.keys())
For Python versions > 3.7
As of Python 3.7, dictionaries retain input order by default. Thus, there is no need for OrderedDict form the collections module.
import matplotlib.pyplot as plt
handles, labels = plt.gca().get_legend_handles_labels()
by_label = dict(zip(labels, handles))
plt.legend(by_label.values(), by_label.keys())
Docs for plt.legend
handles, labels = ax.get_legend_handles_labels()
handle_list, label_list = [], []
for handle, label in zip(handles, labels):
if label not in label_list:
handle_list.append(handle)
label_list.append(label)
plt.legend(handle_list, label_list)
I don't know if this can be considered "elegant", but you can have your label a variable that gets set to "_nolegend_" after first usage:
my_label = "xvalues"
xvalues = [2,3,4,6]
for x in xvalues:
plt.axvline(x, color='b', label=my_label)
my_label = "_nolegend_"
plt.legend()
This can be generalized using a dictionary of labels if you have to put several labels:
my_labels = {"x1" : "x1values", "x2" : "x2values"}
x1values = [1, 3, 5]
x2values = [2, 4, 6]
for x in x1values:
plt.axvline(x, color='b', label=my_labels["x1"])
my_labels["x1"] = "_nolegend_"
for x in x2values:
plt.axvline(x, color='r', label=my_labels["x2"])
my_labels["x2"] = "_nolegend_"
plt.legend()
(Answer inspired by https://stackoverflow.com/a/19386045/1878788)
Problem - 3D Array
Questions: Nov 2012, Oct 2013
import numpy as np
a = np.random.random((2, 100, 4))
b = np.random.random((2, 100, 4))
c = np.random.random((2, 100, 4))
Solution - dict uniqueness
For my case _nolegend_ (bli and DSM) would not work, nor would label if i==0. ecatmur's answer uses get_legend_handles_labels and reduces the legend down with collections.OrderedDict. Fons demonstrates this is possible without an import.
Inline with these answers, I suggest using dict for unique labels.
# Step-by-step
ax = plt.gca() # Get the axes you need
a = ax.get_legend_handles_labels() # a = [(h1 ... h2) (l1 ... l2)] non unique
b = {l:h for h,l in zip(*a)} # b = {l1:h1, l2:h2} unique
c = [*zip(*b.items())] # c = [(l1 l2) (h1 h2)]
d = c[::-1] # d = [(h1 h2) (l1 l2)]
plt.legend(*d)
Or
plt.legend(*[*zip(*{l:h for h,l in zip(*ax.get_legend_handles_labels())}.items())][::-1])
Maybe less legible and memorable than Matthew Bourque's solution. Code golf welcome.
Example
import numpy as np
a = np.random.random((2, 100, 4))
b = np.random.random((2, 100, 4))
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1)
ax.plot(*a, 'C0', label='a')
ax.plot(*b, 'C1', label='b')
ax.legend(*[*zip(*{l:h for h,l in zip(*ax.get_legend_handles_labels())}.items())][::-1])
# ax.legend() # Old, ^ New
plt.show()
Based on answer https://stackoverflow.com/a/13589144/9132798 and https://stackoverflow.com/a/19386045/9132798
plt.gca().get_legend_handles_labels()[1] gives a list of names, it is possible to check if the label is already in the list while in the loop plotting (label= name[i] if name[i] not in plt.gca().get_legend_handles_labels()[1] else '').
For the given example this solution would look like:
import matplotlib.pyplot as plt
xvalues = [2,3,4,6]
for x in xvalues:
plt.axvline(x,color='b',\
label= 'xvalues' if 'xvalues' \
not in plt.gca().get_legend_handles_labels()[1] else '')
plt.legend()
Which is much shorter than https://stackoverflow.com/a/13589144/9132798 and more flexible than https://stackoverflow.com/a/19386045/9132798 as it could be use for any kind of loop any plot function in the loop individually.
However, for many cycles it probably slower than https://stackoverflow.com/a/13589144/9132798.
These code snippets didn't work for me personally. I was plotting two different groups in two different colors. The legend would show two red markers and two blue markers, when I only wanted to see one per color. I'll paste a simplified version of what did work for me:
Import statements
import matplotlib.pyplot as plt
from matplotlib.legend_handler import HandlerLine2D
Plot data
points_grp, = plt.plot(x[grp_idx], y[grp_idx], color=c.c[1], marker=m, ms=4, lw=0, label=leglab[1])
points_ctrl, = plt.plot(x[ctrl_idx], y[ctrl_idx], color=c.c[0], marker=m, ms=4, lw=0, label=leglab[0])
Add legend
points_dict = {points_grp: HandlerLine2D(numpoints=1),points_ctrl: HandlerLine2D(numpoints=1)}
leg = ax.legend(fontsize=12, loc='upper left', bbox_to_anchor=(1, 1.03),handler_map=points_dict)

Categories

Resources