I have a table that contains three different time characteristics according to two different parameters. I want to plot those parameters on x and y-axis and show bars of the three different times on the z-axis. I have created a simple bar plot where I plot one of the time characteristics:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
columns = ['R','Users','A','B','C']
df=pd.DataFrame({'R':[2,2,2,4,4,4,6,6,6,8,8],
'Users':[80,400,1000,80,400,1000,80,400,1000,80,400],
'A':[ 0.05381,0.071907,0.08767,0.04493,0.051825,0.05295,0.05285,0.0804,0.0967,0.09864,0.1097],
'B':[0.04287,0.83652,5.49683,0.02604,.045599,2.80836,0.02678,0.32621,1.41399,0.19025,0.2111],
'C':[0.02192,0.16217,0.71645, 0.25314,5.12239,38.92758,1.60807,262.4874,8493,11.6025,6288]},
columns=columns)
fig = plt.figure()
ax = plt.axes(projection="3d")
num_bars = 11
x_pos = df["R"]
y_pos = df["Users"]
z_pos = [0] * num_bars
x_size = np.ones(num_bars)/4
y_size = np.ones(num_bars)*50
z_size = df["A"]
ax.bar3d(x_pos, y_pos, z_pos, x_size, y_size, z_size, color='aqua')
plt.show()
This produces a simple 3d barplot:
However, I would like to plot similar bars next to the existing ones for the rest two columns (B and C) in a different color and add a plot legend as well. I could not figure out how to achieve this.
As a side question, is it as well possible to show only values from df at x- and y-axis? The values are 2-4-6-8 and 80-400-1000, I do not wish pyplot to add additional values on those axis.
I have managed to find a solution myself. To solve the problem with values I have added one to all times (to avoid negative log) and used np.log on all time columns. The values got on scale 0-10 this way and the plot got way easier to read. After that I used loop to go over each column and create corresponding values, positions and colors which I have added all to one list. I moved y_pos for each column so the columns do not plot on same position.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
columns = ['R','Users','A','B','C']
df=pd.DataFrame({'R':[2,2,2,4,4,4,6,6,6,8,8],
'Users':[80,400,1000,80,400,1000,80,400,1000,80,400],
'A':[ 0.05381,0.071907,0.08767,0.04493,0.051825,0.05295,0.05285,0.0804,0.0967,0.09864,0.1097],
'B':[0.04287,0.83652,5.49683,0.02604,.045599,2.80836,0.02678,0.32621,1.41399,0.19025,0.2111],
'C':[0.02192,0.16217,0.71645, 0.25314,5.12239,38.92758,1.60807,262.4874,8493,11.6025,6288]},
columns=columns)
fig = plt.figure(figsize=(10, 10))
ax = plt.axes(projection="3d")
df["A"] = np.log(df["A"]+1)
df["B"] = np.log(df["B"]+1)
df["C"] = np.log(df["C"]+1)
colors = ['r', 'g', 'b']
num_bars = 11
x_pos = []
y_pos = []
x_size = np.ones(num_bars*3)/4
y_size = np.ones(num_bars*3)*50
c = ['A','B','C']
z_pos = []
z_size = []
z_color = []
for i,col in enumerate(c):
x_pos.append(df["R"])
y_pos.append(df["Users"]+i*50)
z_pos.append([0] * num_bars)
z_size.append(df[col])
z_color.append([colors[i]] * num_bars)
x_pos = np.reshape(x_pos,(33,))
y_pos = np.reshape(y_pos,(33,))
z_pos = np.reshape(z_pos,(33,))
z_size = np.reshape(z_size,(33,))
z_color = np.reshape(z_color,(33,))
ax.bar3d(x_pos, y_pos, z_pos, x_size, y_size, z_size, color=z_color)
plt.xlabel('R')
plt.ylabel('Users')
ax.set_zlabel('Time')
from matplotlib.lines import Line2D
legend_elements = [Line2D([0], [0], marker='o', color='w', label='A',markerfacecolor='r', markersize=10),
Line2D([0], [0], marker='o', color='w', label='B',markerfacecolor='g', markersize=10),
Line2D([0], [0], marker='o', color='w', label='C',markerfacecolor='b', markersize=10)
]
# Make legend
ax.legend(handles=legend_elements, loc='best')
# Set view
ax.view_init(elev=35., azim=35)
plt.show()
Final plot:
Related
My question:
while plotting x and y values from a dataframe, if we have y values as discrete numbers say, id_number or category. if we use scatter plot, it will give linearly spaced yaxis ticks which may have large vertical spacing in between the plotted values depending on how much spaced our original values are.
what i required is to plot some category values ( fixed discrete values ) against the time events ( xaxis ) in a scatter plot, but the values in the table are just integer not strings. As i don't have any deep idea how to do this, the following is what i have achieved, but with modified original table with string values. Here is my testing data ( original data is large )
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtic
import matplotlib.category as mcat
np.random.seed(432987435)
nofpoints = 160
xval = np.arange(nofpoints)
disc = [ 200, 240, 250, 290 ]
yval = np.random.choice( disc , nofpoints)
yval_str = yval.astype(str)
yval , yval_str
cval = np.random.random( nofpoints )
df = pd.DataFrame( { 'xval': xval , 'yval':yval , 'cval': cval })
df_str = pd.DataFrame( { 'xval': xval , 'yval':yval_str , 'cval': cval })
using usual plotting method
fig = plt.figure(dpi=128 , figsize=(12,6))
ax1 = fig.add_subplot(111)
# here we are using the original dataframe(df), without any string field inside.
#ax1.grid(True)
ax1.scatter( 'xval' , 'yval' , data=df , marker='o', facecolor='None' , edgecolor='g')
plt.show()
this is what we get
see the large spacing between the values and each plot point is not against the tick values. (I don't want to use legend to show the category using colourmap, since it is preserved for some other purpose)
with modified dataframe having string as yaxis value
fig = plt.figure(dpi=128 , figsize=(12,6))
ax2 = fig.add_subplot(111)
# dataframe used is modified one with a string field inside.
# as we can see the order is shuffled.
ax2.scatter( 'xval' , 'yval' , data=df_str , marker='o', facecolor='None' , edgecolor='k')
plt.show()
to avoid shuffling
fig = plt.figure(dpi=128 , figsize=(12,6))
ax3 = fig.add_subplot(111)
# to maintain the same order and avoid shuffling we used matplotlib.category
#ax3.grid(True)
disc_str = [ str(x) for x in disc ]
units = mcat.UnitData(sorted(disc_str))
ax3.yaxis.set_units(units)
ax3.yaxis.set_major_locator( mcat.StrCategoryLocator(units._mapping))
ax3.yaxis.set_major_formatter( mcat.StrCategoryFormatter(units._mapping))
ax3.scatter( 'xval' , 'yval' , data=df_str , marker='o', facecolor='None' , edgecolor='y')
plt.show()
Is there any way to achieve this, without modifying the original table, i mean to plot integer category values as yaxis values.
You can do it by replacing ax1.scatter with seaborn.stripplot:
sns.stripplot(ax = ax1, data = df, x = 'xval', y = 'yval_str', marker = 'o', color = 'white', edgecolor = 'green', linewidth = 1)
Before you do that, if you want y axis in a particular order, you should sort your df:
df = pd.DataFrame({'xval': xval, 'yval': yval, 'yval_str': yval_str, 'cval': cval}).sort_values(by = 'yval', ascending = False)
Complete Code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
np.random.seed(432987435)
nofpoints = 160
xval = np.arange(nofpoints)
disc = [200, 240, 250, 290]
yval = np.random.choice(disc, nofpoints)
yval_str = yval.astype(str)
cval = np.random.random(nofpoints)
df = pd.DataFrame({'xval': xval, 'yval': yval, 'yval_str': yval_str, 'cval': cval}).sort_values(by = 'yval', ascending = False)
fig = plt.figure(dpi = 128, figsize = (12, 6))
ax1 = fig.add_subplot(111)
sns.stripplot(ax = ax1, data = df, x = 'xval', y = 'yval_str', marker = 'o', color = 'white', edgecolor = 'green', linewidth = 1)
plt.show()
If you want perfectly horizontally aligned points, you have to pass jitter = False to sns.stripplot:
sns.stripplot(ax = ax1, data = df, x = 'xval', y = 'yval_str', marker = 'o', color = 'white', edgecolor = 'green', linewidth = 1, jitter = False)
I am plotting from a CSV file that contains Cartesian coordinates and I want to change it to Polar coordinates, then plot using the Polar coordinates.
Here is the code
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.read_csv('test_for_plotting.csv',index_col = 0)
x_temp = df['x'].values
y_temp = df['y'].values
df['radius'] = np.sqrt( np.power(x_temp,2) + np.power(y_temp,2) )
df['theta'] = np.arctan2(y_temp,x_temp)
df['degrees'] = np.degrees(df['theta'].values)
df['radians'] = np.radians(df['degrees'].values)
ax = plt.axes(polar = True)
ax.set_aspect('equal')
ax.axis("off")
sns.set(rc={'axes.facecolor':'white', 'figure.facecolor':'white','figure.figsize':(10,10)})
# sns.scatterplot(data = df, x = 'x',y = 'y', s= 1,alpha = 0.1, color = 'black',ax = ax)
sns.scatterplot(data = df, x = 'radians',y = 'radius', s= 1,alpha = 0.1, color = 'black',ax = ax)
plt.tight_layout()
plt.show()
Here is the dataset
If you run this command using polar = False and use this line to plot sns.scatterplot(data = df, x = 'x',y = 'y', s= 1,alpha = 0.1, color = 'black',ax = ax) it will result in this picture
now after setting polar = True and run this line to plot sns.scatterplot(data = df, x = 'radians',y = 'radius', s= 1,alpha = 0.1, color = 'black',ax = ax) It is supposed to give you this
But it is not working as if you run the actual code the shape in the Polar format is the same as Cartesian which does not make sense and it does not match the picture I showed you for polar (If you are wondering where did I get the second picture from, I plotted it using R)
I would appreciate your help and insights and thanks in advance!
For a polar plot, the "x-axis" represents the angle in radians. So, you need to switch x and y, and convert the angles to radians (I also added ax=ax, as the axes was created explicitly):
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
data = {'radius': [0, 0.5, 1, 1.5, 2, 2.5], 'degrees': [0, 25, 75, 155, 245, 335]}
df_temp = pd.DataFrame(data)
ax = plt.axes(polar=True)
sns.scatterplot(x=np.radians(df_temp['degrees']), y=df_temp['radius'].to_numpy(),
s=100, alpha=1, color='black', ax=ax)
for deg, y in zip(df_temp['degrees'], df_temp['radius']):
x = np.radians(deg)
ax.axvline(x, color='skyblue', ls=':')
ax.text(x, y, f' {deg}', color='crimson')
ax.set_rlabel_position(-15) # Move radial labels away from plotted dots
plt.tight_layout()
plt.show()
About your new question: if you have an xy plot, and you convert these xy values to polar coordinates, and then plot these on a polar plot, you'll get again the same plot.
After some more testing with the data, I decided to create the plot directly with matplotlib, as seaborn makes some changes that don't have exactly equal effects across seaborn and matplotlib versions.
What seems to be happening in R:
The angles (given by "x") are spread out to fill the range (0,2 pi). This either requires a rescaling of x, or change how the x-values are mapped to angles. One way to get this, is subtracting the minimum. And with that result divide by the new maximum and multiply by 2 pi.
The 0 of the angles it at the top, and the angles go clockwise.
The following code should create the plot with Python. You might want to experiment with alpha and with s in the scatter plot options. (Default the scatter dots get an outline, which often isn't desired when working with very small dots, and can be removed by lw=0.)
ax = plt.axes(polar=True)
ax.set_aspect('equal')
ax.axis('off')
x_temp = df['x'].to_numpy()
y_temp = df['y'].to_numpy()
x_temp -= x_temp.min()
x_temp = x_temp / x_temp.max() * 2 * np.pi
ax.scatter(x=x_temp, y=y_temp, s=0.05, alpha=1, color='black', lw=0)
ax.set_rlim(y_temp.min(), y_temp.max())
ax.set_theta_zero_location("N") # set zero at the north (top)
ax.set_theta_direction(-1) # go clockwise
plt.show()
At the left the resulting image, at the right using the y-values for coloring (ax.scatter(..., c=y_temp, s=0.05, alpha=1, cmap='plasma_r', lw=0)):
I'm totally new at using Python for Power BI (or anything really).
I would like to add the value of the bar/scatter at the end of the line. (the datalabel)
Also to have a version where I could have the label inside of the scatter bubble would be cool.
Anyone who could help out here ?
All help appreciated
# libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Create a dataframe
df = pd.DataFrame({'group': dataset.Genre , 'values': dataset.Revenue})
val = list(dataset.SelectedGenre)
# Reorder it following the values:
ordered_df = df.sort_values(by='values')
my_range=range(1,len(df.index)+1)
# Create a color if the group is "B"
my_color=np.where(ordered_df ['group']== val, 'orange', 'skyblue')
my_size=np.where(ordered_df ['group']== val , 150, 150)
# The vertival plot is made using the hline function
# I load the seaborn library only to benefit the nice looking feature
import seaborn as sns
val = ordered_df['values']
plt.hlines(y=my_range, xmin=0, xmax=val, color=my_color, alpha=1 , linewidth=8)
plt.scatter(val, my_range, color=my_color, s=my_size, alpha=1)
# Add title and axis names
plt.yticks(my_range, ordered_df['group'])
plt.title("What about the B group?", loc='left')
plt.xlabel('Value of the variable')
plt.ylabel('Group')
plt.box(False) #Turn of Black bx around visual
plt.show()
Found it myself
import matplotlib.pyplot as plt
import numpy as np
# Data
x = dataset.Revenue
y = dataset.Genre
labels = dataset.Revenue
val = list(dataset.SelectedGenre)
# Create the figure and axes objects
fig, ax = plt.subplots(1, figsize=(10, 6))
fig.suptitle('Example Of Labelled Scatterpoints')
my_color=np.where(y == val, 'orange', 'skyblue')
my_size=np.where( y == val , 2000, 2000)
# Plot the scatter points
ax.scatter(x, y,
color= my_color, # Color of the dots
s=1000, # Size of the dots
alpha=1, # Alpha of the dots
linewidths=1) # Size of edge around the dots
ax.hlines(y, xmin=0, xmax=x, color= my_color, alpha=1 , linewidth=8)
def human_format(num):
magnitude = 0
while abs(num) >= 1000:
magnitude += 1
num /= 1000
# add more suffixes if you need them
return '%.0f%s' % (round(num), ['', 'K', 'M', 'G', 'T', 'P'][magnitude])
# Add the participant names as text labels for each point
for x_pos, y_pos, label in zip(x, y, labels):
ax.annotate(
human_format(label), # The label for this point
xy=(x_pos, y_pos), # Position of the corresponding point
xytext=(-8, 0), # Offset text by 7 points to the right
textcoords='offset points', # tell it to use offset points
ha='left', # Horizontally aligned to the left
va='center',
color = 'white') # Vertical alignment is centered
plt.box(False) #Turn of Black bx around visual
# Show the plot
plt.show()
I am looping through a list containing 6 col_names. I loop by taking 3 cols at a time so i can print 3 subplots per iteration later.
I have 2 dataframes with same column names so they look identical except for the histograms of each column name.
I want to plot similar column names of both dataframes on the same subplot. Right now, im plotting their histograms on 2 separate subplots.
currently, for col 'A','B','C' in df_plot:
and for col 'A','B','C' in df_plot2:
I only want 3 charts where i can combine similar column names into same chart so there is blue and yellow bars in the same chart.
Adding df_plot2 below doesnt work. i think im not defining my second axs properly but im not sure how to do that.
col_name_list = ['A','B','C','D','E','F']
chunk_list = [col_name_list[i:i + 3] for i in xrange(0, len(col_name_list), 3)]
for k,g in enumerate(chunk_list):
df_plot = df[g]
df_plot2 = df[g][df[g] != 0]
fig, axs = plt.subplots(1,len(g),figsize = (50,20))
axs = axs.ravel()
for j,x in enumerate(g):
df_plot[x].value_counts(normalize=True).head().plot(kind='bar',ax=axs[j], position=0, title = x, fontsize = 30)
# adding this doesnt work.
df_plot2[x].value_counts(normalize=True).head().plot(kind='bar',ax=axs[j], position=1, fontsize = 30)
axs[j].title.set_size(40)
fig.tight_layout()
the solution is to plot on the same ax:
change axs[j] to axs
for k,g in enumerate(chunk_list):
df_plot = df[g]
df_plot2 = df[g][df[g] != 0]
fig, axs = plt.subplots(1,len(g),figsize = (50,20))
axs = axs.ravel()
for j,x in enumerate(g):
df_plot[x].value_counts(normalize=True).head().plot(kind='bar',ax=axs, position=0, title = x, fontsize = 30)
# adding this doesnt work.
df_plot2[x].value_counts(normalize=True).head().plot(kind='bar',ax=axs, position=1, fontsize = 30)
axs[j].title.set_size(40)
fig.tight_layout()
then just call plt.plot()
Example this will plot x and y on the same subplot:
import matplotlib.pyplot as plt
x = np.arange(0, 10, 1)
y = np.arange(0, 20, 2)
ax = plt.subplot(1,1)
fig = plt.figure()
ax = fig.gca()
ax.plot(x)
ax.plot(y)
plt.show()
EDIT:
There is now a squeeze keyword argument. This makes sure the result is always a 2D numpy array.
fig, ax2d = subplots(2, 2, squeeze=False)
if needed Turning that into a 1D array is easy:
axli = ax1d.flatten()
What I want is like this:
What I get is this:
So how to merge the markers into one label?
also for the lines, for the lines, of course, u can realize it by not assigning label to the second line while using the same linetype, but for the markers, you can not, since they are of different shapes.
Note that in recent versions of matplotlib you can achieve this using class matplotlib.legend_handler.HandlerTuple as illustrated in this answer and also in this guide:
import matplotlib.pyplot as plt
from matplotlib.legend_handler import HandlerTuple
fig, ax1 = plt.subplots(1, 1)
# First plot: two legend keys for a single entry
p2, = ax1.plot([3, 4], [2, 3], 'o', mfc="white", mec="k")
p1, = ax1.plot([1, 2], [5, 6], 's', mfc="gray", mec="gray")
# `plot` returns a list, but we want the handle - thus the comma on the left
p3, = ax1.plot([1, 5], [4, 4], "-k")
p4, = ax1.plot([2, 6], [3, 2], "-k")
# Assign two of the handles to the same legend entry by putting them in a tuple
# and using a generic handler map (which would be used for any additional
# tuples of handles like (p1, p3)).
l = ax1.legend([(p1, p2), p3], ['data', 'models'],
handler_map={tuple: HandlerTuple(ndivide=None)})
plt.savefig("demo.png")
I think it's best to use a full legend - otherwise, how will your readers know the difference between the two models, or the two datasets? I would do it this way:
But, if you really want to do it your way, you can use a custom legend as shown in this guide. You'll need to create your own class, like they do, that defines the legend_artist method, which then adds squares and circles as appropriate. Here is the plot generated and the code used to generate it:
#!/usr/bin/env python
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np
# ==================================
# Define the form of the function
# ==================================
def model(x, A=190, k=1):
return A * np.exp(-k*x/50)
# ==================================
# How many data points are generated
# ==================================
num_samples = 15
# ==================================
# Create data for plots
# ==================================
x_model = np.linspace(0, 130, 200)
x_data1 = np.random.rand(num_samples) * 130
x_data1.sort()
x_data2 = np.random.rand(num_samples) * 130
x_data2.sort()
data1 = model(x_data1, k=1) * (1 + np.random.randn(num_samples) * 0.2)
data2 = model(x_data2, k=2) * (1 + np.random.randn(num_samples) * 0.15)
model1 = model(x_model, k=1)
model2 = model(x_model, k=2)
# ==================================
# Plot everything normally
# ==================================
fig = plt.figure()
ax = fig.add_subplot('111')
ax.plot(x_data1, data1, 'ok', markerfacecolor='none', label='Data (k=1)')
ax.plot(x_data2, data2, 'sk', markeredgecolor='0.5', markerfacecolor='0.5', label='Data (k=2)')
ax.plot(x_model, model1, '-k', label='Model (k=1)')
ax.plot(x_model, model2, '--k', label='Model (k=2)')
# ==================================
# Format plot
# ==================================
ax.set_xlabel('Distance from heated face($10^{-2}$ m)')
ax.set_ylabel('Temperature ($^\circ$C)')
ax.set_xlim((0, 130))
ax.set_title('Normal way to plot')
ax.legend()
fig.tight_layout()
plt.show()
# ==================================
# ==================================
# Do it again, but with custom
# legend
# ==================================
# ==================================
class AnyObject(object):
pass
class data_handler(object):
def legend_artist(self, legend, orig_handle, fontsize, handlebox):
scale = fontsize / 22
x0, y0 = handlebox.xdescent, handlebox.ydescent
width, height = handlebox.width, handlebox.height
patch_sq = mpatches.Rectangle([x0, y0 + height/2 * (1 - scale) ], height * scale, height * scale, facecolor='0.5',
edgecolor='0.5', transform=handlebox.get_transform())
patch_circ = mpatches.Circle([x0 + width - height/2, y0 + height/2], height/2 * scale, facecolor='none',
edgecolor='black', transform=handlebox.get_transform())
handlebox.add_artist(patch_sq)
handlebox.add_artist(patch_circ)
return patch_sq
# ==================================
# Plot everything
# ==================================
fig = plt.figure()
ax = fig.add_subplot('111')
d1 = ax.plot(x_data1, data1, 'ok', markerfacecolor='none', label='Data (k=2)')
d2 = ax.plot(x_data2, data2, 'sk', markeredgecolor='0.5', markerfacecolor='0.5', label='Data (k=1)')
m1 = ax.plot(x_model, model1, '-k', label='Model (k=1)')
m2 = ax.plot(x_model, model2, '-k', label='Model (k=2)')
# ax.legend([d1], handler_map={ax.plot: data_handler()})
ax.legend([AnyObject(), m1[0]], ['Data', 'Model'], handler_map={AnyObject: data_handler()})
# ==================================
# Format plot
# ==================================
ax.set_xlabel('Distance from heated face($10^{-2}$ m)')
ax.set_ylabel('Temperature ($^\circ$C)')
ax.set_xlim((0, 130))
ax.set_title('Custom legend')
fig.tight_layout()
plt.show()
I also found this link very useful (code below), it's an easier way to handle this issue. It's basically using a list of legend handles to make one of the markers of the first handle invisible and overplot it with the marker of the second handle. This way, you have both markers next to each other with one label.
fig, ax = plt.subplots()
p1 = ax.scatter([0.1],[0.5],c='r',marker='s')
p2 = ax.scatter([0.3],[0.2],c='b',marker='o')
l = ax.legend([(p1,p2)],['points'],scatterpoints=2)
With the above code, a TupleHandler is used to create legend handles which
simply overplot two handles (there are red squares behind the blue
circles if you look carefylly. What you want to do is make the second
marker of first handle and the first marker of the second handle
invisible. Unfortunately, the TupleHandler is a rather recent addition
and you need a special function to get all the handles. Otherwise, you
can use the Legend.legendHandles attribute (it only show the first
handle for the TupleHandler).
def get_handle_lists(l):
"""returns a list of lists of handles.
"""
tree = l._legend_box.get_children()[1]
for column in tree.get_children():
for row in column.get_children():
yield row.get_children()[0].get_children()
handles_list = list(get_handle_lists(l))
handles = handles_list[0] # handles is a list of two PathCollection.
# The first one is for red squares, and the second
# is for blue circles.
handles[0].set_facecolors(["r", "none"]) # for the fist
# PathCollection, make the
# second marker invisible by
# setting their facecolor and
# edgecolor to "none."
handles[0].set_edgecolors(["k", "none"])
handles[1].set_facecolors(["none", "b"])
handles[1].set_edgecolors(["none", "k"])
fig
Here is a new solution that will plot any collection of markers with the same label. I have not figured out how to make it work with markers from a line plot, but you can probably do a scatter plot on top of a line plot if you need to.
from matplotlib import pyplot as plt
import matplotlib.collections as mcol
import matplotlib.transforms as mtransforms
import numpy as np
from matplotlib.legend_handler import HandlerPathCollection
from matplotlib import cm
class HandlerMultiPathCollection(HandlerPathCollection):
"""
Handler for PathCollections, which are used by scatter
"""
def create_collection(self, orig_handle, sizes, offsets, transOffset):
p = type(orig_handle)(orig_handle.get_paths(), sizes=sizes,
offsets=offsets,
transOffset=transOffset,
)
return p
fig, ax = plt.subplots()
#make some data to plot
x = np.arange(0, 100, 10)
models = [.05 * x, 8 * np.exp(- .1 * x), np.log(x + 1), .01 * x]
tests = [model + np.random.rand(len(model)) - .5 for model in models]
#make colors and markers
colors = cm.brg(np.linspace(0, 1, len(models)))
markers = ['o', 'D', '*', 's']
markersize = 50
plots = []
#plot points and lines
for i in xrange(len(models)):
line, = plt.plot(x, models[i], linestyle = 'dashed', color = 'black', label = 'Model')
plot = plt.scatter(x, tests[i], c = colors[i], s = markersize, marker = markers[i])
plots.append(plot)
#get attributes
paths = []
sizes = []
facecolors = []
edgecolors = []
for plot in plots:
paths.append(plot.get_paths()[0])
sizes.append(plot.get_sizes()[0])
edgecolors.append(plot.get_edgecolors()[0])
facecolors.append(plot.get_facecolors()[0])
#make proxy artist out of a collection of markers
PC = mcol.PathCollection(paths, sizes, transOffset = ax.transData, facecolors = colors, edgecolors = edgecolors)
PC.set_transform(mtransforms.IdentityTransform())
plt.legend([PC, line], ['Test', 'Model'], handler_map = {type(PC) : HandlerMultiPathCollection()}, scatterpoints = len(paths), scatteryoffsets = [.5], handlelength = len(paths))
plt.show()
I have a solution for you if you're willing to use all circles for markers and differentiate by color only. You can use a circle collection to represent the markers, and then have a legend label for the collection as a whole.
Example code:
import matplotlib.pyplot as plt
import matplotlib.collections as collections
from matplotlib import cm
import numpy as np
#make some data to plot
x = np.arange(0, 100, 10)
models = [.05 * x, 8 * np.exp(- .1 * x), np.log(x + 1), .01 * x]
tests = [model + np.random.rand(len(model)) - .5 for model in models]
#make colors
colors = cm.brg(np.linspace(0, 1, len(models)))
markersize = 50
#plot points and lines
for i in xrange(len(models)):
line, = plt.plot(x, models[i], linestyle = 'dashed', color = 'black', label = 'Model')
plt.scatter(x, tests[i], c = colors[i], s = markersize)
#create collection of circles corresponding to markers
circles = collections.CircleCollection([markersize] * len(models), facecolor = colors)
#make the legend -- scatterpoints needs to be the same as the number
#of markers so that all the markers show up in the legend
plt.legend([circles, line], ['Test', 'Model'], scatterpoints = len(models), scatteryoffsets = [.5], handlelength = len(models))
plt.show()
You can do this by plotting data without any label and then adding the label separately:
from matplotlib import pyplot as plt
from numpy import random
xs = range(10)
data = random.rand(10, 2)
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
kwargs = {'color': 'r', 'linewidth': 2, 'linestyle': '--'}
ax.plot(xs, data, **kwargs)
ax.plot([], [], label='Model', **kwargs)
ax.legend()
plt.show()