How to generate two legends for a scatterplot

How to generate two legends for a scatterplot - python

I want to generate two different legends for these five points according to their size, the corresponding labels are written in my code, but I only generate one wrong legend so far, how can I correct my code?
By the way, if I want to generate wiring with the same logic as in my code, is there a better way? I have checked a lot of information and can only generate the picture like this, I hope to get your help to optimize the code. Thanks in advance!
Edit:
I made changes with reference to matplotlib: 2 different legends on same graph, but the legend I got is still incorrect. I want to add legends only with dots instead of lines. I also tried some methods but all failed, can you give me some suggestions？
import numpy as np
import matplotlib.pyplot as plt
X_new = np.random.randint(1,20,(3,2))
x1 = X_new[0,:]
x2 = X_new[1,:]
x3 = X_new[2,:]
p = np.random.randint(1,20,(1,2))
p1 = np.random.randint(1,20,(1,2))
# plt.style.use('ggplot')
color_map = {0: 'blue', 1:'green', 2: 'darkred', 3: 'black', 4:'red'}
legend1_label = {0: 'trn1', 1: 'trn2', 2: 'trn3'}
legend2_label = {0: 'p', 1: 'p1'}
plt.plot()
for idx, cl in enumerate(legend1_label):
scatter = plt.scatter(x=X_new[idx, 0], y=X_new[idx, 1], c=color_map[cl], marker='.',
s=100)
plt.plot([x1[0],p[0,0]], [x1[1],p[0,1]], color='k',linestyle='--',linewidth=1)
plt.plot([x2[0],p[0,0]], [x2[1],p[0,1]], color='k',linestyle='--',linewidth=1)
plt.plot([x3[0],p[0,0]], [x3[1],p[0,1]], color='k',linestyle='--',linewidth=1)
plt.plot([x1[0],p1[0,0]], [x1[1],p1[0,1]], color='r',linestyle='--',linewidth=1)
plt.plot([x2[0],p1[0,0]], [x2[1],p1[0,1]], color='r',linestyle='--',linewidth=1)
plt.plot([x3[0],p1[0,0]], [x3[1],p1[0,1]], color='r',linestyle='--',linewidth=1)
plt.scatter(x=p[0,0], y=p[0,1],c='red',marker='.',s=200)
plt.scatter(x=p1[0,0], y=p1[0,1],c='black',marker='.',s=200)
legend1 = plt.legend(labels=["trn1","trn2","trn3"],loc=4, title="legend1")
plt.legend(labels=["p","p1"], title="legend2")
plt.gca().add_artist(legend1)
plt.show()

Related

Plotting Multiple Series of Lines on the Same Plot

I am attempting to graph battery cycling data similar to this . Each line is one cycle worth of datapoints and should be one line on the graph. At first the code I wrote simply saw the dataframe as a continuous variable, then I inserted a for loop that would graph 1 line for the 1 cycles worth of data, iterate to the next cycle 2 but currently it simply bugs and does not show any graph. Debug seems to show an issue once it loops past cycle 1. Each cycle does not have an equal amount of data points.
EDIT: I suspect now when looping the headers of the data is causing an issue. I think making a dictionary would solve this issue
df2 = pd.read_excel(r'C:\Users\####\- ##### - ####\2-7_7.xlsx',\
sheet_name='record', usecols="A:N")
df2['Capacity(mAh)'] = df2['Capacity(mAh)'].apply(lambda x: x*1000) #A fix for unit error in the data
df2.set_index('Cycle ID',inplace = True) #Set the index to the Cycle number
for cycle in df2.index:
chosen_cyclex = df2.loc[cycle, 'Capacity(mAh)']
chosen_cycley = df2.loc[cycle,'Voltage(V)']
plt.plot(chosen_cyclex.iloc[1],chosen_cycley.iloc[1])
#print(chosen_cyclex[1],chosen_cycley[1])
plt.show()

I ended up using this method, where the equivalents were selected.
for cycle in cyclearray:
plt.plot(df2[df2.index == cycle]['Capacity(mAh)'],df2[df2.index == cycle]['Voltage(V)'],cycle
For other battery testers who show up here, if you need to 'cut' the voltages curves up, use
plt.xlim([xmin,xmax])
plt.ylim([ymin+0.1,ymax-0.1])

You need to specify an ax when plotting. Here are some examples:
# reproducible (but unimaginative) setup
n = 100
cycles = 4
df2 = pd.DataFrame({
'ID': np.repeat(np.arange(cycles), n),
'Capacity(mAh)': np.tile(np.arange(n), cycles),
'Voltage(V)': (np.arange(n)**0.8 * np.linspace(5, 3, cycles)[:, None]).ravel(),
})
Example 1: using groupby.plot, then fiddle around to adjust labels
fig, ax = plt.subplots()
df2.groupby('ID').plot(x='Capacity(mAh)', y='Voltage(V)', ax=ax)
# now customize the labels
lines, labels = ax.get_legend_handles_labels()
for ith, line in zip('1st 2nd 3rd 4th'.split(), lines):
line.set_label(f'{ith} discharge')
ax.legend()
Example 2: groupby used as an iterator
fig, ax = plt.subplots()
ld = {1: 'st', 2: 'nd', 3: 'rd'}
for cycle, g in df2.groupby('ID'):
label = f'{cycle + 1}{ld.get(cycle + 1, "th")} discharge'
g.plot(x='Capacity(mAh)', y='Voltage(V)', label=label, ax=ax)
Same plot as above.
Example 3: using ax.plot instead of df.plot or similar
fig, ax = plt.subplots()
ld = {1: 'st', 2: 'nd', 3: 'rd'}
for cycle, g in df2.groupby('ID'):
label = f'{cycle + 1}{ld.get(cycle + 1, "th")} discharge'
ax.plot(g['Capacity(mAh)'], g['Voltage(V)'], label=label)
ax.legend()

How does one stop graphs from the same code from overlapping?

The graphs that are output from two distinct nx.draw_networkx commands seem to overlap in the console. How does one properly separate different graphs?
This feels like a silly question, but I have yet to find any solution on the web.
def desenho(dados)
edges,weights = zip(*nx.get_edge_attributes(dados,'weight').items())
pos = nx.spring_layout(dados)
print(nx.draw_networkx(dados, pos, node_color='purple', edgelist=edges, edge_color=weights, width=5.0, edge_cmap=plt.cm.jet), '\n')
graph_1 = desenho(data_1)
graph_2 = desenho(data_2)
I'd expect that each output would process and match with the empty string I put in to create some distance between them, but that isn't happening. What am I doing wrong here?
Present output:
Expected output:
I'd also appreciate suggestions on how to make the color_map a bit less extreme in its gradient.

You need to create a subplot for each graph that you want to plot. For example:
import matplotlib.pyplot as plt
import networkx as nx
def desenho(dados):
edges,weights = zip(*nx.get_edge_attributes(dados,'weight').items())
pos = nx.spring_layout(dados)
nx.draw_networkx(dados, pos, node_color='purple', edgelist=edges, edge_color=weights, width=5.0, edge_cmap=plt.cm.jet)
fig = plt.figure()
ax = fig.add_subplot(2,1,1) # 2,1,1 means: 2:two rows, 1: one column, 1: first plot
graph_1 = desenho(data_1)
ax2 = fig.add_subplot(2,1,2) # 2,1,2 means: 2:two rows, 1: one column, 1: second plot
graph_2 = desenho(data_2)

Separating different points with different colors on a scatter plot

This is a code I've written:
import pandas as pd
import matplotlib.pyplot as plt
data1 = pd.read_csv('F:\HCSE\sample_data1.csv',sep=';')
colnames = data1.columns
plt.plot(data1.iloc[:,0],data1.iloc[:,2],'bs')
plt.ylabel(colnames[2])
plt.xlabel(colnames[0])
plt.show()
This is the data I have used:
Age;Gender;LOS;WBC;HB;Nothrophil
0.62;1;0.11;9.42;22.44;70.43
0.84;0;0.37;4.4;10.4;88.4
0.78;0;0.23;6.8;15.6;76.5
0.8;0;-0.02;9.3;15.1;87
0.7;1;0.19;5.3;11.3;82
0.25;0;0.27;5.9;10.6;87.59
0.32;0;0.37;3.1;12.5;15.4
0.86;1;0.31;4.1;10.4;77
0.75;0;0.21;12.07;14.1;88
Finally, I have drawn the chart which can be found in the link here.
My questions is: How can I have different colors for different sexes (for example: male=red and female=blue)?
Thanks in advance

I think you're looking for something like this:
cols = {0: 'red', 1: 'blue'}
plt.scatter(data1.Age, data1.LOS, c=data1.Gender.map(cols))

With your dataframe as it is, you could use the built-in df.plot.scatter() function and pass Gender to the color keyword:
data1.plot.scatter(
'Age', 'LOS',
c='Gender', cmap='RdBu',
edgecolor='None', s=45)
Note that I've also removed the black borders around each point and slightly increased the size.

Add Legend to Seaborn point plot

I am plotting multiple dataframes as point plot using seaborn. Also I am plotting all the dataframes on the same axis.
How would I add legend to the plot ?
My code takes each of the dataframe and plots it one after another on the same figure.
Each dataframe has same columns
date count
2017-01-01 35
2017-01-02 43
2017-01-03 12
2017-01-04 27
My code :
f, ax = plt.subplots(1, 1, figsize=figsize)
x_col='date'
y_col = 'count'
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df_1,color='blue')
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df_2,color='green')
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df_3,color='red')
This plots 3 lines on the same plot. However the legend is missing. The documentation does not accept label argument .
One workaround that worked was creating a new dataframe and using hue argument.
df_1['region'] = 'A'
df_2['region'] = 'B'
df_3['region'] = 'C'
df = pd.concat([df_1,df_2,df_3])
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df,hue='region')
But I would like to know if there is a way to create a legend for the code that first adds sequentially point plot to the figure and then add a legend.
Sample output :

I would suggest not to use seaborn pointplot for plotting. This makes things unnecessarily complicated.
Instead use matplotlib plot_date. This allows to set labels to the plots and have them automatically put into a legend with ax.legend().
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np
date = pd.date_range("2017-03", freq="M", periods=15)
count = np.random.rand(15,4)
df1 = pd.DataFrame({"date":date, "count" : count[:,0]})
df2 = pd.DataFrame({"date":date, "count" : count[:,1]+0.7})
df3 = pd.DataFrame({"date":date, "count" : count[:,2]+2})
f, ax = plt.subplots(1, 1)
x_col='date'
y_col = 'count'
ax.plot_date(df1.date, df1["count"], color="blue", label="A", linestyle="-")
ax.plot_date(df2.date, df2["count"], color="red", label="B", linestyle="-")
ax.plot_date(df3.date, df3["count"], color="green", label="C", linestyle="-")
ax.legend()
plt.gcf().autofmt_xdate()
plt.show()
In case one is still interested in obtaining the legend for pointplots, here a way to go:
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df1,color='blue')
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df2,color='green')
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df3,color='red')
ax.legend(handles=ax.lines[::len(df1)+1], labels=["A","B","C"])
ax.set_xticklabels([t.get_text().split("T")[0] for t in ax.get_xticklabels()])
plt.gcf().autofmt_xdate()
plt.show()

Old question, but there's an easier way.
sns.pointplot(x=x_col,y=y_col,data=df_1,color='blue')
sns.pointplot(x=x_col,y=y_col,data=df_2,color='green')
sns.pointplot(x=x_col,y=y_col,data=df_3,color='red')
plt.legend(labels=['legendEntry1', 'legendEntry2', 'legendEntry3'])
This lets you add the plots sequentially, and not have to worry about any of the matplotlib crap besides defining the legend items.

I tried using Adam B's answer, however, it didn't work for me. Instead, I found the following workaround for adding legends to pointplots.
import matplotlib.patches as mpatches
red_patch = mpatches.Patch(color='#bb3f3f', label='Label1')
black_patch = mpatches.Patch(color='#000000', label='Label2')
In the pointplots, the color can be specified as mentioned in previous answers. Once these patches corresponding to the different plots are set up,
plt.legend(handles=[red_patch, black_patch])
And the legend ought to appear in the pointplot.

This goes a bit beyond the original question, but also builds on #PSub's response to something more general---I do know some of this is easier in Matplotlib directly, but many of the default styling options for Seaborn are quite nice, so I wanted to work out how you could have more than one legend for a point plot (or other Seaborn plot) without dropping into Matplotlib right at the start.
Here's one solution:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# We will need to access some of these matplotlib classes directly
from matplotlib.lines import Line2D # For points and lines
from matplotlib.patches import Patch # For KDE and other plots
from matplotlib.legend import Legend
from matplotlib import cm
# Initialise random number generator
rng = np.random.default_rng(seed=42)
# Generate sample of 25 numbers
n = 25
clusters = []
for c in range(0,3):
# Crude way to get different distributions
# for each cluster
p = rng.integers(low=1, high=6, size=4)
df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Cluster {c+1}"
})
clusters.append(df)
# Flatten to a single data frame
clusters = pd.concat(clusters)
# Now do the same for data to feed into
# the second (scatter) plot...
n = 8
points = []
for c in range(0,2):
p = rng.integers(low=1, high=6, size=4)
df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Group {c+1}"
})
points.append(df)
points = pd.concat(points)
# And create the figure
f, ax = plt.subplots(figsize=(8,8))
# The KDE-plot generates a Legend 'as usual'
k = sns.kdeplot(
data=clusters,
x='x', y='y',
hue='name',
shade=True,
thresh=0.05,
n_levels=2,
alpha=0.2,
ax=ax,
)
# Notice that we access this legend via the
# axis to turn off the frame, set the title,
# and adjust the patch alpha level so that
# it closely matches the alpha of the KDE-plot
ax.get_legend().set_frame_on(False)
ax.get_legend().set_title("Clusters")
for lh in ax.get_legend().get_patches():
lh.set_alpha(0.2)
# You would probably want to sort your data
# frame or set the hue and style order in order
# to ensure consistency for your own application
# but this works for demonstration purposes
groups = points.name.unique()
markers = ['o', 'v', 's', 'X', 'D', '<', '>']
colors = cm.get_cmap('Dark2').colors
# Generate the scatterplot: notice that Legend is
# off (otherwise this legend would overwrite the
# first one) and that we're setting the hue, style,
# markers, and palette using the 'name' parameter
# from the data frame and the number of groups in
# the data.
p = sns.scatterplot(
data=points,
x="x",
y="y",
hue='name',
style='name',
markers=markers[:len(groups)],
palette=colors[:len(groups)],
legend=False,
s=30,
alpha=1.0
)
# Here's the 'magic' -- we use zip to link together
# the group name, the color, and the marker style. You
# *cannot* retreive the marker style from the scatterplot
# since that information is lost when rendered as a
# PathCollection (as far as I can tell). Anyway, this allows
# us to loop over each group in the second data frame and
# generate a 'fake' Line2D plot (with zero elements and no
# line-width in our case) that we can add to the legend. If
# you were overlaying a line plot or a second plot that uses
# patches you'd have to tweak this accordingly.
patches = []
for x in zip(groups, colors[:len(groups)], markers[:len(groups)]):
patches.append(Line2D([0],[0], linewidth=0.0, linestyle='',
color=x[1], markerfacecolor=x[1],
marker=x[2], label=x[0], alpha=1.0))
# And add these patches (with their group labels) to the new
# legend item and place it on the plot.
leg = Legend(ax, patches, labels=groups,
loc='upper left', frameon=False, title='Groups')
ax.add_artist(leg);
# Done
plt.show();
Here's the output:

Python Matplotlib Boxplot Color

I am trying to make two sets of box plots using Matplotlib. I want each set of box plot filled (and points and whiskers) in a different color. So basically there will be two colors on the plot
My code is below, would be great if you can help make these plots in color. d0 and d1 are each list of lists of data. I want the set of box plots made with data in d0 in one color, and the set of box plots with data in d1 in another color.
plt.boxplot(d0, widths = 0.1)
plt.boxplot(d1, widths = 0.1)

To colorize the boxplot, you need to first use the patch_artist=True keyword to tell it that the boxes are patches and not just paths. Then you have two main options here:
set the color via ...props keyword argument, e.g.
boxprops=dict(facecolor="red"). For all keyword arguments, refer to the documentation
Use the plt.setp(item, properties) functionality to set the properties of the boxes, whiskers, fliers, medians, caps.
obtain the individual items of the boxes from the returned dictionary and use item.set_<property>(...) on them individually. This option is detailed in an answer to the following question: python matplotlib filled boxplots, where it allows to change the color of the individual boxes separately.
The complete example, showing options 1 and 2:
import matplotlib.pyplot as plt
import numpy as np
data = np.random.normal(0.1, size=(100,6))
data[76:79,:] = np.ones((3,6))+0.2
plt.figure(figsize=(4,3))
# option 1, specify props dictionaries
c = "red"
plt.boxplot(data[:,:3], positions=[1,2,3], notch=True, patch_artist=True,
boxprops=dict(facecolor=c, color=c),
capprops=dict(color=c),
whiskerprops=dict(color=c),
flierprops=dict(color=c, markeredgecolor=c),
medianprops=dict(color=c),
)
# option 2, set all colors individually
c2 = "purple"
box1 = plt.boxplot(data[:,::-2]+1, positions=[1.5,2.5,3.5], notch=True, patch_artist=True)
for item in ['boxes', 'whiskers', 'fliers', 'medians', 'caps']:
plt.setp(box1[item], color=c2)
plt.setp(box1["boxes"], facecolor=c2)
plt.setp(box1["fliers"], markeredgecolor=c2)
plt.xlim(0.5,4)
plt.xticks([1,2,3], [1,2,3])
plt.show()

You can change the color of a box plot using setp on the returned value from boxplot(). This example defines a box_plot() function that allows the edge and fill colors to be specified:
import matplotlib.pyplot as plt
def box_plot(data, edge_color, fill_color):
bp = ax.boxplot(data, patch_artist=True)
for element in ['boxes', 'whiskers', 'fliers', 'means', 'medians', 'caps']:
plt.setp(bp[element], color=edge_color)
for patch in bp['boxes']:
patch.set(facecolor=fill_color)
return bp
example_data1 = [[1,2,0.8], [0.5,2,2], [3,2,1]]
example_data2 = [[5,3, 4], [6,4,3,8], [6,4,9]]
fig, ax = plt.subplots()
bp1 = box_plot(example_data1, 'red', 'tan')
bp2 = box_plot(example_data2, 'blue', 'cyan')
ax.legend([bp1["boxes"][0], bp2["boxes"][0]], ['Data 1', 'Data 2'])
ax.set_ylim(0, 10)
plt.show()
This would display as follows:

This question seems to be similar to that one (Face pattern for boxes in boxplots)
I hope this code solves your problem
import matplotlib.pyplot as plt
# fake data
d0 = [[4.5, 5, 6, 4],[4.5, 5, 6, 4]]
d1 = [[1, 2, 3, 3.3],[1, 2, 3, 3.3]]
# basic plot
bp0 = plt.boxplot(d0, patch_artist=True)
bp1 = plt.boxplot(d1, patch_artist=True)
for box in bp0['boxes']:
# change outline color
box.set(color='red', linewidth=2)
# change fill color
box.set(facecolor = 'green' )
# change hatch
box.set(hatch = '/')
for box in bp1['boxes']:
box.set(color='blue', linewidth=5)
box.set(facecolor = 'red' )
plt.show()

Change the color of a boxplot
import numpy as np
import matplotlib.pyplot as plt
#generate some random data
data = np.random.randn(200)
d= [data, data]
#plot
box = plt.boxplot(d, showfliers=False)
# change the color of its elements
for _, line_list in box.items():
for line in line_list:
line.set_color('r')
plt.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to generate two legends for a scatterplot - python

Related

Plotting Multiple Series of Lines on the Same Plot

How does one stop graphs from the same code from overlapping?

Separating different points with different colors on a scatter plot

Add Legend to Seaborn point plot

Python Matplotlib Boxplot Color

Categories

Resources