Plotting a legend with matplotlib: error - python

I am trying to add a legend to my graph in matplotlib. instead of creating a legend it puts the full list of all mylabels in the legend.
My graph looks like this:
The legend is cut off and i cant see more than that, i assume due to its size.
This is my code:
features2 = ["Number of Sides"]
features3 = ["Largest Angle"]
header2 = ["Label"]
data_df = pd.DataFrame.from_csv("AllMixedShapes2.csv")
X1 = np.array(data_df[features2].values)
y1 = np.array(data_df[features3].values)
l = np.array(data_df[header2].values)
plt.scatter(X1[:, 0],y1, c=y, cmap=plt.cm.Paired, label=l)
plt.axis([0, 17, 0, 200])
plt.ylabel("Maximum Angle (Degrees)")
plt.xlabel("Number Of Sides")
plt.title('Original 450 Test Shapes')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.show()
And AllMixedShapes2.csv looks like this:
I'm quite new to python and machine learning and ive tried other examples but i cant get anything to work.

Matplotlib's label argument is meant to be a single string that labels the entire dataset, rather than an array of individual labels for the points within the dataset. If you wish to pass an array of point-by-point labels that will be aggregated into a legend, the best option is probably the Seaborn library. Seaborn provides a wrapper around matplotlib for more convenient statistical visualization.
This should do approximately what you wish to do with your data:
import seaborn
seaborn.lmplot('Number of Sides', 'Largest Angle', hue='Label',
data=data_df, fit_reg=False)
I'd suggest checking out the seaborn example gallery for more ideas.

Related

Seaborn lineplot without lines between points

How can I use the lineplot plotting function in seaborn to create a plot with no lines connecting between the points. I know the function is called lineplot, but it has the useful feature of merging all datapoints with the same x value and plotting a single mean and confidence interval.
tips = sns.load_dataset('tips')
sns.lineplot(x='size', y='total_bill', data=tips, marker='o', err_style='bars')
How do I plot without the line? I'm not sure of a better way to phrase my question. How can I plot points only? Lineless lineplot?
I know that seaborn has a pointplot function, but that is for categorical data. In some cases, my x-values are continuous values, so pointplot would not work.
I realize one could get into the matplotlib figure artists and delete the line, but that gets more complicated as the amount of stuff on the plot increases. I was wondering if there are some sort of arguments that can be passed to the lineplot function.
To get error bars without the connecting lines, you can set the linestyle parameter to '':
import seaborn as sns
tips = sns.load_dataset('tips')
sns.lineplot(x='size', y='total_bill', data=tips, marker='o', linestyle='', err_style='bars')
Other types of linestyle could also be interesting, for example "a loosely dotted line": sns.lineplot(..., linestyle=(0, (1, 10)))
I recommend setting join=False.
For me only join = True works.
sns.pointplot(data=df, x = "x_attribute", y = "y_attribute", ci= 95, join=False)

Python implementation of non uniform (non linear) x-axis in matplotlib

I am trying to have a non linear x - axis in Python using matplotlib and haven't found any functions or hack arounds to this problem.
This is how our graph looks at this point of time and I want to convert it to something like this. (Look at the difference in x axes of both graphs)
The code I have as of now is:
plt.axis([0, 100, 0, 1])
plt.plot(onecsma_x, onecsma_y, label='1-CSMA')
plt.plot(slotted_aloha_x,slotted_aloha_y, label ='Slotted Aloha')
plt.plot(pure_aloha_x,pure_aloha_y, label ='Pure Aloha')
plt.plot(npcsma_x, npcsma_y, label ='Non persisten CSMA')
plt.plot(pcsma_x, pcsma_y, label ='P persistent CSMA')
plt.legend(loc='upper right')
plt.show()
For log x-axis use semilogx instead of plot.
Also you could limit the x-axis maybe after using semilogx (but before show) with:
plt.xlim(0, 10**2)

Show confidence interval in legend of plot in Python / Seaborn

I am generating some scatter plots with linear regression and confidence interval using seaborn on Python, with the sns.regplot function. I could find a way to show the Regression line in the legend, but I would also like to add the Confidence Interval in the legend (with the transparent blue as the reference colour).
Here is the code I have and the result I get so far.
Tobin_Nationality_Reg = sns.regplot(x="Nationality_Index_Normalized",
y="Tobins_Q_2017",
data=Scatter_Plot,
line_kws={'label':'Regression line'})
plt.xlabel("Nationality Index")
plt.ylabel("Tobin's Q")
plt.legend()`
plt.savefig('Tobin_Nationality_Reg.png')
Here is the output I currently get:
Scatter Plot
Does anybody have an idea how I could do that? Thanks in advance.
I believe there is no clean way to do this, because seaborn does not expose keyword arguments for the fill_between call that plots the confidence interval.
However, it can be done by modifying the label attribute of the PolyCollection directly:
x, y = np.random.rand(2, 20)
ax = sns.regplot(x, y, line_kws={'label': 'Regression line'})
ax.collections[1].set_label('Confidence interval')
ax.legend()

How to use a colored shape as yticks in matplotlib or seaborn?

I am working on a task called knowledge tracing which estimates the student mastery level over time. I would like to plot a similar figure as below using the Matplotlib or Seaborn.
It uses different colors to represent a knowledge concept, instead of a text. However, I have googled and found there is no article is talking about how we can do this.
I tried the following
# simulate a record of student mastery level
student_mastery = np.random.rand(5, 30)
df = pd.DataFrame(student_mastery)
# plot the heatmap using seaborn
marker = matplotlib.markers.MarkerStyle(marker='o', fillstyle='full')
sns_plot = sns.heatmap(df, cmap="RdYlGn", vmin=0.0, vmax=1.0)
y_limit = 5
y_labels = [marker for i in range(y_limit)]
plt.yticks(range(y_limit), y_labels)
Yet it simply returns the __repr__ of the marker, e.g., <matplotlib.markers.MarkerStyle at 0x1c5bb07860> on the yticks.
Thanks in advance!
While How can I make the xtick labels of a plot be simple drawings using matplotlib? gives you a general solution for arbitrary shapes, for the shapes shown here, it may make sense to use unicode symbols as text and colorize them according to your needs.
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(42)
fig, ax = plt.subplots()
ax.imshow(np.random.rand(3,10), cmap="Greys")
symbolsx = ["⚪", "⚪", "⚫", "⚫", "⚪", "⚫","⚪", "⚫", "⚫","⚪"]
colorsx = np.random.choice(["#3ba1ab", "#b43232", "#8ecc3a", "#893bab"], 10)
ax.set_xticks(range(len(symbolsx)))
ax.set_xticklabels(symbolsx, size=40)
for tick, color in zip(ax.get_xticklabels(), colorsx):
tick.set_color(color)
symbolsy = ["◾", "◾", "◾"]
ax.set_yticks(range(len(symbolsy)))
ax.set_yticklabels(symbolsy, size=40)
for tick, color in zip(ax.get_yticklabels(), ["crimson", "gold", "indigo"]):
tick.set_color(color)
plt.show()

Matplotlib center alignment for pie chart labels

I have produced a very simple pie chart in Python using Matplotlib and I am wanting to edit the alignment of my labels. I have used \n within my labels to split the line as the labels are too long for one line. But as you can see from the picture called 'pie chart image', it's a mix of weird alignments at the moment. I would really like to have it center alignment.
For other chart/graph types in Matplotlib there is an argument called align where you can set it to center, however, plt.pie(...) does not seem to have this attribute.
Here is my code:
import matplotlib.pyplot as plt
k = [7,15]
labels = 'Strongly and Mostly \n Agree', 'Strongly/Mostly Disagree \n and In the Middle'
plt.pie(k, labels= labels)
plt.show()
Any ideas?
You can pass a dictionary of text properties to plt.pie via the textprops argument. For example:
plt.pie(k, labels=labels, textprops={'weight': 'bold'})
However, if you try to specify the horizontalalignment property, you'll get an error saying that you provided that parameter twice. Obviously you didn't, but matplotlib passed both it's hard-coded value and your value to some internal function.
But that's probably a good thing. The way I see it, there's not so much a mix of alignments, but a consistent alignment of the text against the pie.
Back to your question
pie returns both the patches and the labels for each wedge. So you can loop through the labels after your initial call to pie to modify their alignment. That looks like this:
k = [7, 15]
labels = 'Strongly and Mostly\nAgree', 'Strongly/Mostly Disagree\nand In the Middle'
fig, ax = plt.subplots()
ax.set_aspect('equal')
wedges, labels = ax.pie(k, labels=labels, textprops={'weight': 'bold'})
for label in labels:
label.set_horizontalalignment('center')
As you can see, the labels now overlap with the wedges, diminishing legibility.
The labels also have a set_position method (i.e., label.set_position((x, y))), but recomputing the positions for N labels in a pie chart sounds like a Bad Time to me.

Categories

Resources