I need to do what has been explained for MATLAB here:
How to show legend for only a specific subset of curves in the plotting?
But using Python instead of MATLAB.
Brief summary of my goal: when plotting for example three curves in the following way
from matplotlib import pyplot as plt
a=[1,2,3]
b=[4,5,6]
c=[7,8,9]
# these are the curves
plt.plot(a)
plt.plot(b)
plt.plot(c)
plt.legend(['a','nothing','c'])
plt.show()
Instead of the word "nothing", I would like not to have anything there.
Using '_' will suppress the legend for a particular entry as following (continue reading for handling underscore _ as a legend). This solution is motivated by the recent post of #ImportanceOfBeingEarnest here.
plt.legend(['a','_','c'])
I would also avoid the way you are putting legends right now because in this way, you have to make sure that the plot commands are in the same order as legend. Rather, put the label in the respective plot commands to avoid errors.
That being said, the straightforward and easiest solution (in my opinion) is to do the following
plt.plot(a, label='a')
plt.plot(b)
plt.plot(c, label='c')
plt.legend()
As #Lucas pointed out in comment, if you want to show an underscore _ as the label for plot b, how would you do it. You can do it using
plt.legend(['a','$\_$','c'])
Related
I would like to create a Seaborn scatter-plot, using the following dataframe:
df = pd.DataFrame({'A':[1,2,3,4],'B':[2,4,6,8],'C':['y','y','n','n'],'D':[1,1,2,2]})
In my graph A should be the x-variable and B the y-variable. Furthermore I would like to color based on column D. Finally, when C='y' the marker should be open-faced (no facecolor) and when C='n' the marker should have a closed. My original idea was to use the hue and style parameter:
sns.scatterplot(x='A', y='B',
data=df, hue='D',style ='C')
However, I did not manage to obtain the graph I am looking for. Could somebody help me with this? Thank you in advance.
One cannot specify entire marker styles (so 'marker' and 'fillstyle' keys in your case) for matplotlib yet. Have a look on the answer to this post.
So the only thing left for you is to use different markers right away and specify them (as list or dictionary)
sns.scatterplot(data=df, x='A', y='B', hue='D', style='C', markers=['o', 's'])
plt.show()
Apparently, it is very hard to even create non-filled markers in seaborn, as this post explains. The only option is to do some matplotlib-seaborn-hybrid thing... So if you accept to plot things twice onto the same axis (one for a filled marker and one for the unfilled markers), you still have to dig yourself into the quirks of seaborn...
I used a code like:
g = sns.pairplot(df.loc[:,['column1','column2','column3','column4','column5']])
g.map_offdiag(plt.hexbin, gridsize=(20,20))
and have a pairplot and I expect that upper- and lower- triangle plots to be mirrored. The plots look like this:
I thought maybe the problems are the histograms so I tried to tighten the axes using plt.axis('tight') and plt.autoscale(enable=True, axis='y', tight=True) but nothing changed. I also got rid of the diagonal plots (made them invisible), but still the triangle plots are not mirrored. Why? and how to fix it?
Although still I do not understand why pairplot has this behavior here, I found a workaround. I access each plot within pairplot individually and set the limit manually.
g.axes[I,J].set_ylim(df.column3.min(),df.column3.max())
In this case, I had to repeat this piece of code 5 times, where I = 2 and J = 0,1,2,3,4.
I'm having a problem that (I think) should have a fairly simple solution. I'm still a relative novice in Python, so apologies if I'm doing something obviously wrong. I'm just trying to create a simple plot with multiple lines, where each line is colored by its own specific, user-defined color. When I run the following code as a test for one of the colors it ends up giving me a blank plot. What am I missing here? Thank you very much!
import numpy as np
import matplotlib.pyplot as plt
from colour import Color
dbz53 = Color('#DD3044')
*a bunch of arrays of data, two of which are called x and mpt1*
fig, ax = plt.subplots()
ax.plot(x, mpt1, color='dbz53', label='53 dBz')
ax.set_yscale('log')
ax.set_xlabel('Diameter (mm)')
ax.set_ylabel('$N(D) (m^-4)$')
ax.set_title('N(D) vs. D')
#ax.legend(loc='upper right')
plt.show()
The statement
ax.plot(x, mpt1, color='dbz53', label='53 dBz')
is wrong with 'dbz53' where python treated it as a string of unknown rgb value.
You can simply put
color='#DD3044'
and it will work.
Or you can try
color=dbz53.get_hex()
without quote if you want to use the colour module you imported.
In the plot command, you could enter Hex colours. A much more simple way to beautify your plot would be to simply use matplotlib styles. For instance, before any plot function, just write
plt.style.use('ggplot')
I have a set of data that comes from two different sources, and I have multiple sets graphed together. So essentially 6 scatterplots with error bars (all different colors), and each scatterplot has two sources.
Basically I want the blue scatterplot to have two different markers, 'o' and's'. I currently have done this by plotting each point individually with a loop and checking to see if the source is 1 or 2. If it is 1 it plots a 's' if the source is 2 then it plots a 'o'.
However this method does not really allow for having a legend. (Data1, Data2,...Data6)
Is there a better way of doing this?
EDIT:
I want a cleaner method for this, something along the lines of
x=[1,2,3]
y=[4,5,6]
m=['o','s','^']
plt.scatter(x,y,marker=m)
But this returns an error Unrecognized marker style
A more pythonic way (but still a loop) might be something like
x=[1,2,3]
y=[4,5,6]
l=['data1','data2','data3']
m=['ob','sb','^b']
f,a = plt.subplots(1,1)
[a.plot(*data, label=lab) for data,lab in zip(zip(x,y,m),l)]
plt.legend(loc='lower right')
plt.xlim(0,4)
plt.ylim(3,7);
But I guess this is not the most efficient way if you have lots of datapoints.
If you want to use scatter try something like
m=['o','s','^']
f,a = plt.subplots(1,1)
[a.scatter(*data, marker=m1, label=l1) for data,m1,l1 in zip(zip(x,y),m,l)]
I'm pretty sure, there is also a possibility to apply ** and dicts here.
UPDATE:
Instead of looping over the plot command the ability of matplotlib's plot function to read an arbitrary number of x,y,fmt groups, see docs.
x=np.random.random((3,6))
y=np.random.random((3,6))
l=['data1','data2','data3']
m=['ob','sb','^b']
plt.plot(*[i[j] for i in zip(x,y,m) for j in range(3)])
plt.legend(l,loc='lower right')
Calling plot in a loop is fine. You just need to keep the list of lines returned by plot and use fig.legend to create a legend for the whole figure. See http://matplotlib.org/examples/pylab_examples/figlegend_demo.html
Seconded to #tcaswell 's comments, .scatter() returns collections.PathCollection, which provides a fast way of plotting a large number of identical shaped objects. You can use a loop to plot the data as many scatter plots (and many different datasets) but in my opinion it looses all the speed benefit provided by .scatter().
With these being said, it is however not true that the dots have to be identical in a scatter plot. You can have different linewidth, edgecolor and many other things. But the dots have to be the same shape. See this example, assigning different colors (and only plot one dataset):
>>> sc=plt.scatter(x, y, label='test')
>>> sc.set_color(['r','g','b'])
>>> plt.legend()
See details in http://matplotlib.org/api/collections_api.html.
These were all alright, but not really what I was looking for. The problem was how I parsed through my data and how I could add a legend in the wouldn't mess that up. Since I did a for-loop and plotted each point individually based on if it was measured at Observation location 1 or 2 whenever I made a legend it would plot over 50 legend entries. So I plotted my data as full sets (Invisibly and with no change in symbols) then again in color with the varying symbols. This worked better. Thanks though
I have started my IPython Notebook with
ipython notebook --pylab inline
This is my code in one cell
df['korisnika'].plot()
df['osiguranika'].plot()
This is working fine, it will draw two lines, but on the same chart.
I would like to draw each line on a separate chart.
And it would be great if the charts would be next to each other, not one after the other.
I know that I can put the second line in the next cell, and then I would get two charts. But I would like the charts close to each other, because they represent the same logical unit.
You can also call the show() function after each plot.
e.g
plt.plot(a)
plt.show()
plt.plot(b)
plt.show()
Make the multiple axes first and pass them to the Pandas plot function, like:
fig, axs = plt.subplots(1,2)
df['korisnika'].plot(ax=axs[0])
df['osiguranika'].plot(ax=axs[1])
It still gives you 1 figure, but with two different plots next to each other.
Something like this:
import matplotlib.pyplot as plt
... code for plot 1 ...
plt.show()
... code for plot 2...
plt.show()
Note that this will also work if you are using the seaborn package for plotting:
import matplotlib.pyplot as plt
import seaborn as sns
sns.barplot(... code for plot 1 ...) # plot 1
plt.show()
sns.barplot(... code for plot 2 ...) # plot 2
plt.show()
Another way, for variety. Although this is somewhat less flexible than the others. Unfortunately, the graphs appear one above the other, rather than side-by-side, which you did request in your original question. But it is very concise.
df.plot(subplots=True)
If the dataframe has more than the two series, and you only want to plot those two, you'll need to replace df with df[['korisnika','osiguranika']].
I don't know if this is new functionality, but this will plot on separate figures:
df.plot(y='korisnika')
df.plot(y='osiguranika')
while this will plot on the same figure: (just like the code in the op)
df.plot(y=['korisnika','osiguranika'])
I found this question because I was using the former method and wanted them to plot on the same figure, so your question was actually my answer.