Matplotlib plot legend shows markers twice

Matplotlib plot legend shows markers twice - python

The legend in my plot shows the marker icon twice in the legend
The code that produced this plot is given below
import pandas as pd
import random
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
N = 15
colors = cm.rainbow(np.linspace(0, 1, N))
df = []
for i in range(N):
s = 'NAME %d' % i
df.append(dict(x=random.random(), y=random.random(), name=s))
df = pd.DataFrame(df)
c = 0
labels = []
fig, ax = plt.subplots(figsize=(12,12))
for name, group in df.groupby('name'):
x = group['x'].values[0]
y = group['y'].values[0]
color = colors[c]
c += 1
ax.plot(x, y, color=color, marker='o', linestyle='', label=name)
labels.append(name)
handels, _ = ax.get_legend_handles_labels()
ax.legend(handels, labels)
Why is this happening?
My actual df has multiple entries for each name so that's why I do a groupby. Is there something I'm missing here?

you can either set plt.legend(loc=...,numpoints =1) directly or create a style sheet and set legend.numpoints : 1
If you use a linux system: place your stylesheets in ~/.config/matplotlib/stylelib/ you can use them with plt.style.use([your_style_sheet]). Additionally, you can e.g. make one sheet for the colors etc. and one for the size: plt.style.use([my_colors,half_column_latex])

Related

two DataFrame plots

I have a similar plot to the one answered in the link below:
two DataFrame plot in a single plot matplotlip
I made some modification to plots for df2 columns code block because i think that is where i have to modify but i could not yield the output.
a sample of the plot i want is this
this was how i modified it:
f, axes = plt.subplots(nrows=len(signals.columns)+1, sharex=True, )
i = 0
for col in df2.columns:
fig, axs = plt.subplots()
sns.regplot(x='', y='', data=df2, ax=axs[0])
df2[col].plot(ax=axes[i], color='grey')
axes[i].set_ylabel(col)
i+=1
I have seen that its wrong.
I tried this out, it seems like a head way :)
How do I make modification on this to get what i want:
f, axes = plt.subplots(nrows=len(signals.columns)+1, sharex=True, )
# plots for df2 columns
i = 0
for col in df2.columns:
lw=1
df2[col].plot(ax=axes[i], color='grey')
axes[i].set_ylim(0, 1)
axes[i].set_ylabel(col)
sns.rugplot(df2["P1"])

You have several options to make this graph. df1 and df2 are as defined in your previous question
The version with matplotlib.pyplot.scatter is faster to draw, but less faithful to the example. The version with seaborn.rugplot looks identical to the example, but takes longer to draw. I highlighted the important part of the code between comment lines ########
using matplotlib.pyplot.scatter
import seaborn as sns
import numpy as np
f, axes = plt.subplots(nrows=len(df2.columns)+1, sharex=True,
gridspec_kw={'height_ratios':np.append(np.repeat(1, len(df2.columns)), 3)})
####### variable part below #######
# plots for df2 columns
i = 0
for col in df2.columns:
axes[i].scatter(x=df2.index, y=np.repeat(0, len(df2)), c=df2[col], marker='|', cmap='Greys')
axes[i].set_ylim(-0.5, 0.5)
axes[i].set_yticks([0])
axes[i].set_yticklabels([col])
i+=1
###################################
## code to plot annotations
axes[-1].set_xlabel('Genomic position')
axes[-1].set_ylabel('annotations')
axes[-1].set_ylim(-0.5, 1.5)
axes[-1].set_yticks([0, 1])
axes[-1].set_yticklabels(['−', '+'])
for _, r in df1.iterrows():
marker = '|'
lw=1
if r['type'] == 'exon':
marker=None
lw=8
y = 1 if r['strand'] == '+' else 0
axes[-1].plot((r['start'], r['stop']), (y, y),
marker=marker, lw=lw,
solid_capstyle='butt',
color='#505050')
# remove space between plots
plt.subplots_adjust(hspace=0)
axes[-1].set_xlim(0, len(df2))
f.set_size_inches(6, 2)
using seaborn.rugplot
import seaborn as sns
import numpy as np
f, axes = plt.subplots(nrows=len(df2.columns)+1, sharex=True,
gridspec_kw={'height_ratios':np.append(np.repeat(1, len(df2.columns)), 3)})
####### variable part below #######
import matplotlib
import matplotlib.cm as cm
norm = matplotlib.colors.Normalize(vmin=0, vmax=1, clip=True)
mapper = cm.ScalarMappable(norm=norm, cmap=cm.Greys)
# plots for df2 columns
i = 0
for col in df2.columns:
sns.rugplot(x=df2.index, color=list(map(mapper.to_rgba, df2[col])), height=1, ax=axes[i])
axes[i].set_yticks([0])
axes[i].set_yticklabels([col])
i+=1
###################################
## code to plot annotations
axes[-1].set_xlabel('Genomic position')
axes[-1].set_ylabel('annotations')
axes[-1].set_ylim(-0.5, 1.5)
axes[-1].set_yticks([0, 1])
axes[-1].set_yticklabels(['−', '+'])
for _, r in df1.iterrows():
marker = '|'
lw=1
if r['type'] == 'exon':
marker=None
lw=8
y = 1 if r['strand'] == '+' else 0
axes[-1].plot((r['start'], r['stop']), (y, y),
marker=marker, lw=lw,
solid_capstyle='butt',
color='#505050')
# remove space between plots
plt.subplots_adjust(hspace=0)
axes[-1].set_xlim(0, len(df2))
f.set_size_inches(6, 2)

Seperate title for each subplot in a for loop in Python

I am trying to use subplots within a for loop and I can plot all my graphs, but I can't give them individual x and y labels and titles. It is only the last one that it is applied to.
import numpy as np
import astropy
import matplotlib.pyplot as plt
import pandas as pd
#Import 18 filesnames with similar names
from glob import glob
filenames = glob('./*V.asc')
df = [np.genfromtxt(f) for f in filenames]
A = np.stack(df, axis=0)
#Begin subplot
nrows = 3
ncols = 6
fig, ax = plt.subplots(nrows = nrows, ncols = ncols, figsize=(30,15))
#Loop over each filename i, row j and column k
i = 0
for j in range(0, nrows):
for k in range(0, ncols):
ax[j,k].plot(A[i,:,0], A[i,:,1])
plt.title(filenames[i], fontsize = '25')
i += 1
plt.subplots_adjust(wspace=.5, hspace=.5)
fig.show()
I can plot it in seperate plots, so 18 in total and it works fine
for i in range(0, len(A)):
plt.figure(i)
plt.title(filenames[i], fontsize = '30')
plt.plot(A[i,:,0], A[i,:,1])
plt.xlabel('Wavelength [Å]', fontsize = 20)
plt.ylabel('Flux Density [erg/s/cm^2/Å]', fontsize = 20)
plt.xticks(fontsize = 20)
plt.yticks(fontsize = 20)
I update the title each iteration i, same as the subplot, so I don't understand why it doesn't work.
Any input is appreciated!

plt.title() acts on the current axes, which is generally the last created, and not the Axes that you are thinking of.
In general, if you have several axes, you will be better off using the object-oriented interface of matplotlib rather that the pyplot interface. See usage guide
replace:
plt.title(filenames[i], fontsize = '25')
by
ax[j,k].set_title(filenames[i], fontsize = '25')

Bar Chart using Matlplotlib

I have two values:
test1 = 0.75565
test2 = 0.77615
I am trying to plot a bar chart (using matlplotlib in jupyter notebook) with the x-axis as the the two test values and the y-axis as the resulting values but I keep getting a crazy plot with just one big box
here is the code I've tried:
plt.bar(test1, 1, width = 2, label = 'test1')
plt.bar(test2, 1, width = 2, label = 'test2')

As you can see in this example, you should define X and Y in two separated arrays, so you can do it like this :
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(2)
y = [0.75565,0.77615]
fig, ax = plt.subplots()
plt.bar(x, y)
# set your labels for the x axis here :
plt.xticks(x, ('test1', 'test2'))
plt.show()
the final plot would be like :
UPDATE
If you want to draw each bar with a different color, you should call the bar method multiple times and give it colors to draw, although it has default colors :
import matplotlib.pyplot as plt
import numpy as np
number_of_points = 2
x = np.arange(number_of_points)
y = [0.75565,0.77615]
fig, ax = plt.subplots()
for i in range(number_of_points):
plt.bar(x[i], y[i])
# set your labels for the x axis here :
plt.xticks(x, ('test1', 'test2'))
plt.show()
or you can do it even more better and choose the colors yourself :
import matplotlib.pyplot as plt
import numpy as np
number_of_points = 2
x = np.arange(number_of_points)
y = [0.75565,0.77615]
# choosing the colors and keeping them in a list
colors = ['g','b']
fig, ax = plt.subplots()
for i in range(number_of_points):
plt.bar(x[i], y[i],color = colors[i])
# set your labels for the x axis here :
plt.xticks(x, ('test1', 'test2'))
plt.show()

The main reason your plot is showing one large value is because you are setting a width for the columns that is greater than the distance between the explicit x values that you have set. Reduce the width to see the individual columns. The only advantage to doing it this way is if you need to set the x values (and y values) explicitly for some reason on a bar chart. Otherwise, the other answer is what you need for a "traditional bar chart".
import matplotlib.pyplot as plt
test1 = 0.75565
test2 = 0.77615
plt.bar(test1, 1, width = 0.01, label = 'test1')
plt.bar(test2, 1, width = 0.01, label = 'test2')

Row Titles within a matplotlib GridSpec

I have an GridSpec defined layout with to subgrids, one is supposed to include a colorbar
import pylab as plt
import numpy as np
gs_outer = plt.GridSpec(1, 2, width_ratios=(10, 1))
gs_inner = plt.matplotlib.gridspec.GridSpecFromSubplotSpec(2, 3, gs_outer[0])
ax = []
for i in xrange(6):
ax.append(plt.subplot(gs_inner[i]))
plt.setp(ax[i].get_xticklabels(), visible=False)
plt.setp(ax[i].get_yticklabels(), visible=False)
ax.append(plt.subplot(gs_outer[1]))
plt.show()
I'd now like to get for the left part a row-wise labeling like this:
I tried to add another GridSpec into the GridSpec, but that did not work out:
import pylab as plt
import numpy as np
fig = plt.figure()
gs_outer = plt.GridSpec(1, 2, width_ratios=(10, 1))
gs_medium = plt.matplotlib.gridspec.GridSpecFromSubplotSpec(3, 1, gs_outer[0])
ax_title0 = plt.subplot(gs_medium[0])
ax_title0.set_title('Test!')
gs_row1 = plt.matplotlib.gridspec.GridSpecFromSubplotSpec(1, 3, gs_medium[0])
ax00 = plt.subplot(gs_row1[0]) # toggle this line to see the effect
plt.show()
Adding the ax00 = plt.subplot... line seems to erase the previously created axis

Following CT Zhu comment I came up with the following answer (I don't really like it, but it seems to work)
import pylab as plt
import numpy as np
fig = plt.figure()
rows = 2
cols = 3
row_fraction = 9
row_size = row_fraction / float(rows)
gs_outer = plt.GridSpec(1,2, width_ratios=(9,1))
gs_plots= plt.matplotlib.gridspec.GridSpecFromSubplotSpec(rows * 2, cols, subplot_spec=gs_outer[0], height_ratios = rows * [1, row_size])
# Create title_axes
title_ax = []
for ta in xrange(rows):
row_index = (ta) * 2
title_ax.append(plt.subplot(gs_plots[row_index, :]))
# Create Data axes
ax = []
for row in xrange(rows):
row_index = (row + 1) * 2 -1
for col in xrange(cols):
try:
ax.append(plt.subplot(gs_plots[row_index, col], sharex=ax[0], sharey=ax[0]))
except IndexError:
if row == 0 and col == 0:
ax.append(plt.subplot(gs_plots[row_index, col]))
else:
raise IndexError
# Delete Boxes and Markers from title axes
for ta in title_ax:
ta._frameon = False
ta.xaxis.set_visible(False)
ta.yaxis.set_visible(False)
# Add labels to title axes:
for ta, label in zip(title_ax, ['Row 1', 'Row 2']):
plt.sca(ta)
plt.text(
0.5, 0.5, label, horizontalalignment='center', verticalalignment='center')
# Add common colorbar
gs_cb = plt.matplotlib.gridspec.GridSpecFromSubplotSpec(
1, 1, subplot_spec=gs_outer[1])
ax.append(plt.subplot(gs_cb[:, :]))
Of course labeling and ticklabels could be improved. But how to achive that is likely already explained on SO.

Let's define an example grid pltgrid:
pltgrid = gridspec.GridSpec(ncols=3, nrows=2,
width_ratios=[1]*3, wspace=0.3,
hspace=0.6, height_ratios=[1]*2)
Before your for loop, you can define a list ax using map:
num=list(range(7))
ax=list(map(lambda x : 'ax'+str(x), num))
You may have a list plotnames containing the names. As an example, I'll plot a normal distribution Q-Q plot for each i in the for loop:
for i in xrange(6):
ax[i]=fig.add.subplot(pltgrid[i])
res = stats.probplot(x, dist="norm", plot=ax[i])
# set title for subplot using existing 'plotnames' list
ax[i].set_title(plotnames[i])
# display subplot
ax[i]

How should I pass a matplotlib object through a function; as Axis, Axes or Figure?

Sorry in advance if this is a little long winded but if I cut it down too much the problem is lost. I am trying to make a module on top of pandas and matplotlib which will give me the ability to make profile plots and profile matrices analogous to scatter_matrix. I am pretty sure my problem comes down to what object I need to return from Profile() so that I can handle Axes manipulation in Profile_Matrix(). Then the question is what to return form Profile_Matrix() so I can edit subplots.
My module (ProfileModule.py) borrows a lot from https://github.com/pydata/pandas/blob/master/pandas/tools/plotting.py and looks like:
import pandas as pd
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
def Profile(x,y,nbins,xmin,xmax):
df = DataFrame({'x' : x , 'y' : y})
binedges = xmin + ((xmax-xmin)/nbins) * np.arange(nbins+1)
df['bin'] = np.digitize(df['x'],binedges)
bincenters = xmin + ((xmax-xmin)/nbins)*np.arange(nbins) + ((xmax-xmin)/(2*nbins))
ProfileFrame = DataFrame({'bincenters' : bincenters, 'N' : df['bin'].value_counts(sort=False)},index=range(1,nbins+1))
bins = ProfileFrame.index.values
for bin in bins:
ProfileFrame.ix[bin,'ymean'] = df.ix[df['bin']==bin,'y'].mean()
ProfileFrame.ix[bin,'yStandDev'] = df.ix[df['bin']==bin,'y'].std()
ProfileFrame.ix[bin,'yMeanError'] = ProfileFrame.ix[bin,'yStandDev'] / np.sqrt(ProfileFrame.ix[bin,'N'])
fig = plt.figure();
ax = ProfilePlot.add_subplot(1, 1, 1)
plt.errorbar(ProfileFrame['bincenters'], ProfileFrame['ymean'], yerr=ProfileFrame['yMeanError'], xerr=(xmax-xmin)/(2*nbins), fmt=None)
return ax
#or should I "return fig"
def Profile_Matrix(frame):
import pandas.core.common as com
import pandas.tools.plotting as plots
from pandas.compat import lrange
from matplotlib.artist import setp
range_padding=0.05
df = frame._get_numeric_data()
n = df.columns.size
fig, axes = plots._subplots(nrows=n, ncols=n, squeeze=False)
# no gaps between subplots
fig.subplots_adjust(wspace=0, hspace=0)
mask = com.notnull(df)
boundaries_list = []
for a in df.columns:
values = df[a].values[mask[a].values]
rmin_, rmax_ = np.min(values), np.max(values)
rdelta_ext = (rmax_ - rmin_) * range_padding / 2.
boundaries_list.append((rmin_ - rdelta_ext, rmax_+ rdelta_ext))
for i, a in zip(lrange(n), df.columns):
for j, b in zip(lrange(n), df.columns):
ax = axes[i, j]
common = (mask[a] & mask[b]).values
nbins = 100
(xmin,xmax) = boundaries_list[i]
ax=Profile(df[b][common],df[a][common],nbins,xmin,xmax)
#Profile(df[b][common].values,df[a][common].values,nbins,xmin,xmax)
ax.set_xlabel('')
ax.set_ylabel('')
plots._label_axis(ax, kind='x', label=b, position='bottom', rotate=True)
plots._label_axis(ax, kind='y', label=a, position='left')
if j!= 0:
ax.yaxis.set_visible(False)
if i != n-1:
ax.xaxis.set_visible(False)
for ax in axes.flat:
setp(ax.get_xticklabels(), fontsize=8)
setp(ax.get_yticklabels(), fontsize=8)
return axes
This will run with something like:
import pandas as pd
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
import ProfileModule as pm
x = np.random.uniform(0, 100, size=1000)
y = x *x + 50*x*np.random.randn(1000)
z = x *y + 50*y*np.random.randn(1000)
nbins = 25
xmax = 100
xmin = 0
ProfilePlot = pm.Profile(x,y,nbins,xmin,xmax)
plt.title("Look this works!")
#This does not work as expected
frame = DataFrame({'z' : z,'x' : x , 'y' : y})
ProfileMatrix = pm.Profile_Matrix(frame)
plt.show()
This would hopefully produce a simple profile plot and a 3x3 profile matrix but it does not. I have tried various different methods to get this to work but I imagine it is not worth explaining them all.
I should mention I am using Enthought Canopy Express on Windows 7. Sorry for the long post and thanks again for any help with the code. This is my first week using Python.

You should pass around Axes objects and break your functions up to operate on a single axes at a time. You are close, but just change
import numpy as np
import matplotlib.pyplot as plt
def _profile(ax, x, y):
ln, = ax.plot(x, y)
# return the Artist created
return ln
def profile_matrix(n, m):
fig, ax_array = plt.subplots(n, m, sharex=True, sharey=True)
for ax in np.ravel(ax_array):
_profile(ax, np.arange(50), np.random.rand(50))
profile_matrix(3, 3)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Matplotlib plot legend shows markers twice - python

Related

two DataFrame plots

Seperate title for each subplot in a for loop in Python

Bar Chart using Matlplotlib

Row Titles within a matplotlib GridSpec

How should I pass a matplotlib object through a function; as Axis, Axes or Figure?

Categories

Resources