How can I draw EMPTY circels surrounding the data points?

How can I draw EMPTY circels surrounding the data points? - python

I would like to add to the plot below open circles surrounding each data point and set the diameter proportional to the values of a 3rd variable. Currently, this is what I tried but the circles are filled and cover the data points. Using "facecolors='none'" did not help.
z = df.z # this is the 3rd variable
s = [10*2**n for n in range(len(z))]
ax1 = sns.scatterplot(x='LEF', y='NPQ', hue="Regime", markers=["o",
"^"], s=s, facecolors='none', data=df, ax=ax1)

The following approach loops through the generated dots, and sets their edgecolors to their facecolors. Then the facecolors are set to fully transparent.
import matplotlib.pyplot as plt
import seaborn as sns
tips = sns.load_dataset('tips')
ax = sns.scatterplot(data=tips, x="total_bill", y="tip", hue="day", size="size", sizes=(10, 200))
for dots in ax.collections:
facecolors = dots.get_facecolors()
dots.set_edgecolors(facecolors.copy())
dots.set_facecolors('none')
dots.set_linewidth(2)
plt.show()

Related

Removing legend from mpl parallel coordinates plot?

I have a parallel coordinates plot with lots of data points so I'm trying to use a continuous colour bar to represent that, which I think I have worked out. However, I haven't been able to remove the default key that is put in when creating the plot, which is very long and hinders readability. Is there a way to remove this table to make the graph much easier to read?
This is the code I'm currently using to generate the parallel coordinates plot:
parallel_coordinates(data[[' male_le','
female_le','diet','activity','obese_perc','median_income']],'median_income',colormap = 'rainbow',
alpha = 0.5)
fig, ax = plt.subplots(figsize=(6, 1))
fig.subplots_adjust(bottom=0.5)
cmap = mpl.cm.rainbow
bounds = [0.00,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]
norm = mpl.colors.BoundaryNorm(bounds, cmap.N,)
plt.colorbar(mpl.cm.ScalarMappable(norm = norm, cmap=cmap),cax = ax, orientation = 'horizontal',
label = 'normalised median income', alpha = 0.5)
plt.show()
Current Output:
I want my legend to be represented as a color bar, like this:
Any help would be greatly appreciated. Thanks.

You can use ax.legend_.remove() to remove the legend.
The cax parameter of plt.colorbar indicates the subplot where to put the colorbar. If you leave it out, matplotlib will create a new subplot, "stealing" space from the current subplot (subplots are often referenced to by ax in matplotlib). So, here leaving out cax (adding ax=ax isn't necessary, as here ax is the current subplot) will create the desired colorbar.
The code below uses seaborn's penguin dataset to create a standalone example.
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
import numpy as np
from pandas.plotting import parallel_coordinates
penguins = sns.load_dataset('penguins')
fig, ax = plt.subplots(figsize=(10, 4))
cmap = plt.get_cmap('rainbow')
bounds = np.arange(penguins['body_mass_g'].min(), penguins['body_mass_g'].max() + 200, 200)
norm = mpl.colors.BoundaryNorm(bounds, 256)
penguins = penguins.dropna(subset=['body_mass_g'])
parallel_coordinates(penguins[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']],
'body_mass_g', colormap=cmap, alpha=0.5, ax=ax)
ax.legend_.remove()
plt.colorbar(mpl.cm.ScalarMappable(norm=norm, cmap=cmap),
ax=ax, orientation='horizontal', label='body mass', alpha=0.5)
plt.show()

matplotlib: Why is my multi-colored line plot ignoring boundary values?

I have an anomaly threshold, annotated by an axhline in my plot. I wish to add markers and/or change the color of the line above this threshold. I have followed the following matplotlib tutorial:
https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/multicolored_line.html
As well as this utilized this question/answer here on SO:
How to plot multi-color line if x-axis is date time index of pandas
To produce this plot:
That looks pretty good, until you zoom in on a subset of the data:
Unfortunately, this solution doesn't seem to work for my purposes. I'm not sure if this is an error on my part or not, but clearly the lines are red below the threshold. An additional problem in my view is how clunky and long the code is:
import matplotlib.dates as mdates
from matplotlib.collections import LineCollection
from matplotlib.colors import ListedColormap, BoundaryNorm
fig, ax = plt.subplots(figsize=(15,4))
inxval = mdates.date2num(dates.to_pydatetime())
points = np.array([inxval, scores]).T.reshape(-1,1,2)
segments = np.concatenate([points[:-1],points[1:]], axis=1)#[-366:]
cmap = ListedColormap(['b', 'r'])
norm = BoundaryNorm([0, thresh, 40], cmap.N)
lc = LineCollection(segments, cmap=cmap, norm=norm)
lc.set_array(scores)
ax.add_collection(lc)
monthFmt = mdates.DateFormatter("%Y")
ax.xaxis.set_major_formatter(monthFmt)
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.autoscale_view()
# ax.axhline(y=thresh, linestyle='--', c='r')
plt.show()
dates and scores, and thresh generation aren't shown here, but can be re-produced with random numbers to make this code run
Question:
Why are the red lines in my chart sometimes falling below the threshold value? And is there a way to abbreviate the amount of code required for this purpose?

One option would be to draw two lines with the same data then use an invisible axhspan object to clip one of the lines under the threshold:
f, ax = plt.subplots()
x = np.random.exponential(size=500)
line_over, = ax.plot(x, color="b")
line_under, = ax.plot(x, color="r")
poly = ax.axhspan(0, 1, color="none")
line_under.set_clip_path(poly)

Scatter Plot Points overlapping axis

For some reason when I use a zorder with my scatter plot the edges of the points overlap the axis. I tried some of the solutions from [here] (matplotlib axis tick labels covered by scatterplot (using spines)) but they didn't work for me. Is there a way from preventing this from happening?
I understand I could also add an ax.axvline() at my boundaries but that would be an annoying workaround for lots of plots.
xval = np.array([0,0,0,3,3,3,0,2,3,0])
yval = np.array([0,2,3,5,1,0,1,0,4,5])
zval = yval**2-4
fig = plt.figure(figsize=(6,6))
ax = plt.subplot(111)
ax.scatter(xval,yval,cmap=plt.cm.rainbow,c=zval,s=550,zorder=20)
ax.set_ylim(0,5)
ax.set_xlim(0,3)
#These don't work
ax.tick_params(labelcolor='k', zorder=100)
ax.tick_params(direction='out', length=4, color='k', zorder=100)
#This will work but I don't want to have to do this for the plot edges every time
ax.axvline(0,c='k',zorder=100)
plt.show()

For me the solution you linked to works; that is, setting the z-order of the scatter plot to a negative number. E.g.
xval = np.array([0,0,0,3,3,3,0,2,3,0])
yval = np.array([0,2,3,5,1,0,1,0,4,5])
zval = yval**2-4
fig = plt.figure(figsize=(6,6))
ax = plt.subplot(111)
ax.scatter(xval,yval,cmap=plt.cm.rainbow,c=zval,s=550,zorder=-1)
ax.set_ylim(0,5)
ax.set_xlim(0,3)
plt.show()
]1

You can fix the overlap using the following code with a large number for the zorder. This will work on both the x- and y-axis.
for k,spine in ax.spines.items():
spine.set_zorder(1000)

This works for me
import numpy as np
import matplotlib.pyplot as plt
xval = np.array([0,0,0,3,3,3,0,2,3,0])
yval = np.array([0,2,3,5,1,0,1,0,4,5])
zval = yval**2-4
fig = plt.figure(figsize=(6,6))
ax = plt.subplot(111)
ax.scatter(xval,yval,cmap=plt.cm.rainbow,c=zval,s=550,zorder=20)
ax.set_ylim(-1,6)
ax.set_xlim(-1,4)
#These don't work
ax.tick_params(labelcolor='k', zorder=100)
ax.tick_params(direction='out', length=4, color='k', zorder=100)
#This will work but I don't want to have to do this for the plot edges every time
ax.axvline(0,c='k',zorder=100)
plt.show()
Your circle sizes are big enough that they go beyond the axis scope. So we simply change the ylim and xlim
Changed
ax.set_ylim(0,5)
ax.set_xlim(0,3)
to
ax.set_ylim(-1,6)
ax.set_xlim(-1,4)
Also, zorder doesn't play a role in pushing the points to edges.

How to set center color in heatmap

I want to plot a heatmap in seaborn. My code is following:
plt.rcParams['font.size'] = 13
plt.rcParams['font.weight'] = 'bold'
my_dpi=96
fig, ax = plt.subplots(figsize=(800/my_dpi, 600/my_dpi), dpi=my_dpi, facecolor='black')
rdgn = sns.diverging_palette(h_neg=130, h_pos=10, s=99, l=55, sep=3)
sns.heatmap(df, cmap=rdgn, center=0.00, annot=True, fmt ='.2%', linewidths=1.3, linecolor='black', cbar=False, ax=ax)
plt.savefig('./image/image.png', dpi=96, facecolor='black')
And the result is following:
I want the set 0 to be white, and the value >0 to be red, the values which <0 to be green. But the center in heatmap is invalid.
By the way, how to set the color unsymmetrical. Because the min value in my data is -0.34 and the maxima is 1.31. I want to set 0 to be white, -0.34 to be greenest and 1.31 to be reddest.

center would require something that can be centered. So instead of a palette, which is a list of colors, you will need a colormap. Seaborn provides the as_cmap parameter for this case,
sns.diverging_palette(..., as_cmap=True)
Alternatively, you can of course use any other matplotlib colormap, or specify your custom colormap.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
data = np.linspace(-0.34, 1.31, 100).reshape(10,10)
fig, ax = plt.subplots()
rdgn = sns.diverging_palette(h_neg=130, h_pos=10, s=99, l=55, sep=3, as_cmap=True)
sns.heatmap(data, cmap=rdgn, center=0.00, annot=True, fmt ='.0%',
linewidths=1.3, linecolor='black', cbar=True, ax=ax)
plt.show()
If instead of centering the colormap you want to shift its middle point you cannot use center. But instead a matplotlib.colors.DivergingNorm.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import DivergingNorm
import seaborn as sns
data = np.linspace(-0.34, 1.31, 100).reshape(10,10)
fig, ax = plt.subplots()
rdgn = sns.diverging_palette(h_neg=130, h_pos=10, s=99, l=55, sep=3, as_cmap=True)
divnorm = DivergingNorm(vmin=data.min(), vcenter=0, vmax=data.max())
sns.heatmap(data, cmap=rdgn, norm=divnorm, annot=True, fmt ='.0%',
linewidths=1.3, linecolor='black', cbar=True, ax=ax)
plt.show()
Here, the full colormap will be squeezed in the green part and stretched in the red part.

It looks like the vmin and vmax parameters of seaborn.heatmap might help you:
sns.heatmap(df, cmap=rdgn, annot=True, fmt ='.2%', linewidths=1.3,
linecolor='black', cbar=False, ax=ax,
vmin=-0.34, vmax=1.31)
However there doesn't seem to be a way to also set the center to 0 for non-divergent color maps, so if that is a required feature then you can't use seaborn.heatmap. The best you could do would be to set vmin = -vmax which would at least make the center white.
It looks like you might have diverging data (no hard limit), in which case you could look at using one of the divergent color maps (in which case you need to use center=0 and not vmin/vmax).

Matplotlib scatter plot - Remove white padding

I'm working with matplotlib to plot a variable in latitude longitude coordinates. The problem is that this image cannot include axes or borders. I have been able to remove axis, but the white padding around my image has to be completely removed (see example images from code below here: http://imgur.com/a/W0vy9) .
I have tried several methods from Google searches, including these StackOverflow methodologies:
Remove padding from matplotlib plotting
How to remove padding/border in a matplotlib subplot (SOLVED)
Matplotlib plots: removing axis, legends and white spaces
but nothing has worked in removing the white space. If you have any advice (even if it is to ditch matplotlib and to try another plotting library instead) I would appreciate it!
Here is a basic form of the code I'm using that shows this behavior:
import numpy as np
import matplotlib
from mpl_toolkits.basemap import Basemap
from scipy import stats
lat = np.random.randint(-60.5, high=60.5, size=257087)
lon = np.random.randint(-179.95, high=180, size=257087)
maxnsz = np.random.randint(12, 60, size=257087)
percRange = np.arange(100,40,-1)
percStr=percRange.astype(str)
val_percentile=np.percentile(maxnsz, percRange, interpolation='nearest')
#Rank all values
all_percentiles=stats.rankdata(maxnsz)/len(maxnsz)
#Figure setup
fig = matplotlib.pyplot.figure(frameon=False, dpi=600)
#Basemap code can go here
x=lon
y=lat
cmap = matplotlib.cm.get_cmap('cool')
h=np.where(all_percentiles >= 0.999)
hl=np.where((all_percentiles < 0.999) & (all_percentiles > 0.90))
mh=np.where((all_percentiles > 0.75) & (all_percentiles < 0.90))
ml=np.where((all_percentiles >= 0.4) & (all_percentiles < 0.75))
l=np.where(all_percentiles < 0.4)
all_percentiles[h]=0
all_percentiles[hl]=0.25
all_percentiles[mh]=0.5
all_percentiles[ml]=0.75
all_percentiles[l]=1
rgba_low=cmap(1)
rgba_ml=cmap(0.75)
rgba_mh=cmap(0.51)
rgba_hl=cmap(0.25)
rgba_high=cmap(0)
matplotlib.pyplot.axis('off')
matplotlib.pyplot.scatter(x[ml],y[ml], c=rgba_ml, s=3, marker=',',edgecolor='none', alpha=0.4)
matplotlib.pyplot.scatter(x[mh],y[mh], c=rgba_mh, s=3, marker='o', edgecolor='none', alpha=0.5)
matplotlib.pyplot.scatter(x[hl],y[hl], c=rgba_hl, s=4, marker='*',edgecolor='none', alpha=0.6)
matplotlib.pyplot.scatter(x[h],y[h], c=rgba_high, s=5, marker='^', edgecolor='none',alpha=0.75)
fig.savefig('/home/usr/code/python/testfig.jpg', bbox_inches=0, nbins=0, transparent="True", pad_inches=0.0)
fig.canvas.draw()

The problem is that all the solutions given at Matplotlib plots: removing axis, legends and white spaces are actually meant to work with imshow.
So, the following clearly works
import matplotlib.pyplot as plt
fig = plt.figure()
ax=fig.add_axes([0,0,1,1])
ax.set_axis_off()
im = ax.imshow([[2,3,4,1], [2,4,4,2]], origin="lower", extent=[1,4,2,8])
ax.plot([1,2,3,4], [2,3,4,8], lw=5)
ax.set_aspect('auto')
plt.show()
and produces
But here, you are using scatter. Adding a scatter plot
import matplotlib.pyplot as plt
fig = plt.figure()
ax=fig.add_axes([0,0,1,1])
ax.set_axis_off()
im = ax.imshow([[2,3,4,1], [2,4,4,2]], origin="lower", extent=[1,4,2,8])
ax.plot([1,2,3,4], [2,3,4,8], lw=5)
ax.scatter([2,3,4,1], [2,3,4,8], c="r", s=2500)
ax.set_aspect('auto')
plt.show()
produces
Scatter has the particularity that matplotlib tries to make all points visible by default, which means that the axes limits are set such that all scatter points are visible as a whole.
To overcome this, we need to specifically set the axes limits:
import matplotlib.pyplot as plt
fig = plt.figure()
ax=fig.add_axes([0,0,1,1])
ax.set_axis_off()
im = ax.imshow([[2,3,4,1], [2,4,4,2]], origin="lower", extent=[1,4,2,8])
ax.plot([1,2,3,4], [2,3,4,8], lw=5)
ax.scatter([2,3,4,1], [2,3,4,8], c="r", s=2500)
ax.set_xlim([1,4])
ax.set_ylim([2,8])
ax.set_aspect('auto')
plt.show()
such that we will get the desired behaviour.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can I draw EMPTY circels surrounding the data points? - python

Related

Removing legend from mpl parallel coordinates plot?

matplotlib: Why is my multi-colored line plot ignoring boundary values?

Scatter Plot Points overlapping axis

How to set center color in heatmap

Matplotlib scatter plot - Remove white padding

Categories

Resources