The square attribute in sns.heatmap works in weird manner. When I plot a heatmap using random numbers and use the square attribute, it works fine.
When I plot the heatmap with my matrix, it creates the heatmap properly.
However, when I use the square attribute, the plot becomes a tiny square.
I can't figure out what is going wrong over here.
Well, square=True means: "show all cells as squares". The only way to fit 7x560 squares into the plot region is reducing the height by a factor of about 80. In other words: it is strongly recommended to use square=False for data that has such a large difference between horizontal and vertical directions. Seaborn isn't doing anything wrong here, it just gives you want you asked for.
If you want the heatmap to be square (instead of the cells), you can use ax = sns.heatmap(data, square=False) and then ax.set_aspect(data.shape[1] / data.shape[0]).
Here is an example:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
data = np.random.randn(7, 560).cumsum(axis=1).cumsum(axis=0)
data -= data.min(axis=1, keepdims=True)
data /= data.max(axis=1, keepdims=True)
ax = sns.heatmap(data, cmap='turbo', cbar=True, xticklabels=50,
yticklabels=['Grumpy', 'Dopey', 'Doc', 'Happy', 'Bashful', 'Sneezy', 'Sleepy'])
ax.set_aspect(data.shape[1] / data.shape[0])
ax.tick_params(labelrotation=0)
plt.tight_layout()
plt.show()
Related
How would I make a plot of this style in python with matplotlib? (Cumulative probability plot) I don't need complete code, mostly just need a place to start and a general idea of what I need to do for it.
A cumulative probability plot is really easy to make:
import numpy as np
import matplotlib.pyplot as plt
data = np.random.randn(1000)
fig,ax = plt.subplots()
ax.plot(np.sort(data),np.linspace(0.0,1.0,len(data)))
plt.xlabel(r'$x$')
plt.ylabel(r'$P(X \leq x)$')
plt.show()
Note that it can have a strong advantage over a probability density plot as it does not require binning of your data. (Should you be looking for the latter you can check this code).
What I'm trying to achieve: a plot with two axhline horizontal lines, with the area between them shaded.
The best so far:
ax.hline(y1, color=c)
ax.hline(y2, color=c)
ax.fill_between(ax.get_xlim(), y1, y2, color=c, alpha=0.5)
The problem is that this leaves a small amount of blank space to the left and right of the shaded area.
I understand that this is likely due to the plot creating a margin around the used/data area of the plot. So, how do I get the fill_between to actually cover the entire plot without matplotlib rescaling the x-axis after drawing? Is there an alternative to get_xlim that would give me appropriate limits of the plot, or an alternative to fill_between?
This is the current result:
Note that this is part of a larger grid layout with several plots, but they all leave a similar margin around these shaded areas.
Not strictly speaking an answer to the question of getting the outer limits, but it does solve the problem. Instead of using fill_between, I should have used:
ax.axhspan(y1, y2, facecolor=c, alpha=0.5)
Result:
ax.get_xlim() does return the limits of the axis, not that of the data:
Axes.get_xlim()
Returns the current x-axis limits as the tuple (left, right).
But Matplotlib simply rescales the x-axis after drawing the fill_between:
import matplotlib.pylab as pl
import numpy as np
pl.figure()
ax=pl.subplot(111)
pl.plot(np.random.random(10))
print(ax.get_xlim())
pl.fill_between(ax.get_xlim(), 0.5, 1)
print(ax.get_xlim())
This results in:
(-0.45000000000000001, 9.4499999999999993)
(-0.94499999999999995, 9.9449999999999985)
If you don't want to manually set the x-limits, you could use something like:
import matplotlib.pylab as pl
import numpy as np
pl.figure()
ax=pl.subplot(111)
pl.plot(np.random.random(10))
xlim = ax.get_xlim()
pl.fill_between(xlim, 0.5, 1)
ax.set_xlim(xlim)
Using Matplotlib I'd like to remove the grid lines inside the plot, while keeping the frame (i.e. the axes lines). I've tried the code below and other options as well, but I can't get it to work. How do I simply keep the frame while removing the grid lines?
I'm doing this to reproduce a ggplot2 plot in matplotlib. I've created a MWE below. Be aware that you need a relatively new version of matplotlib to use the ggplot2 style.
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import pylab as P
import numpy as np
if __name__ == '__main__':
values = np.random.uniform(size=20)
plt.style.use('ggplot')
fig = plt.figure()
_, ax1 = P.subplots()
weights = np.ones_like(values)/len(values)
plt.hist(values, bins=20, weights=weights)
ax1.set_xlabel('Value')
ax1.set_ylabel('Probability')
ax1.grid(b=False)
#ax1.yaxis.grid(False)
#ax1.xaxis.grid(False)
ax1.set_axis_bgcolor('white')
ax1.set_xlim([0,1])
P.savefig('hist.pdf', bbox_inches='tight')
OK, I think this is what you are asking (but correct me if I misunderstood):
You need to change the colour of the spines. You need to do this for each spine individually, using the set_color method:
for spine in ['left','right','top','bottom']:
ax1.spines[spine].set_color('k')
You can see this example and this example for more about using spines.
However, if you have removed the grey background and the grid lines, and added the spines, this is not really in the ggplot style any more; is that really the style you want to use?
EDIT
To make the edge of the histogram bars touch the frame, you need to either:
Change your binning, so the bin edges go to 0 and 1
n,bins,patches = plt.hist(values, bins=np.linspace(0,1,21), weights=weights)
# Check, by printing bins:
print bins[0], bins[-1]
# 0.0, 1.0
If you really want to keep the bins to go between values.min() and values.max(), you would need to change your plot limits to no longer be 0 and 1:
n,bins,patches = plt.hist(values, bins=20, weights=weights)
ax.set_xlim(bins[0],bins[-1])
I am trying to create a color wheel in Python, preferably using Matplotlib. The following works OK:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
xval = np.arange(0, 2*pi, 0.01)
yval = np.ones_like(xval)
colormap = plt.get_cmap('hsv')
norm = mpl.colors.Normalize(0.0, 2*np.pi)
ax = plt.subplot(1, 1, 1, polar=True)
ax.scatter(xval, yval, c=xval, s=300, cmap=colormap, norm=norm, linewidths=0)
ax.set_yticks([])
However, this attempt has two serious drawbacks.
First, when saving the resulting figure as a vector (figure_1.svg), the color wheel consists (as expected) of 621 different shapes, corresponding to the different (x,y) values being plotted. Although the result looks like a circle, it isn't really. I would greatly prefer to use an actual circle, defined by a few path points and Bezier curves between them, as in e.g. matplotlib.patches.Circle. This seems to me the 'proper' way of doing it, and the result would look nicer (no banding, better gradient, better anti-aliasing).
Second (relatedly), the final plotted markers (the last few before 2*pi) overlap the first few. It's very hard to see in the pixel rendering, but if you zoom in on the vector-based rendering you can clearly see the last disc overlap the first few.
I tried using different markers (. or |), but none of them go around the second issue.
Bottom line: can I draw a circle in Python/Matplotlib which is defined in the proper vector/Bezier curve way, and which has an edge color defined according to a colormap (or, failing that, an arbitrary color gradient)?
One way I have found is to produce a colormap and then project it onto a polar axis. Here is a working example - it includes a nasty hack, though (clearly commented). I'm sure there's a way to either adjust limits or (harder) write your own Transform to get around it, but I haven't quite managed that yet. I thought the bounds on the call to Normalize would do that, but apparently not.
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import cm
import matplotlib as mpl
fig = plt.figure()
display_axes = fig.add_axes([0.1,0.1,0.8,0.8], projection='polar')
display_axes._direction = 2*np.pi ## This is a nasty hack - using the hidden field to
## multiply the values such that 1 become 2*pi
## this field is supposed to take values 1 or -1 only!!
norm = mpl.colors.Normalize(0.0, 2*np.pi)
# Plot the colorbar onto the polar axis
# note - use orientation horizontal so that the gradient goes around
# the wheel rather than centre out
quant_steps = 2056
cb = mpl.colorbar.ColorbarBase(display_axes, cmap=cm.get_cmap('hsv',quant_steps),
norm=norm,
orientation='horizontal')
# aesthetics - get rid of border and axis labels
cb.outline.set_visible(False)
display_axes.set_axis_off()
plt.show() # Replace with plt.savefig if you want to save a file
This produces
If you want a ring rather than a wheel, use this before plt.show() or plt.savefig
display_axes.set_rlim([-1,1])
This gives
As per #EelkeSpaak in comments - if you save the graphic as an SVG as per the OP, here is a tip for working with the resulting graphic: The little elements of the resulting SVG image are touching and non-overlapping. This leads to faint grey lines in some renderers (Inkscape, Adobe Reader, probably not in print). A simple solution to this is to apply a small (e.g. 120%) scaling to each of the individual gradient elements, using e.g. Inkscape or Illustrator. Note you'll have to apply the transform to each element separately (the mentioned software provides functionality to do this automatically), rather than to the whole drawing, otherwise it has no effect.
I just needed to make a color wheel and decided to update rsnape's solution to be compatible with matplotlib 2.1. Rather than place a colorbar object on an axis, you can instead plot a polar colored mesh on a polar plot.
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import cm
import matplotlib as mpl
# If displaying in a Jupyter notebook:
# %matplotlib inline
# Generate a figure with a polar projection
fg = plt.figure(figsize=(8,8))
ax = fg.add_axes([0.1,0.1,0.8,0.8], projection='polar')
# Define colormap normalization for 0 to 2*pi
norm = mpl.colors.Normalize(0, 2*np.pi)
# Plot a color mesh on the polar plot
# with the color set by the angle
n = 200 #the number of secants for the mesh
t = np.linspace(0,2*np.pi,n) #theta values
r = np.linspace(.6,1,2) #radius values change 0.6 to 0 for full circle
rg, tg = np.meshgrid(r,t) #create a r,theta meshgrid
c = tg #define color values as theta value
im = ax.pcolormesh(t, r, c.T,norm=norm) #plot the colormesh on axis with colormap
ax.set_yticklabels([]) #turn of radial tick labels (yticks)
ax.tick_params(pad=15,labelsize=24) #cosmetic changes to tick labels
ax.spines['polar'].set_visible(False) #turn off the axis spine.
It gives this:
In the following code snippet:
import numpy as np
import pandas as pd
import pandas.rpy.common as com
import matplotlib.pyplot as plt
mtcars = com.load_data("mtcars")
df = mtcars.groupby(["cyl"]).apply(lambda x: pd.Series([x["cyl"].count(), np.mean(x["wt"])], index=["n", "wt"])).reset_index()
plt.plot(df["n"], range(len(df["cyl"])), "o")
plt.yticks(range(len(df["cyl"])), df["cyl"])
plt.show()
This code outputs the dot plot graph, but the result looks quite awful, since both the xticks and yticks don't have enough space, that it's quite difficult to notice both 4 and 8 of the cyl variable output its values in the graph.
So how can I plot it with enough space in advance, much like you can do it without any hassles in R/ggplot2?
For your information, both of this code and this doesn't work in my case. Anyone knows the reason? And do I have to bother to creating such subplots in the first place? Is it impossible to automatically adjust the ticks with response to the input values?
I can't quite tell what you're asking...
Are you asking why the ticks aren't automatically positioned or are you asking how to add "padding" around the inside edges of the plot?
If it's the former, it's because you've manually set the tick locations with yticks. This overrides the automatic tick locator.
If it's the latter, use ax.margins(some_percentage) (where some_percentage is between 0 and 1, e.g. 0.05 is 5%) to add "padding" to the data limits before they're autoscaled.
As an example of the latter, by default, the data limits can be autoscaled such that a point can lie on the boundaries of the plot. E.g.:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(range(10), 'ro')
plt.show()
If you want to avoid this, use ax.margins (or equivalently, plt.margins) to specify a percentage of padding to be added to the data limits before autoscaling takes place.
E.g.
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(range(10), 'ro')
ax.margins(0.04) # 4% padding, similar to R.
plt.show()