How to set markersize in plt.imshow()

How to set markersize in plt.imshow() - python

I have the following problem:
I want to plot an adjacency matrix using a colormap. Now I want do adjust the markersize, because you cannot really
see the dots in the picture since the matrix is really big . How can I do this? Using spy(), this works like this.
plt.spy(adj, markersize = 1)
I want to have something like this:
plt.imshow(adj, cmap = colormap, markersize= 1)
This however, doesnt work.
Thanks

You may use a scatter plot, which allows to set the markersize using the s argument.
ax.scatter(X,Y,c=z, s=36, marker="s")
An example comparing a spy, imshow and scatter plot.
import matplotlib.pyplot as plt
import numpy as np
fig, (ax1,ax2,ax3) = plt.subplots(ncols=3, figsize=(8,4))
z = np.random.rand(20, 20)
X,Y = np.meshgrid(np.arange(z.shape[1]),np.arange(z.shape[0]))
z[5] = 0.
z[:, 12] = 0.
ax1.spy(z, markersize=5, precision=0.1, origin="lower")
ax2.imshow(z, origin="lower")
ax3.scatter(X,Y,c=z, s=36, marker="s")
ax3.set_aspect("equal")
ax3.margins(0)
ax1.set_title("spy")
ax2.set_title("imshow")
ax3.set_title("scatter")
plt.show()

Related

Set size of subplot to other sublot with equal aspect ratio

I would like a representation consisting of a scatter plot and 2 histograms on the right and below the scatter plot
create. I have the following requirements:
1.) In the scatter plot, the apect ratio is equal so that the circle does not look like an ellipse.
2.) In the graphic, the subplots should be exactly as wide or high as the axes of the scatter plot.
This also works to a limited extent. However, I can't make the lower histogram as wide as the x axis of the scatter plot. How do I do that?
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import random
#create some demo data
x = [random.uniform(-2.0, 2.0) for i in range(100)]
y = [random.uniform(-2.0, 2.0) for i in range(100)]
#create figure
fig = plt.figure()
gs = gridspec.GridSpec(2, 2, width_ratios = [3, 1], height_ratios = [3, 1])
ax = plt.subplot(gs[0])
# Axis labels
plt.xlabel('pos error X [mm]')
plt.ylabel('pos error Y [mm]')
ax.grid(True)
ax.axhline(color="#000000")
ax.axvline(color="#000000")
ax.set_aspect('equal')
radius = 1.0
xc = radius*np.cos(np.linspace(0,np.pi*2))
yc = radius*np.sin(np.linspace(0,np.pi*2))
plt.plot(xc, yc, "k")
ax.scatter(x,y)
hist_x = plt.subplot(gs[1],sharey=ax)
hist_y = plt.subplot(gs[2],sharex=ax)
plt.tight_layout() #needed. without no xlabel visible
plt.show()
what i want is:
Many thanks for your help!

The easiest (but not necessarily most elegant) solution is to manually position the lower histogram after applying the tight layout:
ax_pos = ax.get_position()
hist_y_pos = hist_y.get_position()
hist_y.set_position((ax_pos.x0, hist_y_pos.y0, ax_pos.width, hist_y_pos.height))
This output was produced by matplotlib version 3.4.3. For your example output, you're obviously using a different version, as I get a much wider lower histogram than you.
(I retained the histogram names as in your example although I guess the lower one should be hist_x instead of hist_y).

Change colour of colorbar in Python Matplotlib

I have a code that gives me a scatter plot of predicted vs actual values as a function of concentration. The data is pulled from an excel csv spreadsheet.
This is the code:
import matplotlib.pyplot as plt
from numpy import loadtxt
dataset = loadtxt("ColorPlot.csv", delimiter=',')
x = dataset[:,0]
y = dataset[:,1]
z = dataset[:,2]
scaled_z = (z - z.min()) / z.ptp()
colors = plt.cm.viridis(scaled_z)
sc=plt.scatter(x, y, c=colors)
plt.clim(0, 100)
plt.colorbar()
plt.xlabel("Actual")
plt.ylabel("Predicted")
plt.show()
And with this I get a nice graph:
However if I change the color to something like
colors = plt.cm.plasma(scaled_z)
I get the graph below but the colorbar remains unchanged.
I've tried lots of different things like cmap or edgecolors but I don't know how to change it. And I want to keep the code as simple as it currently is because I want to readily change the third variable of z based on my excel spreadsheet data.
Is there also a way for the scale of the colorbar to pick up what the scale is from the excel spreadsheet without me manually specifying 0-100?

To get the right color bar, use the following code:
colormap = plt.cm.get_cmap('plasma') # 'plasma' or 'viridis'
colors = colormap(scaled_z)
sc = plt.scatter(x, y, c=colors)
sm = plt.cm.ScalarMappable(cmap=colormap)
sm.set_clim(vmin=0, vmax=100)
plt.colorbar(sm)
plt.xlabel("Actual")
plt.ylabel("Predicted")
plt.show()
For my random generated data I got the following plot:
Now replace 'plasma' with 'viridis' and check the other variant.

You should not scale your data, unless you want the colorbar to be incorrect. Once you have the PathCollection from the scatter call, you can call set_cmap and set_clim on that and the colorbar should track. (you could also explicitly associate the colorbar with the PathCollection to avoid ambiguity)
import matplotlib.pyplot as plt
import numpy as np
x = np.random.randn(100)
y = np.random.randn(100)
z = np.random.randn(100)
sc=plt.scatter(x, y, c=z, cmap='viridis')
plt.clim(0, 100)
plt.colorbar(sc)
plt.xlabel("Actual")
plt.ylabel("Predicted")
sc.set_cmap('plasma')
sc.set_clim(-1, 1)
plt.show()

Your code return for me an error TypeError: You must first set_array for mappable ...
The following is a simplest syntax that works for me:
import matplotlib.pyplot as plt
import numpy as np
a = np.random.random(100)
b = np.random.random(100)
scaled_z = (a + b)/a
plt.figure()
plt.scatter(a, b, c = scaled_z, cmap = 'plasma') ## you can directly change the colormap here
plt.colorbar()
plt.tight_layout()
plt.show()

Matplotlib: How can I show only exponents in the y tick labels of a semi-log plot with secondary_yaxis()?

I've been working on matplotlib's secondary-yaxis and I can't figure out how I should set "functions" parameter in order to get the result that I want.
I want to make a semi-log plot and set set the labels of y-ticks in the 2 following formats:
ordinary format such as "10^1, 10^2, 10^3, ..., 10^(exponent), ..."
the exponents only: "1, 2, 3, ..."
And I want to put them in the former style in the y-axis of left side, and the latter right side.
What I want to do can be done by using twinx() like this:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(1, 3, 41)
y = 10**x
fig, ax1 = plt.subplots()
ax1.set_yscale('log')
ax1.plot(x, y)
ax2 = ax1.twinx()
ymin, ymax = ax1.get_ylim()
ax2.set_ylim(np.log10(ymin), np.log10(ymax))
plt.show()
You would see that i=(1, 2, 3) in the right label is located at the same height as 10^i in the left label.
However, I want to know how to do the same thing by secondary_yaxis. I've tried this but it didn't work.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(1, 3, 41)
y = 10**x
fig, ax = plt.subplots()
ax.set_yscale('log')
ax.plot(x, y)
def forward(x):
return np.log10(x)
def backward(x):
return 10**x
secax = ax.secondary_yaxis('right', functions=(forward, backward))
plt.show()
It resulted in this:
You can see right-side tick labels are broken. I suspect that my way of setting the parameter "functions" of secondary_yaxis() might be invalid. I would appreciate it if you tell me how to do it.

I get the broken figure on matplotlib 3.1.0. and updating it to 3.3.0. has solved the problem. The same code as the second code block of the question generates this.
enter image description here

Change range of colors in plot(imshow)?

Values in my matrix called 'energy' are close enough to each other: e.g. one value can be 500, another one 520. And i want to see the color difference on my plot more precisely. Like for the smallest value in my data it should be the very dark color and for the highest value it should be the very bright color.
I have the following code:
fig = plt.figure(figsize=(10, 10))
ax = fig.add_subplot(111)
plt.imshow(energy[0:60, 0:5920], cmap='Reds')
ax.axes.set_aspect(aspect=100)
plt.grid(color='yellow')
plt.title('My plot')
plt.xlabel('Length points')
plt.ylabel('Time points(seconds)')
import matplotlib.ticker as plticker
loc = plticker.MultipleLocator(base=500)
ax.xaxis.set_major_locator(loc)
plt.show()
I get the following plot:
plot of energy
Other words i'd love to get this plot more colorful.
Thanks in advance.

You can set a custom range either through a custom colormap or adjusting the range value to show using the keywords vmin and vmax. For example:
from matplotlib.pyplot import subplots
import numpy as np
fig, ax = subplots()
h = ax.imshow(np.random.rand(10,10) * 10, vmin = 0,\
vmax = 2, cmap = 'Reds')
fig.colorbar(h)
fig.show()
Which produces the colors within 0, 2 value
Alternatively you can rescale your data or adjust your colormap, see the maplotlib docs for more info.

Plot two histograms on single chart with matplotlib

I created a histogram plot using data from a file and no problem. Now I wanted to superpose data from another file in the same histogram, so I do something like this
n,bins,patchs = ax.hist(mydata1,100)
n,bins,patchs = ax.hist(mydata2,100)
but the problem is that for each interval, only the bar with the highest value appears, and the other is hidden. I wonder how could I plot both histograms at the same time with different colors.

Here you have a working example:
import random
import numpy
from matplotlib import pyplot
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]
bins = numpy.linspace(-10, 10, 100)
pyplot.hist(x, bins, alpha=0.5, label='x')
pyplot.hist(y, bins, alpha=0.5, label='y')
pyplot.legend(loc='upper right')
pyplot.show()

The accepted answers gives the code for a histogram with overlapping bars, but in case you want each bar to be side-by-side (as I did), try the variation below:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-deep')
x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
bins = np.linspace(-10, 10, 30)
plt.hist([x, y], bins, label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()
Reference: http://matplotlib.org/examples/statistics/histogram_demo_multihist.html
EDIT [2018/03/16]: Updated to allow plotting of arrays of different sizes, as suggested by #stochastic_zeitgeist

In the case you have different sample sizes, it may be difficult to compare the distributions with a single y-axis. For example:
import numpy as np
import matplotlib.pyplot as plt
#makes the data
y1 = np.random.normal(-2, 2, 1000)
y2 = np.random.normal(2, 2, 5000)
colors = ['b','g']
#plots the histogram
fig, ax1 = plt.subplots()
ax1.hist([y1,y2],color=colors)
ax1.set_xlim(-10,10)
ax1.set_ylabel("Count")
plt.tight_layout()
plt.show()
In this case, you can plot your two data sets on different axes. To do so, you can get your histogram data using matplotlib, clear the axis, and then re-plot it on two separate axes (shifting the bin edges so that they don't overlap):
#sets up the axis and gets histogram data
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.hist([y1, y2], color=colors)
n, bins, patches = ax1.hist([y1,y2])
ax1.cla() #clear the axis
#plots the histogram data
width = (bins[1] - bins[0]) * 0.4
bins_shifted = bins + width
ax1.bar(bins[:-1], n[0], width, align='edge', color=colors[0])
ax2.bar(bins_shifted[:-1], n[1], width, align='edge', color=colors[1])
#finishes the plot
ax1.set_ylabel("Count", color=colors[0])
ax2.set_ylabel("Count", color=colors[1])
ax1.tick_params('y', colors=colors[0])
ax2.tick_params('y', colors=colors[1])
plt.tight_layout()
plt.show()

As a completion to Gustavo Bezerra's answer:
If you want each histogram to be normalized (normed for mpl<=2.1 and density for mpl>=3.1) you cannot just use normed/density=True, you need to set the weights for each value instead:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
x_w = np.empty(x.shape)
x_w.fill(1/x.shape[0])
y_w = np.empty(y.shape)
y_w.fill(1/y.shape[0])
bins = np.linspace(-10, 10, 30)
plt.hist([x, y], bins, weights=[x_w, y_w], label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()
As a comparison, the exact same x and y vectors with default weights and density=True:

You should use bins from the values returned by hist:
import numpy as np
import matplotlib.pyplot as plt
foo = np.random.normal(loc=1, size=100) # a normal distribution
bar = np.random.normal(loc=-1, size=10000) # a normal distribution
_, bins, _ = plt.hist(foo, bins=50, range=[-6, 6], normed=True)
_ = plt.hist(bar, bins=bins, alpha=0.5, normed=True)

Here is a simple method to plot two histograms, with their bars side-by-side, on the same plot when the data has different sizes:
def plotHistogram(p, o):
"""
p and o are iterables with the values you want to
plot the histogram of
"""
plt.hist([p, o], color=['g','r'], alpha=0.8, bins=50)
plt.show()

Plotting two overlapping histograms (or more) can lead to a rather cluttered plot. I find that using step histograms (aka hollow histograms) improves the readability quite a bit. The only downside is that in matplotlib the default legend for a step histogram is not properly formatted, so it can be edited like in the following example:
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.2
from matplotlib.lines import Line2D
rng = np.random.default_rng(seed=123)
# Create two normally distributed random variables of different sizes
# and with different shapes
data1 = rng.normal(loc=30, scale=10, size=500)
data2 = rng.normal(loc=50, scale=10, size=1000)
# Create figure with 'step' type of histogram to improve plot readability
fig, ax = plt.subplots(figsize=(9,5))
ax.hist([data1, data2], bins=15, histtype='step', linewidth=2,
alpha=0.7, label=['data1','data2'])
# Edit legend to get lines as legend keys instead of the default polygons
# and sort the legend entries in alphanumeric order
handles, labels = ax.get_legend_handles_labels()
leg_entries = {}
for h, label in zip(handles, labels):
leg_entries[label] = Line2D([0], [0], color=h.get_facecolor()[:-1],
alpha=h.get_alpha(), lw=h.get_linewidth())
labels_sorted, lines = zip(*sorted(leg_entries.items()))
ax.legend(lines, labels_sorted, frameon=False)
# Remove spines
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# Add annotations
plt.ylabel('Frequency', labelpad=15)
plt.title('Matplotlib step histogram', fontsize=14, pad=20)
plt.show()
As you can see, the result looks quite clean. This is especially useful when overlapping even more than two histograms. Depending on how the variables are distributed, this can work for up to around 5 overlapping distributions. More than that would require the use of another type of plot, such as one of those presented here.

It sounds like you might want just a bar graph:
http://matplotlib.sourceforge.net/examples/pylab_examples/bar_stacked.html
http://matplotlib.sourceforge.net/examples/pylab_examples/barchart_demo.html
Alternatively, you can use subplots.

There is one caveat when you want to plot the histogram from a 2-d numpy array. You need to swap the 2 axes.
import numpy as np
import matplotlib.pyplot as plt
data = np.random.normal(size=(2, 300))
# swapped_data.shape == (300, 2)
swapped_data = np.swapaxes(x, axis1=0, axis2=1)
plt.hist(swapped_data, bins=30, label=['x', 'y'])
plt.legend()
plt.show()

Also an option which is quite similar to joaquin answer:
import random
from matplotlib import pyplot
#random data
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]
#plot both histograms(range from -10 to 10), bins set to 100
pyplot.hist([x,y], bins= 100, range=[-10,10], alpha=0.5, label=['x', 'y'])
#plot legend
pyplot.legend(loc='upper right')
#show it
pyplot.show()
Gives the following output:

Just in case you have pandas (import pandas as pd) or are ok with using it:
test = pd.DataFrame([[random.gauss(3,1) for _ in range(400)],
[random.gauss(4,2) for _ in range(400)]])
plt.hist(test.values.T)
plt.show()

This question has been answered before, but wanted to add another quick/easy workaround that might help other visitors to this question.
import seasborn as sns
sns.kdeplot(mydata1)
sns.kdeplot(mydata2)
Some helpful examples are here for kde vs histogram comparison.

Inspired by Solomon's answer, but to stick with the question, which is related to histogram, a clean solution is:
sns.distplot(bar)
sns.distplot(foo)
plt.show()
Make sure to plot the taller one first, otherwise you would need to set plt.ylim(0,0.45) so that the taller histogram is not chopped off.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to set markersize in plt.imshow() - python

Related

Set size of subplot to other sublot with equal aspect ratio

Change colour of colorbar in Python Matplotlib

Matplotlib: How can I show only exponents in the y tick labels of a semi-log plot with secondary_yaxis()?

Change range of colors in plot(imshow)?

Plot two histograms on single chart with matplotlib

Categories

Resources