I'm trying to give pyplot.imshow() a list of values that aren't necessarily linear to use as y-axis labels. The truncated list is:
run_numbers = array([815676, 815766, 815767, 815768, 815769, 815770, 815771, 815772,
815773, 815774, 815775, 815776, 815777, 815778, 815779, 815780,
815781, 815783, 815784, 815785, 815786, 815789, 815790, 815792,
815793, 815794, 815795, 815796, 815797, 815798, 815799, 815800,
815801, 815802, 815803, 815804, 815805, 815806, 815807, 815808,
815809, 815811, 815812, 815813, 815814, 815815, 815816, 815817,
815818, 815819, 815820, 815821, 815822, 815823, 815824, 815825,
815826, 815827, 815829, 815830, 815831, 815832, 815833, 815834,
815835, 815836, 815837, 815838, 815839, 815841, 815842, 815843,
815844, 815845, 815846, 815847, 815848, 815849, 815851, 815852,
815853, 815854, 815855, 815856, 815857, 815858, 815859, 815860,
815861, 815863, 815864, 815865, 815866, 815867, 815869, 815870,
815871, 815872, 815873, 815874, 815875, 815876, 815877, 815878])
My image looks like this:
At first glance, this seems fine. But imshow isn't using the values from the list, instead it's using a linear range from 815676 to the max value of the list. I've tried a few different things:
plt.imshow(np.array(profiles), aspect='auto', vmin=-5, vmax=20, extent=[0,500,max(run_numbers), min(run_numbers)])
The above code gave the image above, which makes sense given what I put in extent.
Is there a way to tell imshow to use the values in the list as the yaxis label? I've tried ax.yticklabels and ax.ytick, but those also give a linear progression of numbers instead of the list values.
Please let me know how I can clarify my question if there's any confusion. I can also provide an example data set if my question isn't clear.
Letting the extent go between 0 and the number of labels minus 1, and then use plt.yticks(range(N), run_numbers) would set the labels. As there are about 100 labels, this would look very crowded, which could be mitigated by setting a large figure size and a small font. Or the labels could be set every with steps, e.g. steps of 10:
import numpy as np
import matplotlib.pyplot as plt
run_numbers = np.array([815676, 815766, 815767, 815768, 815769, 815770, 815771, 815772, 815773, 815774, 815775, 815776, 815777, 815778, 815779, 815780, 815781, 815783, 815784, 815785, 815786, 815789, 815790, 815792, 815793, 815794, 815795, 815796, 815797, 815798, 815799, 815800, 815801, 815802, 815803, 815804, 815805, 815806, 815807, 815808, 815809, 815811, 815812, 815813, 815814, 815815, 815816, 815817, 815818, 815819, 815820, 815821, 815822, 815823, 815824, 815825, 815826, 815827, 815829, 815830, 815831, 815832, 815833, 815834, 815835, 815836, 815837, 815838, 815839, 815841, 815842, 815843, 815844, 815845, 815846, 815847, 815848, 815849, 815851, 815852, 815853, 815854, 815855, 815856, 815857, 815858, 815859, 815860, 815861, 815863, 815864, 815865, 815866, 815867, 815869, 815870, 815871, 815872, 815873, 815874, 815875, 815876, 815877, 815878])
N = len(run_numbers)
profiles = np.random.randn(N, 501).cumsum(axis=1)
plt.imshow(profiles, aspect='auto', extent=[0, 500, N-1, 0])
plt.yticks(range(0, N, 10), run_numbers[::10])
plt.show()
With
plt.figure(figsize=(10, 16))
plt.imshow(profiles, aspect='auto', extent=[0, 500, N-1, 0])
plt.yticks(range(N), run_numbers, fontsize=8)
It could look like
Related
much like the title says I am trying to create a graph that shows 1-6 on the x-axis (the values position in the row of the array) and its value on the y-axis. A snippet from the array is shown below, with each column representing a coefficient number from 1-6.
[0.99105 0.96213 0.96864 0.96833 0.96698 0.97381]
[0.99957 0.99709 0.9957 0.9927 0.98492 0.98864]
[0.9967 0.98796 0.9887 0.98613 0.98592 0.99125]
[0.9982 0.99347 0.98943 0.96873 0.91424 0.83831]
[0.9985 0.99585 0.99209 0.98399 0.97253 0.97942]
It's already set up as a numpy array. I think it's relatively straightforward, just drawing a complete mental blank.
Any ideas?
Do you want something like this?
a = np.array([[0.99105, 0.96213, 0.96864, 0.96833, 0.96698, 0.97381],
[0.99957, 0.99709, 0.9957, 0.9927, 0.98492, 0.98864],
[0.9967, 0.98796, 0.9887, 0.98613, 0.98592, 0.99125],
[0.9982, 0.99347, 0.98943, 0.96873, 0.91424, 0.83831],
[0.9985, 0.99585, 0.99209, 0.98399, 0.97253, 0.97942]])
import matplotlib.pyplot as plt
plt.scatter(x=np.tile(np.arange(a.shape[1]), a.shape[0])+1, y=a)
output:
Note that you can emulate the same with groups using:
plt.plot(a.T, marker='o', ls='')
x = np.arange(a.shape[0]+1)
plt.xticks(x, x+1)
output:
When creating 2d-images with pyplot by using pcolor/pcolormesh I am using custom labels in my project. Unfortunately, some of those labels are too long to fit into their allocated space, and therefore spill over the next label, as shown below for a 2x2-matrix:
and a 4x4-matrix:
I tried to find solutions for that issue, and until now have found two possible approaches: Replacing all spaces with newlines (as proposed by https://stackoverflow.com/a/62521738/2546099), or using textwrap.wrap(), which needs a fixed text width (as proposed by https://stackoverflow.com/a/15740730/2546099 or https://medium.com/dunder-data/automatically-wrap-graph-labels-in-matplotlib-and-seaborn-a48740bc9ce)
I tested both approaches for both test matrices shown above, resulting in (when replacing space with newlines) for the 2x2-matrix:
and for the 4x4-matrix:
Similarly, I tested the second approach for both matrices, the 2x2-matrix, resulting in
and the 4x4-matrix, resulting in
Still, both approaches share the same issues:
When wrapping the label, a part of the label will extend into the image. Setting a fixed pad size will only work if the label size is known beforehand, to adjust the distance properly. Else, some labels might be spaced too far apart from the image, while others are still inside the image.
Wrapping the label with a fixed size might leave too much white space around. The best case might be figure 5 and 6: I used a wrap at 10 characters for both figures, but while the text placement works out nicely (from my point of view) for four columns, there is enough space to increase the wrap limit to 20 characters for two columns. For even more columns 10 characters might even be too much.
Therefore, are there other solutions which might solve my problem dynamically, depending on the size of the figure and the available space?
The code I used for generating the pictures above is:
import textwrap
import numpy as np
import matplotlib.pyplot as plt
imshow_size = 4
x_vec, y_vec = (
np.linspace(0, imshow_size - 1, imshow_size),
np.linspace(0, imshow_size - 1, imshow_size),
)
labels_to_plot = np.linspace(0, imshow_size, imshow_size + 1)
categories = ["This is a very long long label serving its job as place holder"] * (
imshow_size
)
## Idea 1: Replace space with \n, similar to https://stackoverflow.com/a/62521738/2546099
# categories = [category.replace(" ", "\n") for category in categories]
## Idea 2: Wrap labels at certain text length, similar to https://stackoverflow.com/a/15740730/2546099 or
## https://medium.com/dunder-data/automatically-wrap-graph-labels-in-matplotlib-and-seaborn-a48740bc9ce
categories = [textwrap.fill(category, 10) for category in categories]
X, Y = np.meshgrid(x_vec, y_vec)
Z = np.random.rand(imshow_size, imshow_size)
fig, ax = plt.subplots()
ax.pcolormesh(X, Y, Z)
ax.set_xticks(labels_to_plot[:-1])
ax.set_xticklabels(categories, va="center")
ax.set_yticks(labels_to_plot[:-1])
ax.set_yticklabels(categories, rotation=90, va="center")
fig.tight_layout()
plt.savefig("with_wrap_at_10_with_four_labels.png", bbox_inches="tight")
# plt.show()
I am trying to make a matplottlib plot using some image data I have in numpy format, and was wondering if someone would be able to advise me on the best way to approach displaying multiples of these images within the boundaries of one subplot?
For example, using the following code...
n_samples = 10
sample_imgs, min_index = visualise_n_way(n_samples)
print(min_index)
print(sample_imgs.shape)
print(sample_imgs[0].shape)
print(x_train_w)
print(x_train_h)
img_matrix = []
for index in range(1, len(sample_imgs)):
img_matrix.append(np.reshape(sample_imgs[index], (x_train_w, x_train_h)))
img_matrix = np.asarray(img_matrix)
img_matrix = np.vstack(img_matrix)
f, ax = plt.subplots(1, 3, figsize = (10, 12))
f.tight_layout()
ax[0].imshow(np.reshape(sample_imgs[0], (x_train_w, x_train_h)),vmin=0, vmax=1,cmap='Greys')
ax[0].set_title("Test Image")
ax[1].imshow(img_matrix ,vmin=0, vmax=1,cmap='Greys')
ax[1].set_title("Support Set")
ax[2].imshow(np.reshape(sample_imgs[min_index], (x_train_w, x_train_h)),vmin=0, vmax=1,cmap='Greys')
ax[2].set_title("Image most similar to Test Image in Support Set")
I get the following image and output
1
(11, 784)
(784,)
28
28
Matplotlib Output
What I would like to do however is to have the second subplot, the one displaying img_matrix, to be the same size as the two either side of it, creating a grid of the images. Sort of like this
sketch.
I am at a loss as to how to do this however. I believe I may need to use something such as a gridspace, but I'm finding the documentation hard to follow for what I want to do.
Any help is greatly appreciated!
I am trying to plot a linear line with associated error.
I calculated values for slope (a) and intercepts (b). In addition, I calculated the error associated with these values. So I drew the line given by the typical formula below.
y=ax+b
However, in addition to the line, I also want to draw the associated error. I came up with the idea to draw the lines associated with these formulas and color the space between the lines gray.
y=(a+a_sd)x+(b+b_sd)
y=(a-a_sd)x+(b-b_sd)
Uisng the following piece of code, I am able to color part of the surface between the lines, but not the whole span (see included output).
I think this may be due to the fact that "distance" is not sorted, and fill_between is using distance[0] and distance[-1] as begin and end for the span, respectively.
As always, any help would be highly appreciated!
import matplotlib.pyplot as plt
distance=[0.35645334340084989, 0.55406894241607718, 0.10201413273193734, 0.13401365724625941, 0.71918808865838735, 0.14151335417722818]
time=[2.4004984846346171, 2.4909766335028447, 1.9852064018125195, 1.9083156734132103, 2.6380396934372863, 1.9114505780323543]
time_SD=[0.062393810960652669, 0.056945715242838917, 0.073960838867327183, 0.084111239062664475, 0.026912957190265499, 0.08595664694840538]
distance_SD=[0.035160608598240162, 0.032976715460514235, 0.02782911002465227, 0.035465701695038584, 0.043009444687382707, 0.038387585107200854]
a=1.17887019041
b=1.83339229489
a_sd=0.159771527859
b_sd=0.0762509747218
plt.errorbar(distance,time,yerr=time_SD, xerr=distance_SD, linestyle="None")
abline_values = [(a)*i + (b) for i in distance]
abline_values_plus = [(a+a_sd)*i + (b+b_sd) for i in distance]
abline_values_minus = [(a-a_sd)*i + (b-b_sd) for i in distance]
plt.plot(distance, abline_values,"r")
plt.fill_between(distance,abline_values_minus,abline_values_plus,facecolor='lightgrey', interpolate=True, edgecolors="None")
leg = plt.legend(loc="lower right", frameon=False, handlelength=0, handletextpad=0)
for item in leg.legendHandles:
item.set_visible(False)
plt.show()
In order to use pyplot.fill_between() the list to plot the horizontal coordinate should be sorted. Using an unsorted list of x values is possible, but can lead to undesired results.
Sorting a list can be done using sorted(list).
import matplotlib.pyplot as plt
distance=[0.35645334340084989, 0.55406894241607718, 0.10201413273193734, 0.13401365724625941, 0.71918808865838735, 0.14151335417722818]
time=[2.4004984846346171, 2.4909766335028447, 1.9852064018125195, 1.9083156734132103, 2.6380396934372863, 1.9114505780323543]
time_SD=[0.062393810960652669, 0.056945715242838917, 0.073960838867327183, 0.084111239062664475, 0.026912957190265499, 0.08595664694840538]
distance_SD=[0.035160608598240162, 0.032976715460514235, 0.02782911002465227, 0.035465701695038584, 0.043009444687382707, 0.038387585107200854]
a=1.17887019041
b=1.83339229489
a_sd=0.159771527859
b_sd=0.0762509747218
distance_sorted = sorted(distance)
plt.errorbar(distance,time,yerr=time_SD, xerr=distance_SD, linestyle="None")
abline_values = [(a)*i + (b) for i in distance_sorted]
abline_values_plus = [(a+a_sd)*i + (b+b_sd) for i in distance_sorted]
abline_values_minus = [(a-a_sd)*i + (b-b_sd) for i in distance_sorted]
plt.plot(distance_sorted, abline_values,"r")
plt.fill_between(distance_sorted,abline_values_minus,abline_values_plus, facecolor='lightgrey', edgecolors="None")
plt.show()
The documentation does not mention the requirement of x values being sorted. The reason is probably that fill_between actually works even with unsorted lists, just not the way one might expect. Maybe the following animation gives a more intuitive understanding on the issue:
You are right fill_between seems to expect the values to be sorted. The documentation is not clear about this behaviour though. The following example however shows the same effect:
import matplotlib.pyplot as plt
from numpy import random, array
#x = random.randn(20) #does not work
x = array(sorted(random.randn(20))) #works
a = 2
d = .5
y_h = x*(a+d)
y_l = x*(a-d)
plt.fill_between(x,y_h, y_l)
plt.show()
As a workaround just sort your values before calculating your errorlines using sorted.
Is there a way to extract the data from an array, which corresponds to a line of a contourplot in python? I.e. I have the following code:
n = 100
x, y = np.mgrid[0:1:n*1j, 0:1:n*1j]
plt.contour(x,y,values)
where values is a 2d array with data (I stored the data in a file but it seems not to be possible to upload it here). The picture below shows the corresponding contourplot. My question is, if it is possible to get exactly the data from values, which corresponds e.g. to the left contourline in the plot?
Worth noting here, since this post was the top hit when I had the same question, that this can be done with scikit-image much more simply than with matplotlib. I'd encourage you to check out skimage.measure.find_contours. A snippet of their example:
from skimage import measure
x, y = np.ogrid[-np.pi:np.pi:100j, -np.pi:np.pi:100j]
r = np.sin(np.exp((np.sin(x)**3 + np.cos(y)**2)))
contours = measure.find_contours(r, 0.8)
which can then be plotted/manipulated as you need. I like this more because you don't have to get into the deep weeds of matplotlib.
plt.contour returns a QuadContourSet. From that, we can access the individual lines using:
cs.collections[0].get_paths()
This returns all the individual paths. To access the actual x, y locations, we need to look at the vertices attribute of each path. The first contour drawn should be accessible using:
X, Y = cs.collections[0].get_paths()[0].vertices.T
See the example below to see how to access any of the given lines. In the example I only access the first one:
import matplotlib.pyplot as plt
import numpy as np
n = 100
x, y = np.mgrid[0:1:n*1j, 0:1:n*1j]
values = x**0.5 * y**0.5
fig1, ax1 = plt.subplots(1)
cs = plt.contour(x, y, values)
lines = []
for line in cs.collections[0].get_paths():
lines.append(line.vertices)
fig1.savefig('contours1.png')
fig2, ax2 = plt.subplots(1)
ax2.plot(lines[0][:, 0], lines[0][:, 1])
fig2.savefig('contours2.png')
contours1.png:
contours2.png:
plt.contour returns a QuadContourSet which holds the data you're after.
See Get coordinates from the contour in matplotlib? (which this question is probably a duplicate of...)