plotting 3d histogram/barplot in python matplotlib - python

I have an Nx3 matrix in scipy/numpy and I'd like to make a 3 dimensional bar graph out of it, where the X and Y axes are determined by the values of first and second columns of the matrix, the height of each bar is the third column in the matrix, and the number of bars is determined by N.
In other words, if "data" is the matrix then:
data[:, 0] # values of X-axis
data[:, 1] # values of Y-axis
data[:, 2] # values of each Z-axis bar
and there should be one bar for each len(data)
How can I do this in Matplotlib?
Secondly, as a variant of this, how can I do the same thing, but this time histogram the bars into N bins in each X, Y, Z dimension? I.e. instead of a bar for each point, just histogram the data into those bins in every dimension, and plot a bar for each bin.
thanks very much for your help.

Here is one example of a 3D bar plot.
Here is another.
Numpy has a function called histogram2d to do the rectangular binning you want.

I had a little of a hard time shaping the 'heights' of my data properly from the examples, but finally got it to work with the following code. Here, Z is a 3 dimensional array with all of my data, and x and rval are basically the 2-d indices corresponding to the datapoints.
xs = np.arange(biopsy_num)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for y in (np.arange(r_sweep)):
z = Z[:,y]
ax.bar(xs, z, zs=y, zdir='y', alpha=0.8)
ax.set_xlabel('biopsies')
ax.set_ylabel('radius of biopsy')
ax.set_zlabel('Shannon Index')
plt.show()

Related

Make a 2d histogram show if a certain value is above or below average?

I made a 2d histogram of two variables(x and y) and each of them are long, 1d arrays. I then calculated the average of x in each bin and want to make the colorbar show how much each x is above or below average in the respective bin.
So far I have tried to make a new array, z, that contains the values for how far above/below average each x is. When I try to use this with pcolormesh I run into issues that it is not a 2-D array. I also tried to solve this issue by following the solution from this problem (Using pcolormesh with 3 one dimensional arrays in python). The length of each array (x, y and z) are equal in this case and there is a respective z value for each x value.
My overall goal is to just have the colorbar not dependent on counts but to have it show how much above/below average each x value is from the average x of the bin. I suspect that it may make more sense to just plot x vs. z but I do not think that would fix my colorbar issue.
As LoneWanderer mentioned some sample code would be useful; however let me make an attempt at what you want.
import numpy as np
import matplotlib.pyplot as plt
N = 10000
x = np.random.uniform(0, 1, N)
y = np.random.uniform(0, 1, N) # Generating x and y data (you will already have this)
# Histogram data
xbins = np.linspace(0, 1, 100)
ybins = np.linspace(0, 1, 100)
hdata, xedges, yedged = np.histogram2d(x, y, bins=(xbins, ybins))
# compute the histogram average value and the difference
hdataMean = np.mean(hdata)
hdataRelDifference = (hdata - hdataMean) / hdataMean
# Plot the relative difference
fig, ax = plt.subplots(1, 1)
cax = ax.imshow(hdataRelDifference)
fig.colorbar(cax, ax=ax)
If this is not what you intended, hopefully there are enough pieces here to adapt it for your needs.

Python: Plot residuals on a fitted model

I want to plot the lines (residuals; cyan lines) between data points and the estimated model. Currently I'm doing so by iterating over all data points in my income pandas.DataFrame and adding vertical lines. x, y are the points' coordinates and predicted are the predictions (here the blue line).
plt.scatter(income["Education"], income["Income"], c='red')
plt.ylim(0,100)
for indx, (x, y, _, _, predicted) in income.iterrows():
plt.axvline(x, y/100, predicted/100) # /100 because it needs floats [0,1]
Is there a more efficient way? This doesn't seem like a good approach for more than a few rows.
First of all note that axvline here only works by coincidence. In general the y values taken by axvline are in coordinates relative to the axes, not in data coordinates.
In contrast, vlines uses data coordinates and also has the advantage to accept arrays of values. It will then create a LineCollection, which is more efficient than individual lines.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-1.2,1.2,20)
y = np.sin(x)
dy = (np.random.rand(20)-0.5)*0.5
fig, ax = plt.subplots()
ax.plot(x,y)
ax.scatter(x,y+dy)
ax.vlines(x,y,y+dy)
plt.show()

matplotlib 2d histrogram heatmap-- how do I use my dataset to make one?

I am new to python.
I have a dataset like
import numpy as np
from matplotlib import pyplot as plt
dats = np.array([r1,x1,y1],[r2,x2,y2],...])
I would like to plot color intensity associated with r1,r2,... at the position (x1,y1), (x2,y2), et cetera respectively.
How can I get this data set manipulated in a format which matplotlib can use in a 2D histogram?
Any help much appreciated. I'll help others in return once I've gained some skill : o
In order to make 2D histogram, your data set has to comprises two data values rather than one data value and two indices. Thus, you need two arrays: one containing the r1 values and one containing the r2 values. Your data does not have any r2 values, therefore, you cannot compute a bi-dimensional histogram.
Regarding your question, you do not even want a histogram. You just want to visualise your r1 values on a plane. This is easy. Say, your array dats has a length of 100, then:
rs = dats[:, 0] # retrieve r-values from dats
plt.imshow(rs.reshape(10, 10), cmap='Greys', interpolation='None')
plt.colorbar()
You can create interpolated data from a set of points using griddata, assuming x = [x1, x2, etc] and r = [r1, r2, etc] then,
#Setup a grid
xi = np.linspace(x.min(),x,max(),100)
yi = np.linspace(y.min(),y.max(),100)
zi = griddata(x, y, r, xi, yi, interp='linear')
#Plot the colormap
cm = plt.pcolormesh(xi,yi,zi)
plt.colorbar()
plt.show()
Other options include colouring scatter plots,
plt.scatter(x,y,c=r)
or there is a 2D histogram functions in scipy where you could set the weights based on r,
H, xedges, yedges = np.histogram2d(x, y, w_i = r)
I haven't used the last one personally.
I think what you are looking for is not a histogram but a contour plot (a histogram would count the number of occurrences of a coordinate (x,y) falling into a bin).
If your data is not on a grid, you can use tricontourf:
plt.tricontourf(dats[:,1],dats[:,2],dats[:,0],cmap='hot')
plt.colorbar()
plt.show()
There are more ways to plot this, such as scatter plots etc.

Plotting a 2D mesh grid with matplotlib

I would like to plot a 2D discretization rectangular mesh with non-regular
x y axes values, e.g. the typical discretization meshes used in CFD.
An example of the code may be:
fig = plt.figure(1,figsize=(12,8))
axes = fig.add_subplot(111)
matplotlib.rcParams.update({'font.size':17})
axes.set_xticks(self.xPoints)
axes.set_yticks(self.yPoints)
plt.grid(color='black', linestyle='-', linewidth=1)
myName = "2D.jpg"
fig.savefig(myName)
where self.xPoints and self.yPoints are 1D non-regular vectors.
This piece of code produce a good discretization mesh, the problem are the
xtics and ytics labels because they appear for all values of xPoints and yPoints (they overlap).
How can I easily redefine the printed values in the axes?
Let's say I only want to show the minimum and maximum value for x and y and not all values from the discretization mesh.
I cann't post a example-figure because it is the first time I ask something here (I can send it by mail if requested)
the problem is that you explicitly told matplotlib to label each point when you wrote:
axes.set_xticks(self.xPoints)
axes.set_yticks(self.yPoints)
comment out those lines and see what the result looks like.
Of course, if you only want the first and last point labelled, it becomes:
axes.set_xticks([self.xPoints[0], self.xPoints[-1]])
...
If the gridline was specified by axes.set_xticks(), I don't think it would be possible to show ticks without overlap in your case.
I may have a solution for you:
...
ax = plt.gca()
#Arr_y: y-direction data, 1D numpy array or list.
for j in range(len(Arr_y)):
plt.hline(y = Arr_y[j], xmin = Arr_x.min(), xmax = Arr_x.max(), color = 'black')
#Arr_x: x-direction data, 1D numpy array or list.
for i in range(len(Arr_x)):
plt.vline(x = Arr_x[i], ymin = Arr_y.min(), ymax = Arr_y.max(), color = 'black')
#Custom your ticks here, 1D numpy array or list.
ax.set_xticks(Arr_xticks)
ax.set_yticks(Arr_yticks)
plt.xlim(Arr_x.min(), Arr_x.max())
plt.ylim(Arr_y.min(), Arr_y.max())
plt.show()
...
hlines and vlines are horizontal and vertical lines, you can specify those lines with boundary data in both x and y directions.
I tried it with 60×182 non uniform mesh grid which cost me 1.2s, hope I can post a picture here.

Changing axis values on a plot

How can I change the data on one axis?
I'm making some spectrum analysis on some data and my x-axis is the index of some matrix. I'd like to change it so that the x-axis becomes the data itself.
I'm using the imshow() to plot the data (I have a matrix whose elements are some intensity, the y axes are their detector-source correspondent pair and the x-axis should be their frequency).
The code for it is written down here:
def pltspec(dOD, self):
idx = 0
b = plt.psd(dOD[:,idx],Fs=self.fs,NFFT=512)
B = np.zeros((2*len(self.Chan),len(b[0])))
for idx in range(2*len(self.Chan)):
b = plt.psd(dOD[:,idx],Fs=self.fs,NFFT=512)
B[idx,:] = 20*log10(b[0])
fig = plt.figure()
ax = fig.add_subplot(111)
plt.imshow(B, origin = 'lower')
plt.colorbar()
locs, labels = xticks(find(b[1]), b[1])
plt.axis('tight')
ax.xaxis.set_major_locator(MaxNLocator(5))
I think if there's a way of interchanging the index of some array with its value, my problem would be solved.
I've managed to use the line locs, labels = xticks(find(b[1]), b[1]). But with it on my graph my axis interval just isn't right... I think it has something to do with the MaxNLocator (which I used to decrease the number of ticks).
And if I use the xlim, I can set the figure to be what I want, but the x axis is still the same (on that xlim I had to use the original data to set it right).
What am I doing wrong?
Yes, you can use the xticks method exemplified in this example.
There are also more sophisticated ways of doing it. See ticker.

Categories

Resources