Plotting 2d histogram of data with very different ranges in Python

Plotting 2d histogram of data with very different ranges in Python - python

I try to plot a 2d histogram of data with very different ranges using the following code. However, because of the different data ranges, the x data overlaps like the following figure. Is there any solution that plots x and y data with the same axis length?
import numpy as np
from matplotlib import pyplot as plt
plt.clf()
x = np.random.randint(low=0, high=10, size=8873)
y = np.random.randint(low=100000,high=600000, size=8873)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=50)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.imshow(heatmap.T, extent=extent, origin='lower')
plt.show()

Note that imshow() sets the aspect to 1 by default, changing it to auto should solve your problem. You can also calculate your own aspect based on extent to get for example a square image.
aspect = (extent[1] - extent[0]) / (extent[3] - extent[2])
plt.imshow(heatmap.T, extent=extent, origin='lower', aspect=aspect)
# plt.imshow(heatmap.T, extent=extent, origin='lower', aspect='auto')

Related

Make a heatmap whit 2d points and 2 images

Good day to all,
I am trying to create a heatmap, given a set of x, y coordinate pairs extracted from a CSV file. I am using numpy.histogram2d to plot it, the heatmap gives me correct, but I need to overlay it on another image. I share my code
Here is the CSV
csv = pd.read_csv("../test2.csv")
#Gets the centroid of the coords
csv["x"] = (combined_csv["x_min"]+(combined_csv["x_max"]-combined_csv["x_min"])/2).astype(int)
csv["y"] = (combined_csv["y_min"]+(combined_csv["y_max"]-combined_csv["y_min"])/2).astype(int)
am = csv[(csv["tracking_id"] != 7) &(csv["tracking_id"] != 28)] #Some filters
#Taken from another question
def myplot(x, y, s, bins=1000):
heatmap, xedges, yedges = np.histogram2d(x, y, bins=bins, range=[[0, 960], [0, 480]])
heatmap = gaussian_filter(heatmap, sigma=s)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
return heatmap.T, extent
img, extent = myplot(am["x"], am["y_max"], 16,200)
plt.imshow(img, extent=extent, origin='upper', cmap=cm.jet)
The previous code gives me this image
I put the range between 0 and 480 for x and 0 and 960 for y, because the target image has that size, but doing this, histogram2d returns me the histogram matrix h, which is 200x200, then I use matplotlib to plot it and This graphs me correctly in the range 0-480 for x and 0-960 for y, what I need is to save said image, preserving as much information as possible in the 480x960 pixel size, and then add it with the target image using this tutorial how to superimpose heatmap on a base image?
final_img = cv2.addWeighted (heatmap_img, 0.5, image, 0.5, 0)
For creating something like this
But the image and the histogram doesnt match in the sizes that i wanted.
This is the target image

Ok, so i fix the error by the next code, printing the two images at the same plot
base_image = image
sigma = 8
fig, ax = plt.subplots(1, 1, figsize = (20, 16))
heatmap, xedges, yedges = np.histogram2d(am["x"], am["y"], bins=200, range=[[0,960],[0,480]])
heatmap = gaussian_filter(heatmap, sigma=sigma)
extent = [xedges[0], xedges[-1], yedges[-1], yedges[0]]
img = heatmap.T
ax.imshow(img, extent=extent, cmap=cm.jet)
ax.imshow(base_image, alpha = 0.5)
ax.set_xlim(0, 960)
ax.set_ylim(480,0)
ax.axes.get_yaxis().set_visible(False)
ax.axes.get_xaxis().set_visible(False)
plt.subplots_adjust(left=0, bottom=0, right=1, top=1, wspace=0, hspace=0)
plt.savefig("heatmap_final.png", bbox_inches='tight')
plt.show()
plt.close(fig)

Generate a loglog heatmap in MatPlotLib using a scatter data set

I have a 2D power-law like dataset:
import numpy as np
X = 1 / np.random.power(2, size=1000)
Y = 1 / np.random.power(2, size=1000)
I can plot it using a scatter plot in loglog scale
import matplotlib.pyplot as plt
plt.figure()
plt.scatter(X, Y, alpha=0.3)
plt.loglog()
plt.show()
getting:
However, it does not show properly the data near the origin where the density is high. Hence, I converted this plot in a heatmap. I did that:
from matplotlib.colors import LogNorm
heatmap, xedges, yedges = np.histogram2d(X, Y, bins=np.logspace(0, 2, 30))
plt.figure()
plt.imshow(heatmap.T, origin='lower', norm=LogNorm())
plt.colorbar()
plt.show()
getting:
The plot looks great but the axis ticks are not good. To change the scale I tried to add extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]] in the imshow but it only does an affine transformation, the scale is still linear not logarithmic. Do you know how can I get the heatmap plot but with ticks of the scatter one?

You can use pcolormesh like JohanC advised.
Here is an example with you code using pcolormesh:
import numpy as np
import matplotlib.pyplot as plt
X = 1 / np.random.power(2, size=1000)
Y = 1 / np.random.power(2, size=1000)
heatmap, xedges, yedges = np.histogram2d(X, Y, bins=np.logspace(0, 2, 30))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.pcolormesh(xedges, yedges, heatmap)
ax.loglog()
ax.set_xlim(1, 50)
ax.set_ylim(1, 50)
plt.show()
And the output is:

Over-plot an equation curve over a png image

enter image description hereI'm having trouble overplotting a relation between radial velocity and offset(position). I've looked at various solutions, but it doesn't seem to work. I've converted the equation into numbers, with only one variable.It also doesn't display the picture to the required dimensions.
x = np.linspace(-0.8 ,0.8 , 1000)
y = 0.5*((1.334e+20/x)**0.5)
img = plt.imread('Pictures/PVdiagram1casaviewer.png')
fig, ax = plt.subplots(figsize=(16, 16), tight_layout=True)
ax.set_xlabel('Offset(arcsec)', fontsize=14)
ax.set_ylabel('Radial Velocity (Km/S)', fontsize=14)
ax.imshow(img, extent=[-0.8, 0.8, -5, 15])
ax.plot(x, y, linewidth=5, color='white')
plt.title('PV Diagram')
plt.show()
enter image description here

If I plot your image, you can see that the axis of the image and matplotlib don't match, because the image contains space between the plot and border of the pictures (axis titles, and so on...)
So, first you need to crop the image, so that it contains just the plot area.
Next, you can plot the image with the argument aspect=auto to scale it to your figsize:
ax.imshow(img, extent=[-0.8,0.8,-5,15], aspect='auto')
If you try to plot your y function over the image, you will see that the values of y are much larger, so the curve is above the image (notice the tiny image is at the bottom).
I don't know what the physical background of y is, but if you divide it by 10e9 it fits inside the image-range.
Full code:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-0.8 ,0.8 , 1000)
y = 0.5*((1.334e+20/x)**0.5)/10e9 # Scale it here... but how?
img = plt.imread('hNMw82.png')
fig, ax = plt.subplots(figsize=(16, 16), tight_layout=True)
ax.set_xlabel('Offset(arcsec)', fontsize=14)
ax.set_ylabel('Radial Velocity (Km/S)', fontsize=14)
ax.imshow(img, extent=[-0.8,0.8,-5,15], aspect='auto')
ax.plot(x, y, linewidth=5, color='white')
ax.set_ylim([-5,15])
ax.set_xlim([-0.8,0.8])
plt.title('PV Diagram')
plt.show()
Result:
(I also set the axis limits.)

Using imshow() to create higher quality hist2d

I am playing around with volumetric data and I am trying to project a "cosmic web" like image.
I pretty much create a file path and open the data with a module that opens hdf5 files. The x and y values are denoted by indexing from a the file gas_pos and the histogram is weighted by different properties, gas_density in this case:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.colors import LinearSegmentedColormap
from matplotlib.ticker import LogFormatter
cmap = LinearSegmentedColormap.from_list('mycmap', ['black', 'steelblue', 'mediumturquoise', 'darkslateblue'])
fig = plt.figure()
ax = fig.add_subplot(111)
H = ax.hist2d(gas_pos[:,0]/0.7, gas_pos[:,1]/0.7, bins=500, cmap=cmap, norm=matplotlib.colors.LogNorm(), weights=gas_density);
cb = fig.colorbar(H[3], ax=ax, shrink=0.8, pad=0.01, orientation="horizontal", label=r'$ \rho\ [M_{\odot}\ \mathrm{kpc}^{-3}]$')
ax.tick_params(axis=u'both', which=u'both',length=0)
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
giving me this:
which is nice, but I want to up the quality and remove the grainyness of it. When I try imshow interpolation:
cmap = LinearSegmentedColormap.from_list('mycmap', ['black', 'steelblue', 'mediumturquoise', 'darkslateblue'])
fig = plt.figure()
ax = fig.add_subplot(111)
H = ax.hist2d(gas_pos[:,0]/0.7, gas_pos[:,1]/0.7, bins=500, cmap=cmap, norm=matplotlib.colors.LogNorm(), weights=gas_density);
ax.tick_params(axis=u'both', which=u'both',length=0)
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
im = ax.imshow(H[0], cmap=cmap, interpolation='sinc', norm=matplotlib.colors.LogNorm())
cb = fig.colorbar(H[3], ax=ax, shrink=0.8, pad=0.01, orientation="horizontal", label=r'$ \rho\ [M_{\odot}\ \mathrm{kpc}^{-3}]$')
plt.show()
Am I using this incorrectly? or is there something better I can use to modify the pixelation?
If anyone is wanting to play with my data, I will upload the data later on today!

Using interpolation='sinc' is indeed a good method to smoothen a plot. Others would e.g. be "gaussian", "bicubic" or "spline16".
The problem you observe is that the imshow plot is plotted on top of the hist2d plot and thus takes its axes limits. Those limits seem to be smaller than the number of points in the imshow plot and therefore you only see part of the total data.
The solution is either not to plot the hist2d plot at all or at least to plot it into another subplot or figure.
Pursuing the first idea, you would calculate your histogram without plotting it, using numpy.histogram2d
H, xedges, yedges = np.histogram2d(gas_pos[:,0]/0.7, gas_pos[:,1]/0.7,
bins=500, weights=gas_density)
im = ax.imshow(H.T, cmap=cmap, interpolation='sinc', norm=matplotlib.colors.LogNorm())
I would also recommend reading the numpy.histogram2d documentation, which includes an example of plotting the histogram output in matplotlib.

You'll probably want to set interpolation='None' in the call to imshow, instead of interpolation='sinc'

Log-log density-colour plot in matplotlib

I am trying to create a density plot with a given data and using log scales in the two axes x,y, using the version of Matplotlib 2.0.0. I have made the following code, the problem is that for the log plot case don't give the correct functional behaviour.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
init = 0.0
points = 500
final_value = 100
steep = (final_value-init)/points
list_values_x = np.arange(init,final_value,steep)
list_values_y = np.arange(init,final_value,steep)
#WE CREATE OUT DATA FILE
f1 = open("data.txt", "w")
for i in list_values_x:
for j in list_values_y:
f1.write( str(i) +" "+str(j)+" "+str(0.0001*(i**2+j**2)) +"\n")
f1.close()
#NOW WE OPEN THE FILE WITH THE DATA AND MAKE THE PLOT
x,y,temp = np.loadtxt('data.txt').T #Transposed for easier unpacking
nrows, ncols = points, points
grid = temp.reshape((nrows, ncols))
# LINEAR PLOT
fig1 = plt.imshow(grid, extent=(x.min(), x.max(), y.max(), y.min()),
interpolation='nearest', cmap=cm.gist_rainbow)
plt.axis([x.min(), x.max(),y.min(), y.max()])
plt.colorbar()
plt.suptitle('Example', fontsize=15)
plt.xlabel('x', fontsize=16)
plt.ylabel('y', fontsize=16)
plt.show()
# LOG-LOG PLOT
fig, (ax1) = plt.subplots(ncols=1, figsize=(8, 4))
ax1.imshow(grid, aspect="auto", extent=(1, 1e2, 1, 1e2), interpolation='nearest')
ax1.set_yscale('log')
ax1.set_xscale('log')
ax1.set_title('Example with log scale')
plt.show()
The data that I am using in order to make the plot is irrelevant, it's just an example. So that, the first plot is given with a linear scale. The second plot is given with a log-log scale, but is clear that it's incorrect, the behaviour beetwen the two plots is absolutely different and I am using the same data. Moreover, I don't know how put a colorbar in the log-log plot
Any idea why this happens? Thanks for your attention.
PD: In order to build the log-log plot, I have used part of the code that apears in "Non-linear scales on image plots" given in (http://matplotlib.org/devdocs/users/whats_new.html#non-linear-scales-on-image-plots)

Using the extent keyword and it with extent=(xmin, xmax, ymin, ymax) makes more sense when additionally using origin="lower" in imshow. You might also want to set the limits for the axes, since the automatic feature does not work too well for log scales.
Here is the complete example:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from mpl_toolkits.axes_grid1 import make_axes_locatable
points = 500
init = 0.0
final_value = 100
steep = (final_value-init)/points
x = np.arange(init,final_value,steep)
y = np.arange(init,final_value,steep)
X,Y = np.meshgrid(x,y)
Z = 0.0001*(X**2+Y**2)
fig, (ax, ax1) = plt.subplots(ncols=2, figsize=(8, 4))
# LINEAR PLOT
im = ax.imshow(Z, extent=(x.min(), x.max(), y.min(), y.max() ),
interpolation='nearest', cmap=cm.gist_rainbow, origin="lower")
ax.set_title('lin scale')
#make colorbar
divider = make_axes_locatable(ax)
ax_cb = divider.new_horizontal(size="5%", pad=0.05)
fig.add_axes(ax_cb)
fig.colorbar(im, cax = ax_cb, ax=ax)
# LOG-LOG PLOT
im1 = ax1.imshow(Z, extent=(1, 1e2, 1, 1e2),
interpolation='nearest',cmap=cm.gist_rainbow, origin="lower")
ax1.set_yscale('log')
ax1.set_xscale('log')
ax1.set_xlim([1, x.max()])
ax1.set_ylim([1, y.max()])
ax1.set_title('log scale')
#make colorbar
divider1 = make_axes_locatable(ax1)
ax_cb1 = divider1.new_horizontal(size="5%", pad=0.05)
fig.add_axes(ax_cb1)
fig.colorbar(im1, cax = ax_cb1, ax=ax1)
plt.tight_layout()
plt.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Plotting 2d histogram of data with very different ranges in Python - python

Related

Make a heatmap whit 2d points and 2 images

Generate a loglog heatmap in MatPlotLib using a scatter data set

Over-plot an equation curve over a png image

Using imshow() to create higher quality hist2d

Log-log density-colour plot in matplotlib

Categories

Resources