Multiple 2D histogram on same plot

Multiple 2D histogram on same plot - python

I have this script to extract data from an image and roi. I have everything working perfectly except the end when I output the graphs. Basically I'm having trouble with the windowing of both histograms. It doesn't matter if I change the gridsize, mincount, figure size, or x and y limits one of the histograms will always be slightly stretched. When I plot them individually they aren't stretched. Is there a way to make the hexagons on the same plot a consistent "non-stretched" shape?
Down below is my graph and plotting methods. (I left out my data extraction methods because it was quite specialized).
plt.ion()
plt.figure(figsize=(16,8))
plt.title('2D Histogram of Entorhinal Cortex ROIs')
plt.xlabel(x_inputs)
plt.ylabel(y_inputs)
colors = ['Reds','Blues']
x = []
y= []
#image extraction code
hist1 = plt.hexbin(x[0],y[0], gridsize=100,cmap='Reds',mincnt=10, alpha=0.35)
hist2 = plt.hexbin(x[1],y[1], gridsize=100,cmap='Blues',mincnt=10, alpha=0.35)
plt.colorbar(hist1, orientation="vertical")
plt.colorbar(hist2, orientation="vertical")
plt.ioff()
plt.show()
enter image description here

This issue can be solved by setting limits for the bins with the extent parameter. This can be done automatically by computing the minimum and maximum x and y values across all the data being plotted. In cases where gridsize is small (e.g. 10), this approach may result in some of the bins being partially outside of the plot limits. If so, setting a margin with plt.margins can help display all the bins within the plot.
import numpy as np # v 1.20.2
import matplotlib.pyplot as plt # v 3.3.4
# Create a random dataset
rng = np.random.default_rng(seed=123) # random number generator
size = 10000
x1 = rng.normal(loc=5, scale=10, size=size)
y1 = rng.normal(loc=5, scale=2, size=size)
x2 = rng.normal(loc=-30, scale=5, size=size)
y2 = rng.normal(loc=-20, scale=5, size=size)
# Define hexbin grid extent
xmin = min(*x1, *x2)
xmax = max(*x1, *x2)
ymin = min(*y1, *y2)
ymax = max(*y1, *y2)
ext = (xmin, xmax, ymin, ymax)
# Draw figure with colorbars
plt.figure(figsize=(10, 6))
hist1 = plt.hexbin(x1, y1, gridsize=30, cmap='Reds', mincnt=10, alpha=0.3, extent=ext)
hist2 = plt.hexbin(x2, y2, gridsize=30, cmap='Blues', mincnt=10, alpha=0.3, extent=ext)
plt.colorbar(hist1, orientation='vertical')
plt.colorbar(hist2, orientation='vertical')
# plt.margins(0.1) # Uncomment this if hex bins are partially outside of plot limits
plt.show()

Related

Python: Filling colors between curves and axes & to regionalize the areas

I have a set of x,y values for two curves on excel sheets.
Using xlrd module, I have been able to plot them as below:
Question:
How do I shade the three areas with different fill colors? Had tried with fill_between but been unsuccessful due to not knowing how to associate with the x and y axes. The end in mind is as diagram below.
Here is my code:
import xlrd
import numpy as np
import matplotlib.pyplot as plt
workbook = xlrd.open_workbook('data.xls')
sheet = workbook.sheet_by_name('p1')
rowcount = sheet.nrows
colcount = sheet.ncols
result_data_p1 =[]
for row in range(1, rowcount):
row_data = []
for column in range(0, colcount):
data = sheet.cell_value(row, column)
row_data.append(data)
#print(row_data)
result_data_p1.append(row_data)
sheet = workbook.sheet_by_name('p2')
rowcount = sheet.nrows
colcount = sheet.ncols
result_data_p2 =[]
for row in range(1, rowcount):
row_data = []
for column in range(0, colcount):
data = sheet.cell_value(row, column)
row_data.append(data)
result_data_p2.append(row_data)
x1 = []
y1 = []
for i,k in result_data_p1:
cx1,cy1 = i,k
x1.append(cx1)
y1.append(cy1)
x2 = []
y2 = []
for m,n in result_data_p2:
cx2,cy2 = m,n
x2.append(cx2)
y2.append(cy2)
plt.subplot(1,1,1)
plt.yscale('log')
plt.plot(x1, y1, label = "Warm", color = 'red')
plt.plot(x2, y2, label = "Blue", color = 'blue')
plt.xlabel('Color Temperature (K)')
plt.ylabel('Illuminance (lm)')
plt.title('Kruithof Curve')
plt.legend()
plt.xlim(xmin=2000,xmax=7000)
plt.ylim(ymin=10,ymax=50000)
plt.show()
Please guide or lead to other references, if any.
Thank you.

Here is a way to recreate the curves and the gradients. It resulted very complicated to draw the background using the logscale. Therefore, the background is created in linear space and put on a separate y-axis. There were some problems getting the background behind the rest of the plot if it were drawn on the twin axis. Therefore, the background is drawn on the main axis, and the plot on the second axis. Afterwards, that second y-axis is placed again at the left.
To draw the curves, a spline is interpolated using six points. As the interpolation didn't give acceptable results using the plain coordinates, everything was interpolated in logspace.
The background is created column by column, checking where the two curves are for each x position. The red curve is extended artificially to have a consistent area.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
from scipy import interpolate
xmin, xmax = 2000, 7000
ymin, ymax = 10, 50000
# a grid of 6 x,y coordinates for both curves
x_grid = np.array([2000, 3000, 4000, 5000, 6000, 7000])
y_blue_grid = np.array([15, 100, 200, 300, 400, 500])
y_red_grid = np.array([20, 400, 10000, 500000, 500000, 500000])
# create interpolating curves in logspace
tck_red = interpolate.splrep(x_grid, np.log(y_red_grid), s=0)
tck_blue = interpolate.splrep(x_grid, np.log(y_blue_grid), s=0)
x = np.linspace(xmin, xmax)
yr = np.exp(interpolate.splev(x, tck_red, der=0))
yb = np.exp(interpolate.splev(x, tck_blue, der=0))
# create the background image; it is created fully in logspace
# the background (z) is zero between the curves, negative in the blue zone and positive in the red zone
# the values are close to zero near the curves, gradually increasing when they are further
xbg = np.linspace(xmin, xmax, 50)
ybg = np.linspace(np.log(ymin), np.log(ymax), 50)
z = np.zeros((len(ybg), len(xbg)), dtype=float)
for i, xi in enumerate(xbg):
yi_r = interpolate.splev(xi, tck_red, der=0)
yi_b = interpolate.splev(xi, tck_blue, der=0)
for j, yj in enumerate(ybg):
if yi_b >= yj:
z[j][i] = (yj - yi_b)
elif yi_r <= yj:
z[j][i] = (yj - yi_r)
fig, ax2 = plt.subplots(figsize=(8, 8))
# draw the background image, set vmax and vmin to get the desired range of colors;
# vmin should be -vmax to get the white at zero
ax2.imshow(z, origin='lower', extent=[xmin, xmax, np.log(ymin), np.log(ymax)], aspect='auto', cmap='bwr', vmin=-12, vmax=12, interpolation='bilinear', zorder=-2)
ax2.set_ylim(ymin=np.log(ymin), ymax=np.log(ymax)) # the image fills the complete background
ax2.set_yticks([]) # remove the y ticks of the background image, they are confusing
ax = ax2.twinx() # draw the main plot using the twin y-axis
ax.set_yscale('log')
ax.plot(x, yr, label="Warm", color='crimson')
ax.plot(x, yb, label="Blue", color='dodgerblue')
ax2.set_xlabel('Color Temperature')
ax.set_ylabel('Illuminance (lm)')
ax.set_title('Kruithof Curve')
ax.legend()
ax.set_xlim(xmin=xmin, xmax=xmax)
ax.set_ylim(ymin=ymin, ymax=ymax)
ax.grid(True, which='major', axis='y')
ax.grid(True, which='minor', axis='y', ls=':')
ax.yaxis.tick_left() # switch the twin axis to the left
ax.yaxis.set_label_position('left')
ax2.grid(True, which='major', axis='x')
ax2.xaxis.set_major_formatter(mticker.StrMethodFormatter('{x:.0f} K')) # show x-axis in Kelvin
ax.text(5000, 2000, 'Pleasing', fontsize=16)
ax.text(5000, 20, 'Appears bluish', fontsize=16)
ax.text(2300, 15000, 'Appears reddish', fontsize=16)
plt.show()

Create with imshow the same plot as pcolormesh [duplicate]

I am creating a histogram for my data. Interestingly, when I plot my raw data and their histogram together on one plot, they are a "y-flipped" version of each other as follows:
I failed to find out the reason and fix it. My code snippet is as follows:
import math as mt
import numpy as np
import matplotlib.pylab as plt
x = np.random.randn(50)
y = np.random.randn(50)
w = np.random.randn(50)
leftBound, rightBound, topBound, bottomBound = min(x), max(x), max(y), min(y)
# parameters for histogram
x_edges = np.linspace(int(mt.floor(leftBound)), int(mt.ceil(rightBound)), int(mt.ceil(rightBound))-int(mt.floor(leftBound))+1)
y_edges = np.linspace(int(mt.floor(bottomBound)), int(mt.ceil(topBound)), int(mt.ceil(topBound))-int(mt.floor(bottomBound))+1)
# construct the histogram
wcounts = np.histogram2d(x, y, bins=(x_edges, y_edges), normed=False, weights=w)[0]
# wcounts is a 2D array, with each element representing the weighted count in a bins
# show histogram
extent = x_edges[0], x_edges[-1], y_edges[0], y_edges[-1]
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.set_xlabel('x (m)')
axes.set_ylabel('y (m)')
histogram = axes.imshow(np.transpose(wcounts), extent=extent, alpha=1, vmin=0.5, vmax=5, cmap=cm.binary) # alpha controls the transparency
fig.colorbar(histogram)
# show data
axes.plot(x, y, color = '#99ffff')
Since the data here are generated randomly for demonstration, I don't think it helps much, if the problem is with that particular data set. But anyway, if it is something wrong with the code, it still helps.

By default, axes.imshow(z) places array element z[0,0] in the top left corner of the axes (or the extent in this case). You probably want to either add the origin="bottom" argument to your imshow() call or pass a flipped data array, i.e., z[:,::-1].

Data and histogram do not collide in matplotlib?

I am creating a histogram for my data. Interestingly, when I plot my raw data and their histogram together on one plot, they are a "y-flipped" version of each other as follows:
I failed to find out the reason and fix it. My code snippet is as follows:
import math as mt
import numpy as np
import matplotlib.pylab as plt
x = np.random.randn(50)
y = np.random.randn(50)
w = np.random.randn(50)
leftBound, rightBound, topBound, bottomBound = min(x), max(x), max(y), min(y)
# parameters for histogram
x_edges = np.linspace(int(mt.floor(leftBound)), int(mt.ceil(rightBound)), int(mt.ceil(rightBound))-int(mt.floor(leftBound))+1)
y_edges = np.linspace(int(mt.floor(bottomBound)), int(mt.ceil(topBound)), int(mt.ceil(topBound))-int(mt.floor(bottomBound))+1)
# construct the histogram
wcounts = np.histogram2d(x, y, bins=(x_edges, y_edges), normed=False, weights=w)[0]
# wcounts is a 2D array, with each element representing the weighted count in a bins
# show histogram
extent = x_edges[0], x_edges[-1], y_edges[0], y_edges[-1]
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.set_xlabel('x (m)')
axes.set_ylabel('y (m)')
histogram = axes.imshow(np.transpose(wcounts), extent=extent, alpha=1, vmin=0.5, vmax=5, cmap=cm.binary) # alpha controls the transparency
fig.colorbar(histogram)
# show data
axes.plot(x, y, color = '#99ffff')
Since the data here are generated randomly for demonstration, I don't think it helps much, if the problem is with that particular data set. But anyway, if it is something wrong with the code, it still helps.

By default, axes.imshow(z) places array element z[0,0] in the top left corner of the axes (or the extent in this case). You probably want to either add the origin="bottom" argument to your imshow() call or pass a flipped data array, i.e., z[:,::-1].

Set a colormap under a graph

I know this is well documented, but I'm struggling to implement this in my code.
I would like to shade the area under my graph with a colormap. Is it possible to have a colour, i.e. red from any points over 30, and a gradient up until that point?
I am using the method fill_between, but I'm happy to change this if there is a better way to do it.
def plot(sd_values):
plt.figure()
sd_values=np.array(sd_values)
x=np.arange(len(sd_values))
plt.plot(x,sd_values, linewidth=1)
plt.fill_between(x,sd_values, cmap=plt.cm.jet)
plt.show()
This is the result at the moment. I have tried axvspan, but this doesnt have cmap as an option. Why does the below graph not show a colormap?

I'm not sure if the cmap argument should be part of the fill_between plotting command. In your case probably want to use the fill() command btw.
These fill commands create polygons or polygon collections. A polygon collection can take a cmap but with fill there is no way of providing the data on which it should be colored.
What's (for as far as i know) certainly not possible is to fill a single polygon with a gradient as you wish.
The next best thing is to fake it. You can plot a shaded image and clip it based on the created polygon.
# create some sample data
x = np.linspace(0, 1)
y = np.sin(4 * np.pi * x) * np.exp(-5 * x) * 120
fig, ax = plt.subplots()
# plot only the outline of the polygon, and capture the result
poly, = ax.fill(x, y, facecolor='none')
# get the extent of the axes
xmin, xmax = ax.get_xlim()
ymin, ymax = ax.get_ylim()
# create a dummy image
img_data = np.arange(ymin,ymax,(ymax-ymin)/100.)
img_data = img_data.reshape(img_data.size,1)
# plot and clip the image
im = ax.imshow(img_data, aspect='auto', origin='lower', cmap=plt.cm.Reds_r, extent=[xmin,xmax,ymin,ymax], vmin=y.min(), vmax=30.)
im.set_clip_path(poly)
The image is given an extent which basically stretches it over the entire axes. Then the clip_path makes it only showup where the fill polygon is drawn.

I think all you need is to do the plot of the data one at a time, like:
import numpy
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors
# Create fake data
x = numpy.linspace(0,4)
y = numpy.exp(x)
# Now plot one by one
bar_width = x[1] - x[0] # assuming x is linealy spaced
for pointx, pointy in zip(x,y):
current_color = cm.jet( min(pointy/30, 30)) # maximum of 30
plt.bar(pointx, pointy, bar_width, color = current_color)
plt.show()
Resulting in:

How to get log axes for a density plot with matplotlib?

I am trying to make a 2D density plot (from some simulation data) with matplotlib. My x and y data are defined as the log10 of some quantities. How can I get logarithmic axes (with log minor ticks)?
Here is an exemple of my code:
import numpy as np
import matplotlib.pyplot as plt
Data = np.genfromtxt("data") # A 2-column data file
x = np.log10(Data[:,0])
y = np.log10(Data[:,1])
xmin = x.min()
xmax = x.max()
ymin = y.min()
ymax = y.max()
fig = plt.figure()
ax = fig.add_subplot(111)
hist = ax.hexbin(x,y,bins='log', gridsize=(30,30), cmap=cm.Reds)
ax.axis([xmin, xmax, ymin, ymax])
plt.savefig('plot.pdf')

From the matplotlib.pyplot.hist docstring, it looks like there is a 'log' argument to set to 'True' if you want log scale on axis.
hist(x, bins=10, range=None, normed=False, cumulative=False,
bottom=None, histtype='bar', align='mid',
orientation='vertical', rwidth=None, log=False, **kwargs)
log:
If True, the histogram axis will be set to a log scale. If log is True and x is a 1D
array, empty bins will be filtered out and only the non-empty (n, bins, patches) will be
returned.
There is also a pyplot.loglog function to make a plot with log scaling on the x and y axis.

Thank you very much for suggestions.
Below, I join my own solution. It is hardly "a minimum working example" but I have already stripped my script quite a lot!
In a nutshell, I used imshow to plot the "image" (a 2D histogram with log bins) and I remove the axes. Then, I draw a second, empty (and transparent), plot, exactly on top of the first plot just to get log axes as imshow doesn't seem to allow it. Quite complicated if you ask me!
My code is probably far from optimal as I am new to python and matplotlib...
By the way, I don't use hexbin for two reasons:
1) It is too slow to run on very big data files like the kind I have.
2) With the version I use, the hexagons are slightly too large, i.e. they overlap, resulting in "pixels" of irregular shapes and sizes.
Also, I want to be able to write the histogram data into a file in text format.
#!/usr/bin/python
# How to get log axis with a 2D colormap (i.e. an "image") ??
#############################################################
#############################################################
import numpy as np
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import math
# Data file containing 2D data in log-log coordinates.
# The format of the file is 3 columns : x y v
# where v is the value to plotted for coordinate (x,y)
# x and y are already log values
# For instance, this can be a 2D histogram with log bins.
input_file="histo2d.dat"
# Parameters to set space for the plot ("bounding box")
x1_bb, y1_bb, x2_bb, y2_bb = 0.125, 0.12, 0.8, 0.925
# Parameters to set space for colorbar
cb_fraction=0.15
cb_pad=0.05
# Return unique values from a sorted list, will be required later
def uniq(seq, idfun=None):
# order preserving
if idfun is None:
def idfun(x): return x
seen = {}
result = []
for item in seq:
marker = idfun(item)
# in old Python versions:
# if seen.has_key(marker)
# but in new ones:
if marker in seen: continue
seen[marker] = 1
result.append(item)
return result
# Read data from file. The format of the file is 3 columns : x y v
# where v is the value to plotted for coordinate (x,y)
Data = np.genfromtxt(input_file)
x = Data[:,0]
y = Data[:,1]
v = Data[:,2]
# Determine x and y limits and resolution of data
x_uniq = np.array(uniq(np.sort(x)))
y_uniq = np.array(uniq(np.sort(y)))
x_resolution = x_uniq.size
y_resolution = y_uniq.size
x_interval_length = x_uniq[1]-x_uniq[0]
y_interval_length = y_uniq[1]-y_uniq[0]
xmin = x.min()
xmax = x.max()+0.5*x_interval_length
ymin = y.min()
ymax = y.max()+0.5*y_interval_length
# Reshape 1D data to turn it into a 2D "image"
v = v.reshape([x_resolution, y_resolution])
v = v[:,range(y_resolution-1,-1,-1)].transpose()
# Plot 2D "image"
# ---------------
# I use imshow which only work with linear axes.
# We will have to change the axes later...
axis_lim=[xmin, xmax, ymin, ymax]
fig = plt.figure()
ax = fig.add_subplot(111)
extent = [xmin, xmax, ymin, ymax]
img = plt.imshow(v, extent=extent, interpolation='nearest', cmap=cm.Reds, aspect='auto')
ax.axis(axis_lim)
# Make space for the colorbar
x2_bb_eff = (x2_bb-(cb_fraction+cb_pad)*x1_bb)/(1.0-(cb_fraction+cb_pad))
ax.set_position([x1_bb, y1_bb, x2_bb_eff-x1_bb, y2_bb-y1_bb])
position = ax.get_position()
# Remove axis ticks so that we can put log ticks on top
ax.set_xticks([])
ax.set_yticks([])
# Add colorbar
cb = fig.colorbar(img,fraction=cb_fraction,pad=cb_pad)
cb.set_label('Value [unit]')
# Add logarithmic axes
# --------------------
# Empty plot on top of previous one. Only used to add log axes.
ax = fig.add_subplot(111,frameon=False)
ax.set_xscale('log')
ax.set_yscale('log')
plt.plot([])
ax.set_position([x1_bb, y1_bb, x2_bb-x1_bb, y2_bb-y1_bb])
axis_lim_log=map(lambda x: 10.**x, axis_lim)
ax.axis(axis_lim_log)
plt.grid(b=True, which='major', linewidth=1)
plt.ylabel('Some quantity [unit]')
plt.xlabel('Another quantity [unit]')
plt.show()

The answer from #gcalmettes refers to pyplot.hist. The signature for pyplot.hexbin is a bit different:
hexbin(x, y, C = None, gridsize = 100, bins = None,
xscale = 'linear', yscale = 'linear',
cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None,
edgecolors='none', reduce_C_function = np.mean, mincnt=None, marginals=True,
**kwargs)
You are interested on the xscale parameter:
*xscale*: [ 'linear' | 'log' ]
Use a linear or log10 scale on the horizontal axis.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Multiple 2D histogram on same plot - python

Related

Python: Filling colors between curves and axes & to regionalize the areas

Create with imshow the same plot as pcolormesh [duplicate]

Data and histogram do not collide in matplotlib?

Set a colormap under a graph

How to get log axes for a density plot with matplotlib?

Categories

Resources