I am creating a histogram for my data. Interestingly, when I plot my raw data and their histogram together on one plot, they are a "y-flipped" version of each other as follows:
I failed to find out the reason and fix it. My code snippet is as follows:
import math as mt
import numpy as np
import matplotlib.pylab as plt
x = np.random.randn(50)
y = np.random.randn(50)
w = np.random.randn(50)
leftBound, rightBound, topBound, bottomBound = min(x), max(x), max(y), min(y)
# parameters for histogram
x_edges = np.linspace(int(mt.floor(leftBound)), int(mt.ceil(rightBound)), int(mt.ceil(rightBound))-int(mt.floor(leftBound))+1)
y_edges = np.linspace(int(mt.floor(bottomBound)), int(mt.ceil(topBound)), int(mt.ceil(topBound))-int(mt.floor(bottomBound))+1)
# construct the histogram
wcounts = np.histogram2d(x, y, bins=(x_edges, y_edges), normed=False, weights=w)[0]
# wcounts is a 2D array, with each element representing the weighted count in a bins
# show histogram
extent = x_edges[0], x_edges[-1], y_edges[0], y_edges[-1]
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.set_xlabel('x (m)')
axes.set_ylabel('y (m)')
histogram = axes.imshow(np.transpose(wcounts), extent=extent, alpha=1, vmin=0.5, vmax=5, cmap=cm.binary) # alpha controls the transparency
fig.colorbar(histogram)
# show data
axes.plot(x, y, color = '#99ffff')
Since the data here are generated randomly for demonstration, I don't think it helps much, if the problem is with that particular data set. But anyway, if it is something wrong with the code, it still helps.
By default, axes.imshow(z) places array element z[0,0] in the top left corner of the axes (or the extent in this case). You probably want to either add the origin="bottom" argument to your imshow() call or pass a flipped data array, i.e., z[:,::-1].
Related
I have this script to extract data from an image and roi. I have everything working perfectly except the end when I output the graphs. Basically I'm having trouble with the windowing of both histograms. It doesn't matter if I change the gridsize, mincount, figure size, or x and y limits one of the histograms will always be slightly stretched. When I plot them individually they aren't stretched. Is there a way to make the hexagons on the same plot a consistent "non-stretched" shape?
Down below is my graph and plotting methods. (I left out my data extraction methods because it was quite specialized).
plt.ion()
plt.figure(figsize=(16,8))
plt.title('2D Histogram of Entorhinal Cortex ROIs')
plt.xlabel(x_inputs)
plt.ylabel(y_inputs)
colors = ['Reds','Blues']
x = []
y= []
#image extraction code
hist1 = plt.hexbin(x[0],y[0], gridsize=100,cmap='Reds',mincnt=10, alpha=0.35)
hist2 = plt.hexbin(x[1],y[1], gridsize=100,cmap='Blues',mincnt=10, alpha=0.35)
plt.colorbar(hist1, orientation="vertical")
plt.colorbar(hist2, orientation="vertical")
plt.ioff()
plt.show()
enter image description here
This issue can be solved by setting limits for the bins with the extent parameter. This can be done automatically by computing the minimum and maximum x and y values across all the data being plotted. In cases where gridsize is small (e.g. 10), this approach may result in some of the bins being partially outside of the plot limits. If so, setting a margin with plt.margins can help display all the bins within the plot.
import numpy as np # v 1.20.2
import matplotlib.pyplot as plt # v 3.3.4
# Create a random dataset
rng = np.random.default_rng(seed=123) # random number generator
size = 10000
x1 = rng.normal(loc=5, scale=10, size=size)
y1 = rng.normal(loc=5, scale=2, size=size)
x2 = rng.normal(loc=-30, scale=5, size=size)
y2 = rng.normal(loc=-20, scale=5, size=size)
# Define hexbin grid extent
xmin = min(*x1, *x2)
xmax = max(*x1, *x2)
ymin = min(*y1, *y2)
ymax = max(*y1, *y2)
ext = (xmin, xmax, ymin, ymax)
# Draw figure with colorbars
plt.figure(figsize=(10, 6))
hist1 = plt.hexbin(x1, y1, gridsize=30, cmap='Reds', mincnt=10, alpha=0.3, extent=ext)
hist2 = plt.hexbin(x2, y2, gridsize=30, cmap='Blues', mincnt=10, alpha=0.3, extent=ext)
plt.colorbar(hist1, orientation='vertical')
plt.colorbar(hist2, orientation='vertical')
# plt.margins(0.1) # Uncomment this if hex bins are partially outside of plot limits
plt.show()
I make a contourf plot using matplotlib.pyplot. Now I want to have a horizontal line (or something like ax.vspan would work too) with conditional coloring at y = 0. I will show you what I have and what I would like to get. I want to do this with an array, let's say landsurface that represents either land, ocean or ice. This array is filled with 1 (land), 2 (ocean) or 3 (ice) and has the len(locs) (so the x-axis).
This is the plot code:
plt.figure()
ax=plt.axes()
clev=np.arange(0.,50.,.5)
plt.contourf(locs,height-surfaceheight,var,clev,extend='max')
plt.xlabel('Location')
plt.ylabel('Height above ground level [m]')
cbar = plt.colorbar()
cbar.ax.set_ylabel('o3 mixing ratio [ppb]')
plt.show()
This is what I have so far:
This is what I want:
Many thanks in advance!
Intro
I'm going to use a line collection .
Because I have not your original data, I faked some data using a simple sine curve and plotting on the baseline the color codes corresponding to small, middle and high values of the curve
Code
Usual boilerplate, we need to explicitly import LineCollection
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.collections import LineCollection
Just to plot something, a sine curve (x r
x = np.linspace(0, 50, 101)
y = np.sin(0.3*x)
The color coding from the curve values (corresponding to your surface types) to the LineCollection colors, note that LineCollection requires that the colors are specified as RGBA tuples but I have seen examples using color strings, bah!
# 1 when near min, 2 when near 0, 3 when near max
z = np.where(y<-0.5, 1, np.where(y<+0.5, 2, 3))
col_d = {1:(0.4, 0.4, 1.0, 1), # blue, near min
2:(0.4, 1.0, 0.4, 1), # green, near zero
3:(1.0, 0.4, 0.4, 1)} # red, near max
# prepare the list of colors
colors = [col_d[n] for n in z]
In a line collection we need a sequence of segments, here I have decided to place my coded line at y=0 but you can just add a constant to s to move it up and down.
I admit that forming the sequence of segments is a bit tricky...
# build the sequence of segments
s = np.zeros(101)
segments=np.array(list(zip(zip(x,x[1:]),zip(s,s[1:])))).transpose((0,2,1))
# and fill the LineCollection
lc = LineCollection(segments, colors=colors, linewidths=5,
antialiaseds=0, # to prevent artifacts between lines
zorder=3 # to force drawing over the curve) lc = LineCollection(segments, colors=colors, linewidths=5) # possibly add zorder=...
Finally, we put everything on the canvas
# plot the function and the line collection
fig, ax = plt.subplots()
ax.plot(x,y)
ax.add_collection(lc)
I would suggest adding an imshow() with proper extent, e.g.:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colorbar as colorbar
import matplotlib.colors as colors
### generate some data
np.random.seed(19680801)
npts = 50
x = np.random.uniform(0, 1, npts)
y = np.random.uniform(0, 1, npts)
X,Y=np.meshgrid(x,y)
z = x * np.exp(-X**2 - Y**2)*100
### create a colormap of three distinct colors for each landmass
landmass_cmap=colors.ListedColormap(["b","r","g"])
x_land=np.linspace(0,1,len(x)) ## this should be scaled to your "location"
## generate some fake landmass types (either 0, 1, or 2) with probabilites
y_land=np.random.choice(3, len(x), p=[0.1, 0.6, 0.3])
print(y_land)
fig=plt.figure()
ax=plt.axes()
clev=np.arange(0.,50.,.5)
## adjust the "height" of the landmass
x0,x1=0,1
y0,y1=0,0.05 ## y1 is the "height" of the landmass
## make sure that you're passing sensible zorder here and in your .contourf()
im = ax.imshow(y_land.reshape((-1,len(x))),cmap=landmass_cmap,zorder=2,extent=(x0,x1,y0,y1))
plt.contourf(x,y,z,clev,extend='max',zorder=1)
ax.set_xlim(0,1)
ax.set_ylim(0,1)
ax.plot()
ax.set_xlabel('Location')
ax.set_ylabel('Height above ground level [m]')
cbar = plt.colorbar()
cbar.ax.set_ylabel('o3 mixing ratio [ppb]')
## add a colorbar for your listed colormap
cax = fig.add_axes([0.2, 0.95, 0.5, 0.02]) # x-position, y-position, x-width, y-height
bounds = [0,1,2,3]
norm = colors.BoundaryNorm(bounds, landmass_cmap.N)
cb2 = colorbar.ColorbarBase(cax, cmap=landmass_cmap,
norm=norm,
boundaries=bounds,
ticks=[0.5,1.5,2.5],
spacing='proportional',
orientation='horizontal')
cb2.ax.set_xticklabels(['sea','land','ice'])
plt.show()
yields:
I am creating a histogram for my data. Interestingly, when I plot my raw data and their histogram together on one plot, they are a "y-flipped" version of each other as follows:
I failed to find out the reason and fix it. My code snippet is as follows:
import math as mt
import numpy as np
import matplotlib.pylab as plt
x = np.random.randn(50)
y = np.random.randn(50)
w = np.random.randn(50)
leftBound, rightBound, topBound, bottomBound = min(x), max(x), max(y), min(y)
# parameters for histogram
x_edges = np.linspace(int(mt.floor(leftBound)), int(mt.ceil(rightBound)), int(mt.ceil(rightBound))-int(mt.floor(leftBound))+1)
y_edges = np.linspace(int(mt.floor(bottomBound)), int(mt.ceil(topBound)), int(mt.ceil(topBound))-int(mt.floor(bottomBound))+1)
# construct the histogram
wcounts = np.histogram2d(x, y, bins=(x_edges, y_edges), normed=False, weights=w)[0]
# wcounts is a 2D array, with each element representing the weighted count in a bins
# show histogram
extent = x_edges[0], x_edges[-1], y_edges[0], y_edges[-1]
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.set_xlabel('x (m)')
axes.set_ylabel('y (m)')
histogram = axes.imshow(np.transpose(wcounts), extent=extent, alpha=1, vmin=0.5, vmax=5, cmap=cm.binary) # alpha controls the transparency
fig.colorbar(histogram)
# show data
axes.plot(x, y, color = '#99ffff')
Since the data here are generated randomly for demonstration, I don't think it helps much, if the problem is with that particular data set. But anyway, if it is something wrong with the code, it still helps.
By default, axes.imshow(z) places array element z[0,0] in the top left corner of the axes (or the extent in this case). You probably want to either add the origin="bottom" argument to your imshow() call or pass a flipped data array, i.e., z[:,::-1].
I know this is well documented, but I'm struggling to implement this in my code.
I would like to shade the area under my graph with a colormap. Is it possible to have a colour, i.e. red from any points over 30, and a gradient up until that point?
I am using the method fill_between, but I'm happy to change this if there is a better way to do it.
def plot(sd_values):
plt.figure()
sd_values=np.array(sd_values)
x=np.arange(len(sd_values))
plt.plot(x,sd_values, linewidth=1)
plt.fill_between(x,sd_values, cmap=plt.cm.jet)
plt.show()
This is the result at the moment. I have tried axvspan, but this doesnt have cmap as an option. Why does the below graph not show a colormap?
I'm not sure if the cmap argument should be part of the fill_between plotting command. In your case probably want to use the fill() command btw.
These fill commands create polygons or polygon collections. A polygon collection can take a cmap but with fill there is no way of providing the data on which it should be colored.
What's (for as far as i know) certainly not possible is to fill a single polygon with a gradient as you wish.
The next best thing is to fake it. You can plot a shaded image and clip it based on the created polygon.
# create some sample data
x = np.linspace(0, 1)
y = np.sin(4 * np.pi * x) * np.exp(-5 * x) * 120
fig, ax = plt.subplots()
# plot only the outline of the polygon, and capture the result
poly, = ax.fill(x, y, facecolor='none')
# get the extent of the axes
xmin, xmax = ax.get_xlim()
ymin, ymax = ax.get_ylim()
# create a dummy image
img_data = np.arange(ymin,ymax,(ymax-ymin)/100.)
img_data = img_data.reshape(img_data.size,1)
# plot and clip the image
im = ax.imshow(img_data, aspect='auto', origin='lower', cmap=plt.cm.Reds_r, extent=[xmin,xmax,ymin,ymax], vmin=y.min(), vmax=30.)
im.set_clip_path(poly)
The image is given an extent which basically stretches it over the entire axes. Then the clip_path makes it only showup where the fill polygon is drawn.
I think all you need is to do the plot of the data one at a time, like:
import numpy
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors
# Create fake data
x = numpy.linspace(0,4)
y = numpy.exp(x)
# Now plot one by one
bar_width = x[1] - x[0] # assuming x is linealy spaced
for pointx, pointy in zip(x,y):
current_color = cm.jet( min(pointy/30, 30)) # maximum of 30
plt.bar(pointx, pointy, bar_width, color = current_color)
plt.show()
Resulting in:
I am trying to make a 2D density plot (from some simulation data) with matplotlib. My x and y data are defined as the log10 of some quantities. How can I get logarithmic axes (with log minor ticks)?
Here is an exemple of my code:
import numpy as np
import matplotlib.pyplot as plt
Data = np.genfromtxt("data") # A 2-column data file
x = np.log10(Data[:,0])
y = np.log10(Data[:,1])
xmin = x.min()
xmax = x.max()
ymin = y.min()
ymax = y.max()
fig = plt.figure()
ax = fig.add_subplot(111)
hist = ax.hexbin(x,y,bins='log', gridsize=(30,30), cmap=cm.Reds)
ax.axis([xmin, xmax, ymin, ymax])
plt.savefig('plot.pdf')
From the matplotlib.pyplot.hist docstring, it looks like there is a 'log' argument to set to 'True' if you want log scale on axis.
hist(x, bins=10, range=None, normed=False, cumulative=False,
bottom=None, histtype='bar', align='mid',
orientation='vertical', rwidth=None, log=False, **kwargs)
log:
If True, the histogram axis will be set to a log scale. If log is True and x is a 1D
array, empty bins will be filtered out and only the non-empty (n, bins, patches) will be
returned.
There is also a pyplot.loglog function to make a plot with log scaling on the x and y axis.
Thank you very much for suggestions.
Below, I join my own solution. It is hardly "a minimum working example" but I have already stripped my script quite a lot!
In a nutshell, I used imshow to plot the "image" (a 2D histogram with log bins) and I remove the axes. Then, I draw a second, empty (and transparent), plot, exactly on top of the first plot just to get log axes as imshow doesn't seem to allow it. Quite complicated if you ask me!
My code is probably far from optimal as I am new to python and matplotlib...
By the way, I don't use hexbin for two reasons:
1) It is too slow to run on very big data files like the kind I have.
2) With the version I use, the hexagons are slightly too large, i.e. they overlap, resulting in "pixels" of irregular shapes and sizes.
Also, I want to be able to write the histogram data into a file in text format.
#!/usr/bin/python
# How to get log axis with a 2D colormap (i.e. an "image") ??
#############################################################
#############################################################
import numpy as np
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import math
# Data file containing 2D data in log-log coordinates.
# The format of the file is 3 columns : x y v
# where v is the value to plotted for coordinate (x,y)
# x and y are already log values
# For instance, this can be a 2D histogram with log bins.
input_file="histo2d.dat"
# Parameters to set space for the plot ("bounding box")
x1_bb, y1_bb, x2_bb, y2_bb = 0.125, 0.12, 0.8, 0.925
# Parameters to set space for colorbar
cb_fraction=0.15
cb_pad=0.05
# Return unique values from a sorted list, will be required later
def uniq(seq, idfun=None):
# order preserving
if idfun is None:
def idfun(x): return x
seen = {}
result = []
for item in seq:
marker = idfun(item)
# in old Python versions:
# if seen.has_key(marker)
# but in new ones:
if marker in seen: continue
seen[marker] = 1
result.append(item)
return result
# Read data from file. The format of the file is 3 columns : x y v
# where v is the value to plotted for coordinate (x,y)
Data = np.genfromtxt(input_file)
x = Data[:,0]
y = Data[:,1]
v = Data[:,2]
# Determine x and y limits and resolution of data
x_uniq = np.array(uniq(np.sort(x)))
y_uniq = np.array(uniq(np.sort(y)))
x_resolution = x_uniq.size
y_resolution = y_uniq.size
x_interval_length = x_uniq[1]-x_uniq[0]
y_interval_length = y_uniq[1]-y_uniq[0]
xmin = x.min()
xmax = x.max()+0.5*x_interval_length
ymin = y.min()
ymax = y.max()+0.5*y_interval_length
# Reshape 1D data to turn it into a 2D "image"
v = v.reshape([x_resolution, y_resolution])
v = v[:,range(y_resolution-1,-1,-1)].transpose()
# Plot 2D "image"
# ---------------
# I use imshow which only work with linear axes.
# We will have to change the axes later...
axis_lim=[xmin, xmax, ymin, ymax]
fig = plt.figure()
ax = fig.add_subplot(111)
extent = [xmin, xmax, ymin, ymax]
img = plt.imshow(v, extent=extent, interpolation='nearest', cmap=cm.Reds, aspect='auto')
ax.axis(axis_lim)
# Make space for the colorbar
x2_bb_eff = (x2_bb-(cb_fraction+cb_pad)*x1_bb)/(1.0-(cb_fraction+cb_pad))
ax.set_position([x1_bb, y1_bb, x2_bb_eff-x1_bb, y2_bb-y1_bb])
position = ax.get_position()
# Remove axis ticks so that we can put log ticks on top
ax.set_xticks([])
ax.set_yticks([])
# Add colorbar
cb = fig.colorbar(img,fraction=cb_fraction,pad=cb_pad)
cb.set_label('Value [unit]')
# Add logarithmic axes
# --------------------
# Empty plot on top of previous one. Only used to add log axes.
ax = fig.add_subplot(111,frameon=False)
ax.set_xscale('log')
ax.set_yscale('log')
plt.plot([])
ax.set_position([x1_bb, y1_bb, x2_bb-x1_bb, y2_bb-y1_bb])
axis_lim_log=map(lambda x: 10.**x, axis_lim)
ax.axis(axis_lim_log)
plt.grid(b=True, which='major', linewidth=1)
plt.ylabel('Some quantity [unit]')
plt.xlabel('Another quantity [unit]')
plt.show()
The answer from #gcalmettes refers to pyplot.hist. The signature for pyplot.hexbin is a bit different:
hexbin(x, y, C = None, gridsize = 100, bins = None,
xscale = 'linear', yscale = 'linear',
cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None,
edgecolors='none', reduce_C_function = np.mean, mincnt=None, marginals=True,
**kwargs)
You are interested on the xscale parameter:
*xscale*: [ 'linear' | 'log' ]
Use a linear or log10 scale on the horizontal axis.