How to get log axes for a density plot with matplotlib?

How to get log axes for a density plot with matplotlib? - python

I am trying to make a 2D density plot (from some simulation data) with matplotlib. My x and y data are defined as the log10 of some quantities. How can I get logarithmic axes (with log minor ticks)?
Here is an exemple of my code:
import numpy as np
import matplotlib.pyplot as plt
Data = np.genfromtxt("data") # A 2-column data file
x = np.log10(Data[:,0])
y = np.log10(Data[:,1])
xmin = x.min()
xmax = x.max()
ymin = y.min()
ymax = y.max()
fig = plt.figure()
ax = fig.add_subplot(111)
hist = ax.hexbin(x,y,bins='log', gridsize=(30,30), cmap=cm.Reds)
ax.axis([xmin, xmax, ymin, ymax])
plt.savefig('plot.pdf')

From the matplotlib.pyplot.hist docstring, it looks like there is a 'log' argument to set to 'True' if you want log scale on axis.
hist(x, bins=10, range=None, normed=False, cumulative=False,
bottom=None, histtype='bar', align='mid',
orientation='vertical', rwidth=None, log=False, **kwargs)
log:
If True, the histogram axis will be set to a log scale. If log is True and x is a 1D
array, empty bins will be filtered out and only the non-empty (n, bins, patches) will be
returned.
There is also a pyplot.loglog function to make a plot with log scaling on the x and y axis.

Thank you very much for suggestions.
Below, I join my own solution. It is hardly "a minimum working example" but I have already stripped my script quite a lot!
In a nutshell, I used imshow to plot the "image" (a 2D histogram with log bins) and I remove the axes. Then, I draw a second, empty (and transparent), plot, exactly on top of the first plot just to get log axes as imshow doesn't seem to allow it. Quite complicated if you ask me!
My code is probably far from optimal as I am new to python and matplotlib...
By the way, I don't use hexbin for two reasons:
1) It is too slow to run on very big data files like the kind I have.
2) With the version I use, the hexagons are slightly too large, i.e. they overlap, resulting in "pixels" of irregular shapes and sizes.
Also, I want to be able to write the histogram data into a file in text format.
#!/usr/bin/python
# How to get log axis with a 2D colormap (i.e. an "image") ??
#############################################################
#############################################################
import numpy as np
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import math
# Data file containing 2D data in log-log coordinates.
# The format of the file is 3 columns : x y v
# where v is the value to plotted for coordinate (x,y)
# x and y are already log values
# For instance, this can be a 2D histogram with log bins.
input_file="histo2d.dat"
# Parameters to set space for the plot ("bounding box")
x1_bb, y1_bb, x2_bb, y2_bb = 0.125, 0.12, 0.8, 0.925
# Parameters to set space for colorbar
cb_fraction=0.15
cb_pad=0.05
# Return unique values from a sorted list, will be required later
def uniq(seq, idfun=None):
# order preserving
if idfun is None:
def idfun(x): return x
seen = {}
result = []
for item in seq:
marker = idfun(item)
# in old Python versions:
# if seen.has_key(marker)
# but in new ones:
if marker in seen: continue
seen[marker] = 1
result.append(item)
return result
# Read data from file. The format of the file is 3 columns : x y v
# where v is the value to plotted for coordinate (x,y)
Data = np.genfromtxt(input_file)
x = Data[:,0]
y = Data[:,1]
v = Data[:,2]
# Determine x and y limits and resolution of data
x_uniq = np.array(uniq(np.sort(x)))
y_uniq = np.array(uniq(np.sort(y)))
x_resolution = x_uniq.size
y_resolution = y_uniq.size
x_interval_length = x_uniq[1]-x_uniq[0]
y_interval_length = y_uniq[1]-y_uniq[0]
xmin = x.min()
xmax = x.max()+0.5*x_interval_length
ymin = y.min()
ymax = y.max()+0.5*y_interval_length
# Reshape 1D data to turn it into a 2D "image"
v = v.reshape([x_resolution, y_resolution])
v = v[:,range(y_resolution-1,-1,-1)].transpose()
# Plot 2D "image"
# ---------------
# I use imshow which only work with linear axes.
# We will have to change the axes later...
axis_lim=[xmin, xmax, ymin, ymax]
fig = plt.figure()
ax = fig.add_subplot(111)
extent = [xmin, xmax, ymin, ymax]
img = plt.imshow(v, extent=extent, interpolation='nearest', cmap=cm.Reds, aspect='auto')
ax.axis(axis_lim)
# Make space for the colorbar
x2_bb_eff = (x2_bb-(cb_fraction+cb_pad)*x1_bb)/(1.0-(cb_fraction+cb_pad))
ax.set_position([x1_bb, y1_bb, x2_bb_eff-x1_bb, y2_bb-y1_bb])
position = ax.get_position()
# Remove axis ticks so that we can put log ticks on top
ax.set_xticks([])
ax.set_yticks([])
# Add colorbar
cb = fig.colorbar(img,fraction=cb_fraction,pad=cb_pad)
cb.set_label('Value [unit]')
# Add logarithmic axes
# --------------------
# Empty plot on top of previous one. Only used to add log axes.
ax = fig.add_subplot(111,frameon=False)
ax.set_xscale('log')
ax.set_yscale('log')
plt.plot([])
ax.set_position([x1_bb, y1_bb, x2_bb-x1_bb, y2_bb-y1_bb])
axis_lim_log=map(lambda x: 10.**x, axis_lim)
ax.axis(axis_lim_log)
plt.grid(b=True, which='major', linewidth=1)
plt.ylabel('Some quantity [unit]')
plt.xlabel('Another quantity [unit]')
plt.show()

The answer from #gcalmettes refers to pyplot.hist. The signature for pyplot.hexbin is a bit different:
hexbin(x, y, C = None, gridsize = 100, bins = None,
xscale = 'linear', yscale = 'linear',
cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None,
edgecolors='none', reduce_C_function = np.mean, mincnt=None, marginals=True,
**kwargs)
You are interested on the xscale parameter:
*xscale*: [ 'linear' | 'log' ]
Use a linear or log10 scale on the horizontal axis.

Related

Multiple 2D histogram on same plot

I have this script to extract data from an image and roi. I have everything working perfectly except the end when I output the graphs. Basically I'm having trouble with the windowing of both histograms. It doesn't matter if I change the gridsize, mincount, figure size, or x and y limits one of the histograms will always be slightly stretched. When I plot them individually they aren't stretched. Is there a way to make the hexagons on the same plot a consistent "non-stretched" shape?
Down below is my graph and plotting methods. (I left out my data extraction methods because it was quite specialized).
plt.ion()
plt.figure(figsize=(16,8))
plt.title('2D Histogram of Entorhinal Cortex ROIs')
plt.xlabel(x_inputs)
plt.ylabel(y_inputs)
colors = ['Reds','Blues']
x = []
y= []
#image extraction code
hist1 = plt.hexbin(x[0],y[0], gridsize=100,cmap='Reds',mincnt=10, alpha=0.35)
hist2 = plt.hexbin(x[1],y[1], gridsize=100,cmap='Blues',mincnt=10, alpha=0.35)
plt.colorbar(hist1, orientation="vertical")
plt.colorbar(hist2, orientation="vertical")
plt.ioff()
plt.show()
enter image description here

This issue can be solved by setting limits for the bins with the extent parameter. This can be done automatically by computing the minimum and maximum x and y values across all the data being plotted. In cases where gridsize is small (e.g. 10), this approach may result in some of the bins being partially outside of the plot limits. If so, setting a margin with plt.margins can help display all the bins within the plot.
import numpy as np # v 1.20.2
import matplotlib.pyplot as plt # v 3.3.4
# Create a random dataset
rng = np.random.default_rng(seed=123) # random number generator
size = 10000
x1 = rng.normal(loc=5, scale=10, size=size)
y1 = rng.normal(loc=5, scale=2, size=size)
x2 = rng.normal(loc=-30, scale=5, size=size)
y2 = rng.normal(loc=-20, scale=5, size=size)
# Define hexbin grid extent
xmin = min(*x1, *x2)
xmax = max(*x1, *x2)
ymin = min(*y1, *y2)
ymax = max(*y1, *y2)
ext = (xmin, xmax, ymin, ymax)
# Draw figure with colorbars
plt.figure(figsize=(10, 6))
hist1 = plt.hexbin(x1, y1, gridsize=30, cmap='Reds', mincnt=10, alpha=0.3, extent=ext)
hist2 = plt.hexbin(x2, y2, gridsize=30, cmap='Blues', mincnt=10, alpha=0.3, extent=ext)
plt.colorbar(hist1, orientation='vertical')
plt.colorbar(hist2, orientation='vertical')
# plt.margins(0.1) # Uncomment this if hex bins are partially outside of plot limits
plt.show()

Python matplotlib polar coordinate is not plotting as it is supposed to be

I am plotting from a CSV file that contains Cartesian coordinates and I want to change it to Polar coordinates, then plot using the Polar coordinates.
Here is the code
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.read_csv('test_for_plotting.csv',index_col = 0)
x_temp = df['x'].values
y_temp = df['y'].values
df['radius'] = np.sqrt( np.power(x_temp,2) + np.power(y_temp,2) )
df['theta'] = np.arctan2(y_temp,x_temp)
df['degrees'] = np.degrees(df['theta'].values)
df['radians'] = np.radians(df['degrees'].values)
ax = plt.axes(polar = True)
ax.set_aspect('equal')
ax.axis("off")
sns.set(rc={'axes.facecolor':'white', 'figure.facecolor':'white','figure.figsize':(10,10)})
# sns.scatterplot(data = df, x = 'x',y = 'y', s= 1,alpha = 0.1, color = 'black',ax = ax)
sns.scatterplot(data = df, x = 'radians',y = 'radius', s= 1,alpha = 0.1, color = 'black',ax = ax)
plt.tight_layout()
plt.show()
Here is the dataset
If you run this command using polar = False and use this line to plot sns.scatterplot(data = df, x = 'x',y = 'y', s= 1,alpha = 0.1, color = 'black',ax = ax) it will result in this picture
now after setting polar = True and run this line to plot sns.scatterplot(data = df, x = 'radians',y = 'radius', s= 1,alpha = 0.1, color = 'black',ax = ax) It is supposed to give you this
But it is not working as if you run the actual code the shape in the Polar format is the same as Cartesian which does not make sense and it does not match the picture I showed you for polar (If you are wondering where did I get the second picture from, I plotted it using R)
I would appreciate your help and insights and thanks in advance!

For a polar plot, the "x-axis" represents the angle in radians. So, you need to switch x and y, and convert the angles to radians (I also added ax=ax, as the axes was created explicitly):
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
data = {'radius': [0, 0.5, 1, 1.5, 2, 2.5], 'degrees': [0, 25, 75, 155, 245, 335]}
df_temp = pd.DataFrame(data)
ax = plt.axes(polar=True)
sns.scatterplot(x=np.radians(df_temp['degrees']), y=df_temp['radius'].to_numpy(),
s=100, alpha=1, color='black', ax=ax)
for deg, y in zip(df_temp['degrees'], df_temp['radius']):
x = np.radians(deg)
ax.axvline(x, color='skyblue', ls=':')
ax.text(x, y, f' {deg}', color='crimson')
ax.set_rlabel_position(-15) # Move radial labels away from plotted dots
plt.tight_layout()
plt.show()
About your new question: if you have an xy plot, and you convert these xy values to polar coordinates, and then plot these on a polar plot, you'll get again the same plot.
After some more testing with the data, I decided to create the plot directly with matplotlib, as seaborn makes some changes that don't have exactly equal effects across seaborn and matplotlib versions.
What seems to be happening in R:
The angles (given by "x") are spread out to fill the range (0,2 pi). This either requires a rescaling of x, or change how the x-values are mapped to angles. One way to get this, is subtracting the minimum. And with that result divide by the new maximum and multiply by 2 pi.
The 0 of the angles it at the top, and the angles go clockwise.
The following code should create the plot with Python. You might want to experiment with alpha and with s in the scatter plot options. (Default the scatter dots get an outline, which often isn't desired when working with very small dots, and can be removed by lw=0.)
ax = plt.axes(polar=True)
ax.set_aspect('equal')
ax.axis('off')
x_temp = df['x'].to_numpy()
y_temp = df['y'].to_numpy()
x_temp -= x_temp.min()
x_temp = x_temp / x_temp.max() * 2 * np.pi
ax.scatter(x=x_temp, y=y_temp, s=0.05, alpha=1, color='black', lw=0)
ax.set_rlim(y_temp.min(), y_temp.max())
ax.set_theta_zero_location("N") # set zero at the north (top)
ax.set_theta_direction(-1) # go clockwise
plt.show()
At the left the resulting image, at the right using the y-values for coloring (ax.scatter(..., c=y_temp, s=0.05, alpha=1, cmap='plasma_r', lw=0)):

python matplotlib with a line color gradient and colorbar

I've been toying around with this problem and am close to what I want but missing that extra line or two.
Basically, I'd like to plot a single line whose color changes given the value of a third array. Lurking around I have found this works well (albeit pretty slowly) and represents the problem
import numpy as np
import matplotlib.pyplot as plt
c = np.arange(1,100)
x = np.arange(1,100)
y = np.arange(1,100)
cm = plt.get_cmap('hsv')
fig = plt.figure(figsize=(5,5))
ax1 = plt.subplot(111)
no_points = len(c)
ax1.set_color_cycle([cm(1.*i/(no_points-1))
for i in range(no_points-1)])
for i in range(no_points-1):
bar = ax1.plot(x[i:i+2],y[i:i+2])
plt.show()
Which gives me this:
I'd like to be able to include a colorbar along with this plot. So far I haven't been able to crack it just yet. Potentially there will be other lines included with different x,y's but the same c, so I was thinking that a Normalize object would be the right path.
Bigger picture is that this plot is part of a 2x2 sub plot grid. I am already making space for the color bar axes object with matplotlib.colorbar.make_axes(ax4), where ax4 with the 4th subplot.

Take a look at the multicolored_line example in the Matplotlib gallery and dpsanders' colorline notebook:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.collections as mcoll
def multicolored_lines():
"""
http://nbviewer.ipython.org/github/dpsanders/matplotlib-examples/blob/master/colorline.ipynb
http://matplotlib.org/examples/pylab_examples/multicolored_line.html
"""
x = np.linspace(0, 4. * np.pi, 100)
y = np.sin(x)
fig, ax = plt.subplots()
lc = colorline(x, y, cmap='hsv')
plt.colorbar(lc)
plt.xlim(x.min(), x.max())
plt.ylim(-1.0, 1.0)
plt.show()
def colorline(
x, y, z=None, cmap='copper', norm=plt.Normalize(0.0, 1.0),
linewidth=3, alpha=1.0):
"""
http://nbviewer.ipython.org/github/dpsanders/matplotlib-examples/blob/master/colorline.ipynb
http://matplotlib.org/examples/pylab_examples/multicolored_line.html
Plot a colored line with coordinates x and y
Optionally specify colors in the array z
Optionally specify a colormap, a norm function and a line width
"""
# Default colors equally spaced on [0,1]:
if z is None:
z = np.linspace(0.0, 1.0, len(x))
# Special case if a single number:
# to check for numerical input -- this is a hack
if not hasattr(z, "__iter__"):
z = np.array([z])
z = np.asarray(z)
segments = make_segments(x, y)
lc = mcoll.LineCollection(segments, array=z, cmap=cmap, norm=norm,
linewidth=linewidth, alpha=alpha)
ax = plt.gca()
ax.add_collection(lc)
return lc
def make_segments(x, y):
"""
Create list of line segments from x and y coordinates, in the correct format
for LineCollection: an array of the form numlines x (points per line) x 2 (x
and y) array
"""
points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
return segments
multicolored_lines()
Note that calling plt.plot hundreds of times tends to kill performance.
Using a LineCollection to build multi-colored line segments is much much faster.

Create with imshow the same plot as pcolormesh [duplicate]

I am creating a histogram for my data. Interestingly, when I plot my raw data and their histogram together on one plot, they are a "y-flipped" version of each other as follows:
I failed to find out the reason and fix it. My code snippet is as follows:
import math as mt
import numpy as np
import matplotlib.pylab as plt
x = np.random.randn(50)
y = np.random.randn(50)
w = np.random.randn(50)
leftBound, rightBound, topBound, bottomBound = min(x), max(x), max(y), min(y)
# parameters for histogram
x_edges = np.linspace(int(mt.floor(leftBound)), int(mt.ceil(rightBound)), int(mt.ceil(rightBound))-int(mt.floor(leftBound))+1)
y_edges = np.linspace(int(mt.floor(bottomBound)), int(mt.ceil(topBound)), int(mt.ceil(topBound))-int(mt.floor(bottomBound))+1)
# construct the histogram
wcounts = np.histogram2d(x, y, bins=(x_edges, y_edges), normed=False, weights=w)[0]
# wcounts is a 2D array, with each element representing the weighted count in a bins
# show histogram
extent = x_edges[0], x_edges[-1], y_edges[0], y_edges[-1]
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.set_xlabel('x (m)')
axes.set_ylabel('y (m)')
histogram = axes.imshow(np.transpose(wcounts), extent=extent, alpha=1, vmin=0.5, vmax=5, cmap=cm.binary) # alpha controls the transparency
fig.colorbar(histogram)
# show data
axes.plot(x, y, color = '#99ffff')
Since the data here are generated randomly for demonstration, I don't think it helps much, if the problem is with that particular data set. But anyway, if it is something wrong with the code, it still helps.

By default, axes.imshow(z) places array element z[0,0] in the top left corner of the axes (or the extent in this case). You probably want to either add the origin="bottom" argument to your imshow() call or pass a flipped data array, i.e., z[:,::-1].

Data and histogram do not collide in matplotlib?

I am creating a histogram for my data. Interestingly, when I plot my raw data and their histogram together on one plot, they are a "y-flipped" version of each other as follows:
I failed to find out the reason and fix it. My code snippet is as follows:
import math as mt
import numpy as np
import matplotlib.pylab as plt
x = np.random.randn(50)
y = np.random.randn(50)
w = np.random.randn(50)
leftBound, rightBound, topBound, bottomBound = min(x), max(x), max(y), min(y)
# parameters for histogram
x_edges = np.linspace(int(mt.floor(leftBound)), int(mt.ceil(rightBound)), int(mt.ceil(rightBound))-int(mt.floor(leftBound))+1)
y_edges = np.linspace(int(mt.floor(bottomBound)), int(mt.ceil(topBound)), int(mt.ceil(topBound))-int(mt.floor(bottomBound))+1)
# construct the histogram
wcounts = np.histogram2d(x, y, bins=(x_edges, y_edges), normed=False, weights=w)[0]
# wcounts is a 2D array, with each element representing the weighted count in a bins
# show histogram
extent = x_edges[0], x_edges[-1], y_edges[0], y_edges[-1]
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.set_xlabel('x (m)')
axes.set_ylabel('y (m)')
histogram = axes.imshow(np.transpose(wcounts), extent=extent, alpha=1, vmin=0.5, vmax=5, cmap=cm.binary) # alpha controls the transparency
fig.colorbar(histogram)
# show data
axes.plot(x, y, color = '#99ffff')
Since the data here are generated randomly for demonstration, I don't think it helps much, if the problem is with that particular data set. But anyway, if it is something wrong with the code, it still helps.

By default, axes.imshow(z) places array element z[0,0] in the top left corner of the axes (or the extent in this case). You probably want to either add the origin="bottom" argument to your imshow() call or pass a flipped data array, i.e., z[:,::-1].

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get log axes for a density plot with matplotlib? - python

Related

Multiple 2D histogram on same plot

Python matplotlib polar coordinate is not plotting as it is supposed to be

python matplotlib with a line color gradient and colorbar

Create with imshow the same plot as pcolormesh [duplicate]

Data and histogram do not collide in matplotlib?

Categories

Resources