How to evenly spread annotation imageboxes around a scatterplot?

How to evenly spread annotation imageboxes around a scatterplot? - python

I would like to annotate a scatterplot with images corresponding to each datapoint. With standard parameters the images end up clashing with each other and other important features such as legend axis, etc. Thus, I would like the images to form a circle or a rectangle around the main scatter plot.
My code looks like this for now and I am struggling to modify it to organise the images around the center point of the plot.
import matplotlib.cbook as cbook
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
import seaborn as sns
#Generate n points around a 2d circle
def generate_circle_points(n, centre_x, center_y, radius=1):
"""Generate n points around a circle.
Args:
n (int): Number of points to generate.
centre_x (float): x-coordinate of circle centre.
center_y (float): y-coordinate of circle centre.
radius (float): Radius of circle.
Returns:
list: List of points.
"""
points = []
for i in range(n):
angle = 2 * np.pi * i / n
x = centre_x + radius * np.cos(angle)
y = center_y + radius * np.sin(angle)
points.append([x, y])
return points
fig, ax = plt.subplots(1, 1, figsize=(7.5, 7.5))
data = pd.DataFrame(data={'x': np.random.uniform(0.5, 2.5, 20),
'y': np.random.uniform(10000, 50000, 20)})
with cbook.get_sample_data('grace_hopper.jpg') as image_file:
image = plt.imread(image_file)
# Set logarithmic scale for x and y axis
ax.set(xscale="log", yscale='log')
# Add grid
ax.grid(True, which='major', ls="--", c='gray')
coordianates = generate_circle_points(n=len(data),
centre_x=0, center_y=0, radius=10)
# Plot the scatter plot
scatter = sns.scatterplot(data=data, x='x', y='y', ax=ax)
for index, row in data.iterrows():
imagebox = OffsetImage(image, zoom=0.05)
imagebox.image.axes = ax
xy = np.array([row['x'], row['y']])
xybox = np.array(coordianates[index])
ab = AnnotationBbox(imagebox, xy,
xycoords='data',
boxcoords="offset points",
xybox=xybox,
pad=0)
ax.add_artist(ab)
for the moment the output looks like this:enter image description here
Ideally I would like the output to look to something like this:
enter image description here
Many thanks in advance for your help

Not an answer but a long comment:
You can control the location of the arrows, but sometimes it is easier to export figures as SVGs and edit them in Adobe Illustrator or Inkscape.
R has a dodge argument which is really nice, but even then is not always perfect. Solutions in Python exist but are laborious.
The major issue is that this needs to be done last as alternations to the plot would make it problematic. A few points need mentioning.
Your figures will have to have a fixed size (57mm / 121mm / 184mm for Science, 83mm / 171mm for RSC, 83mm / 178mm for ACS etc.), if you need to scale the figure in Illustrator keep note of the scaling factor, adding it as a textbox outside of the canvas —as the underlying plot will need to be replaced at least once due to Murphy's law. Exporting at the right size the SVG is ideal. Sounds silly, but it helps. Likewise, make sure the font size does not go under the minimum spec (7-9 points).

Related

Matplotlib using Wedge() in polar plots

TL/DR: How to use Wedge() in polar coordinates?
I'm generating a 2D histogram plot in polar coordinates (r, theta). At various values of r there can be different numbers of theta values (to preserve equal area sized bins). To draw the color coded bins I'm currently using pcolormesh() calls for each radial ring. This works ok, but near the center of the plot where there may be only 3 bins (each 120 degrees "wide" in theta space), pcolormesh() draws triangles that don't "sweep" out full arc (just connecting the two outer arc points with a straight line).
I've found a workaround using ax.bar() call, one for each radial ring and passing in arrays of theta values (each bin rendering as an individual bar). But when doing 90 rings with 3 to 360 theta bins in each, it's incredibly slow (minutes).
I tried using Wedge() patches, but can't get them to render correctly in the polar projection. Here is sample code showing both approaches:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.patches import Wedge
from matplotlib.collections import PatchCollection
# Theta coordinates in degrees
theta1=45
theta2=80
# Radius coordinates
r1 = 0.4
r2 = 0.5
# Plot using bar()
fig, ax = plt.subplots(figsize=[6,6], subplot_kw={'projection': 'polar'})
theta_mid = np.deg2rad((theta1 + theta2)/2)
theta_width = np.deg2rad(theta2 - theta1)
height = r2 - r1
ax.bar(x=theta_mid, height = height, width=theta_width, bottom=r1)
ax.set_rlim(0, 1)
plt.savefig('bar.png')
# Plot using Wedge()
fig, ax = plt.subplots(figsize=[6,6], subplot_kw={'projection': 'polar'})
patches = []
patches.append( Wedge(center=(0, 0), r = r1, theta1=theta1, theta2=theta2, width = r2-r1, color='blue'))
p = PatchCollection(patches)
ax.add_collection(p)
ax.set_rlim(0, 1)
plt.savefig('wedge.png')
The outputs of each are:
Bar
Wedge
I've tried using radians for the wedge (because polar plots usually want their angle values in radians). That didn't help.
Am I missing something in how I'm using the Wedge? If I add thousands of Wedges to my Patch collection should I have any expectation it will be faster than bar()?

Thinking this was an actual bug, I opened this issue https://github.com/matplotlib/matplotlib/issues/22717 on matplotlib where one of the maintainers nicely pointed out that I should be using Rectangle() instead of Wedge().
The solution they provided is
from matplotlib.patches import Rectangle
fig, ax = plt.subplots(figsize=[6,6], subplot_kw={'projection': 'polar'})
p = PatchCollection([Rectangle((np.deg2rad(theta1), r1), theta_width, height, color='blue')])
ax.add_collection(p)
ax.set_rlim(0, 1)
plt.savefig('wedge.png')

Drawing log-linear plot on a square plot area in matplotlib

I would like to draw a plot with a logarithmic y axis and a linear x axis on a square plot area in matplotlib. I can draw linear-linear as well as log-log plots on squares, but the method I use, Axes.set_aspect(...), is not implemented for log-linear plots. Is there a good workaround?
linear-linear plot on a square:
from pylab import *
x = linspace(1,10,1000)
y = sin(x)**2+0.5
plot (x,y)
ax = gca()
data_aspect = ax.get_data_ratio()
ax.set_aspect(1./data_aspect)
show()
log-log plot on a square:
from pylab import *
x = linspace(1,10,1000)
y = sin(x)**2+0.5
plot (x,y)
ax = gca()
ax.set_yscale("log")
ax.set_xscale("log")
xmin,xmax = ax.get_xbound()
ymin,ymax = ax.get_ybound()
data_aspect = (log(ymax)-log(ymin))/(log(xmax)-log(xmin))
ax.set_aspect(1./data_aspect)
show()
But when I try this with a log-linear plot, I do not get the square area, but a warning
from pylab import *
x = linspace(1,10,1000)
y = sin(x)**2+0.5
plot (x,y)
ax = gca()
ax.set_yscale("log")
xmin,xmax = ax.get_xbound()
ymin,ymax = ax.get_ybound()
data_aspect = (log(ymax)-log(ymin))/(xmax-xmin)
ax.set_aspect(1./data_aspect)
show()
yielding the warning:
axes.py:1173: UserWarning: aspect is not supported for Axes with xscale=linear, yscale=log
Is there a good way of achieving square log-linear plots despite the lack support in Axes.set_aspect?

Well, there is a sort of a workaround. The actual axis area (the area where the plot is, not including external ticks &c) can be resized to any size you want it to have.
You may use the ax.set_position to set the relative (to the figure) size and position of the plot. In order to use it in your case we need a bit of maths:
from pylab import *
x = linspace(1,10,1000)
y = sin(x)**2+0.5
plot (x,y)
ax = gca()
ax.set_yscale("log")
# now get the figure size in real coordinates:
fig = gcf()
fwidth = fig.get_figwidth()
fheight = fig.get_figheight()
# get the axis size and position in relative coordinates
# this gives a BBox object
bb = ax.get_position()
# calculate them into real world coordinates
axwidth = fwidth * (bb.x1 - bb.x0)
axheight = fheight * (bb.y1 - bb.y0)
# if the axis is wider than tall, then it has to be narrowe
if axwidth > axheight:
# calculate the narrowing relative to the figure
narrow_by = (axwidth - axheight) / fwidth
# move bounding box edges inwards the same amount to give the correct width
bb.x0 += narrow_by / 2
bb.x1 -= narrow_by / 2
# else if the axis is taller than wide, make it vertically smaller
# works the same as above
elif axheight > axwidth:
shrink_by = (axheight - axwidth) / fheight
bb.y0 += shrink_by / 2
bb.y1 -= shrink_by / 2
ax.set_position(bb)
show()
A slight stylistic comment is that import pylab is not usually used. The lore goes:
import matplotlib.pyplot as plt
pylab as an odd mixture of numpy and matplotlib imports created to make interactive IPython use easier. (I use it, too.)

Extent and aspect; square pixels in an image with shared axis in matplotlib

I am stuck in a rather complicated situation. I am plotting some data as an image with imshow(). Unfortunately my script is long and a little messy, so it is difficult to make a working example, but I am showing the key steps. This is how I get the data for my image from a bigger array, written in a file:
data = np.tril(np.loadtxt('IC-heatmap-20K.mtx'), 1)
#
#Here goes lot's of other stuff, where I define start and end
#
chrdata = data[start:end, start:end]
chrdata = ndimage.rotate(chrdata, 45, order=0, reshape=True,
prefilter=False, cval=0)
ax1 = host_subplot(111)
#I don't really need host_subplot() in this case, I could use something more common;
#It is just divider.append_axes("bottom", ...) is really convenient.
plt.imshow(chrdata, origin='lower', interpolation='none',
extent=[0, length*resolution, 0, length*resolution]) #resolution=20000
So the values I am interested in are all in a triangle with the top angle in the middle of the top side of a square. At the same time I plot some data (lot's of coloured lines in this case) along with the image near it's bottom.
So at first this looks OK, but is actually is not: all pixels in the image are not square, but elongated with their height being bigger, than their width. This is how they look if I zoom in:
This doesn't happen, If I don't set extent when calling imshow(), but I need it so that coordinates in the image and other plots (coloured lines at the bottom in this case), where identical (see Converting coordinates of a picture in matplotlib?).
I tried to fix it using aspect. I tried to do that and it fixed the pixels' shape, but I got a really weird picture:
The thing is, later in the code I explicitly set this:
ax1.set_ylim(0*resolution, length*resolution) #resolution=20000
But after setting aspect I get absolutely different y limits. And the worst thing: ax1 is now wider, than axes of another plot at the bottom, so that their coordinates do not match anymore! I add it in this way:
axPlotx = divider.append_axes("bottom", size=0.1, pad=0, sharex=ax1)
I would really appreciate help with getting it fixed: square pixels, identical coordinates in two (or more, in other cases) plots. As I see it, the axes of the image need to become wider (as aspect does), the ylims should apply and the width of the second axes should be identical to the image's.
Thanks for reading this probably unclear explanation, please, let me know, if I should clarify anything.
UPDATE
As suggested in the comments, I tried to use
ax1.set(adjustable='box-forced')
And it did help with the image itself, but it caused two axes to get separated by white space. Is there any way to keep them close to each other?

Re-edited my entire answer as I found the solution to your problem. I solved it using the set_adjustable("box_forced") option as suggested by the comment of tcaswell.
import numpy
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import host_subplot, make_axes_locatable
#Calculate aspect ratio
def determine_aspect(shape, extent):
dx = (extent[1] - extent[0]) / float(shape[1])
dy = (extent[3] - extent[2]) / float(shape[0])
return dx / dy
data = numpy.random.random((30,60))
shape = data.shape
extent = [-10, 10, -20, 20]
x_size, y_size = 6, 6
fig = plt.figure(figsize = (x_size, y_size))
ax = host_subplot(1, 1, 1)
ax.imshow(data, extent = extent, interpolation = "None", aspect = determine_aspect(shape, extent))
#Determine width and height of the subplot frame
bbox = ax.get_window_extent().transformed(fig.dpi_scale_trans.inverted())
width, height = bbox.width, bbox.height
#Calculate distance, the second plot needs to be elevated by
padding = (y_size - (height - width)) / float(1 / (2. * determine_aspect(shape, extent)))
#Create second image in subplot with shared x-axis
divider = make_axes_locatable(ax)
axPlotx = divider.append_axes("bottom", size = 0.1, pad = -padding, sharex = ax)
#Turn off yticks for axPlotx and xticks for ax
axPlotx.set_yticks([])
plt.setp(ax.get_xticklabels(), visible=False)
#Make the plot obey the frame
ax.set_adjustable("box-forced")
fig.savefig("test.png", dpi=300, bbox_inches = "tight")
plt.show()
This results in the following image where the x-axis is shared:
Hope that helps!

How do I force scatter points real pixel values when plotting in pyplot/python?

I've taken an image and extracted some features from it using OpenCv. I'd like to replot those points and their respective areas (which are real pixel values) into a scatter window and then save it. Unfortunately, when I plot the points, they resize to stay more visible. If I zoom in they resize. I'd like to save the whole figure retaining the actual ratio of pixel (x,y) coordinates to size of points plotted.
For instance:
import matplotlib.pyplot as plt
x=[5000,10000,20000]
y=[20000,10000,5000]
area_in_pixels=[100,200,100]
scatter(x,y,s=area_in_pixels)
I would like this to produce tiny dots on the image. They should span like 10 xy units. However, the dots it produces are large, and appear to span 1000 xy units.
I've tried resizing the image with:
plt.figure(figsize=(10,10))
Which seems to resize the points relative to their position a little. But I'm not sure what scale I would select to make this accurate. DPI settings on plt.figsave seem to make the saved image larger but don't appear to alter relative spot sizes.
Asked another way, is there another way to relate the s which is in points^2 to a real number or to the units of the x-y axis?

You can use patches to create markers sized relative to the data coordinates.
import matplotlib.pyplot as plt
from matplotlib.patches import Circle
xData=[5000,10000,20000, 15000]
yData=[20000,10000,5000, 15000]
radius_in_pixels=[100,200,100, 1000] # Circle takes radius as an argument. You could convert from area.
fig = plt.figure()
ax = fig.add_subplot(111, aspect='equal')
for x, y, r in zip(xData, yData, radius_in_pixels):
ax.add_artist(Circle(xy=(x, y), radius = r))
plt.xlim(0, max(xData) + 200)
plt.ylim(0, max(yData) + 200)
plt.show()

matplotlib and apect ratio of geographical-data plots

I process geographical information and present the results using
matplotlib. All input is lattitude/longitude [degree]. I convert into
x/y [meter] for my calculations. And I present my results in
lattitude/longitude. The problem is to get the graphs aspect-ratio
right: All graphs are too wide. Is there a standard procedure to set the
correct aspect-ratio so I can simply draw my scatter and other diagrams
using lat/lon and the result has the correct shape? On screen and on
paper (png)?
[added this part later]
This is a bare-bone stripped version of my problem. I need actual lat/lon values
around the axes and an accurate shape (square). Right now it appears wide (2x).
import math
import matplotlib.pyplot as plt
import numpy as np
from pylab import *
w=1/math.cos(math.radians(60.0))
plt_area=[0,w,59.5,60.5] #60deg North, adjacent to the prime meridian
a=np.zeros(shape=(300,300))
matshow(a, extent=plt_area)
plt.grid(False)
plt.axis(plt_area)
fig = plt.gcf()
fig.set_size_inches(8,8)
fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)
plt.show()

It seems I found the solution.
And I found it here: How can I set the aspect ratio in matplotlib?
import math
import matplotlib.pyplot as plt
import numpy as np
w=1/math.cos(math.radians(60.0))
plt_area=[0,w,59.5,60.5] #square area
a=np.zeros(shape=(300,300))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(a)
plt.grid(False)
ax.axis(plt_area)
fig = plt.gcf()
fig.set_size_inches(8,8)
ax.set_aspect(w)
fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)
plt.show()

In matplotlib I usually change the figure size like this:
import matplotlib.pyplot as plt
plt.clf()
fig = plt.figure()
fig_p = plt.gcf()
fig_p.set_size_inches(8, 8) # x, y
However this sets the dimensions for the figure outer dimensions, not the plot area. You can change the plot area relative to the figure size given in ratios of the total figure size lengths of x and y respectively:
fig.subplots_adjust(left=0.1, right=0.9, bottom=0.1, top=0.9)
As long as the the relative ratios stay symmetrically the aspect ratio should be the same for the plot are.
Example 1:
plt.clf()
fig = plt.figure()
fig_p = plt.gcf()
fig_p.set_size_inches(5, 5) # x, y for figure canvas
# Relative distance ratio between origin of the figure and max extend of canvas
fig.subplots_adjust(left=0.2, right=0.8, bottom=0.2, top=0.8)
ax1 = fig.add_subplot(111)
xdata = [rand()*10 for i in xrange(100)]
ydata = [rand()*1 for i in xrange(100)]
ax1.plot(xdata, ydata, '.b', )
ax1.set_xlabel('Very Large X-Label', size=30)
plt.savefig('squareplot.png', dpi=96)
Example 2:
fig.subplots_adjust(left=0.0, right=1.0, bottom=0.0, top=1.0)
Plot area fills the figure size completely:

Don't try to fix this by fiddling fig.set_size_inches() or fig.subplots_adjust() or by changing your data; instead use a Mercator projection.
You can get a quick and dirty Mercator projection by using an aspect ratio of the reciprocal of the cosine of the mean latitude of your data. This is "pretty good" for data contained in about 1 degree of latitude, which is about 100 km. (You have to decide if, for your application, this is "good enough". If it isn't, you really have to consider some serious geographical projection libraries...)
Example:
from math import cos, radians
import matplotlib.pyplot as plt
import numpy as np
# Helsinki 60.1708 N, 24.9375 E
# Helsinki (lng, lat)
hels = [24.9375, 60.1708]
# a point 100 km directly north of Helsinki
pt_N = [24.9375, 61.0701]
# a point 100 km east of Helsinki along its parallel
pt_E = [26.7455, 60.1708]
coords = np.array([pt_N, hels, pt_E])
plt.figure()
plt.plot(coords[:,0], coords[:,1])
# either of these will estimate the "central latitude" of your data
# 1) do the plot, then average the limits of the y-axis
central_latitude = sum(plt.axes().get_ylim())/2.
# 2) actually average the latitudes in your data
central_latitude = np.average(coords, 0)[1]
# calculate the aspect ratio that will approximate a
# Mercator projection at this central latitude
mercator_aspect_ratio = 1/cos(radians(central_latitude))
# set the aspect ratio of the axes to that
plt.axes().set_aspect(mercator_aspect_ratio)
plt.show()
I picked Helsinki for the example since at that latitude the aspect ratio is almost 2... because two degrees of longitude is the about same distance as one degree of latitude.
To really see this work: a) run the above, b) resize the window. Then comment out the call to set_aspect() and do the same. In the first case, the correct aspect ratio is maintained, in the latter you get nonsensical stretching.
The points 100km north and east of Helsinki were calculated/confirmed by the EXCELLENT page calculating distances between lat/lng points at Movable Type Scripts

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to evenly spread annotation imageboxes around a scatterplot? - python

Related

Matplotlib using Wedge() in polar plots

Drawing log-linear plot on a square plot area in matplotlib

Extent and aspect; square pixels in an image with shared axis in matplotlib

How do I force scatter points real pixel values when plotting in pyplot/python?

matplotlib and apect ratio of geographical-data plots

Categories

Resources