Increase visibility of large scatter plot

Increase visibility of large scatter plot - python

I am plotting a dotplot( here is a brief explanation of what it is)https://en.wikipedia.org/wiki/Dot_plot_(bioinformatics) of a large amount of data( the largest one being 66x80,000x80,000!!).
The current method I'm using is to plot a matplotlib scatter plot with the x,y coordinates of the dots.
the problem is I can not set the interpolation value of the plot, and the output is not a vector graph. This makes the graph very hard to see when zoomed out.
here is the code:
v_x = np.random.randint(0, 80000, 300000)
v_y = v_x # the x, y cordinate of the dots.
f,axes = plt.subplots(11,11,figsize = (20,20))
for row in range(11):
for col in range(11):
axes[row,col].set_yticklabels([])
axes[row,col].set_xticklabels([])
if row > col:
axes[row,col].axis('off')
else:
axes[row,col].set_xlim(0,len(v_x))
axes[row,col].set_ylim(0,len(v_y))
axes[row,col].scatter(v_x,v_y, c = '#000000', s=(72./300)**2*20, marker = 's', edgecolor= '', rasterized = True)
f.savefig('{}'.format('test.pdf'), facecolor='w', bbox_inches='tight', dpi = 300)
If I don't do the rasteration, the pdf file contains too many dots and cannot be opened, if I do rasteration, the plot is hard to see when zoomed out.
I wonder if there is any way that I can solve this problem.

Related

How to evenly spread annotation imageboxes around a scatterplot?

I would like to annotate a scatterplot with images corresponding to each datapoint. With standard parameters the images end up clashing with each other and other important features such as legend axis, etc. Thus, I would like the images to form a circle or a rectangle around the main scatter plot.
My code looks like this for now and I am struggling to modify it to organise the images around the center point of the plot.
import matplotlib.cbook as cbook
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
import seaborn as sns
#Generate n points around a 2d circle
def generate_circle_points(n, centre_x, center_y, radius=1):
"""Generate n points around a circle.
Args:
n (int): Number of points to generate.
centre_x (float): x-coordinate of circle centre.
center_y (float): y-coordinate of circle centre.
radius (float): Radius of circle.
Returns:
list: List of points.
"""
points = []
for i in range(n):
angle = 2 * np.pi * i / n
x = centre_x + radius * np.cos(angle)
y = center_y + radius * np.sin(angle)
points.append([x, y])
return points
fig, ax = plt.subplots(1, 1, figsize=(7.5, 7.5))
data = pd.DataFrame(data={'x': np.random.uniform(0.5, 2.5, 20),
'y': np.random.uniform(10000, 50000, 20)})
with cbook.get_sample_data('grace_hopper.jpg') as image_file:
image = plt.imread(image_file)
# Set logarithmic scale for x and y axis
ax.set(xscale="log", yscale='log')
# Add grid
ax.grid(True, which='major', ls="--", c='gray')
coordianates = generate_circle_points(n=len(data),
centre_x=0, center_y=0, radius=10)
# Plot the scatter plot
scatter = sns.scatterplot(data=data, x='x', y='y', ax=ax)
for index, row in data.iterrows():
imagebox = OffsetImage(image, zoom=0.05)
imagebox.image.axes = ax
xy = np.array([row['x'], row['y']])
xybox = np.array(coordianates[index])
ab = AnnotationBbox(imagebox, xy,
xycoords='data',
boxcoords="offset points",
xybox=xybox,
pad=0)
ax.add_artist(ab)
for the moment the output looks like this:enter image description here
Ideally I would like the output to look to something like this:
enter image description here
Many thanks in advance for your help

Not an answer but a long comment:
You can control the location of the arrows, but sometimes it is easier to export figures as SVGs and edit them in Adobe Illustrator or Inkscape.
R has a dodge argument which is really nice, but even then is not always perfect. Solutions in Python exist but are laborious.
The major issue is that this needs to be done last as alternations to the plot would make it problematic. A few points need mentioning.
Your figures will have to have a fixed size (57mm / 121mm / 184mm for Science, 83mm / 171mm for RSC, 83mm / 178mm for ACS etc.), if you need to scale the figure in Illustrator keep note of the scaling factor, adding it as a textbox outside of the canvas —as the underlying plot will need to be replaced at least once due to Murphy's law. Exporting at the right size the SVG is ideal. Sounds silly, but it helps. Likewise, make sure the font size does not go under the minimum spec (7-9 points).

Image plot in bokeh with tight axes and matching aspect ratio

I'm using bokeh 1.0.1 version inside a Django application and I would like to display microscopic surface images as zoomable image plots with a color-encoded height and colorbar. In principle this works, but I have problems to get plots with the correct aspect ratio only showing the image without space around.
Here is an example for what I want to achieve: The resulting plot should
show an image of random data having a width of sx=10 and a height of sy=5 in data space (image size)
have axes limited to (0,sx) and (0,sy), on initial view
and when zooming
a square on the screen should match a square in data space, at least in initial view
For the image I just use random data with nx=100 points in x direction and ny=100 points in y direction.
Here is my first approach:
Attempt 1
from bokeh.models.ranges import Range1d
from bokeh.plotting import figure, show
from bokeh.models import LinearColorMapper, ColorBar
import numpy as np
sx = 10
sy = 5
nx = 100
ny = 100
arr = np.random.rand(nx, ny)
x_range = Range1d(start=0, end=sx, bounds=(0,sx))
y_range = Range1d(start=0, end=sy, bounds=(0,sy))
# Attempt 1
plot = figure(x_range=x_range, y_range=y_range, match_aspect=True)
# Attempt 2
# plot = figure(match_aspect=True)
# Attempt 3
# pw = 400
# ph = int(400/sx*sy)
# plot = figure(plot_width=pw, plot_height=ph,
# x_range=x_range, y_range=y_range, match_aspect=True)
color_mapper = LinearColorMapper(palette="Viridis256",
low=arr.min(), high=arr.max())
colorbar = ColorBar(color_mapper=color_mapper, location=(0,0))
plot.image([arr], x=[0], y=[0], dw=[sx], dh=[sy],
color_mapper=color_mapper)
plot.rect(x=[0,sx,sx,0,sx/2], y=[0,0,sy,sy,sy/2],
height=1, width=1, color='blue')
plot.add_layout(colorbar, 'right')
show(plot)
I've also added blue squares to the plot in order to see, when the
aspect ratio requirement fails.
Unfortunately, in the resulting picture, the square is no square any more, it's twice as high as wide. Zooming and panning works as expected.
Attempt 2
When leaving out the ranges by using
plot = figure(match_aspect=True)
I'll get this picture. The square is a square on the screen,
this is fine, but the axis ranges changed, so there is
now space around it. I would like to have only the data area covered by the
image.
Attempt 3
Alternatively, when providing a plot_height and plot_width to the figure,
with a pre-defined aspect ratio e.g. by
pw = 800 # plot width
ph = int(pw/sx*sy)
plot = figure(plot_width=pw, plot_height=ph,
x_range=x_range, y_range=y_range,
match_aspect=True)
I'll get this picture. The square is also not a square any more. It can be done almost, but it's difficult, because the plot_width also comprises the colorbar and the toolbar.
I've read this corresponding blog post
and the corresponding bokeh documentation, but I cannot get it working.
Does anybody know how to achieve what I want or whether it is impossible?
Responsive behaviour would also be nice, but we can neglect that for now.
Thanks for any hint.
Update
After a conversation with a Bokeh developer on Gitter (thanks Bryan!) it seems that it is nearly impossible what I want.
The reason is, how match_aspect=True works in order to make a square in data space look like a square in pixel space: Given a canvas size, which may result from applying different sizing_mode settings for responsive behaviour, the data range is then changed in order to have the matching aspect ratio. So there is no other way to make the pixel aspect ratio to match the data aspect ratio without adding extra space around the image, i.e. to extend the axes over the given bounds. Also see the comment of this issue.
Going without responsive behaviour and then fixing the canvas size beforehand with respect to the aspect ratio could be done, but currently not perfectly because of all the other elements around the inner plot frame which also take space. There is a PR which may allow a direct control of inner frame dimensions, but I'm not sure how to do it.
Okay, what if I give up the goal to have tight axes?
This is done in "Attempt 2" above, but there is too much empty space around the image, the same space that the image plot takes.
I've tried to use various range_padding* attributes, e.g.
x_range = DataRange1d(range_padding=10, range_padding_units='percent')
y_range = DataRange1d(range_padding=10, range_padding_units='percent')
but it doesn't reduce the amount of space around the plot, but increases it only. The padding in percent should be relative to the image dimensions given by dh and dw.
Does anybody know how to use the range_padding parameters to have smaller axis ranges or another way to have smaller paddings around the image plot in the example above (using match_aspect=True)?
I've opened another question on this.

Can you accept this solution (works with Bokeh v1.0.4) ?
from bokeh.models.ranges import Range1d
from bokeh.plotting import figure, show
from bokeh.layouts import Row
from bokeh.models import LinearColorMapper, ColorBar
import numpy as np
sx = 10
sy = 5
nx = 100
ny = 100
arr = np.random.rand(nx, ny)
x_range = Range1d(start = 0, end = sx, bounds = (0, sx))
y_range = Range1d(start = 0, end = sy, bounds = (0, sy))
pw = 400
ph = pw * sy / sx
plot = figure(plot_width = pw, plot_height = ph,
x_range = x_range, y_range = y_range, match_aspect = True)
color_mapper = LinearColorMapper(palette = "Viridis256",
low = arr.min(), high = arr.max())
plot.image([arr], x = [0], y = [0], dw = [sx], dh = [sy], color_mapper = color_mapper)
plot.rect(x = [0, sx, sx, 0, sx / 2], y = [0, 0, sy, sy, sy / 2], height = 1, width = 1, color = 'blue')
colorbar_plot = figure(plot_height = ph, plot_width = 69, x_axis_location = None, y_axis_location = None, title = None, tools = '', toolbar_location = None)
colorbar = ColorBar(color_mapper = color_mapper, location = (0, 0))
colorbar_plot.add_layout(colorbar, 'left')
show(Row(plot, colorbar_plot))
Result:

How do I shift categorical scatter markers to left and right above xticks (multiple data sets per category)?

I have a simple pandas dataframe that I want to plot with matplotlib:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_excel('SAT_data.xlsx', index_col = 'State')
plt.figure()
plt.scatter(df['Year'], df['Reading'], c = 'blue', s = 25)
plt.scatter(df['Year'], df['Math'], c = 'orange', s = 25)
plt.scatter(df['Year'], df['Writing'], c = 'red', s = 25)
Here is what my plot looks like:
I'd like to shift the blue data points a bit to the left, and the red ones a bit to the right, so each year on the x-axis has three mini-columns of scatter data above it instead of all three datasets overlapping. I tried and failed to use the 'verts' argument properly. Is there a better way to do this?

Using an offset transform would allow to shift the scatter points by some amount in units of points instead of data units. The advantage is that they would then always sit tight against each other, independent of the figure size, zoom level etc.
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(0)
import matplotlib.transforms as transforms
year = np.random.choice(np.arange(2006,2017), size=(300) )
values = np.random.rand(300, 3)
plt.figure()
offset = lambda p: transforms.ScaledTranslation(p/72.,0, plt.gcf().dpi_scale_trans)
trans = plt.gca().transData
sc1 = plt.scatter(year, values[:,0], c = 'blue', s = 25, transform=trans+offset(-5))
plt.scatter(year, values[:,1], c = 'orange', s = 25)
plt.scatter(year, values[:,2], c = 'red', s = 25, transform=trans+offset(5))
plt.show()
Broad figure:
Normal figure:
Zoom
Some explanation:
The problem is that we want to add an offset in points to some data in data coordinates. While data coordinates are automatically transformed to display coordinates using the transData (which we normally don't even see on the surface), adding some offset requires us to change the transform.
We do this by adding an offset. While we could just add an offset in pixels (display coordinates), it is more convenient to add the offset in points and thereby using the same unit as the size of the scatter points is given in (their size is points squared actually).
So we want to know how many pixels are p points? This is found out by dividing p by the ppi (points per inch) to obtain inches, and then by multiplying by the dpi (dots per inch) to obtain the display pixel. This calculation in done in the ScaledTranslation.
While the dots per inch are in principle variable (and taken care of by the dpi_scale_trans transform), the points per inch are fixed. Matplotlib uses 72 ppi, which is kind of a typesetting standard.

A quick and dirty way would be to create a small offset dx and subtract it from x values of blue points and add to x values of red points.
dx = 0.1
plt.scatter(df['Year'] - dx, df['Reading'], c = 'blue', s = 25)
plt.scatter(df['Year'], df['Math'], c = 'orange', s = 25)
plt.scatter(df['Year'] + dx, df['Writing'], c = 'red', s = 25)
One more option could be to use stripplot function from seaborn library. It would be necessary to melt the original dataframe into long form so that each row contains a year, a test and a score. Then make a stripplot specifying year as x, score as y and test as hue. The split keyword argument is what controls plotting categories as separate stripes for each x. There's also the jitter argument that will add some noise to x values so that they take up some small area instead of being on a single vertical line.
import pandas as pd
import seaborn as sns
# make up example data
np.random.seed(2017)
df = pd.DataFrame(columns = ['Reading','Math','Writing'],
data = np.random.normal(540,30,size=(1000,3)))
df['Year'] = np.random.choice(np.arange(2006,2016),size=1000)
# melt the data into long form
df1 = pd.melt(df, var_name='Test', value_name='Score',id_vars=['Year'])
# make a stripplot
fig, ax = plt.subplots(figsize=(10,7))
sns.stripplot(data = df1, x='Year', y = 'Score', hue = 'Test',
jitter = True, split = True, alpha = 0.7,
palette = ['blue','orange','red'])
Output:

Here is how the given code can be adapted to work with multiple subplots, and also to a situation without "middle column".
To adapt the given code, ax[n,p].transData is needed instead of plt.gca().transData. plt.gca() refers to the last created subplot, while now you'll need the transform of each individual subplot.
Another problem is that when only plotting via a transform, matplotlib doesn't automatically sets the lower and upper limits of the subplot. In the given example plots the points "in the middle" without setting a specific transform, and the plot gets "zoomed out" around these points (orange in the example).
If you don't have points at the center, the limits need to be set in another way. The way I came up with, is plotting some dummy points in the middle (which sets the zooming limits), and remove those again.
Also note that the size of the scatter dots in given as the square of their diameter (measured in "unit points"). To have touching dots, you'd need to use the square root for their offset.
import matplotlib.pyplot as plt
from matplotlib import transforms
import numpy as np
# Set up data for reproducible example
year = np.random.choice(np.arange(2006, 2017), size=(100))
data = np.random.rand(4, 100, 3)
data2 = np.random.rand(4, 100, 3)
# Create plot and set up subplot ax loop
fig, axs = plt.subplots(2, 2, figsize=(18, 14))
# Set up offset with transform
offset = lambda p: transforms.ScaledTranslation(p / 72., 0, plt.gcf().dpi_scale_trans)
# Plot data in a loop
for ax, q, r in zip(axs.flat, data, data2):
temp_points = ax.plot(year, q, ls=' ')
for pnt in temp_points:
pnt.remove()
ax.plot(year, q, marker='.', ls=' ', ms=10, c='b', transform=ax.transData + offset(-np.sqrt(10)))
ax.plot(year, r, marker='.', ls=' ', ms=10, c='g', transform=ax.transData + offset(+np.sqrt(10)))
plt.show()

Matplotlib: Adjust legend location/position

I'm creating a figure with multiple subplots. One of these subplots is giving me some trouble, as none of the axes corners or centers are free (or can be freed up) for placing the legend. What I'd like to do is to have the legend placed somewhere in between the 'upper left' and 'center left' locations, while keeping the padding between it and the y-axis equal to the legends in the other subplots (that are placed using one of the predefined legend location keywords).
I know I can specify a custom position by using loc=(x,y), but then I can't figure out how to get the padding between the legend and the y-axis to be equal to that used by the other legends. Would it be possible to somehow use the borderaxespad property of the first legend? Though I'm not succeeding at getting that to work.
Any suggestions would be most welcome!
Edit: Here is a (very simplified) illustration of the problem:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 2, sharex=False, sharey=False)
ax[0].axhline(y=1, label='one')
ax[0].axhline(y=2, label='two')
ax[0].set_ylim([0.8,3.2])
ax[0].legend(loc=2)
ax[1].axhline(y=1, label='one')
ax[1].axhline(y=2, label='two')
ax[1].axhline(y=3, label='three')
ax[1].set_ylim([0.8,3.2])
ax[1].legend(loc=2)
plt.show()
What I'd like is that the legend in the right plot is moved down somewhat so it no longer overlaps with the line.
As a last resort I could change the axis limits, but I would very much like to avoid that.

I saw the answer you posted and tried it out. The problem however is that it is also depended on the figure size.
Here's a new try:
import numpy
import matplotlib.pyplot as plt
x = numpy.linspace(0, 10, 10000)
y = numpy.cos(x) + 2.
x_value = .014 #Offset by eye
y_value = .55
fig, ax = plt.subplots(1, 2, sharex = False, sharey = False)
fig.set_size_inches(50,30)
ax[0].plot(x, y, label = "cos")
ax[0].set_ylim([0.8,3.2])
ax[0].legend(loc=2)
line1 ,= ax[1].plot(x,y)
ax[1].set_ylim([0.8,3.2])
axbox = ax[1].get_position()
fig.legend([line1], ["cos"], loc = (axbox.x0 + x_value, axbox.y0 + y_value))
plt.show()
So what I am now doing is basically getting the coordinates from the subplot. I then create the legend based on the dimensions of the entire figure. Hence, the figure size does not change anything to the legend positioning anymore.
With the values for x_value and y_value the legend can be positioned in the subplot. x_value has been eyeballed for a good correspondence with the "normal" legend. This value can be changed at your desire. y_value determines the height of the legend.
Good luck!

After spending way too much time on this, I've come up with the following satisfactory solution (the Transformations Tutorial definitely helped):
bapad = plt.rcParams['legend.borderaxespad']
fontsize = plt.rcParams['font.size']
axline = plt.rcParams['axes.linewidth'] #need this, otherwise the result will be off by a few pixels
pad_points = bapad*fontsize + axline #padding is defined in relative to font size
pad_inches = pad_points/72.0 #convert from points to inches
pad_pixels = pad_inches*fig.dpi #convert from inches to pixels using the figure's dpi
Then, I found that both of the following work and give the same value for the padding:
# Define inverse transform, transforms display coordinates (pixels) to axes coordinates
inv = ax[1].transAxes.inverted()
# Inverse transform two points on the display and find the relative distance
pad_axes = inv.transform((pad_pixels, 0)) - inv.transform((0,0))
pad_xaxis = pad_axes[0]
or
# Find how may pixels there are on the x-axis
x_pixels = ax[1].transAxes.transform((1,0)) - ax[1].transAxes.transform((0,0))
# Compute the ratio between the pixel offset and the total amount of pixels
pad_xaxis = pad_pixels/x_pixels[0]
And then set the legend with:
ax[1].legend(loc=(pad_xaxis,0.6))
Plot:

Expand the line with specified width in data unit

My question is a bit similar to this question that draws line with width given in data coordinates. What makes my question a bit more challenging is that unlike the linked question, the segment that I wish to expand is of a random orientation.
Let's say if the line segment goes from (0, 10) to (10, 10), and I wish to expand it to a width of 6. Then it is simply
x = [0, 10]
y = [10, 10]
ax.fill_between(x, y - 3, y + 3)
However, my line segment is of random orientation. That is, it is not necessarily along x-axis or y-axis. It has a certain slope.
A line segment s is defined as a list of its starting and ending points: [(x1, y1), (x2, y2)].
Now I wish to expand the line segment to a certain width w. The solution is expected to work for a line segment in any orientation. How to do this?
plt.plot(x, y, linewidth=6.0) cannot do the trick, because I want my width to be in the same unit as my data.

The following code is a generic example on how to make a line plot in matplotlib using data coordinates as linewidth. There are two solutions; one using callbacks, one using subclassing Line2D.
Using callbacks.
It is implemted as a class data_linewidth_plot that can be called with a signature pretty close the the normal plt.plot command,
l = data_linewidth_plot(x, y, ax=ax, label='some line', linewidth=1, alpha=0.4)
where ax is the axes to plot to. The ax argument can be omitted, when only one subplot exists in the figure. The linewidth argument is interpreted in (y-)data units.
Further features:
It's independend on the subplot placements, margins or figure size.
If the aspect ratio is unequal, it uses y data coordinates as the linewidth.
It also takes care that the legend handle is correctly set (we may want to have a huge line in the plot, but certainly not in the legend).
It is compatible with changes to the figure size, zoom or pan events, as it takes care of resizing the linewidth on such events.
Here is the complete code.
import matplotlib.pyplot as plt
class data_linewidth_plot():
def __init__(self, x, y, **kwargs):
self.ax = kwargs.pop("ax", plt.gca())
self.fig = self.ax.get_figure()
self.lw_data = kwargs.pop("linewidth", 1)
self.lw = 1
self.fig.canvas.draw()
self.ppd = 72./self.fig.dpi
self.trans = self.ax.transData.transform
self.linehandle, = self.ax.plot([],[],**kwargs)
if "label" in kwargs: kwargs.pop("label")
self.line, = self.ax.plot(x, y, **kwargs)
self.line.set_color(self.linehandle.get_color())
self._resize()
self.cid = self.fig.canvas.mpl_connect('draw_event', self._resize)
def _resize(self, event=None):
lw = ((self.trans((1, self.lw_data))-self.trans((0, 0)))*self.ppd)[1]
if lw != self.lw:
self.line.set_linewidth(lw)
self.lw = lw
self._redraw_later()
def _redraw_later(self):
self.timer = self.fig.canvas.new_timer(interval=10)
self.timer.single_shot = True
self.timer.add_callback(lambda : self.fig.canvas.draw_idle())
self.timer.start()
fig1, ax1 = plt.subplots()
#ax.set_aspect('equal') #<-not necessary
ax1.set_ylim(0,3)
x = [0,1,2,3]
y = [1,1,2,2]
# plot a line, with 'linewidth' in (y-)data coordinates.
l = data_linewidth_plot(x, y, ax=ax1, label='some 1 data unit wide line',
linewidth=1, alpha=0.4)
plt.legend() # <- legend possible
plt.show()
(I updated the code to use a timer to redraw the canvas, due to this issue)
Subclassing Line2D
The above solution has some drawbacks. It requires a timer and callbacks to update itself on changing axis limits or figure size. The following is a solution without such needs. It will use a dynamic property to always calculate the linewidth in points from the desired linewidth in data coordinates on the fly. It is much shorter than the above.
A drawback here is that a legend needs to be created manually via a proxyartist.
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
class LineDataUnits(Line2D):
def __init__(self, *args, **kwargs):
_lw_data = kwargs.pop("linewidth", 1)
super().__init__(*args, **kwargs)
self._lw_data = _lw_data
def _get_lw(self):
if self.axes is not None:
ppd = 72./self.axes.figure.dpi
trans = self.axes.transData.transform
return ((trans((1, self._lw_data))-trans((0, 0)))*ppd)[1]
else:
return 1
def _set_lw(self, lw):
self._lw_data = lw
_linewidth = property(_get_lw, _set_lw)
fig, ax = plt.subplots()
#ax.set_aspect('equal') # <-not necessary, if not given, y data is assumed
ax.set_xlim(0,3)
ax.set_ylim(0,3)
x = [0,1,2,3]
y = [1,1,2,2]
line = LineDataUnits(x, y, linewidth=1, alpha=0.4)
ax.add_line(line)
ax.legend([Line2D([],[], linewidth=3, alpha=0.4)],
['some 1 data unit wide line']) # <- legend possible via proxy artist
plt.show()

Just to add to the previous answer (can't comment yet), here's a function that automates this process without the need for equal axes or the heuristic value of 0.8 for labels. The data limits and size of the axis need to be fixed and not changed after this function is called.
def linewidth_from_data_units(linewidth, axis, reference='y'):
"""
Convert a linewidth in data units to linewidth in points.
Parameters
----------
linewidth: float
Linewidth in data units of the respective reference-axis
axis: matplotlib axis
The axis which is used to extract the relevant transformation
data (data limits and size must not change afterwards)
reference: string
The axis that is taken as a reference for the data width.
Possible values: 'x' and 'y'. Defaults to 'y'.
Returns
-------
linewidth: float
Linewidth in points
"""
fig = axis.get_figure()
if reference == 'x':
length = fig.bbox_inches.width * axis.get_position().width
value_range = np.diff(axis.get_xlim())
elif reference == 'y':
length = fig.bbox_inches.height * axis.get_position().height
value_range = np.diff(axis.get_ylim())
# Convert length to points
length *= 72
# Scale linewidth to value range
return linewidth * (length / value_range)

Explanation:
Set up the figure with a known height and make the scale of the two axes equal (or else the idea of "data coordinates" does not apply). Make sure the proportions of the figure match the expected proportions of the x and y axes.
Compute the height of the whole figure point_hei (including margins) in units of points by multiplying inches by 72
Manually assign the y-axis range yrange (You could do this by plotting a "dummy" series first and then querying the plot axis to get the lower and upper y limits.)
Provide the width of the line that you would like in data units linewid
Calculate what those units would be in points pointlinewid while adjusting for the margins. In a single-frame plot, the plot is 80% of the full image height.
Plot the lines, using a capstyle that does not pad the ends of the line (has a big effect at these large line sizes)
Voilà? (Note: this should generate the proper image in the saved file, but no guarantees if you resize a plot window.)
import matplotlib.pyplot as plt
rez=600
wid=8.0 # Must be proportional to x and y limits below
hei=6.0
fig = plt.figure(1, figsize=(wid, hei))
sp = fig.add_subplot(111)
# # plt.figure.tight_layout()
# fig.set_autoscaley_on(False)
sp.set_xlim([0,4000])
sp.set_ylim([0,3000])
plt.axes().set_aspect('equal')
# line is in points: 72 points per inch
point_hei=hei*72
xval=[100,1300,2200,3000,3900]
yval=[10,200,2500,1750,1750]
x1,x2,y1,y2 = plt.axis()
yrange = y2 - y1
# print yrange
linewid = 500 # in data units
# For the calculation below, you have to adjust width by 0.8
# because the top and bottom 10% of the figure are labels & axis
pointlinewid = (linewid * (point_hei/yrange)) * 0.8 # corresponding width in pts
plt.plot(xval,yval,linewidth = pointlinewid,color="blue",solid_capstyle="butt")
# just for fun, plot the half-width line on top of it
plt.plot(xval,yval,linewidth = pointlinewid/2,color="red",solid_capstyle="butt")
plt.savefig('mymatplot2.png',dpi=rez)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Increase visibility of large scatter plot - python

Related

How to evenly spread annotation imageboxes around a scatterplot?

Image plot in bokeh with tight axes and matching aspect ratio

How do I shift categorical scatter markers to left and right above xticks (multiple data sets per category)?

Matplotlib: Adjust legend location/position

Expand the line with specified width in data unit

Categories

Resources