I am trying to plot a linear line with associated error.
I calculated values for slope (a) and intercepts (b). In addition, I calculated the error associated with these values. So I drew the line given by the typical formula below.
y=ax+b
However, in addition to the line, I also want to draw the associated error. I came up with the idea to draw the lines associated with these formulas and color the space between the lines gray.
y=(a+a_sd)x+(b+b_sd)
y=(a-a_sd)x+(b-b_sd)
Uisng the following piece of code, I am able to color part of the surface between the lines, but not the whole span (see included output).
I think this may be due to the fact that "distance" is not sorted, and fill_between is using distance[0] and distance[-1] as begin and end for the span, respectively.
As always, any help would be highly appreciated!
import matplotlib.pyplot as plt
distance=[0.35645334340084989, 0.55406894241607718, 0.10201413273193734, 0.13401365724625941, 0.71918808865838735, 0.14151335417722818]
time=[2.4004984846346171, 2.4909766335028447, 1.9852064018125195, 1.9083156734132103, 2.6380396934372863, 1.9114505780323543]
time_SD=[0.062393810960652669, 0.056945715242838917, 0.073960838867327183, 0.084111239062664475, 0.026912957190265499, 0.08595664694840538]
distance_SD=[0.035160608598240162, 0.032976715460514235, 0.02782911002465227, 0.035465701695038584, 0.043009444687382707, 0.038387585107200854]
a=1.17887019041
b=1.83339229489
a_sd=0.159771527859
b_sd=0.0762509747218
plt.errorbar(distance,time,yerr=time_SD, xerr=distance_SD, linestyle="None")
abline_values = [(a)*i + (b) for i in distance]
abline_values_plus = [(a+a_sd)*i + (b+b_sd) for i in distance]
abline_values_minus = [(a-a_sd)*i + (b-b_sd) for i in distance]
plt.plot(distance, abline_values,"r")
plt.fill_between(distance,abline_values_minus,abline_values_plus,facecolor='lightgrey', interpolate=True, edgecolors="None")
leg = plt.legend(loc="lower right", frameon=False, handlelength=0, handletextpad=0)
for item in leg.legendHandles:
item.set_visible(False)
plt.show()
In order to use pyplot.fill_between() the list to plot the horizontal coordinate should be sorted. Using an unsorted list of x values is possible, but can lead to undesired results.
Sorting a list can be done using sorted(list).
import matplotlib.pyplot as plt
distance=[0.35645334340084989, 0.55406894241607718, 0.10201413273193734, 0.13401365724625941, 0.71918808865838735, 0.14151335417722818]
time=[2.4004984846346171, 2.4909766335028447, 1.9852064018125195, 1.9083156734132103, 2.6380396934372863, 1.9114505780323543]
time_SD=[0.062393810960652669, 0.056945715242838917, 0.073960838867327183, 0.084111239062664475, 0.026912957190265499, 0.08595664694840538]
distance_SD=[0.035160608598240162, 0.032976715460514235, 0.02782911002465227, 0.035465701695038584, 0.043009444687382707, 0.038387585107200854]
a=1.17887019041
b=1.83339229489
a_sd=0.159771527859
b_sd=0.0762509747218
distance_sorted = sorted(distance)
plt.errorbar(distance,time,yerr=time_SD, xerr=distance_SD, linestyle="None")
abline_values = [(a)*i + (b) for i in distance_sorted]
abline_values_plus = [(a+a_sd)*i + (b+b_sd) for i in distance_sorted]
abline_values_minus = [(a-a_sd)*i + (b-b_sd) for i in distance_sorted]
plt.plot(distance_sorted, abline_values,"r")
plt.fill_between(distance_sorted,abline_values_minus,abline_values_plus, facecolor='lightgrey', edgecolors="None")
plt.show()
The documentation does not mention the requirement of x values being sorted. The reason is probably that fill_between actually works even with unsorted lists, just not the way one might expect. Maybe the following animation gives a more intuitive understanding on the issue:
You are right fill_between seems to expect the values to be sorted. The documentation is not clear about this behaviour though. The following example however shows the same effect:
import matplotlib.pyplot as plt
from numpy import random, array
#x = random.randn(20) #does not work
x = array(sorted(random.randn(20))) #works
a = 2
d = .5
y_h = x*(a+d)
y_l = x*(a-d)
plt.fill_between(x,y_h, y_l)
plt.show()
As a workaround just sort your values before calculating your errorlines using sorted.
Related
I am not sure on how to plot a dotted line from a shapefile in Python. It appears that readshapefile() does not have any linestyle for me to set. Below I have a working code where I take a shapefile and plot it, but it only plots a solid line. Any ideas to set me in the right direction? Thanks!
The shapefile can be found here: http://www.natice.noaa.gov/products/daily_products.html, where the Start Date is Feb 15th, end date is Feb 17th, and the Date Types is Ice Edge. It should be the first link.
#!/awips2/python/bin/python
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
map = Basemap(llcrnrlon=-84.37,llcrnrlat=42.11,urcrnrlon=-20.93,urcrnrlat=66.48,
resolution='i', projection='tmerc', lat_0 = 55., lon_0 = -50.)
map.drawmapboundary(fill_color='aqua')
map.fillcontinents(color='#ddaa66',lake_color='aqua')
map.drawcoastlines(zorder = 3)
map.readshapefile('nic_autoc2018046n_pl_a', 'IceEdge', zorder = 2, color = 'blue')
plt.show()
From the Basemap documentation:
A tuple (num_shapes, type, min, max) containing shape file info is
returned. num_shapes is the number of shapes, type is the type code
(one of the SHPT* constants defined in the shapelib module, see
http://shapelib.maptools.org/shp_api.html) and min and max are
4-element lists with the minimum and maximum values of the vertices.
If drawbounds=True a matplotlib.patches.LineCollection object is
appended to the tuple.
drawbounds is True by default, so all you have to do is collect the return value of readshapefile and alter the linestyle of the returned LineCollection object, which can be done with LineCollection.set_linestyle(). So in principle you can change the linestyle of your plotted shape file with something like this:
result = m.readshapefile('shapefiles/nic_autoc2018046n_pl_a', 'IceEdge', zorder = 10, color = 'blue')#, drawbounds = False)
col = result[-1]
col.set_linestyle('dotted')
plt.show()
However, your shapefile contains 5429 separate line segments of different length and somehow matplotlib does not seem to be able to deal with this large amount of non-continuous lines. At least on my machine the plotting did not finish within one hour, so I interrupted the process. I played a bit with your file and it seems like many of the lines are broken into segments unnecessarily (I'm guessing this is because the ice sheet outlines are somehow determined on tiles and then pieced together afterwards, but only the providers will really know). Maybe it would help to piece together adjacent pieces, but I'm not sure.
I was also wondering whether the result would even look that great with a dotted line, because there are so many sharp bends. Below I show a picture where I only plot the 100 longest line segments (leaving out drawcoastlines and with thicker lines) using this code:
import numpy as np
result = m.readshapefile('shapefiles/nic_autoc2018046n_pl_a', 'IceEdge', zorder = 10, color = 'blue')#, drawbounds = False)
col = result[-1]
segments = col.get_segments()
seglens = [len(seg) for seg in col.get_segments()]
segments = np.array(segments)
seglens = np.array(seglens)
idx = np.argsort(seglens)
seglens = seglens[idx]
segments = segments[idx]
col.remove()
new_col = LineCollection(segments[-100:],linewidths = 2, linestyles='dotted', colors='b')
ax.add_collection(new_col)
plt.show()
And the result looks like this:
Is there a way to extract the data from an array, which corresponds to a line of a contourplot in python? I.e. I have the following code:
n = 100
x, y = np.mgrid[0:1:n*1j, 0:1:n*1j]
plt.contour(x,y,values)
where values is a 2d array with data (I stored the data in a file but it seems not to be possible to upload it here). The picture below shows the corresponding contourplot. My question is, if it is possible to get exactly the data from values, which corresponds e.g. to the left contourline in the plot?
Worth noting here, since this post was the top hit when I had the same question, that this can be done with scikit-image much more simply than with matplotlib. I'd encourage you to check out skimage.measure.find_contours. A snippet of their example:
from skimage import measure
x, y = np.ogrid[-np.pi:np.pi:100j, -np.pi:np.pi:100j]
r = np.sin(np.exp((np.sin(x)**3 + np.cos(y)**2)))
contours = measure.find_contours(r, 0.8)
which can then be plotted/manipulated as you need. I like this more because you don't have to get into the deep weeds of matplotlib.
plt.contour returns a QuadContourSet. From that, we can access the individual lines using:
cs.collections[0].get_paths()
This returns all the individual paths. To access the actual x, y locations, we need to look at the vertices attribute of each path. The first contour drawn should be accessible using:
X, Y = cs.collections[0].get_paths()[0].vertices.T
See the example below to see how to access any of the given lines. In the example I only access the first one:
import matplotlib.pyplot as plt
import numpy as np
n = 100
x, y = np.mgrid[0:1:n*1j, 0:1:n*1j]
values = x**0.5 * y**0.5
fig1, ax1 = plt.subplots(1)
cs = plt.contour(x, y, values)
lines = []
for line in cs.collections[0].get_paths():
lines.append(line.vertices)
fig1.savefig('contours1.png')
fig2, ax2 = plt.subplots(1)
ax2.plot(lines[0][:, 0], lines[0][:, 1])
fig2.savefig('contours2.png')
contours1.png:
contours2.png:
plt.contour returns a QuadContourSet which holds the data you're after.
See Get coordinates from the contour in matplotlib? (which this question is probably a duplicate of...)
I am plotting just a simple scatterplot with MPL 1.4.0. I want to control the number of dashes on the figures I am plotting because currently even though I set a linestyle, the dashes are too close to each other so it doesn't look like a properly dashed line.
#load cdeax,cdeay,gsix,gsiy,reich all are arrays of shape (380,)
figfit = plt.figure(); axfit = figfit.gca()
axfit.plot(cdeax,np.log(cdeay),'ko', alpha=.5); axfit.plot(gsix,np.log(gsiy), 'kx')
axfit.plot(cdeax,cdeafit,'k-'); axfit.plot(gsix,gsifit,'k:')
longevityregplot[1].plot(gsix,np.log(reich_l),'k-.')
#load cdeax,cdeay,gsix,gsiy,reich all are arrays of shape (380,)
figfit = plt.figure(); axfit = figfit.gca()
axfit.plot(cdeax,np.log(cdeay),'ko', alpha=.5); axfit.plot(gsix,np.log(gsiy), 'kx')
axfit.plot(cdeax,cdeafit,'k-',dashes = [10,10]); axfit.plot(gsix,gsifit,'k:',dashes=[10,10])
longevityregplot[1].plot(gsix,np.log(reich_l),'k-.')
However the above is what I get. Rather than a uniformly-dashed line, the lines get dashed at the ends to varying degrees but no matter what values I use for dashes, the dashing is never uniform.
I'm afraid I really don't know what the problem is here... Any ideas?
I have pasted the arrays I am using here: http://pastebin.com/rJ5Jjfmm
You should be able to just copy/paste them to your IDE for the above code to run.
Cheers!
EDIT:
Just with the single line plotted:
axfit.plot(cdeax,cdeafit,'k-',dashes = [10,10]);
EDIT2: pastebin link changed to include all data
EDIT3: Histogram of point density along the x axis:
I think what #cphlewis said is correct, you may have some x-axis backtracking. If I sort everything it looks ok to me (did my own fitting since I still don't see the fits on pastebin)
# import your data here
import math
figfit = plt.figure(); axfit = figfit.gca()
cdea = zip(cdeax,cdeay)
cdea = np.array(sorted(cdea, key = lambda x: x[0]))
gsi = zip(gsix,gsiy)
gsi = np.array(sorted(gsi, key = lambda x: x[0]))
cdeafit2 = np.polyfit(cdea[:,0],cdea[:,1],1)
gsifit2 = np.polyfit([x[0] for x in gsi],[math.log(x[1]) for x in gsi],1)
cdeafit = [x*cdeafit2[0] + cdeafit2[1] for x in cdea[:,0]]
gsifit = [math.exp(y) for y in [x*gsifit2[0] + gsifit2[1] for x in gsi[:,0]]]
axfit.plot(cdea[:,0],cdea[:,1],'ko', alpha=.5); axfit.plot(gsi[:,0],gsi[:,1], 'kx')
axfit.plot(cdea[:,0],cdeafit,'k-',dashes = [10,10]); axfit.plot(gsi[:,0],gsifit,'k:',dashes=[10,10])
#longevityregplot[1].plot(gsix,np.log(reich_l),'k-.') # not sure what this is
axfit.set_yscale('log')
plt.show()
I've been looking into how to make plots against time on the x axis and have it pretty much sorted, with one strange quirk that makes me wonder whether I've run into a bug or (admittedly much more likely) am doing something I don't really understand.
Simply put, below is a simplified version of my program. If I put this in a .py file and execute it from an interpreter (ipython) I get a figure with an x axis with the year only, "2012", repeated a number of times, like this.
However, if I comment out the line (40) that sets the xticks manually, namely 'plt.xticks(tk)' and then run that exact command in the interpreter immediately after executing the script, it works great and my figure looks like this.
Similarly it also works if I just move that line to be after the savefig command in the script, that's to say to put it at the very end of the file. Of course in both cases only the figure drawn on screen will have the desired axis, and not the saved file. Why can't I set my x axis earlier?
Grateful for any insights, thanks in advance!
import matplotlib.pyplot as plt
import datetime
# define arrays for x, y and errors
x=[16.7,16.8,17.1,17.4]
y=[15,17,14,16]
e=[0.8,1.2,1.1,0.9]
xtn=[]
# convert x to datetime format
for t in x:
hours=int(t)
mins=int((t-int(t))*60)
secs=int(((t-hours)*60-mins)*60)
dt=datetime.datetime(2012,01,01,hours,mins,secs)
xtn.append(date2num(dt))
# set up plot
fig=plt.figure()
ax=fig.add_subplot(1,1,1)
# plot
ax.errorbar(xtn,y,yerr=e,fmt='+',elinewidth=2,capsize=0,color='k',ecolor='k')
# set x axis range
ax.xaxis_date()
t0=date2num(datetime.datetime(2012,01,01,16,35)) # x axis startpoint
t1=date2num(datetime.datetime(2012,01,01,17,35)) # x axis endpoint
plt.xlim(t0,t1)
# manually set xtick values
tk=[]
tk.append(date2num(datetime.datetime(2012,01,01,16,40)))
tk.append(date2num(datetime.datetime(2012,01,01,16,50)))
tk.append(date2num(datetime.datetime(2012,01,01,17,00)))
tk.append(date2num(datetime.datetime(2012,01,01,17,10)))
tk.append(date2num(datetime.datetime(2012,01,01,17,20)))
tk.append(date2num(datetime.datetime(2012,01,01,17,30)))
plt.xticks(tk)
plt.show()
# save to file
plt.savefig('savefile.png')
I don't think you need that call to xaxis_date(); since you are already providing the x-axis data in a format that matplotlib knows how to deal with. I also think there's something slightly wrong with your secs formula.
We can make use of matplotlib's built-in formatters and locators to:
set the major xticks to a regular interval (minutes, hours, days, etc.)
customize the display using a strftime formatting string
It appears that if a formatter is not specified, the default is to display the year; which is what you were seeing.
Try this out:
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, MinuteLocator
x = [16.7,16.8,17.1,17.4]
y = [15,17,14,16]
e = [0.8,1.2,1.1,0.9]
xtn = []
for t in x:
h = int(t)
m = int((t-int(t))*60)
xtn.append(dt.datetime.combine(dt.date(2012,1,1), dt.time(h,m)))
def larger_alim( alim ):
''' simple utility function to expand axis limits a bit '''
amin,amax = alim
arng = amax-amin
nmin = amin - 0.1 * arng
nmax = amax + 0.1 * arng
return nmin,nmax
plt.errorbar(xtn,y,yerr=e,fmt='+',elinewidth=2,capsize=0,color='k',ecolor='k')
plt.gca().xaxis.set_major_locator( MinuteLocator(byminute=range(0,60,10)) )
plt.gca().xaxis.set_major_formatter( DateFormatter('%H:%M:%S') )
plt.gca().set_xlim( larger_alim( plt.gca().get_xlim() ) )
plt.show()
Result:
FWIW the utility function larger_alim was originally written for this other question: Is there a way to tell matplotlib to loosen the zoom on the plotted data?
In matplotlib, what is a way of converting the text box size into data coordinates?
For example, in this toy script I'm fine-tuning the coordinates of the text box so that it's next to a data point.
#!/usr/bin/python
import matplotlib.pyplot as plt
xx=[1,2,3]
yy=[2,3,4]
dy=[0.1,0.2,0.05]
fig=plt.figure()
ax=fig.add_subplot(111)
ax.errorbar(xx,yy,dy,fmt='ro-',ms=6,elinewidth=4)
# HERE: can one get the text bbox size?
txt=ax.text(xx[1]-0.1,yy[1]-0.4,r'$S=0$',fontsize=16)
ax.set_xlim([0.,3.4])
ax.set_ylim([0.,4.4])
plt.show()
Is there a way of doing something like this pseudocode instead?
x = xx[1] - text_height
y = yy[1] - text_width/2
ax.text(x,y,text)
Generally speaking, you can't get the size of the text until after it's drawn (thus the hacks in #DSM's answer).
For what you're wanting to do, you'd be far better off using annotate.
E.g. ax.annotate('Your text string', xy=(x, y), xytext=(x-0.1, y-0.4))
Note that you can specify the offset in points as well, and thus offset the text by it's height (just specify textcoords='offset points')
If you're wanting to adjust vertical alignment, horizontal alignment, etc, just add those as arguments to annotate (e.g. horizontalalignment='right' or equivalently ha='right')
I'm not happy with it at all, but the following works; I was getting frustrated until I found this code for a similar problem, which suggested a way to get at the renderer.
import matplotlib.pyplot as plt
xx=[1,2,3]
yy=[2,3,4]
dy=[0.1,0.2,0.05]
fig=plt.figure()
figname = "out.png"
ax=fig.add_subplot(111)
ax.errorbar(xx,yy,dy,fmt='ro-',ms=6,elinewidth=4)
# start of hack to get renderer
fig.savefig(figname)
renderer = plt.gca().get_renderer_cache()
# end of hack
txt = ax.text(xx[1], yy[1],r'$S=0$',fontsize=16)
tbox = txt.get_window_extent(renderer)
dbox = tbox.transformed(ax.transData.inverted())
text_width = dbox.x1-dbox.x0
text_height = dbox.y1-dbox.y0
x = xx[1] - text_height
y = yy[1] - text_width/2
txt.set_position((x,y))
ax.set_xlim([0.,3.4])
ax.set_ylim([0.,4.4])
fig.savefig(figname)
OTOH, while this might get the text box out of the actual data point, it doesn't necessarily get the box out of the way of the marker, or the error bar. So I don't know how useful it'll be in practice, but I guess it wouldn't be that hard to loop over all the drawn objects and move the text until it's out of the way. I think the linked code tries something similar.
Edit: Please note that this was clearly a courtesy accept; I would use Joe Kington's solution if I actually wanted to do this, and so should everyone else. :^)