how to draw an asymptote with a dashed line? - python

I would like the asymptote on the tg(x) function be draw with a dashed line, but I don't know how to change it in this code:
import matplotlib.ticker as tck
import matplotlib.pyplot as plt
import numpy as np
f,ax=plt.subplots(figsize=(8,5))
x=np.linspace(-np.pi, np.pi,100)
y=np.sin(x)/np.cos(x)
plt.ylim([-4, 4])
plt.title("f(x) = tg(x)")
plt.xlabel("x")
plt.ylabel("y")
ax.plot(x/np.pi,y)
ax.xaxis.set_major_formatter(tck.FormatStrFormatter('%g $\pi$'))

Interesting question. My approach is to look for the discontinuities by examining the derivative of the function, and separating the original function based on the location o these discontinuities.
So for tan(x), since the derivative is always positive (outside of the asymptotes) we look for points where np.diff(y) < 0. Based on all the locations where the previous condition is true, we split up the original function into segments and plot those individually (with the same plot properties so the lines look the same) and then plot black dashed lines separately. The following code shows this working:
import matplotlib.ticker as tck
import matplotlib.pyplot as plt
import numpy as np
f,ax=plt.subplots(figsize=(8,5))
x=np.linspace(-np.pi, np.pi,100)
y=np.sin(x)/np.cos(x)
plt.ylim([-4, 4])
plt.title("f(x) = tg(x)")
plt.xlabel("x")
plt.ylabel("y")
ax.xaxis.set_major_formatter(tck.FormatStrFormatter('%g $\pi$'))
# Search for points with negative slope
dydx = np.diff(y)
negativeSlopeIdx = np.nonzero(dydx < 0)[0]
# Take those points and parse the original function into segments to plot
yasymptote = np.array([-4, 4])
iprev = 0
for i in negativeSlopeIdx:
ax.plot(x[iprev:i-1]/np.pi, y[iprev:i-1], "b", linewidth=2)
ax.plot(np.array([x[i], x[i]])/np.pi, yasymptote, "--k")
iprev = i+1
ax.plot(x[iprev:]/np.pi, y[iprev:], "b", linewidth=2)
plt.show()
With a final plot looking like:

Related

retrieve leave colors from scipy dendrogram

I can not get the color leaves from the scipy dendrogram dictionary. As stated in the documentation and in this github issue, the color_list key in the dendrogram dictionary refers to the links, not the leaves. It would be nice to have another key referring to the leaves, sometimes you need this for coloring other types of graphics, such as this scatter plot in the example below.
import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram
# DATA EXAMPLE
x = np.array([[ 5, 3],
[10,15],
[15,12],
[24,10],
[30,30],
[85,70],
[71,80]])
# DENDROGRAM
plt.figure()
plt.subplot(121)
z = linkage(x, 'single')
d = dendrogram(z)
# COLORED PLOT
# This is what I would like to achieve. Colors are assigned manually by looking
# at the dendrogram, because I failed to get it from d['color_list'] (it refers
# to links, not observations)
plt.subplot(122)
points = d['leaves']
colors = ['r','r','g','g','g','g','g']
for point, color in zip(points, colors):
plt.plot(x[point, 0], x[point, 1], 'o', color=color)
Manual color assignment seems easy in this example, but I'm dealing with huge datasets, so until we get this new feature in the dictionary (color leaves), I'm trying to infer it somehow with the current information contained in the dictionary but I'm out of ideas so far. Can anyone help me?
Thanks.
For scipy 1.7.1 the new functionality has been implemented and the dendogram function returns in the output dictionary also an entry 'leaves_color_list' that can be used to perform easily this task.
Here is a working code of the OP (see last line "NEW CODE")
import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram
# DATA EXAMPLE
x = np.array([[ 5, 3],
[10,15],
[15,12],
[24,10],
[30,30],
[85,70],
[71,80]])
# DENDROGRAM
plt.figure()
plt.subplot(121)
z = linkage(x, 'single')
d = dendrogram(z)
# COLORED PLOT
# This is what I would like to achieve. Colors are assigned manually by looking
# at the dendrogram, because I failed to get it from d['color_list'] (it refers
# to links, not observations)
plt.subplot(122)
#NEW CODE
plt.scatter(x[d['leaves'],0],x[d['leaves'],1], color=d['leaves_color_list'])
The following approach seems to work. The dictionary returned by the dendogram contains 'color_list' with the colors of the linkages. And 'icoord' and 'dcoord' with the x, resp. y, plot coordinates of these linkages. These x-positions are 5, 15, 25, ... when the linkage starts at a point. So, testing these x-positions can bring us back from the linkage to the corresponding point. And allows to assign the color of the linkage to the point.
import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram
# DATA EXAMPLE
x = np.random.uniform(0, 10, (20, 2))
# DENDROGRAM
plt.figure()
plt.subplot(121)
z = linkage(x, 'single')
d = dendrogram(z)
plt.yticks([])
# COLORED PLOT
plt.subplot(122)
points = d['leaves']
colors = ['none'] * len(points)
for xs, c in zip(d['icoord'], d['color_list']):
for xi in xs:
if xi % 10 == 5:
colors[(int(xi)-5) // 10] = c
for point, color in zip(points, colors):
plt.plot(x[point, 0], x[point, 1], 'o', color=color)
plt.text(x[point, 0], x[point, 1], f' {point}')
plt.show()
PS: This post about matching points with their clusters might also be relevant.

Change mean indicator in violin plot to a circle

I want to change the the look of the mean in violinplots. I am using matplotlib. I could change the color of the means with the following code:
import matplotlib.pyplot as plt
fig,(axes1,axes2,axes3) = plt.subplots(nrows=3,ncols=1,figsize=(10,20))
r=axes2.violinplot(D,showmeans=True,showmedians=True)
r['cmeans'].set_color('red')
But now I want to change the look of the mean (currently a line, like the median) to a 'small circle'.
Can someone help me with this?
The idea can be to obtain the coordinates of the mean lines and plot a scatter plot at those coordinates.
Obtaining the coordinates can
either be done by looping over the mean lines' paths,
# loop over the paths of the mean lines
xy = [[l.vertices[:,0].mean(),l.vertices[0,1]] for l in r['cmeans'].get_paths()]
xy = np.array(xy)
or by reacalculating the mean from the input data.
#alternatively get the means from the data
y = data.mean(axis=0)
x = np.arange(1,len(y)+1)
xy=np.c_[x,y]
Complete code:
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(1)
data = np.random.normal(size=(50, 2))
fig,ax = plt.subplots()
r=ax.violinplot(data,showmeans=True)
# loop over the paths of the mean lines
xy = [[l.vertices[:,0].mean(),l.vertices[0,1]] for l in r['cmeans'].get_paths()]
xy = np.array(xy)
##alternatively get the means from the data
#y = data.mean(axis=0)
#x = np.arange(1,len(y)+1)
#xy=np.c_[x,y]
ax.scatter(xy[:,0], xy[:,1],s=121, c="crimson", marker="o", zorder=3)
# make lines invisible
r['cmeans'].set_visible(False)
plt.show()

plot log-scale and linear scale functions and histograms on same canvas

I have a probability density function of that I can only evaluate the logarithm without running into numeric issues. I have a histogram that I would like to plot on the same canvas. However, for the histogram, I need the option log=True to have it plotted in log scale, wheras for the function, I can only have the logarithms of the values directly. How can I plot both on the same canvas?
Please look at this MWE for illustration of the problem:
import matplotlib.pyplot as plt
import random
import math
import numpy as np
sqrt2pi = math.sqrt(2*math.pi)
def gauss(l):
return [ 1/sqrt2pi * math.exp(-x*x) for x in l]
def loggauss(l):
return [ -math.log(sqrt2pi) -x*x for x in l ]
# just fill a histogram
h = [ random.gauss(0,1) for x in range(0,1000) ]
plt.hist(h,bins=21,normed=True,log=True)
# this works nicely
xvals = np.arange(-4,4,0.1)
plt.plot(xvals,gauss(xvals),"-k")
# but I would like to plot this on the same canvas:
# plt.plot(xvals,loggauss(xvals),"-r")
plt.show()
Any suggestions?
If I understand correctly, you want to plot two data sets in the same figure, on the same x-axis, but one on a log y-scale and one on a linear y-scale. You can do this using twinx:
fig, ax = plt.subplots()
ax.hist(h,bins=21,normed=True,log=True)
ax2 = ax.twinx()
ax2.plot(xvals, loggauss(xvals), '-r')

pcolormesh adds empty white columns

I have been trying to do a simple heatmap with pcolormesh and I run into this weird effect with some sizes, which add empty white columns. If I create a 10x30, as below, it works perfectly.
from matplotlib import pyplot as plt
import numpy as np
d = []
for x in range(10):
d.append([])
for y in range(30):
d[-1].append(y)
plt.pcolormesh(np.array(d))
plt.show()
But, if I try with a 10x37:
from matplotlib import pyplot as plt
import numpy as np
d = []
for x in range(10):
d.append([])
for y in range(34):
d[-1].append(y)
plt.pcolormesh(np.array(d))
plt.show()
I got those weird white columns at the end. This seems to hold for a couple of values (10x11 fails, but 10x12 works... I wasn't able to discerna pattern.
Is there any way to remove them, maybe forcing the final size of the heatmap?
In terms of axes limits and aspect ratio, pcolormesh acts less like an image, and more like a line plot. If you want to show the elements of an array as pixels, you can use imshow. Alternatively, you can set the x-limits of your pcolormesh plot. Consider the following example:
from matplotlib import pyplot as plt
import numpy as np
d1 = []
d2 = []
for x in range(10):
d1.append([])
d2.append([])
for y in range(30):
d1[-1].append(y+x)
for y in range(37):
d2[-1].append(y+x)
fig, axes = plt.subplots(ncols=4, figsize=(10,4))
# your first two examples
axes[0].pcolormesh(np.array(d1), cmap=plt.cm.coolwarm)
axes[1].pcolormesh(np.array(d2), cmap=plt.cm.coolwarm)
# let's reset the x-lims on this
axes[2].pcolormesh(np.array(d2), cmap=plt.cm.coolwarm)
axes[2].set_ylim(bottom=0, top=d2.shape[0])
axes[2].set_xlim(left=0, right=d2.shape[1])
# or more concisely (thanks Joe):
axes[2].axis('tight')
# and use imshow here
axes[3].imshow(np.array(d2), cmap=plt.cm.coolwarm)
and that gives us:

Plotting a Lognormal Distribution

I am trying to plot a lognormal distribution so I can compare it with a histogram of my sample data using the code below but my plot does not look right. Is there something with my code that I am not doing correctly?
The C array has a length of 17576
import matplotlib.pyplot as plt
import numpy as np
data=np.loadtxt(F)
C=data[:,3]
x = np.ma.log(C)
avg = np.mean(x)
std = np.std(x)
dist=lognorm(std,loc=avg)
plt.plot(C,dist.pdf(C),'r')
plt.show()
It looks like your x data are not in sorted order. Try this
ind = np.argsort(C)
xx = C[ind]
yy = dist.pdf(C)[ind]
plt.plot(xx, yy, 'r')
Plot just connects all the (x,y) pairs with straight lines, so you need to make sure you trace your function from left-right (or right-left). Alternatively, you can skip the lines between the plot:
plt.plot(C, dist.pdf(C), 'ro')

Categories

Resources