Plot max of graph using python - python

I'm new to python and have a question. I've figured out how to graph functions, but how do I plot a point which indicates the max and minimum values? Here is my code, and it graphs properly I believe. Thank you.
import numpy as np
import matplotlib.pyplot as plt
def graph(formula, x_range):
x = np.array(x_range)
y = eval(formula)
plt.plot(x, y)
plt.show()
graph('-x**4 + 508 * x + 40', range(-10, 200))

n_max = y.argmax()
plt.plot(x[n_max],y[n_max],'o')
n_min = y.argmin()
plt.plot(x[n_min],y[n_min],'x')
something like this?

Related

Strange output in matplotlib

Can someone explain why I get this strange output when running this code:
import matplotlib.pyplot as plt
import numpy as np
def x_y():
return np.random.randint(9999, size=1000), np.random.randint(9999, size=1000)
plt.plot(x_y())
plt.show()
The output:
Your data is a tuple of two 1000 length arrays.
def x_y():
return np.random.randint(9999, size=1000), np.random.randint(9999, size=1000)
xy = x_y()
print(len(xy))
# > 2
print(xy[0].shape)
# > (1000,)
Let's read pyplot's documentation:
plot(y) # plot y using x as index array 0..N-1
Thus pyplot will plot a line between (0, xy[0][i]) and (1, xy[1][i]), for i in range(1000).
You probably try to do this:
plt.plot(*x_y())
This time, it will plot 1000 points joined by lines: (xy[0][i], xy[1][i]) for i in range 1000.
Yet, the lines don't represent anything here. Therefore you probably want to see individual points:
plt.scatter(*x_y())
Your function x_y is returning a tuple, assigning each element to a variable gives the correct output.
import matplotlib.pyplot as plt
import numpy as np
def x_y():
return np.random.randint(9999, size=1000), np.random.randint(9999, size=1000)
x, y = x_y()
plt.plot(x, y)
plt.show()

Something wrong about matplotlib.pyplot.contour

I used matplotlib.pyplot.contour to draw a line, but the result is strange.
My python code:
import numpy as np
from matplotlib import pyplot as plt
N = 1000
E = np.linspace(-5,0,N)
V = np.linspace(0, 70,N)
E, V = np.meshgrid(E, V)
L = np.sqrt(-E)
R = -np.sqrt(E+V)/np.tan(np.sqrt(E+V))
plt.contour(V, E,(L-R),levels=[0])
plt.show()
The result is:
But when I use Mathematica, the result is different.
Mathematica code is:
ContourPlot[Sqrt[-en] == -Sqrt[en + V]/Tan[Sqrt[en + V]], {V, 0, 70}, {en, -5, 0}]
The result is:
The result that I want is Mathematica's result.
Why does matplotlib.pyplot.contour give the wrong result? I am very confused!
It would be very appreciate if you can give me some idea! Thank you very much!
The result given by matplotlib.pyplot.contour is numerically correct, but mathematically wrong.
Check what happens if you simply plot the tan(x):
import numpy as np
from matplotlib import pyplot as plt
x = np.linspace(0,2*np.pi,1000)
y = np.tan(x)
plt.plot(x,y)
plt.show()
You will get a line at the poles. This is because subsequent points are connected.
You can circumvent this by using np.inf for points larger than a certain number. E.g. adding
y[np.abs(y)> 200] = np.inf
would result in
The same approach can be used for the contour.
import numpy as np
from matplotlib import pyplot as plt
N = 1000
x = np.linspace(0, 70,N)
y = np.linspace(-5,0,N)
X,Y = np.meshgrid(x, y)
F = np.sqrt(-Y) + np.sqrt(Y+X)/np.tan(np.sqrt(Y+X))
F[np.abs(F) > 200] = np.inf
plt.contour(X, Y, F, levels=[0])
plt.show()

Tangent to curve interpolated from discrete data

I was wondering if there's a way to find tangents to curve from discrete data.
For example:
x = np.linespace(-100,100,100001)
y = sin(x)
so here x values are integers, but what if we want to find tangent at something like x = 67.875?
I've been trying to figure out if numpy.interp would work, but so far no luck.
I also found a couple of similar examples, such as this one, but haven't been able to apply the techniques to my case :(
I'm new to Python and don't entirely know how everything works yet, so any help would be appreciated...
this is what I get:
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-100,100,10000)
y = np.sin(x)
tck, u = interpolate.splprep([y])
ti = np.linspace(-100,100,10000)
dydx = interpolate.splev(ti,tck,der=1)
plt.plot(x,y)
plt.plot(ti,dydx[0])
plt.show()
There is a comment in this answer, which tells you that there is a difference between splrep and splprep. For the 1D case you have here, splrep is completely sufficient.
You may also want to limit your curve a but to be able to see the oscilations.
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-15,15,1000)
y = np.sin(x)
tck = interpolate.splrep(x,y)
dydx = interpolate.splev(x,tck,der=1)
plt.plot(x,y)
plt.plot(x,dydx, label="derivative")
plt.legend()
plt.show()
While this is how the code above would be made runnable, it does not provide a tangent. For the tangent you only need the derivative at a single point. However you need to have the equation of a tangent somewhere and actually use it; so this is more a math question.
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-15,15,1000)
y = np.sin(x)
tck = interpolate.splrep(x,y)
x0 = 7.3
y0 = interpolate.splev(x0,tck)
dydx = interpolate.splev(x0,tck,der=1)
tngnt = lambda x: dydx*x + (y0-dydx*x0)
plt.plot(x,y)
plt.plot(x0,y0, "or")
plt.plot(x,tngnt(x), label="tangent")
plt.legend()
plt.show()
It should be noted that you do not need to use splines at all if the points you have are dense enough. In that case obtaining the derivative is just taking the differences between the nearest points.
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-15,15,1000)
y = np.sin(x)
x0 = 7.3
i0 = np.argmin(np.abs(x-x0))
x1 = x[i0:i0+2]
y1 = y[i0:i0+2]
dydx, = np.diff(y1)/np.diff(x1)
tngnt = lambda x: dydx*x + (y1[0]-dydx*x1[0])
plt.plot(x,y)
plt.plot(x1[0],y1[0], "or")
plt.plot(x,tngnt(x), label="tangent")
plt.legend()
plt.show()
The result will be visually identical to the one above.

plotting 2dhistogram with sum value rather than count

Beginner user on the forum. Help please. I have a data set: x, y coordinates, each x, y has a value. I want to plot a 2d histogram displaying the sum of the values in each bin with color scale. matplotlib hexbin is straight forward. I can do this. eg:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.colors import LogNorm
xpos = np.random.rand(0,10)
ypos = np.random.rand(0,10)
plt.hexbin(x = xpos, y = ypos, C=mass, cmap= plt.cm.jet, gridsize=100, reduce_C_function=sum, bins="log")
cb = plt.colorbar()
cb.ax.set_ylabel('log (sum value in each bin)')
plt.xlabel('Xpos')
plt.ylabel('Ypos')
plt.show()
However, I'm struggling to make a similar plot with histogram2d or matplotlib hist2d. I think i have to combine binned_statistic_2d and histogram2d somehow. No problem if I replace plt.hexbin line above to this:
plt.hist2d(x = xpos, y = ypos, bins = 50, norm = LogNorm())
Any clue? I have look on the forum but can't seem to find a working code.
You could calculate the values to show in the binned 2D plot prior to plotting and then show as an imshow plot.
If you're happy to use pandas, one option would be to group the mass data accordings to cut (pandas.cut) x and y data. Then apply the sum (.sum()) and unstack to obtain a pivot table.
df.mass.groupby([pd.cut(df.x, bins=xbins, include_lowest=True),
pd.cut(df.y, bins=ybins, include_lowest=True)]) \
.sum().unstack(fill_value=0)
Here is a complete example:
import numpy as np; np.random.seed(1)
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.colors
xpos = np.random.randint(0,10, size=50)
ypos = np.random.randint(0,10, size=50)
mass = np.random.randint(0,75, size=50)
df = pd.DataFrame({"x":xpos, "y":ypos, "mass":mass})
xbins = range(10)
ybins = range(10)
su = df.mass.groupby([pd.cut(df.x, bins=xbins, include_lowest=True),
pd.cut(df.y, bins=ybins, include_lowest=True)]) \
.sum().unstack(fill_value=0)
print su
im = plt.imshow(su.values, norm=matplotlib.colors.LogNorm(1,300))
plt.xticks(range(len(su.index)), su.index, rotation=90)
plt.yticks(range(len(su.columns)), su.columns)
plt.colorbar(im)
plt.show()

Plotting profile hitstograms in python

I am trying to make a profile plot for two columns of a pandas.DataFrame. I would not expect this to be in pandas directly but it seems there is nothing in matplotlib either. I have searched around and cannot find it in any package other than rootpy. Before I take the time to write this myself I thought I would ask if there was a small package that contained profile histograms, perhaps where they are known by a different name.
If you don't know what I mean by "profile histogram" have a look at the ROOT implementation. http://root.cern.ch/root/html/TProfile.html
You can easily do it using scipy.stats.binned_statistic.
import scipy.stats
import numpy
import matplotlib.pyplot as plt
x = numpy.random.rand(10000)
y = x + scipy.stats.norm(0, 0.2).rvs(10000)
means_result = scipy.stats.binned_statistic(x, [y, y**2], bins=50, range=(0,1), statistic='mean')
means, means2 = means_result.statistic
standard_deviations = numpy.sqrt(means2 - means**2)
bin_edges = means_result.bin_edges
bin_centers = (bin_edges[:-1] + bin_edges[1:])/2.
plt.errorbar(x=bin_centers, y=means, yerr=standard_deviations, linestyle='none', marker='.')
Use seaborn. Data as from #MaxNoe
import numpy as np
import seaborn as sns
# just some random numbers to get started
x = np.random.uniform(-2, 2, 10000)
y = np.random.normal(x**2, np.abs(x) + 1)
sns.regplot(x=x, y=y, x_bins=10, fit_reg=None)
You can do much more (error bands are from bootstrap, you can change the estimator on the y-axis, add regression, ...)
While #Keith's answer seems to fit what you mean, it is quite a lot of code. I think this can be done much simpler, so one gets the key concepts and can adjust and build on top of it.
Let me stress one thing: what ROOT is calling a ProfileHistogram is not a special kind of plot. It is an errorbar plot. Which can simply be done in matplotlib.
It is a special kind of computation and that's not the task of a plotting library. This lies in the pandas realm, and pandas is great at stuff like this. It's symptomatical for ROOT as the giant monolithic pile it is to have an extra class for this.
So what you want to do is: discretize in some variable x and for each bin, calculate something in another variable y.
This can easily done using np.digitize together with the pandas groupy and aggregate methods.
Putting it all together:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# just some random numbers to get startet
x = np.random.uniform(-2, 2, 10000)
y = np.random.normal(x**2, np.abs(x) + 1)
df = pd.DataFrame({'x': x, 'y': y})
# calculate in which bin row belongs base on `x`
# bins needs the bin edges, so this will give as 100 equally sized bins
bins = np.linspace(-2, 2, 101)
df['bin'] = np.digitize(x, bins=bins)
bin_centers = 0.5 * (bins[:-1] + bins[1:])
bin_width = bins[1] - bins[0]
# grouby bin, so we can calculate stuff
binned = df.groupby('bin')
# calculate mean and standard error of the mean for y in each bin
result = binned['y'].agg(['mean', 'sem'])
result['x'] = bin_centers
result['xerr'] = bin_width / 2
# plot it
result.plot(
x='x',
y='mean',
xerr='xerr',
yerr='sem',
linestyle='none',
capsize=0,
color='black',
)
plt.savefig('result.png', dpi=300)
Just like ROOT ;)
I made a module myself for this functionality.
import pandas as pd
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
def Profile(x,y,nbins,xmin,xmax,ax):
df = DataFrame({'x' : x , 'y' : y})
binedges = xmin + ((xmax-xmin)/nbins) * np.arange(nbins+1)
df['bin'] = np.digitize(df['x'],binedges)
bincenters = xmin + ((xmax-xmin)/nbins)*np.arange(nbins) + ((xmax-xmin)/(2*nbins))
ProfileFrame = DataFrame({'bincenters' : bincenters, 'N' : df['bin'].value_counts(sort=False)},index=range(1,nbins+1))
bins = ProfileFrame.index.values
for bin in bins:
ProfileFrame.ix[bin,'ymean'] = df.ix[df['bin']==bin,'y'].mean()
ProfileFrame.ix[bin,'yStandDev'] = df.ix[df['bin']==bin,'y'].std()
ProfileFrame.ix[bin,'yMeanError'] = ProfileFrame.ix[bin,'yStandDev'] / np.sqrt(ProfileFrame.ix[bin,'N'])
ax.errorbar(ProfileFrame['bincenters'], ProfileFrame['ymean'], yerr=ProfileFrame['yMeanError'], xerr=(xmax-xmin)/(2*nbins), fmt=None)
return ax
def Profile_Matrix(frame):
#Much of this is stolen from https://github.com/pydata/pandas/blob/master/pandas/tools/plotting.py
import pandas.core.common as com
import pandas.tools.plotting as plots
from pandas.compat import lrange
from matplotlib.artist import setp
range_padding=0.05
df = frame._get_numeric_data()
n = df.columns.size
fig, axes = plots._subplots(nrows=n, ncols=n, squeeze=False)
# no gaps between subplots
fig.subplots_adjust(wspace=0, hspace=0)
mask = com.notnull(df)
boundaries_list = []
for a in df.columns:
values = df[a].values[mask[a].values]
rmin_, rmax_ = np.min(values), np.max(values)
rdelta_ext = (rmax_ - rmin_) * range_padding / 2.
boundaries_list.append((rmin_ - rdelta_ext, rmax_+ rdelta_ext))
for i, a in zip(lrange(n), df.columns):
for j, b in zip(lrange(n), df.columns):
common = (mask[a] & mask[b]).values
nbins = 100
(xmin,xmax) = boundaries_list[i]
ax = axes[i, j]
Profile(df[a][common],df[b][common],nbins,xmin,xmax,ax)
ax.set_xlabel('')
ax.set_ylabel('')
plots._label_axis(ax, kind='x', label=b, position='bottom', rotate=True)
plots._label_axis(ax, kind='y', label=a, position='left')
if j!= 0:
ax.yaxis.set_visible(False)
if i != n-1:
ax.xaxis.set_visible(False)
for ax in axes.flat:
setp(ax.get_xticklabels(), fontsize=8)
setp(ax.get_yticklabels(), fontsize=8)
return axes
To my knowledge matplotlib doesn't still allow to directly produce profile histograms.
You can instead give a look at Hippodraw, a package developed at SLAC, that can be used as a Python extension module.
Here there is a Profile histogram example:
http://www.slac.stanford.edu/grp/ek/hippodraw/datareps_root.html#datareps_profilehist

Categories

Resources