Make quadratic regression lines connect seamlessly with matplotlib - python

I have a basic graph example and I am trying to make all the points be on some sort of curved line. I have an idea on how to about this but am not sure how to implement it or if it is even possible. Below I have a picture of the graph that I have made with the following code:
import matplotlib.pyplot as plt
import numpy as np
# original data
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y = [2, 7, 3, 4, 5, 1, 6, 9, 4, 6]
# quadratic regression
for i in range(int((len(x) + len(y)) / 2)):
sub_x = x[i:i+3]
sub_y = y[i:i+3]
model = np.poly1d(np.polyfit(sub_x, sub_y, 2))
polyline = np.linspace(min(sub_x), max(sub_x), 200)
plt.plot(polyline, model(polyline), color="#6D34D6", linestyle='dashed')
# plot lines
plt.scatter(x, y, color='#FF3FAF')
plt.plot(x, y, color='#FF3FAF', linestyle='solid')
plt.show()
Here is the picture graph that is produced:
The question that I have is how do I make all the dotted lines connect seamlessly? I had an idea about averaging each two line segments that contain the same points but I don't know how to go around doing so. Another idea that I had was making some sort of bezier curve that connects all the points but that sounds unnecessarily complicated.
Something like the green line should be the output (sorry for the poorly drawn line):

You can use scipy.interpolate.interp1d to apply a quadratic interpolation to expand the number of points to, say, 300 length, and then plot a smooth curve.
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.interpolate import interp1d
# original data
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y = [2, 7, 3, 4, 5, 1, 6, 9, 4, 6]
# quadratic regression
for i in range(int((len(x) + len(y)) / 2)):
sub_x = x[i:i+3]
sub_y = y[i:i+3]
model = np.poly1d(np.polyfit(sub_x, sub_y, 2))
polyline = np.linspace(min(sub_x), max(sub_x), 200)
plt.plot(polyline, model(polyline), color="#6D34D6", linestyle='dashed')
#Interpolate
x_new = np.linspace(min(x), max(x), 300) #<----
f = interp1d(x, y, kind='quadratic') #<----
# plot lines
plt.scatter(x, y, color='#FF3FAF')
plt.plot(x_new, f(x_new), color='#FF3FAF', linestyle='solid') #<----
plt.show()

Related

How to find the global minima from a matplotlib graph?

import numpy as np
import matplotlib.pyplot as plt
x = [1 ,2, 3, 4, 5, 6, 7, 8, 9]
y = [ 3,5, 1, 9, 3, 2, 10, 7, 8]
plt.plot(x, y)
#for global minima
minpos = y.index(min(y))
plt.plot(x[minpos],min(y), 'go', label="Minima")
plt.show()
I have two arrays x and y. Here I've plotted them using Matplotlib and found the global minima using this simple logic. Here is the output that I'm getting:
After that I've smoothen the graph BSpline
from scipy.interpolate import make_interp_spline, BSpline
# 300 represents number of points to make between T.min and T.max
xnew = np.linspace(min(x), max(x), 100)
spl = make_interp_spline(x, y, k=2) # type: BSpline
power_smooth = spl(xnew)
plt.plot(x[minpos],min(y), 'go', label="Minima")
plt.plot(xnew, power_smooth)
plt.show()
Now my position of the global minima has changed and that simple logic will not work here. I want to know how I can find the global minima from a graph in this case
Use numpy.argmin on power_smooth:
minpos = np.argmin(power_smooth)
min_x = xnew[minpos]
min_y = power_smooth[minpos]
plt.plot(min_x, min_y, 'go', label="Minima")
Output:

2d probability distribution with rbf and scipy

I have something similar to this problem respectivly the answer of this problem: RBF interpolation: LinAlgError: singular matrix
But I want to do the probability distribution with rbf.
My code until now:
from scipy.interpolate.rbf import Rbf # radial basis functions
import cv2
import matplotlib.pyplot as plt
import numpy as np
x = [1, 1, 2 ,3, 4, 4, 2, 6, 7]
y = [0, 2, 5, 6, 2, 4, 1, 5, 2]
rbf_adj = Rbf(x, y, function='gaussian')
plt.figure()
# Plotting the original points.
plot3 = plt.plot(x, y, 'ko', markersize=12) # the original points.
plt.show()
My problem is I have only coordinates of the points: x, y
But what can i use for z and d?
This is my error message:
numpy.linalg.linalg.LinAlgError: Matrix is singular.
This is, first, a 1D example to emphasis the difference between the Radial Basis Function interpolation and the Kernel Density Estimation of a probability distribution:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
from scipy.interpolate.rbf import Rbf # radial basis functions
from scipy.stats import gaussian_kde
coords = np.linspace(0, 2, 7)
values = np.ones_like(coords)
x_fine = np.linspace(-1, 3, 101)
rbf_interpolation = Rbf(coords, values, function='gaussian')
interpolated_y = rbf_interpolation(x_fine)
kernel_density_estimation = gaussian_kde(coords)
plt.figure()
plt.plot(coords, values, 'ko', markersize=12)
plt.plot(x_fine, interpolated_y, '-r', label='RBF Gaussian interpolation')
plt.plot(x_fine, kernel_density_estimation(x_fine), '-b', label='kernel density estimation')
plt.legend(); plt.xlabel('x')
plt.show()
And this is the 2D interpolation using Gaussian RBF for the provided data, and by setting arbitrarily the values to z=1:
from scipy.interpolate.rbf import Rbf # radial basis functions
import matplotlib.pyplot as plt
import numpy as np
x = [1, 1, 2 ,3, 4, 4, 2, 6, 7]
y = [0, 2, 5, 6, 2, 4, 1, 5, 2]
z = [1]*len(x)
rbf_adj = Rbf(x, y, z, function='gaussian')
x_fine = np.linspace(0, 8, 81)
y_fine = np.linspace(0, 8, 82)
x_grid, y_grid = np.meshgrid(x_fine, y_fine)
z_grid = rbf_adj(x_grid.ravel(), y_grid.ravel()).reshape(x_grid.shape)
plt.pcolor(x_fine, y_fine, z_grid);
plt.plot(x, y, 'ok');
plt.xlabel('x'); plt.ylabel('y'); plt.colorbar();
plt.title('RBF Gaussian interpolation');

How to plot a scatter plot over a line plot? [duplicate]

Does anybody have a suggestion on what's the best way to present overlapping lines on a plot? I have a lot of them, and I had the idea of having full lines of different colors where they don't overlap, and having dashed lines where they do overlap so that all colors are visible and overlapping colors are seen.
But still, how do I that.
I have the same issue on a plot with a high degree of discretization.
Here the starting situation:
import matplotlib.pyplot as plt
grid=[x for x in range(10)]
graphs=[
[1,1,1,4,4,4,3,5,6,0],
[1,1,1,5,5,5,3,5,6,0],
[1,1,1,0,0,3,3,2,4,0],
[1,2,4,4,3,2,3,2,4,0],
[1,2,3,3,4,4,3,2,6,0],
[1,1,3,3,0,3,3,5,4,3],
]
for gg,graph in enumerate(graphs):
plt.plot(grid,graph,label='g'+str(gg))
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()
No one can say where the green and blue lines run exactly
and my "solution"
import matplotlib.pyplot as plt
grid=[x for x in range(10)]
graphs=[
[1,1,1,4,4,4,3,5,6,0],
[1,1,1,5,5,5,3,5,6,0],
[1,1,1,0,0,3,3,2,4,0],
[1,2,4,4,3,2,3,2,4,0],
[1,2,3,3,4,4,3,2,6,0],
[1,1,3,3,0,3,3,5,4,3],
]
for gg,graph in enumerate(graphs):
lw=10-8*gg/len(graphs)
ls=['-','--','-.',':'][gg%4]
plt.plot(grid,graph,label='g'+str(gg), linestyle=ls, linewidth=lw)
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()
I am grateful for suggestions on improvement!
Just decrease the opacity of the lines so that they are see-through. You can achieve that using the alpha variable. Example:
plt.plot(x, y, alpha=0.7)
Where alpha ranging from 0-1, with 0 being invisible.
imagine your panda data frame is called respone_times, then you can use alpha to set different opacity for your graphs. Check the picture before and after using alpha.
plt.figure(figsize=(15, 7))
plt.plot(respone_times,alpha=0.5)
plt.title('a sample title')
plt.grid(True)
plt.show()
Depending on your data and use case, it might be OK to add a bit of random jitter to artificially separate the lines.
from numpy.random import default_rng
import pandas as pd
rng = default_rng()
def jitter_df(df: pd.DataFrame, std_ratio: float) -> pd.DataFrame:
"""
Add jitter to a DataFrame.
Adds normal distributed jitter with mean 0 to each of the
DataFrame's columns. The jitter's std is the column's std times
`std_ratio`.
Returns the jittered DataFrame.
"""
std = df.std().values * std_ratio
jitter = pd.DataFrame(
std * rng.standard_normal(df.shape),
index=df.index,
columns=df.columns,
)
return df + jitter
Here's a plot of the original data from Markus Dutschke's example:
And here's the jittered version, with std_ratio set to 0.1:
Replacing solid lines by dots or dashes works too
g = sns.FacetGrid(data, col='config', row='outputs', sharex=False)
g.map_dataframe(sns.lineplot, x='lag',y='correlation',hue='card', linestyle='dotted')
Instead of random jitter, the lines can be offset just a little bit, creating a layered appearance:
import matplotlib.pyplot as plt
from matplotlib.transforms import offset_copy
grid = list(range(10))
graphs = [[1, 1, 1, 4, 4, 4, 3, 5, 6, 0],
[1, 1, 1, 5, 5, 5, 3, 5, 6, 0],
[1, 1, 1, 0, 0, 3, 3, 2, 4, 0],
[1, 2, 4, 4, 3, 2, 3, 2, 4, 0],
[1, 2, 3, 3, 4, 4, 3, 2, 6, 0],
[1, 1, 3, 3, 0, 3, 3, 5, 4, 3]]
fig, ax = plt.subplots()
lw = 1
for gg, graph in enumerate(graphs):
trans_offset = offset_copy(ax.transData, fig=fig, x=lw * gg, y=lw * gg, units='dots')
ax.plot(grid, graph, lw=lw, transform=trans_offset, label='g' + str(gg))
ax.legend(loc='upper left', bbox_to_anchor=(1.01, 1.01))
# manually set the axes limits, because the transform doesn't set them automatically
ax.set_xlim(grid[0] - .5, grid[-1] + .5)
ax.set_ylim(min([min(g) for g in graphs]) - .5, max([max(g) for g in graphs]) + .5)
plt.tight_layout()
plt.show()

Connect points in a three dimensional plot

I have an algorithm that can be controlled by two parameters so now I want to plot the runtime of the algorithm depending on these parameters.
My Code:
from matplotlib import pyplot
import pylab
from mpl_toolkits.mplot3d import Axes3D
fig = pylab.figure()
ax = Axes3D(fig)
sequence_containing_x_vals = [5,5,5,5,10,10,10,10,15,15,15,15,20,20,20,20]
sequence_containing_y_vals = [1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4]
sequence_containing_z_vals = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
ax.scatter(sequence_containing_x_vals, sequence_containing_y_vals, sequence_containing_z_vals)
pyplot.show()
This will plot all the points in the space but I want them connected and have something like this:
(The coloring would be nice but not necessary)
To plot the surface you need to use plot_surface, and have the data as a regular 2D array (that reflects the 2D geometry of the x-y plane). Usually meshgrid is used for this, but since your data already has the x and y values repeated appropriately, you just need to reshape them. I did this with numpy reshape.
from matplotlib import pyplot, cm
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
fig = pyplot.figure()
ax = Axes3D(fig)
sequence_containing_x_vals = np.array([5,5,5,5,10,10,10,10,15,15,15,15,20,20,20,20])
X = sequence_containing_x_vals.reshape((4,4))
sequence_containing_y_vals = np.array([1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4])
Y = sequence_containing_y_vals.reshape((4,4))
sequence_containing_z_vals = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16])
Z = sequence_containing_z_vals.reshape((4,4))
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.hot)
pyplot.show()
Note that X, Y = np.meshgrid([1,2,3,4], [5, 10, 15, 20]) will give the same X and Y as above but more easily.
Of course, the surface shown here is just a plane since your data is consistent with z = x + y - -5, but this method will work with generic surfaces, as can be seen in the many matplotlib surface examples.

Contourplot with 2 different step sizes in matplotlib

This question has probably a totally simple solution but I just can't find it. I'd like to plot a contourf plot where the one part of my data varies in steps of order 1 and the other part varies with steps of order 100.
Now I tried to just give contour levels like this:
contour_levels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 200, 300, 400]
However this leads to the result that the fist 11 levels all have the same color as matplotlib is somehow normalizing this to the maximum value. How can I make every level equally important in terms of my color map?
Thanks a lot HYRY, your answer solved my problem. This is what the plots look like bevore and after the implementation (I adjusted the levels a bit; data from the GOZCARDS team/NASA):
Use colors argument:
import pylab as pl
import numpy as np
x, y = np.mgrid[-1:1:100j, 0:1:100j]
z = ... # your function
contour_levels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 200, 300, 400]
cmap = pl.cm.BuPu
colors = cmap(np.linspace(0, 1, len(contour_levels)))
pl.contour(x, y, z, levels=contour_levels, colors=colors)
I am a little wary of HYRY's solution as the mapping between the colors level can become arbitrary. I would suggest using LogNorm instead which maps your values -> colors with a log.
import pylab as pl
import numpy as np
x, y = np.mgrid[-1:1:100j, 0:1:100j]
z = ... # your function
contour_levels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 200, 300, 400]
cmap = pl.cm.BuPu
pl.contourf(x, y, z, levels=contour_levels, norm=matplotlib.colors.LogNorm)
If you also use vmin and vmax you can explicitly control the limits of the normalization and ensure that the color scales match between graphs independent of what levels you use.

Categories

Resources