How to take draw an average line for a scatter plot

How to take draw an average line for a scatter plot - python

My data is the following:
x = [3,4,5,6,7,8,9,9]
y = [6,5,4,3,2,1,1,2]
And I can obtain the following two graphs.
and
However, what I want is this (an average of all the points along the way):
Is it possible in matplotlib? Or do I have to change the list manually and somehow create:
x = [3,4,5,6,7,8,9]
y = [6,5,4,3,2,1,1.5]
RELEVANT CODE
ax.plot(x, y, 'o-', label='curPerform')
x1,x2,y1,y2 = ax.axis()
x1 = min(x) - 1
x2 = max(x) + 1
ax.axis((x1,x2,(y1-1),(y2+1)))

This can done by generating a new y_mean from your data, then plotting this on the same plot axis using an additional call to ax.plot(), where:
x is the same x used in your scatter plot
y is an iterable with "mean" value you calculate repeated so that its length is equal to x, i.e. y_mean = [np.mean(y) for i in x].
Example:
import matplotlib.pyplot as plt
import random
import numpy as np
# Create some random data
x = np.arange(0,10,1)
y = np.zeros_like(x)
y = [random.random()*5 for i in x]
# Calculate the simple average of the data
y_mean = [np.mean(y)]*len(x)
fig,ax = plt.subplots()
# Plot the data
data_line = ax.plot(x,y, label='Data', marker='o')
# Plot the average line
mean_line = ax.plot(x,y_mean, label='Mean', linestyle='--')
# Make a legend
legend = ax.legend(loc='upper right')
plt.show()
Resulting figure:

Yes, you must do the calculation yourself. plot plots the data you give it. If you want to plot some other data, you need to calculate that data yourself and then plot that instead.
Edit: A quick way to do the calculation:
>>> x, y = zip(*sorted((xVal, np.mean([yVal for a, yVal in zip(x, y) if xVal==a])) for xVal in set(x)))
>>> x
(3, 4, 5, 6, 7, 8, 9)
>>> y
(6.0, 5.0, 4.0, 3.0, 2.0, 1.0, 1.5)

Related

The Matplotlib Result is Different From WolfarmAlpha

I want to plot some equation in Matplotlib. But it has different result from Wolframalpha.
This is the equation:
y = 10yt + y^2t + 20
The plot result in wolframalpha is:
But when I want to plot it in the matplotlib with these code
# Creating vectors X and Y
x = np.linspace(-2, 2, 100)
# Assuming α is 10
y = ((10*y*x)+((y**2)*x)+20)
# Create the plot
fig = plt.figure(figsize = (10, 5))
plt.plot(x, y)
The result is:
Any suggestion to modify to code so it has similar plot result as wolframalpha? Thank you

As #Him has suggested in the comments, y = ((10*y*x)+((y**2)*x)+20) won't describe a relationship, so much as make an assignment, so the fact that y appears on both sides of the equation makes this difficult.
It's not trivial to express y cleanly in terms of x, but it's relatively easy to express x in terms of y, and then graph that relationship, like so:
import numpy as np
import matplotlib.pyplot as plt
y = np.linspace(-40, 40, 2000)
x = (y-20)*(((10*y)+(y**2))**-1)
fig, ax = plt.subplots()
ax.plot(x, y, linestyle = 'None', marker = '.')
ax.set_xlim(left = -4, right = 4)
ax.grid()
ax.set_xlabel('x')
ax.set_ylabel('y')
Which produces the following result:
If you tried to plot this with a line instead of points, you'll get a big discontinuity as the asymptotic limbs try to join up
So you'd have to define the same function and evaluate it in three different ranges and plot them all so you don't get any crossovers.
import numpy as np
import matplotlib.pyplot as plt
y1 = np.linspace(-40, -10, 2000)
y2 = np.linspace(-10, 0, 2000)
y3 = np.linspace(0, 40, 2000)
x = lambda y: (y-20)*(((10*y)+(y**2))**-1)
y = np.hstack([y1, y2, y3])
fig, ax = plt.subplots()
ax.plot(x(y), y, linestyle = '-', color = 'b')
ax.set_xlim(left = -4, right = 4)
ax.grid()
ax.set_xlabel('x')
ax.set_ylabel('y')
Which produces this result, that you were after:

draw functions in 3D data

using the below code, I create three-dimensional data to plot in a pcolormesh plot.
n = 100 # size
_min, _max = -10, 10
# generate 2 2d grids for the x & y bounds
x, y = np.meshgrid(np.linspace(_min, _max, n), np.linspace(_min, _max, n))
# generate z values with random noise
z = np.array([np.zeros(n) for i in range(n)])
for i in range(len(z)):
z[i] = z[i] + 0.1 * np.random.randint(0,3, size=len(z[i]))
# plotting
fig, ax = plt.subplots()
c = ax.pcolormesh(x, y, z, cmap='RdBu', vmin=-1, vmax=1)
ax.set_title('pcolormesh')
plt.plot([5,5,-2.5], [5,-5,5], color='darkblue', marker='o', markersize=15, linewidth=0) # dots (outer)
plt.plot([5,5,-2.5], [5,-5,5], color='lightblue', marker='o', markersize=10, linewidth=0) # dots (inner)
plt.grid(b=True) # background grid
# set the limits of the plot to the limits of the data
ax.axis([_min, _max, _min, _max])
fig.colorbar(c, ax=ax)
plt.show()
This gives an image:
However, I would now like to alter z values of specific x/y combinations according to specific functions, e.g. a circle described by (x-5)^2 + (y+5)^2 = 1. I would like to alter the data(!) and then plot it.
The 'goal' would be data producing an image like this:
I can experiment with the functions, it's mostly about the logic of altering the z values according to a mathematical function of the form z = f(x, y) that I cannot figure out.
It would follow the (pseudo code logic):
if the x / y combination of a point is on the function f(x, y): add the value c to the initial z value.
Could someone point me to how I can implement this? I tried multiple times but cannot figure it out... :( Many thanks in advance!!!
NOTE: an earlier version was imprecise. It wrongly explained this as a plotting problem although it seems that the data manipulation is the issue. Apologies for that!

You only need to plot a function, the same way.
With these lines I plot a function on your plot.
# Create the independent points of your plot
x = np.arange(0., 5., 0.2)
# Generate your dependent variables
y = np.exp(x)
# Plot your variables
plt.plot(x, y)
You could then do it multiple time.
In your full example it looks like this:
import numpy as np
import matplotlib.pyplot as plt
n = 100 # size
_min, _max = -10, 10
# generate 2 2d grids for the x & y bounds
x, y = np.meshgrid(np.linspace(_min, _max, n), np.linspace(_min, _max, n))
# generate z values with random noise
z = np.array([np.zeros(n) for i in range(n)])
for i in range(len(z)):
z[i] = z[i] + 0.1 * np.random.randint(0, 3, size=len(z[i]))
# plotting
fig, ax = plt.subplots()
c = ax.pcolormesh(x, y, z, cmap='RdBu', vmin=-1, vmax=1)
ax.set_title('pcolormesh')
plt.plot([5, 5, -2.5], [5, -5, 5], color='darkblue', marker='o', markersize=15, linewidth=0) # dots (outer)
plt.plot([5, 5, -2.5], [5, -5, 5], color='lightblue', marker='o', markersize=10, linewidth=0) # dots (inner)
plt.grid(b=True) # background grid
# set the limits of the plot to the limits of the data
ax.axis([_min, _max, _min, _max])
fig.colorbar(c, ax=ax)
x = np.arange(0., 5., 0.2)
plt.plot(x, np.exp(x))
plt.show()
Of course you need to change the line y = np.exp(x) with whatever function you need.

Scatter plot coloring of data under the region of a function in Matplotlib

I need to plot bunch of points and also on the same graph plot a function cosx. The idea is to see the points that fall under the curve.
I have graph of cosx:
x = np.linspace(0, np.pi) #x range between 0 and pi
y = np.cos(x)
plt.plot(x, y)
plt.show()
Now I need to plot x = [2, 0.9, 2.6, 3.1] and y = [0.1, 0.4, 0.5, 0.2]
I can plot them as scatter plot but how would I combine both and preferably color code the points that fall under the curve?

You can seperate the scatter points into two lists, one for points under the line and one for points over the line. Then you can plot both lists.
Your data would be inserted instead of the random numbers in points_x and points_y
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, np.pi) #x range between 0 and pi
y = np.cos(x)
N = 100
points_x = np.random.rand(N)*np.pi
points_y = np.random.rand(N)*2.-1.
points_over = [(xi,yi) for xi,yi in zip(points_x,points_y) if np.cos(xi) < yi]
points_under = [(xi,yi) for xi,yi in zip(points_x,points_y) if np.cos(xi) >= yi]
plt.plot(x, y)
plt.scatter(*zip(*points_over),c='g')
plt.scatter(*zip(*points_under),c='r')
plt.show()
Producing something like:

Changing the linewidth and the color simultaneously in matplotlib

The figure above is a great artwork showing the wind speed, wind direction and temperature simultaneously. detailedly:
The X axes represent the date
The Y axes shows the wind direction(Southern, western, etc)
The variant widths of the line were stand for the wind speed through timeseries
The variant colors of the line were stand for the atmospheric temperature
This simple figure visualized 3 different attribute without redundancy.
So, I really want to reproduce similar plot in matplotlib.
My attempt now
## Reference 1 http://stackoverflow.com/questions/19390895/matplotlib-plot-with-variable-line-width
## Reference 2 http://stackoverflow.com/questions/17240694/python-how-to-plot-one-line-in-different-colors
def plot_colourline(x,y,c):
c = plt.cm.jet((c-np.min(c))/(np.max(c)-np.min(c)))
lwidths=1+x[:-1]
ax = plt.gca()
for i in np.arange(len(x)-1):
ax.plot([x[i],x[i+1]], [y[i],y[i+1]], c=c[i],linewidth = lwidths[i])# = lwidths[i])
return
x=np.linspace(0,4*math.pi,100)
y=np.cos(x)
lwidths=1+x[:-1]
fig = plt.figure(1, figsize=(5,5))
ax = fig.add_subplot(111)
plot_colourline(x,y,prop)
ax.set_xlim(0,4*math.pi)
ax.set_ylim(-1.1,1.1)
Does someone has a more interested way to achieve this? Any advice would be appreciate!

Using as inspiration another question.
One option would be to use fill_between. But perhaps not in the way it was intended. Instead of using it to create your line, use it to mask everything that is not the line. Under it you can have a pcolormesh or contourf (for example) to map color any way you want.
Look, for instance, at this example:
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import interp1d
def windline(x,y,deviation,color):
y1 = y-deviation/2
y2 = y+deviation/2
tol = (y2.max()-y1.min())*0.05
X, Y = np.meshgrid(np.linspace(x.min(), x.max(), 100), np.linspace(y1.min()-tol, y2.max()+tol, 100))
Z = X.copy()
for i in range(Z.shape[0]):
Z[i,:] = c
#plt.pcolormesh(X, Y, Z)
plt.contourf(X, Y, Z, cmap='seismic')
plt.fill_between(x, y2, y2=np.ones(x.shape)*(y2.max()+tol), color='w')
plt.fill_between(x, np.ones(x.shape) * (y1.min() - tol), y2=y1, color='w')
plt.xlim(x.min(), x.max())
plt.ylim(y1.min()-tol, y2.max()+tol)
plt.show()
x = np.arange(100)
yo = np.random.randint(20, 60, 21)
y = interp1d(np.arange(0, 101, 5), yo, kind='cubic')(x)
dv = np.random.randint(2, 10, 21)
d = interp1d(np.arange(0, 101, 5), dv, kind='cubic')(x)
co = np.random.randint(20, 60, 21)
c = interp1d(np.arange(0, 101, 5), co, kind='cubic')(x)
windline(x, y, d, c)
, which results in this:
The function windline accepts as arguments numpy arrays with x, y , a deviation (like a thickness value per x value), and color array for color mapping. I think it can be greatly improved by messing around with other details but the principle, although not perfect, should be solid.

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
x = np.linspace(0,4*np.pi,10000) # x data
y = np.cos(x) # y data
r = np.piecewise(x, [x < 2*np.pi, x >= 2*np.pi], [lambda x: 1-x/(2*np.pi), 0]) # red
g = np.piecewise(x, [x < 2*np.pi, x >= 2*np.pi], [lambda x: x/(2*np.pi), lambda x: -x/(2*np.pi)+2]) # green
b = np.piecewise(x, [x < 2*np.pi, x >= 2*np.pi], [0, lambda x: x/(2*np.pi)-1]) # blue
a = np.ones(10000) # alpha
w = x # width
fig, ax = plt.subplots(2)
ax[0].plot(x, r, color='r')
ax[0].plot(x, g, color='g')
ax[0].plot(x, b, color='b')
# mysterious parts
points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
# mysterious parts
rgba = list(zip(r,g,b,a))
lc = LineCollection(segments, linewidths=w, colors=rgba)
ax[1].add_collection(lc)
ax[1].set_xlim(0,4*np.pi)
ax[1].set_ylim(-1.1,1.1)
fig.show()
I notice this is what I suffered.

python matplotlib with a line color gradient and colorbar

I've been toying around with this problem and am close to what I want but missing that extra line or two.
Basically, I'd like to plot a single line whose color changes given the value of a third array. Lurking around I have found this works well (albeit pretty slowly) and represents the problem
import numpy as np
import matplotlib.pyplot as plt
c = np.arange(1,100)
x = np.arange(1,100)
y = np.arange(1,100)
cm = plt.get_cmap('hsv')
fig = plt.figure(figsize=(5,5))
ax1 = plt.subplot(111)
no_points = len(c)
ax1.set_color_cycle([cm(1.*i/(no_points-1))
for i in range(no_points-1)])
for i in range(no_points-1):
bar = ax1.plot(x[i:i+2],y[i:i+2])
plt.show()
Which gives me this:
I'd like to be able to include a colorbar along with this plot. So far I haven't been able to crack it just yet. Potentially there will be other lines included with different x,y's but the same c, so I was thinking that a Normalize object would be the right path.
Bigger picture is that this plot is part of a 2x2 sub plot grid. I am already making space for the color bar axes object with matplotlib.colorbar.make_axes(ax4), where ax4 with the 4th subplot.

Take a look at the multicolored_line example in the Matplotlib gallery and dpsanders' colorline notebook:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.collections as mcoll
def multicolored_lines():
"""
http://nbviewer.ipython.org/github/dpsanders/matplotlib-examples/blob/master/colorline.ipynb
http://matplotlib.org/examples/pylab_examples/multicolored_line.html
"""
x = np.linspace(0, 4. * np.pi, 100)
y = np.sin(x)
fig, ax = plt.subplots()
lc = colorline(x, y, cmap='hsv')
plt.colorbar(lc)
plt.xlim(x.min(), x.max())
plt.ylim(-1.0, 1.0)
plt.show()
def colorline(
x, y, z=None, cmap='copper', norm=plt.Normalize(0.0, 1.0),
linewidth=3, alpha=1.0):
"""
http://nbviewer.ipython.org/github/dpsanders/matplotlib-examples/blob/master/colorline.ipynb
http://matplotlib.org/examples/pylab_examples/multicolored_line.html
Plot a colored line with coordinates x and y
Optionally specify colors in the array z
Optionally specify a colormap, a norm function and a line width
"""
# Default colors equally spaced on [0,1]:
if z is None:
z = np.linspace(0.0, 1.0, len(x))
# Special case if a single number:
# to check for numerical input -- this is a hack
if not hasattr(z, "__iter__"):
z = np.array([z])
z = np.asarray(z)
segments = make_segments(x, y)
lc = mcoll.LineCollection(segments, array=z, cmap=cmap, norm=norm,
linewidth=linewidth, alpha=alpha)
ax = plt.gca()
ax.add_collection(lc)
return lc
def make_segments(x, y):
"""
Create list of line segments from x and y coordinates, in the correct format
for LineCollection: an array of the form numlines x (points per line) x 2 (x
and y) array
"""
points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
return segments
multicolored_lines()
Note that calling plt.plot hundreds of times tends to kill performance.
Using a LineCollection to build multi-colored line segments is much much faster.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to take draw an average line for a scatter plot - python

Related

The Matplotlib Result is Different From WolfarmAlpha

draw functions in 3D data

Scatter plot coloring of data under the region of a function in Matplotlib

Changing the linewidth and the color simultaneously in matplotlib

python matplotlib with a line color gradient and colorbar

Categories

Resources