How to curve fit multiple y vals for single x value?

How to curve fit multiple y vals for single x value? - python

I'm trying to use numpy to curve fit (polyfit) a data set I have - it's multiple y vals for discrete x vals, i.e.:
data = [[2, 3], [3, 4], [5, 4]]
where the index is x, and the arrays are the y vals.
I tried the average/median of each array, but I get the feeling that's ignoring a lot of useful data.
TLDR:
Need to fit a curve to this scatter plot:

You could flatten your data out:
x = []
y = []
for i,ydata in enumerate(data):
x += [i]*len(ydata)
y += ydata
Now you can fit to x and y and it will account for all points in the set.

Related

Creating Density/Heatmap Plot from Coordinates and Magnitude in Python

I have some data which is the number of readings at each point on a 5x10 grid, which is in the format of;
X = [1, 2, 3, 4,..., 5]
Y = [1, 1, 1, 1,...,10]
Z = [9,8,14,0,89,...,0]
I would like to plot this as a heatmap/density map from above, but all of the matplotlib graphs (incl. contourf) that I have found are requiring a 2D array for Z and I don't understand why.
EDIT;
I have now collected the actual coordinates that I want to plot which are not as regular as what I have above they are;
X = [8,7,7,7,8,8,8,9,9.5,9.5,9.5,11,11,11,10.5,
10.5,10.5,10.5,9,9,8, 8,8,8,6.5,6.5,1,2.5,4.5,
4.5,2,2,2,3,3,3,4,4.5,4.5,4.5,4.5,3.5,2.5,2.5,
1,1,1,2,2,2]
Y = [5.5,7.5,8,9,9,8,7.5,6,6.5,8,9,9,8,6.5,5.5,
5,3.5,2,2,1,2,3.5,5,1,1,2,4.5,4.5,4.5,4,3,
2,1,1,2,3,4.5,3.5,2.5,1.5,1,5.5,5.5,6,7,8,9,
9,8,7]
z = [286,257,75,38,785,3074,1878,1212,2501,1518,419,33,
3343,1808,3233,5943,10511,3593,1086,139,565,61,61,
189,155,105,120,225,682,416,30632,2035,165,6777,
7223,465,2510,7128,2296,1659,1358,204,295,854,7838,
122,5206,6516,221,282]
From what I understand you can't use floats in a np.array so I have tried to multiply all values by 10 so that they are all integers, but I am still running into some issues. Am I trying to do something that will not work?

They expect a 2D array because they use the "row" and "column" to set the position of the value. For example, if array[2, 3] = 5, then when x is 2 and y is 3, the heatmap will use the value 5.
So, let's try transforming your current data into a single array:
>>> array = np.empty((len(set(X)), len(set(Y))))
>>> for x, y, z in zip(X, Y, Z):
array[x-1, y-1] = z
If X and Y are np.arrays, you could do this too (SO answer):
>>> array = np.empty((X.shape[0], Y.shape[0]))
>>> array[np.array(X) - 1, np.array(Y) - 1] = Z
And now just plot the array as you prefer:
>>> plt.imshow(array, cmap="hot", interpolation="nearest")
>>> plt.show()

Tensorflow multiple X values to one Y value

Is it possible to use a list of inputs as X to only one label Y?
I'm working with ECG values and have a time series of 1 second, and for each second I have what emotion was displayed.
So I have something like an array of 100 values and a binary value for the Y.
What can I do?

Difficult to tell if that's what you're looking for without seeing your code so far. But here's an example.
tf.reset_default_graph()
x_len = 3 # length of X, in your case 100
xs = tf.placeholder(shape = [None, x_len], dtype = tf.float32) # feed arbitrary number of X's
ys = tf.placeholder(shape = [None], dtype = tf.float32) # feed Y's corresponding to the X's
outs = tf.reduce_sum(xs, axis = 1) + ys # do something with X's and Y's
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
x = np.array([[1, 2, 3], [4, 5, 6]]) # 2 X's of x_len == 3 each
y = [10, 20] # 2 Y's corresponding to each X
outs = sess.run(outs, feed_dict = { xs: x, ys: y }) # run the graph to get the output
print(outs)
This takes several X's of specified length (3 here, in your case 100), a corresponding Y for each X and feeds it through the graph. The outs operation sums up all values in each X and adds the corresponding Y to the sum.
Output:
[16. 35.]

MatPlotLib: Scatter with multiple y values to one x value, and regression lines

I would like to create a scatter plot in matplotlib to measure the performance of my algorithm.
An example of my data is as follows:
x = [1, 2, 3, 4, 5]
y1 = [1, 2, 3] # corresponding to x = 1
y2 = [4, 5, 6] # corresponding to x = 2
y3 = [7, 8, 9] # corresponding to x = 3
y4 = [10, 11, 12] # corresponding to x = 4
y5 = [13, 14, 15] # corresponding to x = 5
What data type would be best to represent multiple y values with one x value?
In my example the relation is exponential. Is there a way to plot an exponential regression line in matplotlib?

I think it is related with the data analyses. If I understand correctly, I think you want to have a comparison with every test's time efficiency, but at each test run, they should be at the same test environments (like the same machine, the same input data, etc.) So just give a suggestion, you can use each test's average run time as the standard value to show your test results. Here is some code you can use.
import numpy as np
import matplotlib.pyplot as plt
data_dim = 4 # number of test
data_points = 100 # number of each test_data_points
data_set = np.random.rand(data_dim,data_points)
time = [ list(range(len(i))) for i in data_set]
norm = np.full((data_dim,data_points),1)
aver = [] # get each test's average value
ndx = 0
for i in norm:
aver.append(i* sum(data_set[0]) / data_points)
fig = plt.figure(figsize=(10,10))
ndx = 1
for i in range(0,2):
for j in range(0,2):
ax = fig.add_subplot(2,2,ndx)
ax.plot(time[ndx-1],data_set[ndx-1],'ko')
ax.plot(time[ndx-1],aver[ndx-1],'r')
ax.set_ylim(-1,2)
ndx += 1
plt.show()
The following is the run result. Note, the red solid line is the average of your test time, which will give some senses of your each test.

interpolation between arrays in python

What is the easiest and fastest way to interpolate between two arrays to get new array.
For example, I have 3 arrays:
x = np.array([0,1,2,3,4,5])
y = np.array([5,4,3,2,1,0])
z = np.array([0,5])
x,y corresponds to data-points and z is an argument. So at z=0 x array is valid, and at z=5 y array valid. But I need to get new array for z=1. So it could be easily solved by:
a = (y-x)/(z[1]-z[0])*1+x
Problem is that data is not linearly dependent and there are more than 2 arrays with data. Maybe it is possible to use somehow spline interpolation?

This is a univariate to multivariate regression problem. Scipy supports univariate to univariate regression, and multivariate to univariate regression. But you can instead iterate over the outputs, so this is not such a big problem. Below is an example of how it can be done. I've changed the variable names a bit and added a new point:
import numpy as np
from scipy.interpolate import interp1d
X = np.array([0, 5, 10])
Y = np.array([[0, 1, 2, 3, 4, 5],
[5, 4, 3, 2, 1, 0],
[8, 6, 5, 1, -4, -5]])
XX = np.array([0, 1, 5]) # Find YY for these
YY = np.zeros((len(XX), Y.shape[1]))
for i in range(Y.shape[1]):
f = interp1d(X, Y[:, i])
for j in range(len(XX)):
YY[j, i] = f(XX[j])
So YY are the result for XX. Hope it helps.

Numpy meshgrid in 3D

Numpy's meshgrid is very useful for converting two vectors to a coordinate grid. What is the easiest way to extend this to three dimensions? So given three vectors x, y, and z, construct 3x3D arrays (instead of 2x2D arrays) which can be used as coordinates.

Numpy (as of 1.8 I think) now supports higher that 2D generation of position grids with meshgrid. One important addition which really helped me is the ability to chose the indexing order (either xy or ij for Cartesian or matrix indexing respectively), which I verified with the following example:
import numpy as np
x_ = np.linspace(0., 1., 10)
y_ = np.linspace(1., 2., 20)
z_ = np.linspace(3., 4., 30)
x, y, z = np.meshgrid(x_, y_, z_, indexing='ij')
assert np.all(x[:,0,0] == x_)
assert np.all(y[0,:,0] == y_)
assert np.all(z[0,0,:] == z_)

Here is the source code of meshgrid:
def meshgrid(x,y):
"""
Return coordinate matrices from two coordinate vectors.
Parameters
----------
x, y : ndarray
Two 1-D arrays representing the x and y coordinates of a grid.
Returns
-------
X, Y : ndarray
For vectors `x`, `y` with lengths ``Nx=len(x)`` and ``Ny=len(y)``,
return `X`, `Y` where `X` and `Y` are ``(Ny, Nx)`` shaped arrays
with the elements of `x` and y repeated to fill the matrix along
the first dimension for `x`, the second for `y`.
See Also
--------
index_tricks.mgrid : Construct a multi-dimensional "meshgrid"
using indexing notation.
index_tricks.ogrid : Construct an open multi-dimensional "meshgrid"
using indexing notation.
Examples
--------
>>> X, Y = np.meshgrid([1,2,3], [4,5,6,7])
>>> X
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> Y
array([[4, 4, 4],
[5, 5, 5],
[6, 6, 6],
[7, 7, 7]])
`meshgrid` is very useful to evaluate functions on a grid.
>>> x = np.arange(-5, 5, 0.1)
>>> y = np.arange(-5, 5, 0.1)
>>> xx, yy = np.meshgrid(x, y)
>>> z = np.sin(xx**2+yy**2)/(xx**2+yy**2)
"""
x = asarray(x)
y = asarray(y)
numRows, numCols = len(y), len(x) # yes, reversed
x = x.reshape(1,numCols)
X = x.repeat(numRows, axis=0)
y = y.reshape(numRows,1)
Y = y.repeat(numCols, axis=1)
return X, Y
It is fairly simple to understand. I extended the pattern to an arbitrary number of dimensions, but this code is by no means optimized (and not thoroughly error-checked either), but you get what you pay for. Hope it helps:
def meshgrid2(*arrs):
arrs = tuple(reversed(arrs)) #edit
lens = map(len, arrs)
dim = len(arrs)
sz = 1
for s in lens:
sz*=s
ans = []
for i, arr in enumerate(arrs):
slc = [1]*dim
slc[i] = lens[i]
arr2 = asarray(arr).reshape(slc)
for j, sz in enumerate(lens):
if j!=i:
arr2 = arr2.repeat(sz, axis=j)
ans.append(arr2)
return tuple(ans)

Can you show us how you are using np.meshgrid? There is a very good chance that you really don't need meshgrid because numpy broadcasting can do the same thing without generating a repetitive array.
For example,
import numpy as np
x=np.arange(2)
y=np.arange(3)
[X,Y] = np.meshgrid(x,y)
S=X+Y
print(S.shape)
# (3, 2)
# Note that meshgrid associates y with the 0-axis, and x with the 1-axis.
print(S)
# [[0 1]
# [1 2]
# [2 3]]
s=np.empty((3,2))
print(s.shape)
# (3, 2)
# x.shape is (2,).
# y.shape is (3,).
# x's shape is broadcasted to (3,2)
# y varies along the 0-axis, so to get its shape broadcasted, we first upgrade it to
# have shape (3,1), using np.newaxis. Arrays of shape (3,1) can be broadcasted to
# arrays of shape (3,2).
s=x+y[:,np.newaxis]
print(s)
# [[0 1]
# [1 2]
# [2 3]]
The point is that S=X+Y can and should be replaced by s=x+y[:,np.newaxis] because
the latter does not require (possibly large) repetitive arrays to be formed. It also generalizes to higher dimensions (more axes) easily. You just add np.newaxis where needed to effect broadcasting as necessary.
See http://www.scipy.org/EricsBroadcastingDoc for more on numpy broadcasting.

i think what you want is
X, Y, Z = numpy.mgrid[-10:10:100j, -10:10:100j, -10:10:100j]
for example.

Here is a multidimensional version of meshgrid that I wrote:
def ndmesh(*args):
args = map(np.asarray,args)
return np.broadcast_arrays(*[x[(slice(None),)+(None,)*i] for i, x in enumerate(args)])
Note that the returned arrays are views of the original array data, so changing the original arrays will affect the coordinate arrays.

Instead of writing a new function, numpy.ix_ should do what you want.
Here is an example from the documentation:
>>> ixgrid = np.ix_([0,1], [2,4])
>>> ixgrid
(array([[0],
[1]]), array([[2, 4]]))
>>> ixgrid[0].shape, ixgrid[1].shape
((2, 1), (1, 2))'

You can achieve that by changing the order:
import numpy as np
xx = np.array([1,2,3,4])
yy = np.array([5,6,7])
zz = np.array([9,10])
y, z, x = np.meshgrid(yy, zz, xx)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to curve fit multiple y vals for single x value? - python

You could flatten your data out: x = [] y = [] for i,ydata in enumerate(data): x += [i]*len(ydata) y += ydata Now you can fit to x and y and it will account for all points in the set.

Related

Creating Density/Heatmap Plot from Coordinates and Magnitude in Python

Tensorflow multiple X values to one Y value

MatPlotLib: Scatter with multiple y values to one x value, and regression lines

interpolation between arrays in python

Numpy meshgrid in 3D

Categories

Resources