Related
I have the following constraints for the problem,
I want to minimize the sum of squared differences of w_i, uw_i divided by SUM(uw) following these restrictions:
1. w_i is at maximum, ul
2. w_i is, at least, 0.05
3. The sum of all w for a sector code can not be bigger than 0.50
So basically I want to generate all w_i for each row, however, I dont know how to implement the third restriction with scipy.
With scipy.optimize.lsq_linear I can force the first two conditions with bound = (0.05, ul), but I don't know how to force the third one.
import pandas as pd
import scipy
import numpy as np
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
df
I think you are trying to do something like this:
import pandas as pd
import scipy
import numpy as np
'''
Minimize the sum of squared differences of w_i, uw_i divided by SUM(uw) following these restrictions:
Constraints:
1. w_i is at maximum, ul [NOTE: I think this should say 'minimum']
2. w_i is, at least, 0.05 [NOTE: I think this should say 'at most']
3. The sum of all w for a sector code can not be bigger than 0.50
'''
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
print(df)
gb = df.groupby('Sector Code')['ul']
codeCounts = gb.count().to_list()
cumCounts = [0] + [sum(codeCounts[:i + 1]) for i in range(len(codeCounts))]
newIdx = []
for code, dfGp in gb:
newIdx += list(dfGp.index)
df = df.reindex(newIdx)
# For each unique Sector Code, create constraint that 0.50 minus the sum of w for that code must be non-negative:
def foo(i, c):
# return a closure that acts as a constraint for the i'th interval defined by c[i-1]:c[i]
def bar(x):
return 0.50 - sum(x[c[i-1]:c[i]])
return bar
cons = [{'type': 'ineq', 'fun': foo(i, cumCounts)} for i in range(1, len(cumCounts))]
# Value of bounds argument to enforce ul <= w_i <= 0.05
bnds = tuple((ul_i, 0.05) for code, ul_group in gb for ul_i in ul_group)
# Initial guess
n = len(df.index)
w_i = np.ones(n) * (1 / n)
# The objective function to be minimized
uw_sum = df.uw.sum()
def fun(w):
return (pd.Series(w) - df.uw).pow(2).sum() / uw_sum
# Optimize using scipy minimize() function
from scipy.optimize import minimize
res = minimize(fun, w_i, method='SLSQP', bounds=bnds, constraints=cons)
print(res)
df['w'] = res.x
df = df.reindex(range(len(df.index)))
print(df)
Explanation:
Use groupby() to get the row count for each unique Sector Code value and also to construct an index ordered by Sector Code, which we use to re-order the original input df
create a list of constraint dictionaries to be passed to the optimizer, one for each Sector Code, which will use python closures to constrain the sum of the corresponding solution elements to be <= 0.50
create a sequence of bounds to constrain solution elements w_i to be between ul and 0.05
create the objective function to return the sum of squared differences of w_i, uw_i divided by sum(uw)
call minimize() from scipy.optimize with the above constraints, bounds, objective function and an initial guess
add a column to the dataframe with the result and call reindex() to restore the original row order.
Output:
uw ul Sector Code
0 0.006822 0.050000 40
1 0.017949 0.050000 40
2 0.001906 0.031289 40
3 0.000904 0.040318 20
4 0.001147 0.046904 15
... ... ... ...
1226 0.003653 0.033553 10
1227 0.002556 0.031094 10
1228 0.002816 0.041031 10
1229 0.010216 0.050000 40
1230 0.001559 0.033480 55
[1231 rows x 3 columns]
fun: 0.4487707682194904
jac: array([0.02089997, 0.00466947, 0.01358654, ..., 0.02070332, nan,
0.02188896])
message: 'Positive directional derivative for linesearch'
nfev: 919
nit: 5
njev: 1
status: 8
success: False
x: array([0.03730054, 0.0247585 , 0.02171931, ..., 0.03300862, 0.05 ,
0.03348039])
uw ul Sector Code w
0 0.006822 0.050000 40 0.050000
1 0.017949 0.050000 40 0.050000
2 0.001906 0.031289 40 0.031289
3 0.000904 0.040318 20 0.040318
4 0.001147 0.046904 15 0.046904
... ... ... ... ...
1226 0.003653 0.033553 10 0.033553
1227 0.002556 0.031094 10 0.031094
1228 0.002816 0.041031 10 0.041031
1229 0.010216 0.050000 40 0.050000
1230 0.001559 0.033480 55 0.033480
[1231 rows x 4 columns]
Note that success is False, so perhaps some work remains. Hopefully the dataframe related manipulations are helpful in addressing your question.
You already got a working answer from #constantstranger. IMO, there's just one problem: it's quite slow. More precisely, it took more than a minute to solve the problem on my machine.
Therefore, some notes on what could be done in order to speed up the solver in the following:
Since Python has a noticeable overhead when calling functions, it's a good idea to implement all functions as fast as possible. For instance, evaluating one vectorial constraint function is faster than evaluating multiple scalar constraint functions.
At the moment, all derivatives (the objective gradient and the constraint Jacobian) are approximated by finite differences. This is a real bottleneck because each evaluation of the approximated derivative goes in hand with multiple objective/constraint function evaluations. Instead, it's highly recommended to provide the exact derivatives or use algorithmic differentiation.
Last but not least, scipy.optimize.minimize is only suited for small to mid-sized problems at least. If you are willing to use another package, you could use IPOPT, the state-of-the-art NLP solver. The cyipopt package provides a scipy-like interface, so it isn't hard switching from scipy.optimize.minimize.
Besides from that, your problem is a (convex) quadratic optimization problem and can be formulated as follows:
min f(w) s.t. A*w <= 0.5, u_l <= w <= 0.05
with
f(w) = (1/sum(u_w)) * ||w - u_w||^2_2 = (1/sum(u_w)) * (w'Iw - 2u_w'*w + u_w'u_w)
where A[i,j] = 1 if w[j] belongs to sector i and 0 otherwise.
Then, solving the problem with IPOPT (note that we pass the exact derivatives) looks like this:
import numpy as np
import pandas as pd
from cyipopt import minimize_ipopt
# dataframe
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
# sectors
sectors = df["Sector Code"].unique()
# building the matrix A
A = np.zeros((sectors.size, len(df)))
for i, sec in enumerate(sectors):
indices = df[df["Sector Code"] == sec].index.values
A[i, indices] = 1
uw = df['uw'].values
uw_sum = uw.sum()
# objective
def obj(w):
return np.sum((w - uw)**2) / uw_sum
# objective gradient
def grad(w):
return (2*w - 2*uw) / uw_sum
# Linear Constraint A # w <= 0.5 <=> 0.5 - A # w >= 0
cons = [{'type': 'ineq', 'fun': lambda w: 0.5 - A # w, 'jac': lambda w: -A}]
# variable bounds
bounds = [(u_i, 0.05) for u_i in df.ul.values]
# feasible initial guess
w0 = np.ones(len(df)) / len(df)
# solve the problem
res = minimize_ipopt(obj, x0=w0, jac=grad, bounds=bounds, constraints=cons)
print(res)
On my machine, this terminates in less than 2 seconds and yields
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
fun: 0.4306218505716169
info: {'x': array([0.05 , 0.05 , 0.03128946, ..., 0.04103131, 0.05 ,
0.03348038]), 'g': array([-3.51687688, -9.45217602, -7.88799127, -1.78825803, -1.86650095,
-5.09092925, -2.11181422, -1.35485327, -1.15847276, 0.35 ]), 'obj_val': 0.4306218505716169, 'mult_g': array([-1.000000e+03, -1.000000e+03, -1.000000e+03, -1.000000e+03,
-1.000000e+03, -1.000000e+03, -1.000000e+03, -1.000000e+03,
-1.000000e+03, -2.857166e-09]), 'mult_x_L': array([1000.02960821, 1000.02197802, 1000.00000005, ..., 1000.00000011,
1000.02728049, 1000.00000006]), 'mult_x_U': array([0.00000000e+00, 0.00000000e+00, 5.34457820e-08, ...,
1.11498931e-07, 0.00000000e+00, 6.05340266e-08]), 'status': 2, 'status_msg': b'Algorithm converged to a point of local infeasibility. Problem may be infeasible.'}
message: b'Algorithm converged to a point of local infeasibility. Problem may be infeasible.'
nfev: 13
nit: 9
njev: 7
status: 2
success: False
x: array([0.05 , 0.05 , 0.03128946, ..., 0.04103131, 0.05 ,
0.03348038])
[Finished in 1.9s]
When running the ODR algorithm on some experiment data, I've been asked to run it with the following model:
It is clear that this fitting function is containing a redundant degree of freedom.
When I run the fitting on my experiment data I get enormous relative errors of beta, starting from 8000% relative error.
When I try to run the fitting again but with a fitting function that doesn't have a redundant degree of freedom, such as:
I don't get this kind of problem.
Why is this happening? Why the ODR algorithm is so sensitive for redundant degrees of freedom? I wasn't able to answer these questions to my supervisors. An answer will be much appreciated.
Reproducing code example:
from scipy.odr import RealData, Model, ODR
def func1(a, x):
return a[0] * (x + a[1]) / (a[3] * (x + a[1]) + a[1] * x) + a[2]
def func2(a, x):
return a[0] / (x + a[1]) + a[2]
# fmt: off
zx = [
1911.125, 2216.95, 2707.71, 3010.225, 3410.612, 3906.015, 4575.105, 5517.548,
6918.481,
]
dx = [
0.291112577, 0.321695254, 0.370771197, 0.401026507, 0.441068641, 0.490601621,
0.557573268, 0.651755155, 0.79184836,
]
zy = [
0.000998056, 0.000905647, 0.000800098, 0.000751041, 0.000699982, 0.000650532,
0.000600444, 0.000550005, 0.000500201,
]
dy = [
5.49029e-07, 5.02824e-07, 4.5005e-07, 4.25532e-07, 3.99991e-07, 3.75266e-07,
3.50222e-07, 3.25003e-07, 3.00101e-07,
]
# fmt: on
data = RealData(x=zx, y=zy, sx=dx, sy=dy)
print("Func 1")
print("======")
beta01 = [
1.46,
4775.4,
0.01,
1000,
]
model1 = Model(func1)
odr1 = ODR(data, model1, beta0=beta01)
result1 = odr1.run()
print("beta", result1.beta)
print("sd beta", result1.sd_beta)
print("relative", result1.sd_beta / result1.beta * 100)
print()
print()
print("Func 2")
print("======")
beta02 = [
1,
1,
1,
]
model2 = Model(func2)
odr2 = ODR(data, model2, beta0=beta02)
result2 = odr2.run()
print("beta", result2.beta)
print("sd beta", result2.sd_beta)
print("relative", result2.sd_beta / result2.beta * 100)
This prints out:
Func 1
======
beta [ 1.30884537e+00 -2.82585952e+03 7.79755196e-04 9.47943376e+01]
sd beta [1.16144608e+02 3.73765816e+06 6.12613738e-01 4.20775596e+03]
relative [ 8873.82193523 -132266.24068473 78564.88054498 4438.82627453]
Func 2
======
beta [1.40128121e+00 9.80844274e+01 3.00511669e-04]
sd beta [2.73990552e-03 3.22344713e+00 3.74538794e-07]
relative [0.1955286 3.28640051 0.12463369]
Scipy/Numpy/Python version information:
Versions are:
Scipy - 1.4.1
Numpy - 1.18.2
Python - 3.7.2
The problem is not with the degrees of freedom.
The degrees of freedom is the difference between the number of data points and the number of fitting parameters.
The problem has the same number of degrees of freedom for the two formulae, as they have the same number of parameters.
It also looks like that you do not have free degrees of freedom, which is good news, it means that it can potentially be fitted.
However, you are right that first expression has some problem: the parameters you are trying to fit are not independent.
This is probably better understood with some simpler example.
Consider the following expression:
y = x + b + c
which you try to fit, given n data for x and y with n >> 2.
The question is: what are the optimal value for b and c? This cannot be answered. All you can say from x and y data is about the combination. Therefore, if b + c is 0, the fit cannot tell us if b = 1000, c = -1000 or b = 1, c= -1, but at least we can say that given b we can determine c.
What is the error on a given b? Potentially infinite. That is the reason for the fitting to give you that large relative error.
I am trying to fit a linear line to my graph:
the x values of the red line(raw data) are:
array([ 0.03591733, 0.16728212, 0.49537727, 0.96912459,
1. , 1. , 1.11894521, 1.93042113,
2.94284656, 10.98699942])
and the y values are
array([ 0.0016241 , 0.00151784, 0.00155586, 0.00174498, 0.00194872,
0.00189413, 0.00208325, 0.00218074, 0.0021281 , 0.00243127])
my code for the line of best fit is:
LineFit = np.polyfit(x, y, 1)
p = np.poly1d(LineFit)
plt.plot(x,y,'r-')
plt.plot(x,p(y),'--')
plt.show()
However, my LineFit returns me
array([ 7.03475069e-05, 1.76565292e-03])
which supposed to be interception and gradient according to the definition of polyfit (lower to higher order coefficient)
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.polynomial.polynomial.polyfit.html
but seems like its the opposite (gradient and interception) from the plot.
Could someone explain this to me?
You are looking at a different doc. See https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.polyfit.html#numpy.polyfit:
...Fit a polynomial p(x) = p[0] * x**deg + ... + p[deg] of degree deg to points (x, y).
So in your example, it is p(x) = p[0] * x + p[1], which is exactly gradient and interception...
I wanted a very simple spring system written in numpy. The system would be defined as a simple network of knots, linked by links. I'm not interested in evaluating the system over time, but instead I want to go from an initial state, change a variable (usually move a knot to a new position) and solve the system until it reaches a stable state (last applied force is below a given threshold). The knots have no mass, there's no gravity, the forces are all derived from each link's current lengths/init lengths. And the only "special" variable is that each knot can bet set as "anchored" (doesn't move).
So I wrote this simple solver below, and included a simple example. Jump to the very end for my question.
import numpy as np
from numpy.core.umath_tests import inner1d
np.set_printoptions(precision=4)
np.set_printoptions(suppress=True)
np.set_printoptions(linewidth =150)
np.set_printoptions(threshold=10)
def solver(kPos, kAnchor, link0, link1, w0, cycles=1000, precision=0.001, dampening=0.1, debug=False):
"""
kPos : vector array - knot position
kAnchor : float array - knot's anchor state, 0 = moves freely, 1 = anchored (not moving)
link0 : int array - array of links connecting each knot. each index corresponds to a knot
link1 : int array - array of links connecting each knot. each index corresponds to a knot
w0 : float array - initial link length
cycles : int - eval stops when n cycles reached
precision : float - eval stops when highest applied force is below this value
dampening : float - keeps system stable during each iteration
"""
kPos = np.asarray(kPos)
pos = np.array(kPos) # copy of kPos
kAnchor = 1-np.clip(np.asarray(kAnchor).astype(float),0,1)[:,None]
link0 = np.asarray(link0).astype(int)
link1 = np.asarray(link1).astype(int)
w0 = np.asarray(w0).astype(float)
F = np.zeros(pos.shape)
i = 0
for i in xrange(cycles):
# Init force applied per knot
F = np.zeros(pos.shape)
# Calculate forces
AB = pos[link1] - pos[link0] # get link vectors between knots
w1 = np.sqrt(inner1d(AB,AB)) # get link lengths
AB/=w1[:,None] # normalize link vectors
f = (w1 - w0) # calculate force vectors
f = f[:,None] * AB
# Apply force vectors on each knot
np.add.at(F, link0, f)
np.subtract.at(F, link1, f)
# Update point positions
pos += F * dampening * kAnchor
# If the maximum force applied is below our precision criteria, exit
if np.amax(F) < precision:
break
# Debug info
if debug:
print 'Iterations: %s'%i
print 'Max Force: %s'%np.amax(F)
return pos
Here's some test data to show how it works. In this case i'm using a grid, but in reality this can be any type of network, like a string with many knots, or a mess of polygons...:
import cProfile
# Create a 5x5 3D knot grid
z = np.linspace(-0.5, 0.5, 5)
x = np.linspace(-0.5, 0.5, 5)[::-1]
x,z = np.meshgrid(x,z)
kPos = np.array([np.array(thing) for thing in zip(x.flatten(), z.flatten())])
kPos = np.insert(kPos, 1, 0, axis=1)
'''
array([[-0.5 , 0. , 0.5 ],
[-0.25, 0. , 0.5 ],
[ 0. , 0. , 0.5 ],
...,
[ 0. , 0. , -0.5 ],
[ 0.25, 0. , -0.5 ],
[ 0.5 , 0. , -0.5 ]])
'''
# Define the links connecting each knots
link0 = [0,1,2,3,5,6,7,8,10,11,12,13,15,16,17,18,20,21,22,23,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]
link1 = [1,2,3,4,6,7,8,9,11,12,13,14,16,17,18,19,21,22,23,24,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24]
AB = kPos[link0]-kPos[link1]
w0 = np.sqrt(inner1d(AB,AB)) # this is a square grid, each link's initial length will be 0.25
# Set the anchor states
kAnchor = np.zeros(len(kPos)) # All knots will be free floating
kAnchor[12] = 1 # Middle knot will be anchored
This is what the grid looks like:
If we run my code using this data, nothing will happen since the links aren't pushing or stretching:
print np.allclose(kPos,solver(kPos, kAnchor, link0, link1, w0, debug=True))
# Returns True
# Iterations: 0
# Max Force: 0.0
Now lets move that middle anchored knot up a bit and solve the system:
# Move the center knot up a little
kPos[12] = np.array([0,0.3,0])
# eval the system
new = solver(kPos, kAnchor, link0, link1, w0, debug=True) # positions will have moved
#Iterations: 102
#Max Force: 0.000976603249133
# Rerun with cProfile to see how fast it runs
cProfile.run('solver(kPos, kAnchor, link0, link1, w0)')
# 520 function calls in 0.008 seconds
And here's what the grid looks like after being pulled by that single anchored knot:
Question:
My actual use cases are a little more complex than this example and solve a little too slow for my taste: (100-200 knots with a network anywhere between 200-300 links, solves in a few seconds).
How can i make my solver function run faster? I'd consider Cython but i have zero experience with C. Any help would be greatly appreciated.
Your method, at a cursory glance, appears to be an explicit under-relaxation type of method. Calculate the residual force at each knot, apply a factor of that force as a displacement, repeat until convergence. It's the repeating until convergence that takes the time. The more points you have, the longer each iteration takes, but you also need more iterations for the constraints at one end of the mesh to propagate to the other.
Have you considered an implicit method? Write the equation for the residual force at each non-constrained node, assemble them into a large matrix, and solve in one step. Information now propagates across the entire problem in a single step. As an additional benefit, the matrix you construct should be sparse, which scipy has a module for.
Wikipedia: explicit and implicit methods
EDIT Example of an implicit method matching (roughly) your problem. This solution is linear, so it doesn't take into account the effect of the calculated displacement on the force. You would need to iterate (or use non-linear techniques) to calculate this. Hope it helps.
#!/usr/bin/python3
import matplotlib.pyplot as pp
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import scipy as sp
import scipy.sparse
import scipy.sparse.linalg
#------------------------------------------------------------------------------#
# Generate a grid of knots
nX = 10
nY = 10
x = np.linspace(-0.5, 0.5, nX)
y = np.linspace(-0.5, 0.5, nY)
x, y = np.meshgrid(x, y)
knots = list(zip(x.flatten(), y.flatten()))
# Create links between the knots
links = []
# Horizontal links
for i in range(0, nY):
for j in range(0, nX - 1):
links.append((i*nX + j, i*nX + j + 1))
# Vertical links
for i in range(0, nY - 1):
for j in range(0, nX):
links.append((i*nX + j, (i + 1)*nX + j))
# Create constraints. This dict takes a knot index as a key and returns the
# fixed z-displacement associated with that knot.
constraints = {
0 : 0.0,
nX - 1 : 0.0,
nX*(nY - 1): 0.0,
nX*nY - 1 : 1.0,
2*nX + 4 : 1.0,
}
#------------------------------------------------------------------------------#
# Matrix i-coordinate, j-coordinate and value
Ai = []
Aj = []
Ax = []
# Right hand side array
B = np.zeros(len(knots))
# Loop over the links
for link in links:
# Link geometry
displacement = np.array([ knots[1][i] - knots[0][i] for i in range(2) ])
distance = np.sqrt(displacement.dot(displacement))
# For each node
for i in range(2):
# If it is not a constraint, add the force associated with the link to
# the equation of the knot
if link[i] not in constraints:
Ai.append(link[i])
Aj.append(link[i])
Ax.append(-1/distance)
Ai.append(link[i])
Aj.append(link[not i])
Ax.append(+1/distance)
# If it is a constraint add a diagonal and a value
else:
Ai.append(link[i])
Aj.append(link[i])
Ax.append(+1.0)
B[link[i]] += constraints[link[i]]
# Create the matrix and solve
A = sp.sparse.coo_matrix((Ax, (Ai, Aj))).tocsr()
X = sp.sparse.linalg.lsqr(A, B)[0]
#------------------------------------------------------------------------------#
# Plot the links
fg = pp.figure()
ax = fg.add_subplot(111, projection='3d')
for link in links:
x = [ knots[i][0] for i in link ]
y = [ knots[i][1] for i in link ]
z = [ X[i] for i in link ]
ax.plot(x, y, z)
pp.show()
I have DATA on x and y axes and the output is on z
for example
y = 10
x = [1,2,3,4,5,6]
z = [2.3,3.4,5.6,7.8,9.6,11.2]
y = 20
x = [1,2,3,4,5,6]
z = [4.3,5.4,7.6,9.8,11.6,13.2]
y = 30
x = [1,2,3,4,5,6]
z = [6.3,7.4,8.6,10.8,13.6,15.2]
how can i find the value of z when y = 15 x = 3.5
I was trying to use scipy but i am very new at it
Thanks a lot for the help
vibhor
scipy.interpolate.bisplrep
Reference:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.bisplrep.html
import scipy
import math
import numpy
from scipy import interpolate
x= [1,2,3,4,5,6]
y= [10,20,30]
Y = numpy.array([[i]*len(x) for i in y])
X = numpy.array([x for i in y])
Z = numpy.array([[2.3,3.4,5.6,7.8,9.6,11.2],
[4.3,5.4,7.6,9.8,11.6,13.2],
[6.3,7.4,8.6,10.8,13.6,15.2]])
tck = interpolate.bisplrep(X,Y,Z)
print interpolate.bisplev(3.5,15,tck)
7.84921875
EDIT:
Upper solution does not give you perfect fit.
check
print interpolate.bisplev(x,y,tck)
[[ 2.2531746 4.2531746 6.39603175]
[ 3.54126984 5.54126984 7.11269841]
[ 5.5031746 7.5031746 8.78888889]
[ 7.71111111 9.71111111 10.9968254 ]
[ 9.73730159 11.73730159 13.30873016]
[ 11.15396825 13.15396825 15.2968254 ]]
to overcome this interpolate whit polyinomials of 5rd degree in x and 2nd degree in y direction
tck = interpolate.bisplrep(X,Y,Z,kx=5,ky=2)
print interpolate.bisplev(x,y,tck)
[[ 2.3 4.3 6.3]
[ 3.4 5.4 7.4]
[ 5.6 7.6 8.6]
[ 7.8 9.8 10.8]
[ 9.6 11.6 13.6]
[ 11.2 13.2 15.2]]
This yield
print interpolate.bisplev(3.5,15,tck)
7.88671875
Plotting:
reference http://matplotlib.sourceforge.net/examples/mplot3d/surface3d_demo.html
fig = plt.figure()
ax = Axes3D(fig)
ax.plot_surface(X, Y, Z,rstride=1, cstride=1, cmap=cm.jet)
plt.show()
Given (not as Python code, since the second assignment would obliterate the first in each case, of course;-):
y = 10
x = [1,2,3,4,5,6]
z = [2.3,3.4,5.6,7.8,9.6,11.2]
y = 20
x = [1,2,3,4,5,6]
z = [4.3,5.4,7.6,9.8,11.6,13.2]
you ask: "how can i find the value of z when y = 15 x = 3.5"?
Since you're looking at a point exactly equidistant in both x and y from the given "grid", you just take the midpoint between the grid values (if you had values not equidistant, you'd take a proportional midpoint, see later). So for y=10, the z values for x 3 and 4 are 5.6 and 7.8, so for x 3.5 you estimate their midpoint, 6.7; and similarly for y=20 you estimate the midpoint between 7.6 and 9.8, i.e., 8.7. Finally, since you have y=15, the midpoint between 6.7 and 8.7 is your final interpolated value for z: 7.7.
Say you had y=13 and x=3.8 instead. Then for x you'd take the values 80% of the way, i.e.:
for y=10, 0.2*5.6+0.8*7.8 -> 7.36
for y=20, 0.2*7.6+0.8*9.8 -> 9.46
Now you want the z 30% of the way between these, 0.3*7.36 + 0.7*9.46 -> 8.83, that's z.
This is linear interpolation, and it's really very simple. Do you want to compute it by hand, or find routines that do it for you (given e.g. numpy arrays as "the grids")? Even in the latter case, I hope this "manual" explanation (showing what you're doing in the most elementary of arithmetical terms) can help you understand what you're doing...;-).
There are more advanced forms of interpolation, of course -- do you need those, or does linear interpolation suffice for your use case?
I would say just take the average of the values around it. So if you need X=3.5 and Y=15 (3.5,15), you average (3,10), (3,20), (4,10) and (4,20). Since I have no idea what the data is you are dealing with, I am not sure if the exact proximity would matter - in which case you can just stick w/the average - or if you need to do some sort of inverse distance weighting.