how do i write a code to show the addition operation between two arrays (row wise), i don't want the result of the addition but want to illustrate the operation. Here is what I have, however my code is not giving me the right output
import numpy as np
Grid = np.random.randint(-50,50, size=(5,4))
iList =np.array([[1, -1, 2, -2]])
result = (Grid.astype(str), iList.astype(str))
print(result)
the output needs to be something to this effect
([3+1 4-1 4+2 5-2]
[6+1 9-1 7+2 8-2]
etc.
thank you.
You basically want to apply a function to two numpy arrays of different sizes, making use of numpy's broadcasting capability.
This works:
import numpy as np
grid = np.random.randint(-50, 50, size=(5, 4))
i_list = np.array([[1, -1, 2, -2]])
def sum_text(x: int, y: int):
return f'{x}+{y}'
# create a ufunc, telling numpy that it takes 2 arguments and returns 1 value
np_sum_text = np.frompyfunc(sum_text, 2, 1)
result = np_sum_text(grid, i_list)
print(result)
Result:
[['46+1' '-27+-1' '35+2' '-3+-2']
['-5+1' '6+-1' '2+2' '22+-2']
['6+1' '-45+-1' '-21+2' '31+-2']
['25+1' '-4+-1' '-24+2' '3+-2']
['-32+1' '-10+-1' '-19+2' '28+-2']]
Or maybe you don't need to reuse that function and like one-liners:
print(np.frompyfunc(lambda x, y: f'{x}+{y}', 2, 1)(grid, i_list))
Getting rid of the + before a negative integer is trivial:
def sum_text(x: int, y: int):
return f'{x}+{y}' if y >= 0 else f'{x}{y}'
Related
I'm trying to make custom gradient descent estimator, however, I am encountering the issue with storing the parameter values at every step of the gradient descent algorithm. Here is the code skeleton:
from numpy import *
import pandas as pd
from joblib import Parallel, delayed
from multiprocessing import cpu_count
ftemp = zeros((2, ))
stemp = empty([1, ], dtype='<U10')
la = 10
vals = pd.DataFrame(index=range(la), columns=['a', 'b', 'string']
def sfun(k1, k2, k3, string):
a = k1*k2
b = k2*k3
s = string
nums = [a, b]
strs = [s]
return(nums, strs)
def store(inp):
r = rsfun(inp[0], inp[1], inp[2], inp[3])
ftemp = append(ftemp, asarray(r[0]), axis = 0)
stemp = append(stemp, asarray(r[1]), axis = 0)
return(ftemp, stemp)
for l in range(la):
inputs = [(2, 3, 4, 'he'),
(4, 6, 2, 'je'),
(2, 7, 5, 'ke')]
Parallel(n_jobs = cpu_count)(delayed(store)(i) for i in inputs)
vals.iloc[l, 0:2] = ftemp[0, 0], ftemp[0, 1]
vals.iloc[l, 2] = stemp[0]
d = ftemp[2, 0]-ftemp[0, 0]
Note: most of the gradient descent stuff is removed because I do not have any issues with that. the main issues that I have are storing the values at each step.
sfun() is the loss function (I know that it doesn't look like that here) and store() is just an attempt to store the parameter values with each step.
The important aspect here is that I want to parallelize the process as sfun() is computationally expensive and the issue with that I want to save values for all parallel runs.
I tried solving this in many different ways, but I always get a different error.
No need to make a temporary storage array, possible to store the results of Parallel() function directly by:
a = Parallel(n_jobs = cpu_count)(delayed(store)(i) for i in inputs)
Most importantly, a is populated in order that the inputs are given.
I am attempting to write a program which constructs a matrix and performs a singular value decomposition on it. I am evaluating the function ax^2 +bx + 1 on a grid. I then make a uniform meshgrid of a and b. The rows of the matrix correspond to different quadratic coefficients, while each column corresponds to a grid point at which the function is evaluated.
The matlab code is here:
% Collect data
x = linspace(-1,1,100);
[a,b] = meshgrid(0:0.1:1,0:0.1:1);
D=zeros(numel(x),numel(a));
sz = size(D)
% Build “Dose” matrix
for i=1:numel(a)
D(:,i) = a(i)*x.^2+b(i)*x+1;
end
% Do the SVD:
[U,S,V]=svd(D,'econ');
D_reconstructed = U*S*V';
plot(diag(S))
scatter3(a(:),b(:),V(:,1))
This is my attempt at a solution:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-1, 1, 100)
def f(x, a, b):
return a*x*x + b*x + 1
a, b = np.mgrid[0:1:0.1,0:1:0.1]
#a = b = np.arange(0,1,0.01)
D = np.zeros((x.size, a.size))
for i in range(a.size):
D[i] = a[i]*x*x +b[i]*x +1
U, S, V = np.linalg.svd(D)
plt.plot(np.diag(S))
fig = plt.figure()
ax = plt.axes(projection="3d")
ax.scatter(a, b, V[0])
but I always get broadcasting errors which I am not sure how to fix.
Firstly, in MATLAB you're assigning to D(:,i), but in python you're assigning to D[i]. The latter is equivalent to D[i, ...] which is in your case D[i, :]. Instead you seem to need D[:, i].
Secondly, in MATLAB using a linear index into a 2d array (namely a and b) will give you flattened views. If you do that with numpy you get slices of an array instead, just as I mentioned with D[i].
You can do away with the loop with broadcasting and getting your desired 2d array by .ravelling (or reshaping) your a and b arrays:
x = np.linspace(-1, 1, 100)[:, None] # inject trailing singleton for broadcasting
a, b = np.mgrid[0:1:0.1, 0:1:0.1]
D = a.ravel() * x**2 + b.ravel() * x + 1
The way this works is that x has shape (100, 1) after we inject a trailing singleton (in MATLAB trailing singletons are implied, in numpy leading ones), and both a.ravel() and b.ravel() have shape (10*10,) which is compatible with (1, 10*10), making broadcasting possible into shape (100, 10*10). You could also replace the calls to ravel with
a, b = np.mgrid[...].reshape(2, -1)
which is a trick I sometimes use, but this is harder to read if you're unfamiliar with the pattern.
Side note: it's better to use example data where dimensions end up being of different size so that you notice if something ends up being transposed.
I need to use this loss function for a CNN the list_distance and list_residual are output tensors from hidden layers which are important to compute the loss, but when i execute the code it gives me back this error
TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.
Is there another way to iterate over tensors without the use of the costruct
x in X or convert it in a numpy array or using the backend function of keras?
def DBL(y_true, y_pred, list_distances, list_residual, l=0.65):
prob_dist = []
Li = []
# mean of the images power spectrum
S = np.sum([np.power(np.abs(fp.fft2(residual)), 2)
for residual in list_residual], axis=0) / K.shape(list_residual)[0]
# log-ratio between the geometric and arithmetic of S
R = np.log10((scistats.gmean(S) / np.mean(S)))
for c_i, dis_i in enumerate(list_distances):
prob_dist.append([
np.exp(-dis_i) / sum([np.exp(-dis_j) if c_j != c_i else 0 for c_j, dis_j in enumerate(list_distances)])
])
for count, _ in enumerate(prob_dist):
Li.append(
-1 * np.log10(sum([p_j for c_j, p_j in enumerate(prob_dist[count])
if y_pred[count] == 1 and count != c_j])))
L0 = np.sum(Li)
return L0 - l * R
You need to define a custom function to feed into tf.map_fn() - Tensorflow dox
Mapper functions map (funnily enough) the existing object (tensor) into a new one using a function you define.
They apply the custom function to every element in the object, without all the mucking about with for loops.
For instance (non tested code, may not run - on my phone atm):
def custom(a):
b = a + 1
return b
original = np.array([2,2,2])
mapped = tf.map_fn(custom, original)
# mapped == [3, 3, 3] ... hopefully
Tensorflow examples all use lambda functions, so you might need to define your functions like that if the above doesn’t work. Tensorflow example:
elems = np.array([1, 2, 3, 4, 5, 6])
squares = map_fn(lambda x: x * x, elems)
# squares == [1, 4, 9, 16, 25, 36]
Edit:
As an aside, map functions are much easier to parallelise than for loops - it is assumed that each element of an object is processed uniquely - so you can see a performance uplift by using them.
Edit 2:
For the "reduce sum, but not on this index" part, I would heavily recommend you start looking back at matrix operations... As mentioned, map functions work element-wise - they are not aware of other elements. A reduce function is what you want, but even they are finiky when you try and do "not this index" sums... also tensorflow is built around matrix ops... Not the MapReduce paradigm.
Something along these lines might help:
sess = tf.Session()
var = np.ones([3, 3, 3]) * 5
zero_identity = tf.linalg.set_diag(
var, tf.zeros(var.shape[0:-1], dtype=tf.float64)
)
exp_one = tf.exp(var)
exp_two = tf.exp(zero_identity)
summed = tf.reduce_sum(exp_two, axis = [0,1])
final = exp_one / summed
print("input matrix: \n", var, "\n")
print("Identities of the matrix to Zero: \n", zero_identity.eval(session=sess), "\n")
print("Exponential Values numerator: \n", exp_one.eval(session=sess), "\n")
print("Exponential Values to Sum: \n", exp_two.eval(session=sess), "\n")
print("Summed values for zero identity matrix\n ... along axis [0,1]: \n", summed.eval(session=sess), "\n")
print("Output:\n", final.eval(session=sess), "\n")
I'm trying to use scipy.optimize to identify the optimal values for 3 parameters(variable). I am starting with a very simple optimization function that sums the analyzed parameters together with some predefined (past) values. The values are bound using some fixed values. I set the value of the sign parameter to -1 as I am dealing with a maximization problem. However, scipy returns [0, 0, 0] as optimal values (same as setting sign=1), while the correct solution is [2, 2, 2]. Am I setting something wrong? What am I missing?
import scipy.optimize as optimize
import numpy as np
old = [1,1,1]
def f(params,sign=-1.0):
first, second, third = params
return sum(old+[first, second, third])
initial_guess = [2,2,2]
in1 = 1
in2 = 2
in3 = 1
bnds = ((0, in1+2), (0, in2+2), (0, in3+2))
result = optimize.minimize(f, initial_guess, bounds=bnds)
print result.x
In general when performing nonlinear optimization, libraries like your function to take only a single parameter vector. A good idea, generally, if you want to maximize a function is to minimize its inverse. If you simply want to maximize the value of x1+x2+x3, I would write things out this way:
from scipy.optimize import minimize
def f(x):
return 1/sum(x)
guess = [2,2,2]
x1bnds = (0, 3)
x2bnds = (0, 4)
x3bnds = (0, 5)
bnds = (x1bnds, x2bnds, x3bnds)
result = minimize(f, guess, bounds=bnds)
print(result.x) will give you [3,4,5] because the optimizer hit the bounds.
If you want to operate on the distance between your input parameters and some other values, I would modify the setup as so:
from functools import partial
from scipy.optimize import minimize
import numpy as np
other_values = np.asarray([3,4,5])
def f(x, other_pts):
x_lcl = np.asarray(x)
difference = x_lcl-other_pts
return 1/difference.sum()
guess = [2,2,2]
x1bnds = (0, 3)
x2bnds = (0, 4)
x3bnds = (0, 5)
bnds = (x1bnds, x2bnds, x3bnds)
f_opt = partial(f, other_values)
result = minimize(f_opt, guess, bounds=bnds)
print(result.x) will give you [0,0,0] because the optimizer hit the bounds.
It is a good idea to make the function you optimize not depend on external data (globals) -- using a partial will make everything a little nicer.
If you don't want to use numpy, you could use a list comprehension to do the elementwise subtraction of x and the other parameter vector, but this way things are a little nicer.
I want to preallocate memory for the output of an array operation, and I need to know what dtype to make it. Below I have a function that does what I want it to do, but is terribly ugly.
import numpy as np
def array_operation(arr1, arr2):
out_shape = arr1.shape
# Get the dtype of the output, these lines are the ones I want to replace.
index1 = ([0],) * arr1.ndim
index2 = ([0],) * arr2.ndim
tmp_arr = arr1[index1] * arr2[index2]
out_dtype = tmp_arr.dtype
# All so I can do the following.
out_arr = np.empty(out_shape, out_dtype)
The above is pretty ugly. Does numpy have a function that does this?
You are looking for numpy.result_type.
(As an aside, do you realize that you can access all multi-dimensional arrays as 1d arrays? You don't need to access x[0, 0, 0, 0, 0] -- you can access x.flat[0].)
For those using numpy version < 1.6, you could use:
def result_type(arr1, arr2):
x1 = arr1.flat[0]
x2 = arr2.flat[0]
return (x1 * x2).dtype
def array_operation(arr1, arr2):
return np.empty(arr1.shape, result_type(arr1, arr2))
This isn't very different from the code you posted, though I think arr1.flat[0] is a slight improvement over index1 = ([0],) * arr1.ndim; arr1[index1].
For numpy version >= 1.6, use Mike Graham's answer, np.result_type