Store the results in a separate NumPy array - python

i was told to Use a Python program to compute ê for every row of the array and store the results in a separate NumPy array.
which example 1 or 2 (image) below being correctly displayed as a seprate Numpy Array?

Consider you have a 3*2 image. Probably "all rows" would mean doing operation e across the columns. Just an example of np.sum()
>>> img=np.array([[1,1],[2,1],[4,1]])
>>> e=np.sum(img,axis=1)
>>> e
array([2, 3, 5])
>>> e.shape
(3,)
>>> img.shape
(3, 2)
>>> img
array([[1, 1],
[2, 1],
[4, 1]])
>>>
However, it depends really on what your ê is which hasn't been posted.
It could be ê is being calculated for each element or 'element-wise' for which you would have to do as Rinshan stated in second part of his answer.
You should refer these diagrams to clear out which axis you need to perform e-hat on. This is only how much I can help you sorry.
EDIT: If e-hat is exp then you could sum across the columns then apply np.exp()

Compute exponent row wise
import numpy as np
ar = np.array(([4.0,0.2,1.16,0.5],[6.0,0.1,0.06,-0.75]))
e = np.exp(np.sum(ar,axis=1)) # O/P array([350.72414402, 223.63158768])
# Take exponent and take some along axis 1
e = np.sum(np.exp(ar),axis=1)
Computing exponent element wise
e = np.exp(ar)
# To convert to single row
e =e.reshape(-1) # or single line e = np.exp(ar).reshape(-1)
print(e)
array([ 54.59815003, 1.22140276, 3.18993328, 1.64872127,
403.42879349, 1.10517092, 1.06183655, 0.47236655])
Compute by multiplying with variable
import numpy as np
ar = np.array(([4.0,0.2,1.16,0.5],[6.0,0.1,0.06,-0.75]))
s = np.sum(a,axis=1)
e_calculated = e ** s # (Where you can assing value of e)
# Calculating with np.power np.power(e,s)

Related

Perform matrix multiplication with single column * single row

How do I stop python from assuming that I want the scalar result of a dot product?
I have columns taken from two matrices, V=[v1,v2,v3,...] and D=[d1,d2,...] of lengths M and N respectively.
I need the following matrix, which can be generated by matrix multiplication of one column with one row.
v1d1, v1d2, v1*d3, ...
v2d1, v2d2,
v3*d1,
.
.
.
This calculation will be done at least hundreds of thousands of times so I don't want to use a for-loop.
When I try to do this with numpy it assumes I want the more common dot product (1xM, Nx1) to result in a scalar(if M=N) or error, rather than the (Mx1, 1xN) for the MxN matrix I want. I've tried np.dot and np.matmul, and in each case it seems to ignore np.transpose.
In the following I've tried to specify that these objects should be considered to have two dimensions, and it gives the same error with or without the presence of transpose.
import numpy as np
v = np.arange(4)
d = np.arange(3)
np.reshape(v,(1,4))
np.reshape(d,(3,1))
e = np.matmul(np.transpose(d),v)
print(e)
Traceback (most recent call last):
File "/home/voidbender/research/NNs/test2.py", line 8, in
e = np.matmul(np.transpose(v),d)
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 4)
You forgot to re-assign v and d to its reshape version, this does the job:
import numpy as np
v = np.arange(4)
d = np.arange(3)
v =np.reshape(v,(1,4))
d =np.reshape(d,(3,1))
e=np.matmul(d,v)
print(e)
results:
[[0 0 0 0]
[0 1 2 3]
[0 2 4 6]]
Just to make it simpler, and avoid reshape, you can create column and row vectors by:
row_vector=np.array([np.arange(4)])
col_vector=np.array([np.arange(3)]).T
e=np.matmul(col_vector,row_vector)

Find values closest to zero among ndarrays

I have two numpy ndarrays named A and B. Each ndarray has dimension 2 by 3. For each of the grid point, I have to find which element of the two arrays is closest to zero and assign a flag accordingly. The flag takes value 1 for array A and value 2 for array B. That is, if the element in (0,0) (i.e., row 0 and column 0) of array A is closest to zero compared to (0,0) element of array B, then the output assigns a value 1 in position row 0 and column 0. The output array will have the dimension 1 by 3.
I give an example below
A= np.array([[0.1,2,0.3],[0.4,3,2]])
B= np.array([[1,0.2,0.5],[4,0.03,0.02]])
The output should be
[[1,2,1],[1,2,2]]
Is there an efficient way of doing it without writing for loop? Many thanks.
Here's what i would do:
import numpy as np
a = np.array([[0.1,2,0.3],[0.4,3,2]])
b = np.array([[1,0.2,0.5],[4,0.03,0.02]])
c = np.abs(np.stack([a, b])).argmin(0)+1
Output:
array([[1, 2, 1],
[1, 2, 2]])

Matlab to Python Matrix Code

I am trying to translate some code from MATLAB to Python. I have been stumped on this part of the MATLAB code:
[L,N] = size(Y);
if (L<p)
error('Insufficient number of columns in y');
end
I understand that [L,N] = size(Y) returns the number of rows and columns when Y is a matrix. However I have limited experience with Python and thus cannot understand how to do the same with Python. This also is part of the reason I do not understand how the MATLAB logic with in the loop can be also fulfilled in Python.
Thank you in advance!
Also, in case the rest of the code is also needed. Here it is.
function [M,Up,my,sing_values] = mvsa(Y,p,varargin)
if (nargin-length(varargin)) ~= 2
error('Wrong number of required parameters');
end
% data set size
[L,N] = size(Y)
if (L<p)
error('Insufficient number of columns in y');
end
I am still unclear as to what p is from your post, however the excerpt below effectively performs the same task as your MATLAB code in Python. Using numpy, you can represent a matrix as an array of arrays and then call .shape to return the number of rows and columns, respectively.
import numpy as np
p = 2
Y = np.matrix([[1, 1, 1, 1],[2, 2, 2, 2],[3, 3, 3, 3]])
L, N = Y.shape
if L < p:
print('Insufficient number of columns in y')
Non-numpy
data = ([[1, 2], [3, 4], [5, 6]])
L, N = len(data), len(data[0])
p = 2
if L < p:
raise ValueError("Insufficient number of columns in y")
number_of_rows = Y.__len__()
number_of_cols = Y[0].__len__()

Numpy find number of occurrences in a 2D array

Is there a numpy function to count the number of occurrences of a certain value in a 2D numpy array. E.g.
np.random.random((3,3))
array([[ 0.68878371, 0.2511641 , 0.05677177],
[ 0.97784099, 0.96051717, 0.83723156],
[ 0.49460617, 0.24623311, 0.86396798]])
How do I find the number of times 0.83723156 occurs in this array?
arr = np.random.random((3,3))
# find the number of elements that get really close to 1.0
condition = arr == 0.83723156
# count the elements
np.count_nonzero(condition)
The value of condition is a list of booleans representing whether each element of the array satisfied the condition. np.count_nonzero counts how many nonzero elements are in the array. In the case of booleans it counts the number of elements with a True value.
To be able to deal with floating point accuracy, you could do something like this instead:
condition = np.fabs(arr - 0.83723156) < 0.001
For floating point arrays np.isclose is much better option than either comparing with the exactly same element or defining a custom range.
>>> a = np.array([[ 0.68878371, 0.2511641 , 0.05677177],
[ 0.97784099, 0.96051717, 0.83723156],
[ 0.49460617, 0.24623311, 0.86396798]])
>>> np.isclose(a, 0.83723156).sum()
1
Note that real numbers are not represented exactly in a computer, that is why np.isclose will work while == doesn't:
>>> (0.1 + 0.2) == 0.3
False
Instead:
>>> np.isclose(0.1 + 0.2, 0.3)
True
To count the number of times x appears in any array, you can simply sum the boolean array that results from a == x:
>>> col = numpy.arange(3)
>>> cols = numpy.tile(col, 3)
>>> (cols == 1).sum()
3
It should go without saying, but I'll say it anyway: this is not very useful with floating point numbers unless you specify a range, like so:
>>> a = numpy.random.random((3, 3))
>>> ((a > 0.5) & (a < 0.75)).sum()
2
This general principle works for all sorts of tests. For example, if you want to count the number of floating point values that are integral:
>>> a = numpy.random.random((3, 3)) * 10
>>> a
array([[ 7.33955747, 0.89195947, 4.70725211],
[ 6.63686955, 5.98693505, 4.47567936],
[ 1.36965745, 5.01869306, 5.89245242]])
>>> a.astype(int)
array([[7, 0, 4],
[6, 5, 4],
[1, 5, 5]])
>>> (a == a.astype(int)).sum()
0
>>> a[1, 1] = 8
>>> (a == a.astype(int)).sum()
1
You can also use np.isclose() as described by Imanol Luengo, depending on what your goal is. But often, it's more useful to know whether values are in a range than to know whether they are arbitrarily close to some arbitrary value.
The problem with isclose is that its default tolerance values (rtol and atol) are arbitrary, and the results it generates are not always obvious or easy to predict. To deal with complex floating point arithmetic, it does even more floating point arithmetic! A simple range is much easier to reason about precisely. (This is an expression of a more general principle: first, do the simplest thing that could possibly work.)
Still, isclose and its cousin allclose have their uses. I usually use them to see if a whole array is very similar to another whole array, which doesn't seem to be your question.
If it may be of use to anyone: for very large 2D arrays, if you want to count how many time all elements appear within the entire array, one could flatten the array into a list and then count how many times each element appeared:
from itertools import chain
import collections
from collections import Counter
#large array is called arr
flatten_arr = list(chain.from_iterable(arr))
dico_nodeid_appearence = Counter(flatten_arr)
#how may times x appeared in the arr
dico_nodeid_appearence[x]

Min-Max difference in continuous part of certain length within a np.array

I have a numpy array of values like this:
a = np.array((1, 3, 4, 5, 10))
In this case the array has length 5. Now I want to know the difference between the lowest and highest value in the array, but only within a certain continuous part of the array, for example with length 3.
So in this case it would be the difference between 4 and 10, so 6. It would also be nice to have the index of the starting point of the continuous part (in the above example that would be 2). So something like this:
def f(a, lenght_of_part):
...
return (max_difference, starting index)
I know I could iterate over sliced parts of the array, but for my actual purpose I have ~150k arrays of length 1500, so that would take too long.
What would be an easy and quick way of doing this?
Thanks in advance!
This is a bit tricky to get done in a vectorised way in Numpy. One option is to use numpy.lib.stride_tricks.as_strided, which requires care, because it allows to access arbitrary memory. Here's an example for a window size of k = 3:
>>> k = 3
>>> shape = (len(a) - k + 1, k)
>>> b = numpy.lib.stride_tricks.as_strided(
a, shape=shape, strides=(a.itemsize, a.itemsize))
>>> moving_ptp = b.ptp(axis=1)
>>> start_index = moving_ptp.argmax()
>>> moving_ptp[start_index]
6

Categories

Resources