I have some code where I want to test if the product of a matrix and vector is the zero vector. An example of my attempt is:
n =2
zerovector = np.asarray([0]*n)
for column in itertools.product([0,1], repeat = n):
for row in itertools.product([0,1], repeat = n-1):
M = toeplitz(column, [column[0]]+list(row))
for v in itertools.product([-1,0,1], repeat = n):
vector = np.asarray(v)
if (np.dot(M,v) == zerovector):
print M, "No good!"
break
But the line if (np.dot(M,v) == zerovector): gives the error ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). What is the right way to do this?
The problem is that == between two arrays is an element-wise comparison—you get back an array of boolean values. An array of boolean values isn't a boolean value itself, so you can't use it in an if. This is what the error is trying to tell you.
You could solve this by using the all method, to check whether all of the elements in the boolean array are true. But you're making this way more complicated than you need to. Nonzero values are truthy, zero values are falsey, so you can just use any without a comparison:
if not np.dot(M, v).any():
If you want to make the comparison to zero explicit, just compare to a scalar, don't build a zero vector; it'll get broadcast the same way. And, if you ever do want to build a zero vector, just use the zeros function; don't build a list of zeros in a complicated way and pass it to asarray.
You could also use the count_nonzero function here as a different alternative. If it returns anything truthy (that is, any non-zero number), the array had at least one non-zero.
In general, you're making almost everything harder than necessary, and working through a brief NumPy tutorial and then scanning the main docs pages for useful functions would really help you.
Also, if your values aren't integers, you probably don't actually want to compare == 0 in the first place. Floating-point numbers accumulate rounding errors. To handle that, use the allclose function instead.
as the error says you need to use all
if all(np.dot(M,v) == zerovector):
or np.all. np.dot(M,v) == zerovector gives you a vector which is pair-wise comparison of the two vectors.
Related
I've been given the challenge to code np.argmin without numpy .
I've been thinking hard for about a day..
I have no idea whether I should use a for statement,
an if statement, a while statement, or another function..
First question!
First, I thought about how to express it with an inequality sign to distinguish between cases.
using the if statement
a[0,0] - a[0,1] > 0
a[0,0] - a[0,1] < 0
I tried to write the code by dividing the two cases.
There were too many cases, so I stopped.
Couldn't it be done with an If statement?
Second question!
We know that the argmin method represents the address of a pointer as an array value.
What is in the screen capture is what I arbitrarily input as a two-dimensional list.
ndarray.
Because the task is limited to receiving a two-dimensional list as input
I thought that the directions of axis=0 and axis=1 are fixed.
Then axis=0 freezes the column and compares row to row
Is it okay to think that axis=1 freezes rows and compares columns to columns?
Third question!
After receiving an arbitrary two-dimensional list, ndarray is
I thought it would be in the form of a matrix of the form ixj.
Then, if you use a.shape, the output value is output as (i , j).
How can we extract i and j here?
It's really hard to think about all day long.
Any hints would be appreciated.
def argmin(a):
return min(range(len(a)), key=lambda x : a[x])
def argmax(a):
return max(range(len(a)), key=lambda x : a[x])
This code is for 1D list.
I'm trying to use the isin() function from Numpy library to find elements that are common in two arrays.
Seems pretty basic, but one of those arrays is created using linspace() and the other I just put hard values in.
But it seems like isin() is using == for its comparisons, and so the result returned by the method is missing one of the numbers.
Is there a way I can work around this, either by defining my arrays differently or by using a method other than isin() ?
thetas = np.array(np.linspace(.25, .50, 51))
known_thetas = [.3, .35, .39, .41, .45]
unknown_thetas = thetas[np.isin(thetas, known_thetas, assume_unique = True, invert = True)]
Printing the three arrays, I find that .41 is still in the third array, because when printing them one by one, my value in the first array is actually 0.41000000000000003, which means == comparison returns False. What is the best way of working around this ?
We could make use of np.isclose after extending one of those arrays to 2D for an outer isclose-match-finding and then doing a ANY match to give us a 1D boolean-array that could be used to mask the relevant input array -
thetas[~np.isclose(thetas[:,None],known_thetas).any(1)]
To customize the level of tolerance for matches, we could feed in custom relative and absolute tolerance values to np.isclose.
If you are looking for performance on large arrays, we could optimize on memory and hence performance too with a NumPy implementation of np.isin with tolerance arg for floating pt numbers with np.searchsorted -
thetas[~isin_tolerance(thetas,known_thetas,tol=0.001)]
Feed in your tolerance value in tol arg.
If you have a fixed absolute tolerance, you can use np.around to round the values before comparing:
unknown_thetas = thetas[np.isin(np.around(thetas, 5), known_thetas, assume_unique = True, invert = True)]
This rounds thetas to 5 decimal digits, but it's up to you to decide how close the numbers need to be for you to consider them equal.
I'm learning Python right now and I'm stuck with this line of code I found on the internet. I can not understand what actually this line of code do.
Suppose I have this array:
import numpy as np
x = np.array ([[1,5],[8,1],[10,0.5]]
y = x[np.sqrt(x[:,0]**2+x[:,1]**2) < 1]
print (y)
The result is an empty array. What I want to know is what does actually the y do? I've never encountered this kind of code before. It seems like the square brackets is like the if-conditional statement. Instead of that code, If write this line of code:
import numpy as np
x = np.array ([[1,5],[8,1],[10,0.5]]
y = x[0 < 1]
print (y)
It will return exactly what x is (because zero IS less than one).
Assuming that it is a way to write if-conditional statement, I find it really absurd because I'm comparing an array with an integer.
Thank you for your answer!
In Numpy:
[1,1,2,3,4] < 2
is (very roughly) equivalent to something like:
[x<2 for x in [1,1,2,3,4]]
for vanilla Python lists. And as such, in both cases, the result would be:
[True, True, False, False, False]
The same holds true for some other functions, like addition, multiplication and so on. Broadcasting is actually a major selling point for Numpy.
Now, another thing you can do in Numpy is boolean indexing, which is providing an array of bools that are interpreted as 'Keep this value Y/N?'. So:
arr = [1,1,2,3,4]
res = arr[arr<2]
# evaluates to:
=> [1,1]
numpy works differently when you slice an array using a boolean or an int.
From the docs:
This advanced indexing occurs when obj is an array object of Boolean type, such as may be returned from comparison operators. A single
boolean index array is practically identical to x[obj.nonzero()]
where, as described above, obj.nonzero() returns a tuple (of length
obj.ndim) of integer index arrays showing the True elements of obj.
However, it is faster when obj.shape == x.shape.
If obj.ndim == x.ndim, x[obj] returns a 1-dimensional array filled
with the elements of x corresponding to the True values of obj. The
search order will be row-major, C-style. If obj has True values at
entries that are outside of the bounds of x, then an index error will
be raised. If obj is smaller than x it is identical to filling it with
False.
When you index an array using booleans, you are telling numpy to select the data corresponding to True, therefore array[True] is not the same as array[1]. In the first case, numpy will therefore interpret it as a zero dimensional boolean array, which, based on how masks works, is the same as selecting all data.
Therefore:
x[True]
will return the full array, just as
x[False]
will return an empty array.
Is there any neat way to check is numpy array surrounded by zeros.
Example:
[[0,0,0,0],
[0,1,2,0],
[0,0,0,0]]
I know I can iterate it element wise to find out but I wonder is there any nice trick we can use here. The numpy array is of floats, n x m of arbitrary size.
Any ideas are welcome.
You can use numpy.any() to test if there is any non-zero element in numpy array.
Now, to test if a 2D array is surrounded by zeroes, you can get first and last columns as well as first and last rows and test if any of those contains a non-zero number.
def zero_surrounded(array):
return not (array[0,:].any() or array[-1,:].any() or array[:,0].any() or array[:,-1].any())
We can check this by constructing two submatrices:
A[[0,-1]] the first and the last row, including the first and last column; and
A[1:-1,[0,-1]] the first and last column, excluding the first and last row.
All the values of these matrices should be equal to zero, so we can use:
if np.all(A[[0,-1]] == 0) and np.all(A[1:-1,[0,-1]] == 0):
# ...
pass
This works for an arbitrary 2d-array, but not for arrays with arbitrary depth. We can however use a trick for that as well.
For an arbitrary matrix, we can use:
def surrounded_zero_dim(a):
n = a.ndim
sel = ([0,-1],)
sli = (slice(1,-1),)
return all(np.all(a[sli*i+sel] == 0) for i in range(n))
Using the slice is strictly speaking not necessary, but it prevents checking certain values twice.
Not the fastest, but perhaps the shortest (and hence a "neat") way of doing it:
surrounded = np.sum(a[1:-1, 1:-1]**2) == np.sum(a**2)
print(surrounded) # True
Here, a is the array.
This compares the sum of all squared elements to the sum of all squared elements except for those on the boundary. If we left out the squaring, cases where positive and negative boundary values add up to zero would produce the wrong answer.
epsData is a two-dimensional array consisting of Dates and StockID.
I took out some of the code in order to make it simple.
The code calls the functions Generate and neweps, epsData is passed by the engine. I am not sure why it gives an error when I try to pass the array epsss to the SUE() function.
I tried to remove the extra bracket in array (if any) by using flatten function but that does not help.
SUE() is supposed to loop through the array and find the 4th last different value and then store these in an array.
I get this error:
TypeError: return arrays must be of ArrayType
with the three lines marked below:
def lastdifferentvalue(vals,datas,i):
sizes=len(datas)
j=sizes-1
values=0
while (i>0) and (j>=0):
if logical_and((vals-datas[j]!=0),(datas[j]!=0),(datas[j-1]!=0)): # !! HERE !!
i=i-1
values=datas[j-1]
j=j-1
return j, values
def SUE(datas):
sizes=len(datas)
j=sizes-1
values=0
sues=zeros(8)
eps1=datas[j]
i=7
while (j>0) and (i>=0) :
counts, eps2=lastdifferentvalue(eps1,array(datas[0:j]),4)
if eps2!=0:
sues[i]=eps1-eps2
i=i-1
j,eps1=lastdifferentvalue(eps1,datas[0:j],1) # !! HERE !!
stddev=std(SUE)
sue7=SUE[7]
return stddev,sue7
def Generate(di,alpha):
#the code below loops through the data. neweps is a two dimensional array of floats [dates, stockid]
for ii in range(0,len(alpha)):
if (epss[2,ii]-epss[1,ii]!=0) and (epss[2,ii]!=0) and (epss[1,ii]!=0):
predata=0
epsss= neweps[di-delay-250:di-delay+1,ii]
stddevs,suedata= SUE(array(epsss.flatten())) # !! HERE !!
Presumably, you're using numpy.logical_and, in the form of
np.logical_and(a, b, c)
with the meaning that you'd like to take the logical and of the three. If you check the documentation, though, that's not what it does. It's interpreting c as the array where you intend to store the results.
You probably mean here something like
np.logical_and(a, np.logical_and(b, c))
or
from functools import reduce
reduce(np.logical_and, [a, b, c])
The line:
if logical_and((vals-datas[j]!=0),(datas[j]!=0),(datas[j-1]!=0))
has two errors:
Presumably you are wanting to perform a logical_and over (vals-datas[j] != 0) and (datas[j] != 0) and (datas[j-1] != 0). However numpy.logical_and only takes two input parameters, the third if passed is assumed to be an output array. Thus if you are wishing to have numpy.logical_and operate over three arrays it should be expressed as:
logical_and(logical_and((vals-datas[j] != 0), (datas[j] != 0)), (datas[j-1] != 0))
In any case, using a logical_and in an if statement makes no sense. It returns an array and an array does not have a truth value. That is, the result of a logical_and is an array of booleans, some of which are true and some false. Are you wishing to check if they are all true? Or if at least some are true?
If the former, then you should test it as:
if numpy.all(logical_and(...)):
...
And if the latter then test it as:
if numpy.any(logical_and(...)):
...