How to do elementwise comparison sum without for loop

How to do elementwise comparison sum without for loop - python

The for loop makes my program very slow. I would've used np.sum(target==output) but I need the argmax value for each row in the output. How can I speed this up?
The output is a tensor data type
for i, x in enumerate(target):
if target[i] == torch.argmax(output[i]):
correct_class += 1

You could vectorize the above using np.argmax's axis argument, to obtain the indices of the maximum value across the rows:
(target==np.argmax(output, axis=1)).sum()
For instance:
output = np.random.choice([0,1],(4,2))
print(output)
array([[1, 1],
[0, 1],
[0, 1],
[0, 1]])
target = np.array([[0,1,0,1]])
(target==np.argmax(output, axis=1)).sum()
# 3

Related

Find the index of first non-zero element to the right of given elements in python

I have a 2D numpy.ndarray. Given a list of positions, I want to find the positions of first non-zero elements to the right of the given elements in the same row. Is it possible to vectorize this? I have a huge array and looping is taking too much time.
Eg:
matrix = numpy.array([
[1, 0, 0, 1, 1],
[1, 1, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 1, 1, 1, 1],
[1, 0, 0, 0, 1]
])
query = numpy.array([[0,2], [2,1], [1,3], [0,1]])
Expected Result:
>> [[0,3], [2,4], [1,4], [0,3]]
Currently I'm doing this using for loops as follows
for query_point in query:
y, x = query_point
result_point = numpy.min(numpy.argwhere(self.matrix[y, x + 1:] == 1)) + x + 1
print(f'{y}, {result_point}')
PS: I also want to find the first non-zero element to the left as well. I guess, the solution to find the right point can be easily tqeaked to find the left point.

If your query array is sufficiently dense, you can reverse the computation: find an array of the same size as matrix that gives the index of the next nonzero element in the same row for each location. Then your problem becomes one of just one of applying query to this index array, which numpy supports directly.
It is actually much easier to find the left index, so let's start with that. We can transform matrix into an array of indices like this:
r, c = np.nonzero(matrix)
left_ind = np.zeros(matrix.shape, dtype=int)
left_ind[r, c] = c
Now you can find the indices of the preceding nonzero element by using np.maximum similarly to how it is done in this answer: https://stackoverflow.com/a/48252024/2988730:
np.maximum.accumulate(left_ind, axis=1, out=left_ind)
Now you can index directly into ind to get the previous nonzero column index:
left_ind[query[:, 0], query[:, 1]]
or
left_ind[tuple(query.T)]
Now to do the same thing with the right index, you need to reverse the array. But then your indices are no longer ascending, and you risk overwriting any zeros you have in the first column. To solve that, in addition to just reversing the array, you need to reverse the order of the indices:
right_ind = np.zeros(matrix.shape, dtype=int)
right_ind[r, c] = matrix.shape[1] - c
You can use any number larger than matrix.shape[1] as your constant as well. The important thing is that the reversed indices all come out greater than zero so np.maximum.accumulate overwrites the zeros. Now you can use np.maximum.accumulate in the same way on the reversed array:
right_ind = matrix.shape[1] - np.maximum.accumulate(right_ind[:, ::-1], axis=1)[:, ::-1]
In this case, I would recommend against using out=right_ind, since right_ind[:, ::-1] is a view into the same buffer. The operation is buffered, but if your line size is big enough, you may overwrite data unintentionally.
Now you can index the array in the same way as before:
right_ind[(*query.T,)]
In both cases, you need to stack with the first column of query, since that's the row key:
>>> row, col = query.T
>>> np.stack((row, left_ind[row, col]), -1)
array([[0, 0],
[2, 0],
[1, 1],
[0, 0]])
>>> np.stack((row, right_ind[row, col]), -1)
array([[0, 3],
[2, 4],
[1, 4],
[0, 3]])
>>> np.stack((row, left_ind[row, col], right_ind[row, col]), -1)
array([[0, 0, 3],
[2, 0, 4],
[1, 1, 4],
[0, 0, 3]])
If you plan on sampling most of the rows in the array, either at once, or throughout your program, this will help you speed things up. If, on the other hand, you only need to access a small subset, you can apply this technique only to the rows you need.

I came up with a solution to get both your wanted indices,
i.e. to the left and to the right from the indicated position.
First define the following function, to get the row number and both indices:
def inds(r, c, arr):
ind = np.nonzero(arr[r])[0]
indSlice = ind[ind < c]
iLeft = indSlice[-1] if indSlice.size > 0 else None
indSlice = ind[ind > c]
iRight = indSlice[0] if indSlice.size > 0 else None
return r, iLeft, iRight
Parameters:
r and c are row number (in the source array) and the "starting"
index in this row,
arr is the array to look in (matrix will be passed here).
Then define the vectorized version of this function:
indsVec = np.vectorize(inds, excluded=['arr'])
And to get the result, run:
result = np.vstack(indsVec(query[:, 0], query[:, 1], arr=matrix)).T
The result is:
array([[0, 0, 3],
[2, 0, 4],
[1, 1, 4],
[0, 0, 3]], dtype=int64)
Your expected result is the left and right column (row number
and the index of first non-zero element after the "starting" position.
The middle column is the index of last non-zero element before the "starting" position.
This solution is resistant to "non-existing" case (if there are no
any "before" or "after" non-zero element). In such case the respective
index is returned as None.

Sum of outer product of corresponding lists in two arrays - NumPy

I am trying to find the numpy matrix operations to get the same result as in the following for loop code. I believe it will be much faster but I am missing some python skills to do it.
It works line by line, each value from a line of x is multiplied by each value of the same line in e and then summed.
The first item of result would be (2*0+2*1+2*4+2*2+2*3)+(0*0+...)+...+(1*0+1*1+1*4+1*2+1*3)=30
Any idea would be much appreciated :).
e = np.array([[0,1,4,2,3],[2,0,2,3,0,1]])
x = np.array([[2,0,0,0,1],[0,3,0,0,4,0]])
result = np.zeros(len(x))
for key, j in enumerate(x):
for jj in j:
for i in e[key]:
result[key] += jj*i
>>> result
Out[1]: array([ 30., 56.])

Those are ragged arrays as they have lists of different lengths. So, a fully vectorized approach even if possible won't be straight-forward. Here's one using np.einsum in a loop comprehension -
[np.einsum('i,j->',x[n],e[n]) for n in range(len(x))]
Sample run -
In [381]: x
Out[381]: array([[2, 0, 0, 0, 1], [0, 3, 0, 0, 4, 0]], dtype=object)
In [382]: e
Out[382]: array([[0, 1, 4, 2, 3], [2, 0, 2, 3, 0, 1]], dtype=object)
In [383]: [np.einsum('i,j->',x[n],e[n]) for n in range(len(x))]
Out[383]: [30, 56]
If you are still feel persistent about a fully vectorized approach, you could make a regular array with the smaller lists being filled zeros. For the same, here's a post that lists a NumPy based approach to do the filling.
Once, we have the regular shaped arrays as x and e, the final result would be simply -
np.einsum('ik,il->i',x,e)

Is this close to what you are looking for?
https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html
It seems like you are trying to get the dot product of matrices.

ufunc.at for cases where target indices are unique (buffered call possible then)

I use ufunc.at similar as a sparse matrix multiplication or better, as a flow in a graph. c[:, 0] denotes the target index where each element denoted by the source index c[:, 1] will be summed up
c = np.array([[0, 1], [0, 2], [1, 1]) # sum up 1 and 2 into 0, and 1 into 1
src = ... # source vector
targ = ... # target vector, not necessarily 0 in the beginning
np.add.at(targ, c[:, 0], src[c[:, 1]]) # sum up into bins
One could similarly write:
targ[c[:, 0]] += src[c[:, 1]]
That approach will only work if all target indices c[:, 0] are unique, else there will be sort of race conditions. I also expect, that it is a bit faster because it does not need to care about accumulation internally, but can just do an 'one shot' addition, what is way more efficient when it comes to vectorization. Numpy calls this buffered/unbuffered operation.
Is there a similar syntax for the buffered version with unique target indices? (Basically just for convenience and more consistently looking code.)

Generating binary lists that sum to a given number

I am attempting Project Euler #15, which essentially reduces to computing the number of binary lists of length 2*size such that their entries sum to size, for the particular case size = 20. For example, if size = 2 there are 6 such lists: [1,1,0,0], [1,0,1,0], [1,0,0,1], [0,1,1,0], [0,1,1,0], [0,1,0,1], [0,0,1,1]. Of course the number of such sequences is trivial to compute for any value size and is equal to some binomial coefficient but I am interested in explicitly generating the correct sequences in Python. I have tried the following:
import itertools
size = 20
binary_lists = itertools.product(range(2), repeat = 2*size)
lattice_paths = {lists for lists in binary_lists if sum(lists) == size}
but the last line makes me run into memory errors. What would be a neat way to accomplish this?

There are far too many for the case of size=20 to iterate over (even if we don't materialize them, 137846528820 is not a number we can loop over in a reasonable time), so it's not particularly useful.
But you can still do it using built-in tools by thinking of the positions of the 1s:
from itertools import combinations
def bsum(size):
for locs in combinations(range(2*size), size):
vec = [0]*(2*size)
for loc in locs:
vec[loc] = 1
yield vec
which gives
>>> list(bsum(1))
[[1, 0], [0, 1]]
>>> list(bsum(2))
[[1, 1, 0, 0], [1, 0, 1, 0], [1, 0, 0, 1], [0, 1, 1, 0], [0, 1, 0, 1], [0, 0, 1, 1]]
>>> sum(1 for x in bsum(12))
2704156
>>> factorial(24)//factorial(12)**2
2704156

I'm not 100% sure of the math on this problem, but your last line is taking a generator and dumping it into a list, and based on your example, and your size of 20, that is a massive list. If you want to sum it, just iterate, but I don't think you can get a nice view of every combo

numpy: ravel_multi_index increment different results from iterating over indices loop

I have an array of indices (possible duplicates) where I increment each these of indices in another 2D matrix by 1. There have been several several suggestions and this answer proposes to use np.ravel_multi_index.
So, I've tried it out but they don't seem to give me the same set of answers. Any idea why?
raveled = np.ravel_multi_index(legit_indices.T, acc.shape)
counts = np.bincount(raveled)
acc = np.resize(counts, acc.shape)
acc2 = np.zeros(acc2.shape)
for i in legit_indices:
acc2[i[0], i[1]] += 1
(Pdb) np.array_equal(acc, acc2)
False
(Pdb) acc[493][5]
135
(Pdb) acc2[493][5]
0.0

There are a few problems with your current approach. Firstly, np.bincount(x)
will give you the counts for every positive integer value of x starting at 0
and ending at max(x):
print(np.bincount([1, 1, 3, 3, 3, 4]))
# [0, 2, 0, 3, 1]
# i.e. [count for 0, count for 1, count for 2, count for 3, count for 4]
Therefore, if not every location in acc.flat gets indexed, the length of
np.bincount(raveled) will be greater than the number of unique indices. What
you actually want is the counts only for those locations in acc.flat that are
indexed at least once.
Secondly, what you want to do is assign the bin counts to the corresponding
indices into acc.flat. What your call to np.resize does is to repeat parts
of your array of bincounts in order to make it the same size as acc.flat,
then reshape it to the same shape as acc. This will not result in the bin
counts being assigned to the correct locations in acc!
The way I would solve this problem would be to use np.unique instead of
np.bincount, and use it to return both the unique indices and their corresponding
counts. These can then be used to assign the correct counts to the correct unique locations within acc:
import numpy as np
# some example data
acc = np.zeros((4, 3))
legit_indices = np.array([[0, 1],
[0, 1],
[1, 2],
[1, 0],
[1, 0],
[1, 0]])
# convert the index array into a set of indices into acc.flat
flat_idx = np.ravel_multi_index(legit_indices.T, acc.shape)
# get the set of unique indices and their corresponding counts
uidx, ucounts = np.unique(flat_idx, return_counts=True)
# assign the count value to each unique index in acc.flat
acc.flat[uidx] = ucounts
# confirm that this matches the result of your for loop
acc2 = np.zeros_like(acc)
for ii, jj in legit_indices:
acc2[ii, jj] += 1
assert np.array_equal(acc, acc2)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to do elementwise comparison sum without for loop - python

The for loop makes my program very slow. I would've used np.sum(target==output) but I need the argmax value for each row in the output. How can I speed this up? The output is a tensor data type for i, x in enumerate(target): if target[i] == torch.argmax(output[i]): correct_class += 1

Related

Find the index of first non-zero element to the right of given elements in python

Sum of outer product of corresponding lists in two arrays - NumPy

ufunc.at for cases where target indices are unique (buffered call possible then)

Generating binary lists that sum to a given number

numpy: ravel_multi_index increment different results from iterating over indices loop

Categories

Resources