Logical indexing - numpy.where in C++ - python

I have created a simple numpy array with shape (4, 2) called A.
import numpy as np
A = np.array([[1, 2],
[2, 2],
[3, 2],
[4, 2]])
I wanted to get the index of the rows where the first column is 2 and 3, so I did:
indices = np.where((A[:, 0] == 2) | (A[:, 0] == 3))[0]
Doing this I got an array with two items (1 and 2), which is what I wanted.
Now I would like to do this in C++ efficiently. Is there any way to do this using Eigen? I would like to avoid for loops.
Thanks.

Avoiding for loops in NumPy is admirable. But in fact all you're doing there is pushing the loops down into lower-level code implemented in C or Fortran.
There is simply no need to avoid loops in C++. On the contrary, loops are the clear and obvious way to solve this problem in C++. So use loops. They're blazing fast.

this is numpy in c++ it is using for loop to do [where]
https://github.com/dpilger26/NumCpp/blob/master/include/NumCpp/Functions/where.hpp

Related

Adding 2 Diffrent NumPy arrays with diffrent values inside (Boolean , int)

I am taking the Data Science course on DataCamp.On one of the examples there were some kind of lack of an explanation about the numpy addittion rules. I am sending the picture of the example and the question below. What i did not understood was how a 2 array with diffrent values can be add up and give a solution like that.
DataCamp Numpy example
Code Python
In [1]:
np.array([True, 1, 2]) + np.array([3, 4, False])
Out[1]:
array([4, 5, 2])
You can think of a numpy 1d array as a list in python.
In fact you can see this if you case to a list like this:
# cast to a list
a = np.array([True, 1, 2]).tolist()
b = np.array([3, 4, False]).tolist()
# print them out
print(a) # [1,1,2]
print(b) # [3,4,0]
returns this:
[1, 1, 2]
[3, 4, 0]
You are then just adding each element of the lists.
a[0]+b[0] , a[1]+b[1], a[2]+b[2]
So the (numpy) result is this:
[4,5,2]
Because you are using numpy (which is a module in python) the plus (+) operation returns the result as a numpy list (which is the sum of both lists).
Note: numpy arrays are similar, but not identical to python lists.

Nested array computations in Python using numpy

I am trying to use numpy in Python in solving my project.
I have a random binary array rndm = [1, 0, 1, 1] and a resource_arr = [[2, 3], 4, 2, [1, 2]]. What I am trying to do is to multiply the array element wise, then get their sum. As an expected output for the sample above,
output = 5 0 2 3. I find hard to solve such problem because of the nested array/list.
So far my code looks like this:
def fitness_score():
output = numpy.add(rndm * resource_arr)
return output
fitness_score()
I keep getting
ValueError: invalid number of arguments.
For which I think is because of the addition that I am trying to do. Any help would be appreciated. Thank you!
Numpy treats its arrays as matrices, and resource_arr is not a (valid) matrix. In your case a python list is more suitable:
def sum_nested(l):
tmp = []
for element in l:
if isinstance(element, list):
tmp.append(numpy.sum(element))
else:
tmp.append(element)
return tmp
In this function we check for each element inside l if it is a list. If so, we sum its elements. On the other hand, if the encountered element is just a number, we leave it untouched. Please note that this only works for one level of nesting.
Now, if we run sum_nested([[2, 3], 4, 2, [1, 2]]) we will get [5 4 2 3]. All that's left is multiplying this result by the elements of rndm, which can be achieved easily using numpy:
def fitness_score(a, b):
return numpy.multiply(a, sum_nested(b))
Numpy is all about the non-jagged arrays. You can do things with jagged arrays, but doing so efficiently and elegantly isnt trivial.
Almost always, trying to find a way to map your datastructure to a non-nested one, for instance, encoding the information as below, will be more flexible, and more performant.
resource_arr = (
[0, 0, 1, 2, 3, 3]
[2, 3, 4, 2, 1, 2]
)
That is, an integer denoting the 'row' each value belongs to, paired with an array of equal size of the values themselves.
This may 'feel' wasteful when coming from a C-style way of doing arrays (omg more memory consumption), but staying away from nested datastructures is almost certainly your best bet in terms of performance, and the amount of numpy/scipy ecosystem that will actually be compatible with your data representation. If it really uses more memory is actually rather questionable; every new python object uses a ton of bytes, so if you have only few elements per nesting, it is the more memory efficient solution too.
In this case, that would give you the following efficient solution to your problem:
output = np.bincount(*resource_arr) * rndm
I have not worked much with pandas/numpy so I'm not sure if this is most efficient way, but it works (atleast for the example you have shown):
import numpy as np
rndm = [1, 0, 1, 1]
resource_arr = [[2, 3], 4, 2, [1, 2]]
multiplied_output = np.multiply(rndm, resource_arr)
print(multiplied_output)
output = []
for elem in multiplied_output:
output.append(sum(elem)) if isinstance(elem, list) else output.append(elem)
final_output = np.array(output)
print(final_output)

Generate pairs from a numpy array in a specific pattern without loops

Let's consider an array [1,2,3] say, what I would like to generate is the list containing the pairs [[1,2], [1,3], [2,3]]. This can be done using itertools. However, I would like to produce them using pure numpy operations, and no loops or branching is allowed.
A close solution is provided here, but it generates all the possible pairs, instead of a particular fashion as in my case.
Can you please suggest a way to do that? The array will always be 1D.
Also, this is my first question on SE. If it requires any edit, please let me know.
Here is a one-liner using np.triu_indices:
>>> a = np.array([1, 2, 3])
>>> a[np.transpose(np.triu_indices(len(a), 1))]
array([[1, 2],
[1, 3],
[2, 3]])

Dynamic matrix in Python

I'm new to Python and I need a dynamic matrix that I can manipulate adding more columns and rows to it. I read about numpy.matrix, but I can't find a method in there that does what I mentioned above. It occurred to me to use lists but I want to know if there is a simpler way to do it or a better implementation.
Example of what I look for:
matrix.addrow ()
matrix.addcolumn ()
matrix.changeValue (0, 0, "$200")
Am I asking for too much? If so, any ideas of how to implement something like that? Thanks!
You can do all of that in numpy (np.concatenate for example) or native python (my_list.append()). Which one is more efficient will depend on what else your program will do: numpy will be probably less efficient if all you are doing is adding / changing values one at a time, or do a lot of column 'adding' or 'removing'. However if you do matrix or column operations, the overhead of adding new columns to a numpy array maybe offset by the vectorized computation speed offered by numpy. So pick which ever you prefer, and if speed is an issue, then you need to experiment yourself with both approaches...
There are several ways to represent matrices in Python. You can use List of lists or numpy arrays. For example if you were to use numpy arrays
>>> import numpy as np
>>> a = np.array([[1,2,3], [2,3,4]])
>>> a
array([[1, 2, 3],
[2, 3, 4]])
To add a row
>>> np.vstack([a, [7,8,9]])
array([[1, 2, 3],
[2, 3, 4],
[7, 8, 9]])
To add a column
>>> np.hstack((a, [[7],[8]]))
array([[1, 2, 3, 7],
[2, 3, 4, 8]])

How to simplify a for loop in python

I have this piece of Python code that fills up a 2d matrix in a for loop
img=zeros((len(bins_x),len(bins_y)))
for i in arange(0,len(ix)):
img[ix[i]][iy[i]]=dummy[i]
Is it possible to use a vectorial operation for the last two lines of code? Is there also something that might speed up the calculation?
If ix, iy are index sequences:
img[ix, iy] = dummy
It might be useful to use numpy. In particular, the reshape method might be useful. Here is an example (adapted from the second link):
>>> import numpy as np
>>> a = np.array([1,2,3,4,5,6])
>>> np.reshape(a, (3,2))
array([[1, 2],
[3, 4],
[5, 6]])

Categories

Resources