Retrieve column slices from a NumPy array using variables as the indexer - python

Say I have an array and I want a function to select some of its columns based on an argument a that is pre-defined :
extracted_columns = array[:,a].
If I have e.g. a = np.arange(10), I'll get the first ten columns,
What if I want to define a so that all the columns are selected without knowing the size of the array ?
I'd like to set a = : so that the function does
extracted_columns = array[:,:]
but it seems : can't pas passed as an argument. I also tried a = None but this gives me an array of dimensions 3 with the second dimension equal to 1.
Is there a nice way of doing it ?
Thanks,

Pass a slice object to your function.
MCVE:
x = np.arange(9).reshape(3, 3)
print(x)
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
a = slice(None)
print(x[:, a])
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
For your case, you'd define a function along these lines:
def foo(array, a):
return array[:, a]
And call it like this:
arr_slice = foo(array, slice(None))

Related

tf.lookup.StaticHashTable with lists (of arbitrary sizes) as values

I want to associate to each person's name a list of numbers.
keys = ["Fritz", "Franz", "Fred"]
values = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
If I run the following:
import tensorflow as tf
table = tf.lookup.StaticHashTable(tf.lookup.KeyValueTensorInitializer(keys, values), default_value=0)
, I get a ValueError: Can't convert non-rectangular Python sequence to Tensor.
since the lists are not of the same size and hence cannot be converted to a tf.Tensor.
Is there another way to associate the values of a tensor to lists of arbitrary shape?
Thank you for your help :)
StaticHashTable - as of TF 2.3 - cannot return multi-dimensional values, let alone ragged ones. So despite padding the values, creating a hash table like this:
keys = ["Fritz", "Franz", "Fred"]
values = [[1, 2, 3, -1], [4, 5, -1, -1], [6, 7, 8, 9]]
table_init = tf.lookup.KeyValueTensorInitializer(keys, values)
table = tf.lookup.StaticHashTable(table_init, -1)
will throw the following error:
InvalidArgumentError: Expected shape [3] for value, got [3,4] [Op:LookupTableImportV2]
To circumvent this, you can use Dense hash tables although it is in experimental mode. Neither dense nor static hash tables provide support for ragged keys or values. So your best bet is to pad your values, and create a dense hash table. During lookup, you can rag them back. The overall code looks like this:
keys = ["Fritz", "Franz", "Fred"]
values = [[1, 2, 3, -1], [4, 5, -1, -1], [6, 7, 8, 9]]
table = tf.lookup.experimental.DenseHashTable(key_dtype=tf.string, value_dtype=tf.int64, empty_key="<EMPTY_SENTINEL>", deleted_key="<DELETE_SENTINEL>", default_value=[-1, -1, -1, -1])
table.insert(keys, values)
And during lookup:
>>> tf.RaggedTensor.from_tensor(table.lookup(['Franz', 'Emil']), padding=-1)
<tf.RaggedTensor [[4, 5], []]>

Check if any row in a numpy array is part of another array [duplicate]

This question already has an answer here:
Check if two 3D numpy arrays contain overlapping 2D arrays
(1 answer)
Closed 2 years ago.
I'm using numpy for the first time. I am trying to achieve the following:
There are 2 arrays:
a = np.array([[1, 3], [2, 5], [1, 2], [2, 1], [1,6]])
b = np.array([[3, 5], [1, 2]])
I need to check if ANY pair (or a row in other words) in array b is present in array a, in the same order (as in, [1, 2] is not to be considered same as [2, 1])
The above example should return True since both a and b contain [1, 2]
I've tried:
for [x, y] in b
if [x, y] in a
and:
if (a == b).all(1).any() # --> This throws "AttributeError: 'bool' object has no attribute 'all'"
but failed.
Thanks in advance
Let do it the numpyic way (loops are not advised with numpy). Add a dimension using None to let the numpy do the correct broadcasting, then use any and all along correct axis:
(a==b[:,None]).all(-1).any()
Output for sample input in question:
True
This solution use np.ravel_multi_index to avoid broadcasting. If your array is big, this is helpful since it doesn't use broadcasting
d = np.maximum(a.max(0), b.max(0))+1
np.in1d(np.ravel_multi_index(a.T,d), np.ravel_multi_index(b.T,d)).any()
Out[71]: True
This solution is also able to give position of the row in a where it matches
np.nonzero(np.in1d(np.ravel_multi_index(a.T,d), np.ravel_multi_index(b.T,d)))[0]
Out[72]: array([2], dtype=int64)
Note: I learned this trick a long time ago from #Divakar . so, credit should go to him.
Try:
a = np.array([[1, 3], [2, 5], [1, 2], [2, 1], [1,6]])
b = np.array([[3, 5], [1, 2]])
check = any(map(lambda x: x in b, a))
Explanation:
lambda is a key word to create a function. In this case:
lambda x: x in b
it represents a function that takes an x and returns if x is in array b
map is a built-in function that takes a function as a first argument, and an iterable as a second argument.
what it does is apply the first argument (the function) to every item in the iterable (the second argument) and return an iterable with these values.
In this case:
map(lambda x: x in b, a)
it returns an iterable of True and False depending the result of applying the function throw the elements.
Finally, any its another build-in function that takes and iterable of True's and False's and returns True if any item on the iterable is True
EDIT:
You can also do it using list comprehension (as someone write it down in comments):
a = np.array([[1, 3], [2, 5], [1, 2], [2, 1], [1,6]])
b = np.array([[3, 5], [1, 2]])
check = any(x in b for x in a)
It is exactly the same and even more legible.

Adding elements to numpy array

Using NumPy:
X= numpy.zeros(shape=[1, 4], dtype=np.int)
How can I add a list, such as [1,2,3,4]? I tried numpy.add(X,[1,2,3,4]) and np.hstack((1,2,3,4)) but none of them work!
I know how to use that in standard Python list using append method but I want to use numpy for performance.
Numpy arrays don't change shape after they are created. So after invoking method zeros((1,4), ...), you already have a 1x4 matrix full of zeroes. To set its elements to values other than zeroes, you need to use the assignment operator:
X[0] = [1, 2, 3, 4] # does what you are trying to achieve in your question
X[0, :] = [1, 2, 3, 4] # equivalent to the above
X[:] = [1, 2, 3, 4] # same
X[0, 1] = 2 # set the individual element at [0, 1] to 2

Inplace permutation of a numpy arrray

I have a quite large numpy array of one dimension for which I would like to apply some sorting on a slice inplace and also retrieve the permutation vector for other processing.
However, the ndarray.sort() (which is an inplace operation) method does not return this vector and I may use the ndarray.argsort() method to get the permutation vector and use it to permute the slice. However, I can't figure out how to do it inplace.
Vslice = V[istart:istop] # This is a view of the slice
iperm = Vslice.argsort()
V[istart:istop] = Vslice[iperm] # Not an inplace operation...
Subsidiary question : Why the following code does not modifies V as we are working on a view of V ?
Vslice = Vslice[iperm]
Best wishes !
François
To answer your question of why assignment to view does not modify the original:
You need to change Vslice = Vslice[iperm] to Vslice[:] = Vslice[iperm] otherwise you are assigning a new value to Vslice rather than changing the values inside Vslice:
>>> a = np.arange(10, 0, -1)
>>> a
array([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])
>>> b = a[2:-2]
>>> b
array([8, 7, 6, 5, 4, 3])
>>> i = b.argsort()
>>> b[:] = b[i] # change the values inside the view
>>> a # note `a` has been sorted in [2:-2] slice
array([10, 9, 3, 4, 5, 6, 7, 8, 2, 1])

How can I use the unique(a, 'rows') from MATLAB in Python?

I'm translating some stuff from MATLAB to the Python language.
There's this command, unique(a), in NumPy. But since the MATLAB program runs the 'rows' command also, it gives something a little different.
Is there a similar command in Python or should I make some algorithm that does the same thing?
Assuming your 2D array is stored in the usual C order (that is, each row is counted as an array or list within the main array; in other words, row-major order), or that you transpose the array beforehand otherwise, you could do something like...
>>> import numpy as np
>>> a = np.array([[1, 2, 3], [2, 3, 4], [1, 2, 3], [3, 4, 5]])
>>> a
array([[1, 2, 3],
[2, 3, 4],
[1, 2, 3],
[3, 4, 5]])
>>> np.array([np.array(x) for x in set(tuple(x) for x in a)]) # or "list(x) for x in set[...]"
array([[3, 4, 5],
[2, 3, 4],
[1, 2, 3]])
Of course, this doesn't really work if you need the unique rows in their original order.
By the way, to emulate something like unique(a, 'columns'), you'd just transpose the original array, do the step shown above, and then transpose back.
You can try:
ii = 0; wrk_arr = your_arr
idx = numpy.arange(0,len(wrk_arr))
while ii<=len(wrk_arr)-1:
i_list = numpy.arange(0,len(wrk_arr)
candidate = numpy.matrix(wrk_arr[ii,:])
i_dup = numpy.array([0] * len(wrk_arr))
numpy.all(candidate == wrk_arr,axis=1, iout = idup)
idup[ii]=0
i_list = numpy.unique(i_list * (1-idup))
idx = numpy.unique(idx * (1-idup))
wrk_arr = wrk_arr[i_list,:]
ii += 1
The results are wrk_arr which is the unique sorted array of your_arr. The relation is:
your_arr[idx,:] = wrk_arr
It works like MATLAB in the sense that the returned array (wrk_arr) keeps the order of the original array (your_arr). The idx array differs from MATLAB since it contains the indices of first appearance whereas MATLAB returns the LAST appearance.
From my experience it worked as fast as MATLAB on a 10000 X 4 matrix.
And a transpose will do the trick for the column case.

Categories

Resources