I have 2 arrays. "A" is one of them with arbitrary length (let's assume 1000 entries for a start), where each point holds a n-dimensional vector, where each entry represents a scalar. "B" is the other one, with n entries that each hold a 3-dimensional vector. How can I do a scalar multiplication, so that the result is one array "C", where each entry is the scalar multiplication of each of the n scalars with each of the n 3-Dimensional Vectors?
As an example in 4-D:
a=[[1,2,3,4],[5,6,7,8],....]
b=[[1,0,0],[0,1,0],[0,0,1],[1,1,1]]
and a result
c=[[1*[1,0,0],2*[0,1,0],3*[0,0,1],4*[1,1,1]] , [5*[1,0,0],...],...]
The implementation should be in numpy without to large for loops, because there are expected to be way more than 1000 entries. n is expected to be 7 in our case.
If you start with:
a = np.array([[1,2,3,4],[5,6,7,8]])
b = np.array([[1,0,0],[0,1,0],[0,0,1],[1,1,1]])
Then we can add an extra axis to a, and repeating the array along it gives us...
>>> a[:,:,None].repeat(3, axis=2)
array([[[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]],
[[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8]]])
Now, as #Jaime says, there is no need to use the repeat while operating, because NumPy's broadcasting takes care of it:
>>> a[:,:,None] * b
array([[[1, 0, 0],
[0, 2, 0],
[0, 0, 3],
[4, 4, 4]],
[[5, 0, 0],
[0, 6, 0],
[0, 0, 7],
[8, 8, 8]]])
Related
I have the following NumPy matrix:
m = np.array([[1, 2, 3, 4],
[10, 5, 3, 4],
[12, 8, 1, 2],
[7, 0, 2, 4]])
Now, I need the indices of N (say, N=2) lowest values of each row in this matrix . So with the example above, I expect the following output:
[[0, 1],
[2, 3],
[3, 2],
[1, 2]]
where the rows of the output matrix correspond to the respective rows of the original, and the elements of the rows of the output matrix are the indices of the N lowest values in the corresponding original rows (preferably in ascending order by values in the original matrix). How could I do it in NumPy?
You could either use a simple loop-approach (not recommended) or you use np.argpartition:
In [13]: np.argpartition(m, 2)[:, :2]
Out[13]:
array([[0, 1],
[2, 3],
[2, 3],
[1, 2]])
You could use np.argsort on your array and then slice the array with the amount of N lowest/highest values.
np.argsort(m, axis=1)[:, :2]
array([[0, 1],
[2, 3],
[2, 3],
[1, 2]], dtype=int64)
Try this;
import numpy as np
m = np.array([[1, 2, 3, 4],
[10, 5, 3, 4],
[12, 8, 1, 2],
[7, 0, 2, 4]])
for arr in m:
print(arr.argsort()[:2])
I looked into other posts related to indexing numpy array with another numpy array, but still could not wrap my head around to accomplish the following:
a = [[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]],
b = [[[1,0],[0,1]],[[1,1],[0,1]]]
a[b] = [[[7,8,9],[4,5,6]],[[10,11,12],[4,5,6]]]
a is an image represented by 3D numpy array, with dimension 2 * 2 * 3 with RGB values for the last dimension. b contains the index that will match to the image. For instance for pixel index (0,0), it should map to index (1,0) of the original image, which should give pixel values [7,8,9]. I wonder if there's a way to achieve this. Thanks!
Here's one way:
In [54]: a = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
In [55]: b = np.array([[[1, 0], [0, 1]], [[1, 1], [0, 1]]])
In [56]: a[b[:, :, 0], b[:, :, 1]]
Out[56]:
array([[[ 7, 8, 9],
[ 4, 5, 6]],
[[10, 11, 12],
[ 4, 5, 6]]])
I need to remove the last arrays from a 3D numpy cube. I have:
a = np.array(
[[[1,2,3],
[4,5,6],
[7,8,9]],
[[9,8,7],
[6,5,4],
[3,2,1]],
[[0,0,0],
[0,0,0],
[0,0,0]],
[[0,0,0],
[0,0,0],
[0,0,0]]])
How do I remove the arrays with zero sub-arrays like at the bottom side of the cube, using np.delete?
(I cannot simply remove all zero values, because there will be zeros in the data on the top side)
For a 3D cube, you might check all against the last two axes
a = np.asarray(a)
a[~(a==0).all((2,1))]
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Here's one way to remove trailing all zeros slices, as mentioned in the question that we want to keep the all zeros slices in the data on the top side -
a[:-(a==0).all((1,2))[::-1].argmin()]
Sample run -
In [80]: a
Out[80]:
array([[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]],
[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]]])
In [81]: a[:-(a==0).all((1,2))[::-1].argmin()]
Out[81]:
array([[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
If you know where they are already, the easiest thing to do is slice them off:
a[:-2]
Results in:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Hope this helps,
a_new=[] #Create a empty list
for item in a:
if not (np.count_nonzero(item) == 0): #check if inner matrix is empty or not
a_new.append(item) #appending to inner matrix to the list
a_new=np.array(a_new) #creating numpy matrix with removed zero elements
Output:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Use any and select :)
a=np.array([[[1,2,3],
[4,5,6],
[7,8,9]],
[[9,8,7],
[6,5,4],
[3,2,1]],
[[0,0,0],
[0,0,0],
[0,0,0]],
[[0,0,0],
[0,0,0],
[0,0,0]]])
a[a.any(axis=2).any(axis=1)]
Given an two arrays: an input array and a repeat array, I would like to receive an array which is repeated along a new dimension a specified amount of times for each row and padded until the ending.
to_repeat = np.array([1, 2, 3, 4, 5, 6])
repeats = np.array([1, 2, 2, 3, 3, 1])
# I want final array to look like the following:
#[[1, 0, 0],
# [2, 2, 0],
# [3, 3, 0],
# [4, 4, 4],
# [5, 5, 5],
# [6, 0, 0]]
The issue is that I'm operating with large datasets (10M or so) so a list comprehension is too slow - what is a fast way to achieve this?
Here's one with masking based on this idea -
m = repeats[:,None] > np.arange(repeats.max())
out = np.zeros(m.shape,dtype=to_repeat.dtype)
out[m] = np.repeat(to_repeat,repeats)
Sample output -
In [44]: out
Out[44]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
Or with broadcasted-multiplication -
In [67]: m*to_repeat[:,None]
Out[67]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
For large datasets/sizes, we can leverage multi-cores and be more efficient on memory with numexpr module on that broadcasting -
In [64]: import numexpr as ne
# Re-using mask `m` from previous method
In [65]: ne.evaluate('m*R',{'m':m,'R':to_repeat[:,None]})
Out[65]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
I face a problem when I try to change the shape of tf.SparseTensor inside a tf.while_loop. Let's say I have this sparse tensor:
indices = np.array([[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5],
[1, 0], [1, 1], [1, 3], [1, 4], [1, 5],
[2, 1], [2, 2], [2, 3], [2, 4],
[3, 0], [3, 1], [3, 2], [3, 3], [3, 4], [3, 5],
[4, 0], [4, 2], [4, 3], [4, 4], [4, 5]], dtype=np.int64)
values = np.array([7, 6, 7, 4, 5, 4,
6, 7, 4, 3, 4,
3, 3, 1, 1,
1, 2, 2, 3, 3, 4,
1, 1, 2, 3, 3], dtype=np.float64)
dense_shape = np.array([5, 6], dtype=np.int64)
tRatings = tf.SparseTensor(indices, values, dense_shape)
So, I want to take a slice from the first 3 rows. I know for that purpose I can use tf.sparse_slice but this is an example. In my real code, I gather multiple rows from the sparse Tensor which they are not serial. The code I wrote is this:
subTensor = tf.sparse_slice(tRatings, [0, 0], [1, 6])
i = tf.constant(1)
def condition(i, sub):
return tf.less(i, 3)
def body(i, sub):
tempUser = tf.sparse_slice(tRatings, [i, 0], [1, 6])
sub = tf.sparse_concat(axis = 0, sp_inputs = [sub, tempUser])
return [tf.add(i, 1), sub]
subTensor = tf.while_loop(condition1, body1, [i, subTensor], shape_invariants=[i.get_shape(), tf.TensorShape([2])])[1]
which does't work for some reason when I run it. I get this:
ValueError: Dimensions 1 and 2 are not compatible
According to https://www.tensorflow.org/api_docs/python/tf/while_loop it says that:
The shape_invariants argument allows the caller to specify a less specific shape invariant for each loop variable, which is needed if the shape varies between iterations. The tf.Tensor.set_shape function may also be used in the body function to indicate that the output loop variable has a particular shape. The shape invariant for SparseTensor and IndexedSlices are treated specially as follows:
a) If a loop variable is a SparseTensor, the shape invariant must be TensorShape([r]) where r is the rank of the dense tensor represented by the sparse tensor. It means the shapes of the three tensors of the SparseTensor are ([None], [None, r], [r]). NOTE: The shape invariant here is the shape of the SparseTensor.dense_shape property. It must be the shape of a vector.
What am I missing here?
There are two problems.
First the problem in Tensorflow code. Change this line to:
var.indices.set_shape(tensor_shape.TensorShape([None, shape[0]]))
Another small problem in your code. You have to use int64 type for indexing variable:
i = tf.constant(1, dtype=tf.int64)