Using np.pad() on structured array

Using np.pad() on structured array - python

Another post I had does exactly what I wanted, but I cannot seem to implement on a structured array.
Say I have an array like so:
>>> arr = np.empty(2, dtype=np.dtype([('xy', np.float32, (2, 2))]))
>>> arr['xy']
array([[[1., 1.],
[2., 2.]],
[3., 3.],
[4., 4.]]], dtype=float32)
I need to pad it so that the last row in each subarray is repeated a specific number of times:
arr['xy'] = np.pad(arr['xy'], [(0, 0), (0, 2), (0, 0)], mode='edge')
However I'm getting a ValueError:
ValueError: could not broadcast input array from shape (2, 4, 2) into shape (2, 2, 2)
So without a structured array, I tried the following:
>>> arr = np.array([[[1, 1], [2, 2]], [[3, 3], [4, 4]]])
>>> arr
array([[[1, 1],
[2, 2]],
[3, 3],
[4, 4]]], dtype=float32)
>>> arr = np.pad(arr, [(0, 0), (0, 2), (0, 0)], mode='edge')
>>> arr
array([[[1, 1],
[2, 2],
[2, 2],
[2, 2]],
[3, 3],
[4, 4],
[4, 4],
[4, 4]], dtype=float32)
How come I cannot repeat with a structured array?

Your padding works, it's the assignment to ar["xy"] that fails, you can't change the shape of a structure.
>>> arr = np.empty(2, dtype=np.dtype([('xy', np.float32, (2, 2))]))
>>> ar2 = np.pad(arr['xy'], [(0, 0), (0, 2), (0, 0)], mode='edge')
>>> ar2.shape
(2, 4, 2)
>>> arr["xy"] = ar2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (2,4,2) into shape (2,2,2)

It might be the late reply, but you can use it instead.
assume that you have structured ndarray input "sample"
and pad_value_dict, having {name: pad_value} you want to add.
pad_arr = np.array(
[tuple([pad_value_dict[k] for k in keys])],
dtype=sample.dtype
)
pad_arr = np.tile(pad_arr, pad_len)
result = np.append(sample, pad_arr)

Related

How to reshape matrices using index instead of shape inputs?

Given an array of shape (8, 3, 4, 4), reshape them into an arbitrary new shape (8, 4, 4, 3) by inputting the new indices compared to the old positions (0, 2, 3, 1).
Bonus: perform numpy.dot of one of said array's non-last index and a 1-D second, i.e. numpy.dot(<array with shape (8, 3, 4, 4)>, [1, 2, 3]) # will return shape mismatch as it is

Numpy's transpose "reverses or permutes":
ni = (0, 2, 3, 1)
arr = arr.transpose(ni)
Old solution:
ni = (0, 2, 3, 1)
s = arr.shape
arr = arr.reshape(s[ni[0]], s[ni[1]]...)

Maybe this is what you are looking for:
arr = np.array([[[1, 2], [3, 4], [5, 6]]])
s = arr.shape
new_indexes = (1, 0, 2) # permutation
new_arr = arr.reshape(*[s[index] for index in new_indexes])
print(arr.shape) # (1, 3, 2)
print(new_arr.shape) # (3, 1, 2)

Creating numpy array from calculations across arrays

I currently have the task of creating a 4x4 array with operations performed on the cells
Below you will see a function that takes in array into function the_matrix which returns adj_array
It then has a for loop that is supposed to loop through array, looking at the the cell in ref_array and upon finding the matching first two numbers in array (like 6,3") it will put that function lambda N: 30 into it's respective cell in adj_array, as it will do for all cells in the 4x4 matrix
Essentially the function should return an array like this
array([[inf, <function <lambda> at 0x00000291139AF790>,
<function <lambda> at 0x00000291139AF820>, inf],
[inf, inf, inf, <function <lambda> at 0x00000291139AF8B0>],
[inf, inf, inf, <function <lambda> at 0x00000291139AF940>],
[inf, inf, inf, inf]], dtype=object)
My work so far below
def the_matrix(array):
ref_array = np.zeros((4,4), dtype = object)
ref_array[0,0] = (5,0)
ref_array[0,1] = (5,1)
ref_array[0,2] = (5,2)
ref_array[0,3] = (5,3)
ref_array[1,0] = (6,0)
ref_array[1,1] = (6,1)
ref_array[1,2] = (6,2)
ref_array[1,3] = (6,3)
ref_array[2,0] = (7,0)
ref_array[2,1] = (7,1)
ref_array[2,2] = (7,2)
ref_array[2,3] = (7,3)
ref_array[3,0] = (8,0)
ref_array[3,1] = (8,1)
ref_array[3,2] = (8,2)
ref_array[3,3] = (8,3)
for i in ref_array:
for a in i: #Expecting to get (5,1) here, but's showing me array
if a == array[0, 0:2]: #This specific slice was a test
put the function in that cell for adj_array
return adj_array
array = np.array([[5, 1, lambda N: 120],
[5, 2, lambda N: 30],
[6, 3, lambda N: 30],
[7, 3, lambda N: N/30]])
Have tried variations of this for loop, and it's throwing errors. For one, the a in the for loop is displaying the input argument array, which is weird because it hasn't been called in the loop at that stage. My intention here is to refer to the exact cell in ref_array.
Not sure where I'm going wrong here and how I'm improperly looping through. Any help appreciated

Your ref_array is object dtype, (4,4) containing tuples:
In [26]: ref_array
Out[26]:
array([[(5, 0), (5, 1), (5, 2), (5, 3)],
[(6, 0), (6, 1), (6, 2), (6, 3)],
[(7, 0), (7, 1), (7, 2), (7, 3)],
[(8, 0), (8, 1), (8, 2), (8, 3)]], dtype=object)
Your iteration, just showing the iteration variables. I'm using `repr
In [28]: for i in ref_array:
...: print(repr(i))
...: for a in i:
...: print(repr(a))
...:
array([(5, 0), (5, 1), (5, 2), (5, 3)], dtype=object)
(5, 0)
(5, 1)
(5, 2)
(5, 3)
...
So i is a "row" of the array, itself a 1d object dtype array.
a is one of those objects, a tuple.
Your description of the alternatives is vague. But assume on tries to start with a numeric dtype array
In [30]: arr = np.array(ref_array.tolist())
In [31]: arr
Out[31]:
array([[[5, 0],
[5, 1],
[5, 2],
[5, 3]],
...
[8, 2],
[8, 3]]])
In [32]: arr.shape
Out[32]: (4, 4, 2)
now the looping:
In [33]: for i in arr:
...: print(repr(i))
...: for a in i:
...: print(repr(a))
...:
array([[5, 0], # i is a (4,2) array
[5, 1],
[5, 2],
[5, 3]])
array([5, 0]) # a is (2,) array....
array([5, 1])
array([5, 2])
array([5, 3])
If "the a in the for loop is displaying the input argument array", it's most likely because a IS a an array.
Keep in mind that object dtype arrays are processed at list speeds. You might as well think of them as bastardized lists. While they have some array enhancements (multidimensonal indexing etc), the elements are still references, and are processed as in lists.
I haven't paid attention as to why you are putting lambdas in the array. It looks ugly, and I don't see what it gains you. They can't be "evaluated" at array speeds. You'd have to do some sort of iteration or list comprehension.
edit
A more direct way of generating the arr, derived from ref_array:
In [39]: I,J = np.meshgrid(np.arange(5,9), np.arange(0,4), indexing='ij')
In [40]: I
Out[40]:
array([[5, 5, 5, 5],
[6, 6, 6, 6],
[7, 7, 7, 7],
[8, 8, 8, 8]])
In [41]: J
Out[41]:
array([[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3]])
In [42]: arr = np.stack((I,J), axis=2) # shape (4,4,2)
If the function was something like
In [46]: def foo(I,J):
...: return I*10 + J
...:
You could easily generate a value for each pair of the values in ref_array.
In [47]: foo(I,J)
Out[47]:
array([[50, 51, 52, 53],
[60, 61, 62, 63],
[70, 71, 72, 73],
[80, 81, 82, 83]])

How to make a 2d array of tuples in python?

I want to make a 2D array of 2-tuples of fixed dimension (say 10x10).
e.g
[[(1,2), (1,2), (1,2)],
[(1,2), (1,2), (1,2)],
[(1,2), (1,2), (1,2)]]
There are also two ways that I'd like to generate this array:
An array like the example above where every element is the same tuple
An array which I populate iteratively with specific tuples (possibly starting with an empty array of fixed size and then using assignment)
How would I go about doing this? For #1 I tried using numpy.tiles:
>>> np.tile(np.array([1,2]), (3, 3))
array([[1, 2, 1, 2, 1, 2],
[1, 2, 1, 2, 1, 2],
[1, 2, 1, 2, 1, 2]])
But I can't seem to copy it across columns, the columns are just concatenated.
i.e instead of:
[[[1,2], [1,2], [1,2]],
[[1,2], [1,2], [1,2]],
[[1,2], [1,2], [1,2]]]

you can use numpy.full:
numpy.full((3, 3, 2), (1, 2))
output:
array([[[1, 2],
[1, 2],
[1, 2]],
[[1, 2],
[1, 2],
[1, 2]],
[[1, 2],
[1, 2],
[1, 2]]])

for <1> you can generate like this
[[(1,2)] * 3]*3
# get [[(1, 2), (1, 2), (1, 2)], [(1, 2), (1, 2), (1, 2)], [(1, 2), (1, 2), (1, 2)]]

numpy.zeros((3,3,2))
I guess would work (but its not tuples its lists...)

Taking dot products of high dimensional numpy arrays

I am trying to take the dot product between three numpy arrays. However, I am struggling with wrapping my head around this.
The problem is as follows:
I have two (4,) shaped numpy arrays a and b respectively, as well as a numpy array with shape (4, 4, 3), c.
import numpy as np
a = np.array([0, 1, 2, 3])
b = np.array([[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]],
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]],
[[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]],
[[4, 4, 4], [4, 4, 4], [4, 4, 4], [4, 4, 4]]])
c = np.array([4, 5, 6, 7])
I want to compute the dot product in such a way that my result is a 3-tuple. That is, first dot a with b and then dotting with c, taking transposes if needed. In other words, I want to compute the dot product between a, b and c as if c was of shape (4, 4), but I want a 3-tuple as result.
I have tried:
Reshaping a and c, and then computing the dot product:
a = np.reshape(a, (4, 1))
c = np.reshape(c, (4, 1))
tmp = np.dot(a.T, b) # now has shape (1, 4, 3)
result = np.dot(tmp, c)
Ideally, I should now have:
print(result.shape)
>> (1, 1, 3)
but I get the error
ValueError: shapes (1,4,3) and (4,1) not aligned: 3 (dim 2) != 4 (dim 0)
I have also tried using the tensordot function from numpy, but without luck.

The basic dot(A,B) rule is: last axis of A with the 2nd to the last of B
In [965]: a.shape
Out[965]: (4,)
In [966]: b.shape
Out[966]: (4, 4, 3)
a (and c) is 1d. It's (4,) can dot with the 2nd (4) of b with:
In [967]: np.dot(a,b).shape
Out[967]: (4, 3)
Using c in the same on the output produces a (3,) array
In [968]: np.dot(c, np.dot(a,b))
Out[968]: array([360, 360, 360])
This combination may be clearer with the equivalent einsum:
In [971]: np.einsum('i,jik,j->k',a,b,c)
Out[971]: array([360, 360, 360])
But what if we want a to act on the 1st axis of b? With einsum that's easy to do:
In [972]: np.einsum('i,ijk,j->k',a,b,c)
Out[972]: array([440, 440, 440])
To do the same with the dot, we could just switch a and c:
In [973]: np.dot(a, np.dot(c,b))
Out[973]: array([440, 440, 440])
Or transpose axes of b:
In [974]: np.dot(c, np.dot(a,b.transpose(1,0,2)))
Out[974]: array([440, 440, 440])
This transposition question would be clearer if a and c had different lengths. e.g. A (2,) and (4,) with a (2,4,3) or (4,2,3).
In
tmp = np.dot(a.T, b) # now has shape (1, 4, 3)
you have a (1,4a) dotted with (4,4a,3). The result is (1,4,3). I added the a to identify when axes were combined.
To apply the (4,1) c, we have to do the same transpose:
In [977]: np.dot(c[:,None].T, np.dot(a[:,None].T, b))
Out[977]: array([[[360, 360, 360]]])
In [978]: _.shape
Out[978]: (1, 1, 3)
np.dot(c[None,:], np.dot(a[None,:], b)) would do the same without the transposes.
I was hoping numpy would automagically distribute over the last axis. That is, that the dot product would run over the two first axes, if that makes sense.
Given the dot rule that I cited at the start this does not make sense. But if we transpose b so the (3) axis is first, it can 'carry that along', using the last and 2nd to the last.
In [986]: b.transpose(2,0,1).shape
Out[986]: (3, 4, 4)
In [987]: np.dot(a, b.transpose(2,0,1)).shape
Out[987]: (3, 4)
In [988]: np.dot(np.dot(a, b.transpose(2,0,1)),c)
Out[988]: array([440, 440, 440])
(4a).(3, 4a, 4c) -> (3, 4c)
(3, 4c). (4c) -> 3

Not automagical but does the job:
np.einsum('i,ijk,j->k',a,b,c)
# array([440, 440, 440])
This computes d of shape (3,) such that d_k = sum_{ij} a_i b_{ijk} c_j.

You are multiplying (1,4,3) matrix by (4,1) matrix so it is impossible because you have 3 pages of (1,4) matrices in b. If you want to do multiplication of each page of matrix b by c just multiply each page separately.
a = np.array([0, 1, 2, 3])
b = np.array([[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]],
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]],
[[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]],
[[4, 4, 4], [4, 4, 4], [4, 4, 4], [4, 4, 4]]])
c = np.array([4, 5, 6, 7])
a = np.reshape(a, (4, 1))
c = np.reshape(c, (4, 1))
tmp = np.dot(a.T, b) # now has shape (1, 4, 3)
result = np.dot(tmp[:,:,0], c)
for i in range(1,3):
result = np.dstack((result, np.dot(tmp[:,:,i], c)))
print np.shape(result)
So you have result of size (1,1,3)

Return the subset of NumPy array according to the first element of each row

I am trying to get the subset x of the given NumPy array alist such that the first element of each row must be in the list r.
>>> import numpy
>>> alist = numpy.array([(0, 2), (0, 4), (1, 3), (1, 4), (2, 1), (3, 1), (3, 2), (4, 1), (4, 3), (4, 2)])
>>> alist
array([[0, 2],
[0, 4],
[1, 3],
[1, 4],
[2, 1],
[3, 1],
[3, 2],
[4, 1],
[4, 3],
[4, 2]])
>>> r = [1,3]
>>> x = alist[where first element of each row is in r] #this i need to figure out.
>>> x
array([[1, 3],
[1, 4],
[3, 1],
[3, 2]])
Any easy way (without looping as I've a large dataset) to do this in Python?

Slice the first column off input array (basically selecting first elem from each row), then use np.in1d with r as the second input to create a mask of such valid rows and finally index into the rows of the array with the mask to select the valid ones.
Thus, the implementation would be like so -
alist[np.in1d(alist[:,0],r)]
Sample run -
In [258]: alist # Input array
Out[258]:
array([[0, 2],
[0, 4],
[1, 3],
[1, 4],
[2, 1],
[3, 1],
[3, 2],
[4, 1],
[4, 3],
[4, 2]])
In [259]: r # Input list to be searched for
Out[259]: [1, 3]
In [260]: np.in1d(alist[:,0],r) # Mask of valid rows
Out[260]: array([False, False, True, True, False, True, True,
False, False, False], dtype=bool)
In [261]: alist[np.in1d(alist[:,0],r)] # Index and select for final o/p
Out[261]:
array([[1, 3],
[1, 4],
[3, 1],
[3, 2]])

You can construct the index array for the valid rows using some indexing tricks: we can add an additional dimension and check equality with each element of your first column:
import numpy as np
alist = np.array([(0, 2), (0, 4), (1, 3), (1, 4), (2, 1),
(3, 1), (3, 2), (4, 1), (4, 3), (4, 2)])
inds = (alist[:,0][:,None] == r).any(axis=-1)
x = alist[inds,:] # the valid rows
The trick is that we take the first column of alist, make it an (N,1)-shaped array, make use of array broadcasting in the comparison to end up with an (N,2)-shape boolean array, and if any of the values in a given row is True, we keep that index. The resulting index array is the exact same as the np.in1d one in Divakar's answer.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using np.pad() on structured array - python

Related

How to reshape matrices using index instead of shape inputs?

Creating numpy array from calculations across arrays

How to make a 2d array of tuples in python?

Taking dot products of high dimensional numpy arrays

Return the subset of NumPy array according to the first element of each row

Categories

Resources