Choosing and iterating specific sub-arrays in multidimensional arrays in Python - python

This is a question that comes from the post here Iterating and selecting a specific array from a multidimensional array in Python
In that post, user #Cleb solved what it was my original problem: how to perform a sum through columns in a 3d array:
import numpy as np
arra = np.arange(16).reshape(2, 2, 4)
which gives
array([[[0, 1, 2, 3],
[4, 5, 6, 7]],
[[8, 9, 10, 11],
[12, 13, 14, 15]]])
and the problem was how to perform the sum of columns in each matrix, i. e., 0 + 4, 1 + 5, ... , 8 + 12, ..., 11 + 15. It was solved by #Cleb.
Then I wondered how to do it in the case of a sum of 0 + 8, 1 + 9, ..., 4 + 12, ..., 7 + 15, (odd and even columns) which was also solved by #Cleb.
But then I wondered if there are a general idea (which can be modified in each specific case). Imagine you can add the first and the last rows and the center rows, in columns, separately, i. e., 0 + 12, 1 + 13, ..., 3 + 15, 4 + 8, 5 + 9, ..., 7 + 11.
Is there a general way? Thank you.

Depending on the how exactly arra is defined, you can shift your values appropriately using np.roll:
arra_mod = np.roll(arra, arra.shape[2])
arra_mod then looks as follows:
array([[[12, 13, 14, 15],
[ 0, 1, 2, 3]],
[[ 4, 5, 6, 7],
[ 8, 9, 10, 11]]])
Now you can simply use the command from your previous question to get your desired output:
map(sum, arra_mod)
which gives you the desired output:
[array([12, 14, 16, 18]), array([12, 14, 16, 18])]
You can also use a list comprehension
[sum(ai) for ai in arra_mod]
which gives you the same output.
If you prefer one-liner, you can therefore simply do:
map(sum, np.roll(arra, arra.shape[2]))

Related

Adding up formatted indexes with Numpy arrays Python

How can I write code that adds up all the numbers in between the indices? The arrays numbers and indices are correlated. The first two index values are 0-3 so the numbers between indices 0-3 are being added up 1 + 5 + 6 = 12. The expected value is what I am trying to find. I am trying to get the results without using a for loop.
numbers = np.array([1, 5, 6, 7, 4, 3, 6, 7, 11, 3, 4, 6, 2, 20]
indices = np.array([0, 3 , 7, 11])
Expected output:
[12, 41, 22]
I'm not sure how you got the expected output - from my calculation, the sum between the indices should be [12, 20, 25]. The following code calculates this:
numbers = np.array([1, 5, 6, 7, 4, 3, 6, 7, 11, 3, 4, 6, 2, 20])
indexes = np.array([0, 3, 7, 11])
tmp = np.zeros(len(numbers) + 1)
np.cumsum(numbers, out=tmp[1:])
result = np.diff(tmp[indexes])
The output of this is [12, 20, 25]
How does this work? It creates an array that is just one size larger than the numbers array (in order to have the first element be zero). Then it calculates the cumulative sum of the elements, starting at index 1 of the tmp array. Then, it takes the diff of the tmp array at the indices provided. As an example, it takes the different of the cumulative sum of the array from index 3 (value = 12) to index 7 (value = 32), 32-12 = 20.
You are likely looking for np.add.reduceat:
>>> np.add.reduceat(numbers, indices)
array([12, 20, 25, 28], dtype=int32)

Iterating Over Rows in Python Array to Extract Column Data

I am using Python and looking to iterate through each row of an Nx9 array and extract certain values from the row to form another matrix with them. The N value can change depending on the file I am reading but I have used N=3 in my example. I only require the 0th, 1st, 3rd and 4th values of each row to form into an array which I need to store. E.g:
result = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9],
[11, 12, 13, 14, 15, 16, 17, 18, 19],
[21, 22, 23, 24, 25, 26, 27, 28, 29]])
#Output matrix of first row should be: ([[1,2],[4,5]])
#Output matrix of second row should be: ([[11,12],[14,15]])
#Output matrix of third row should be: ([[21,22],[24,25]])
I should then end up with N number of matrices formed with the extracted values - a 2D matrix for each row. However, the matrices formed appear 3D so when transposed and subtracted I receive the error ValueError: operands could not be broadcast together with shapes (2,2,3) (3,2,2). I am aware that a (3,2,2) matrix cannot be subtracted from a (2,2,3) so how do I obtain a 2D matrix N number of times? Would a loop be better suited? Any suggestions?
import numpy as np
result = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9],
[11, 12, 13, 14, 15, 16, 17, 18, 19],
[21, 22, 23, 24, 25, 26, 27, 28, 29]])
a = result[:, 0]
b = result[:, 1]
c = result[:, 2]
d = result[:, 3]
e = result[:, 4]
f = result[:, 5]
g = result[:, 6]
h = result[:, 7]
i = result[:, 8]
output = [[a, b], [d, e]]
output = np.array(output)
output_transpose = output.transpose()
result = 0.5 * (output - output_transpose)
In [276]: result = np.array(
...: [
...: [1, 2, 3, 4, 5, 6, 7, 8, 9],
...: [11, 12, 13, 14, 15, 16, 17, 18, 19],
...: [21, 22, 23, 24, 25, 26, 27, 28, 29],
...: ]
...: )
...:
...: a = result[:, 0]
...
...: i = result[:, 8]
...: output = [[a, b], [d, e]]
In [277]: output
Out[277]:
[[array([ 1, 11, 21]), array([ 2, 12, 22])],
[array([ 4, 14, 24]), array([ 5, 15, 25])]]
In [278]: arr = np.array(output)
In [279]: arr
Out[279]:
array([[[ 1, 11, 21],
[ 2, 12, 22]],
[[ 4, 14, 24],
[ 5, 15, 25]]])
In [280]: arr.shape
Out[280]: (2, 2, 3)
In [281]: arr.T.shape
Out[281]: (3, 2, 2)
transpose exchanges the 1st and last dimensions.
A cleaner way to make a (N,2,2) array from selected columns is:
In [282]: arr = result[:,[0,1,3,4]].reshape(3,2,2)
In [283]: arr.shape
Out[283]: (3, 2, 2)
In [284]: arr
Out[284]:
array([[[ 1, 2],
[ 4, 5]],
[[11, 12],
[14, 15]],
[[21, 22],
[24, 25]]])
Since the last 2 dimensions are 2, you could transpose them, and take the difference:
In [285]: arr-arr.transpose(0,2,1)
Out[285]:
array([[[ 0, -2],
[ 2, 0]],
[[ 0, -2],
[ 2, 0]],
[[ 0, -2],
[ 2, 0]]])
Another way to get the (N,2,2) array is with a matrix index:
In [286]: result[:,[[0,1],[3,4]]]
Out[286]:
array([[[ 1, 2],
[ 4, 5]],
[[11, 12],
[14, 15]],
[[21, 22],
[24, 25]]])
Ok, this is not a coding problem, but a math problem. I wrote some code for you, since it's pretty obvious you're a beginner, so there will be some unfamiliar syntax in there that you should look into so you can avoid problems like this in the future. You might not use them all that often, but it's good to know how to do it, because it expands your understanding of python syntax in general.
First up, the complete code for easy copy and pasting:
import numpy as np
result=np.array([[1,2,3,4,5,6,7,8,9],
[11,12,13,14,15,16,17,18,19],
[21,22,23,24,25,26,27,28,29]])
output = np.array(tuple(result[:,i] for i in (0,1,3)))
def Matrix_Operation(Matrix,Coefficient):
if (Matrix.shape == Matrix.shape[::-1]
and isinstance(Matrix,np.ndarray)
and isinstance(Coefficient,float)):
return Coefficient*(Matrix-Matrix.transpose())
else:
print('The shape of you Matrix is not palindromic')
print('You cannot substitute matrices of unequal shape')
print('Your shape: %s'%str(Matrix.shape))
print(Matrix_Operation(output,0.5))
Now let's talk about a step by step explanation of what's happening here:
import numpy as np
result=np.array([[1,2,3,4,5,6,7,8,9],
[11,12,13,14,15,16,17,18,19],
[21,22,23,24,25,26,27,28,29]])
Python uses indentation (alignment of whitespaces) as an integral part of it's syntax. However, if you provide brackets, a lot of the time you don't need aligning indentations in order for the interpreter to understand your code. If you provide a large array of values manually, it is usually adviseable to start new lines at the commas (here, the commas separating the sublists). It's just more readable and that way your data isn't off screen in your coding program.
output = np.array(tuple(result[:,i] for i in (0,1,3)))
List comprehensions are a big deal in python and really handy for dirty one liners. As far as I know, no other language gives you this option. That's one of the reasons why python is so great. I basically created a list of lists, where each sublist is result[:,i] for every i in (0,1,3). This is cast as a tuple (yes, list comprehensions can also be done with tuples, not just lists). Finally I wrapped it in the np.array function, since this is the type required for our mathematical operations later on.
def Matrix_Operation(Matrix,Coefficient):
if (Matrix.shape == Matrix.shape[::-1]
and isinstance(Matrix,np.ndarray)
and isinstance(Coefficient,(float,int))):
return Coefficient*(Matrix-Matrix.transpose())
else:
print('The shape of you Matrix is not palindromic')
print('You cannot substitute matrices of unequal shape')
print('Your shape: %s'%str(Matrix.shape))
print(Matrix_Operation(output,0.5))
If you're gonna create a complex formula in python code, why not wrap it inside an abstractable function? You can incorporate a lot of "quality control" into a function as well, to check if it is given the correct input for the task it is supposed to do.
Your code failed, because you were trying to subtract a (2,2,3) shaped matrix from a (3,2,2) matrix. So we'll need a code snippet to check, if our provided matrix has a palindromic shape. You can reverse the order of items in a container by doing Container[::-1] and so we ask, if Matrix.shape == Matrix.shape[::-1]. Further, it is necessary, that our Matrix is a np.ndarray and if our coefficient is a number. That's what I'm doing with the isinstance() function. You can check for multiple types at once, which is why the isinstance(Coefficient,(float,int)) contains a tuple with both int and float in it.
Now that we have ensured that our input makes sense, we can preform our Matrix_Operation.
So in conclusion: Check if your math is solid before asking SO for help, because people here can get pretty annoyed at that sort of thing. You probably noticed by now that someone has already downvoted your question. Personally, I believe it's necessary to let newbies ask a couple stupid questions before they get into the groove, but that's what the voting button is for, I guess.

Numpy reshape seems to output value error

I tried to use reshape
import numpy as np
d = np.arange(30).reshape(1,3)
It is not working cannot reshape array of size 30 into shape (1,3)
but when I tried to use
d = np.arange(30).reshape(-1,3) # This works
Why do we have to use -1?.
It's really confusing and I'm can't seem to figure out how reshape works. I would really appreciate if someone can help me figure out how this works. I tried docs and other posts in SO but it wasn't much helpful.
I am new to ML and Python.
A reshape means that you order your elements of the array, according to other dimensions. For example arange(27) will produce a vector containing 27 elements. But with .reshape(9, 3) you specify here that you want to transform it into a two dimensional array, where the first dimension contains 9 elements, and the second three elements. So the result will look like:
>>> np.arange(27).reshape(9, 3)
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26]])
But we can also make it a 3×3×3 array:
>>> np.arange(27).reshape(3, 3, 3)
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
-1 is used as a value that will numpy derive the dimension.
So if you have an array containing 30 elements, and you reshape these to m×3, then m is 10. -1 is thus not the real value, it is used for programmer convenience if for example you do not know the number of elements, but know that it is divisable by three.
The following two are (under the assumption that m contains 30 elements equivalent:
m.reshape(10, 3)
m.reshape(-1, 3)
Note that you can specify at most one -1, since otherwise there are multiple possibilities, and it becomes also harder to find a valid configuration.

Why np.concatenate changes dimension

In the following program, I am trying to understand how np.concatenate command works. After accessing each row of the array a by for loop, when I concatenate along row axis I expect a 2-dimensional array having the shape of (5,5) but it changes.
I want to have the same dimension (5,5) after concatenation. How can I do that?
I tried to repeat the above method for the 2-dimensional array by storing them in a list [(2,5),(2,5),(2,5)]. At the end when I concatenate it gives me the shape of (6,5) as expected but in the following case, it is different.
a = np.arange(25).reshape(5,5)
ind =[0,1,2,3,4]
list=[]
for i in ind:
list.append(a[i])
new= np.concatenate(list, axis=0)
print(list)
print(len(list))
print(new)
print(new.shape)
This gives the following results for new:
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]
and for new.shape:
(25,)
To preface this you really should not be using concatenate here.
Setup
a = np.arange(25).reshape(5,5)
L = [i for i in a]
You're question asks:
Why is np.concatenate changing dimension?
It's not changing dimension, it is doing exactly what it is supposed to based on the input you are giving it. From the documentation:
Join a sequence of arrays along an existing axis
When you pass your list to concatenate, don't think of it as passing a (5, 5) list, think of it as passing 5 (5,) shape arrays, which are getting joined along axis 0, which will intuitively produce a (25,) shape output.
Now this behavior also gives insight on how to work around this. If passing 5 (5,) shape arrays produces a (25,) shape output, we just need to pass (1, 5) shape arrays to produce a (5, 5) shape output. We can accomplish this by simply adding a dimension to each element of L:
np.concatenate([[i] for i in L])
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
However, the much better way to approach this is to simply use stack, vstack, etc..
>>> np.stack(L)
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> np.vstack(L)
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])

Numpy: how to add / join slice objects?

For a 2*N x 2*N array x, I'd like to swap rows [0:N] with rows [N:2*N] in a particular way, namely, the question I have is, if there is a 'built-in' way of 'adding / joining' slice objects to achieve this? I.e. something like:
x[N:2*N + 0:N,:]
although, the preceding does something different.
Certainly I could do things like vstack((x[N:2*N,:],x[0:N,:])), which is not really what I'm looking for, or x[[i for i in range(N)]+[i for i in range(N,2*N)],:], which probably is slow.
I think you're looking for numpy.r_, which "translates slice objects to concatenation along the first axis". It allows you to perform more complex slices along the first axis - you can concatenate multiple slices with commas: np.r_[5:10, 100:200:10, 15, 20, 0:5].
For example:
>>> import numpy as np
>>> N = 2
>>> x = np.arange(16).reshape(4, 4)
>>> x[np.r_[N:2*N, 0:N]]
array([[ 8, 9, 10, 11],
[12, 13, 14, 15],
[ 0, 1, 2, 3],
[ 4, 5, 6, 7]])
And in this specific case, you could also just np.roll it:
>>> np.roll(x, N, axis=0)
array([[ 8, 9, 10, 11],
[12, 13, 14, 15],
[ 0, 1, 2, 3],
[ 4, 5, 6, 7]])

Categories

Resources