NumPy array indexing - python

I want to extract the second and the 3rd to the fifth columns of the NumPy array, how would I go about it?
A = array([[0, 1, 2, 3, 4, 5, 6], [4, 5, 6, 7, 4, 5, 6]])
A[:, [1, 4:6]]
This obviously doesn't work.

Assuming I've understood you -- it's usually a good idea to explicitly specify the output you want, because it's not obvious -- you could use numpy.r_:
In [27]: A
Out[27]:
array([[0, 1, 2, 3, 4, 5, 6],
[4, 5, 6, 7, 4, 5, 6]])
In [28]: A[:, [1,3,4,5]]
Out[28]:
array([[1, 3, 4, 5],
[5, 7, 4, 5]])
In [29]: A[:, r_[1, 3:6]]
Out[29]:
array([[1, 3, 4, 5],
[5, 7, 4, 5]])
In [37]: A[1:, r_[1, 3:6]]
Out[37]: array([[5, 7, 4, 5]])
which you can then flatten or reshape as you like. r_ is basically a convenience function to generate the right indices, e.g.
In [30]: r_[1, 3:6]
Out[30]: array([1, 3, 4, 5])

Perhaps you are looking for this?
In [10]: A[1:, [1]+range(3,6)]
Out[10]: array([[5, 7, 4, 5]])
Note this gives you the second, fourth, fifth and six columns of all rows but the first.

The second element is A[:,1]. Elements 3-5 (I'm assuming you want inclusive) are A[:,2:5]. You won't be able to extract them with a single call. To get them as an array, you could do
import numpy as np
A = np.array([[0, 1, 2, 3, 4, 5, 6], [4, 5, 6, 7, 4, 5, 6]])
my_cols = np.hstack((A[:,1][...,np.newaxis], A[:,2:5]))
The np.newaxis stuff is just to make A[:,1] a 2D array, consistent with A[:,2:5].
Hope this helps.

Related

Creating shifted Hankel matrix

Say I have some time-series data in the form of a simple array.
X1 = np.array[(1, 2, 3, 4]
The Hankel matrix can be obtained by using scipy.linalg.hankel, which would look something like this:
hankel(X1)
array([[1, 2, 3, 4],
[2, 3, 4, 0],
[3, 4, 0, 0],
[4, 0, 0, 0]])
Now assume I had a larger array in the form of
X2 = np.array([1, 2, 3, 4, 5, 6, 7])
What I want to do is fill in the zeros in this matrix with the numbers that are next in the index (specific to each row). Taking the same Hankel matrix earlier by using the first four values in the array X2, I'd like to see the following output:
hankel(X2[:4])
array([[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6],
[4, 5, 6, 7]])
How would I do this? I'd ideally like to use this for larger data.
Appreciate any tips or pointers given. Thanks!
If you have a matrix with the appropriate index values into your dataset, you can use integer array indexing directly into your dataset.
To create the index matrix, you can simply use the upper-left quadrant of a double-sized Hankel array. There are likely simpler ways to create the index matrix, but this does the trick.
>>> X = np.array([9, 8, 7, 6, 5, 4, 3])
>>> N = 4 # the size of the "window"
>>> indices = scipy.linalg.hankel(np.arange(N*2))[:N, :N]
>>> indices
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6]])
>>> X[indices]
array([[9, 8, 7, 6],
[8, 7, 6, 5],
[7, 6, 5, 4],
[6, 5, 4, 3]])

Using reshape or view in a certain fashion

import torch
import numpy as np
a = torch.tensor([[1, 4], [2, 5],[3, 6]])
bb=a.detach().numpy()
b = a.view(6).detach().numpy()
Element b is like:
[1 4 2 5 3 6]
How do I reshape back to the following:
[1 2 3 4 5 6]
This is just an example, want some generic answers, even 3D.
In Pytorch you can use reshape and permute as in this example:
Import torch
a = torch.randn((3,3,2))
b = a.permute(2,0,1).reshape(-1)
a
tensor([[[ 0.2372, 0.5550],
[ 0.7700, -0.3693],
[-0.4151, 0.6247]],
[[ 1.2179, 0.6992],
[ 0.5033, 1.6290],
[-1.2165, -0.4180]],
[[ 0.3189, 0.3208],
[ 0.3894, 2.5544],
[-1.3069, -0.6905]]])
b
tensor([ 0.2372, 0.7700, -0.4151, 1.2179, 0.5033, -1.2165, 0.3189, 0.3894,
-1.3069, 0.5550, -0.3693, 0.6247, 0.6992, 1.6290, -0.4180, 0.3208,
2.5544, -0.6905])
I think this solves the problem.
If you want to remain in PyTorch, you can view b in a's shape, then apply a transpose and flatten:
>>> b.view(-1,2).T.flatten()
tensor([1, 2, 3, 4, 5, 6])
In the 3D case, you can perform similar manipulations using torch.transpose which enables you to swap two axes. You get the desired result by combining it with torch.view:
First case (extra dimension last):
>>> b = a.view(-1, 1).expand(-1,3).flatten()
tensor([1, 1, 1, 4, 4, 4, 2, 2, 2, 5, 5, 5, 3, 3, 3, 6, 6, 6])
>>> b.view(-1,2,3).transpose(0,1).flatten()
tensor([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6])
Second case (extra dimension first):
>>> b = a.view(1,-1).expand(3,-1).flatten()
tensor([1, 4, 2, 5, 3, 6, 1, 4, 2, 5, 3, 6, 1, 4, 2, 5, 3, 6])
>>> b.view(3,-1).T.view(-1,2,3).transpose(0,1).flatten()
tensor([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6])
I can't help with the torch step, but starting with a numpy array:
In [70]: a=np.array([[1, 4], [2, 5],[3, 6]])
In [71]: a
Out[71]:
array([[1, 4],
[2, 5],
[3, 6]])
In [72]: a.ravel() # can also use reshape
Out[72]: array([1, 4, 2, 5, 3, 6])
To get a column major copy:
In [73]: a.ravel(order='F')
Out[73]: array([1, 2, 3, 4, 5, 6])
In [74]: a.T.ravel()
Out[74]: array([1, 2, 3, 4, 5, 6])
the transpose:
In [79]: a.T
Out[79]:
array([[1, 2, 3],
[4, 5, 6]])
For 3d arrays, you can use transpose with an order parameter.

Concatenation of 2 lists using numpy

a = np.array([[0, 1, 2, 3], [4, 5, 6, 7]], dtype=int)
b = np.array([[8], [9]], dtype=int)
result wanted:
alist = [[0, 1, 2, 3, 8], [4, 5, 6, 7, 9]] # as np.array
I tried:
np.concatenate(alist,blist)
np.concatenate((alist,blist))
np.concatenate(alist, blist[0])
for a,b in zip(alist,blist): np.concatenate(a,b)
alist = [*map(np.concatenate, alist, blist)])
This got me various error messages I tried to fix by using the next trial. Nothing worked so far.
You are just missing the axis=1 keyword argument.
np.concatenate((a, b), axis=1)
Normally np.concatenate works on axis 0 (going down the array). But in this case you want to concatenate along axis 1 (going across the array). See the glossary for more information.
You can achieve this by using np.hstack, this will concatenate the two arrays, but at the second axis.
a = np.array([[0, 1, 2, 3], [4, 5, 6, 7]], dtype=int)
b = np.array([[8], [9]], dtype=int)
>>> np.hstack((a,b))
array([[0, 1, 2, 3, 8],
[4, 5, 6, 7, 9]])

Splitting an N dimensional numpy array into multiple 1D arrays

I have a simulation model that integrates a set of variables whose states are represented by numpy arrays of an arbitrary number of dimensions. After the simulation, I now have a list of arrays whose elements represent the variable state at a particular point in time.
In order to output the simulation results I want to split these arrays into multiple 1D arrays where the elements correspond to the same component of the state variable through time. Here is an example of a 2D state variable over a number of time steps.
import numpy as np
# Arbitrary state that is constant
arr = np.arange(9).reshape((3, 3))
# State variable through 3 time steps
state = [arr.copy() for _ in range(3)]
# Stack the arrays up to 3d. Axis could be rolled here if it makes it easier.
stacked = np.stack(state)
The output I need to get is:
[np.array([0, 0, 0]), np.array([1, 1, 1]), np.array([2, 2, 2]), ...]
I've tried doing np.split(stacked, sum(stacked.shape[:-1]), axis=...) (tried everything for axis=) but get the following error: ValueError: array split does not result in an equal division. Is there a way to do this using np.split or maybe np.nditer that will work for the general case?
I guess this would be equivalent to doing:
I, J, K = stacked.shape
result = []
for i in range(I):
for j in range(J):
result.append(stacked[i, j, :])
Which is also the ordering I'm hoping to get. Easy enough, however I'm hoping there is something in numpy that I can take advantage of for this that will be more general.
If I reshape it to a 9x3 array, then a simple list() will turn it into a list of 3 element arrays:
In [190]: stacked.reshape(-1,3)
Out[190]:
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8]])
In [191]: list(stacked.reshape(-1,3))
Out[191]:
[array([0, 0, 0]),
array([1, 1, 1]),
array([2, 2, 2]),
array([3, 3, 3]),
array([4, 4, 4]),
array([5, 5, 5]),
array([6, 6, 6]),
array([7, 7, 7]),
array([8, 8, 8])]
np.split(stacked.reshape(-1,3),9) produces a list of 1x3 arrays.
np.split only works on one axis, but you want to split on the 1st 2 - hence the need for a reshape or ravel.
And forget about nditer. That's a stepping stone to reworking code in cython. It does not help with ordinary iteration - except that when used in ndindex it can streamline your i,j double loop:
In [196]: [stacked[idx] for idx in np.ndindex(stacked.shape[:2])]
Out[196]:
[array([0, 0, 0]),
array([1, 1, 1]),
array([2, 2, 2]),
array([3, 3, 3]),
array([4, 4, 4]),
array([5, 5, 5]),
array([6, 6, 6]),
array([7, 7, 7]),
array([8, 8, 8])]
======================
With the different state, just stack on a different axis
In [302]: state
Out[302]:
[array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]]), array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]]), array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])]
In [303]: np.stack(state,axis=2).reshape(-1,3)
Out[303]:
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8]])
stack is rather like np.array, except it gives more control over where the dimension is added. But do look at it's code.
You could use np.split on a flattened version and cut into appropriate number of parts as 1D lists, like so -
np.split(stacked.ravel(),np.prod(stacked.shape[:2]))
Sample run -
In [406]: stacked
Out[406]:
array([[[0, 0, 0],
[1, 1, 1]],
[[2, 2, 2],
[3, 3, 3]],
[[4, 4, 4],
[5, 5, 5]],
[[6, 6, 6],
[7, 7, 7]]])
In [407]: np.split(stacked.ravel(),np.prod(stacked.shape[:2]))
Out[407]:
[array([0, 0, 0]),
array([1, 1, 1]),
array([2, 2, 2]),
array([3, 3, 3]),
array([4, 4, 4]),
array([5, 5, 5]),
array([6, 6, 6]),
array([7, 7, 7])]

Numpy: equivalent of numpy.roll but only for data visualisation

Is there a way to perform a roll on an array, but instead of having a copy of the data having just a different visualisation of it?
An example might clarify: given b a rolled version of a...
>>> a = np.random.randint(0, 10, (3, 3))
>>> a
array([[6, 7, 4],
[5, 4, 8],
[1, 3, 4]])
>>> b = np.roll(a, 1, axis=0)
>>> b
array([[1, 3, 4],
[6, 7, 4],
[5, 4, 8]])
...if I perform an assignment on array b...
>>> b[2,2] = 99
>>> b
array([[ 1, 3, 4],
[ 6, 7, 4],
[ 5, 4, 99]])
...the content of a won't change...
>>> a
array([[6, 7, 4],
[5, 4, 8],
[1, 3, 4]])
...contrarily, I would like to have:
>>> a
array([[6, 7, 4],
[5, 4, 99], # observe as `8` has been changed here too!
[1, 3, 4]])
Thanks in advance for your time and expertise!
This is not possible, sorry. The rolled array cannot be described by a different set of strides, which would be necessary for a NumPy view to work.

Categories

Resources