Reference to ndarray rows in ndarray - python

is it possible to store references of specific rows of an numpy array in another numpy array?
I have an array of 2D nodes, e.g.
nodes = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
Now I want to select only a few of them and store a reference in another numpy array:
nn = np.array([nodes[0], nodes[3]])
If I modify a entry in nn the array nodes remains unchanged. Is there a way to store a reference to nodes in the ndarray nn?

If the reference can be created with basic indexing/slicing, then you get a view (an array that does not own its data, but refers to another array’s data instead) of the initial array where changes propagate:
>>> nn = nodes[0:4:3] # reference array for rows 0 and 3
>>> nn[0][0] = 0
>>> nodes
array([[0, 2],
[2, 3],
[3, 4],
[4, 5],
[5, 6]])
Otherwise, you get a copy from the original array as in your code, and updates do not propagate to the initial array.

You can store an index to the rows you want in a numpy array:
ref = np.array([0, 3])
You can use the reference in an indexing expression to access the nodes you want:
>>> nn = nodes[ref]
>>> nn
array([[1, 2],
[4, 5]])
nn will be a deep copy with no connection to the original in this case. While nn[foo] = bar won't affect the original array, you can use ref directly:
>>> nodes[ref, 1] = [17, 18]
>>> nodes
array([[ 1, 17],
[ 2, 3],
[ 3, 4],
[ 4, 18],
[ 5, 6]])
Alternatively, you can use a mask for ref:
>>> ref2 = np.zeros(nodes.shape[0], dtype=np.bool)
>>> ref2[ref] = True
>>> ref2
array([ True, False, False, True, False], dtype=bool)
You can do almost all the same operations:
>>> nn2 = nodes[ref2]
>>> nn2
array([[1, 2],
[4, 5]])
Modifications work too:
>>> nodes[ref2, 1] = [19, 23]
>>> nodes
array([[ 1, 19],
[ 2, 3],
[ 3, 4],
[ 4, 23],
[ 5, 6]])
The only thing that is more convenient with an array of indices is selecting a particular node from within the selection:
>>> nodes[ref[0], 0]
1

In Numpy, you can get a view of an array that can be edited. In your example, you can do this:
import numpy as np
nodes = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
node_idx = np.array([0, 3])
nodes[node_idx] = np.array([[1, 5], [2, 5]])
nodes
Output:
array([[1, 5],
[2, 3],
[3, 4],
[2, 5],
[5, 6]])
You can also replace it with boolean arrays:
import numpy as np
nodes = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
node_mask = np.array([True, False, False, True, False])
nodes[node_mask] = np.array([[1, 5], [2, 5]])
nodes
Which produces the same result. Of course, this means you can do magic like this:
import numpy as np
nodes = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
nodes[nodes[:, 0] == 3] = [1, 5]
nodes
Which replaces all rows with the first element equal to 3 with [1, 5]. Output:
array([[1, 2],
[2, 3],
[1, 5],
[4, 5],
[5, 6]])

Method 1
First, initialize a Numpy array of None with dtype=object. (It don't have to be None. My guess it that you just cannot put references at initialization as Numpy somehow just creates an deep copy of it.)
Then, put the reference into the array.
nodes = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
# nn = np.array([nodes[0], nodes[1]],dtype=object) would not work
nn = np.array([None, None], dtype=object)
nn[0] = nodes[0]
nn[1] = nodes[3]
# Now do some modification.
nn[0][1] = 100
Output of nodes:
array([[ 1, 100],
[ 2, 3],
[ 3, 4],
[ 4, 5],
[ 5, 6]])
# make it a function
def make_ref(old_array, indeces):
ret = np.array([None for _ in range(len(indeces))])
for i in range(len(indeces)):
ret[i] = old_array[indeces[i]]
return ret
nn = make_ref(nodes, [0, 3])
Method 2
If you don't need to put it in Numpy arrays, just use a list to host the references.
nn = [nodes[0], nodes[1]]

Related

Using Numpy to write a function that returns all rows of A that have completely distinct entries

For example:
For
A = np.array([
[1, 2, 3],
[4, 4, 4],
[5, 6, 6]])
I want to get the output
array([[1, 2, 3]]).
For A = np.arange(9).reshape(3, 3),
I want to get
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
WITHOUT using loops
You can use:
B = np.sort(A, axis=1)
out = A[(B[:,:-1] != B[:, 1:]).all(1)]
Or using pandas:
import pandas as pd
out = A[pd.DataFrame(A).nunique(axis=1).eq(A.shape[1])]
Output:
array([[1, 2, 3]])

Append an element to Array of Arrays

Suppose I have an array of arrays.
import numpy as np
x = np.array([ [1, 2], [3, 4], [5, 6]])
I want to add 10 as the first element of each of those arrays without running for loop. Result should look like
array([[10, 1, 2],
[10, 3, 4],
[10, 5, 6]])
Plain append does not work.
np.append(10, x)
array([10, 1, 2, 3, 4, 5, 6])
My original problem has 100K arrays. So I need to find an efficient way to do this.
You are looking for np.insert.
https://numpy.org/doc/stable/reference/generated/numpy.insert.html
np.insert(x, 0, [10,10,10], axis=1)
np.insert is your choice
>>> import numpy as np
x = np.array([ [1, 2], [3, 4], [5, 6]])
>>> x
array([[1, 2],
[3, 4],
[5, 6]])
>>> np.insert(x, 0, 10, axis=1)
array([[10, 1, 2],
[10, 3, 4],
[10, 5, 6]])
you also can insert different values
>>> np.insert(x, 0, [10,11,12] , axis=1)
array([[10, 1, 2],
[11, 3, 4],
[12, 5, 6]])

Concatenate indices of each Numpy Array in a Matrix

So I have a Numpy Array with a bunch of numpy arrays inside of them. I want to group them based on the position in their individual array.
For Example:
If Matrix is:
[[1, 2], [2, 3], [4, 5], [6, 7]]
Then the code should return:
[[1, 2, 4, 6], [2, 3, 5, 7]]
This is becuase 1, 2, 4, 6 are all the first elements in their individual arrays, and 2, 3, 5, 7 are the second elements in their individual arrays.
Anyone know some function that could do this. Thanks.
Answer in Python.
Using numpy transpose should do the trick:
a = np.array([[1, 2], [2, 3], [4, 5], [6, 7]])
a_t = a.T
print(a_t)
array([[1, 2, 4, 6],
[2, 3, 5, 7]])
Your data as a list:
In [101]: alist = [[1, 2], [2, 3], [4, 5], [6, 7]]
In [102]: alist
Out[102]: [[1, 2], [2, 3], [4, 5], [6, 7]]
and as a numpy array:
In [103]: arr = np.array(alist)
In [104]: arr
Out[104]:
array([[1, 2],
[2, 3],
[4, 5],
[6, 7]])
A standard idiom for 'transposing' lists is:
In [105]: list(zip(*alist))
Out[105]: [(1, 2, 4, 6), (2, 3, 5, 7)]
with arrays, there's a transpose method:
In [106]: arr.transpose()
Out[106]:
array([[1, 2, 4, 6],
[2, 3, 5, 7]])
The first array is (4,2) shape; its transpose is (2,4).

How can I add a column to a numpy array

How can I add a column containing only "1" to the beginning of a second numpy array.
X = np.array([1, 2], [3, 4], [5, 6])
I want to have X become
[[1,1,2], [1,3,4],[1,5,6]]
You can use the np.insert
new_x = np.insert(x, 0, 1, axis=1)
You can use the np.append method to add your array at the right of a column of 1 values
x = np.array([[1, 2], [3, 4], [5, 6]])
ones = np.array([[1]] * len(x))
new_x = np.append(ones, x, axis=1)
Both will give you the expected result
[[1 1 2]
[1 3 4]
[1 5 6]]
Try this:
>>> X = np.array([[1, 2], [3, 4], [5, 6]])
>>> X
array([[1, 2],
[3, 4],
[5, 6]])
>>> np.insert(X, 0, 1, axis=1)
array([[1, 1, 2],
[1, 3, 4],
[1, 5, 6]])
Since a new array is going to be created in any event, it is just sometimes easier to do so from the beginning. Since you want a column of 1's at the beginning, then you can use builtin functions and the input arrays existing structure and dtype.
a = np.arange(6).reshape(3,2) # input array
z = np.ones((a.shape[0], 3), dtype=a.dtype) # use the row shape and your desired columns
z[:, 1:] = a # place the old array into the new array
z
array([[1, 0, 1],
[1, 2, 3],
[1, 4, 5]])
numpy.insert() will do the trick.
X = np.array([[1, 2], [3, 4], [5, 6]])
np.insert(X,0,[1,2,3],axis=1)
The Output will be:
array([[1, 1, 2],
[2, 3, 4],
[3, 5, 6]])
Note that the second argument is the index before which you want to insert. And the axis = 1 indicates that you want to insert as a column without flattening the array.
For reference:
numpy.insert()

Replicating elements in numpy array

I have a numpy array say
a = array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
I have an array 'replication' of the same size where replication[i,j](>=0) denotes how many times a[i][j] should be repeated along the row. Obiviously, replication array follows the invariant that np.sum(replication[i]) have the same value for all i.
For example, if
replication = array([[1, 2, 1],
[1, 1, 2],
[2, 1, 1]])
then the final array after replicating is:
new_a = array([[1, 2, 2, 3],
[4, 5, 6, 6],
[7, 7, 8, 9]])
Presently, I am doing this to create new_a:
##allocate new_a
h = a.shape[0]
w = a.shape[1]
for row in range(h):
ll = [[a[row][j]]*replicate[row][j] for j in range(w)]
new_a[row] = np.array([item for sublist in ll for item in sublist])
However, this seems to be too slow as it involves using lists. Can I do the intended entirely in numpy, without the use of python lists?
You can flatten out your replication array, then use the .repeat() method of a:
import numpy as np
a = array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
replication = array([[1, 2, 1],
[1, 1, 2],
[2, 1, 1]])
new_a = a.repeat(replication.ravel()).reshape(a.shape[0], -1)
print(repr(new_a))
# array([[1, 2, 2, 3],
# [4, 5, 6, 6],
# [7, 7, 8, 9]])

Categories

Resources