Python numpy array integer indexed flat slice assignment - python

Was experimenting with numpy and found this strange behavior.
This code works ok:
>>> a = np.array([[1, 2, 3], [4, 5, 6]])
>>> a[:, 1].flat[:] = np.array([-1, -1])
>>> a
array([[ 1, -1, 3],
[ 4, -1, 6]])
But why this code doesn't change to -1 elements of 0 and 2 column?
>>> a[:, [0, 2]].flat[:] = np.array([-1, -1])
>>> a
array([[ 1, -1, 3],
[ 4, -1, 6]])
And how to write the code so that would change to -1 elements of 0 and 2 columns like this?
UPD: use of flat or smt similar is necessarily in my example
UPD2: I made example in question basing on this code:
img = imread(img_name)
xor_mask = np.zeros_like(img, dtype=np.bool)
# msg_bits looks like array([ True, False, False, ..., False, False, True], dtype=bool)
xor_mask[:, :, channel].flat[:len(msg_bits)] = np.ones_like(msg_bits, dtype=np.bool)
And after assignment to xor mask with channel == 0 or 1 or 2 code works ok, but if channel == [1,2] or smt like this, assignment does not happen

In first example by flattening the slice you don't change the shape and actually the python Numpy doesn't create a new object. so assigning to flattened slice is like assigning to actual slice. But by flattening a 2d array you're changing the shape and hence numpy makes a copy of it.
also you don't need to flatten your slice to add to it:
In [5]: a[:, [0, 2]] += 100
In [6]: a
Out[6]:
array([[101, 2, 103],
[104, 5, 106]])

As others has pointed out .flat may create a copy of the original vector, so any updates to it would be lost. But flattening a 1D slice is fine, so you can use a for loop to update multiple indexes.
import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6]])
a[:, 1].flat = np.array([-1, -1])
print a
# Use for loop to avoid copies
for idx in [0, 2]:
a[:, idx].flat = np.array([-1, -1])
print a
Note that you don't need to use flat[:]: just flat is enough (and probably more efficient).

You could just remove the flat[:] from a[:, [0, 2]].flat[:] += 100:
>>> import numpy as np
>>> a = np.array([[1, 2, 3], [4, 5, 6]])
>>> a[:, 1].flat[:] += 100
>>> a
array([[ 1, 102, 3],
[ 4, 105, 6]])
>>> a[:, [0, 2]] += 100
>>> a
array([[101, 102, 103],
[104, 105, 106]])
But you say it is necessary... Can't you just reshape whatever you are trying to add to the initial array instead of using flat?
The second index call makes a copy of the array while the first returns a reference to it:
>>> import numpy as np
>>> a = np.array([[1, 2, 3], [4, 5, 6]])
>>> b = a[:,1].flat
>>> b[0] += 100
>>> a
array([[ 1, 102, 3],
[ 4, 5, 6]])
>>> b =a[:,[0,2]].flat
>>> b[0]
1
>>> b[0] += 100
>>> a
array([[ 1, 102, 3],
[ 4, 5, 6]])
>>> b[:]
array([101, 3, 4, 6])
It appears that when the elements you wish to iterate upon in a flat maner are not adjacent numpy makes an iterator over a copy of the array.

Related

Multiply Numpy arrays of different sizes

I have a list of coefficients and a list of times
a = np.array([0,1,2,3])
t = np.array([1,2,3])
I would like to perform some multiplicative operation on the two where each coefficient is multiplied by each of the times to result in an array like:
array([[0, 0, 0],
[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
I can do this with a for loop like:
np.array([i * t for i in a])
However I was wondering whether there was a more efficient numpythonic way of performing this operation without the for loop as in reality I have much bigger arrays and multiple sets of coefficients?
Try this (uses broadcasting).
>>> import numpy as np
>>> a = np.array([0, 1, 2, 3])
>>> t = np.array([1, 2, 3])
>>> res1 = t * a[:, None]
>>> res1
array([[0, 0, 0],
[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
My prefered way is:
a[:, None] * t
But there is also a special method for that:
np.outer(a, t)

Numpy sort two arrays together with one array as the keys in axis 1 [duplicate]

I'm trying to get the indices to sort a multidimensional array by the last axis, e.g.
>>> a = np.array([[3,1,2],[8,9,2]])
And I'd like indices i such that,
>>> a[i]
array([[1, 2, 3],
[2, 8, 9]])
Based on the documentation of numpy.argsort I thought it should do this, but I'm getting the error:
>>> a[np.argsort(a)]
IndexError: index 2 is out of bounds for axis 0 with size 2
Edit: I need to rearrange other arrays of the same shape (e.g. an array b such that a.shape == b.shape) in the same way... so that
>>> b = np.array([[0,5,4],[3,9,1]])
>>> b[i]
array([[5,4,0],
[9,3,1]])
Solution:
>>> a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
array([[1, 2, 3],
[2, 8, 9]])
You got it right, though I wouldn't describe it as cheating the indexing.
Maybe this will help make it clearer:
In [544]: i=np.argsort(a,axis=1)
In [545]: i
Out[545]:
array([[1, 2, 0],
[2, 0, 1]])
i is the order that we want, for each row. That is:
In [546]: a[0, i[0,:]]
Out[546]: array([1, 2, 3])
In [547]: a[1, i[1,:]]
Out[547]: array([2, 8, 9])
To do both indexing steps at once, we have to use a 'column' index for the 1st dimension.
In [548]: a[[[0],[1]],i]
Out[548]:
array([[1, 2, 3],
[2, 8, 9]])
Another array that could be paired with i is:
In [560]: j=np.array([[0,0,0],[1,1,1]])
In [561]: j
Out[561]:
array([[0, 0, 0],
[1, 1, 1]])
In [562]: a[j,i]
Out[562]:
array([[1, 2, 3],
[2, 8, 9]])
If i identifies the column for each element, then j specifies the row for each element. The [[0],[1]] column array works just as well because it can be broadcasted against i.
I think of
np.array([[0],
[1]])
as 'short hand' for j. Together they define the source row and column of each element of the new array. They work together, not sequentially.
The full mapping from a to the new array is:
[a[0,1] a[0,2] a[0,0]
a[1,2] a[1,0] a[1,1]]
def foo(a):
i = np.argsort(a, axis=1)
return (np.arange(a.shape[0])[:,None], i)
In [61]: foo(a)
Out[61]:
(array([[0],
[1]]), array([[1, 2, 0],
[2, 0, 1]], dtype=int32))
In [62]: a[foo(a)]
Out[62]:
array([[1, 2, 3],
[2, 8, 9]])
The above answers are now a bit outdated, since new functionality was added in numpy 1.15 to make it simpler; take_along_axis (https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.take_along_axis.html) allows you to do:
>>> a = np.array([[3,1,2],[8,9,2]])
>>> np.take_along_axis(a, a.argsort(axis=-1), axis=-1)
array([[1 2 3]
[2 8 9]])
I found the answer here, with someone having the same problem. They key is just cheating the indexing to work properly...
>>> a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
array([[1, 2, 3],
[2, 8, 9]])
You can also use linear indexing, which might be better with performance, like so -
M,N = a.shape
out = b.ravel()[a.argsort(1)+(np.arange(M)[:,None]*N)]
So, a.argsort(1)+(np.arange(M)[:,None]*N) basically are the linear indices that are used to map b to get the desired sorted output for b. The same linear indices could also be used on a for getting the sorted output for a.
Sample run -
In [23]: a = np.array([[3,1,2],[8,9,2]])
In [24]: b = np.array([[0,5,4],[3,9,1]])
In [25]: M,N = a.shape
In [26]: b.ravel()[a.argsort(1)+(np.arange(M)[:,None]*N)]
Out[26]:
array([[5, 4, 0],
[1, 3, 9]])
Rumtime tests -
In [27]: a = np.random.rand(1000,1000)
In [28]: b = np.random.rand(1000,1000)
In [29]: M,N = a.shape
In [30]: %timeit b[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
10 loops, best of 3: 133 ms per loop
In [31]: %timeit b.ravel()[a.argsort(1)+(np.arange(M)[:,None]*N)]
10 loops, best of 3: 96.7 ms per loop

Create numpy array within numpy array

I want to create a numpy array within a numpy array. If i do it with normal python its something like
a = [[1,2], [3,4]]
a[0][1] = [1,1,1]
print a
The result is [[1, [1, 1, 1]], [3, 4]]
How can I achieve the same using numpy arrays? The code I have is:
a = np.array([(1, 2, 3),(4, 5, 6)])
b = np.array([1,1,1])
a[0][1] = b
a as created is dtype int. Each element can only be another integer:
In [758]: a = np.array([(1, 2, 3),(4, 5, 6)])
...: b = np.array([1,1,1])
...:
In [759]: a
Out[759]:
array([[1, 2, 3],
[4, 5, 6]])
In [760]: b
Out[760]: array([1, 1, 1])
In [761]: a[0,1]=b
...
ValueError: setting an array element with a sequence.
You can make another dtype of array, one that holds pointers to objects, much as list does:
In [762]: aO = a.astype(object)
In [763]: aO
Out[763]:
array([[1, 2, 3],
[4, 5, 6]], dtype=object)
Now it is possible to replace one of those element pointers with a pointer to b array:
In [765]: aO[0,1]=b
In [766]: aO
Out[766]:
array([[1, array([1, 1, 1]), 3],
[4, 5, 6]], dtype=object)
But as asked in the comments - why do you want/need to do this? What are you going to do with such an array? It is possible to do some numpy math on such an array, but as shown in some recent SO questions, it is hit-or-miss. It is also slower.
As far as I know, you cannot do this. Numpy arrays cannot have entries of varying shape. Your request to make an array like [[1, [1, 1, 1]], [3, 4]] is impossible. However, you could make a numpy matrix of dimensions (3x2x3) to get
[
[
[1,0,0],
[1,1,1],
[0,0,0],
]
[
[3,0,0],
[4,0,0],
[0,0,0]
]
]
Your only option is to pad empty elements with some number (I used 0s above) or use another data structure.

How to add items into a numpy array

I need to accomplish the following task:
from:
a = array([[1,3,4],[1,2,3]...[1,2,1]])
(add one element to each row) to:
a = array([[1,3,4,x],[1,2,3,x]...[1,2,1,x]])
I have tried doing stuff like a[n] = array([1,3,4,x])
but numpy complained of shape mismatch. I tried iterating through a and appending element x to each item, but the changes are not reflected.
Any ideas on how I can accomplish this?
Appending data to an existing array is a natural thing to want to do for anyone with python experience. However, if you find yourself regularly appending to large arrays, you'll quickly discover that NumPy doesn't easily or efficiently do this the way a python list will. You'll find that every "append" action requires re-allocation of the array memory and short-term doubling of memory requirements. So, the more general solution to the problem is to try to allocate arrays to be as large as the final output of your algorithm. Then perform all your operations on sub-sets (slices) of that array. Array creation and destruction should ideally be minimized.
That said, It's often unavoidable and the functions that do this are:
for 2-D arrays:
np.hstack
np.vstack
np.column_stack
np.row_stack
for 3-D arrays (the above plus):
np.dstack
for N-D arrays:
np.concatenate
import numpy as np
a = np.array([[1,3,4],[1,2,3],[1,2,1]])
b = np.array([10,20,30])
c = np.hstack((a, np.atleast_2d(b).T))
returns c:
array([[ 1, 3, 4, 10],
[ 1, 2, 3, 20],
[ 1, 2, 1, 30]])
One way to do it (may not be the best) is to create another array with the new elements and do column_stack. i.e.
>>>a = array([[1,3,4],[1,2,3]...[1,2,1]])
[[1 3 4]
[1 2 3]
[1 2 1]]
>>>b = array([1,2,3])
>>>column_stack((a,b))
array([[1, 3, 4, 1],
[1, 2, 3, 2],
[1, 2, 1, 3]])
Appending a single scalar could be done a bit easier as already shown (and also without converting to float) by expanding the scalar to a python-list-type:
import numpy as np
a = np.array([[1,3,4],[1,2,3],[1,2,1]])
x = 10
b = np.hstack ((a, [[x]] * len (a) ))
returns b as:
array([[ 1, 3, 4, 10],
[ 1, 2, 3, 10],
[ 1, 2, 1, 10]])
Appending a row could be done by:
c = np.vstack ((a, [x] * len (a[0]) ))
returns c as:
array([[ 1, 3, 4],
[ 1, 2, 3],
[ 1, 2, 1],
[10, 10, 10]])
np.insert can also be used for the purpose
import numpy as np
a = np.array([[1, 3, 4],
[1, 2, 3],
[1, 2, 1]])
x = 5
index = 3 # the position for x to be inserted before
np.insert(a, index, x, axis=1)
array([[1, 3, 4, 5],
[1, 2, 3, 5],
[1, 2, 1, 5]])
index can also be a list/tuple
>>> index = [1, 1, 3] # equivalently (1, 1, 3)
>>> np.insert(a, index, x, axis=1)
array([[1, 5, 5, 3, 4, 5],
[1, 5, 5, 2, 3, 5],
[1, 5, 5, 2, 1, 5]])
or a slice
>>> index = slice(0, 3)
>>> np.insert(a, index, x, axis=1)
array([[5, 1, 5, 3, 5, 4],
[5, 1, 5, 2, 5, 3],
[5, 1, 5, 2, 5, 1]])
If x is just a single scalar value, you could try something like this to ensure the correct shape of the array that is being appended/concatenated to the rightmost column of a:
import numpy as np
a = np.array([[1,3,4],[1,2,3],[1,2,1]])
x = 10
b = np.hstack((a,x*np.ones((a.shape[0],1))))
returns b as:
array([[ 1., 3., 4., 10.],
[ 1., 2., 3., 10.],
[ 1., 2., 1., 10.]])
target = []
for line in a.tolist():
new_line = line.append(X)
target.append(new_line)
return array(target)

Numpy - add row to array

How does one add rows to a numpy array?
I have an array A:
A = array([[0, 1, 2], [0, 2, 0]])
I wish to add rows to this array from another array X if the first element of each row in X meets a specific condition.
Numpy arrays do not have a method 'append' like that of lists, or so it seems.
If A and X were lists I would merely do:
for i in X:
if i[0] < 3:
A.append(i)
Is there a numpythonic way to do the equivalent?
Thanks,
S ;-)
You can do this:
newrow = [1, 2, 3]
A = numpy.vstack([A, newrow])
What is X? If it is a 2D-array, how can you then compare its row to a number: i < 3?
EDIT after OP's comment:
A = array([[0, 1, 2], [0, 2, 0]])
X = array([[0, 1, 2], [1, 2, 0], [2, 1, 2], [3, 2, 0]])
add to A all rows from X where the first element < 3:
import numpy as np
A = np.vstack((A, X[X[:,0] < 3]))
# returns:
array([[0, 1, 2],
[0, 2, 0],
[0, 1, 2],
[1, 2, 0],
[2, 1, 2]])
As this question is been 7 years before, in the latest version which I am using is numpy version 1.13, and python3, I am doing the same thing with adding a row to a matrix, remember to put a double bracket to the second argument, otherwise, it will raise dimension error.
In here I am adding on matrix A
1 2 3
4 5 6
with a row
7 8 9
same usage in np.r_
A = [[1, 2, 3], [4, 5, 6]]
np.append(A, [[7, 8, 9]], axis=0)
>> array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
#or
np.r_[A,[[7,8,9]]]
Just to someone's intersted, if you would like to add a column,
array = np.c_[A,np.zeros(#A's row size)]
following what we did before on matrix A, adding a column to it
np.c_[A, [2,8]]
>> array([[1, 2, 3, 2],
[4, 5, 6, 8]])
If you want to prepend, you can just flip the order of the arguments, i.e.:
np.r_([[7, 8, 9]], A)
>> array([[7, 8, 9],
[1, 2, 3],
[4, 5, 6]])
If no calculations are necessary after every row, it's much quicker to add rows in python, then convert to numpy. Here are timing tests using python 3.6 vs. numpy 1.14, adding 100 rows, one at a time:
import numpy as np
from time import perf_counter, sleep
def time_it():
# Compare performance of two methods for adding rows to numpy array
py_array = [[0, 1, 2], [0, 2, 0]]
py_row = [4, 5, 6]
numpy_array = np.array(py_array)
numpy_row = np.array([4,5,6])
n_loops = 100
start_clock = perf_counter()
for count in range(0, n_loops):
numpy_array = np.vstack([numpy_array, numpy_row]) # 5.8 micros
duration = perf_counter() - start_clock
print('numpy 1.14 takes {:.3f} micros per row'.format(duration * 1e6 / n_loops))
start_clock = perf_counter()
for count in range(0, n_loops):
py_array.append(py_row) # .15 micros
numpy_array = np.array(py_array) # 43.9 micros
duration = perf_counter() - start_clock
print('python 3.6 takes {:.3f} micros per row'.format(duration * 1e6 / n_loops))
sleep(15)
#time_it() prints:
numpy 1.14 takes 5.971 micros per row
python 3.6 takes 0.694 micros per row
So, the simple solution to the original question, from seven years ago, is to use vstack() to add a new row after converting the row to a numpy array. But a more realistic solution should consider vstack's poor performance under those circumstances. If you don't need to run data analysis on the array after every addition, it is better to buffer the new rows to a python list of rows (a list of lists, really), and add them as a group to the numpy array using vstack() before doing any data analysis.
You can also do this:
newrow = [1,2,3]
A = numpy.concatenate((A,newrow))
import numpy as np
array_ = np.array([[1,2,3]])
add_row = np.array([[4,5,6]])
array_ = np.concatenate((array_, add_row), axis=0)
I use 'np.vstack' which is faster, EX:
import numpy as np
input_array=np.array([1,2,3])
new_row= np.array([4,5,6])
new_array=np.vstack([input_array, new_row])
I use numpy.insert(arr, i, the_object_to_be_added, axis) in order to insert object_to_be_added at the i'th row(axis=0) or column(axis=1)
import numpy as np
a = np.array([[1, 2, 3], [5, 4, 6]])
# array([[1, 2, 3],
# [5, 4, 6]])
np.insert(a, 1, [55, 66], axis=1)
# array([[ 1, 55, 2, 3],
# [ 5, 66, 4, 6]])
np.insert(a, 2, [50, 60, 70], axis=0)
# array([[ 1, 2, 3],
# [ 5, 4, 6],
# [50, 60, 70]])
Too old discussion, but I hope it helps someone.
If you can do the construction in a single operation, then something like the vstack-with-fancy-indexing answer is a fine approach. But if your condition is more complicated or your rows come in on the fly, you may want to grow the array. In fact the numpythonic way to do something like this - dynamically grow an array - is to dynamically grow a list:
A = np.array([[1,2,3],[4,5,6]])
Alist = [r for r in A]
for i in range(100):
newrow = np.arange(3)+i
if i%5:
Alist.append(newrow)
A = np.array(Alist)
del Alist
Lists are highly optimized for this kind of access pattern; you don't have convenient numpy multidimensional indexing while in list form, but for as long as you're appending it's hard to do better than a list of row arrays.
You can use numpy.append() to append a row to numpty array and reshape to a matrix later on.
import numpy as np
a = np.array([1,2])
a = np.append(a, [3,4])
print a
# [1,2,3,4]
# in your example
A = [1,2]
for row in X:
A = np.append(A, row)

Categories

Resources