Can't append numpy arrays after for loop? - python

After a for loop, I can not append each iteration into a single array:
in:
for a in l:
arr = np.asarray(a_lis)
print(arr)
How can I append and return in a single array the above three arrays?:
[[ 0.55133 0.58122 0.66129032 0.67562724 0.69354839 0.70609319
0.6702509 0.63799283 0.61827957 0.6155914 0.60842294 0.60215054
0.59946237 0.625448 0.60215054 0.60304659 0.59856631 0.59677419
0.59408602 0.61021505]
[ 0.58691756 0.6784946 0.64964158 0.66397849 0.67114695 0.66935484
0.67293907 0.66845878 0.65143369 0.640681 0.63530466 0.6344086
0.6281362 0.6281362 0.62634409 0.6281362 0.62903226 0.63799283
0.63709677 0.6978495]
[ 0.505018 0.53405018 0.59408602 0.65143369 0.66577061 0.66487455
0.65412186 0.64964158 0.64157706 0.63082437 0.62634409 0.6218638
0.62007168 0.6648746 0.62096774 0.62007168 0.62096774 0.62007168
0.62275986 0.81362 ]]
I tried to append as a list, using numpy's append, merge, and hstack. None of them worked. Any idea of how to get the previous output?

Use numpy.concatenate to join the arrays:
import numpy as np
a = np.array([[1, 2, 3, 4]])
b = np.array([[5, 6, 7, 8]])
arr = np.concatenate((a, b), axis=0)
print(arr)
# [[1 2 3 4]
# [5 6 7 8]]
Edit1: To do it inside the array (as mentioned in the comment) you can use numpy.vstack:
import numpy as np
for i in range(0, 3):
a = np.random.randint(0, 10, size=4)
if i == 0:
arr = a
else:
arr = np.vstack((arr, a))
print(arr)
# [[1 1 8 7]
# [2 4 9 1]
# [8 4 7 5]]
Edit2: Citing Iguananaut from the comments:
That said, using concatenate repeatedly can be costly. If you know the
size of the output in advance it's better to pre-allocate an array and
fill it as you go.

Related

In python numpy, how to replace some rows in array A with array B if we know the index

In python numpy, how to replace some rows in array A with array B if we know the index.
For example
we have
a = np.array([[1,2],[3,4],[5,6]])
b = np.array([[10,10],[1000, 1000]])
index = [0,2]
I want to change a to
a = np.array([[10,10],[3,4],[1000,1000]])
I have considered the funtion np.where but it need to create the bool condition, not very convenient,
I would do it following way
import numpy as np
a = np.array([[1,2],[3,4],[5,6]])
b = np.array([[10,10],[1000, 1000]])
index = [0,2]
a[index] = b
print(a)
gives output
[[ 10 10]
[ 3 4]
[1000 1000]]
You can use :
a[index] = b
For example :
import numpy as np
a = np.array([[1,2],[3,4],[5,6]])
b = np.array([[10,10],[1000, 1000]])
index = [0,2]
a[index] = b
print(a)
Result :
[[ 10 10]
[ 3 4]
[1000 1000]]
In Python's NumPy library, you can use the numpy.put() method to replace some rows in array A with array B if you know the index. Here's an example:
import numpy as np
# Initialize array A
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Initialize array B
B = np.array([[10, 20, 30], [40, 50, 60]])
# Indices of the rows to be replaced in array A
indices = [0, 1]
# Replace rows in array A with rows in array B
np.put(A, indices, B)
print(A)
In this example, the first two rows in array A are replaced with the first two rows in array B, so the output will be
[[10 20 30]
[40 50 60]
[ 7 8 9]]
Simply a[indices] = b or if you want to be more fancy np.put(a, indices, b)

Numpy python - calculating sum of columns from irregular dimension

I have a multi-dimensional array for scores, and for which, I need to get sum of each columns at 3rd level in Python. I am using Numpy to achieve this.
import numpy as np
Data is something like:
score_list = [
[[1,1,3], [1,2,5]],
[[2,7,5], [4,1,3]]
]
This should return:
[[3 8 8] [5 3 8]]
Which is happening correctly using this:
sum_array = np_array.sum(axis=0)
print(sum_array)
However, if I have irregular shape like this:
score_list = [
[[1,1], [1,2,5]],
[[2,7], [4,1,3]]
]
I expect it to return:
[[3 8] [5 3 8]]
However, it comes up with warning and the return value is:
[list([1, 1, 2, 7]) list([1, 2, 5, 4, 1, 3])]
How can I get expected result?
numpy will try to cast it into an nd array which will fail, instead consider passing each sublist individually using zip.
score_list = [
[[1,1], [1,2,5]],
[[2,7], [4,1,3]]
]
import numpy as np
res = [np.sum(x,axis=0) for x in zip(*score_list)]
print(res)
[array([3, 8]), array([5, 3, 8])]
Here is one solution for doing this, keep in mind that it doesn't use numpy and will be very inefficient for larger matrices (but for smaller matrices runs just fine).
# Create matrix
score_list = [
[[1,1,3], [1,2,5]],
[[2,7,5], [4,1,3]]
]
# Get each row
for i in range(1, len(score_list)):
# Get each list within the row
for j in range(len(score_list[i])):
# Get each value in each list
for k in range(len(score_list[i][j])):
# Add current value to the same index
# on the first row
score_list[0][j][k] += score_list[i][j][k]
print(score_list[0])
There is bound to be a better solution but this is a temporary fix for you :)
Edit. Made more efficient
A possible solution:
a = np.vstack([np.array(score_list[x], dtype='object')
for x in range(len(score_list))])
[np.add(*[x for x in a[:, i]]) for i in range(a.shape[1])]
Another possible solution:
a = sum(score_list, [])
b = [a[x] for x in range(0,len(a),2)]
c = [a[x] for x in range(1,len(a),2)]
[np.add(x[0], x[1]) for x in [b, c]]
Output:
[array([3, 8]), array([5, 3, 8])]

numpy broadcast multiply on condition?

I have two arrays, one of shape arr1.shape = (1000,2) and the other of shape arr2.shape = (100,).
I'd like to somehow multiply arr1[:,1]*arr2 where arr1[:,0] == arr2.index so that I get a final shape of arr_out.shape = (1000,). The first column of arr1 is essentially an id where the following condition holds true: set(arr1[:,0]) == set(i for i in range(0,100)), i.e. there is always at least one value index of arr2 found in arr1[:,0].
I can't quite see how to do this in the numpy library but feel there should be a way using numpy multiply, if there was a way to configure the where condition correctly?
I considered perhaps a dummy index dimension for arr2 might help?
A toy example can be produced with the following code snippet
arr2_length = 100
arr1_length = 1000
arr1 = np.column_stack(
(np.random.randint(0,arr2_length,(arr1_length)),
np.random.rand(arr1_length))
)
arr2 = np.random.rand(arr2_length)
# Doesn't work
arr2_b = np.column_stack((
np.arange(arr2_length),
np.random.rand(arr2_length)
))
# np.multiply(arr1[:,1],arr2_b[:,1], where=(arr1[:,0]==arr2_b[:,0]))
One sort of solution I had was to leverage a left join in Pandas to broadcast the smaller array to a same-lengthed array and then multiply, as follows:
df = pd.DataFrame(arr1).set_index(0).join(pd.DataFrame(arr2))
arr_out = (df[0]*df[1]).values
But I'd really like to understand if there's a native numpy way of doing this since I feel using dataframe joins for a multiplication isn't a very readable solution and possibly suffers from excess memory overhead.
Thanks for your help.
I believe this does exactly what you want:
indices, values = arr1[:,0].astype(int), arr1[:,1]
arr_out = values * arr2[indices]
Is this what you're looking for?
import numpy as np
arr1 = np.random.randint(1, 5, (10, 2))
arr2 = np.random.randint(1, 5, (5,))
arr2_sampled = arr2[arr1[:, 0]]
result = arr1[:, 1]*arr2_sampled
Output:
arr1 =
[[4 2]
[3 3]
[2 3]
[3 1]
[2 1]
[2 4]
[1 1]
[4 2]
[4 1]
[3 4]]
arr2 =
[4 1 2 1 2]
arr2_sampled =
[2 1 2 1 2 2 1 2 2 1]
result =
[4 3 6 1 2 8 1 4 2 4]

Python access all entrys from one dimension in a multidimensional numpy array

I would like to manipulate the data of all entries of a complicated 3d numpy array. I want all entries of all subarrays in the X-Y-Position. I know Matlab can do something like that (with the variable indicator : for everything) I indicated that below with DARK[:][1][1]. Which basically mean I want the second entry from the second the column in all sub arrays. Is there a way to do this in python?
import numpy
# Creating a dummy variable of the type I deal with (If this looks crappy sorry, the variable actually comes from the output of d = pyfits.getdata()):
a = []
for i in range(3):
d = numpy.array([[i, 2*i], [3*i, 4*i]])
a.append(d)
print a
# Pseudo code:
print 'Second row, second column: ', a[:][1][1]
I expect a result like this:
[array([[ 0, 0],[ 0, 0]]),
array([[ 1, 2],[ 3, 4]]),
array([[ 2, 4],[ 6, 8]])]
Second row, second column: [0, 4, 8]
You can do this using slightly different syntax.
import numpy as np
a = np.arange(27).reshape(3,3,3) # Create a 3x3x3 3d array
print("3d Array:")
print(a)
print("Second Row, Second Column: ", a[:,1,1])
Output:
>>> 3d Array:
[[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]]
[[ 9 10 11]
[12 13 14]
[15 16 17]]
[[18 19 20]
[21 22 23]
[24 25 26]]]
>>> Second Row, Second Column: [ 4 13 22]
Found the solution, thanks Divakar and eeScott:
import numpy as np
# Creating a dummy variable of the type I deal with (If this looks crappy sorry, the variable actually comes from the output of d = pyfits.getdata()):
a = []
for i in range(3):
d = np.array([[i, 2*i], [3*i, 4*i]])
a.append(d)
# print variable
print np.array(a)
print 'Second row, second column: ', np.array(a)[:, 1, 1]
# Alternative solution:
a = np.asarray(a)
print a
print 'Second row, second column: ', a[:,1,1]
Result:
[[[0 0][0 0]]
[[1 2][3 4]]
[[2 4][6 8]]]
Second row, second column: [0 4 8]

Operations on 'N' dimensional numpy arrays

I am attempting to generalize some Python code to operate on arrays of arbitrary dimension. The operations are applied to each vector in the array. So for a 1D array, there is simply one operation, for a 2-D array it would be both row and column-wise (linearly, so order does not matter). For example, a 1D array (a) is simple:
b = operation(a)
where 'operation' is expecting a 1D array. For a 2D array, the operation might proceed as
for ii in range(0,a.shape[0]):
b[ii,:] = operation(a[ii,:])
for jj in range(0,b.shape[1]):
c[:,ii] = operation(b[:,ii])
I would like to make this general where I do not need to know the dimension of the array beforehand, and not have a large set of if/elif statements for each possible dimension.
Solutions that are general for 1 or 2 dimensions are ok, though a completely general solution would be preferred. In reality, I do not imagine needing this for any dimension higher than 2, but if I can see a general example I will learn something!
Extra information:
I have a matlab code that uses cells to do something similar, but I do not fully understand how it works. In this example, each vector is rearranged (basically the same function as fftshift in numpy.fft). Not sure if this helps, but it operates on an array of arbitrary dimension.
function aout=foldfft(ain)
nd = ndims(ain);
for k = 1:nd
nx = size(ain,k);
kx = floor(nx/2);
idx{k} = [kx:nx 1:kx-1];
end
aout = ain(idx{:});
In Octave, your MATLAB code does:
octave:19> size(ain)
ans =
2 3 4
octave:20> idx
idx =
{
[1,1] =
1 2
[1,2] =
1 2 3
[1,3] =
2 3 4 1
}
and then it uses the idx cell array to index ain. With these dimensions it 'rolls' the size 4 dimension.
For 5 and 6 the index lists would be:
2 3 4 5 1
3 4 5 6 1 2
The equivalent in numpy is:
In [161]: ain=np.arange(2*3*4).reshape(2,3,4)
In [162]: idx=np.ix_([0,1],[0,1,2],[1,2,3,0])
In [163]: idx
Out[163]:
(array([[[0]],
[[1]]]), array([[[0],
[1],
[2]]]), array([[[1, 2, 3, 0]]]))
In [164]: ain[idx]
Out[164]:
array([[[ 1, 2, 3, 0],
[ 5, 6, 7, 4],
[ 9, 10, 11, 8]],
[[13, 14, 15, 12],
[17, 18, 19, 16],
[21, 22, 23, 20]]])
Besides the 0 based indexing, I used np.ix_ to reshape the indexes. MATLAB and numpy use different syntax to index blocks of values.
The next step is to construct [0,1],[0,1,2],[1,2,3,0] with code, a straight forward translation.
I can use np.r_ as a short cut for turning 2 slices into an index array:
In [201]: idx=[]
In [202]: for nx in ain.shape:
kx = int(np.floor(nx/2.))
kx = kx-1;
idx.append(np.r_[kx:nx, 0:kx])
.....:
In [203]: idx
Out[203]: [array([0, 1]), array([0, 1, 2]), array([1, 2, 3, 0])]
and pass this through np.ix_ to make the appropriate index tuple:
In [204]: ain[np.ix_(*idx)]
Out[204]:
array([[[ 1, 2, 3, 0],
[ 5, 6, 7, 4],
[ 9, 10, 11, 8]],
[[13, 14, 15, 12],
[17, 18, 19, 16],
[21, 22, 23, 20]]])
In this case, where 2 dimensions don't roll anything, slice(None) could replace those:
In [210]: idx=(slice(None),slice(None),[1,2,3,0])
In [211]: ain[idx]
======================
np.roll does:
indexes = concatenate((arange(n - shift, n), arange(n - shift)))
res = a.take(indexes, axis)
np.apply_along_axis is another function that constructs an index array (and turns it into a tuple for indexing).
If you are looking for a programmatic way to index the k-th dimension an n-dimensional array, then numpy.take might help you.
An implementation of foldfft is given below as an example:
In[1]:
import numpy as np
def foldfft(ain):
result = ain
nd = len(ain.shape)
for k in range(nd):
nx = ain.shape[k]
kx = (nx+1)//2
shifted_index = list(range(kx,nx)) + list(range(kx))
result = np.take(result, shifted_index, k)
return result
a = np.indices([3,3])
print("Shape of a = ", a.shape)
print("\nStarting array:\n\n", a)
print("\nFolded array:\n\n", foldfft(a))
Out[1]:
Shape of a = (2, 3, 3)
Starting array:
[[[0 0 0]
[1 1 1]
[2 2 2]]
[[0 1 2]
[0 1 2]
[0 1 2]]]
Folded array:
[[[2 0 1]
[2 0 1]
[2 0 1]]
[[2 2 2]
[0 0 0]
[1 1 1]]]
You could use numpy.ndarray.flat, which allows you to linearly iterate over a n dimensional numpy array. Your code should then look something like this:
b = np.asarray(x)
for i in range(len(x.flat)):
b.flat[i] = operation(x.flat[i])
The folks above provided multiple appropriate solutions. For completeness, here is my final solution. In this toy example for the case of 3 dimensions, the function 'ops' replaces the first and last element of a vector with 1.
import numpy as np
def ops(s):
s[0]=1
s[-1]=1
return s
a = np.random.rand(4,4,3)
print '------'
print 'Array a'
print a
print '------'
for ii in np.arange(a.ndim):
a = np.apply_along_axis(ops,ii,a)
print '------'
print ' Axis',str(ii)
print a
print '------'
print ' '
The resulting 3D array has a 1 in every element on the 'border' with the numbers in the middle of the array unchanged. This is of course a toy example; however ops could be any arbitrary function that operates on a 1D vector.
Flattening the vector will also work; I chose not to pursue that simply because the book-keeping is more difficult and apply_along_axis is the simplest approach.
apply_along_axis reference page

Categories

Resources