Numpy flatten a nested array using concatenate - python

I have a numpy array with subarrays of different shapes. I was trying to use an iterator to flatten them into a 1D array. Below is the code:
import numpy as np
a=np.array([np.random.rand(1,2),np.random.rand(2,2),np.random.rand(1,4)],dtype=object)
b=np.concatenate(x.ravel for x in a)
This returns an error:
TypeError: The first input argument needs to be a sequence
I am not quite sure what I am doing incorrectly. It works fine when I create a for loop with the same logic and keep concatenating my array recursively. Any help appreciated.
The goal is to flatten the array into a 1D array. (Note that hstack doesn't work because the arrays are of different shapes. flatten doesn't work because it is already a 1D array (of arrays).)

b=np.concatenate([x.ravel() for x in a])
print(b)
array([0.0928126 , 0.26396728, 0.37416516, 0.86079876, 0.3070049 ,
0.86714361, 0.67955231, 0.11715076, 0.34659847, 0.17392114])

Related

Numpy array dimension conversion

I have a 2 dimension array which sub-array has different size, it is expected to operate as 2 dimension array but turns out 1, is there anything wrong?
import numpy as np
sample_list = [['Section 1','Section 2','Section 3'],['Section 4','Section 5'],['Section 6']]
nd_array = np.array(sample_list, dtype=object)
print(nd_array.ndim)
the output is 1
however, when it change to
import numpy as np
sample_list = [['Section 1','Section 2','Section 3'],['Section 4','Section 5','Section 6'],['Section 7','Section 7','Section 7']]
nd_array = np.array(sample_list, dtype=object)
print(nd_array.ndim)
the output is as expected is 2.
There's nothing wrong, except that your first array is not a 2-dimensional array. It's a one-dimensional array with 3 entries, each of which happens to be a different-sized list.
Numpy 2D arrays are always square. You'll have to pad the lists in your first example if you want to make it a 2D array.

Is there a way to properly format a large numpy array

I have a large 200x100 2d tuple (tuple of tuples) of floats that I'm trying to convert into a numpy array.
I do the conventional code:
arr = np.array(2d_tuple_of_floats)
But when I do that instead of converting into a 2d array, it converts into a 1d array of tuples (arr.shape = (200,)). So then I specify:
arr = np.array(2d_tuple_of_floats, ndmin=2)
And instead of changing to shape (200,100) it changes to the shape (1,200), implying that I simply have an array of 1 tuple of 200 tuples.
So then I try
arr = np.array(2d_tuple_of_floats, dtype=np.float, ndmin=2)
And it still remains as shape (1,200); I also tried making it as a matrix and converting the matrix to an array, but that did not work either.
How can I get it to the proper shape of (200,100)?
Edit:
For reference the tuple is formatted as:
((0,1/6.0,...),(0.0,1/9.0,...),...,(100/3.0+101/3.0,...))

how to argsort with an array of arrays

I am trying to use argsort on an array of float arrays, but faced some problem.
Here is what I try to do:
import numpy as np
z = np.array([np.array([30.9,29.0,5.87],dtype=float),np.array([20.3,1.3,8.8,4.4],dtype=float)]) # actually z is transferred from a tree using root2array whose corrseponding branches is a vector<vector<float>>
Index_list = np.argsort(z)
Then I received:
ValueError: operands could not be broadcast together with shapes (4,) (3,)
So what should I do to modify z or change the way of argsort to make it work?

Python List of np arrays to array

I'm trying to turn a list of 2d numpy arrays into a 2d numpy array. For example,
dat_list = []
for i in range(10):
dat_list.append(np.zeros([5, 10]))
What I would like to get out of this list is an array that is (50, 10). However, when I try the following, I get a (10,5,10) array.
output = np.array(dat_list)
Thoughts?
you want to stack them:
np.vstack(dat_list)
Above accepted answer is correct for 2D arrays as you requested. For 3D input arrays though, vstack() will give you a surprising outcome. For those, use stack(<list of 3D arrays>, 0).
See https://docs.scipy.org/doc/numpy/reference/generated/numpy.append.html
for details. You can use append, but will want to specify the axis on which to append.
dat_list.append(np.zeros([5, 10]),axis=0)

Convert a list of 2D numpy arrays to one 3D numpy array?

I have a list of several hundred 10x10 arrays that I want to stack together into a single Nx10x10 array. At first I tried a simple
newarray = np.array(mylist)
But that returned with "ValueError: setting an array element with a sequence."
Then I found the online documentation for dstack(), which looked perfect: "...This is a simple way to stack 2D arrays (images) into a single 3D array for processing." Which is exactly what I'm trying to do. However,
newarray = np.dstack(mylist)
tells me "ValueError: array dimensions must agree except for d_0", which is odd because all my arrays are 10x10. I thought maybe the problem was that dstack() expects a tuple instead of a list, but
newarray = np.dstack(tuple(mylist))
produced the same result.
At this point I've spent about two hours searching here and elsewhere to find out what I'm doing wrong and/or how to go about this correctly. I've even tried converting my list of arrays into a list of lists of lists and then back into a 3D array, but that didn't work either (I ended up with lists of lists of arrays, followed by the "setting array element as sequence" error again).
Any help would be appreciated.
newarray = np.dstack(mylist)
should work. For example:
import numpy as np
# Here is a list of five 10x10 arrays:
x = [np.random.random((10,10)) for _ in range(5)]
y = np.dstack(x)
print(y.shape)
# (10, 10, 5)
# To get the shape to be Nx10x10, you could use rollaxis:
y = np.rollaxis(y,-1)
print(y.shape)
# (5, 10, 10)
np.dstack returns a new array. Thus, using np.dstack requires as much additional memory as the input arrays. If you are tight on memory, an alternative to np.dstack which requires less memory is to
allocate space for the final array first, and then pour the input arrays into it one at a time.
For example, if you had 58 arrays of shape (159459, 2380), then you could use
y = np.empty((159459, 2380, 58))
for i in range(58):
# instantiate the input arrays one at a time
x = np.random.random((159459, 2380))
# copy x into y
y[..., i] = x

Categories

Resources