why unable to concatenate two arrays in python? - python

I have two arrays
>>> array1.shape
(97, 195)
>>> array2.shape
(195,)
>>> array1 = numpy.concatenate((array1, array2), axis=0)
when I perform concatenate operation it shows an error
ValueError: all the input arrays must have same number of dimensions
is that the second array shape (195,) creating problem?

Just make both have the same dimensions and the same size except along the axis to be concatenated:
np.concatenate((array1, array2[np.newaxis,...]), axis=0)

In order for this to work, you need array2 to actually be 2d.
array1 = numpy.concatenate((array1, array2.reshape((1,195)))
should work

Another easy way to achieve the array concatenation that you’re looking for is to use Numpy’s vstack function as follows:
array1 = np.vstack([array1, array2])

Related

bootstrap numpy 2D array

I am trying to sample with replacement a base 2D numpy array with shape of (4,2) by rows, say 10 times. The final output should be a 3D numpy array.
Have tried the code below, it works. But is there a way to do it without the for loop?
base=np.array([[20,30],[50,60],[70,80],[10,30]])
print(np.shape(base))
nsample=10
tmp=np.zeros((np.shape(base)[0],np.shape(base)[1],10))
for i in range(nsample):
id_pick = np.random.choice(np.shape(base)[0], size=(np.shape(base)[0]))
print(id_pick)
boot1=base[id_pick,:]
tmp[:,:,i]=boot1
print(tmp)
Here's one vectorized approach -
m,n = base.shape
idx = np.random.randint(0,m,(m,nsample))
out = base[idx].swapaxes(1,2)
Basic idea is that we generate all the possible indices with np.random.randint as idx. That would an array of shape (m,nsample). We use this array to index into the input array along the first axis. Thus, it selects random rows off base. To get the final output with a shape (m,n,nsample), we need to swap last two axes.
You can use the stack function from numpy. Your code would then look like:
base=np.array([[20,30],[50,60],[70,80],[10,30]])
print(np.shape(base))
nsample=10
tmp = []
for i in range(nsample):
id_pick = np.random.choice(np.shape(base)[0], size=(np.shape(base)[0]))
print(id_pick)
boot1=base[id_pick,:]
tmp.append(boot1)
tmp = np.stack(tmp, axis=-1)
print(tmp)
Based on #Divakar 's answer, if you already know the shape of this 2D-array, you can treat it as an (8,) 1D array while bootstrapping, and then reshape it:
m, n = base.shape
flatbase = np.reshape(base, (m*n,))
idxs = np.random.choice(range(8), (numReps, m*n))
bootflats = flatbase[idx]
boots = np.reshape(flatbase, (numReps, m, n))

how to argsort with an array of arrays

I am trying to use argsort on an array of float arrays, but faced some problem.
Here is what I try to do:
import numpy as np
z = np.array([np.array([30.9,29.0,5.87],dtype=float),np.array([20.3,1.3,8.8,4.4],dtype=float)]) # actually z is transferred from a tree using root2array whose corrseponding branches is a vector<vector<float>>
Index_list = np.argsort(z)
Then I received:
ValueError: operands could not be broadcast together with shapes (4,) (3,)
So what should I do to modify z or change the way of argsort to make it work?

Subset a 3d numpy array

I have checked the numpy documentation but some of the indexing still eludes me. I have a numpy array such that its shape is (40000, 432) and its looks something like:
arr = [[1,2,3......431,432],
[1,2,3......431,432],
[1,2,3......431,432],
....................
[1,2,3......431,432]'
[1,2,3......431,432]]
I wanted to subset each array over a range (ie. 20-50) so that the shape will be (40000, 30) and it will look like:
subarr = [[20,21,22...48,49,50],
[20,21,22...48,49,50],
[20,21,22...48,49,50],
.....................
[20,21,22...48,49,50]]
Everything I try either returns me an error or gives me the shape (30, 432) which is not what I need. How do I subset a 2d array along the axis I want to?
You want to use numpy slicing:
arr = np.zeros((40000, 432))
subarr = arr[:, 20:50]
print(subarr.shape)
Output
(40000L, 30L)
The L in the shape output indicates that the integer is of Python type long.

Array stacking/ concatenation error in python

I am trying to concatenate two arrays: a and b, where
a.shape
(1460,10)
b.shape
(1460,)
I tried using hstack and concatenate as:
np.hstack((a,b))
c=np.concatenate(a,b,0)
I am stuck with the error
ValueError: all the input arrays must have same number of dimensions
Please guide me for concatenation and generating array c with dimensions 1460 x 11.
Try
b = np.expand_dims( b,axis=1 )
then
np.hstack((a,b))
or
np.concatenate( (a,b) , axis=1)
will work properly.
np.c_[a, b] concatenates along the last axis.
Per the docs,
... arrays will be stacked along their last axis after
being upgraded to at least 2-D with 1's post-pended to the shape
Since b has shape (1460,) its shape gets upgraded to (1460, 1) before concatenation along the last axis.
In [26]: c = np.c_[a,b]
In [27]: c.shape
Out[27]: (1460, 11)
The most basic operation that works is:
np.concatenate((a,b[:,None]),axis=1)
The [:,None] bit turns b into a (1060,1) array. Now the 1st dimensions of both arrays match, and you can easily concatenate on the 2nd.
There a many ways of adding the 2nd dimension to b, such as reshape and expanddims. hstack uses atleast_1d which does not help in this case. atleast_2d adds the None on the wrong side. I strongly advocate learning the [:,None] syntax.
Once the arrays are both 2d and match on the correct dimensions, concatenation is easy.

Convert a list of 2D numpy arrays to one 3D numpy array?

I have a list of several hundred 10x10 arrays that I want to stack together into a single Nx10x10 array. At first I tried a simple
newarray = np.array(mylist)
But that returned with "ValueError: setting an array element with a sequence."
Then I found the online documentation for dstack(), which looked perfect: "...This is a simple way to stack 2D arrays (images) into a single 3D array for processing." Which is exactly what I'm trying to do. However,
newarray = np.dstack(mylist)
tells me "ValueError: array dimensions must agree except for d_0", which is odd because all my arrays are 10x10. I thought maybe the problem was that dstack() expects a tuple instead of a list, but
newarray = np.dstack(tuple(mylist))
produced the same result.
At this point I've spent about two hours searching here and elsewhere to find out what I'm doing wrong and/or how to go about this correctly. I've even tried converting my list of arrays into a list of lists of lists and then back into a 3D array, but that didn't work either (I ended up with lists of lists of arrays, followed by the "setting array element as sequence" error again).
Any help would be appreciated.
newarray = np.dstack(mylist)
should work. For example:
import numpy as np
# Here is a list of five 10x10 arrays:
x = [np.random.random((10,10)) for _ in range(5)]
y = np.dstack(x)
print(y.shape)
# (10, 10, 5)
# To get the shape to be Nx10x10, you could use rollaxis:
y = np.rollaxis(y,-1)
print(y.shape)
# (5, 10, 10)
np.dstack returns a new array. Thus, using np.dstack requires as much additional memory as the input arrays. If you are tight on memory, an alternative to np.dstack which requires less memory is to
allocate space for the final array first, and then pour the input arrays into it one at a time.
For example, if you had 58 arrays of shape (159459, 2380), then you could use
y = np.empty((159459, 2380, 58))
for i in range(58):
# instantiate the input arrays one at a time
x = np.random.random((159459, 2380))
# copy x into y
y[..., i] = x

Categories

Resources