2D numpy array showing as 1D - python

I have a numpy ndarray train_data of length 200, where every row is another ndarray of length 10304.
However when I print np.shape(train_data), I get (200, 1), and when I print np.shape(train_data[0]) I get (1, ), and when I print np.shape(train_data[0][0]) I get (10304, ).
I am quite confused with this behavior as I supposed the first np.shape(train_data) should return (200, 10304).
Can someone explains to me why this is happening, and how could I get the array to be in shape of (200, 10304)?

This is because the arrays are constructed to be arrays of objects. Basically each element in the array is pointing to another array of size (1, ) which points to another array of size (10304, ). This is not equivalent to a normal ndarray in numpy so the shape is not recognized correctly. You can check this by looking at the dtypes.
To replicate what you see:
import numpy as np
arr = np.empty(200, dtype='object')
for i in range(200):
temp_arr = np.empty(1, dtype='object')
temp_arr[0] = np.zeros(10304)
arr[i] = temp_arr
print(arr.shape)
print(arr[0].shape)
print(arr[0][0].shape)
(200,)
(1,)
(10304,)
To get the (200, 10304) array back you need to "unpack" them:
new_arr = np.array([x[0] for x in arr])
#(200, 10304)

I'm not sure why that's happening, try reshaping the array:
B = np.reshape(A, (-1, 2))

Related

Trouble using np.append with 2d array

So i'm trying to append 2 np array together but it gives me this error ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)I know that this mean the shape of the array are not the same but I don't understand why and how to fix it.
arr1 = np.array([
[10.24217065 5.63381577]
[ 2.71521988 -3.33068004]
[-3.43022486 16.40921457]
[ 1.4461307 12.59851726]
[12.34829023 29.67531647]
[16.65382971 9.8915765 ]])
arr2 = np.array([4.62643996 5.14587112])
arr3 = np.append(arr1,arr2,axis=0)
Simply make them the same dimension:
arr3 = np.append(arr1, [arr2], axis=0)
arr2 has only a single dimension, since its shape is (2,). arr1 on the other hand has two dimensions, since its shape is (6, 2). These aren't compatible for np.append, as it says.
You can make arr2 have the required number of dimensions in many ways. One of them is reshaping:
arr3 = np.append(arr1, arr2.reshape(1, 2), axis=0)
At this point, the arrays have shape (6, 2) and (1, 2), which np.append knows how to deal with. The output will have shape (7, 2).
The error message tells you exactly what is the problem. The first array has two dimensions and the second array has one dimension. Another pair of [ ] in the second array will do the job.
arr2 = np.array([[4.62643996 5.14587112]])
arr3 = np.vstack((arr1, arr2))
or if you really want to use append, my favorite is
arr3 = np.append(arr1, arr2[np.newaxis, :])

How To ReShape a Numpy Array in Python

I have a numpy array of images with the shape of (5879,). Inside every index of the numpy array, I have the Pixels of the image with a shape of (640,640,3).
I want to reshape the complete array in such a way that the shape of the numpy array becomes (5879,640,640,3).
please check, whether below code works for you or not
import numpy as np
b = np.array([5879])
b.shape
output (1,)
a = np.array([[640],[640],[3]])
a = a.reshape((a.shape[0], 1))
a.shape
output (3, 1)
c = np.concatenate((a,b[:,None]),axis=0)
c.shape
Output:
(4, 1)
np.concatenate((a,b[:,None]),axis=0)
output
array([[ 640],
[ 640],
[ 3],
[5879]])
You want to stack your images along the first axis, into a 4D array. However, your images are all 3D.
So, first you need to add a leading singleton dimension to all images, and then to concatenate them along this axis:
imgs = [i_[None, ...] for i_ in orig_images] # add singleton dim to all images
x = np.concatenate(imgs, axis=0) # stack along the first axis
Edit:
Based on Mad Phyiscist's comment, it seems like using np.stack is more appropriate here: np.stack takes care of adding the leading singleton dimension for you:
x = np.stack(orig_images, axis=0)

About Numpy,a=np.array([1,2,3,4]),print a.shape[0]. why it will output 4?

import numpy as np
a = np.array([1,2,3,4])
print a.shape[0]
Why it will output 4?
The array [1,2,3,4], it's rows should be 1, I think , so who can explain the reason for me?
because
print(a.shape) # -> (4,)
what you think (or want?) to have is
a = np.array([[1],[2],[3],[4]])
print(a.shape) # -> (4, 1)
or rather (?)
a = np.array([[1, 2 , 3 , 4]])
print(a.shape) # -> (1, 4)
If you'll print a.ndim you'll get 1. That means that a is a one-dimensional array (has rank 1 in numpy terminology), with axis length = 4. It's different from 2D matrix with a single row or column (rank 2).
More on ranks
Related questions:
numpy: 1D array with various shape
Python: Differentiating between row and column vectors
The shape attribute for numpy arrays returns the dimensions of the array. If a has n rows and m columns, then a.shape is (n,m). So a.shape[0] is n and a.shape[1] is m.
numpy arrays returns the dimensions of the array. So, when you create an array using,
a = np.array([1,2,3,4])
you get an array with 4 dimensions. You can check it by printing the shape,
print(a.shape) #(4,)
So, what you get is NOT a 1x4 matrix. If you want that do,
a = numpy.array([1,2,3,4]).reshape((1,4))
print(a.shape)
Or even better,
a = numpy.array([[1,2,3,4]])
a = np.array([1, 2, 3, 4])
by doing this, you get a a as a ndarray, and it is a one-dimension array. Here, the shape (4,) means the array is indexed by a single index which runs from 0 to 3. You can access the elements by the index 0~3. It is different from multi-dimensional arrays.
You can refer to more help from this link Difference between numpy.array shape (R, 1) and (R,).

Subset a 3d numpy array

I have checked the numpy documentation but some of the indexing still eludes me. I have a numpy array such that its shape is (40000, 432) and its looks something like:
arr = [[1,2,3......431,432],
[1,2,3......431,432],
[1,2,3......431,432],
....................
[1,2,3......431,432]'
[1,2,3......431,432]]
I wanted to subset each array over a range (ie. 20-50) so that the shape will be (40000, 30) and it will look like:
subarr = [[20,21,22...48,49,50],
[20,21,22...48,49,50],
[20,21,22...48,49,50],
.....................
[20,21,22...48,49,50]]
Everything I try either returns me an error or gives me the shape (30, 432) which is not what I need. How do I subset a 2d array along the axis I want to?
You want to use numpy slicing:
arr = np.zeros((40000, 432))
subarr = arr[:, 20:50]
print(subarr.shape)
Output
(40000L, 30L)
The L in the shape output indicates that the integer is of Python type long.

Collapse nested array of arrays

I want to take an array with shape (N,), and dtype=object, of arrays that all have the same shape, shape, and create an array with shape == (N,) + shape. I was wondering if anyone knew the best way to do this. Here's an example.
import numpy as np
array = np.empty(4, dtype=object)
array[:] = [np.ones([3, 2])]
array = np.array(array.tolist())
print array.dtype
# float64
print array.shape
# (4, 3, 2)
If you already know the shape of your inner arrays (here, (3,2)), you could simplify the whole process as
subshape = (3,2)
a = np.empty(tuple([N,]+list(subshape)), dtype=object)
a[:] = np.ones(subshape)
That will let you avoid unnecessary conversions to/from lists.
Now, assuming you have a (N,) object array a where each element is a subshape float array, you could do:
a = np.vstack(a)
a.shape = [N,] + list(subshape)
or more simply:
a = np.array(a.tolist(), dtype=float)
the .tolist conversion might not be very efficient, though.

Categories

Resources