Convert a 1D array to a 2D array in numpy - python

I want to convert a 1-dimensional array into a 2-dimensional array by specifying the number of columns in the 2D array. Something that would work like this:
> import numpy as np
> A = np.array([1,2,3,4,5,6])
> B = vec2matrix(A,ncol=2)
> B
array([[1, 2],
[3, 4],
[5, 6]])
Does numpy have a function that works like my made-up function "vec2matrix"? (I understand that you can index a 1D array like a 2D array, but that isn't an option in the code I have - I need to make this conversion.)

You want to reshape the array.
B = np.reshape(A, (-1, 2))
where -1 infers the size of the new dimension from the size of the input array.

You have two options:
If you no longer want the original shape, the easiest is just to assign a new shape to the array
a.shape = (a.size//ncols, ncols)
You can switch the a.size//ncols by -1 to compute the proper shape automatically. Make sure that a.shape[0]*a.shape[1]=a.size, else you'll run into some problem.
You can get a new array with the np.reshape function, that works mostly like the version presented above
new = np.reshape(a, (-1, ncols))
When it's possible, new will be just a view of the initial array a, meaning that the data are shared. In some cases, though, new array will be acopy instead. Note that np.reshape also accepts an optional keyword order that lets you switch from row-major C order to column-major Fortran order. np.reshape is the function version of the a.reshape method.
If you can't respect the requirement a.shape[0]*a.shape[1]=a.size, you're stuck with having to create a new array. You can use the np.resize function and mixing it with np.reshape, such as
>>> a =np.arange(9)
>>> np.resize(a, 10).reshape(5,2)

Try something like:
B = np.reshape(A,(-1,ncols))
You'll need to make sure that you can divide the number of elements in your array by ncols though. You can also play with the order in which the numbers are pulled into B using the order keyword.

If your sole purpose is to convert a 1d array X to a 2d array just do:
X = np.reshape(X,(1, X.size))

convert a 1-dimensional array into a 2-dimensional array by adding new axis.
a=np.array([10,20,30,40,50,60])
b=a[:,np.newaxis]--it will convert it to two dimension.

There is a simple way as well, we can use the reshape function in a different way:
A_reshape = A.reshape(No_of_rows, No_of_columns)

You can useflatten() from the numpy package.
import numpy as np
a = np.array([[1, 2],
[3, 4],
[5, 6]])
a_flat = a.flatten()
print(f"original array: {a} \nflattened array = {a_flat}")
Output:
original array: [[1 2]
[3 4]
[5 6]]
flattened array = [1 2 3 4 5 6]

some_array.shape = (1,)+some_array.shape
or get a new one
another_array = numpy.reshape(some_array, (1,)+some_array.shape)
This will make dimensions +1, equals to adding a bracket on the outermost

Change 1D array into 2D array without using Numpy.
l = [i for i in range(1,21)]
part = 3
new = []
start, end = 0, part
while end <= len(l):
temp = []
for i in range(start, end):
temp.append(l[i])
new.append(temp)
start += part
end += part
print("new values: ", new)
# for uneven cases
temp = []
while start < len(l):
temp.append(l[start])
start += 1
new.append(temp)
print("new values for uneven cases: ", new)

import numpy as np
array = np.arange(8)
print("Original array : \n", array)
array = np.arange(8).reshape(2, 4)
print("New array : \n", array)

Related

create new array from other arrays

I want to create a new array from three other arrays so that I want to create a 20x3 array (center_lab) from three 20x1 arrays (TT2r, TT2g, TT2b) so I would get the following:
center_lab = np.zeros([20,3])
center_lab[:,0] = TT2r
center_lab[:,1] = TT2g
center_lab[:,2] = TT2b
And I get the next error: could not broadcast input array from shape (21,1) into shape (21,)
Anyone knows how to fix this?
If im not mistaken your TT2r arrays have a shape of (21,1)
The part of center_lab you write into is only of shape (21,) since you index the last dimension (as opposed to slicing). If you also remove the last dimensions of your TTs it should fit.
center_lab = np.zeros([20,3])
center_lab[:,0] = TT2r[:,0]
center_lab[:,1] = TT2g[:,0]
center_lab[:,2] = TT2b[:,0]
create a 20x3 array (center_lab) from three 20x1 arrays
This sound like task for numpy.hstack, consider following simple example:
import numpy as np
arr1 = np.array([[1],[2],[3]])
arr2 = np.array([[4],[5],[6]])
arr3 = np.array([[7],[8],[9]])
arr = np.hstack([arr1,arr2,arr3])
print(arr)
output
[[1 4 7]
[2 5 8]
[3 6 9]]
Note: I used 3x1 array for sake of clarity. If you would need to stack vertically rather than horizontally as above see numpy.vstack.

Alternative to loop for for boolean / nonzero indexing of numpy array

I need to select only the non-zero 3d portions of a 3d binary array (or alternatively the true values of a boolean array). Currently I am able to do so with a series of 'for' loops that use np.any, but this does work but seems awkward and slow, so currently investigating a more direct way to accomplish the task.
I am rather new to numpy, so the approaches that I have tried include a) using
np.nonzero, which returns indices that I am at a loss to understand what to do with for my purposes, b) boolean array indexing, and c) boolean masks. I can generally understand each of those approaches for simple 2d arrays, but am struggling to understand the differences between the approaches, and cannot get them to return the right values for a 3d array.
Here is my current function that returns a 3D array with nonzero values:
def real_size(arr3):
true_0 = []
true_1 = []
true_2 = []
print(f'The input array shape is: {arr3.shape}')
for zero_ in range (0, arr3.shape[0]):
if arr3[zero_].any()==True:
true_0.append(zero_)
for one_ in range (0, arr3.shape[1]):
if arr3[:,one_,:].any()==True:
true_1.append(one_)
for two_ in range (0, arr3.shape[2]):
if arr3[:,:,two_].any()==True:
true_2.append(two_)
arr4 = arr3[min(true_0):max(true_0) + 1, min(true_1):max(true_1) + 1, min(true_2):max(true_2) + 1]
print(f'The nonzero area is: {arr4.shape}')
return arr4
# Then use it on a small test array:
test_array = np.zeros([2, 3, 4], dtype = int)
test_array[0:2, 0:2, 0:2] = 1
#The function call works and prints out as expected:
non_zero = real_size(test_array)
>> The input array shape is: (2, 3, 4)
>> The nonzero area is: (2, 2, 2)
# So, the array is correct, but likely not the best way to get there:
non_zero
>> array([[[1, 1],
[1, 1]],
[[1, 1],
[1, 1]]])
The code works appropriately, but I am using this on much larger and more complex arrays, and don't think this is an appropriate approach. Any thoughts on a more direct method to make this work would be greatly appreciated. I am also concerned about errors and the results if the input array has for example two separate non-zero 3d areas within the original array.
To clarify the problem, I need to return one or more 3D portions as one or more 3d arrays beginning with an original larger array. The returned arrays should not include extraneous zeros (or false values) in any given exterior plane in three dimensional space. Just getting the indices of the nonzero values (or vice versa) doesn't by itself solve the problem.
Assuming you want to eliminate all rows, columns, etc. that contain only zeros, you could do the following:
nz = (test_array != 0)
non_zero = test_array[nz.any(axis=(1, 2))][:, nz.any(axis=(0, 2))][:, :, nz.any(axis=(0, 1))]
An alternative solution using np.nonzero:
i = [np.unique(_) for _ in np.nonzero(test_array)]
non_zero = test_array[i[0]][:, i[1]][:, :, i[2]]
This can also be generalized to arbitrary dimensions, but requires a bit more work (only showing the first approach here):
def real_size(arr):
nz = (arr != 0)
result = arr
axes = np.arange(arr.ndim)
for axis in range(arr.ndim):
zeros = nz.any(axis=tuple(np.delete(axes, axis)))
result = result[(slice(None),)*axis + (zeros,)]
return result
non_zero = real_size(test_array)

numpy padding matrix of different row size

I have a numpy array of different row size
a = np.array([[1,2,3,4,5],[1,2,3],[1]])
and I would like to become this one into a dense (fixed n x m size, no variable rows) matrix. Until now I tried with something like this
size = (len(a),5)
result = np.zeros(size)
result[[0],[len(a[0])]]=a[0]
But I receive an error telling me
shape mismatch: value array of shape (5,) could not be broadcast to
indexing result of shape (1,)
I also tried to do padding wit np.pad, but according to the documentation of numpy.pad it seems I need to specify in the pad_width, the previous size of the rows (which is variable and produced me errors trying with -1,0, and biggest row size).
I know I can do it padding padding lists per row as it's shown here, but I need to do that with a much bigger array of data.
If someone can help me with the answer to this question, I would be glad to know of it.
There's really no way to pad a jagged array such that it would loose its jaggedness, without having to iterate over the rows of the array. You'll have to iterate over the array twice even: once to find out the maximum length you need to pad to, another to actually do the padding.
The code proposal you've linked to will get the job done, but it's not very efficient, because it adds zeroes in a python for-loop that iterates over the elements of the rows, whereas that appending could have been precalculated, thereby pushing more of that code to C.
The code below precomputes an array of the required minimal dimensions, filled with zeroes and then simply adds the row from the jagged array M in place, which is far more efficient.
import random
import numpy as np
M = [[random.random() for n in range(random.randint(0,m))] for m in range(10000)] # play-data
def pad_to_dense(M):
"""Appends the minimal required amount of zeroes at the end of each
array in the jagged array `M`, such that `M` looses its jagedness."""
maxlen = max(len(r) for r in M)
Z = np.zeros((len(M), maxlen))
for enu, row in enumerate(M):
Z[enu, :len(row)] += row
return Z
To give you some idea for speed:
from timeit import timeit
n = [10, 100, 1000, 10000]
s = [timeit(stmt='Z = pad_to_dense(M)', setup='from __main__ import pad_to_dense; import numpy as np; from random import random, randint; M = [[random() for n in range(randint(0,m))] for m in range({})]'.format(ni), number=1) for ni in n]
print('\n'.join(map(str,s)))
# 7.838103920221329e-05
# 0.0005027339793741703
# 0.01208890089765191
# 0.8269036808051169
If you want to prepend zeroes to the arrays, rather than append, that's a simple enough change to the code, which I'll leave to you.
You can do something like this with numpy.pad
import numpy as np
a = np.array([[1,2,3,4,5],[1,2,3],[1]])
l = np.array([len(a[i]) for i in range(len(a))])
width = l.max()
b=[]
for i in range(len(a)):
if len(a[i]) != width:
x = np.pad(a[i], (0,width-len(a[i])), 'constant',constant_values = 0)
else:
x = a[i]
b.append(x)
b = np.array(b)
print(b)
Above piece of code outputs something like this.
b = [[1, 2, 3, 4, 5],
[1, 2, 3, 0, 0],
[1, 0, 0, 0, 0]]
You can read back your input version of data by doing something as follows
a = []
for i in range(len(b)):
a.append(b[i][0:l[i]])
a = np.array(a)
print(a)
where you get the following output
a = array([array([1, 2, 3, 4, 5]), array([1, 2, 3]), array([1])], dtype=object)
Hopefully this helps someone who struggled like me to solve the issue.
Thank you.

ctypes - numpy array with no shape?

I am using a python wrapper to call functions of a c++ dll library. A ctype is returned by the dll library, which I convert to numpy array
score = np.ctypeslib.as_array(score,1)
however, the array has no shape?
score
>>> array(-0.019486344729027664)
score.shape
>>> ()
score[0]
>>> IndexError: too many indices for array
How can I extract a double from the score array?
Thank you.
You can access the data inside a 0-dimensional array via indexing [()].
For example, score[()] will retrieve the underlying data in your array.
The idiom is in fact consistent:
# x, y, z are 0-dim, 1-dim, 2-dim respectively
x = np.array(1)
y = np.array([1, 2, 3])
z = np.array([[1, 2, 3], [4, 5, 6]])
# use 0-dim, 1-dim, 2-dim tuple indexers respectively
res_x = x[()] # 1
res_y = y[(1,)] # 2
res_z = z[(1, 2)] # 6
Tuples seem unnatural because you don't need to use them explicitly for the 1d and 2d cases, i.e. y[1] and z[1, 2] suffice. That option isn't available for the 0-dim case, so use the zero-length tuple.

Index the middle of a numpy array?

To index the middle points of a numpy array, you can do this:
x = np.arange(10)
middle = x[len(x)/4:len(x)*3/4]
Is there a shorthand for indexing the middle of the array? e.g., the n or 2n elements closes to len(x)/2? Is there a nice n-dimensional version of this?
as cge said, the simplest way is by turning it into a lambda function, like so:
x = np.arange(10)
middle = lambda x: x[len(x)/4:len(x)*3/4]
or the n-dimensional way is:
middle = lambda x: x[[slice(np.floor(d/4.),np.ceil(3*d/4.)) for d in x.shape]]
Late, but for everyone else running into this issue:
A much smoother way is to use numpy's take or put.
To address the middle of an array you can use put to index an n-dimensional array with a single index. Same for getting values from an array with take
Assuming your array has an odd number of elements, the middle of the array will be at half of it's size. By using an integer division (// instead of /) you won't get any problems here.
import numpy as np
arr = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
# put a value to the center
np.put(arr, arr.size // 2, 999)
print(arr)
# take a value from the center
center = np.take(arr, arr.size // 2)
print(center)

Categories

Resources