Create a matrix out of an array - python

I am trying to construct a matrix object out of an array. The array has a length of 25, and what I'm trying to do is construct a 5x5 matrix out of it. I have used both numpy.asmatrix() and the matrix constructor but both result in a matrix that has a length of 1. So, what's basically happening is all the elements of the array are considered a tuple and inserted into the newly-created matrix. Is there any way around this so I can accomplish what I want?
EDIT: When I wrote "array", I naively meant a vanilla python list and not an actual numpy.array which would make things a lot simpler. A mistake on my part.

Think you probably just want .reshape():
In [2]: a = np.arange(25)
In [3]: a
Out[3]:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24])
In [4]: a.reshape(5,5)
Out[4]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
You can also convert it into an np.matrix after if you need things from that:
In [5]: np.matrix(a.reshape(5,5))
Out[5]:
matrix([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
EDIT: If you've got a list to start, it's still not too bad:
In [16]: l = range(25)
In [17]: np.matrix(np.reshape(l, (5,5)))
Out[17]:
matrix([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])

You can simply simulate a Matrix by using a 2-dimensional array with 5 spaces in each direction:
>>>Matrix = [[0 for x in range(5)] for x in range(5)]
And access the elemets via:
>>>Matrix[0][0]=1
To test the output, print it:
>>>Matrix
[[1, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
If you need a specific implementation like numpy, please specify your question.

row = int(input("Enter the number of Rows: \n"))
col = int(input("Enter the number of Column: \n"))
print("Enter how many elements you want: \n")
num1 = row * col
print('Enter your elements in array: ')
for i in range(num1):
n = int(input("Element " + str(i + 1) + " : "))
num_array1.append(n)
arr = np.array([num_array1])
newarr = arr.reshape(row, col)
print(newarr)
print(type(newarr))
This should help to create matrix type arrays
with custom user input

If you have an array of length 25, you can turn it into a 5x5 array using reshape().
A = np.arange(25) # length 25
B = A.reshape(5, 5) # 5x5 array
You will however have to make sure that the elements in your array end up in the correct place in the newly formed 5x5 array.
Although there is a numpy.matrix class, I would suggest you forget about it and only use numpy.ndarray. The only difference is you have to use np.dot (or # in case of newer Python/Numpy) for matrix multiplication instead of *.
The matrix class have a tendency to introduce mistakes in your code unless you are very careful.

Related

Is there a vectorized way of appending/indexing values from nd_array 2 based off component r/c values in nd_array 1?

I'm looking to vectorize some code that I have, but am unsure how to approach it or if it's even possible. Here's what I have so far:
arr_row = [0, 1, 2, ..., n] # Array of size n with random integers
arr_col = [0, 1, 2, ..., n] # Array of size n with random integers
for i in range(n):
r = arr_row[i]
c = arr_col[i]
result = n_by_n_table[r,c]
So briefly about the above. I have two arrays that are n elements, I iterate through i to n getting the item at the row/col location obtained from my row/col arrs. I get this result from a guaranteed n x n matrix. I am hoping that I could vectorized this for a performance improvement. Here's what I had in mind:
vectorized = nd_array((n, 3))
vectorized[:, 0] = [0, 1, 2, ..., n] # Likely using np.rand func or something here.
vectorized[:, 1] = [0, 1, 2, ..., n] # Same as above
vectorized[:, 2] = n_by_n_table[vectorized[:,0], vectorized[:,1]]
Is this possible and any concerns doing the above?
example data
arr_row = [5,1,0,3,2,4]
arr_col = [1,4,2,0,5,3]
n_by_n_table = np.array(range(36)).reshape(6,6)
n_by_n_table :
[[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]]
solution
n_by_n_table[arr_row,arr_col]
result:
array([31, 10, 2, 18, 17, 27])

Numpy Delete for 2-dimensional array

I have an ndarray of shape (10, 3) and an index list of length 10:
import numpy as np
arr = np.arange(10* 3).reshape((10, 3))
idxs = np.array([0, 1, 1, 1, 2, 0, 2, 2, 1 , 0])
I want to use numpy delete (or a numpy function that is suited better for the task) to delete the values in arr as indicated by idxs for each row. So in the zeroth row of arr I want to delete the 0th entry, in the first the first, in the second the first, and so on.
I tried something like
np.delete(arr, idxs, axis=1)
but it won't work. Then I tried building an index list like this:
idlist = [np.arange(len(idxs)), idxs]
np.delete(arr, idlist)
but this doesn't give me the results I want either.
#Quang's answer is good, but may benefit from some explanation.
np.delete works with whole rows or columns, not selected elements from each.
In [30]: arr = np.arange(10* 3).reshape((10, 3))
...: idxs = np.array([0, 1, 1, 1, 2, 0, 2, 2, 1 , 0])
Selecting items from the array is easy:
In [31]: arr[np.arange(10), idxs]
Out[31]: array([ 0, 4, 7, 10, 14, 15, 20, 23, 25, 27])
Selecting everything but these, takes a bit more work. np.delete is complex general code that does different things depending on the delete specification. But one thing it can do is create a True mask, and set the delete items to False.
For your 2d case we can:
In [33]: mask = np.ones(arr.shape, bool)
In [34]: mask[np.arange(10), idxs] = False
In [35]: arr[mask]
Out[35]:
array([ 1, 2, 3, 5, 6, 8, 9, 11, 12, 13, 16, 17, 18, 19, 21, 22, 24,
26, 28, 29])
boolean indexing produces a flat array, so we need to reshape to get 2d:
In [36]: arr[mask].reshape(10,2)
Out[36]:
array([[ 1, 2],
[ 3, 5],
[ 6, 8],
[ 9, 11],
[12, 13],
[16, 17],
[18, 19],
[21, 22],
[24, 26],
[28, 29]])
The Quand's answer creates the mask in another way:
In [37]: arr[np.arange(arr.shape[1]) != idxs[:,None]]
Out[37]:
array([ 1, 2, 3, 5, 6, 8, 9, 11, 12, 13, 16, 17, 18, 19, 21, 22, 24,
26, 28, 29])
Let's try extracting the other items by masking, then reshape:
arr[np.arange(arr.shape[1]) != idxs[:,None]].reshape(len(arr),-1)
Thanks for your question and the answers from Quang, and hpaulj.
I just want to add a second senario, where one wants to do the deletion from the other axis.
The index now has only 3 elements because there are only 3 columns in arr, for example:
idxs2 = np.array([1,2,3])
To delete the elements of each column according to the index in idxs2, one can do this
arr.T[np.array(np.arange(arr.shape[0]) != idxs2[:,None])].reshape(len(idxs2),-1).T
And the result becomes:
array([[ 0, 1, 2],
[ 6, 4, 5],
[ 9, 10, 8],
[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26],
[27, 28, 29]])

Extract values from a numpy array based on another array of 0/1 indices

Given an index array idx that only contains 0 and 1 elements, and 1s represent the sample indices of interest, and a sample array A (A.shape[0] = idx.shape[0]). The objective here is to extract a subset of samples based on the index vector.
In matlab, it is trivial to do:
B = A(idx,:) %assuming A is 2D matrix and idx is a logical vector
How to achieve this in Python in a simple manner?
If your mask array idx has the same shape as your array A, then you should be able to extract elements specified by the mask if you convert idx to a boolean array, using astype.
Demo -
>>> A
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> idx
array([[1, 0, 0, 1, 1],
[0, 0, 0, 1, 0],
[1, 0, 0, 1, 1],
[1, 0, 0, 1, 1],
[0, 1, 1, 1, 1]])
>>> A[idx.astype(bool)]
array([ 0, 3, 4, 8, 10, 13, 14, 15, 18, 19, 21, 22, 23, 24])
Using the bool operation is equivalent to that logical one in Matlab:
B = A[idx.astype(bool)]

compress numpy array(matrix) by removing columns using another numpy array as mask

I have a 2D numpy array (i.e matrix) A which contains useful data interspread with garbage in the form of column vectors as well as a 'selection' array B which contains '1' for those columns that are important and 0 for those that are not. Is there a way to select only those columns from A that correspond to ones in B? i.e i have a matrix
A = array([[ 0, 1, 2, 3, 4], and a vector B = array([ 0, 1, 0, 1, 0])
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
and I want
array([[1, 3],
[6, 8],
[11, 13],
[16, 18],
[21, 23]])
Is there an elegant way to do so? Right now i just have a for loop that iterates through B.
NOTE: the matrices that i'm dealing with are large, so i don't want to use numpy masked arrays, as i simply don't want the masked data
>>> A
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> B = NP.array([ 0, 1, 0, 1, 0])
>>> # convert the indexing array to a boolean array
>>> B = NP.array(B, dtype=bool)
>>> # index A against B--indexing array is placed after the ',' because
>>> # you are selecting columns
>>> res = A[:,B]
>>> res
array([[ 1, 3],
[ 6, 8],
[11, 13],
[16, 18],
[21, 23]])
The syntax for index-based slicing in NumPy is elegant and simple. A couple of rules cover a majority of use cases:
the form is [rows, columns]
specify all rows or all columns using a colon ":" e.g., [:, 4] (extracts the
entire 5th column)
Not sure if it's the most efficient way (because of the transposition), but it should be better than a for loop:
A.T[B == 1].T
I was interested to do the same but to slice row & column using the boolean values of vector B, the solution was simple:
res = A[:,B][B,:]

Numpy append: Automatically cast an array of the wrong dimension

is there a way to do the following without an if clause?
I'm reading a set of netcdf files with pupynere and want to build an array with numpy append. Sometimes the input data is multi-dimensional (see variable "a" below), sometimes one dimensional ("b"), but the number of elements in the first dimension is always the same ("9" in the example below).
> import numpy as np
> a = np.arange(27).reshape(3,9)
> b = np.arange(9)
> a.shape
(3, 9)
> b.shape
(9,)
this works as expected:
> np.append(a,a, axis=0)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26]])
but, appending b does not work so elegantly:
> np.append(a,b, axis=0)
ValueError: arrays must have same number of dimensions
The problem with append is (from the numpy manual)
"When axis is specified, values must have the correct shape."
I'd have to cast first in order to get the right result.
> np.append(a,b.reshape(1,9), axis=0)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8]])
So, in my file reading loop, I'm currently using an if clause like this:
for i in [a, b]:
if np.size(i.shape) == 2:
result = np.append(result, i, axis=0)
else:
result = np.append(result, i.reshape(1,9), axis=0)
Is there a way to append "a" and "b" without the if statement?
EDIT: While #Sven answered the original question perfectly (using np.atleast_2d()), he (and others) pointed out that the code is inefficient. In an answer below, I combined their suggestions and replaces my original code. It should be much more efficient now. Thanks.
You can use numpy.atleast_2d():
result = np.append(result, np.atleast_2d(i), axis=0)
That said, note that the repeated use of numpy.append() is a very inefficient way to build a NumPy array -- it has to be reallocated in every step. If at all possible, preallocate the array with the desired final size and populate it afterwards using slicing.
You can just add all of the arrays to a list, then use np.vstack() to concatenate them all together at the end. This avoids constantly reallocating the growing array with every append.
|1> a = np.arange(27).reshape(3,9)
|2> b = np.arange(9)
|3> np.vstack([a,b])
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8]])
I'm going to improve my code with the help of #Sven, #Henry and #Robert. #Sven answered the question, so he earns the reputation for this question, but - as highlighted by him and others -there is a more efficient way of doing what I want.
This involves using a python list, which allows appending with a performance penalty of O(1) whereas numpy.append() has a performance penalty of O(N**2). Afterwards, the list is converted to a numpy array:
Suppose i is either of type a or b:
> a = np.arange(27).reshape(3,9)
> b = np.arange(9)
> a.shape
(3, 9)
> b.shape
(9,)
Initialise list and append all read data, e.g. if data appear in order 'aaba'.
> mList = []
> for i in [a,a,b,a]:
mList.append(i)
Your mList will look like this:
> mList
[array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26]]),
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26]]),
array([0, 1, 2, 3, 4, 5, 6, 7, 8]),
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26]])]
finally, vstack the list to form a numpy array:
> result = np.vstack(mList[:])
> result.shape
(10, 9)
Thanks again for valuable help.
As pointed out, append needs to reallocate every numpy array. An alternative solution that allocates once would be something like this:
total_size = 0
for i in [a,b]:
total_size += i.size
result = numpy.empty(total_size, dtype=a.dtype)
offset = 0
for i in [a,b]:
# copy in the array
result[offset:offset+i.size] = i.ravel()
offset += i.size
# if you know its always divisible by 9:
result = result.reshape(result.size//9, 9)
If you can't precompute the array size, then perhaps you can put an upper bound on the size and then just preallocate a block that will always be big enough. Then you can just make the result a view into that block:
result = result[0:known_final_size]

Categories

Resources