numpy asmatrix sometimes produces an array of lists - python

I am using numpy of python to convert multi dimensional data into matrix by using numpy.asmatrix().
data_array = [[] for k in range(K)]
while Append data to data_array:
data_array[k].append(data)
mymatrix = numpy.asmatrix(data_array)
The data_array ends up with K*E. While I try to use mymatrix for further manipulation, it turns out some of them works but others still remains as list. When I print it out, it looks like
[[ list([..., ...]), list(..., ...) ]]
Does anyone knows the potential reason for that?

My guess is that you are doing the equivalent of:
In [5]: np.matrix([[1,2,3],[2,3],[3]])
Out[5]: matrix([[list([1, 2, 3]), list([2, 3]), list([3])]], dtype=object)

You should change your approach to create the entire 2D array first, then fill it in. Something like this:
mymatrix = np.empty((K, N))
for k in range(K):
mymatrix[k] = data

Related

Change first colum of numpy array

How to convert this numpy array:
array = np.array([[np.array([1]),2],[np.array([1]),2],[np.array([1]),2]])
print (array)
[[array([1]) 2]
[array([1]) 2]
[array([1]) 2]]
to this numpy array:
print(array)
[[1 2]
[1 2]
[1 2]]
How can I achieve this without a for loop?
This is what I tried but it doesn't work:
first_col = array[:,0]
first_col = np.array([i[0] for i in first_col])
I don't even know if answering this is a good idea, since there must be a fundemental flaw in the design to even come up with a situation like this and the correct solution would be to fix that, rather than trying to mitigate the issue by converting the output.
Never-the-less, given the data, interestingly enough, it is possible to 'unpack' the structure using the numpy array method .astype():
import numpy as np
array = np.array([[np.array([1]),2],[np.array([1]),2],[np.array([1]),2]])
array = array.astype(int) # alt array = array.astype(float)
But, as stated above, this is treating the symptom of the problem, rather than the problem itself.
Try using map to convert it to a list:
map(lambda x: [list(x[0]), x[1]], array)
This is one way.
import numpy as np
arr = np.array([[np.array([1]),2],
[np.array([1]),2],
[np.array([1]),2]])
np.array([[i[0][0], i[1]] for i in arr])
# array([[1, 2],
# [1, 2],
# [1, 2]])

Python equivalent to MATLAB's dynamic array initialization [duplicate]

I want to create an empty array and append items to it, one at a time.
xs = []
for item in data:
xs.append(item)
Can I use this list-style notation with NumPy arrays?
That is the wrong mental model for using NumPy efficiently. NumPy arrays are stored in contiguous blocks of memory. To append rows or columns to an existing array, the entire array needs to be copied to a new block of memory, creating gaps for the new elements to be stored. This is very inefficient if done repeatedly.
Instead of appending rows, allocate a suitably sized array, and then assign to it row-by-row:
>>> import numpy as np
>>> a = np.zeros(shape=(3, 2))
>>> a
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
>>> a[0] = [1, 2]
>>> a[1] = [3, 4]
>>> a[2] = [5, 6]
>>> a
array([[ 1., 2.],
[ 3., 4.],
[ 5., 6.]])
A NumPy array is a very different data structure from a list and is designed to be used in different ways. Your use of hstack is potentially very inefficient... every time you call it, all the data in the existing array is copied into a new one. (The append function will have the same issue.) If you want to build up your matrix one column at a time, you might be best off to keep it in a list until it is finished, and only then convert it into an array.
e.g.
mylist = []
for item in data:
mylist.append(item)
mat = numpy.array(mylist)
item can be a list, an array or any iterable, as long
as each item has the same number of elements.
In this particular case (data is some iterable holding the matrix columns) you can simply use
mat = numpy.array(data)
(Also note that using list as a variable name is probably not good practice since it masks the built-in type by that name, which can lead to bugs.)
EDIT:
If for some reason you really do want to create an empty array, you can just use numpy.array([]), but this is rarely useful!
To create an empty multidimensional array in NumPy (e.g. a 2D array m*n to store your matrix), in case you don't know m how many rows you will append and don't care about the computational cost Stephen Simmons mentioned (namely re-buildinging the array at each append), you can squeeze to 0 the dimension to which you want to append to: X = np.empty(shape=[0, n]).
This way you can use for example (here m = 5 which we assume we didn't know when creating the empty matrix, and n = 2):
import numpy as np
n = 2
X = np.empty(shape=[0, n])
for i in range(5):
for j in range(2):
X = np.append(X, [[i, j]], axis=0)
print X
which will give you:
[[ 0. 0.]
[ 0. 1.]
[ 1. 0.]
[ 1. 1.]
[ 2. 0.]
[ 2. 1.]
[ 3. 0.]
[ 3. 1.]
[ 4. 0.]
[ 4. 1.]]
I looked into this a lot because I needed to use a numpy.array as a set in one of my school projects and I needed to be initialized empty... I didn't found any relevant answer here on Stack Overflow, so I started doodling something.
# Initialize your variable as an empty list first
In [32]: x=[]
# and now cast it as a numpy ndarray
In [33]: x=np.array(x)
The result will be:
In [34]: x
Out[34]: array([], dtype=float64)
Therefore you can directly initialize an np array as follows:
In [36]: x= np.array([], dtype=np.float64)
I hope this helps.
For creating an empty NumPy array without defining its shape you can do the following:
arr = np.array([])
The first one is preferred because you know you will be using this as a NumPy array. NumPy converts this to np.ndarray type afterward, without extra [] 'dimension'.
for adding new element to the array us can do:
arr = np.append(arr, 'new element')
Note that in the background for python there's no such thing as an array without
defining its shape. as #hpaulj mentioned this also makes a one-rank
array.
You can use the append function. For rows:
>>> from numpy import *
>>> a = array([10,20,30])
>>> append(a, [[1,2,3]], axis=0)
array([[10, 20, 30],
[1, 2, 3]])
For columns:
>>> append(a, [[15],[15]], axis=1)
array([[10, 20, 30, 15],
[1, 2, 3, 15]])
EDIT
Of course, as mentioned in other answers, unless you're doing some processing (ex. inversion) on the matrix/array EVERY time you append something to it, I would just create a list, append to it then convert it to an array.
Here is some workaround to make numpys look more like Lists
np_arr = np.array([])
np_arr = np.append(np_arr , 2)
np_arr = np.append(np_arr , 24)
print(np_arr)
OUTPUT: array([ 2., 24.])
If you absolutely don't know the final size of the array, you can increment the size of the array like this:
my_arr = numpy.zeros((0,5))
for i in range(3):
my_arr=numpy.concatenate( ( my_arr, numpy.ones((1,5)) ) )
print(my_arr)
[[ 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1.]]
Notice the 0 in the first line.
numpy.append is another option. It calls numpy.concatenate.
You can apply it to build any kind of array, like zeros:
a = range(5)
a = [i*0 for i in a]
print a
[0, 0, 0, 0, 0]
Depending on what you are using this for, you may need to specify the data type (see 'dtype').
For example, to create a 2D array of 8-bit values (suitable for use as a monochrome image):
myarray = numpy.empty(shape=(H,W),dtype='u1')
For an RGB image, include the number of color channels in the shape: shape=(H,W,3)
You may also want to consider zero-initializing with numpy.zeros instead of using numpy.empty. See the note here.
Another simple way to create an empty array that can take array is:
import numpy as np
np.empty((2,3), dtype=object)
I think you want to handle most of the work with lists then use the result as a matrix. Maybe this is a way ;
ur_list = []
for col in columns:
ur_list.append(list(col))
mat = np.matrix(ur_list)
I think you can create empty numpy array like:
>>> import numpy as np
>>> empty_array= np.zeros(0)
>>> empty_array
array([], dtype=float64)
>>> empty_array.shape
(0,)
This format is useful when you want to append numpy array in the loop.
Perhaps what you are looking for is something like this:
x=np.array(0)
In this way you can create an array without any element. It similar than:
x=[]
This way you will be able to append new elements to your array in advance.
The simplest way
Input:
import numpy as np
data = np.zeros((0, 0), dtype=float) # (rows,cols)
data.shape
Output:
(0, 0)
Input:
for i in range(n_files):
data = np.append(data, new_data, axis = 0)

best way to create a numpy array from a list and additional individual values

I want to create an array from list entries and some additional individual values.
I am using the following approach which seems clumsy:
x=[1,2,3]
y=some_variable1
z=some_variable2
x.append(y)
x.append(z)
arr = np.array(x)
#print arr --> [1 2 3 some_variable1 some_variable2]
is there a better solution to the problem?
You can use list addition to add the variables all placed in a list to the larger one, like so:
arr = np.array(x + [y, z])
Appending or concatenating lists is fine, and probably fastest.
Concatenating at the array level works as well
In [456]: np.hstack([x,y,z])
Out[456]: array([1, 2, 3, 4, 5])
This is compact, but under the covers it does
np.concatenate([np.array(x),np.array([y]),np.array([z])])

Pythonic way to get the first AND the last element of the sequence

What is the easiest and cleanest way to get the first AND the last elements of a sequence? E.g., I have a sequence [1, 2, 3, 4, 5], and I'd like to get [1, 5] via some kind of slicing magic. What I have come up with so far is:
l = len(s)
result = s[0:l:l-1]
I actually need this for a bit more complex task. I have a 3D numpy array, which is cubic (i.e. is of size NxNxN, where N may vary). I'd like an easy and fast way to get a 2x2x2 array containing the values from the vertices of the source array. The example above is an oversimplified, 1D version of my task.
Use this:
result = [s[0], s[-1]]
Since you're using a numpy array, you may want to use fancy indexing:
a = np.arange(27)
indices = [0, -1]
b = a[indices] # array([0, 26])
For the 3d case:
vertices = [(0,0,0),(0,0,-1),(0,-1,0),(0,-1,-1),(-1,-1,-1),(-1,-1,0),(-1,0,0),(-1,0,-1)]
indices = list(zip(*vertices)) #Can store this for later use.
a = np.arange(27).reshape((3,3,3)) #dummy array for testing. Can be any shape size :)
vertex_values = a[indices].reshape((2,2,2))
I first write down all the vertices (although I am willing to bet there is a clever way to do it using itertools which would let you scale this up to N dimensions ...). The order you specify the vertices is the order they will be in the output array. Then I "transpose" the list of vertices (using zip) so that all the x indices are together and all the y indices are together, etc. (that's how numpy likes it). At this point, you can save that index array and use it to index your array whenever you want the corners of your box. You can easily reshape the result into a 2x2x2 array (although the order I have it is probably not the order you want).
This would give you a list of the first and last element in your sequence:
result = [s[0], s[-1]]
Alternatively, this would give you a tuple
result = s[0], s[-1]
With the particular case of a (N,N,N) ndarray X that you mention, would the following work for you?
s = slice(0,N,N-1)
X[s,s,s]
Example
>>> N = 3
>>> X = np.arange(N*N*N).reshape(N,N,N)
>>> s = slice(0,N,N-1)
>>> print X[s,s,s]
[[[ 0 2]
[ 6 8]]
[[18 20]
[24 26]]]
>>> from operator import itemgetter
>>> first_and_last = itemgetter(0, -1)
>>> first_and_last([1, 2, 3, 4, 5])
(1, 5)
Why do you want to use a slice? Getting each element with
result = [s[0], s[-1]]
is better and more readable.
If you really need to use the slice, then your solution is the simplest working one that I can think of.
This also works for the 3D case you've mentioned.

How do I create an empty array and then append to it in NumPy?

I want to create an empty array and append items to it, one at a time.
xs = []
for item in data:
xs.append(item)
Can I use this list-style notation with NumPy arrays?
That is the wrong mental model for using NumPy efficiently. NumPy arrays are stored in contiguous blocks of memory. To append rows or columns to an existing array, the entire array needs to be copied to a new block of memory, creating gaps for the new elements to be stored. This is very inefficient if done repeatedly.
Instead of appending rows, allocate a suitably sized array, and then assign to it row-by-row:
>>> import numpy as np
>>> a = np.zeros(shape=(3, 2))
>>> a
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
>>> a[0] = [1, 2]
>>> a[1] = [3, 4]
>>> a[2] = [5, 6]
>>> a
array([[ 1., 2.],
[ 3., 4.],
[ 5., 6.]])
A NumPy array is a very different data structure from a list and is designed to be used in different ways. Your use of hstack is potentially very inefficient... every time you call it, all the data in the existing array is copied into a new one. (The append function will have the same issue.) If you want to build up your matrix one column at a time, you might be best off to keep it in a list until it is finished, and only then convert it into an array.
e.g.
mylist = []
for item in data:
mylist.append(item)
mat = numpy.array(mylist)
item can be a list, an array or any iterable, as long
as each item has the same number of elements.
In this particular case (data is some iterable holding the matrix columns) you can simply use
mat = numpy.array(data)
(Also note that using list as a variable name is probably not good practice since it masks the built-in type by that name, which can lead to bugs.)
EDIT:
If for some reason you really do want to create an empty array, you can just use numpy.array([]), but this is rarely useful!
To create an empty multidimensional array in NumPy (e.g. a 2D array m*n to store your matrix), in case you don't know m how many rows you will append and don't care about the computational cost Stephen Simmons mentioned (namely re-buildinging the array at each append), you can squeeze to 0 the dimension to which you want to append to: X = np.empty(shape=[0, n]).
This way you can use for example (here m = 5 which we assume we didn't know when creating the empty matrix, and n = 2):
import numpy as np
n = 2
X = np.empty(shape=[0, n])
for i in range(5):
for j in range(2):
X = np.append(X, [[i, j]], axis=0)
print X
which will give you:
[[ 0. 0.]
[ 0. 1.]
[ 1. 0.]
[ 1. 1.]
[ 2. 0.]
[ 2. 1.]
[ 3. 0.]
[ 3. 1.]
[ 4. 0.]
[ 4. 1.]]
I looked into this a lot because I needed to use a numpy.array as a set in one of my school projects and I needed to be initialized empty... I didn't found any relevant answer here on Stack Overflow, so I started doodling something.
# Initialize your variable as an empty list first
In [32]: x=[]
# and now cast it as a numpy ndarray
In [33]: x=np.array(x)
The result will be:
In [34]: x
Out[34]: array([], dtype=float64)
Therefore you can directly initialize an np array as follows:
In [36]: x= np.array([], dtype=np.float64)
I hope this helps.
For creating an empty NumPy array without defining its shape you can do the following:
arr = np.array([])
The first one is preferred because you know you will be using this as a NumPy array. NumPy converts this to np.ndarray type afterward, without extra [] 'dimension'.
for adding new element to the array us can do:
arr = np.append(arr, 'new element')
Note that in the background for python there's no such thing as an array without
defining its shape. as #hpaulj mentioned this also makes a one-rank
array.
You can use the append function. For rows:
>>> from numpy import *
>>> a = array([10,20,30])
>>> append(a, [[1,2,3]], axis=0)
array([[10, 20, 30],
[1, 2, 3]])
For columns:
>>> append(a, [[15],[15]], axis=1)
array([[10, 20, 30, 15],
[1, 2, 3, 15]])
EDIT
Of course, as mentioned in other answers, unless you're doing some processing (ex. inversion) on the matrix/array EVERY time you append something to it, I would just create a list, append to it then convert it to an array.
Here is some workaround to make numpys look more like Lists
np_arr = np.array([])
np_arr = np.append(np_arr , 2)
np_arr = np.append(np_arr , 24)
print(np_arr)
OUTPUT: array([ 2., 24.])
If you absolutely don't know the final size of the array, you can increment the size of the array like this:
my_arr = numpy.zeros((0,5))
for i in range(3):
my_arr=numpy.concatenate( ( my_arr, numpy.ones((1,5)) ) )
print(my_arr)
[[ 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1.]]
Notice the 0 in the first line.
numpy.append is another option. It calls numpy.concatenate.
You can apply it to build any kind of array, like zeros:
a = range(5)
a = [i*0 for i in a]
print a
[0, 0, 0, 0, 0]
Depending on what you are using this for, you may need to specify the data type (see 'dtype').
For example, to create a 2D array of 8-bit values (suitable for use as a monochrome image):
myarray = numpy.empty(shape=(H,W),dtype='u1')
For an RGB image, include the number of color channels in the shape: shape=(H,W,3)
You may also want to consider zero-initializing with numpy.zeros instead of using numpy.empty. See the note here.
Another simple way to create an empty array that can take array is:
import numpy as np
np.empty((2,3), dtype=object)
I think you want to handle most of the work with lists then use the result as a matrix. Maybe this is a way ;
ur_list = []
for col in columns:
ur_list.append(list(col))
mat = np.matrix(ur_list)
I think you can create empty numpy array like:
>>> import numpy as np
>>> empty_array= np.zeros(0)
>>> empty_array
array([], dtype=float64)
>>> empty_array.shape
(0,)
This format is useful when you want to append numpy array in the loop.
Perhaps what you are looking for is something like this:
x=np.array(0)
In this way you can create an array without any element. It similar than:
x=[]
This way you will be able to append new elements to your array in advance.
The simplest way
Input:
import numpy as np
data = np.zeros((0, 0), dtype=float) # (rows,cols)
data.shape
Output:
(0, 0)
Input:
for i in range(n_files):
data = np.append(data, new_data, axis = 0)

Categories

Resources