Create numpy array of arrays [duplicate] - python

How to create an numpy array with shape [2, 2, 3], where the elements at axis 2 is another array, for example [1, 2, 3]?
So I would like to do something like this invalid code:
a = np.arange(1, 4)
b = np.full((3, 3), a)
Resulting in an array like:
[[[ 1. 2. 3.]
[ 1. 2. 3.]]
[[ 1. 2. 3.]
[ 1. 2. 3.]]]
Could of course make the loop for filling like, but thought there may be a shortcut:
for y in range(b.shape[0]):
for x in range(b.shape[1]):
b[y, x, :] = a

There are multiple ways to achieve this. One is to use np.full in np.full((2,2,3), a) as pointed out by Divakar in the comments. Alternatively, you can use np.tile for this, which allows you to construct an array by repeating an input array a given number of times. To construct your example you could do:
import numpy as np
np.tile(np.arange(1, 4), [2, 2, 1])

If your numpy version is >= 1.10 you can use broadcast_to
a = np.arange(1,4)
a.shape = (1,1,3)
b = np.broadcast_to(a,(2,2,3))
This produces a view rather than copying so will be quicker for large arrays.
EDIT this looks to be the result you're asking for with your demo.

Based on Divakar comment, an answer can also be:
import numpy as np
np.full([2, 2, 3], np.arange(1, 4))
Yet another possibility is:
import numpy as np
b = np.empty([2, 2, 3])
b[:] = np.arange(1, 4)

Also using np.concatenate or it's wrapper np.vstack
In [26]: a = np.arange(1,4)
In [27]: np.vstack([a[np.newaxis, :]]*4).reshape(2,2, 3)
Out[27]:
array([[[1, 2, 3],
[1, 2, 3]],
[[1, 2, 3],
[1, 2, 3]]])
In [28]: np.concatenate([a[np.newaxis, :]]*4, axis=0).reshape(2,2, 3)
Out[28]:
array([[[1, 2, 3],
[1, 2, 3]],
[[1, 2, 3],
[1, 2, 3]]])

Related

Repeat specific row or column of Python numpy 2D array [duplicate]

I'd like to copy a numpy 2D array into a third dimension. For example, given the 2D numpy array:
import numpy as np
arr = np.array([[1, 2], [1, 2]])
# arr.shape = (2, 2)
convert it into a 3D matrix with N such copies in a new dimension. Acting on arr with N=3, the output should be:
new_arr = np.array([[[1, 2], [1,2]],
[[1, 2], [1, 2]],
[[1, 2], [1, 2]]])
# new_arr.shape = (3, 2, 2)
Probably the cleanest way is to use np.repeat:
a = np.array([[1, 2], [1, 2]])
print(a.shape)
# (2, 2)
# indexing with np.newaxis inserts a new 3rd dimension, which we then repeat the
# array along, (you can achieve the same effect by indexing with None, see below)
b = np.repeat(a[:, :, np.newaxis], 3, axis=2)
print(b.shape)
# (2, 2, 3)
print(b[:, :, 0])
# [[1 2]
# [1 2]]
print(b[:, :, 1])
# [[1 2]
# [1 2]]
print(b[:, :, 2])
# [[1 2]
# [1 2]]
Having said that, you can often avoid repeating your arrays altogether by using broadcasting. For example, let's say I wanted to add a (3,) vector:
c = np.array([1, 2, 3])
to a. I could copy the contents of a 3 times in the third dimension, then copy the contents of c twice in both the first and second dimensions, so that both of my arrays were (2, 2, 3), then compute their sum. However, it's much simpler and quicker to do this:
d = a[..., None] + c[None, None, :]
Here, a[..., None] has shape (2, 2, 1) and c[None, None, :] has shape (1, 1, 3)*. When I compute the sum, the result gets 'broadcast' out along the dimensions of size 1, giving me a result of shape (2, 2, 3):
print(d.shape)
# (2, 2, 3)
print(d[..., 0]) # a + c[0]
# [[2 3]
# [2 3]]
print(d[..., 1]) # a + c[1]
# [[3 4]
# [3 4]]
print(d[..., 2]) # a + c[2]
# [[4 5]
# [4 5]]
Broadcasting is a very powerful technique because it avoids the additional overhead involved in creating repeated copies of your input arrays in memory.
* Although I included them for clarity, the None indices into c aren't actually necessary - you could also do a[..., None] + c, i.e. broadcast a (2, 2, 1) array against a (3,) array. This is because if one of the arrays has fewer dimensions than the other then only the trailing dimensions of the two arrays need to be compatible. To give a more complicated example:
a = np.ones((6, 1, 4, 3, 1)) # 6 x 1 x 4 x 3 x 1
b = np.ones((5, 1, 3, 2)) # 5 x 1 x 3 x 2
result = a + b # 6 x 5 x 4 x 3 x 2
Another way is to use numpy.dstack. Supposing that you want to repeat the matrix a num_repeats times:
import numpy as np
b = np.dstack([a]*num_repeats)
The trick is to wrap the matrix a into a list of a single element, then using the * operator to duplicate the elements in this list num_repeats times.
For example, if:
a = np.array([[1, 2], [1, 2]])
num_repeats = 5
This repeats the array of [1 2; 1 2] 5 times in the third dimension. To verify (in IPython):
In [110]: import numpy as np
In [111]: num_repeats = 5
In [112]: a = np.array([[1, 2], [1, 2]])
In [113]: b = np.dstack([a]*num_repeats)
In [114]: b[:,:,0]
Out[114]:
array([[1, 2],
[1, 2]])
In [115]: b[:,:,1]
Out[115]:
array([[1, 2],
[1, 2]])
In [116]: b[:,:,2]
Out[116]:
array([[1, 2],
[1, 2]])
In [117]: b[:,:,3]
Out[117]:
array([[1, 2],
[1, 2]])
In [118]: b[:,:,4]
Out[118]:
array([[1, 2],
[1, 2]])
In [119]: b.shape
Out[119]: (2, 2, 5)
At the end we can see that the shape of the matrix is 2 x 2, with 5 slices in the third dimension.
Use a view and get free runtime! Extend generic n-dim arrays to n+1-dim
Introduced in NumPy 1.10.0, we can leverage numpy.broadcast_to to simply generate a 3D view into the 2D input array. The benefit would be no extra memory overhead and virtually free runtime. This would be essential in cases where the arrays are big and we are okay to work with views. Also, this would work with generic n-dim cases.
I would use the word stack in place of copy, as readers might confuse it with the copying of arrays that creates memory copies.
Stack along first axis
If we want to stack input arr along the first axis, the solution with np.broadcast_to to create 3D view would be -
np.broadcast_to(arr,(3,)+arr.shape) # N = 3 here
Stack along third/last axis
To stack input arr along the third axis, the solution to create 3D view would be -
np.broadcast_to(arr[...,None],arr.shape+(3,))
If we actually need a memory copy, we can always append .copy() there. Hence, the solutions would be -
np.broadcast_to(arr,(3,)+arr.shape).copy()
np.broadcast_to(arr[...,None],arr.shape+(3,)).copy()
Here's how the stacking works for the two cases, shown with their shape information for a sample case -
# Create a sample input array of shape (4,5)
In [55]: arr = np.random.rand(4,5)
# Stack along first axis
In [56]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[56]: (3, 4, 5)
# Stack along third axis
In [57]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[57]: (4, 5, 3)
Same solution(s) would work to extend a n-dim input to n+1-dim view output along the first and last axes. Let's explore some higher dim cases -
3D input case :
In [58]: arr = np.random.rand(4,5,6)
# Stack along first axis
In [59]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[59]: (3, 4, 5, 6)
# Stack along last axis
In [60]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[60]: (4, 5, 6, 3)
4D input case :
In [61]: arr = np.random.rand(4,5,6,7)
# Stack along first axis
In [62]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[62]: (3, 4, 5, 6, 7)
# Stack along last axis
In [63]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[63]: (4, 5, 6, 7, 3)
and so on.
Timings
Let's use a large sample 2D case and get the timings and verify output being a view.
# Sample input array
In [19]: arr = np.random.rand(1000,1000)
Let's prove that the proposed solution is a view indeed. We will use stacking along first axis (results would be very similar for stacking along the third axis) -
In [22]: np.shares_memory(arr, np.broadcast_to(arr,(3,)+arr.shape))
Out[22]: True
Let's get the timings to show that it's virtually free -
In [20]: %timeit np.broadcast_to(arr,(3,)+arr.shape)
100000 loops, best of 3: 3.56 µs per loop
In [21]: %timeit np.broadcast_to(arr,(3000,)+arr.shape)
100000 loops, best of 3: 3.51 µs per loop
Being a view, increasing N from 3 to 3000 changed nothing on timings and both are negligible on timing units. Hence, efficient both on memory and performance!
This can now also be achived using np.tile as follows:
import numpy as np
a = np.array([[1,2],[1,2]])
b = np.tile(a,(3, 1,1))
b.shape
(3,2,2)
b
array([[[1, 2],
[1, 2]],
[[1, 2],
[1, 2]],
[[1, 2],
[1, 2]]])
A=np.array([[1,2],[3,4]])
B=np.asarray([A]*N)
Edit #Mr.F, to preserve dimension order:
B=B.T
Here's a broadcasting example that does exactly what was requested.
a = np.array([[1, 2], [1, 2]])
a=a[:,:,None]
b=np.array([1]*5)[None,None,:]
Then b*a is the desired result and (b*a)[:,:,0] produces array([[1, 2],[1, 2]]), which is the original a, as does (b*a)[:,:,1], etc.
Summarizing the solutions above:
a = np.arange(9).reshape(3,-1)
b = np.repeat(a[:, :, np.newaxis], 5, axis=2)
c = np.dstack([a]*5)
d = np.tile(a, [5,1,1])
e = np.array([a]*5)
f = np.repeat(a[np.newaxis, :, :], 5, axis=0) # np.repeat again
print('b='+ str(b.shape), b[:,:,-1].tolist())
print('c='+ str(c.shape),c[:,:,-1].tolist())
print('d='+ str(d.shape),d[-1,:,:].tolist())
print('e='+ str(e.shape),e[-1,:,:].tolist())
print('f='+ str(f.shape),f[-1,:,:].tolist())
b=(3, 3, 5) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
c=(3, 3, 5) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
d=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
e=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
f=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
Good luck

Create numpy array within numpy array

I want to create a numpy array within a numpy array. If i do it with normal python its something like
a = [[1,2], [3,4]]
a[0][1] = [1,1,1]
print a
The result is [[1, [1, 1, 1]], [3, 4]]
How can I achieve the same using numpy arrays? The code I have is:
a = np.array([(1, 2, 3),(4, 5, 6)])
b = np.array([1,1,1])
a[0][1] = b
a as created is dtype int. Each element can only be another integer:
In [758]: a = np.array([(1, 2, 3),(4, 5, 6)])
...: b = np.array([1,1,1])
...:
In [759]: a
Out[759]:
array([[1, 2, 3],
[4, 5, 6]])
In [760]: b
Out[760]: array([1, 1, 1])
In [761]: a[0,1]=b
...
ValueError: setting an array element with a sequence.
You can make another dtype of array, one that holds pointers to objects, much as list does:
In [762]: aO = a.astype(object)
In [763]: aO
Out[763]:
array([[1, 2, 3],
[4, 5, 6]], dtype=object)
Now it is possible to replace one of those element pointers with a pointer to b array:
In [765]: aO[0,1]=b
In [766]: aO
Out[766]:
array([[1, array([1, 1, 1]), 3],
[4, 5, 6]], dtype=object)
But as asked in the comments - why do you want/need to do this? What are you going to do with such an array? It is possible to do some numpy math on such an array, but as shown in some recent SO questions, it is hit-or-miss. It is also slower.
As far as I know, you cannot do this. Numpy arrays cannot have entries of varying shape. Your request to make an array like [[1, [1, 1, 1]], [3, 4]] is impossible. However, you could make a numpy matrix of dimensions (3x2x3) to get
[
[
[1,0,0],
[1,1,1],
[0,0,0],
]
[
[3,0,0],
[4,0,0],
[0,0,0]
]
]
Your only option is to pad empty elements with some number (I used 0s above) or use another data structure.

How to create or fill an numpy array with another array?

How to create an numpy array with shape [2, 2, 3], where the elements at axis 2 is another array, for example [1, 2, 3]?
So I would like to do something like this invalid code:
a = np.arange(1, 4)
b = np.full((3, 3), a)
Resulting in an array like:
[[[ 1. 2. 3.]
[ 1. 2. 3.]]
[[ 1. 2. 3.]
[ 1. 2. 3.]]]
Could of course make the loop for filling like, but thought there may be a shortcut:
for y in range(b.shape[0]):
for x in range(b.shape[1]):
b[y, x, :] = a
There are multiple ways to achieve this. One is to use np.full in np.full((2,2,3), a) as pointed out by Divakar in the comments. Alternatively, you can use np.tile for this, which allows you to construct an array by repeating an input array a given number of times. To construct your example you could do:
import numpy as np
np.tile(np.arange(1, 4), [2, 2, 1])
If your numpy version is >= 1.10 you can use broadcast_to
a = np.arange(1,4)
a.shape = (1,1,3)
b = np.broadcast_to(a,(2,2,3))
This produces a view rather than copying so will be quicker for large arrays.
EDIT this looks to be the result you're asking for with your demo.
Based on Divakar comment, an answer can also be:
import numpy as np
np.full([2, 2, 3], np.arange(1, 4))
Yet another possibility is:
import numpy as np
b = np.empty([2, 2, 3])
b[:] = np.arange(1, 4)
Also using np.concatenate or it's wrapper np.vstack
In [26]: a = np.arange(1,4)
In [27]: np.vstack([a[np.newaxis, :]]*4).reshape(2,2, 3)
Out[27]:
array([[[1, 2, 3],
[1, 2, 3]],
[[1, 2, 3],
[1, 2, 3]]])
In [28]: np.concatenate([a[np.newaxis, :]]*4, axis=0).reshape(2,2, 3)
Out[28]:
array([[[1, 2, 3],
[1, 2, 3]],
[[1, 2, 3],
[1, 2, 3]]])

Numpy arange over numpy arrays

I have a function that basically returns generalized harmonic number.
def harmonic(limit, z):
return numpy.sum(1.0/numpy.arange(1, limit+1)**z)
Here is two examples for the current function definition:
>>> harmonic(1, 1)
1.0
>>> harmonic(2, 1)
1.5
As you might guess this works fine when limit is scalar, but how can I make this function work with 1D and 2D arrays as well?
The following demonstrates an example output of the function I want to achieve
>>> limit = np.array([[1, 2], [3, 4]])
>>> harmonic(limit, 1)
array([[1.0, 1.5], [1.833, 2.083]])
If you're only interested in vectorizing over limit and not z, as in the example you showed, then I think you can use np.vectorize:
>>> h = np.vectorize(harmonic)
>>> h(1, 1)
array(1.0)
>>> h(2, 1)
array(1.5)
>>> h([[1,2], [3,4]], 1)
array([[ 1. , 1.5 ],
[ 1.83333333, 2.08333333]])
>>> h([[1,2], [3,4]], 2)
array([[ 1. , 1.25 ],
[ 1.36111111, 1.42361111]])
Note that this will return 0-dimensional arrays for the scalar case.
Actually, on second thought, it should work for the z case too:
>>> h([[2,2], [2,2]], [[1,2],[3,4]])
array([[ 1.5 , 1.25 ],
[ 1.125 , 1.0625]])
arange generates evenly spaced 1D ndarray in range [1,limit+1] in your example.
Now say you want an multi-dim ndarray of evenly spaced arrays. Then you may use arange to generate each component of your 2D ndarray. You convert result of arange to a python list with list(), to make it the right format to be an argument of ndarray constructor.
It all depends on your purpose. As you deal with math. analysis, what you look for may be a grid:
>>> np.mgrid[0:5,0:5]
array([[[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
More here.
EDIT:
After you posted the code :
as DSM mentions, np.vectorize is a good way to do. From doc,
class numpy.vectorize(pyfunc, otypes='', doc=None, excluded=None,
cache=False)
Generalized function class.
Define a vectorized function which takes a nested sequence of objects
or numpy arrays as inputs and returns a numpy array as output. The
vectorized function evaluates pyfunc over successive tuples of the
input arrays like the python map function, except it uses the
broadcasting rules of numpy.

Numpy - add row to array

How does one add rows to a numpy array?
I have an array A:
A = array([[0, 1, 2], [0, 2, 0]])
I wish to add rows to this array from another array X if the first element of each row in X meets a specific condition.
Numpy arrays do not have a method 'append' like that of lists, or so it seems.
If A and X were lists I would merely do:
for i in X:
if i[0] < 3:
A.append(i)
Is there a numpythonic way to do the equivalent?
Thanks,
S ;-)
You can do this:
newrow = [1, 2, 3]
A = numpy.vstack([A, newrow])
What is X? If it is a 2D-array, how can you then compare its row to a number: i < 3?
EDIT after OP's comment:
A = array([[0, 1, 2], [0, 2, 0]])
X = array([[0, 1, 2], [1, 2, 0], [2, 1, 2], [3, 2, 0]])
add to A all rows from X where the first element < 3:
import numpy as np
A = np.vstack((A, X[X[:,0] < 3]))
# returns:
array([[0, 1, 2],
[0, 2, 0],
[0, 1, 2],
[1, 2, 0],
[2, 1, 2]])
As this question is been 7 years before, in the latest version which I am using is numpy version 1.13, and python3, I am doing the same thing with adding a row to a matrix, remember to put a double bracket to the second argument, otherwise, it will raise dimension error.
In here I am adding on matrix A
1 2 3
4 5 6
with a row
7 8 9
same usage in np.r_
A = [[1, 2, 3], [4, 5, 6]]
np.append(A, [[7, 8, 9]], axis=0)
>> array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
#or
np.r_[A,[[7,8,9]]]
Just to someone's intersted, if you would like to add a column,
array = np.c_[A,np.zeros(#A's row size)]
following what we did before on matrix A, adding a column to it
np.c_[A, [2,8]]
>> array([[1, 2, 3, 2],
[4, 5, 6, 8]])
If you want to prepend, you can just flip the order of the arguments, i.e.:
np.r_([[7, 8, 9]], A)
>> array([[7, 8, 9],
[1, 2, 3],
[4, 5, 6]])
If no calculations are necessary after every row, it's much quicker to add rows in python, then convert to numpy. Here are timing tests using python 3.6 vs. numpy 1.14, adding 100 rows, one at a time:
import numpy as np
from time import perf_counter, sleep
def time_it():
# Compare performance of two methods for adding rows to numpy array
py_array = [[0, 1, 2], [0, 2, 0]]
py_row = [4, 5, 6]
numpy_array = np.array(py_array)
numpy_row = np.array([4,5,6])
n_loops = 100
start_clock = perf_counter()
for count in range(0, n_loops):
numpy_array = np.vstack([numpy_array, numpy_row]) # 5.8 micros
duration = perf_counter() - start_clock
print('numpy 1.14 takes {:.3f} micros per row'.format(duration * 1e6 / n_loops))
start_clock = perf_counter()
for count in range(0, n_loops):
py_array.append(py_row) # .15 micros
numpy_array = np.array(py_array) # 43.9 micros
duration = perf_counter() - start_clock
print('python 3.6 takes {:.3f} micros per row'.format(duration * 1e6 / n_loops))
sleep(15)
#time_it() prints:
numpy 1.14 takes 5.971 micros per row
python 3.6 takes 0.694 micros per row
So, the simple solution to the original question, from seven years ago, is to use vstack() to add a new row after converting the row to a numpy array. But a more realistic solution should consider vstack's poor performance under those circumstances. If you don't need to run data analysis on the array after every addition, it is better to buffer the new rows to a python list of rows (a list of lists, really), and add them as a group to the numpy array using vstack() before doing any data analysis.
You can also do this:
newrow = [1,2,3]
A = numpy.concatenate((A,newrow))
import numpy as np
array_ = np.array([[1,2,3]])
add_row = np.array([[4,5,6]])
array_ = np.concatenate((array_, add_row), axis=0)
I use 'np.vstack' which is faster, EX:
import numpy as np
input_array=np.array([1,2,3])
new_row= np.array([4,5,6])
new_array=np.vstack([input_array, new_row])
I use numpy.insert(arr, i, the_object_to_be_added, axis) in order to insert object_to_be_added at the i'th row(axis=0) or column(axis=1)
import numpy as np
a = np.array([[1, 2, 3], [5, 4, 6]])
# array([[1, 2, 3],
# [5, 4, 6]])
np.insert(a, 1, [55, 66], axis=1)
# array([[ 1, 55, 2, 3],
# [ 5, 66, 4, 6]])
np.insert(a, 2, [50, 60, 70], axis=0)
# array([[ 1, 2, 3],
# [ 5, 4, 6],
# [50, 60, 70]])
Too old discussion, but I hope it helps someone.
If you can do the construction in a single operation, then something like the vstack-with-fancy-indexing answer is a fine approach. But if your condition is more complicated or your rows come in on the fly, you may want to grow the array. In fact the numpythonic way to do something like this - dynamically grow an array - is to dynamically grow a list:
A = np.array([[1,2,3],[4,5,6]])
Alist = [r for r in A]
for i in range(100):
newrow = np.arange(3)+i
if i%5:
Alist.append(newrow)
A = np.array(Alist)
del Alist
Lists are highly optimized for this kind of access pattern; you don't have convenient numpy multidimensional indexing while in list form, but for as long as you're appending it's hard to do better than a list of row arrays.
You can use numpy.append() to append a row to numpty array and reshape to a matrix later on.
import numpy as np
a = np.array([1,2])
a = np.append(a, [3,4])
print a
# [1,2,3,4]
# in your example
A = [1,2]
for row in X:
A = np.append(A, row)

Categories

Resources