I need help in storing the combinations of column vectors' values in a numpy array.
My problem consists of two column vectors, having size nx1 and mx1, with n=m, and finding n combinations.
I then vertical stacked these column vectors in a matrix, having size nx2.
I found the combinations with the itertools.combination function of python, but I struggle to store them in a numpy array, since itertools gives n rows of tuples.
The main example I found online is reported below:
import itertools
val = [1, 2, 3, 4]
com_set = itertools.combinations(val, 2)
for i in com_set:
print(i)
Output:
(1, 2)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
(3, 4)
Now, in my case, I have two vectors, val and val1, different from each other.
And, I would need the output in a numpy array, possible a matrix, so I can apply the maximum likelihood estimation method on these values.
You are looking for itertools.product instead of itertools.combinations.
x = [1, 2, 3]
y = [4, 5, 6]
z = list(itertools.product(x, y))
# z = [(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)]
You can turn the result into a (n * n, 2) shaped array by simply passing the result to np.array:
result = np.array(z)
# array([[1, 4],
# [1, 5],
# [1, 6],
# [2, 4],
# [2, 5],
# [2, 6],
# [3, 4],
# [3, 5],
# [3, 6]])
Finally, you can also do this with numpy directly, albeit in a different order:
result = np.stack(np.meshgrid(x, y)).reshape(2, -1).T
# array([[1, 4],
# [2, 4],
# [3, 4],
# [1, 5],
# [2, 5],
# [3, 5],
# [1, 6],
# [2, 6],
# [3, 6]])
I'd like to copy a numpy 2D array into a third dimension. For example, given the 2D numpy array:
import numpy as np
arr = np.array([[1, 2], [1, 2]])
# arr.shape = (2, 2)
convert it into a 3D matrix with N such copies in a new dimension. Acting on arr with N=3, the output should be:
new_arr = np.array([[[1, 2], [1,2]],
[[1, 2], [1, 2]],
[[1, 2], [1, 2]]])
# new_arr.shape = (3, 2, 2)
Probably the cleanest way is to use np.repeat:
a = np.array([[1, 2], [1, 2]])
print(a.shape)
# (2, 2)
# indexing with np.newaxis inserts a new 3rd dimension, which we then repeat the
# array along, (you can achieve the same effect by indexing with None, see below)
b = np.repeat(a[:, :, np.newaxis], 3, axis=2)
print(b.shape)
# (2, 2, 3)
print(b[:, :, 0])
# [[1 2]
# [1 2]]
print(b[:, :, 1])
# [[1 2]
# [1 2]]
print(b[:, :, 2])
# [[1 2]
# [1 2]]
Having said that, you can often avoid repeating your arrays altogether by using broadcasting. For example, let's say I wanted to add a (3,) vector:
c = np.array([1, 2, 3])
to a. I could copy the contents of a 3 times in the third dimension, then copy the contents of c twice in both the first and second dimensions, so that both of my arrays were (2, 2, 3), then compute their sum. However, it's much simpler and quicker to do this:
d = a[..., None] + c[None, None, :]
Here, a[..., None] has shape (2, 2, 1) and c[None, None, :] has shape (1, 1, 3)*. When I compute the sum, the result gets 'broadcast' out along the dimensions of size 1, giving me a result of shape (2, 2, 3):
print(d.shape)
# (2, 2, 3)
print(d[..., 0]) # a + c[0]
# [[2 3]
# [2 3]]
print(d[..., 1]) # a + c[1]
# [[3 4]
# [3 4]]
print(d[..., 2]) # a + c[2]
# [[4 5]
# [4 5]]
Broadcasting is a very powerful technique because it avoids the additional overhead involved in creating repeated copies of your input arrays in memory.
* Although I included them for clarity, the None indices into c aren't actually necessary - you could also do a[..., None] + c, i.e. broadcast a (2, 2, 1) array against a (3,) array. This is because if one of the arrays has fewer dimensions than the other then only the trailing dimensions of the two arrays need to be compatible. To give a more complicated example:
a = np.ones((6, 1, 4, 3, 1)) # 6 x 1 x 4 x 3 x 1
b = np.ones((5, 1, 3, 2)) # 5 x 1 x 3 x 2
result = a + b # 6 x 5 x 4 x 3 x 2
Another way is to use numpy.dstack. Supposing that you want to repeat the matrix a num_repeats times:
import numpy as np
b = np.dstack([a]*num_repeats)
The trick is to wrap the matrix a into a list of a single element, then using the * operator to duplicate the elements in this list num_repeats times.
For example, if:
a = np.array([[1, 2], [1, 2]])
num_repeats = 5
This repeats the array of [1 2; 1 2] 5 times in the third dimension. To verify (in IPython):
In [110]: import numpy as np
In [111]: num_repeats = 5
In [112]: a = np.array([[1, 2], [1, 2]])
In [113]: b = np.dstack([a]*num_repeats)
In [114]: b[:,:,0]
Out[114]:
array([[1, 2],
[1, 2]])
In [115]: b[:,:,1]
Out[115]:
array([[1, 2],
[1, 2]])
In [116]: b[:,:,2]
Out[116]:
array([[1, 2],
[1, 2]])
In [117]: b[:,:,3]
Out[117]:
array([[1, 2],
[1, 2]])
In [118]: b[:,:,4]
Out[118]:
array([[1, 2],
[1, 2]])
In [119]: b.shape
Out[119]: (2, 2, 5)
At the end we can see that the shape of the matrix is 2 x 2, with 5 slices in the third dimension.
Use a view and get free runtime! Extend generic n-dim arrays to n+1-dim
Introduced in NumPy 1.10.0, we can leverage numpy.broadcast_to to simply generate a 3D view into the 2D input array. The benefit would be no extra memory overhead and virtually free runtime. This would be essential in cases where the arrays are big and we are okay to work with views. Also, this would work with generic n-dim cases.
I would use the word stack in place of copy, as readers might confuse it with the copying of arrays that creates memory copies.
Stack along first axis
If we want to stack input arr along the first axis, the solution with np.broadcast_to to create 3D view would be -
np.broadcast_to(arr,(3,)+arr.shape) # N = 3 here
Stack along third/last axis
To stack input arr along the third axis, the solution to create 3D view would be -
np.broadcast_to(arr[...,None],arr.shape+(3,))
If we actually need a memory copy, we can always append .copy() there. Hence, the solutions would be -
np.broadcast_to(arr,(3,)+arr.shape).copy()
np.broadcast_to(arr[...,None],arr.shape+(3,)).copy()
Here's how the stacking works for the two cases, shown with their shape information for a sample case -
# Create a sample input array of shape (4,5)
In [55]: arr = np.random.rand(4,5)
# Stack along first axis
In [56]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[56]: (3, 4, 5)
# Stack along third axis
In [57]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[57]: (4, 5, 3)
Same solution(s) would work to extend a n-dim input to n+1-dim view output along the first and last axes. Let's explore some higher dim cases -
3D input case :
In [58]: arr = np.random.rand(4,5,6)
# Stack along first axis
In [59]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[59]: (3, 4, 5, 6)
# Stack along last axis
In [60]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[60]: (4, 5, 6, 3)
4D input case :
In [61]: arr = np.random.rand(4,5,6,7)
# Stack along first axis
In [62]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[62]: (3, 4, 5, 6, 7)
# Stack along last axis
In [63]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[63]: (4, 5, 6, 7, 3)
and so on.
Timings
Let's use a large sample 2D case and get the timings and verify output being a view.
# Sample input array
In [19]: arr = np.random.rand(1000,1000)
Let's prove that the proposed solution is a view indeed. We will use stacking along first axis (results would be very similar for stacking along the third axis) -
In [22]: np.shares_memory(arr, np.broadcast_to(arr,(3,)+arr.shape))
Out[22]: True
Let's get the timings to show that it's virtually free -
In [20]: %timeit np.broadcast_to(arr,(3,)+arr.shape)
100000 loops, best of 3: 3.56 µs per loop
In [21]: %timeit np.broadcast_to(arr,(3000,)+arr.shape)
100000 loops, best of 3: 3.51 µs per loop
Being a view, increasing N from 3 to 3000 changed nothing on timings and both are negligible on timing units. Hence, efficient both on memory and performance!
This can now also be achived using np.tile as follows:
import numpy as np
a = np.array([[1,2],[1,2]])
b = np.tile(a,(3, 1,1))
b.shape
(3,2,2)
b
array([[[1, 2],
[1, 2]],
[[1, 2],
[1, 2]],
[[1, 2],
[1, 2]]])
A=np.array([[1,2],[3,4]])
B=np.asarray([A]*N)
Edit #Mr.F, to preserve dimension order:
B=B.T
Here's a broadcasting example that does exactly what was requested.
a = np.array([[1, 2], [1, 2]])
a=a[:,:,None]
b=np.array([1]*5)[None,None,:]
Then b*a is the desired result and (b*a)[:,:,0] produces array([[1, 2],[1, 2]]), which is the original a, as does (b*a)[:,:,1], etc.
Summarizing the solutions above:
a = np.arange(9).reshape(3,-1)
b = np.repeat(a[:, :, np.newaxis], 5, axis=2)
c = np.dstack([a]*5)
d = np.tile(a, [5,1,1])
e = np.array([a]*5)
f = np.repeat(a[np.newaxis, :, :], 5, axis=0) # np.repeat again
print('b='+ str(b.shape), b[:,:,-1].tolist())
print('c='+ str(c.shape),c[:,:,-1].tolist())
print('d='+ str(d.shape),d[-1,:,:].tolist())
print('e='+ str(e.shape),e[-1,:,:].tolist())
print('f='+ str(f.shape),f[-1,:,:].tolist())
b=(3, 3, 5) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
c=(3, 3, 5) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
d=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
e=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
f=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
Good luck
The code that I have in place goes something as follows:
import numpy as np
z = np.array([
[1, 2],
[3]
])
x = np.array([
[4, 5]
])
print(np.multiply(x,z))
The output of this program creates a list of lists. This is different than the regular broadcasting rules that apply on arrays with equal dimensions. Is there a name for this property? Also why does it explicitly mention the word list in the output?
[[list([1, 2, 1, 2, 1, 2, 1, 2]) list([3, 3, 3, 3, 3])]]
[Finished in 0.244s]
This is just normal cell-by-cell multiplication. Because your z array is not a true matrix (it does not have a square shape), Numpy interprets it as a row of two objects:
>>> z
array([[1, 2], [3]], dtype=object)
>>> z.shape
(2,)
From here here you multiply normally - the first object is multiplied by 4, the second by 5:
>>> [1, 2]*4
[1, 2, 1, 2, 1, 2, 1, 2]
>>> [3]*5
[3, 3, 3, 3, 3]
just normal Python list multiplication - this is the result you get. Indeed, your result is not a "list of lists". It's an array of shape (1, 2) of dtype=object, so a row of two objects (which happen to be lists):
>>> np.multiply(x,z)
array([[[1, 2, 1, 2, 1, 2, 1, 2], [3, 3, 3, 3, 3]]], dtype=object)
>>> np.multiply(x,z).shape
(1, 2)
I have 2 multidimensional arrays. I want to multiply those arrays.
My both arrays have shape :
shape : (3, 100)
I want to convert matlab code :
sum(q1.*q2)
to
np.dot(q1, q2)
gives me output :
ValueError: objects are not aligned
Use Matrix element wise product * instead of dot product
Here is a sample run with a reduced dimension
Implementation
A = np.random.randint(5,size=(3,4))
B = np.random.randint(5,size=(3,4))
result = A * B
Demo
>>> A
array([[4, 1, 3, 0],
[2, 0, 2, 2],
[0, 1, 1, 1]])
>>> B
array([[1, 3, 0, 2],
[3, 4, 1, 2],
[3, 0, 4, 3]])
>>> A * B
array([[4, 3, 0, 0],
[6, 0, 2, 4],
[0, 0, 4, 3]])
My installation of Octave, when asked to do
sum(a .* b)
with a and b having shape (3, 100), returns an array of shape (1, 100). The exact equivalent in numpy would be:
np.sum(a * b, axis=0)
which returns an array of shape (100,), or if you want to keep the dimensions of size 1:
np.sum(a * b, axis=0, keepdims=True)
You can get the same result, possibly faster, using np.einsum:
np.einsum('ij,ij->j', a, b)
What is the difference between a numpy array (lets say X) that has a shape of (N,1) and (N,). Aren't both of them Nx1 matrices ? The reason I ask is because sometimes computations return either one or the other.
This is a 1D array:
>>> np.array([1, 2, 3]).shape
(3,)
This array is a 2D but there is only one element in the first dimension:
>>> np.array([[1, 2, 3]]).shape
(1, 3)
Transposing gives the shape you are asking for:
>>> np.array([[1, 2, 3]]).T.shape
(3, 1)
Now, look at the array. Only the first column of this 2D array is filled.
>>> np.array([[1, 2, 3]]).T
array([[1],
[2],
[3]])
Given these two arrays:
>>> a = np.array([[1, 2, 3]])
>>> b = np.array([[1, 2, 3]]).T
>>> a
array([[1, 2, 3]])
>>> b
array([[1],
[2],
[3]])
You can take advantage of broadcasting:
>>> a * b
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
The missing numbers are filled in. Think for rows and columns in table or spreadsheet.
>>> a + b
array([[2, 3, 4],
[3, 4, 5],
[4, 5, 6]])
Doing this with higher dimensions gets tougher on your imagination.