I know it is possible to use meshgrid to get all combinations between two arrays using numpy.
But in my case I have an array of two columns and n rows and another array that I would like to get the unique combinations.
For example:
a = [[1,1],
[2,2],
[3,3]]
b = [5,6]
# The expected result would be:
final_array = [[1,1,5],
[1,1,6],
[2,2,5],
[2,2,6],
[3,3,5],
[3,3,6]]
Which method is the fastest way to get this result using only numpy?
Proposed solution
Ok got the result, but I would like to know if this is a reliable and fast solution for this task, if someone could give me any advice I will appreciate.
a_t = np.tile(a, len(b)).reshape(-1,2)
b_t = np.tile(b, len(a)).reshape(1,-1)
final_array = np.hstack((a_t,b_t.T))
array([[1, 1, 5],
[1, 1, 6],
[2, 2, 5],
[2, 2, 6],
[3, 3, 5],
[3, 3, 6]])
Kind of ugly, but here's one way:
xx = np.repeat(a, len(b)).reshape(-1, a.shape[1])
yy = np.tile(b, a.shape[0])[:, None]
np.concatenate((xx, yy), axis=1)
Related
Is there any efficient way to find the multiplication of every row in a matrix using numpy?
I mean, for example, if
A = [[1, 2], [3, 4]]
then I would want something like np.sum(A, axis=1) just,
np.mul(A, axis=0) = [2, 12]
np.prod is what you're looking for.
a = np.array([[1, 2], [3, 4]])
print(np.prod(a, axis=1)) # Prints array([2, 12])
Use nympy.prod, exactly as you describe, i.e.
import numpy as np
A = [[1, 2], [3, 4]]
np.prod(A, axis=1) # Gives [ 2 12]
The multiply function is an universal function (ufunc) so you could do:
import numpy as np
A = np.array([[1, 2], [3, 4]])
result = np.multiply.reduce(A, axis=1)
print(result)
Output
[ 2 12]
Read the documentation on reduce, here.
I try to have a matrix like:
M= [[1,1,..,1],
[2,2,..,2],
...
[40000, 40000, ..,40000]
It's what I tried:
data = np.mat((40000,8))
print(data.shape)
for i in range(data.shape[0]):
data[i,:] = i
print(data[:5])
The above code prints:
(1, 2)
[[0 0]]
I know how to fill a matrix with constant values, but I couldn't find a similar question for this case.
Use a simple array and don't forget that Python starts indexing at 0:
data = np.zeros((40000,8))
for i in range(data.shape[0]):
data[i,:] = i+1
Here's a way using numpy:
rows = 10
cols = 3
l = np.arange(1,rows)
np.tile(l,cols).reshape(cols,rows-1).T
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8],
[9, 9, 9]])
Matthieu Brucher's answer will perfectly do for your case. If you are looking at numbers much higher than 4000 and if time is an issue, you might want to get rid of the for-loop and create a list of lists with list comprehension before turning it into a numpy array:
a = [[i]*8 for i in range(1,4001)]
m = np.asarray(a)
In my case, this solution was ~7 times faster.
To use numpy broadcast over iterations u can do,
import numpy as np
M = np.ones((40000,8), dtype=np.int).T * np.arange(1, 40001)
M = M.T
print(M)
This should be faster than any above iterations.
If that's what u are looking for
Very simple:
data = np.arange(1, 40001).repeat(8).reshape(-1,8)
Though this is pure numpy as well, this is considerably slower than #yatu's solution.
In python need to combine two 2-dimensional numpy arrays, so that the resulting rows are combinations of the rows from the input arrays concatenated together. I need the fastest solution so it can be used in arrays that are very big.
For example:
I got:
import numpy as np
array1 = np.array([[1,2],[3,4]])
array2 = np.array([[5,6],[7,8]])
I want the code to return:
[[1,2,5,6]
[1,2,7,8]
[3,4,5,6]
[3,4,7,8]]
Solution using numpy's repeat, tile and hstack
The snippet
result = np.hstack([
np.repeat(array1, array2.shape[0], axis=0),
np.tile(array2, (array1.shape[0], 1))
])
Step by step explanation
We start with the two arrays, array1 and array2:
import numpy as np
array1 = np.array([[1,2],[3,4]])
array2 = np.array([[5,6],[7,8]])
First, we duplicate the content of array1 using repeat:
a = np.repeat(array1, array2.shape[0], axis=0)
The content of a is:
array([[1, 2],
[1, 2],
[3, 4],
[3, 4]])
Then we repeat the second array, array2, using tile. In particular, (array1.shape[0],1) replicates array2 in the first direction array1.shape[0] times and just 1 time in the other direction.
b = np.tile(array2, (array1.shape[0],1))
The result is:
array([[5, 6],
[7, 8],
[5, 6],
[7, 8]])
Now we can just proceed to stack the two results, using hstack:
result = np.hstack([a,b])
Achieving the desired output:
array([[1, 2, 5, 6],
[1, 2, 7, 8],
[3, 4, 5, 6],
[3, 4, 7, 8]])
For this small example, itertools.product is actually faster. I don't know how it scales
alist = list(itertools.product(array1.tolist(),array2.tolist()))
np.array(alist).reshape(-1,4)
I have a long 1D array. I'd like to create an array that is the result of np.arange() applied to each value in the array plus some constant. E.g if the constant = 3 and my array looks like
[1,2,3,4,5]
I'd like to get
[[1,2,3]
[2,3,4]
[3,4,5]
[4,5,6]
[5,6,7]]
np.arange() only accepts scalars as arguments. I played around with np.vectorize() a bit to no success. Clearly I could do this with a loop, or with lists and then convert to an array, but I was wondering if there's a good numpy-only solution.
You could use addition and broadcasting:
>>> x = np.array([1,2,3,4,5])
>>> constant = 3
>>> x[:,None] + np.arange(constant)
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7]])
This could also be written as np.add.outer(x, np.arange(constant)).
I tried to use numpy.apply_along_axis, but this seems to work only when the applied function collapses the dimension and not when it expands it.
Example:
def dup(x):
return np.array([x, x])
a = np.array([1,2,3])
np.apply_along_axis(dup, axis=0, arr=a) # This doesn't work
I was expecting the matrix below (notice how its dimension has expanded from the input matrix a):
np.array([[1, 1], [2, 2], [3, 3]])
In R, this would be accomplished by the **ply set of functions from the plyr package. How to do it with numpy?
If you just want to repeat the elements you can use np.repeat :
>>> np.repeat(a,2).reshape(3,2)
array([[1, 1],
[2, 2],
[3, 3]])
And for apply a function use np.frompyfunc and for convert to an integrate array use np.vstack:
>>> def dup(x):
... return np.array([x, x])
>>> oct_array = np.frompyfunc(dup, 1, 1)
>>> oct_array(a)
array([array([1, 1]), array([2, 2]), array([3, 3])], dtype=object)
>>> np.vstack(oct_array(a))
array([[1, 1],
[2, 2],
[3, 3]])
For someone used to general Python code, a list comprehension may be the simplest approach:
In [20]: np.array([dup(x) for x in a])
Out[20]:
array([[1, 1],
[2, 2],
[3, 3]])
The comprehension (a loop or mapping that applies dup to each element of a) returns [array([1, 1]), array([2, 2]), array([3, 3])], which is easily turned into a 2d array with np.array().
At least for this small a, it is also faster than the np.frompyfunc approach. The np.frompyfunc function will give full access to broadcasting, but evidently it doesn't apply any fast iteration tricks.
apply_along_axis can help keep indices straight when dealing with many dimensions, but it still is just an iteration method. It's written Python so you can study its code yourself. It is much more complicated than needed for this simple case.
In order for your example to work as expected, a should be 2-dimensional:
def dup(x):
# x is now an array of size 1
return np.array([ x[0], x[0] ])
a = np.array([[1,2,3]]) # 2dim
np.apply_along_axis(dup, axis=0, arr=a)
=>
array([[1, 2, 3],
[1, 2, 3]])
Of course, you probably want to transpose the result.