Related
This question already has answers here:
Basic matrix transpose in python
(5 answers)
Closed 6 months ago.
I am making a function in python in which when taking a matrix A, it returns a matrix B with swapped rows and columns, example:
if i enter this matrix:
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Should return
1 5 9 13
2 6 10 14
3 7 11 15
4 8 12 16
but what I get is:
array([[ 1, 5, 9, 13],
[ 5, 6, 10, 14],
[ 9, 10, 11, 15],
[13, 14, 15, 16]])
I don't understand why, could someone help me understand this error and how can I solve it?
my code:
def transpose(matrix):
for i in range(matrix.shape[0]):
for j in range(matrix.shape[1]):
matrix[i][j] = matrix[j][i]
return matrix
(I can't use default functions like transpose, I have to code)
This line
matrix[i][j] = matrix[j][i]
is your issue.
For example, when i = 1 and j = 2, you set matrix[1][2] to 10 because matrix[2][1] is 10. When you come around the next time to i = 2 and j = 1, you set matrix[2][1] to 10 because matrix[1][2] was set to 10 even though it was originally 7, it doesn't keep a memory of the previous value.
Depending on if you want the function to mutate the original matrix or return a new matrix with changes values (but keep the original) will change how you create this function.
To mutate the original
def transpose(matrix):
matrix2 = numpy.copy(matrix)
for i in range(matrix.shape[0]):
for j in range(matrix.shape[1]):
matrix[i][j] = matrix2[j][i]
return matrix
To return a new array
def transpose(matrix):
matrix2 = numpy.copy(matrix)
for i in range(matrix.shape[0]):
for j in range(matrix.shape[1]):
matrix2[i][j] = matrix[j][i]
return matrix2
zip can do this for you. Unpack the list and pass sub lists as arguments to the zip:
lst = [
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
]
transposed = list(zip(*lst))
for i in transposed:
print(i)
output:
(1, 5, 9, 13)
(2, 6, 10, 14)
(3, 7, 11, 15)
(4, 8, 12, 16)
You can use numpy.transpose to transpose a matrix.
As for why your code is not working is because your program does the follow assignments in a loop:
matrix[0][2] = matrix[2][0] # writes 9
...
matrix[2][0] = matrix[0][2] # writes 9 instead of 3 because matrix[0][2] has previously been updated
So to fix this you can use an intermediate variable like output_matrix in this example:
def transpose(matrix):
output_matrix = np.zeros_like(matrix)
for i in range(matrix.shape[0]):
for j in range(matrix.shape[1]):
output_matrix[i][j] = matrix[j][i]
return output_matrix
How to make a multiplication chart with nested lists and for ? I need to all numbers from first list multiply to from second list
chart = [
[],
[],
]
for i in range(1,len(chart)+1):
for j in range(i,i*len(chart)+1):
print(f'{i} * {j} = {i*j}')
In python positions of elements in a list start from 0.
The chart list contains 2 lists:
chart[0] = [1, 2, 3, 4, 5]
chart[1] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
What you want to do is access the elements of the first list and multiply by elements of the second list.
for i in range(len(chart[0])): # range(5) => 0, 1, 2, 3, 4
for j in range(len(chart[1])): # range(10) => 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
print(f'{chart[0][i]} * {chart[1][j]} = {chart[0][i] * chart[1][j]}
itertools.product will produce the desired output, creating 2-tuples consisting of one element from the first list and one element from the second list:
import itertools
chart = [
[1,2,3,4,5],
[1,2,3,4,5,6,7,8,9,10],
]
for i, j in itertools.product(*chart):
print(f'{i} * {j} = {i*j}')
Use a cartesian product:
chart = [[1,2,3,4,5],[1,2,3,4,5,6,7,8,9,10]]
>>> print('\n'.join([f'{i} * {j} = {i*j}' for i in chart[0] for j in chart[1]]))
1 * 1 = 1
1 * 2 = 2
1 * 3 = 3
1 * 4 = 4
...
4 * 10 = 40
5 * 1 = 5
5 * 2 = 10
5 * 3 = 15
5 * 4 = 20
5 * 5 = 25
5 * 6 = 30
5 * 7 = 35
5 * 8 = 40
5 * 9 = 45
5 * 10 = 50
Normally: a * b == np.multiply(a, b)
but in this case :
a=np.matrix(([1,2,3],[1,2,3],[1,2,3]))
b = np.array(([1,2,3]))
print( a.dot(b))
print(np.multiply(a,b))
print(a * b)
I have a problem:
[[14 14 14]]
[[1 4 9]
[1 4 9]
[1 4 9]]
Traceback (most recent call last):
File "C:\Users\FAROUQ\.spyder-py3\untitled0.py", line 28, in <module>
print(a * b)
ValueError: shapes (3,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
Please, could you explain to me why ?
np.dot() product is the dot product or scaler product of vectors/matrixes. If you don't know what dot product is I recommend you to learn about it. (Linear Algebra / Calculus).
Algebraically, the dot product is the sum of the products of the corresponding entries of the two sequences of numbers.
For example:
a = [3, 2, 4]
b = [5, 7, 6]
np.dot(a, b) = 3 * 5 + 5 * 7 + 4 * 6 = 15 + 14 + 24 = 53
* in NumPy is the element-wise product of two arrays.
For example:
a = np.array([3, 2, 4])
b = np.array([5, 7, 6])
a * b = [3 * 5, 2 * 7, 4 * 6] = [15, 14, 24]
You should replace np.matrix by np.array to get the same result!
a = np.array(([1,2,3],[1,2,3],[1,2,3]))
b = np.array(([1,2,3]))
print( a.dot(b))
print(np.multiply(a,b))
print(a * b)
Additional Information
1. element-wise product: a*b or element-wise matrix multiplication np.multiply(a,b) are the same!
2. dot product : np.dot(a,b) or a.dot(b)
Examples:
A very simple example just for understanding.
I have the following pandas dataframe:
import pandas as pd
df = pd.DataFrame({'A':pd.Series([1, 2, 13, 14, 25, 26, 37, 38])})
df
A
0 1
1 2
2 13
3 14
4 25
5 26
6 37
8 38
Set n = 3
First example
How to get a new dataframe df1 (in an efficient way), like the following:
D1 D2 D3 T
0 1 2 13 14
1 2 13 14 25
2 13 14 25 26
3 14 25 26 37
4 25 26 37 38
Hint: think at the first n-columns as the data (Dx) and the last columns as the target (T). In the 1st example the target (e.g 25) depends on the preceding n-elements (2, 13, 14).
Second example
What if the target is some element ahead (e.g.+3)?
D1 D2 D3 T
0 1 2 13 26
1 2 13 14 37
2 13 14 25 38
Thank you for your help,
Gilberto
P.S. If you think that the title can be improved, please suggest me how to modify it.
Update
Thanks to #Divakar and this post the rolling function can be defined as:
import numpy as np
def rolling(a, window):
shape = (a.size - window + 1, window)
strides = (a.itemsize, a.itemsize)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
a = np.arange(1000000000)
b = rolling(a, 4)
In less than 1 second!
Let's see how we can solve it with NumPy tools. So, let's imagine you have the column data as a NumPy array, let's call it a. For such sliding windowed operations, we have a very efficient tool in NumPy as strides, as they are views into the input array without actually making copies.
Let's directly use the methods with the sample data and start with case #1 -
In [29]: a # Input data
Out[29]: array([ 1, 2, 13, 14, 25, 26, 37, 38])
In [30]: m = a.strides[0] # Get strides
In [31]: n = 3 # parameter
In [32]: nrows = a.size - n # Get number of rows in o/p
In [33]: a2D = np.lib.stride_tricks.as_strided(a,shape=(nrows,n+1),strides=(m,m))
In [34]: a2D
Out[34]:
array([[ 1, 2, 13, 14],
[ 2, 13, 14, 25],
[13, 14, 25, 26],
[14, 25, 26, 37],
[25, 26, 37, 38]])
In [35]: np.may_share_memory(a,a2D)
Out[35]: True # a2D is a view into a
Case #2 would be similar with an additional parameter for the Target column -
In [36]: n2 = 3 # Additional param
In [37]: nrows = a.size - n - n2 + 1
In [38]: part1 = np.lib.stride_tricks.as_strided(a,shape=(nrows,n),strides=(m,m))
In [39]: part1 # These are D1, D2, D3, etc.
Out[39]:
array([[ 1, 2, 13],
[ 2, 13, 14],
[13, 14, 25]])
In [43]: part2 = a[n+n2-1:] # This is target col
In [44]: part2
Out[44]: array([26, 37, 38])
I found another method: view_as_windows
import numpy as np
from skimage.util.shape import view_as_windows
window_shape = (4, )
aa = np.arange(1000000000) # 1 billion!
bb = view_as_windows(aa, window_shape)
bb
array([[ 0, 1, 2, 3],
[ 1, 2, 3, 4],
[ 2, 3, 4, 5],
...,
[999999994, 999999995, 999999996, 999999997],
[999999995, 999999996, 999999997, 999999998],
[999999996, 999999997, 999999998, 999999999]])
Around 1 second.
What do you think?
I have the following array:
import numpy as np
a = np.array([[ 1, 2, 3],
[ 1, 2, 3],
[ 1, 2, 3]])
I understand that np.random.shuffle(a.T) will shuffle the array along the row, but what I need is for it to shuffe each row idependently. How can this be done in numpy? Speed is critical as there will be several million rows.
For this specific problem, each row will contain the same starting population.
import numpy as np
np.random.seed(2018)
def scramble(a, axis=-1):
"""
Return an array with the values of `a` independently shuffled along the
given axis
"""
b = a.swapaxes(axis, -1)
n = a.shape[axis]
idx = np.random.choice(n, n, replace=False)
b = b[..., idx]
return b.swapaxes(axis, -1)
a = a = np.arange(4*9).reshape(4, 9)
# array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
# [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
# [18, 19, 20, 21, 22, 23, 24, 25, 26],
# [27, 28, 29, 30, 31, 32, 33, 34, 35]])
print(scramble(a, axis=1))
yields
[[ 3 8 7 0 4 5 1 2 6]
[12 17 16 9 13 14 10 11 15]
[21 26 25 18 22 23 19 20 24]
[30 35 34 27 31 32 28 29 33]]
while scrambling along the 0-axis:
print(scramble(a, axis=0))
yields
[[18 19 20 21 22 23 24 25 26]
[ 0 1 2 3 4 5 6 7 8]
[27 28 29 30 31 32 33 34 35]
[ 9 10 11 12 13 14 15 16 17]]
This works by first swapping the target axis with the last axis:
b = a.swapaxes(axis, -1)
This is a common trick used to standardize code which deals with one axis.
It reduces the general case to the specific case of dealing with the last axis.
Since in NumPy version 1.10 or higher swapaxes returns a view, there is no copying involved and so calling swapaxes is very quick.
Now we can generate a new index order for the last axis:
n = a.shape[axis]
idx = np.random.choice(n, n, replace=False)
Now we can shuffle b (independently along the last axis):
b = b[..., idx]
and then reverse the swapaxes to return an a-shaped result:
return b.swapaxes(axis, -1)
If you don't want a return value and want to operate on the array directly, you can specify the indices to shuffle.
>>> import numpy as np
>>>
>>>
>>> a = np.array([[1,2,3], [1,2,3], [1,2,3]])
>>>
>>> # Shuffle row `2` independently
>>> np.random.shuffle(a[2])
>>> a
array([[1, 2, 3],
[1, 2, 3],
[3, 2, 1]])
>>>
>>> # Shuffle column `0` independently
>>> np.random.shuffle(a[:,0])
>>> a
array([[3, 2, 3],
[1, 2, 3],
[1, 2, 1]])
If you want a return value as well, you can use numpy.random.permutation, in which case replace np.random.shuffle(a[n]) with a[n] = np.random.permutation(a[n]).
Warning, do not do a[n] = np.random.shuffle(a[n]). shuffle does not return anything, so the row/column you end up "shuffling" will be filled with nan instead.
Good answer above. But I will throw in a quick and dirty way:
a = np.array([[1,2,3], [1,2,3], [1,2,3]])
ignore_list_outpput = [np.random.shuffle(x) for x in a]
Then, a can be something like this
array([[2, 1, 3],
[4, 6, 5],
[9, 7, 8]])
Not very elegant but you can get this job done with just one short line.
Building on my comment to #Hun's answer, here's the fastest way to do this:
def shuffle_along(X):
"""Minimal in place independent-row shuffler."""
[np.random.shuffle(x) for x in X]
This works in-place and can only shuffle rows. If you need more options:
def shuffle_along(X, axis=0, inline=False):
"""More elaborate version of the above."""
if not inline:
X = X.copy()
if axis == 0:
[np.random.shuffle(x) for x in X]
if axis == 1:
[np.random.shuffle(x) for x in X.T]
if not inline:
return X
This, however, has the limitation of only working on 2d-arrays. For higher dimensional tensors, I would use:
def shuffle_along(X, axis=0, inline=True):
"""Shuffle along any axis of a tensor."""
if not inline:
X = X.copy()
np.apply_along_axis(np.random.shuffle, axis, X) # <-- I just changed this
if not inline:
return X
You can do it with numpy without any loop or extra function, and much more faster. E. g., we have an array of size (2, 6) and we want a sub array (2,2) with independent random index for each column.
import numpy as np
test = np.array([[1, 1],
[2, 2],
[0.5, 0.5],
[0.3, 0.3],
[4, 4],
[7, 7]])
id_rnd = np.random.randint(6, size=(2, 2)) # select random numbers, use choice and range if don want replacement.
new = np.take_along_axis(test, id_rnd, axis=0)
Out:
array([[2. , 2. ],
[0.5, 2. ]])
It works for any number of dimensions.
As of NumPy 1.20.0 released in January 2021 we have a permuted() method on the new Generator type (introduced with the new random API in NumPy 1.17.0, released in July 2019). This does exactly what you need:
import numpy as np
rng = np.random.default_rng()
a = np.array([
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
])
shuffled = rng.permuted(a, axis=1)
This gives you something like
>>> print(shuffled)
[[2 3 1]
[1 3 2]
[2 1 3]]
As you can see, the rows are permuted independently. This is in sharp contrast with both rng.permutation() and rng.shuffle().
If you want an in-place update you can pass the original array as the out keyword argument. And you can use the axis keyword argument to choose the direction along which to shuffle your array.