arr is a n-dimensional numpy array.
How to change sign of every element of arr with an odd sum of indices?
For example, arr[0, 1, 2] needs a sign change because it has a sum of indices 0 + 1 + 2 = 3, which is odd.
When I convert arr to a list, I notice that every second element in the list are elements that needs a sign change.
Another example:
Original array:
[[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]]
[[ 9 10 11]
[12 13 14]
[15 16 17]]
[[18 19 20]
[21 22 23]
[24 25 26]]]
Array with signs changed:
[[[ 0 -1 2]
[ -3 4 -5]
[ 6 -7 8]]
[[ -9 10 -11]
[12 -13 14]
[-15 16 -17]]
[[18 -19 20]
[-21 22 -23]
[24 -25 26]]]
np.negative is silghtly faster than multiplying (as it is a ufunc)
N = 5
arr = np.arange(N ** 3).reshape(N, N, N)
%timeit arr.ravel()[1::2] *= -1
%timeit np.negative(arr.ravel()[1::2], out = arr.ravel()[1::2])
The slowest run took 8.74 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.39 µs per loop
The slowest run took 5.57 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.12 µs per loop
N = 25
arr = np.arange(N ** 3).reshape(N, N, N)
%timeit arr.ravel()[1::2] *= -1
%timeit np.negative(arr.ravel()[1::2], out = arr.ravel()[1::2])
The slowest run took 7.03 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 10.8 µs per loop
The slowest run took 5.27 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.63 µs per loop
N = 101
arr = np.arange(N ** 3).reshape(N, N, N)
%timeit arr.ravel()[1::2] *= -1
%timeit np.negative(arr.ravel()[1::2], out = arr.ravel()[1::2])
1000 loops, best of 3: 663 µs per loop
1000 loops, best of 3: 512 µs per loop
Greatly simplified by hpaulj's suggestion.
Program:
import numpy as np
def change_sign(arr):
"""Switch sign of every second element of arr in-place
Note
----
Modifies the input array (arr).
"""
# arr.reshape(-1) makes a 1D view of arr
#
# [1::2] select every other element of arr,
# starting from the 1st element.
#
# *= -1 changes sign of selected elements.
arr.reshape(-1)[1::2] *= -1
return arr
def main():
N = 3
arr = np.arange(N ** 3).reshape(N, N, N)
print("original array:")
print(arr)
print("change signs")
print(change_sign(arr))
main()
Result:
original array:
[[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]]
[[ 9 10 11]
[12 13 14]
[15 16 17]]
[[18 19 20]
[21 22 23]
[24 25 26]]]
change signs
[[[ 0 -1 2]
[ -3 4 -5]
[ 6 -7 8]]
[[ -9 10 -11]
[ 12 -13 14]
[-15 16 -17]]
[[ 18 -19 20]
[-21 22 -23]
[ 24 -25 26]]]
Related
I need to create matrix
matrix = np.random.randint(1, 100, size = (3, 3), dtype='l')
and it`s looks likee
10 45 74
59 20 15
86 41 76
and i need to swap rows that contains max and min number
like that
86 41 76
59 20 15
10 45 74
how i can do it?
Here is one (boring) solution:
import numpy as np
mat = np.array([
[10, 45, 74],
[59, 20, 15],
[86, 41, 76],
])
max_row_id = mat.max(1).argmax() # row index with maximum element
min_row_id = mat.min(1).argmin() # row index with minimum element
row_idx = np.arange(mat.shape[0]) # index of row ids
row_idx[max_row_id] = min_row_id # swap row ids for rows with min and max elements
row_idx[min_row_id] = max_row_id
result = mat[row_idx,:]
I think using np.unravel_index be one of the fastest way besides the following advanced indexing to swap the rows:
row_min = np.unravel_index(np.argmin(mat), mat.shape)[0]
row_max = np.unravel_index(np.argmax(mat), mat.shape)[0]
mat[[row_min, row_max]] = mat[[row_max, row_min]]
Benchmarks (Colab):
# matrix (3*3) FASTEST
1000 loops, best of 5: 8.7 µs per loop ------> hilberts_drinking_problem method
1000 loops, best of 5: 14.3 µs per loop
# matrix (1000*3)
100 loops, best of 5: 65 µs per loop
100 loops, best of 5: 21.9 µs per loop ------> This method
# matrix (1000*1000)
100 loops, best of 5: 3.44 ms per loop
100 loops, best of 5: 2.64 ms per loop ------> This method
# matrix (10000*10000)
10 loops, best of 5: 388 ms per loop
10 loops, best of 5: 282 ms per loop ------> This method
The problem is simple, the input is a list of non-container objects (int, str etc.), all elements inside the list are contained inside a column in a DataFrame, the task is, for each element inside the list, find the object (only its value, not the array) in another column in the same row.
The problem will be better demonstrated in code:
from pandas import DataFrame
digits = '0123456789abcdef'
df = DataFrame([(a,b) for a, b in zip(digits, range(16))], columns=['hex', 'dec'])
df
df.loc[df.dec == 12, 'hex']
df.loc[df.dec == 12, 'hex'].values[0]
import random
eight = random.sample(range(16), 8)
eight
fun = lambda x: df.loc[df.dec == x, 'hex'].values[0]
''.join(fun(i) for i in eight)
''.join(map(fun, eight))
As you can see I can already do this, but I am using a for loop, and the performance isn't very impressive, I know pandas and numpy are all about vectorization, I wonder is there a built-in way to do this...
In [1]: from pandas import DataFrame
In [2]: digits = '0123456789abcdef'
In [3]: df = DataFrame([(a,b) for a, b in zip(digits, range(16))], columns=['hex', 'dec'])
In [4]: df
Out[4]:
hex dec
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 9
10 a 10
11 b 11
12 c 12
13 d 13
14 e 14
15 f 15
In [5]: df.loc[df.dec == 12, 'hex']
Out[5]:
12 c
Name: hex, dtype: object
In [6]: df.loc[df.dec == 12, 'hex'].values[0]
Out[6]: 'c'
In [7]: import random
In [8]: eight = random.sample(range(16), 8)
In [9]: eight
Out[9]: [9, 7, 1, 6, 11, 12, 14, 10]
In [10]: fun = lambda x: df.loc[df.dec == x, 'hex'].values[0]
In [11]: ''.join(fun(i) for i in eight)
Out[11]: '9716bcea'
In [12]: ''.join(map(fun, eight))
Out[12]: '9716bcea'
In [13]: %timeit ''.join(fun(i) for i in eight)
2.34 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [14]: %timeit ''.join(map(fun, eight))
2.34 ms ± 134 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
So what is a vectorized way to achieve the same result as the method demonstrated in the code?
A vectorized way would be to construct a Series:
series = df.set_index('dec')['hex']
''.join(series[eight])
Output: '9716bcea'
I have a bunch of dataframes with the structure like below
df = pd.DataFrame(
[[1, 'A', 10], [2, 'A', 20], [3, 'A', 30],
[1, 'B', 20], [2, 'B', 20], [3, 'B', 10],
[1, 'M', 20], [2, 'M', 30], [3, 'M', 30]],
columns=['foo', 'bar', 'buzz']
)
the dataframe is initially sorted by columns bar and foo as one can get from
df.sort_values(['bar', 'foo'])
I need to get the df sorted by foo an bar instead. The obvious solution would be
df.sort_values(['foo', 'bar'])
which gives me
foo bar buzz
0 1 A 10
3 1 B 20
6 1 M 20
1 2 A 20
4 2 B 20
7 2 M 30
2 3 A 30
5 3 B 10
8 3 M 30
but the real-world dataframe contains about 500,000 rows and I have about 3,000 individual dataframes to be processed.
I was wondering if there is a better, more efficient solution which would take into account the fact that the dataframe is already pre-sorted?
You can take advantage of stable sorting here, since bar is already sorted, which means that you only need to re-sort foo.
This should have a consistent impact on runtime at all sizes of DataFrame (I am seeing about a 2x speedup across the board).
Here is an example solution using numpy's argsort, specifying a stable sort.
df.iloc[np.argsort(df['foo'], kind="stable")]
foo bar buzz
0 1 A 10
3 1 B 20
6 1 M 20
1 2 A 20
4 2 B 20
7 2 M 30
2 3 A 30
5 3 B 10
8 3 M 30
Performance and Validation
df = pd.DataFrame(
{
"foo": np.random.randint(0, 100, 100_000),
"bar": np.random.choice(list("ABCDEFGHIJKLMNOP"), 100_000),
"buzz": np.random.randint(0, 100, 100_000),
}
).sort_values(["bar", "foo"])
In [42]: %timeit df.iloc[np.argsort(df['foo'], kind="stable")]
3.41 ms ± 22.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [43]: %timeit df.sort_values(["foo", "bar"])
6.95 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [48]: a = df.iloc[np.argsort(df['foo'], kind="stable")]
In [49]: b = df.sort_values(["foo", "bar"])
In [50]: np.all(a == b)
Out[50]: True
I have a large set of N d-dimensional vectors (residing in a matrix) that I am lifting by taking the self outer product (i.e. each vector i times itself). For each vector this produces a symmetric matrix with (d+1) choose 2 unique entries. For the entire data, this is an N x d x d tensor. I would like to compute only the unique (d+1) choose 2 entries from the lower diagonal of each tensor slice and store them in a vector. I want to do this with as little memory footprint and as fast as possible in Python - including using C bindings.
If you do this using standard numpy methods, it allocates the entirety of each matrix. This is about double the memory complexity of what is actually required.
For a sense of scale here, consider the case where N = 20k and d = 20k. Then N * d^2 * ~8bytes per element = (2*10^4)^3 * 8 bytes = 64 terabytes.
If we only compute the vectors that encode the unique entries, we have (20001 choose 2) * 20k * 8 = 200010000 * 20000 * 8 bytes = 32 terabytes.
Is there a quick way to do this without resorting to slow methods (such as coding my own outer product in python)?
Edit: I'll note that a similar question was asked in Create array of outer products in numpy
I already know how to compute this using einsum (as in the above question). However, no answer was reached about doing this without the extra (d choose 2) computations and allocations
Edit 2:
This thread How to exploit symmetry in outer product in Numpy (or other Python solutions)? asks a related question but does not address the memory complexity. The top answer will still allocate a d x d array for each outer product.
This thread Numpy Performance - Outer Product of a vector with its transpose also addressed computational considerations of a self outer product, but does not reach a memory efficient solution.
Edit 3:
If one wants to allocate the whole array and then extract the elements, np.tril_indices or scipy.spatial.distance.squareform will do the trick.
Not sure exactly how you want your output, but there's always the option of using Numba:
import numpy as np
import numba as nb
# Computes unique pairwise products
#nb.njit(parallel=True)
def self_outer_unique(a):
n, d = a.shape
out = np.empty((n, (d * d + d) // 2), dtype=a.dtype)
for i in nb.prange(n):
for j1 in range(d):
for j2 in range(j1, d):
idx = j1 * (2 * d - j1 + 1) // 2 + j2 - j1
out[i, idx] = a[i, j1] * a[i, j2]
return out
This will give you an array with all the unique products on each row (i.e. the flattened upper triangle of the full output).
import numpy as np
a = np.arange(12).reshape(4, 3)
print(a)
# [[ 0 1 2]
# [ 3 4 5]
# [ 6 7 8]
# [ 9 10 11]]
print(self_outer_unique(a))
# [[ 0 0 0 1 2 4]
# [ 9 12 15 16 20 25]
# [ 36 42 48 49 56 64]
# [ 81 90 99 100 110 121]]
Performance-wise, it is faster than computing the full outer product with NumPy, although recreating the full array from this takes longer.
import numpy as np
def np_self_outer(a):
return a[:, :, np.newaxis] * a[:, np.newaxis, :]
def my_self_outer(a):
b = self_outer_unique(a)
n, d = a.shape
b_full = np.zeros((n, d, d), dtype=a.dtype)
idx0 = np.arange(n)[:, np.newaxis]
idx1, idx2 = np.triu_indices(d)
b_full[idx0, idx1, idx2] = b
b_full += np.triu(b_full, 1).transpose(0, 2, 1)
return b_full
n, d = 1000, 100
a = np.arange(n * d).reshape(n, d)
print(np.all(np_self_outer(a) == my_self_outer(a)))
# True
%timeit np_self_outer(a)
# 24.6 ms ± 248 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit self_outer_unique(a)
# 6.32 ms ± 69.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit my_self_outer(a)
# 124 ms ± 770 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
I have 2 numpy arrays of shape (5,1) say:
a=[1,2,3,4,5]
b=[2,4,2,3,6]
How can I make a matrix multiplying each i-th element with each j-th? Like:
..a = [1,2,3,4,5]
b
2 2, 4, 6, 8,10
4 4, 8,12,16,20
2 2, 4, 6, 8,10
3 3, 6, 9,12,15
6 6,12,18,24,30
Without using forloops? is there any combination of reshape, reductions or multiplications that I can use?
Right now I create a a*b tiling of each array along rows and along colums and then multiply element wise, but it seems to me there must be an easier way.
With numpy.outer() and numpy.transpose() routines:
import numpy as np
a = [1,2,3,4,5]
b = [2,4,2,3,6]
c = np.outer(a,b).transpose()
print(c)
Or just with swapped array order:
c = np.outer(b, a)
The output;
[[ 2 4 6 8 10]
[ 4 8 12 16 20]
[ 2 4 6 8 10]
[ 3 6 9 12 15]
[ 6 12 18 24 30]]
For some reason np.multiply.outer seems to be faster than np.outer for small inputs. And broadcasting is faster still - but for bigger arrays they are all pretty much equal.
%timeit np.outer(a,b)
%timeit np.multiply.outer(a,b)
%timeit a[:, None]*b
100000 loops, best of 3: 5.97 µs per loop
100000 loops, best of 3: 3.27 µs per loop
1000000 loops, best of 3: 1.38 µs per loop
a = np.random.randint(0,10,100)
b = np.random.randint(0,10,100)
%timeit np.outer(a,b)
%timeit np.multiply.outer(a,b)
%timeit a[:, None]*b
100000 loops, best of 3: 15.5 µs per loop
100000 loops, best of 3: 14 µs per loop
100000 loops, best of 3: 13.5 µs per loop
a = np.random.randint(0,10,10000)
b = np.random.randint(0,10,10000)
%timeit np.outer(a,b)
%timeit np.multiply.outer(a,b)
%timeit a[:, None]*b
10 loops, best of 3: 154 ms per loop
10 loops, best of 3: 154 ms per loop
10 loops, best of 3: 152 ms per loop