Related
Given an two arrays: an input array and a repeat array, I would like to receive an array which is repeated along a new dimension a specified amount of times for each row and padded until the ending.
to_repeat = np.array([1, 2, 3, 4, 5, 6])
repeats = np.array([1, 2, 2, 3, 3, 1])
# I want final array to look like the following:
#[[1, 0, 0],
# [2, 2, 0],
# [3, 3, 0],
# [4, 4, 4],
# [5, 5, 5],
# [6, 0, 0]]
The issue is that I'm operating with large datasets (10M or so) so a list comprehension is too slow - what is a fast way to achieve this?
Here's one with masking based on this idea -
m = repeats[:,None] > np.arange(repeats.max())
out = np.zeros(m.shape,dtype=to_repeat.dtype)
out[m] = np.repeat(to_repeat,repeats)
Sample output -
In [44]: out
Out[44]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
Or with broadcasted-multiplication -
In [67]: m*to_repeat[:,None]
Out[67]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
For large datasets/sizes, we can leverage multi-cores and be more efficient on memory with numexpr module on that broadcasting -
In [64]: import numexpr as ne
# Re-using mask `m` from previous method
In [65]: ne.evaluate('m*R',{'m':m,'R':to_repeat[:,None]})
Out[65]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
is there an easy way to merge let's say n spectra (i.e. arrays of shape (y_n, 2)) with varying lengths y_n into an array (or list) of shape (y_n_max, 2*x) by filling up y_n with zeros if it is
Basically I want to have all spectra next to each other.
For example
a = [[1,2],[2,3],[4,5]]
b = [[6,7],[8,9]]
into
c = [[1,2,6,7],[2,3,8,9],[4,5,0,0]]
Either Array or List would be fine. I guess it comes down to filling up arrays with zeros?
If you're dealing with native Python lists, then you can do:
from itertools import zip_longest
c = [a + b for a, b in zip_longest(a, b, fillvalue=[0, 0])]
You also could do this with extend and zip without itertools provided a will always be longer than b. If b could be longer than a, the you could add a bit of logic as well.
a = [[1,2],[2,3],[4,5]]
b = [[6,7],[8,9]]
b.extend([[0,0]]*(len(a)-len(b)))
[[x,y] for x,y in zip(a,b)]
Trying to generalize the other solutions to multiple lists:
In [114]: a
Out[114]: [[1, 2], [2, 3], [4, 5]]
In [115]: b
Out[115]: [[6, 7], [8, 9]]
In [116]: c
Out[116]: [[3, 4]]
In [117]: d
Out[117]: [[1, 2], [2, 3], [4, 5], [6, 7], [8, 9]]
In [118]: ll=[a,d,c,b]
zip_longest pads
In [120]: [l for l in itertools.zip_longest(*ll,fillvalue=[0,0])]
Out[120]:
[([1, 2], [1, 2], [3, 4], [6, 7]),
([2, 3], [2, 3], [0, 0], [8, 9]),
([4, 5], [4, 5], [0, 0], [0, 0]),
([0, 0], [6, 7], [0, 0], [0, 0]),
([0, 0], [8, 9], [0, 0], [0, 0])]
intertools.chain flattens the inner lists (or .from_iterable(l))
In [121]: [list(itertools.chain(*l)) for l in _]
Out[121]:
[[1, 2, 1, 2, 3, 4, 6, 7],
[2, 3, 2, 3, 0, 0, 8, 9],
[4, 5, 4, 5, 0, 0, 0, 0],
[0, 0, 6, 7, 0, 0, 0, 0],
[0, 0, 8, 9, 0, 0, 0, 0]]
More ideas at Convert Python sequence to NumPy array, filling missing values
Adapting #Divakar's solution to this case:
def divakars_pad(ll):
lens = np.array([len(item) for item in ll])
mask = lens[:,None] > np.arange(lens.max())
out = np.zeros((mask.shape+(2,)), int)
out[mask,:] = np.concatenate(ll)
out = out.transpose(1,0,2).reshape(5,-1)
return out
In [142]: divakars_pad(ll)
Out[142]:
array([[1, 2, 1, 2, 3, 4, 6, 7],
[2, 3, 2, 3, 0, 0, 8, 9],
[4, 5, 4, 5, 0, 0, 0, 0],
[0, 0, 6, 7, 0, 0, 0, 0],
[0, 0, 8, 9, 0, 0, 0, 0]])
For this small size the itertools solution is faster, even with an added conversion to array.
With an array as target we don't need the chain flattener; reshape takes care of that:
In [157]: np.array(list(itertools.zip_longest(*ll,fillvalue=[0,0]))).reshape(-1, len(ll)*2)
Out[157]:
array([[1, 2, 1, 2, 3, 4, 6, 7],
[2, 3, 2, 3, 0, 0, 8, 9],
[4, 5, 4, 5, 0, 0, 0, 0],
[0, 0, 6, 7, 0, 0, 0, 0],
[0, 0, 8, 9, 0, 0, 0, 0]])
Use the zip built-in function and the chain.from_iterable function from itertools. This has the benefit of being more type agnostic than the other posted solution -- it only requires that your spectra are iterables.
a = [[1,2],[2,3],[4,5]]
b = [[6,7],[8,9]]
c = list(list(chain.from_iterable(zs)) for zs in zip(a,b))
If you want more than 2 spectra, you can change the zip call to zip(a,b,...)
So let's say I have two arrays (numpy arrays that is):
array1 =
[[[1, 0, 0], [0, 6, 0], [3, 0, 0]],
[[0, 2, 4], [0, 4, 0], [0, 4, 0]],
[[0, 0, 2], [1, 3, 2], [3, 4, 0]]]
and
array2 =
[[[2, 4, 0], [0, 4, 0], [3, 0, 0]],
[[0, 0, 3], [1, 4, 3], [2, 4, 3]],
[[0, 0, 1], [0, 2, 1], [1, 0, 2]]]
I then make a function like:
def array_calc(x,y,z):
x*y+z
What I would like to do now is have the x-values come from array1 and y-values from array2, and z-values just a constant I choose (let's say z = 0), and then do the calculation on each entry of the arrays, and ultimately end up with a new array, where the calculation has been done, and I get something like:
array_result =
[[[2, 0, 0], [0, 24, 0], [9, 0, 0]],
[[0, 0, 12], [0, 16, 0], [0, 16, 0]],
[[0, 0, 2], [0, 6, 2], [3, 0, 0]]]
But, I'm not quite sure how that is done.
If your arrays are numpy arrays, it is as simple as:
import numpy as np
x = np.array([[1,0],[0,1]])
y = np.array([[4,1],[0,2]])
z = 1
result = x*y + z
# result = array([[5, 1], [1, 3]])
Using simple for loops:
import numpy as np
def array_calc(x, y, z):
"""Returns x * y + z with x and y 3D Numpy arrays and z a number"""
new_arr = x.copy()
for i in np.arange(x.shape[0]):
for k in np.arange(x.shape[1]):
for j in np.arange(x.shape[2]):
new_arr[i, k, j] = x[i, k, j] * y[i, k, j] + z
return new_arr
With:
array1 = np.array([[[1, 0, 0], [0, 6, 0], [3, 0, 0]],
[[0, 2, 4], [0, 4, 0], [0, 4, 0]],
[[0, 0, 2], [1, 3, 2], [3, 4, 0]]])
array2 = np.array([[[2, 4, 0], [0, 4, 0], [3, 0, 0]],
[[0, 0, 3], [1, 4, 3], [2, 4, 3]],
[[0, 0, 1], [0, 2, 1], [1, 0, 2]]])
Returns:
array([[[ 3, 1, 1],
[ 1, 25, 1],
[10, 1, 1]],
[[ 1, 1, 13],
[ 1, 17, 1],
[ 1, 17, 1]],
[[ 1, 1, 3],
[ 1, 7, 3],
[ 4, 1, 1]]])
A way I can think of is to iterate through them and perform your calculations.
This can be done with 3 dimensional arrays too but I just found it easier to do it with 2 dimensional arrays. I am sure there are other ways to reduce the complexity further down because 3 for loops is not the best solution but it gets the work done.
The code is here:
array1 = [[[1, 0, 0], [0, 6, 0], [3, 0, 0]],[[0, 2, 4], [0, 4, 0], [0, 4, 0]],[[0, 0, 2], [1, 3, 2], [3, 4, 0]]]
array2 = [[[2, 4, 0], [0, 4, 0], [3, 0, 0]], [[0, 0, 3], [1, 4, 3], [2, 4, 3]], [[0, 0, 1], [0, 2, 1], [1, 0, 2]]]
z=0
array_1 = reduce(list.__add__, array1)
array_2 = reduce(list.__add__, array2)
array_3 = [[0,0,0] for _ in xrange(9)]
len_array=9
for i in range(len_array):
for l in range(3):
array_3[i][l] = array_1[i][l]*array_2[i][l]+z
print array_3
I generated a lower triangular matrix, and I want to complete the matrix using the values in the lower triangular matrix to form a square matrix, symmetrical around the diagonal zeros.
lower_triangle = numpy.array([
[0,0,0,0],
[1,0,0,0],
[2,3,0,0],
[4,5,6,0]])
I want to generate the following complete matrix, maintaining the zero diagonal:
complete_matrix = numpy.array([
[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])
Thanks.
You can simply add it to its transpose:
>>> m
array([[0, 0, 0, 0],
[1, 0, 0, 0],
[2, 3, 0, 0],
[4, 5, 6, 0]])
>>> m + m.T
array([[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])
You can use the numpy.triu_indices or numpy.tril_indices:
>>> a=np.array([[0, 0, 0, 0],
... [1, 0, 0, 0],
... [2, 3, 0, 0],
... [4, 5, 6, 0]])
>>> irows,icols = np.triu_indices(len(a),1)
>>> a[irows,icols]=a[icols,irows]
>>> a
array([[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])
This question already has answers here:
Python — How can I find the square matrix of a lower triangular numpy matrix? (with a symmetrical upper triangle)
(2 answers)
Closed 9 years ago.
I generated a lower triangular matrix, and I want to complete the matrix using the values in the lower triangular matrix to form a square matrix.
lower_triangle = numpy.array([
[0,0,0,0],
[1,0,0,0],
[2,3,0,0],
[4,5,6,0]])
I want to generate the following complete matrix, maintaining the zero diagonal:
complete_matrix = numpy.array([
[0, 6, 5, 4],
[1, 0, 3, 2],
[2, 3, 0, 1],
[4, 5, 6, 0]])
Thanks.
How about:
>>> m
array([[0, 0, 0, 0],
[1, 0, 0, 0],
[2, 3, 0, 0],
[4, 5, 6, 0]])
>>> np.rot90(m,2)
array([[0, 6, 5, 4],
[0, 0, 3, 2],
[0, 0, 0, 1],
[0, 0, 0, 0]])
>>> m + np.rot90(m, 2)
array([[0, 6, 5, 4],
[1, 0, 3, 2],
[2, 3, 0, 1],
[4, 5, 6, 0]])
See also fliplr(m)[::-1], etc.
without any addition:
>>> a=np.array([[0, 0, 0, 0],
... [1, 0, 0, 0],
... [2, 3, 0, 0],
... [4, 5, 6, 0]])
>>> irows,icols = np.triu_indices(len(a),1)
>>> a[irows,icols]=a[icols,irows]
>>> a
array([[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])