How to repeat a numpy array along a new dimension with padding? - python

Given an two arrays: an input array and a repeat array, I would like to receive an array which is repeated along a new dimension a specified amount of times for each row and padded until the ending.
to_repeat = np.array([1, 2, 3, 4, 5, 6])
repeats = np.array([1, 2, 2, 3, 3, 1])
# I want final array to look like the following:
#[[1, 0, 0],
# [2, 2, 0],
# [3, 3, 0],
# [4, 4, 4],
# [5, 5, 5],
# [6, 0, 0]]
The issue is that I'm operating with large datasets (10M or so) so a list comprehension is too slow - what is a fast way to achieve this?

Here's one with masking based on this idea -
m = repeats[:,None] > np.arange(repeats.max())
out = np.zeros(m.shape,dtype=to_repeat.dtype)
out[m] = np.repeat(to_repeat,repeats)
Sample output -
In [44]: out
Out[44]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
Or with broadcasted-multiplication -
In [67]: m*to_repeat[:,None]
Out[67]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
For large datasets/sizes, we can leverage multi-cores and be more efficient on memory with numexpr module on that broadcasting -
In [64]: import numexpr as ne
# Re-using mask `m` from previous method
In [65]: ne.evaluate('m*R',{'m':m,'R':to_repeat[:,None]})
Out[65]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])

Related

Remove 2d slice from 3d numpy array

I need to remove the last arrays from a 3D numpy cube. I have:
a = np.array(
[[[1,2,3],
[4,5,6],
[7,8,9]],
[[9,8,7],
[6,5,4],
[3,2,1]],
[[0,0,0],
[0,0,0],
[0,0,0]],
[[0,0,0],
[0,0,0],
[0,0,0]]])
How do I remove the arrays with zero sub-arrays like at the bottom side of the cube, using np.delete?
(I cannot simply remove all zero values, because there will be zeros in the data on the top side)
For a 3D cube, you might check all against the last two axes
a = np.asarray(a)
a[~(a==0).all((2,1))]
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Here's one way to remove trailing all zeros slices, as mentioned in the question that we want to keep the all zeros slices in the data on the top side -
a[:-(a==0).all((1,2))[::-1].argmin()]
Sample run -
In [80]: a
Out[80]:
array([[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]],
[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]]])
In [81]: a[:-(a==0).all((1,2))[::-1].argmin()]
Out[81]:
array([[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
If you know where they are already, the easiest thing to do is slice them off:
a[:-2]
Results in:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Hope this helps,
a_new=[] #Create a empty list
for item in a:
if not (np.count_nonzero(item) == 0): #check if inner matrix is empty or not
a_new.append(item) #appending to inner matrix to the list
a_new=np.array(a_new) #creating numpy matrix with removed zero elements
Output:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Use any and select :)
a=np.array([[[1,2,3],
[4,5,6],
[7,8,9]],
[[9,8,7],
[6,5,4],
[3,2,1]],
[[0,0,0],
[0,0,0],
[0,0,0]],
[[0,0,0],
[0,0,0],
[0,0,0]]])
a[a.any(axis=2).any(axis=1)]

convert numpy open mesh to coordinates

I'd like to turn an open mesh returned by the numpy ix_ routine to a list of coordinates
eg, for:
In[1]: m = np.ix_([0, 2, 4], [1, 3])
In[2]: m
Out[2]:
(array([[0],
[2],
[4]]), array([[1, 3]]))
What I would like is:
([0, 1], [0, 3], [2, 1], [2, 3], [4, 1], [4, 3])
I'm pretty sure I could hack it together with some iterating, unpacking and zipping, but I'm sure there must be a smart numpy way of achieving this...
Approach #1 Use np.meshgrid and then stack -
r,c = np.meshgrid(*m)
out = np.column_stack((r.ravel('F'), c.ravel('F') ))
Approach #2 Alternatively, with np.array() and then transposing, reshaping -
np.array(np.meshgrid(*m)).T.reshape(-1,len(m))
For a generic case with for generic number of arrays used within np.ix_, here are the modifications needed -
p = np.r_[2:0:-1,3:len(m)+1,0]
out = np.array(np.meshgrid(*m)).transpose(p).reshape(-1,len(m))
Sample runs -
Two arrays case :
In [376]: m = np.ix_([0, 2, 4], [1, 3])
In [377]: p = np.r_[2:0:-1,3:len(m)+1,0]
In [378]: np.array(np.meshgrid(*m)).transpose(p).reshape(-1,len(m))
Out[378]:
array([[0, 1],
[0, 3],
[2, 1],
[2, 3],
[4, 1],
[4, 3]])
Three arrays case :
In [379]: m = np.ix_([0, 2, 4], [1, 3],[6,5,9])
In [380]: p = np.r_[2:0:-1,3:len(m)+1,0]
In [381]: np.array(np.meshgrid(*m)).transpose(p).reshape(-1,len(m))
Out[381]:
array([[0, 1, 6],
[0, 1, 5],
[0, 1, 9],
[0, 3, 6],
[0, 3, 5],
[0, 3, 9],
[2, 1, 6],
[2, 1, 5],
[2, 1, 9],
[2, 3, 6],
[2, 3, 5],
[2, 3, 9],
[4, 1, 6],
[4, 1, 5],
[4, 1, 9],
[4, 3, 6],
[4, 3, 5],
[4, 3, 9]])

Merging arrays of varying size in Python

is there an easy way to merge let's say n spectra (i.e. arrays of shape (y_n, 2)) with varying lengths y_n into an array (or list) of shape (y_n_max, 2*x) by filling up y_n with zeros if it is
Basically I want to have all spectra next to each other.
For example
a = [[1,2],[2,3],[4,5]]
b = [[6,7],[8,9]]
into
c = [[1,2,6,7],[2,3,8,9],[4,5,0,0]]
Either Array or List would be fine. I guess it comes down to filling up arrays with zeros?
If you're dealing with native Python lists, then you can do:
from itertools import zip_longest
c = [a + b for a, b in zip_longest(a, b, fillvalue=[0, 0])]
You also could do this with extend and zip without itertools provided a will always be longer than b. If b could be longer than a, the you could add a bit of logic as well.
a = [[1,2],[2,3],[4,5]]
b = [[6,7],[8,9]]
b.extend([[0,0]]*(len(a)-len(b)))
[[x,y] for x,y in zip(a,b)]
Trying to generalize the other solutions to multiple lists:
In [114]: a
Out[114]: [[1, 2], [2, 3], [4, 5]]
In [115]: b
Out[115]: [[6, 7], [8, 9]]
In [116]: c
Out[116]: [[3, 4]]
In [117]: d
Out[117]: [[1, 2], [2, 3], [4, 5], [6, 7], [8, 9]]
In [118]: ll=[a,d,c,b]
zip_longest pads
In [120]: [l for l in itertools.zip_longest(*ll,fillvalue=[0,0])]
Out[120]:
[([1, 2], [1, 2], [3, 4], [6, 7]),
([2, 3], [2, 3], [0, 0], [8, 9]),
([4, 5], [4, 5], [0, 0], [0, 0]),
([0, 0], [6, 7], [0, 0], [0, 0]),
([0, 0], [8, 9], [0, 0], [0, 0])]
intertools.chain flattens the inner lists (or .from_iterable(l))
In [121]: [list(itertools.chain(*l)) for l in _]
Out[121]:
[[1, 2, 1, 2, 3, 4, 6, 7],
[2, 3, 2, 3, 0, 0, 8, 9],
[4, 5, 4, 5, 0, 0, 0, 0],
[0, 0, 6, 7, 0, 0, 0, 0],
[0, 0, 8, 9, 0, 0, 0, 0]]
More ideas at Convert Python sequence to NumPy array, filling missing values
Adapting #Divakar's solution to this case:
def divakars_pad(ll):
lens = np.array([len(item) for item in ll])
mask = lens[:,None] > np.arange(lens.max())
out = np.zeros((mask.shape+(2,)), int)
out[mask,:] = np.concatenate(ll)
out = out.transpose(1,0,2).reshape(5,-1)
return out
In [142]: divakars_pad(ll)
Out[142]:
array([[1, 2, 1, 2, 3, 4, 6, 7],
[2, 3, 2, 3, 0, 0, 8, 9],
[4, 5, 4, 5, 0, 0, 0, 0],
[0, 0, 6, 7, 0, 0, 0, 0],
[0, 0, 8, 9, 0, 0, 0, 0]])
For this small size the itertools solution is faster, even with an added conversion to array.
With an array as target we don't need the chain flattener; reshape takes care of that:
In [157]: np.array(list(itertools.zip_longest(*ll,fillvalue=[0,0]))).reshape(-1, len(ll)*2)
Out[157]:
array([[1, 2, 1, 2, 3, 4, 6, 7],
[2, 3, 2, 3, 0, 0, 8, 9],
[4, 5, 4, 5, 0, 0, 0, 0],
[0, 0, 6, 7, 0, 0, 0, 0],
[0, 0, 8, 9, 0, 0, 0, 0]])
Use the zip built-in function and the chain.from_iterable function from itertools. This has the benefit of being more type agnostic than the other posted solution -- it only requires that your spectra are iterables.
a = [[1,2],[2,3],[4,5]]
b = [[6,7],[8,9]]
c = list(list(chain.from_iterable(zs)) for zs in zip(a,b))
If you want more than 2 spectra, you can change the zip call to zip(a,b,...)

argmax on 2 axis for 3-d numpy array

I'd like to obtain a 1D array of indexes from a 3D matrix.
For instance given x = np.random.randint(10, size=(10,3,3)), I'd like to do something like np.argmax(x, axis=(1,2)) just like you can do with np.max, that is, obtain a 1D array of length 10 containing the indexes (0 to 8) of the maximums of each submatrix of size (3,3).
I have not found anything helpful so far and I want to avoid looping on the first dimension (and use np.argmax(x)) as it is quite big.
Cheers!
Reshape to merge those last two axes and then use np.argmax -
idx = x.reshape(x.shape[0],-1).argmax(-1)
out = np.unravel_index(idx, x.shape[-2:])
Sample run -
In [263]: x = np.random.randint(10, size=(4,3,3))
In [264]: x
Out[264]:
array([[[0, 9, 2],
[7, 7, 8],
[2, 5, 9]],
[[1, 7, 2],
[8, 9, 0],
[2, 8, 3]],
[[7, 5, 0],
[7, 1, 6],
[5, 1, 1]],
[[0, 7, 3],
[5, 4, 1],
[9, 8, 9]]])
In [265]: idx = x.reshape(x.shape[0],-1).argmax(-1)
In [266]: np.unravel_index(idx, x.shape[-2:])
Out[266]: (array([0, 1, 0, 2]), array([1, 1, 0, 0]))
If you meant getting the merged index, then its simpler -
x.reshape(x.shape[0],-1).argmax(1)
Sample run -
In [283]: x
Out[283]:
array([[[2, 3, 7],
[8, 1, 0],
[3, 6, 9]],
[[8, 0, 5],
[2, 2, 9],
[9, 0, 9]],
[[1, 9, 2],
[5, 0, 3],
[7, 2, 1]],
[[1, 6, 5],
[2, 3, 7],
[7, 4, 6]]])
In [284]: x.reshape(x.shape[0],-1).argmax(1)
Out[284]: array([8, 5, 1, 5])

Python — How can I find the square matrix of a lower triangular numpy matrix? (with a symmetrical upper triangle)

I generated a lower triangular matrix, and I want to complete the matrix using the values in the lower triangular matrix to form a square matrix, symmetrical around the diagonal zeros.
lower_triangle = numpy.array([
[0,0,0,0],
[1,0,0,0],
[2,3,0,0],
[4,5,6,0]])
I want to generate the following complete matrix, maintaining the zero diagonal:
complete_matrix = numpy.array([
[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])
Thanks.
You can simply add it to its transpose:
>>> m
array([[0, 0, 0, 0],
[1, 0, 0, 0],
[2, 3, 0, 0],
[4, 5, 6, 0]])
>>> m + m.T
array([[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])
You can use the numpy.triu_indices or numpy.tril_indices:
>>> a=np.array([[0, 0, 0, 0],
... [1, 0, 0, 0],
... [2, 3, 0, 0],
... [4, 5, 6, 0]])
>>> irows,icols = np.triu_indices(len(a),1)
>>> a[irows,icols]=a[icols,irows]
>>> a
array([[0, 1, 2, 4],
[1, 0, 3, 5],
[2, 3, 0, 6],
[4, 5, 6, 0]])

Categories

Resources