Python: Insert columns into a numpy array based on mask - python

Suppose I have the following data:
mask = [[0, 1, 1, 0, 1]] # 2D mask
ip_array = [[4, 5, 2]
[3, 2, 1]
[1, 8, 6]] # 2D array
I want to insert columns of 0s into ip_array where ever there is 0 in the mask. So the output should be like:
[[0, 4, 5, 0, 2]
[0, 3, 2, 0, 1]
[0, 1, 8, 0, 6]]
I am new to numpy functions and I am looking for an efficient way to do this. Any help is appreciated!

Here's one way to do it in two steps:
(i) Create an array of zeros of the correct shape (the first dimension of ip_array and the second dimension of mask)
(ii) Use the mask across the second dimension (as a boolean mask) and assign the values of ip_array to the array of zeros.
out = np.zeros((ip_array.shape[0], mask.shape[1])).astype(int)
out[..., mask[0].astype(bool)] = ip_array
print(out)
Output:
[[0 4 5 0 2]
[0 3 2 0 1]
[0 1 8 0 6]]

Here is another approach using slicing with a cumsum mask and an extra 0 column in the input. The cumsum mask will have the indices of the ip_array + 1 and 0 whenever to add zeros. The concatenated array has an extra initial columns of zeros so indexing with 0 yields a column of zeros.
m = (mask.cumsum()*mask)[0]
# array([0, 1, 2, 0, 3])
np.c_[np.zeros(ip_array.shape[0]), ip_array][:,m].astype(int)
# array([[0, 4, 5, 0, 2],
# [0, 3, 2, 0, 1],
# [0, 1, 8, 0, 6]])

A solution with parameters and other way to do than green checked. So it is more understandable.
Juste the last line is important for the operation.
import numpy
import random
n1 = 5
n2 = 5
r = 0.7
random.seed(1)
a = numpy.array([[0 if random.random() > r else 1 for _ in range(n1)]])
n3 = numpy.count_nonzero(a)
b = numpy.array([[random.randint(1,9) for _ in range(n3)] for _ in range(n2)])
c = numpy.zeros((n2, n1))
c[:, numpy.where(a)[1]] = b[:]
Result:
a = array([[1, 0, 0, 1, 1]])
b = array([[8, 8, 7],
[4, 2, 8],
[1, 7, 7],
[1, 8, 5],
[4, 2, 6]])
c = array([[8., 0., 0., 8., 7.],
[4., 0., 0., 2., 8.],
[1., 0., 0., 7., 7.],
[1., 0., 0., 8., 5.],
[4., 0., 0., 2., 6.]])
Here your time processing depending on n-values:
Using this code:
import numpy
import random
import time
import matplotlib.pyplot as plt
n1 = 5
n2 = 5
r = 0.7
def main(n1, n2):
print()
print(f"{n1 = }")
print(f"{n2 = }")
random.seed(1)
a = numpy.array([[0 if random.random() > r else 1 for _ in range(n1)]])
n3 = numpy.count_nonzero(a)
b = numpy.array([[random.randint(1,9) for _ in range(n3)] for _ in range(n2)])
t0 = time.time()
c = numpy.zeros((n2, n1))
c[:, numpy.where(a)[1]] = b[:]
t = time.time() - t0
print(f"{t = }")
return t
t1 = [main(10**i, 10) for i in range(1, 8)]
t2 = [main(10, 10**i) for i in range(1, 8)]
plt.plot(t1, label="n1 time process evolution")
plt.plot(t2, label="n2 time process evolution")
plt.xlabel("n-values (log)")
plt.ylabel("Time processing (s)")
plt.title("Insert columns into a numpy array based on mask")
plt.legend()
plt.show()

mask = np.array([0, 1, 1, 0, 1])
#extract indices of zeros
mask_pos = (list(np.where(mask == 0)[0]))
ip_array =np.array([[4, 5, 2],
[3, 2, 1],
[1, 8, 6]])
#insert 0 at respextive mask position
for i in mask_pos:
ip_array = np.insert(ip_array,i,0,axis=1)
print(ip_array)

Related

Split self-intersecting linestring into non-self-intersecting linestrings

I have a list of coordinates defining a line string that might intersect with itself:
coordinates = [
[0, 3],
[0, 5],
[4, 5],
[4, 0],
[0, 0],
[0, 5],
[2, 5]
]
How can I split the linestring into smaller linestrings so none of the linestrings intersects with itself?
smallest number of linestrings
line strings should have equal number of coordinates as possible
the desired outcome in this case would be:
line0 = [
[0, 3],
[0, 5],
[4, 5],
[4, 0]
]
line1 = [
[4, 0],
[0, 0],
[0, 5],
[2, 5]
]
My attempt
In my attempt so far I construct an intersection matrix using Shapely Linestrings to find the intersections:
from shapely.geometry import LineString
from itertools import product, zip_longest
import numpy as np
def get_intersection_matrix(coordinates):
linestrings = [
(ix, LineString([c0, c1]))
for ix, (c0, c1) in enumerate(zip(coordinates[:-1], coordinates[1:]))
]
M = np.zeros((len(linestrings), len(linestrings)))
for (ix0, ls0), (ix1, ls1) in combinations(linestrings, 2):
if abs(ix0 - ix1) == 1: # ignore connecting segments
continue
if ls0.intersects(ls1):
M[ix0, ix1], M[ix1, ix0] = 1, 1
return M
which outputs what I call the "intersection matrix":
>> get_intersection_matrix(coordinates)
array([[0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0]])
That you can read as:
segment 1 intersects with segment 5 and 6
segment 2 intersects with segment 5 and 6
segment 5 intersects with segment 1 and 2
segment 6 intersects with segment 1 and 2
Also; I think that the number of "intersection clusters" indicate the number of linestrings: no_clusters + 1
How I solve it now... I changed my intersection matrix, so at no intersection the value is 1 and at any intersection the value is 0.
def get_intersection_matrix(coordinates):
linestrings = [
(ix, LineString([c0, c1]))
for ix, (c0, c1) in enumerate(zip(coordinates[:-1], coordinates[1:]))
]
M = np.ones((len(linestrings), len(linestrings)))
for (ix0, ls0), (ix1, ls1) in combinations(linestrings, 2):
if abs(ix0 - ix1) == 1: # ignore connecting segments
continue
if ls0.intersects(ls1):
M[ix0, ix1], M[ix1, ix0] = 0, 0
return M
>> M = get_intersection_matrix(coordinates)
>> M
array([[1., 1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0., 0.],
[1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1.],
[0., 0., 1., 1., 1., 1.],
[0., 0., 1., 1., 1., 1.]])
any combination of split indexes is given by: itertools.combinations(range(1, len(M)), nr_split_ixs) where also ix1 < ix2 < ... < ixn
at one split index you get two squares that should not contain any 0's, and the squares can be optimized by a minimum sum!
This is a legal (but not the best) split with split_ix = 4 and the sum of the two boxes is 16+4 = 20.
This is a better legal (no zeros) split where the sum of the two boxes is 9+9=18
The method to calculate the scored split indexes:
def get_scored_split_ixs_combination(M, nr_split_ixs):
ixs_scores = []
for ixs in combinations(range(1, len(M)), nr_split_ixs):
splitted_matrix = [
M[i0:i1, i0:i1] for i0, i1 in zip((0, *ixs), (*ixs, len(M)))
]
# check if no matrices have zeros
if not all([(m > 0).all() for m in splitted_matrix]):
# ilegal ixs combination
continue
ixs_scores.append((ixs, sum([m.sum() for m in splitted_matrix])))
return ixs_scores
if the return is empty there are no legal options and you should increase the number of splits.
Now return the best split option by increment the number of splits:
def get_best_split_ixs_combination(M):
nr_split_ixs = 0
while True:
ixs_scores = get_scored_split_ixs_combination(M, nr_split_ixs)
if ixs_scores:
return min(ixs_scores, key=lambda x: x[1])[0]
nr_split_ixs +=1
>> get_best_split_ixs_combination(M)
(3,)
And finally wrap it all together:
def get_non_intersecting_linestrings(coordinates):
M = get_intersection_matrix(coordinates)
split_indexes = get_best_split_ixs_combination(M)
return [
coordinates[i1:i2]
for i1, i2 in zip([0] + split_indexes, split_indexes + [len(coordinates)])
]
>> get_non_intersecting_linestrings(coordinates)
[[[0, 3], [0, 5], [4, 5]], [[4, 0], [0, 0], [0, 5], [2, 5]]]

Expand a 2D array into a 3D array with specific length

I have a 100x100 numpy array that I want to add for it a third dimension which has length 3 which have [1,0,1].
I'm trying to do this without a for loop if possible.
Tried all sort of things like np.newaxis but it only expands the dimension with length 1 and then it can't be populated with a array of length 3.
Thank yiu
Depending on what you want you have a few options:
import numpy as np
arr = np.random.random((100, 100))
some_numbers = [1, 0, 1]
# A
new_arr = np.empty(arr.shape + (3,))
new_arr[..., :] = some_numbers
# array([[[1., 0., 1.],
# [1., 0., 1.],
# [1., 0., 1.],
# ...,
# A2
new_arr = np.empty(arr.shape + (len(some_numbers) + 1,))
new_arr[..., 0] = arr[..., np.newaxis]
new_arr[..., 1:] = some_numbers
# array([[[0.2853, 1., 0., 1.],
# [0.7324, 1., 0., 1.],
# [0.0706, 1., 0., 1.],
# ...,
# B
new_arr = np.empty(arr.shape + (3,))
new_arr[..., :] = arr[..., np.newaxis]
# C
new_arr = np.repeat(arr[..., np.newaxis], 3, axis=-1)
# array([[[0.2853, 0.2853, 0.2853],
# [0.7324, 0.7324, 0.7324],
# [0.0706, 0.0706, 0.0706],
# ...,
In case A you are overwriting all elements of arr with [1, 0, 1].
In case A2 you keep the original array at new_arr[:, :, 0] and fill the remaining planes new_arr[:, :, 1:] with some_numbers respectively.
In case B and case C you repeat the 100x100 array 3 times along the new third dimension.
As I understood, you want to generate a 3-D array:
the first "layer" filled with ones,
the second "layer" filled with zeroes,
the third "layer" filled again with ones,
all "layers" with dimension 100 * 100.
For readablity, I changed your assumptions:
the third "layer" filled with 2,
all "layers" with dimension 5 * 5.
Step 1: Create each 2-D array (layer in the target array):
arr1 = np.ones((5,5), dtype=np.int)
arr2 = np.zeros((5,5), dtype=np.int)
arr3 = np.full((5,5), 2)
Step 2: Create the target array:
res = np.stack((arr1, arr2, arr3), axis=2)
When you print res.shape, you will get:
(5, 5, 3)
(5 rows, 5 columns, 3 layers)
To see each "layer" separately, run res[:, :, n] where n is either
0, 1 or 2. E.g. for n == 2 (the last layer) I got:
array([[2, 2, 2, 2, 2],
[2, 2, 2, 2, 2],
[2, 2, 2, 2, 2],
[2, 2, 2, 2, 2],
[2, 2, 2, 2, 2]])

Numpy replacing elements based on logic and value in an identically shaped array [duplicate]

This question already has answers here:
How do I select elements of an array given condition?
(6 answers)
Closed 3 years ago.
I have 2 numpy arrays. One is filled with boolean values and the other numerical values.
How would I perform logic on the numerical array based on also the current value in the boolean array.
e.g. if true and > 5 then make the value false
matrix1
matrix2
newMatrix = matrix1 > 5 where matrix2 value is false
Please note that these arrays have the same shape e.g.
[[0, 1, 1],
[1, 0, 0]]
and
[[3, 1, 0]
[6, 2, 6]]
And the result I would like would be a new boolean matrix that is true if its value is true in the boolean array and the equivalent value in the numerical array is more than 5 e.g.
[[0, 0, 0]
[1, 0, 0]]
newMatrix = np.logical_and(matrix2 == 0, matrix1 > 5 )
This will iterate over all elements, and make an 'and' between pairs of booleans from matrix == 0 and matrix1 > 5. Note that matrix1 > 5 type of expression generates a matrix of boolean values.
If you want 0,1 instead of False,True, you can add +0 to the result:
newMatrix = np.logical_and(matrix2 == 0, matrix1 > 5 ) + 0
The clearest way:
import numpy as np
matrix1 = np.array([[3, 1, 0],
[6, 2, 6]])
matrix2 = np.array([[0, 1, 1],
[1, 0, 0]])
r,c = matrix1.shape
res = np.zeros((r,c))
for i in range(r):
for j in range(c):
if matrix1[i,j]>5 and matrix2[i,j]==1:
res[i,j]=1
result
array([[0., 0., 0.],
[1., 0., 0.]])
A fancier way, using numpy.where():
import numpy as np
matrix1 = np.array([[3, 1, 0],
[6, 2, 6]])
matrix2 = np.array([[0, 1, 1],
[1, 0, 0]])
r,c = matrix1.shape
res = np.zeros((r,c))
res[np.where((matrix1>5) & (matrix2==1))]=1
result
array([[0., 0., 0.],
[1., 0., 0.]])

Duplicate array dimension with numpy (without np.repeat)

I'd like to duplicate a numpy array dimension, but in a way that the sum of the original and the duplicated dimension array are still the same. For instance consider a n x m shape array (a) which I'd like to convert to a n x n x m (b) array, so that a[i,j] == b[i,i,j]. Unfortunately np.repeat and np.resize are not suitable for this job. Is there another numpy function I could use or is this possible with some creative indexing?
>>> import numpy as np
>>> a = np.asarray([1, 2, 3])
>>> a
array([1, 2, 3])
>>> a.shape
(3,)
# This is not what I want...
>>> np.resize(a, (3, 3))
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
In the above example, I would like to get this result:
array([[1, 0, 0],
[0, 2, 0],
[0, 0, 3]])
From 1d to 2d array, you can use the np.diagflat method, which Create a two-dimensional array with the flattened input as a diagonal:
import numpy as np
a = np.asarray([1, 2, 3])
np.diagflat(a)
#array([[1, 0, 0],
# [0, 2, 0],
# [0, 0, 3]])
More generally, you can create a zeros array and assign values in place with advanced indexing:
a = np.asarray([[1, 2, 3], [4, 5, 6]])
result = np.zeros((a.shape[0],) + a.shape)
idx = np.arange(a.shape[0])
result[idx, idx, :] = a
result
#array([[[ 1., 2., 3.],
# [ 0., 0., 0.]],
# [[ 0., 0., 0.],
# [ 4., 5., 6.]]])

Initializing a N x M matrix in python

I'm trying to learn python. In it, I'm trying to dynamically generate a N x M matrix in python, where each cell contains the index value of that cell in python.
The matrix would look like:
[0,1,2,3,4
0,1,2,3,4
...]
I know that in java it would go something like:
a={}{}
for (i=0;i<N;i++)
for (j=0;j<M:j++)
a[i][j] = i
Where N is the width of the matrix and M is the height of the matrix
Except in python it seems like I can't iterate on a matrix on the basis of the cell placement, rather I need to iterate on the basis of the elements in the cell. From my experience something like
a = [][]
a = np.zeroes((N, M))
[ 0, 0, 0
0, 0, 0]
in the case where N = 3, and M = 2
and then the same style of a loop:
j = 0
for i in len(a):
a[i][j] = i
if i == len(a):
j = j+1
doesn't work because python can't iterate on the basis of the places of the elements. Perhaps I am wrong. Would this work? Is there a better way to make such a matrix and fill it with the indexed values?
Since you're already using NumPy, you could use numpy.arange and numpy.tile:
In [26]: N = 5
In [27]: M = 4
In [28]: np.tile(np.arange(N), (M, 1))
Out[28]:
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
Another option is to create a row using np.arange(5) and assign it to every row of zeros matrix.
In [22]: m = np.zeros((4,5))
In [23]: m[:,] = np.arange(5)
In [24]: m
Out[24]:
array([[ 0., 1., 2., 3., 4.],
[ 0., 1., 2., 3., 4.],
[ 0., 1., 2., 3., 4.],
[ 0., 1., 2., 3., 4.]])
Some example similar to your Java example, but with python syntax sugar.
>>> N=M=5
>>> for z in [[n for n in xrange(N)] for m in xrange(M)]:
... print z
...
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
Here is the code in which matrix contain index value of that cell:
n,m=map(int,raw_input().split())
a=n*[m*[0]]
j=0
for i in range (0,n):
for j in range(0,m):
a[i][j]=j
for i in range (0,n):
for j in range(0,m):
print a[i][j],
print

Categories

Resources