How to compare diagonally opposed element of a multidimensional array with numpy - python

I have a non-symmetrical matrix and basically I would like to compare diagonally opposed element as follow:
if the diagonally opposed element are equal but opposed in sign, keep the absolute value of an element and zero the diagonally opposed value
if it is not the case, then one of the two element is 0 (but we don't know which one), so take the absolute value of both.
Once this is done transpose the lower triangle of the matrix and add it to the upper triangle.
I came up with the following python loop:
for i in range(0, number_files):
for j in range(0, len(Identifier)):
for k in range(0,len(Identifier)):
if Matrix[i][j][k] == - Matrix[i][k][j]:
Matrix[i][j][k] = abs(Matrix[i][j][k])
Matrix[i][k][j] = 0
else:
Matrix[i][j][k] = abs(Matrix[i][j][k]) # one of this two
Matrix[i][k][j] = abs(Matrix[i][k][j]) # values is 0
Matrix[i] = np.tril(Matrix[i],0).transpose() + np.triu(Matrix[i],0)
However, this is very slow and I was wondering how I could improve it with numpy.
I know I can generate a test for example with:
test=np.isclose(Matrix.transpose(),-Matrix)
which will return a boolean matrix, but I do not know how to proceed with that.
Many thanks in advance for your help

Lets start by creating a sample matrix:
>>> a = np.random.randint(-3, 3, 100).reshape(10,10)
Getting its upper and lower triangles:
>>> triu = np.triu(a)
>>> tril = np.tril(a)
Note that triu and tril are the same size as a, but filled with zeros outside the triangle.
Define the triangle you want to modify, and transpose the other. E.g. modify upper triangle:
>>> tril = tril.T
As you suggested, do one of the following to create a mask where your condition applies:
# For integer data
>>> mask = (triu == -tril) & (triu != 0)
# For real data
>>> mask = np.isclose(triu, -tril) & ~np.isclose(triu, 0)
Note the new condition added (!= 0), to avoid comparisons comparisons where triu and tril are filled with 0. mask will contain True where an element from the upper triangle triu matches lower triangle tril.
Implement your conditions:
# Second and abs part of the first condition
>>> a = np.abs(a)
# Set upper diagonal when matches lower diagonal to 0
>>> a[mask] = 0

Related

Python/Maths: Convert a matrix into a lower triangular matrix

I defined the matrix a. Then I wrote an algorithm which turns the matrix a into a lower triangular matrix (I provided the programme below)
The algorithm does work because if I do "print(" ")" at the end of the algorithm I do receive a lower triangular matrix
But when I do "print(a)" directly afterwards (so after the algorithm for the lower triangular matrix)
I still receive the previous matrix a... and then it is not a lower triangular matrix. So I would like the matrix a to "permanently" be a lower triangular matrix, and not just whenever I use this algorithm (with the "print(" ")" at the end)
How can I sort of "change" the matrix a in a way that it stays a lower triangular matrix? I hope I expressed myself well.
Here is my python code:
import numpy as np
s = 5
#s is the number of columns and rows respectively#
a = np.array([[None]*s]*s)
print(a)
#I receive a matrix with 5 columns/rows respectively#
#Calculates number of rows and columns present in given matrix#
rows = len(a);
cols = len(a[0]);
if(rows != cols):
print("Matrix should be a square matrix");
else:
#Performs required operation to convert given matrix into lower triangular matrix
print("Lower triangular matrix: ");
for i in range(0, rows):
for j in range(0, cols):
if(j > i):
print("0"),
else:
print(a[i][j]),
print(" ");
#so far it works perfectly fine. If I use this algorithm I do receive a lower triangular matrix#
print(a)
#my wish was that now that I do print(a) again that it would print the matrix a but as a lower triangular matrix. because of the previous algorithm. but I get the same matrix as I did when I did "print(A)" for the first time.#type here
Keep in mind that printing and assigning values to variables are different actions:
print shows a message on the screen but does not change the value of any variable.
Assignment, such as x = 2 or a[i][j] = 0 changes the value that is stored in that place in memory.
If you want to change the matrix, then you need to use the assignment operator to change the values. In practice you just need to add a[i][j] = 0 in the correct place.
if(j > i):
a[i][j] = 0 # <-- add this line
print("0"),
Now the elements in the upper right part will be set to zero.

Determine the index where the elements in the input array satisfy conditions of 2 reference arrays without using loop in python

I have an input array a:
a = np.array([120,350,410,354,247])
And two reference arrays which form the upper and lower limits for a:
lower = np.array([100,150,200,250,300,350,400,450])
upper = np.array([150,200,250,300,350,400,450,500])
The input and reference arrays can be of different lengths.
My goal is to find out the index of the lower and upper arrays that satisfies lower <= a < upper without using a loop in python. So I am looking for a way to obtain the following output for the above example without looping:
output = [0,5,6,5,2]
Using the a.reshape(-1,1) let's us create a column vector out of a. If we compare it against a 1D array (lower) it returns the comparison for each element in a against the whole 1D array (lower).
np.argmax((lower <= a.reshape(-1, 1)) & (a.reshape(-1, 1) < upper), axis=1)
returns
[0 5 6 5 2]

how to init a array with each element holding the value different from its neighbours

I have a matrix or a multiple array written in python, each element in the array is an integer ranged from 0 to 7, how would I randomly initalize this matrix or multiple array, so that for each element holds a value, which is different from the values of its 4 neighbours(left,right, top, bottom)? can it be implemented in numpy?
You can write your own matrix initializer.
Go through the array[i][j] for each i, j pick a random number between 0 and 7.
If the number equals to either left element: array[i][j-1] or to the upper one: array[i-1][j] regenerate it once again.
You have 2/7 probability to encounter such a bad case, and 4/49 to make it twice in a row, 8/343 for 3 in a row, etc.. the probability dropes down very quickly.
The average case complexity for n elements in a matrix would be O(n).
A simpler problem that might get you started is to do the same for a 1d array. A pure-python solution would look like:
def sample_1d(n, upper):
x = [random.randrange(upper)]
for i in range(1, n)"
xi = random.randrange(upper - 1)
if xi >= x:
xi += 1
x.append(xi)
return x
You can vectorize this as:
def sample_1d_v(n, upper):
x = np.empty(n)
x[0] = 0
x[1:] = np.cumsum(np.random.randint(1, upper, size=n-1)) % upper
x += np.random.randint(upper)
return
The trick here is noting that if there is adjacent values must be different, then the difference between their values is uniformly distributed in [1, upper)

I am using a numpy array of randomly generated ordered pairs, I need to determin if the ordered pairs are different types of triangles

I jst started using numpy this week, and am very confused with it. seems very different from normal python functions.
With an array, shape of 1000X6, is there a way to go row by row in the array and check for example a equilateral triangle.I have 6 columns so that there are triples in each row, 2 integers for each point.
import numpy as np
pnts = np.random.randint(0,50,(1000, 6))
I also thought it may be better to create 3 arrays that are like this:
import numpy as np
A = np.random.random((10,2))
B = np.random.random((10,2))
C = np.random.random((10,2))
to create the ordered pairs and then use a algorithm to find a triangle.
Is there a better way to create an array that represent 1000 triples of ordered pairs and how can I find triangles in that array, like a equilateral triangle for example.
I have made some changes now. I made two arrays for x coordinates and y coordinates.
x = np.random.randint(0,10,(3,1000))
y = np.random.randint(0,10,(3,1000))
############# Adding to question #############
I have algorithms that take each matching x and y coordinates that find there side length and angles for each triangle. I would post but its too much code. And also now I have functions that use angles and side lengths to find a Scalene, Equilateral, Right Isoceles, And Non-right Isoceles.
My question is now more index related. I will use equilateral triangle again as an example because that is what we have been working with.
E = np.column_stack((ACXY,ABXY,CBXY))
ES = np.logical_and(E[:,0] == E[:,1], E[:,1] == E[:,2])
I Have this to find equilateral triangles.
- ACXY = the distance from point A to C
- ABXY = the distance from point A to B
- CBXY = the distance from point C to B
I want to be able to take all the coordinate triples that are equilateral triangles, index them and put them into a new array called E_Tri. I dont think i need the function creating boolean values. Ive thought that maybe If: else: statements maybe a better way to do it.
Also this may help too, I will display E = np.column_stack((ACXY,ABXY,CBXY))
to help understand the array of (E).
[[ 4. 4.47213595 7.21110255]
[ 3.60555128 2.23606798 5.83095189]
[ 2.23606798 9.05538514 8.54400375]
...,
[ 3.60555128 9.05538514 6.08276253]
[ 8.94427191 8.54400375 1. ]
[ 10.63014581 1. 10. ]]
E will look like that. Hopefully this will make sense, if not please let me know.
Something like this perhaps, Even though this will not work just adding to the question.
E = np.column_stack((ACXY,ABXY,CBXY))
equilateral = []
def E_Tri(E):
if E[:,0] == E[:,1] and E[:,1] == E[:,2]:
equilateral.append(E_Tri)
else:
return E
You've described well how you are storing the data, but not what the algorithm is. For example, if we want to answer the question "Is this set of three (x,y) points P1..P3 an equilateral triangle," we can formulate it this way:
dist(P1,P2) == dist(P2,P3) == dist(P3,P1)
Where dist(P1,P2) uses the Pythagorean Theorem:
sqrt((P1.x - P2.x)**2 + (P1.y - P2.y)**2)
But note the sqrt() is unnecessary because all we care about is if all three legs are equal length (and if they are, their squares will be equal as well).
In NumPy we want to do everything in a parallelizable way. So if you have a 1000x6 array representing 1000 triangles, you need to do all the operations on 1000 elements at a time. If the array is called A and its columns are:
P1.x, P1.y, P2.x, P2.y, P3.x, P3.y
Then the first operations are:
A[0] - A[2] # P1.x - P2.x
A[1] - A[3] # P1.y - P2.y
A[2] - A[4]
A[3] - A[5]
A[4] - A[0]
A[5] - A[1]
Which can be more succinctly written:
R = A - np.roll(A, -2, axis=0) # 1000x6 array of all differences
That being done, you can square all those 1000x6 results at once, giving us a 1000x6 array R from which we add the x and y pairs to get the squares-of-distances:
R[0] + R[1] # (P1.x - P2.x)**2 + (P1.y - P2.y)**2
R[2] + R[3]
R[4] + R[5]
Which is to say:
S = R[0::2] + R[1::2] # three column-wise additions at once
This gives us the 1000x3 squares-of-distances array S. Now we simply check for each row if its columns are all equal:
np.logical_and(S[0] == S[1], S[1] == S[2])
This gives us the 1000x1 boolean vector which tells us if each row is an equilateral triangle.
Note that we never went row-by-row in an iterative fashion. That's because doing so in NumPy is much slower than doing column-wise operations.
Note that I have written the above assuming the shape of the arrays are actually (6,1000) when I say 1000x6. This is for convenience of notation (A[0] instead of A[:,0]) and also because it is more efficient when we are operating on columns since NumPy by default uses row-major order. You can np.transpose() your input data if needed.
So in the end it's just:
A = pnts.T
R = np.square(A - np.roll(A, -2, axis=0))
S = R[0::2] + R[1::2] # 1000x3 squares of distances
np.logical_and(S[0] == S[1], S[1] == S[2]) # 1000 True/False results

Find large number of consecutive values fulfilling condition in a numpy array

I have some audio data loaded in a numpy array and I wish to segment the data by finding silent parts, i.e. parts where the audio amplitude is below a certain threshold over a period in time.
An extremely simple way to do this is something like this:
values = ''.join(("1" if (abs(x) < SILENCE_THRESHOLD) else "0" for x in samples))
pattern = re.compile('1{%d,}'%int(MIN_SILENCE))
for match in pattern.finditer(values):
# code goes here
The code above finds parts where there are at least MIN_SILENCE consecutive elements smaller than SILENCE_THRESHOLD.
Now, obviously, the above code is horribly inefficient and a terrible abuse of regular expressions. Is there some other method that is more efficient, but still results in equally simple and short code?
Here's a numpy-based solution.
I think (?) it should be faster than the other options. Hopefully it's fairly clear.
However, it does require a twice as much memory as the various generator-based solutions. As long as you can hold a single temporary copy of your data in memory (for the diff), and a boolean array of the same length as your data (1-bit-per-element), it should be pretty efficient...
import numpy as np
def main():
# Generate some random data
x = np.cumsum(np.random.random(1000) - 0.5)
condition = np.abs(x) < 1
# Print the start and stop indices of each region where the absolute
# values of x are below 1, and the min and max of each of these regions
for start, stop in contiguous_regions(condition):
segment = x[start:stop]
print start, stop
print segment.min(), segment.max()
def contiguous_regions(condition):
"""Finds contiguous True regions of the boolean array "condition". Returns
a 2D array where the first column is the start index of the region and the
second column is the end index."""
# Find the indicies of changes in "condition"
d = np.diff(condition)
idx, = d.nonzero()
# We need to start things after the change in "condition". Therefore,
# we'll shift the index by 1 to the right.
idx += 1
if condition[0]:
# If the start of condition is True prepend a 0
idx = np.r_[0, idx]
if condition[-1]:
# If the end of condition is True, append the length of the array
idx = np.r_[idx, condition.size] # Edit
# Reshape the result into two columns
idx.shape = (-1,2)
return idx
main()
There is a very convenient solution to this using scipy.ndimage. For an array:
a = np.array([1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0])
which can be the result of a condition applied to another array, finding the contiguous regions is as simple as:
regions = scipy.ndimage.find_objects(scipy.ndimage.label(a)[0])
Then, applying any function to those regions can be done e.g. like:
[np.sum(a[r]) for r in regions]
Slightly sloppy, but simple and fast-ish, if you don't mind using scipy:
from scipy.ndimage import gaussian_filter
sigma = 3
threshold = 1
above_threshold = gaussian_filter(data, sigma=sigma) > threshold
The idea is that quiet portions of the data will smooth down to low amplitude, and loud regions won't. Tune 'sigma' to affect how long a 'quiet' region must be; tune 'threshold' to affect how quiet it must be. This slows down for large sigma, at which point using FFT-based smoothing might be faster.
This has the added benefit that single 'hot pixels' won't disrupt your silence-finding, so you're a little less sensitive to certain types of noise.
I haven't tested this but you it should be close to what you are looking for. Slightly more lines of code but should be more efficient, readable, and it doesn't abuse regular expressions :-)
def find_silent(samples):
num_silent = 0
start = 0
for index in range(0, len(samples)):
if abs(samples[index]) < SILENCE_THRESHOLD:
if num_silent == 0:
start = index
num_silent += 1
else:
if num_silent > MIN_SILENCE:
yield samples[start:index]
num_silent = 0
if num_silent > MIN_SILENCE:
yield samples[start:]
for match in find_silent(samples):
# code goes here
This should return a list of (start,length) pairs:
def silent_segs(samples,threshold,min_dur):
start = -1
silent_segments = []
for idx,x in enumerate(samples):
if start < 0 and abs(x) < threshold:
start = idx
elif start >= 0 and abs(x) >= threshold:
dur = idx-start
if dur >= min_dur:
silent_segments.append((start,dur))
start = -1
return silent_segments
And a simple test:
>>> s = [-1,0,0,0,-1,10,-10,1,2,1,0,0,0,-1,-10]
>>> silent_segs(s,2,2)
[(0, 5), (9, 5)]
another way to do this quickly and concisely:
import pylab as pl
v=[0,0,1,1,0,0,1,1,1,1,1,0,1,0,1,1,0,0,0,0,0,1,0,0]
vd = pl.diff(v)
#vd[i]==1 for 0->1 crossing; vd[i]==-1 for 1->0 crossing
#need to add +1 to indexes as pl.diff shifts to left by 1
i1=pl.array([i for i in xrange(len(vd)) if vd[i]==1])+1
i2=pl.array([i for i in xrange(len(vd)) if vd[i]==-1])+1
#corner cases for the first and the last element
if v[0]==1:
i1=pl.hstack((0,i1))
if v[-1]==1:
i2=pl.hstack((i2,len(v)))
now i1 contains the beginning index and i2 the end index of 1,...,1 areas
#joe-kington I've got about 20%-25% speed improvement over np.diff / np.nonzero solution by using argmax instead (see code below, condition is boolean)
def contiguous_regions(condition):
idx = []
i = 0
while i < len(condition):
x1 = i + condition[i:].argmax()
try:
x2 = x1 + condition[x1:].argmin()
except:
x2 = x1 + 1
if x1 == x2:
if condition[x1] == True:
x2 = len(condition)
else:
break
idx.append( [x1,x2] )
i = x2
return idx
Of course, your mileage may vary depending on your data.
Besides, I'm not entirely sure, but i guess numpy may optimize argmin/argmax over boolean arrays to stop searching on first True/False occurrence. That might explain it.
I know I'm late to the party, but another way to do this is with 1d convolutions:
np.convolve(sig > threshold, np.ones((cons_samples)), 'same') == cons_samples
Where cons_samples is the number of consecutive samples you require above threshold

Categories

Resources