I have a matrix that looks like:
x=[[a,b,c,d,e,f],
[g,h,i,j,k,l],
[m,n,o,p,q,r]]
with a,b,c numbers. I am however only interested in the numbers at the lower left half and would like to creat a lower triangular matrix that goes by steps of two positions and thus looks like this:
x2=[[a,b,0,0,0,0],
[g,h,i,j,0,0],
[m,n,o,p,q,r]]
I could of course multiply x with:
x3=[[1,1,0,0,0,0],
[1,1,1,1,0,0],
[1,1,1,1,1,1]]
But is there a way to do this without manually creating x3?
And would it be possible to create a script where the steps are bigger then 2 zeros at a time?
Taking the example matrix x3 that you provided in the example, you could go and do something like this:
x3=[[1,1,1,1,1,1],
[1,1,1,1,1,1],
[1,1,1,1,1,1]]
for i in range(len(x3)):
step = (i+1) * 2
for j in range(step, len(x3[i])):
x3[i][j] = 0
for i in x3:
print(i)
Output:
[1, 1, 0, 0, 0, 0]
[1, 1, 1, 1, 0, 0]
[1, 1, 1, 1, 1, 1]
Or in case you prefer an one-liner:
x3 = [[0 if j>(i+1)*2 else x3[i][j] for j in range(0, len(x3[i]))] for i in range(len(x3))]
Related
I want to create in Python the following sequence of zero's and one's:
{0, 1,1,1,1, 0,0, 1,1,1, 0,0,0, 1,1, 0,0,0,0, 1}
So there is first 1 zero and 4 one's, then 2 zeros and 3 one's, then 3 zeros and 2 ones and finally 4 zeros and 1 one. The final array is supposed to have dimension 20x1, but my code gives me the dimension 4x2. Does anyone know how I can fix this?
Here's my code:
import numpy as np
seq = [ (np.ones(n), np.zeros(5-n) ) for n in range(1,5)]
Many thanks in advance!
For each iteration you create a tuple of two things, hence the 4x2 result. You can bring it to the form you want by concatenating the array elements all together, but there is a pattern to your sequence; you can take advantage that it looks like a triangular matrix of 1s and 0s, which you can then flatten.
n = 5
ones = np.ones((n, n), dtype=int)
seq = np.triu(ones)[1:].flatten()
Output:
array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1])
You can use flatten:
import numpy as np
l = np.array([[0] * n + [1] * (5 - n) for n in range(1, 5)]).flatten()
print(l)
# >>> [0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1]
I would like to replace the N smallest elements in each row for 0, and that the resulting array would respect the same order and shape of the original array.
Specifically, if the original numpy array is:
import numpy as np
x = np.array([[0,50,20],[2,0,10],[1,1,0]])
And N = 2, I would like for the result to be the following:
x = np.array([[0,50,0],[0,0,10],[0,1,0]])
I tried the following, but in the last row it replaces 3 elements instead of 2 (because it replaces both 1s and not only one)
import numpy as np
N = 2
x = np.array([[0,50,20],[2,0,10],[1,1,0]])
x_sorted = np.sort(x , axis = 1)
x_sorted[:,N:] = 0
replace = x_sorted.copy()
final = np.where(np.isin(x,replace),0,x)
Note that this is small example and I would like that it works for a much bigger matrix.
Thanks for your time!
One way using numpy.argsort:
N = 2
x[x.argsort().argsort() < N] = 0
Output:
array([[ 0, 50, 0],
[ 0, 0, 10],
[ 0, 1, 0]])
Use numpy.argpartition to find the index of N smallest elements, and then use the index to replace values:
N = 2
idy = np.argpartition(x, N, axis=1)[:, :N]
x[np.arange(len(x))[:,None], idy] = 0
x
array([[ 0, 50, 0],
[ 0, 0, 10],
[ 1, 0, 0]])
Notice if there are ties, it could be undetermined which values get replaced depending on the algorithm used.
I would like to loop over following check_matrix in such a way that code recognize whether the first and second element is 1 and 1 or 1 and 2 etc? Then for each separate class of pair i.e. 1,1 or 1,2 or 2,2, the code should store in the new matrices, the sum of last element (which in this case has index 8) times exp(-i*q(check_matrix[k][2:5]-check_matrix[k][5:8])), where i is iota (complex number), k is the running index on check_matrix and q is a vector defined as given below. So there are 20 q vectors.
import numpy as np
q= []
for i in np.linspace(0, 10, 20):
q.append(np.array((0, 0, i)))
q = np.array(q)
check_matrix = np.array([[1, 1, 0, 0, 0, 0, 0, -0.7977, -0.243293],
[1, 1, 0, 0, 0, 0, 0, 1.5954, 0.004567],
[1, 2, 0, 0, 0, -1, 0, 0, 1.126557],
[2, 1, 0, 0, 0, 0.5, 0.86603, 1.5954, 0.038934],
[2, 1, 0, 0, 0, 2, 0, -0.7977, -0.015192],
[2, 2, 0, 0, 0, -0.5, 0.86603, 1.5954, 0.21394]])
This means in principles I will have to have 20 matrices of shape 2x2, corresponding to each q vector.
For the moment my code is giving only one matrix, which appears to be the last one, even though I am appending in the Matrices. My code looks like below,
for i in range(2):
i = i+1
for j in range(2):
j= j +1
j_list = []
Matrices = []
for k in range(len(check_matrix)):
if check_matrix[k][0] == i and check_matrix[k][1] == j:
j_list.append(check_matrix[k][8]*np.exp(-1J*np.dot(q,(np.subtract(check_matrix[k][2:5],check_matrix[k][5:8])))))
j_11 = np.sum(j_list)
I_matrix[i-1][j-1] = j_11
Matrices.append(I_matrix)
I_matrix is defined as below:
I_matrix= np.zeros((2,2),dtype=np.complex_)
At the moment I get following output.
Matrices = [array([[-0.66071446-0.77603624j, -0.29038112+2.34855023j], [-0.31387562-0.08116629j, 4.2788 +0.j ]])]
But, I desire to get a matrix corresponding to each q value meaning that in total there should be 20 matrices in this case, where each 2x2 matrix element would be containing sums such that elements belong to 1,1 and 1,2 and 2,2 pairs in following manner
array([[11., 12.],
[21., 22.]])
I shall highly appreciate your suggestion to correct it. Thanks in advance!
I am pretty sure you can solve this problem in an easier way and I am not 100% sure that I understood you correctly, but here is some code that does what I think you want. If you have a possibility to check if the results are valid, I would suggest you do so.
import numpy as np
n = 20
q = np.zeros((20, 3))
q[:, -1] = np.linspace(0, 10, n)
check_matrix = np.array([[1, 1, 0, 0, 0, 0, 0, -0.7977, -0.243293],
[1, 1, 0, 0, 0, 0, 0, 1.5954, 0.004567],
[1, 2, 0, 0, 0, -1, 0, 0, 1.126557],
[2, 1, 0, 0, 0, 0.5, 0.86603, 1.5954, 0.038934],
[2, 1, 0, 0, 0, 2, 0, -0.7977, -0.015192],
[2, 2, 0, 0, 0, -0.5, 0.86603, 1.5954, 0.21394]])
check_matrix[:, :2] -= 1 # python indexing is zero based
matrices = np.zeros((n, 2, 2), dtype=np.complex_)
for i in range(2):
for j in range(2):
k_list = []
for k in range(len(check_matrix)):
if check_matrix[k][0] == i and check_matrix[k][1] == j:
k_list.append(check_matrix[k][8] *
np.exp(-1J * np.dot(q, check_matrix[k][2:5]
- check_matrix[k][5:8])))
matrices[:, i, j] = np.sum(k_list, axis=0)
NOTE: I changed your indices to have consistent
zero-based indexing.
Here is another approach where I replaced the k-loop with a vectored version:
for i in range(2):
for j in range(2):
k = np.logical_and(check_matrix[:, 0] == i, check_matrix[:, 1] == j)
temp = np.dot(check_matrix[k, 2:5] - check_matrix[k, 5:8], q[:, :, np.newaxis])[..., 0]
temp = check_matrix[k, 8:] * np.exp(-1J * temp)
matrices[:, i, j] = np.sum(temp, axis=0)
3 line solution
You asked for efficient solution in your original title so how about this solution that avoids nested loops and if statements in a 3 liner, which is thus hopefully faster?
fac=2*(check_matrix[:,0]-1)+(check_matrix[:,1]-1)
grp=np.split(check_matrix[:,8], np.cumsum(np.unique(fac,return_counts=True)[1])[:-1])
[np.sum(x) for x in grp]
output:
[-0.23872600000000002, 1.126557, 0.023742000000000003, 0.21394]
How does it work?
I combine the first two columns into a single index, treating each as "bits" (i.e. base 2)
fac=2*(check_matrix[:,0]-1)+(check_matrix[:,1]-1)
( If you have indexes that exceed 2, you can still use this technique but you will need to use a different base to combine the columns. i.e. if your indices go from 1 to 18, you would need to multiply column 0 by a number equal to or larger than 18 instead of 2. )
So the result of the first line is
array([0., 0., 1., 2., 2., 3.])
Note as well it assumes the data is ordered, that one column changes fastest, if this is not the case you will need an extra step to sort the index and the original check matrix. In your example the data is ordered.
The next step groups the data according to the index, and uses the solution posted here.
np.split(check_matrix[:,8], np.cumsum(np.unique(fac,return_counts=True)[1])[:-1])
[array([-0.243293, 0.004567]), array([1.126557]), array([ 0.038934, -0.015192]), array([0.21394])]
i.e. it outputs the 8th column of check_matrix according to the grouping of fac
then the last line simply sums those... knowing how the first two columns were combined to give the single index allows you to map the result back. Or you could simply add it to check matrix as a 9th column if you wanted.
I am given a 2D Tensor with stochastic rows. After applying tf.math.greater() and tf.cast(tf.int32) I am left with a Tensor with 0's and 1's. I now want to apply reduce sum onto that matrix but with a condition: If there was at least one 1 summed and a 0 follows I want to remove all following 1 aswell, meaning 1 0 1 should result in 1 instead of 2.
I have tried to solve the Problem with tf.scan(), but I was not able to come up with a function yet that is able to handle starting 0's, because the row might look like: 0 0 0 1 0 1
One idea was to set the lower part of the matrix to one (bc I know everything left from the diagonal will always be 0) and then have a function like tf.scan() run to filter out the spots (see code and error message below).
Let z be the matrix after tf.cast.
helper = tf.matrix_band_part(tf.ones_like(z), -1, 0)
z = tf.math.logical_or(tf.cast(z, tf.bool), tf.cast(helper,tf.bool))
z = tf.cast(z, tf.int32)
z = tf.scan(lambda a, x: x if a == 1 else 0 ,z)
Resulting in:
ValueError: Incompatible shape for value ([]), expected ([5])
IIUC, this is one way to do what you want without scanning or looping. It may be a bit convoluted, and is actually iterating the columns twice (one cumsum and one cumprod), but being vectorized operations I think it is probably faster. Code is TF 2.x but runs the same in TF 1.x (except for the last line obviously).
import tensorflow as tf
# Example data
a = tf.constant([[0, 0, 0, 0],
[1, 0, 0, 0],
[0, 1, 1, 0],
[0, 1, 0, 1],
[1, 1, 1, 0],
[1, 1, 0, 1],
[0, 1, 1, 1],
[1, 1, 1, 1]])
# Cumsum columns
c = tf.math.cumsum(a, axis=1)
# Column-wise differences
diffs = tf.concat([tf.ones([tf.shape(c)[0], 1], c.dtype), c[:, 1:] - c[:, :-1]], axis=1)
# Find point where we should not sum anymore (cumsum is not zero and difference is zero)
cutoff = tf.equal(a, 0) & tf.not_equal(c, 0)
# Make mask
mask = tf.math.cumprod(tf.dtypes.cast(~cutoff, tf.uint8), axis=1)
# Compute result
result = tf.reduce_max(c * tf.dtypes.cast(mask, c.dtype), axis=1)
print(result.numpy())
# [0 1 2 1 3 2 3 4]
I have this code:
gs = open("graph.txt", "r")
gp = gs.readline()
gp_splitIndex = gp.find(" ")
gp_nodeCount = int(gp[0:gp_splitIndex])
gp_edgeCount = int(gp[gp_splitIndex+1:-1])
matrix = [] # predecare the array
for i in range(0, gp_nodeCount):
matrix.append([])
for y in range(0, gp_nodeCount):
matrix[i].append(0)
for i in range(0, gp_edgeCount-1):
gp = gs.readline()
gp_splitIndex = gp.find(" ") # get the index of space, dividing the 2 numbers on a row
gp_from = int(gp[0:gp_splitIndex])
gp_to = int(gp[gp_splitIndex+1:-1])
matrix[gp_from][gp_to] = 1
print matrix
The file graph.txt contains this:
5 10
0 1
1 2
2 3
3 4
4 0
0 3
3 1
1 4
4 2
2 0
The first two number are telling me, that GRAPH has 5 nodes and 10 edges. The Following number pairs demonstrate the edges between nodes. For example "1 4" means an edge between node 1 and 4.
Problem is, the output should be this:
[[0, 1, 0, 1, 0], [0, 0, 1, 0, 1], [1, 0, 0, 1, 0], [0, 1, 0, 0, 1], [1, 0, 1, 0, 0]]
But instead of that, I get this:
[[0, 1, 0, 1, 0], [0, 0, 1, 0, 1], [0, 0, 0, 1, 0], [0, 1, 0, 0, 1], [1, 0, 1, 0, 0]]
Only one number is different and I can't understand why is this happening. The edge "3 1" is not present. Can someone explain, where is the problem?
Change for i in range(0, gp_edgeCount-1): to
for i in range(0, gp_edgeCount):
The range() function already does the "-1" operation. range(0,3) "==" [0,1,2]
And it is not the "3 1" edge that is missing, it is the "2 0" edge that is missing, and that is the last edge. The matrices start counting at 0.
Matthias has it; you don't need edgeCount - 1 since the range function doesn't include the end value in the iteration.
There are several other things you can do to clean up your code:
The with operator is preferred for opening files, since it closes them automatically for you
You don't need to call find and manually slice, split already does what you want.
You can convert and assign directly to a pair of numbers using a generator expression and iterable unpacking
You can call range with just an end value, the 0 start is implicit.
The multiplication operator is handy for initializing lists
With all of those changes:
with open('graph.txt', 'r') as graph:
node_count, edge_count = (int(n) for n in graph.readline().split())
matrix = [[0]*node_count for _ in range(node_count)]
for i in range(edge_count):
src, dst = (int(n) for n in graph.readline().split())
matrix[src][dst] = 1
print matrix
# [[0, 1, 0, 1, 0], [0, 0, 1, 0, 1], [1, 0, 0, 1, 0], [0, 1, 0, 0, 1], [1, 0, 1, 0, 0]]
Just to keep your code and style, of course it could be much more readable:
gs = open("graph.txt", "r")
gp = gs.readline()
gp_splitIndex = gp.split(" ")
gp_nodeCount = int(gp_splitIndex[0])
gp_edgeCount = int(gp_splitIndex[1])
matrix = [] # predecare the array
for i in range(0, gp_nodeCount):
matrix.append([])
for y in range(0, gp_nodeCount):
matrix[i].append(0)
for i in range(0, gp_edgeCount):
gp = gs.readline()
gp_Index = gp.split(" ") # get the index of space, dividing the 2 numbers on a row
gp_from = int(gp_Index[0])
gp_to = int(gp_Index[1])
matrix[gp_from][gp_to] = 1
print matrix
Exactly is the last instance not used..the 2 0 from your file. Thus the missed 1. Have a nice day!
The other answers are correct, another version similar to the one of tzaman:
with open('graph.txt', mode='r') as txt_file:
lines = [l.strip() for l in txt_file.readlines()]
number_pairs = [[int(n) for n in line.split(' ')] for line in lines]
header = number_pairs[0]
edge_pairs = number_pairs[1:]
num_nodes, num_edges = header
edges = [[0] * num_nodes for _ in xrange(num_nodes)]
for edge_start, edge_end in edge_pairs:
edges[edge_start][edge_end] = 1
print edges