compare each row with each column in matrix using only pure python - python

I have a certain function that I made and I want to run it on each column and each row of a matrix, to check if there are rows and columns that produce the same output.
for example:
matrix = [[1,2,3],
[7,8,9]]
I want to run the function, lets call it myfun, on each column [1,7], [2,8] and [3,9] separatly, and also run it on each row [1,2,3] and [7,8,9]. If there is a row and a column that produce the same result, the counter ct would go up 1. All of this is found in another function, called count_good, which basically counts rows and columns that produce the same result.
here is the code so far:
def count_good(mat):
ct = 0
for i in mat:
for j in mat:
if myfun(i) == myfun(j):
ct += 1
return ct
However, when I use print to check my code I get this:
mat = [[1,2,3],[7,8,9]]
​
for i in mat:
for j in mat:
print(i,j)
​
[1, 2, 3] [1, 2, 3]
[1, 2, 3] [7, 8, 9]
[7, 8, 9] [1, 2, 3]
[7, 8, 9] [7, 8, 9]
I see that the code does not return what I need' which means that the count_good function won't work. How can I run a function on each row and each column? I need to do it without any help of outside libraries, no map,zip or stuff like that, only very pure python.

Let's start by using itertools and collections for this, then translate it back to "pure" python.
from itertools import product, starmap, chain # combinations?
from collections import Counter
To iterate in a nested loop efficiently, you can use itertools.product. You can use starmap to expand the arguments of a function as well. Here is a generator of the values of myfun over the rows:
starmap(myfun, product(matrix, repeat=2))
To transpose the matrix and iterate over the columns, use the zip(* idiom:
starmap(myfun, product(zip(*matrix), repeat=2))
You can use collections.Counter to map all the repeats for each possible return value:
Counter(starmap(myfun, chain(product(matrix, repeat=2), product(zip(*matrix), repeat=2))))
If you want to avoid running myfun on the same elements, replace product(..., repeat=2) with combinations(..., 2).
Now that you have the layout of how to do this, replace all the external library stuff with equivalent builtins:
counter = {}
for i in range(len(matrix)):
for j in range(len(matrix)):
result = myfun(matrix[i], matrix[j])
counter[result] = counter.get(result, 0) + 1
for i in range(len(matrix[0])):
for j in range(len(matrix[0])):
c1 = [matrix[row][i] for row in range(len(matrix))]
c2 = [matrix[row][j] for row in range(len(matrix))]
result = myfun(c1, c2)
counter[result] = counter.get(result, 0) + 1
If you want combinations instead, replace the loop pairs with
for i in range(len(...) - 1):
for j in range(i + 1, len(...)):

Using native python:
def count_good(mat):
ct = 0
columns = [[row[col_idx] for row in mat] for col_idx in range(len(mat[0]))]
for row in mat:
for column in columns:
if myfun(row) == myfun(column):
ct += 1
return ct
However, this is very inefficient as it is a triple nested for-loop. I would suggest using numpy instead.
e.g.
def count_good(mat):
ct = 0
mat = np.array(mat)
for row in mat:
for column in mat.T:
if myfun(row) == myfun(column):
ct += 1
return ct

TL;DR
To get a column from a 2D list of N lists of M elements, first flatten the list to a 1D list of N×M elements, then choosing elements from the 1D list with a stride equal to M, the number of columns, gives you a column of the original 2D list.
First, I create a matrix of random integers, as a list of lists of equal
length — Here I take some liberty from the objective of "pure" Python, the OP
will probably input by hand some assigned matrix.
from random import randrange, seed
seed(20220914)
dim = 5
matrix = [[randrange(dim) for column in range(dim)] for row in range(dim)]
print(*matrix, sep='\n')
We need a function to be applied to each row and each column of the matrix,
that I intend must be supplied as a list. Here I choose a simple summation of
the elements.
def myfun(l_st):
the_sum = 0
for value in l_st: the_sum = the_sum+value
return the_sum
To proceed, we are going to do something unexpected, that is we unwrap the
matrix, starting from an empty list we do a loop on the rows and "sum" the
current row to unwrapped, note that summing two lists gives you a single
list containing all the elements of the two lists.
unwrapped = []
for row in matrix: unwrapped = unwrapped+row
In the following we will need the number of columns in the matrix, this number
can be computed counting the elements in the last row of the matrix.
ncols = 0
for value in row: ncols = ncols+1
Now, we can compute the values produced applying myfunc to each column,
counting how many times we have the same value.
We use an auxiliary variable, start, that is initialized to zero and
incremented in every iteration of the following loop, that scans, using a
dummy variable, all the elements of the current row, hence start has the
values 0, 1, ..., ncols-1, so that unwrapped[start::ncols] is a list
containing exactly one of the columns of the matrix.
count_of_column_values = {}
start = 0
for dummy in row:
column_value = myfun(unwrapped[start::ncols])
if column_value not in count_of_column_values:
count_of_column_values[column_value] = 1
else:
count_of_column_values[column_value] = count_of_column_values[column_value] + 1
start = start+1
At this point, we are ready to apply myfun to the rows
count = 0
for row in matrix:
row_value = myfun(row)
if row_value in count_of_column_values: count = count+count_of_column_values[row_value]
print(count)
Executing the code above prints
[1, 4, 4, 1, 0]
[1, 2, 4, 1, 4]
[1, 4, 4, 0, 1]
[4, 0, 3, 1, 2]
[0, 0, 4, 2, 2]
3

Related

compute matrix from list and two numbers that powers the elements

I'm trying to define a function. The function should compute a matrix from inserting a list of numbers and two additional numbers, which should be the range of what each element in the list is going to be powered to, in the command line.
For example if I insert powers([2,3,4],0,2) in the command line, the output should be a 3x3 matrix with the first row [2^0,2^1,2^2], the second [3^0,3^1,3^2] and third row [3^0,3^1,3^2].
It should look something like:
input: powers([2,3,4],0,2)
output: [[1, 2, 4],[1,3,9],[1,4,16]]
Does anyone know how to do something like that by not importing any additional package to python?
So far I have
def powers(C,a,b):
for c in C:
matrix=[]
for i in range(a,b):
c = c**i
matrix.append(c)
print(matrix)
But that only gives me one row of ones.
In your outer loop, you're emptying the matrix in each iteration. In your inner loop you're appending the powers directly to the matrix, when you should instead create a sub-list and append the numbers to it, then, append the sub-list to the matrix. All you need for this is a simple list comprehension:
def powers(C, a, b):
matrix = [[c ** i for i in range(a, b + 1)] for c in C]
return matrix
Test:
>>> powers([2, 3, 4], 0, 2)
[[1, 2, 4], [1, 3, 9], [1, 4, 16]]
The range is range(a, b + 1) because Python's range stops one step before the end (it doesn't include the end), so to include b use b + 1.

Function Failing at Large List Sizes

I have a question: Starting with a 1-indexed array of zeros and a list of operations, for each operation add a value to each the array element between two given indices, inclusive. Once all operations have been performed, return the maximum value in the array.
Example: n = 10, Queries = [[1,5,3],[4,8,7],[6,9,1]]
The following will be the resultant output after iterating through the array, Index 1-5 will have 3 added to it etc...:
[0,0,0, 0, 0,0,0,0,0, 0]
[3,3,3, 3, 3,0,0,0,0, 0]
[3,3,3,10,10,7,7,7,0, 0]
[3,3,3,10,10,8,8,8,1, 0]
Finally you output the max value in the final list:
[3,3,3,10,10,8,8,8,1, 0]
My current solution:
def Operations(size, Array):
ResultArray = [0]*size
Values = [[i.pop(2)] for i in Array]
for index, i in enumerate(Array):
#Current Values in = Sum between the current values in the Results Array AND the added operation of equal length
#Results Array
ResultArray[i[0]-1:i[1]] = list(map(sum, zip(ResultArray[i[0]-1:i[1]], Values[index]*len(ResultArray[i[0]-1:i[1]]))))
Result = max(ResultArray)
return Result
def main():
nm = input().split()
n = int(nm[0])
m = int(nm[1])
queries = []
for _ in range(m):
queries.append(list(map(int, input().rstrip().split())))
result = Operations(n, queries)
if __name__ == "__main__":
main()
Example input: The first line contains two space-separated integers n and m, the size of the array and the number of operations.
Each of the next m lines contains three space-separated integers a,b and k, the left index, right index and summand.
5 3
1 2 100
2 5 100
3 4 100
Compiler Error at Large Sizes:
Runtime Error
Currently this solution is working for smaller final lists of length 4000, however in order test cases where length = 10,000,000 it is failing. I do not know why this is the case and I cannot provide the example input since it is so massive. Is there anything clear as to why it would fail in larger cases?
I think the problem is that you make too many intermediary trow away list here:
ResultArray[i[0]-1:i[1]] = list(map(sum, zip(ResultArray[i[0]-1:i[1]], Values[index]*len(ResultArray[i[0]-1:i[1]]))))
this ResultArray[i[0]-1:i[1]] result in a list and you do it twice, and one is just to get the size, which is a complete waste of resources, then you make another list with Values[index]*len(...) and finally compile that into yet another list that will also be throw away once it is assigned into the original, so you make 4 throw away list, so for example lets said the the slice size is of 5.000.000, then you are making 4 of those or 20.000.000 extra space you are consuming, 15.000.000 of which you don't really need, and if your original list is of 10.000.000 elements, well just do the math...
You can get the same result for your list(map(...)) with list comprehension like
[v+Value[index][0] for v in ResultArray[i[0]-1:i[1]] ]
now we use two less lists, and we can reduce one list more by making it a generator expression, given that slice assignment does not need that you assign a list specifically, just something that is iterable
(v+Value[index][0] for v in ResultArray[i[0]-1:i[1]] )
I don't know if internally the slice assignment it make it a list first or not, but hopefully it doesn't, and with that we go back to just one extra list
here is an example
>>> a=[0]*10
>>> a
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> a[1:5] = (3+v for v in a[1:5])
>>> a
[0, 3, 3, 3, 3, 0, 0, 0, 0, 0]
>>>
we can reduce it to zero extra list (assuming that internally it doesn't make one) by using itertools.islice
>>> import itertools
>>> a[3:7] = (1+v for v in itertools.islice(a,3,7))
>>> a
[0, 3, 3, 4, 4, 1, 1, 0, 0, 0]
>>>

Rearrange array element based on the number sequence and represent by array id

Suppose, I have multiple arrays in one array with the number from 0 to n in multiple order.
For example,
x = [[0,2,3,5],[1,4]]
Here we have two arrays in x. There could be more than two.
I want to get rearrange all the array elements based on their number sequence. However, they will represent their array ID. The result should be like this
y = [0,1,0,0,1,0]
That means 0,2,3,5 is in array id 0. So, they will show the id in their respective sequence. Same for 1 and 4. Can anyone help me to solve this? [N.B. There could be more than two arrays. So, it will be highly appreciated if the code work for different array numbers]
You can do this by using a dictionary
x = [[0,2,3,5],[1,4]]
lst = {}
for i in range(len(x)):
for j in range(len(x[i])):
lst[x[i][j]] = i
print(lst)
You can also do this by using list, list.insert(idx, value) means value is inserted to the list at the idxth index. Here, we are traversing through all the values of x and the value x[i][j] is in the i th number array.
x = [[0,2,3,5],[1,4]]
lst = []
for i in range(len(x)):
for j in range(len(x[i])):
lst.insert(x[i][j], i)
print(lst)
Output: [0, 1, 0, 0, 1, 0]
You might also consider using np.argsort for rearranging your array values and create the index-array with list comprehension:
x = [[0,2,3,5],[1,4]]
order = np.concatenate(x).argsort()
np.concatenate([ [i]*len(e) for i,e in enumerate(x) ])[order]
array([0, 1, 0, 0, 1, 0])

Loop over clump_masked indices

I have an array y_filtered that contains some masked values. I want to replace these values by some value I calculate based on their neighbouring values. I can get the indices of the masked values by using masked_slices = ma.clump_masked(y_filtered). This returns a list of slices, e.g. [slice(194, 196, None)].
I can easily get the values from my masked array, by using y_filtered[masked_slices], and even loop over them. However, I need to access the index of the values as well, so i can calculate its new value based on its neighbours. Enumerate (logically) returns 0, 1, etc. instead of the indices I need.
Here's the solution I came up with.
# get indices of masked data
masked_slices = ma.clump_masked(y_filtered)
y_enum = [(i, y_i) for i, y_i in zip(range(len(y_filtered)), y_filtered)]
for sl in masked_slices:
for i, y_i in y_enum[sl]:
# simplified example calculation
y_filtered[i] = np.average(y_filtered[i-2:i+2])
It is very ugly method i.m.o. and I think there has to be a better way to do this. Any suggestions?
Thanks!
EDIT:
I figured out a better way to achieve what I think you want to do. This code picks every window of 5 elements and compute its (masked) average, then uses those values to fill the gaps in the original array. If some index does not have any unmasked value close enough it will just leave it as masked:
import numpy as np
from numpy.lib.stride_tricks import as_strided
SMOOTH_MARGIN = 2
x = np.ma.array(data=[1, 2, 3, 4, 5, 6, 8, 9, 10],
mask=[0, 1, 0, 0, 1, 1, 1, 1, 0])
print(x)
# [1 -- 3 4 -- -- -- -- 10]
pad_data = np.pad(x.data, (SMOOTH_MARGIN, SMOOTH_MARGIN), mode='constant')
pad_mask = np.pad(x.mask, (SMOOTH_MARGIN, SMOOTH_MARGIN), mode='constant',
constant_values=True)
k = 2 * SMOOTH_MARGIN + 1
isize = x.dtype.itemsize
msize = x.mask.dtype.itemsize
x_pad = np.ma.array(
data=as_strided(pad_data, (len(x), k), (isize, isize), writeable=False),
mask=as_strided(pad_mask, (len(x), k), (msize, msize), writeable=False))
x_avg = np.ma.average(x_pad, axis=1).astype(x_pad.dtype)
fill_mask = ~x_avg.mask & x.mask
result = x.copy()
result[fill_mask] = x_avg[fill_mask]
print(result)
# [1 2 3 4 3 4 10 10 10]
(note all the values are integers here because x was originally of integer type)
The original posted code has a few errors, firstly it both reads and writes values from y_filtered in the loop, so the results of later indices are affected by the previous iterations, this could be fixed with a copy of the original y_filtered. Second, [i-2:i+2] should probably be [max(i-2, 0):i+3], in order to have a symmetric window starting at zero or later always.
You could do this:
from itertools import chain
# get indices of masked data
masked_slices = ma.clump_masked(y_filtered)
for idx in chain.from_iterable(range(s.start, s.stop) for s in masked_slices):
y_filtered[idx] = np.average(y_filtered[max(idx - 2, 0):idx + 3])

Python - Select elements from matrix within range

I have a question regarding python and selecting elements within a range.
If I have a n x m matrix with n row and m columns, I have a defined range for each column (so I have m min and max values).
Now I want to select those rows, where all values are within the range.
Looking at the following example:
input = matrix([[1, 2], [3, 4],[5,6],[1,8]])
boundaries = matrix([[2,1],[8,5]])
#Note:
#col1min = 2
#col1max = 8
#col2min = 1
#col2max = 5
print(input)
desired_result = matrix([[3, 4]])
print(desired_result)
Here, 3 rows where discarded, because they contained values beyond the boundaries.
While I was able to get values within one range for a given array, I did not manage to solve this problem efficiently.
Thank you for your help.
I believe that there is more elegant solution, but i came to this:
def foo(data, boundaries):
zipped_bounds = list(zip(*boundaries))
output = []
for item in data:
for index, bound in enumerate(zipped_bounds):
if not (bound[0] <= item[index] <= bound[1]):
break
else:
output.append(item)
return output
data = [[1, 2], [3, 4], [5, 6], [1, 8]]
boundaries = [[2, 1], [8, 5]]
foo(data, boundaries)
Output:
[[3, 4]]
And i know that there is not checking and raising exceptions if the sizes of arrays won't match each concrete size. I leave it OP to implement this.
Your example data syntax is not correct matrix([[],..]) so it needs to be restructured like this:
matrix = [[1, 2], [3, 4],[5,6],[1,8]]
bounds = [[2,1],[8,5]]
I'm not sure exactly what you mean by "efficient", but this solution is readable, computationally efficient, and modular:
# Test columns in row against column bounds or first bounds
def row_in_bounds(row, bounds):
for ci, colVal in enumerate(row):
bi = ci if len(bounds[0]) >= ci + 1 else 0
if not bounds[1][bi] >= colVal >= bounds[0][bi]:
return False
return True
# Use a list comprehension to apply test to n rows
print ([r for r in matrix if row_in_bounds(r,bounds)])
>>>[[3, 4]]
First we create a reusable test function for rows accepting a list of bounds lists, tuples are probably more appropriate, but I stuck with list as per your specification.
Then apply the test to your matrix of n rows with a list comprehension. If n exceeds the bounds column index or the bounds column index is falsey use the first set of bounds provided.
Keeping the row iterator out of the row parser function allows you to do things like get min/max from the filtered elements as required. This way you will not need to define a new function for every manipulation of the data required.

Categories

Resources