I am working through some algorithm challenges to get some more Python practice. I am having some problems with a problem that requires changing values inside a python matrix (list of lists).
# Challenge
# After they became famous, the CodeBots all decided to move to a new building and live together. The building is represented by a
# rectangular matrix of rooms. Each cell in the matrix contains an integer that represents the price of the room. Some rooms are
# free (their cost is 0), but that's probably because they are haunted, so all the bots are afraid of them. That is why any room
# that is free or is located anywhere below a free room in the same column is not considered suitable for the bots to live in.
# ex: matrix = [[0, 1, 1, 2], [[x, 1, 1, 2],
# [0, 5, 0, 0], --> [x, 5, x, x], --> 5 + 1 + 1 + 2 = 9
# [2, 0, 3, 3]] [x, x, x, x]]
My approach is two-fold: 1) first find all zeros in the matrix and replace that value with 'x'. 2) Once that happens, loop through all the lists and find the index of existing 'x', then use that index value and search for it in the other lists.. replace the numeric value with 'x' if that number is 'below' the existing 'x'.. hopefully that makes sense. I have the first part down and I have attempted the second piece a number of different ways but now am running into an error.. I feel like i'm very close. I also feel like my code is pretty inefficient (I am new to Python), so if there is some far more efficient way to do it please let me know.
I understand what the error means but I am having a hard time fixing it while getting the correct answer..The error is that the index is out of range.
My code:
def matrixElementsSum(matrix):
numList = len(matrix) # essentially the number of 'rows' -> number of lists
numCol = len(matrix[0]) # number of values in each list
# replace 0's in each list with 'x'
for x in matrix:
if x.count(0) > 0:
for index, i in enumerate(x):
if i == 0:
x[index] = 'x'
for x in matrix:
for y in matrix[x]:
if(matrix[x][y] == 'x'):
x_ind = y
for z in matrix:
if(z < x):
matrix[z][x_ind] = 'x'
print(matrix)
Test scenario:
matrixElementsSum([[0, 1, 1, 2],
[0, 5, 0, 0],
[2, 0, 3, 3]])
You are still going to require nested for loops in some fashion or another since you are iterating over a list of lists, but you could simplify the logic a bit using list comprehensions.
def solver(matrix):
mx = [[v if v else 'x' for v in row] for row in matrix]
mxx = [[v1 if v2 else 'x' for v1, v2 in zip(row1, row2)] for row1, row2 in zip(mx[1:], matrix)]
return mx[:1] + mxx
I first iterate over the matrix, and replace '0's with "x"s in the new matrix mx.
mx = [[v if v else 'x' for v in row] for row in matrix]
This is just a nested list comprehension, where we operate on each element per row, for every row per matrix. The ... if ... else ... is just your classic ternary operator. If v holds (is not zero in our case), then it evaluates to the value before the "if", otherwise it evaluates to the value after the "else" - in this case 'x'.
I then repeat the process, but offsetting the rows by one so that I can check if the above element is now an "x".
mxx = [[v1 if v2 else 'x' for v1, v2 in zip(row1, row2)] for row1, row2 in zip(mx[1:], matrix)]
There's a bit to break down here. Lets start from the "outside" and work our way in.
... for row1, row2 in zip(mx[1:], matrix)
This zips the new matrix, offset by one (using [1:] slice notation), with the original matrix. So it returns an iterable functionally equivalent to the following list:
[(mx_row1, matrix_row0), (mx_row2, matrix_row1), (mx_row3, matrix_row2), ...]
This allows us to extract a given row and the row above it simultaneously, as row1 and row2. Then the other half-
[v1 if v2 else 'x' for v1, v2 in zip(row1, row2)]
-repeats a similar process on each element per row, rather than row per matrix. We do not offset the elements in either row like we offset the rows of the mx matrix, but otherwise the logic is identical. We then again compare with our ternary operator to see if the above element is a 0, and if so evaluate to 'x'. We could easily have changed this to compare each element of each row of mx to 'x' instead of matrix to 0, but I decided to mirror the first list comprehension.
Once I have this new matrix mxx, I simply prepend the first row of mx, because we effectively skip that row when we offset our comparison. The result is a matrix with all 0s and elements below replaced with "x"s.
As per clarification in the comments, if you wish you to mark an "x" if any of the above elements are 0, not just the one directly above, you can accomplish this by taking a slice of that column of the matrix and using the all() builtin to see if any are 0. Revised code below
def solver(matrix):
return [[v if all(col) else 'x' for v, col in zip(row, zip(*matrix[:idx]))] for idx, row in enumerate(matrix, 1)]
Related
I have a certain function that I made and I want to run it on each column and each row of a matrix, to check if there are rows and columns that produce the same output.
for example:
matrix = [[1,2,3],
[7,8,9]]
I want to run the function, lets call it myfun, on each column [1,7], [2,8] and [3,9] separatly, and also run it on each row [1,2,3] and [7,8,9]. If there is a row and a column that produce the same result, the counter ct would go up 1. All of this is found in another function, called count_good, which basically counts rows and columns that produce the same result.
here is the code so far:
def count_good(mat):
ct = 0
for i in mat:
for j in mat:
if myfun(i) == myfun(j):
ct += 1
return ct
However, when I use print to check my code I get this:
mat = [[1,2,3],[7,8,9]]
​
for i in mat:
for j in mat:
print(i,j)
​
[1, 2, 3] [1, 2, 3]
[1, 2, 3] [7, 8, 9]
[7, 8, 9] [1, 2, 3]
[7, 8, 9] [7, 8, 9]
I see that the code does not return what I need' which means that the count_good function won't work. How can I run a function on each row and each column? I need to do it without any help of outside libraries, no map,zip or stuff like that, only very pure python.
Let's start by using itertools and collections for this, then translate it back to "pure" python.
from itertools import product, starmap, chain # combinations?
from collections import Counter
To iterate in a nested loop efficiently, you can use itertools.product. You can use starmap to expand the arguments of a function as well. Here is a generator of the values of myfun over the rows:
starmap(myfun, product(matrix, repeat=2))
To transpose the matrix and iterate over the columns, use the zip(* idiom:
starmap(myfun, product(zip(*matrix), repeat=2))
You can use collections.Counter to map all the repeats for each possible return value:
Counter(starmap(myfun, chain(product(matrix, repeat=2), product(zip(*matrix), repeat=2))))
If you want to avoid running myfun on the same elements, replace product(..., repeat=2) with combinations(..., 2).
Now that you have the layout of how to do this, replace all the external library stuff with equivalent builtins:
counter = {}
for i in range(len(matrix)):
for j in range(len(matrix)):
result = myfun(matrix[i], matrix[j])
counter[result] = counter.get(result, 0) + 1
for i in range(len(matrix[0])):
for j in range(len(matrix[0])):
c1 = [matrix[row][i] for row in range(len(matrix))]
c2 = [matrix[row][j] for row in range(len(matrix))]
result = myfun(c1, c2)
counter[result] = counter.get(result, 0) + 1
If you want combinations instead, replace the loop pairs with
for i in range(len(...) - 1):
for j in range(i + 1, len(...)):
Using native python:
def count_good(mat):
ct = 0
columns = [[row[col_idx] for row in mat] for col_idx in range(len(mat[0]))]
for row in mat:
for column in columns:
if myfun(row) == myfun(column):
ct += 1
return ct
However, this is very inefficient as it is a triple nested for-loop. I would suggest using numpy instead.
e.g.
def count_good(mat):
ct = 0
mat = np.array(mat)
for row in mat:
for column in mat.T:
if myfun(row) == myfun(column):
ct += 1
return ct
TL;DR
To get a column from a 2D list of N lists of M elements, first flatten the list to a 1D list of N×M elements, then choosing elements from the 1D list with a stride equal to M, the number of columns, gives you a column of the original 2D list.
First, I create a matrix of random integers, as a list of lists of equal
length — Here I take some liberty from the objective of "pure" Python, the OP
will probably input by hand some assigned matrix.
from random import randrange, seed
seed(20220914)
dim = 5
matrix = [[randrange(dim) for column in range(dim)] for row in range(dim)]
print(*matrix, sep='\n')
We need a function to be applied to each row and each column of the matrix,
that I intend must be supplied as a list. Here I choose a simple summation of
the elements.
def myfun(l_st):
the_sum = 0
for value in l_st: the_sum = the_sum+value
return the_sum
To proceed, we are going to do something unexpected, that is we unwrap the
matrix, starting from an empty list we do a loop on the rows and "sum" the
current row to unwrapped, note that summing two lists gives you a single
list containing all the elements of the two lists.
unwrapped = []
for row in matrix: unwrapped = unwrapped+row
In the following we will need the number of columns in the matrix, this number
can be computed counting the elements in the last row of the matrix.
ncols = 0
for value in row: ncols = ncols+1
Now, we can compute the values produced applying myfunc to each column,
counting how many times we have the same value.
We use an auxiliary variable, start, that is initialized to zero and
incremented in every iteration of the following loop, that scans, using a
dummy variable, all the elements of the current row, hence start has the
values 0, 1, ..., ncols-1, so that unwrapped[start::ncols] is a list
containing exactly one of the columns of the matrix.
count_of_column_values = {}
start = 0
for dummy in row:
column_value = myfun(unwrapped[start::ncols])
if column_value not in count_of_column_values:
count_of_column_values[column_value] = 1
else:
count_of_column_values[column_value] = count_of_column_values[column_value] + 1
start = start+1
At this point, we are ready to apply myfun to the rows
count = 0
for row in matrix:
row_value = myfun(row)
if row_value in count_of_column_values: count = count+count_of_column_values[row_value]
print(count)
Executing the code above prints
[1, 4, 4, 1, 0]
[1, 2, 4, 1, 4]
[1, 4, 4, 0, 1]
[4, 0, 3, 1, 2]
[0, 0, 4, 2, 2]
3
Consider the variable "A" and an array with several nested arrays. Both "A" and the nested arrays contain the same amount of elements, in this case 5. Each nested array are also nested together in groups of 3.
A=[10,20,30,40,50]
array=[[[1,5,8,3,4],[18,4,-8,4,21],[-8,12,42,16,-9]], ...]
I was wondering how I can replace the elements in each nested array with the corresponding elements in A if the value of the element in the nested array exceeds a certain threshold. Otherwise, if the element fails to exceed the threshold, replace with zero.
For example, if the threshold is 10 in this example, the result would be:
array=[[[0,0,0,0,0],[10,0,0,0,50],[0,20,30,40,0]], ...]
I know this might be a simple problem, but I'm having trouble comprehending multidimensional arrays, especially if they are greater than 2-dimensions. The bigger question is, how would I do this if those arrays are nested under many arrays without using several for loops? My incorrect attempt:
for a in array:
for x in a:
for i in x:
if a[i]>10:
a[i]=A[i]
else:
a[i]=0
Your attempt is not working because first of all you are using the list value i as an index. In the line for i in x: the variable i will take each value of the list x, and if you need the index of it as well, you can use for id, i in enumerate(x) which gives you each value of the list as i and its index as id.
Moreover, to update the array, it is not enough to update x inside the loop, you need to update the array directly. And of course you can use list comprehension for simplicity in the last loop. So a solution to your problem could look like this:
for i1, val1 in enumerate(array):
for i2, val2 in enumerate(val1):
array[i1][i2] = [y if x>10 else 0 for (x, y) in zip(val2, A)]
As for your bigger question, the general solution when you have multiple nested lists and you don't want to use for loops is to implement recursive functions.
Here, one recursive solution to your problem would be:
def my_recursive_fun(input):
if isinstance(input, list) and isinstance(input[0], list):
return [my_recursive_fun(item) for item in input]
else:
return [y if x>10 else 0 for (x, y) in zip(input, [10,20,30,40,50])]
array=[[[1,5,8,3,4],[18,4,-8,4,21],[-8,12,42,16,-9]]]
new_array = my_recursive_fun(array)
The good thing about recursive solution is that it works with any number of nested lists (of course there are limits) without changing the code.
If the nesting of your array is arbitrary deep, then go for a recursive function:
def apply_threshold(substitutes, limit, source):
if isinstance(source[0], list): # recursive case
return [apply_threshold(substitutes, limit, elem) for elem in source]
else: # base case
return [subst if value >= limit else 0
for subst, value in zip(substitutes, source)]
Here is how to use it:
A = [10,20,30,40,50]
array = [[[1,5,8,3,4],[18,4,-8,4,21],[-8,12,42,16,-9]]]
result = apply_threshold(A, 10, array)
print(result) # [[[0, 0, 0, 0, 0], [10, 0, 0, 0, 50], [0, 20, 30, 40, 0]]]
Suppose, I have multiple arrays in one array with the number from 0 to n in multiple order.
For example,
x = [[0,2,3,5],[1,4]]
Here we have two arrays in x. There could be more than two.
I want to get rearrange all the array elements based on their number sequence. However, they will represent their array ID. The result should be like this
y = [0,1,0,0,1,0]
That means 0,2,3,5 is in array id 0. So, they will show the id in their respective sequence. Same for 1 and 4. Can anyone help me to solve this? [N.B. There could be more than two arrays. So, it will be highly appreciated if the code work for different array numbers]
You can do this by using a dictionary
x = [[0,2,3,5],[1,4]]
lst = {}
for i in range(len(x)):
for j in range(len(x[i])):
lst[x[i][j]] = i
print(lst)
You can also do this by using list, list.insert(idx, value) means value is inserted to the list at the idxth index. Here, we are traversing through all the values of x and the value x[i][j] is in the i th number array.
x = [[0,2,3,5],[1,4]]
lst = []
for i in range(len(x)):
for j in range(len(x[i])):
lst.insert(x[i][j], i)
print(lst)
Output: [0, 1, 0, 0, 1, 0]
You might also consider using np.argsort for rearranging your array values and create the index-array with list comprehension:
x = [[0,2,3,5],[1,4]]
order = np.concatenate(x).argsort()
np.concatenate([ [i]*len(e) for i,e in enumerate(x) ])[order]
array([0, 1, 0, 0, 1, 0])
I'm trying to iterate through a two dimensional array in Python and compare items in the array to ints, however I am faced with a ton of various errors whenever I attempt to do such. I'm using numpy and pandas.
My dataset is created as follows:
filename = "C:/Users/User/My Documents/JoeTest.csv"
datas = pandas.read_csv(filename)
dataset = datas.values
Then, I attempt to go through the data, grabbing certain elements of it.
def model_building(data):
global blackKings
flag = 0;
blackKings.append(data[0][1])
for i in data:
if data[i][39] == 1:
if data[i][40] == 1:
values.append(1)
else:
values.append(-1)
else:
if data[i][40] == 1:
values.append(-1)
else:
values.append(1)
for j in blackKings:
if blackKings[j] != data[i][1]:
flag = 1
if flag == 1:
blackKings.append(data[i][1])
flag = 0;
However, doing so leaves me with a ValueError: The Truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). I don't want to use either of these, as I'm looking to compare the actual value of that one specific instance. Is there another way around this problem?
You need to tell us something about this: dataset = datas.values
It's probably a 2d array, since it derives from a load of a csv. But what shape and dtype? Maybe even a sample of the array.
Is that the data argument in the function?
What are blackKings and values? You treat them like lists (with append).
for i in data:
if data[i][39] == 1:
This doesn't make sense. for i in data, if data is 2d, i is the the first row, then the second row, etc. If you want i to in an index, you use something like
for i in range(data.shape[0]):
2d array indexing is normally done with data[i,39].
But in your case data[i][39] is probably an array.
Anytime you use an array in a if statement, you'll get this ValueError, because there are multiple values.
If i were proper indexes, then data[i,39] would be a single value.
To illustrate:
In [41]: data=np.random.randint(0,4,(4,4))
In [42]: data
Out[42]:
array([[0, 3, 3, 2],
[2, 1, 0, 2],
[3, 2, 3, 1],
[1, 3, 3, 3]])
In [43]: for i in data:
...: print('i',i)
...: print('data[i]',data[i].shape)
...:
i [0 3 3 2] # 1st row
data[i] (4, 4)
i [2 1 0 2] # a 4d array
data[i] (4, 4)
...
Here i is a 4 element array; using that to index data[i] actually produces a 4 dimensional array; it isn't selecting one value, but rather many values.
Instead you need to iterate in one of these ways:
In [46]: for row in data:
...: if row[3]==1:
...: print(row)
[3 2 3 1]
In [47]: for i in range(data.shape[0]):
...: if data[i,3]==1:
...: print(data[i])
[3 2 3 1]
To debug a problem like this you need to look at intermediate values, and especially their shapes. Don't just assume. Check!
I'm going to attempt to rewrite your function
def model_building(data):
global blackKings
blackKings.append(data[0, 1])
# Your nested if statements were performing an xor
# This is vectorized version of the same thing
values = np.logical_xor(*(data.T[[39, 40]] == 1)) * -2 + 1
# not sure where `values` is defined. If you really wanted to
# append to it, you can do
# values = np.append(values, np.logical_xor(*(data.T[[39, 40]] == 1)) * -2 + 1)
# Your blackKings / flag logic can be reduced
mask = (blackKings[:, None] != data[:, 1]).all(1)
blackKings = np.append(blackKings, data[:, 1][mask])
This may not be perfect because it is difficult to parse your logic considering you are missing some pieces. But hopefully you can adopt some of what I've included here and improve your code.
I have a question regarding python and selecting elements within a range.
If I have a n x m matrix with n row and m columns, I have a defined range for each column (so I have m min and max values).
Now I want to select those rows, where all values are within the range.
Looking at the following example:
input = matrix([[1, 2], [3, 4],[5,6],[1,8]])
boundaries = matrix([[2,1],[8,5]])
#Note:
#col1min = 2
#col1max = 8
#col2min = 1
#col2max = 5
print(input)
desired_result = matrix([[3, 4]])
print(desired_result)
Here, 3 rows where discarded, because they contained values beyond the boundaries.
While I was able to get values within one range for a given array, I did not manage to solve this problem efficiently.
Thank you for your help.
I believe that there is more elegant solution, but i came to this:
def foo(data, boundaries):
zipped_bounds = list(zip(*boundaries))
output = []
for item in data:
for index, bound in enumerate(zipped_bounds):
if not (bound[0] <= item[index] <= bound[1]):
break
else:
output.append(item)
return output
data = [[1, 2], [3, 4], [5, 6], [1, 8]]
boundaries = [[2, 1], [8, 5]]
foo(data, boundaries)
Output:
[[3, 4]]
And i know that there is not checking and raising exceptions if the sizes of arrays won't match each concrete size. I leave it OP to implement this.
Your example data syntax is not correct matrix([[],..]) so it needs to be restructured like this:
matrix = [[1, 2], [3, 4],[5,6],[1,8]]
bounds = [[2,1],[8,5]]
I'm not sure exactly what you mean by "efficient", but this solution is readable, computationally efficient, and modular:
# Test columns in row against column bounds or first bounds
def row_in_bounds(row, bounds):
for ci, colVal in enumerate(row):
bi = ci if len(bounds[0]) >= ci + 1 else 0
if not bounds[1][bi] >= colVal >= bounds[0][bi]:
return False
return True
# Use a list comprehension to apply test to n rows
print ([r for r in matrix if row_in_bounds(r,bounds)])
>>>[[3, 4]]
First we create a reusable test function for rows accepting a list of bounds lists, tuples are probably more appropriate, but I stuck with list as per your specification.
Then apply the test to your matrix of n rows with a list comprehension. If n exceeds the bounds column index or the bounds column index is falsey use the first set of bounds provided.
Keeping the row iterator out of the row parser function allows you to do things like get min/max from the filtered elements as required. This way you will not need to define a new function for every manipulation of the data required.