I've been given the challenge to code np.argmin without numpy .
I've been thinking hard for about a day..
I have no idea whether I should use a for statement,
an if statement, a while statement, or another function..
First question!
First, I thought about how to express it with an inequality sign to distinguish between cases.
using the if statement
a[0,0] - a[0,1] > 0
a[0,0] - a[0,1] < 0
I tried to write the code by dividing the two cases.
There were too many cases, so I stopped.
Couldn't it be done with an If statement?
Second question!
We know that the argmin method represents the address of a pointer as an array value.
What is in the screen capture is what I arbitrarily input as a two-dimensional list.
ndarray.
Because the task is limited to receiving a two-dimensional list as input
I thought that the directions of axis=0 and axis=1 are fixed.
Then axis=0 freezes the column and compares row to row
Is it okay to think that axis=1 freezes rows and compares columns to columns?
Third question!
After receiving an arbitrary two-dimensional list, ndarray is
I thought it would be in the form of a matrix of the form ixj.
Then, if you use a.shape, the output value is output as (i , j).
How can we extract i and j here?
It's really hard to think about all day long.
Any hints would be appreciated.
def argmin(a):
return min(range(len(a)), key=lambda x : a[x])
def argmax(a):
return max(range(len(a)), key=lambda x : a[x])
This code is for 1D list.
Edit::
after all these discussions with juanpa & fusion here in the comments and Kevin on python chat , i have come to a conclusion that iterating through a generator takes the same time as it would take iterating through any other object because generator itself generates those combinations on the fly. Moreover the approach by fusion worked great for len(arr) up to 1000(maybe up to 5k - but it terminates due to time out, of course on an online judge - Please Note it is not because of trying to get the min_variance_sub, but I also have to get the sum of absolute differences of all the pairs possible in the min_variance_sub). I am going to accept fusion's approach as an answer for this question, because it answered the question.
But I will also create a new question for that problem statement (more like a QnA, where I will also answer the question for future visitors - i got the answer from submissions by other candidates, an editorial by problem setter, and a code by problem setter himself - though I do not understand the approach they used). I will link to the other question as I create it :)
It's HERE
The original question starts below
I'm using itertools.combinations on an array so first up I tried something like
aList = [list(x) for x in list(cmb(arr, k))]
where cmb = itertools.combinations, arr is the list, and k is an int.
This works totally good for len(arr) < 20 or so but this Raised a MemoryError when len(arr) became 50 or more.
On a suggestion by kevin on Python Chat, I used a generator, and it worked amazingly fast in generating those combinations like this
aGen = (list(x) for x in cmb(arr, k))
But It's so slow to iterate through this generator object.
I tried something like
for p in aGen:
continue
and even this code seems to take forever.
Kevin also suggested an answer talking about kth combination which was nice but in my case I actually want to test all the possible combinations and select the one with minimum variance.
So what would be the memory efficient way of checking all the possible combinations of an array (a list) to have minimum variance (to be precise, I only need to consider sub arrays having exactly k number of elements)
Thank You For Any Help.
You can sort the list with n elements first,
Then use a moving window of k length along the sorted list.
And find the minimum variance of the n-k+1 possible combinations.
The minimum should be the minimum of all combinations.
def myvar(arr):
l = len(arr)
m = sum(arr)/l
return sum((i-m)**2 for i in arr)/l
input_list = [.......]
sorted_list = sorted(input_list)
variance = None
min_variance_sub = None
for i in range(len(sorted_list) - k + 1):
sub = sorted_list[i:i+k]
var = myvar(sub)
if variance is None or var<variance:
variance = var
min_variance_sub=sub
print(min_variance_sub)
I'm trying to code something like this:
where x and y are two different numpy arrays and the j is an index for the array. I don't know the length of the array because it will be entered by the user and I cannot use loops to code this.
My main problem is finding a way to move between indexes since i would need to go from
x[2]-x[1] ... x[3]-x[2]
and so on.
I'm stumped but I would appreciate any clues.
A numpy-ic solution would be:
np.square(np.diff(x)).sum() + np.square(np.diff(y)).sum()
A list comprehension approach would be:
sum([(x[k]-x[k-1])**2+(y[k]-y[k-1])**2 for k in range(1,len(x))])
will give you the result you want, even if your data appears as list.
x[2]-x[1] ... x[3]-x[2] can be generalized to:
x[[1,2,3,...]-x[[0,1,2,...]]
x[1:]-x[:-1] # ie. (1 to the end)-(0 to almost the end)
numpy can take the difference between two arrays of the same shape
In list terms this would be
[i-j for i,j in zip(x[1:], x[:-1])]
np.diff does essentially this, a[slice1]-a[slice2], where the slices are as above.
The full answer squares, sums and squareroots.
I'm working on Euler Project, problem 11, which involves finding the greatest product of all possible combinations of four adjacent numbers in a grid. I've split the numbers into a nested list and used a list comprehension to slice the relevant numbers, like this:
if x+4 <= len(matrix[x]): #check right
my_slice = [int(matrix[x][n]) for n in range(y,y+4)]
...and so on for the other cardinal directions. So far, so good. But when I get to the diagonals things get problematic. I tried to use two ranges like this:
if x+4 <= len(matrix[x]) and y-4 >=0:# check up, right
my_slice = [int(matrix[m][n]) for m,n in ((range(x,x+4)),range(y,y+4))]
But this yields the following error:
<ipython-input-53-e7c3ebf29401> in <listcomp>(.0)
48 if x+4 <= len(matrix[x]) and y-4 >=0:# check up, right
---> 49 my_slice = [int(matrix[m][n]) for m,n in ((range(x,x+4)),range(y,y+4))]
ValueError: too many values to unpack (expected 2)
My desired indices for x,y values of [0,0] would be ['0,0','1,1','2,2','3,3']. This does not seem all that different for using the enumerate function to iterate over a list, but clearly I'm missing something.
P.S. My apologies for my terrible variable nomenclature, I'm a work in progress.
You do not need to use two ranges, simply use one and apply it twice:
my_slice = [int(matrix[m][m-x+y]) for m in range(x,x+4)]
Since your n is supposed to be attached to range(y,y+4) we know that there will always be a difference of y-x between m and n. So instead of using two variables, we can counter the difference ourselves.
Or in case you still wish to use two range(..) constructs, you can use zip(..) which takes a list of generators, consumes them concurrently and emits tuples:
my_slice = [int(matrix[m][n]) for m,n in zip(range(x,x+4),range(y,y+4))]
But I think this will not improve performance because of the tuple packing and unpacking overhead.
[int(matrix[x+d][n+d]) for d in range(4)] for one diagonal.
[int(matrix[x+d][n-d]) for d in range(4)] for the other.
Btw, better use standard matrix index names, i.e., row i and column j. Not x and y. It's confusing. I think you even confused yourself, as for example your if x+4 <= len(matrix[x]) tests x against the second dimension length but uses it in the first dimension. Huh?
This may be more of an 'approach' or conceptual question.
Basically, I have a python a multi-dimensional list like so:
my_list = [[0,1,1,1,0,1], [1,1,1,0,0,1], [1,1,0,0,0,1], [1,1,1,1,1,1]]
What I have to do is iterate through the array and compare each element with those directly surrounding it as though the list was layed out as a matrix.
For instance, given the first element of the first row, my_list[0][0], I need to know know the value of my_list[0][1], my_list[1][0] and my_list[1][1]. The value of the 'surrounding' elements will determine how the current element should be operated on. Of course for an element in the heart of the array, 8 comparisons will be necessary.
Now I know I could simply iterate through the array and compare with the indexed values, as above. I was curious as to whether there was a more efficient way which limited the amount of iteration required? Should I iterate through the array as is, or iterate and compare only values to either side and then transpose the array and run it again. This, however would ignore those values to the diagonal. And should I store results of the element lookups, so I don't keep determining the value of the same element multiple times?
I suspect this may have a fundamental approach in Computer Science, and I am eager to get feedback on the best approach using Python as opposed to looking for a specific answer to my problem.
You may get faster, and possibly even simpler, code by using numpy, or other alternatives (see below for details). But from a theoretical point of view, in terms of algorithmic complexity, the best you can get is O(N*M), and you can do that with your design (if I understand it correctly). For example:
def neighbors(matrix, row, col):
for i in row-1, row, row+1:
if i < 0 or i == len(matrix): continue
for j in col-1, col, col+1:
if j < 0 or j == len(matrix[i]): continue
if i == row and j == col: continue
yield matrix[i][j]
matrix = [[0,1,1,1,0,1], [1,1,1,0,0,1], [1,1,0,0,0,1], [1,1,1,1,1,1]]
for i, row in enumerate(matrix):
for j, cell in enumerate(cell):
for neighbor in neighbors(matrix, i, j):
do_stuff(cell, neighbor)
This has takes N * M * 8 steps (actually, a bit less than that, because many cells will have fewer than 8 neighbors). And algorithmically, there's no way you can do better than O(N * M). So, you're done.
(In some cases, you can make things simpler—with no significant change either way in performance—by thinking in terms of iterator transformations. For example, you can easily create a grouper over adjacent triplets from a list a by properly zipping a, a[1:], and a[2:], and you can extend this to adjacent 2-dimensional nonets. But I think in this case, it would just make your code more complicated that writing an explicit neighbors iterator and explicit for loops over the matrix.)
However, practically, you can get a whole lot faster, in various ways. For example:
Using numpy, you may get an order of magnitude or so faster. When you're iterating a tight loop and doing simple arithmetic, that's one of the things that Python is particularly slow at, and numpy can do it in C (or Fortran) instead.
Using your favorite GPGPU library, you can explicitly vectorize your operations.
Using multiprocessing, you can break the matrix up into pieces and perform multiple pieces in parallel on separate cores (or even separate machines).
Of course for a single 4x6 matrix, none of these are worth doing… except possibly for numpy, which may make your code simpler as well as faster, as long as you can express your operations naturally in matrix/broadcast terms.
In fact, even if you can't easily express things that way, just using numpy to store the matrix may make things a little simpler (and save some memory, if that matters). For example, numpy can let you access a single column from a matrix naturally, while in pure Python, you need to write something like [row[col] for row in matrix].
So, how would you tackle this with numpy?
First, you should read over numpy.matrix and ufunc (or, better, some higher-level tutorial, but I don't have one to recommend) before going too much further.
Anyway, it depends on what you're doing with each set of neighbors, but there are three basic ideas.
First, if you can convert your operation into simple matrix math, that's always easiest.
If not, you can create 8 "neighbor matrices" just by shifting the matrix in each direction, then perform simple operations against each neighbor. For some cases, it may be easier to start with an N+2 x N+2 matrix with suitable "empty" values (usually 0 or nan) in the outer rim. Alternatively, you can shift the matrix over and fill in empty values. Or, for some operations, you don't need an identical-sized matrix, so you can just crop the matrix to create a neighbor. It really depends on what operations you want to do.
For example, taking your input as a fixed 6x4 board for the Game of Life:
def neighbors(matrix):
for i in -1, 0, 1:
for j in -1, 0, 1:
if i == 0 and j == 0: continue
yield np.roll(np.roll(matrix, i, 0), j, 1)
matrix = np.matrix([[0,0,0,0,0,0,0,0],
[0,0,1,1,1,0,1,0],
[0,1,1,1,0,0,1,0],
[0,1,1,0,0,0,1,0],
[0,1,1,1,1,1,1,0],
[0,0,0,0,0,0,0,0]])
while True:
livecount = sum(neighbors(matrix))
matrix = (matrix & (livecount==2)) | (livecount==3)
(Note that this isn't the best way to solve this problem, but I think it's relatively easy to understand, and likely to illuminate whatever your actual problem is.)