out of bounds error when down sampling image

out of bounds error when down sampling image - python

Does anyone know why my index matrix[k][m]=sum/9 is out of range? I'm pretty sure that my solution is correct. I tried debugging it, but I still cannot think
why it's not working.
def downsample_by_3(image):
matrix_image = copy.deepcopy(image)
matrix=[ [], [], [] ]
k=0
m=0
for i in range(0,len(matrix_image),3):
for j in range(0,len(matrix_image[i]),3):
sum=0
for r in range(i,i+3):
for c in range(j,j+3):
sum+=matrix_image[r][c]
m+=1
matrix[k][m]=sum/9
m=0
k+=1
return matrix
The image is presented as a matrix (list of lists).
Let's say I took this list,
print(downsample_by_3([[2,2,2],[2,2,2],[2,2,2]]))
it should return a list with 18.
Another example of it to understand it better:

You have a list of empty lists. So while matrix[0] is fine, matrix[0][0] is out of bounds.
You have to either preallocate the lengths of the sublists beforehand if you wish to access them by index, or you append the averages as shown in Gary02127's answer. If you choose the former, you must also increment m after matrix[k][m]=sum/9.
However, your code has another out-of-bounds bug waiting to be triggered. Suppose you have a 3x4 image, i.e., one of the dimensions isn't divisible by 3. You average the first 3x3 block just fine, and then you try to access the pixels at column indices 3-5.
If your dimensions aren't divisible by the downsample factor, you have 3 choices:
raise an error at the beginning of the function call,
add extra rows/columns to make your matrix divisible by your factor (you can use the value of the last element of the row/column to extend it further), or
use only as many elements are you have left.
Here's an implementation with approach #3 and preallocating your matrix.
def downsample(image, factor):
height = len(image)
# this assumes that all sublists have the same length!
width = len(image[0])
matrix = [[None] * ((width + factor - 1) // factor)
for _ in range((height + factor - 1) // factor)]
k = 0
m = 0
for i in range(0, height, factor):
for j in range(0, width, factor):
# don't overshadow the `sum` builtin
total = 0
count = 0
for r in range(i, min(i+factor, height)):
for c in range(j, min(j+factor, width)):
total += image[r][c]
count += 1
matrix[k][m] = total / count
m += 1
m = 0
k += 1
return matrix

You start with matrix hard-coded to a sequence of empty lists matrix=[ [], [], [] ], and then in the
matrix[k][m]=sum/9
line, you try to force a value into one of those slots, but there are no slots. The k will be okay for the first few (from 0 through 2), but the m is going to error right away. Seems like you need to dynamically create matrix on the fly. Maybe
matrix[k].append(sum/9)
Also, a return matrix at the end your function wouldn't hurt. :)
The above should help you move forward. At least it gives running code.

Related

Modify the elements of a list inside a for loop (Python equivalent of Matlab code with a nested loop)

I have the following Matlab code (adopted from Programming and Numerical Methods in MATLAB by Otto&Denier, page 75)
clear all
p = input('Enter the power you require: ');
points = p+2;
n = 1:points;
for N = n
sums(N) = 0;
for j = 1:N
sums(N) = sums(N)+j^p;
end
end
The output for 3 as the given value of p is the following list
>> sums
sums =
1 9 36 100 225
I have written the following Python code (maybe not the most 'Pythonic way') trying to follow as much as possible Matlab instructions.
p = int(input('Enter the power you require: '))
points = p+2
n = range(points)
for N in range(1, len(n)+1):
sums = [0]*N
for index, item in list(enumerate(sums)):
sums[index] = item+index**p
Nevertheless the output is not same list. I have tried to replace the inner loop with
for j in range(1,N+1):
sums[N] = sums[N]+j**p
but this results to an index error message. Thanks in advance for any suggestions.

This might be due to the index difference. In Python, it starts from 0 while it's 1 in Matlab. Also, sums = [0]*N initialize a list of a length N, this has to be moved outside of the loop.
points = p+2
sums = [0]*points
for N in range(0, points):
for index in range(0, N+1):
sums[N] = sums[N] + (index+1)**p

sums(N) = 0; does not create an array of all zeros, it sets element N of the existing array to 0, and creates additional elements in the array if it not at least of length N.
Because N grows by one each iteration, you could initialize as an empty array before the loop, and append(0) inside the loop:
sums = []
for N in range(1, len(n)+1):
sums.append(0)
I don’t particularly like the use of enumerate here either, I would:
for index in range(N)
sums[index] += (index + 1)**p
(Notice the +1 on the index that was missing in the code in the OP!)
Finally, n is just confusing here. I would:
for N in range(1, points + 1):
…

Guidance on removing a nested for loop from function

I'm trying to write the fastest algorithm possible to return the number of "magic triples" (i.e. x, y, z where z is a multiple of y and y is a multiple of x) in a list of 3-2000 integers.
(Note: I believe the list was expected to be sorted and unique but one of the test examples given was [1,1,1] with the expected result of 1 - that is a mistake in the challenge itself though because the definition of a magic triple was explicitly noted as x < y < z, which [1,1,1] isn't. In any case, I was trying to optimise an algorithm for sorted lists of unique integers.)
I haven't been able to work out a solution that doesn't include having three consecutive loops and therefore being O(n^3). I've seen one online that is O(n^2) but I can't get my head around what it's doing, so it doesn't feel right to submit it.
My code is:
def solution(l):
if len(l) < 3:
return 0
elif l == [1,1,1]:
return 1
else:
halfway = int(l[-1]/2)
quarterway = int(halfway/2)
quarterIndex = 0
halfIndex = 0
for i in range(len(l)):
if l[i] >= quarterway:
quarterIndex = i
break
for i in range(len(l)):
if l[i] >= halfway:
halfIndex = i
break
triples = 0
for i in l[:quarterIndex+1]:
for j in l[:halfIndex+1]:
if j != i and j % i == 0:
multiple = 2
while (j * multiple) <= l[-1]:
if j * multiple in l:
triples += 1
multiple += 1
return triples
I've spent quite a lot of time going through examples manually and removing loops through unnecessary sections of the lists but this still completes a list of 2,000 integers in about a second where the O(n^2) solution I found completes the same list in 0.6 seconds - it seems like such a small difference but obviously it means mine takes 60% longer.
Am I missing a really obvious way of removing one of the loops?
Also, I saw mention of making a directed graph and I see the promise in that. I can make the list of first nodes from the original list with a built-in function, so in principle I presume that means I can make the overall graph with two for loops and then return the length of the third node list, but I hit a wall with that too. I just can't seem to make progress without that third loop!!

from array import array
def num_triples(l):
n = len(l)
pairs = set()
lower_counts = array("I", (0 for _ in range(n)))
upper_counts = lower_counts[:]
for i in range(n - 1):
lower = l[i]
for j in range(i + 1, n):
upper = l[j]
if upper % lower == 0:
lower_counts[i] += 1
upper_counts[j] += 1
return sum(nx * nz for nz, nx in zip(lower_counts, upper_counts))
Here, lower_counts[i] is the number of pairs of which the ith number is the y, and z is the other number in the pair (i.e. the number of different z values for this y).
Similarly, upper_counts[i] is the number of pairs of which the ith number is the y, and x is the other number in the pair (i.e. the number of different x values for this y).
So the number of triples in which the ith number is the y value is just the product of those two numbers.
The use of an array here for storing the counts is for scalability of access time. Tests show that up to n=2000 it makes negligible difference in practice, and even up to n=20000 it only made about a 1% difference to the run time (compared to using a list), but it could in principle be the fastest growing term for very large n.

How about using itertools.combinations instead of nested for loops? Combined with list comprehension, it's cleaner and much faster. Let's say l = [your list of integers] and let's assume it's already sorted.
from itertools import combinations
def div(i,j,k): # this function has the logic
return l[k]%l[j]==l[j]%l[i]==0
r = sum([div(i,j,k) for i,j,k in combinations(range(len(l)),3) if i<j<k])

#alaniwi provided a very smart iterative solution.
Here is a recursive solution.
def find_magicals(lst, nplet):
"""Find the number of magical n-plets in a given lst"""
res = 0
for i, base in enumerate(lst):
# find all the multiples of current base
multiples = [num for num in lst[i + 1:] if not num % base]
res += len(multiples) if nplet <= 2 else find_magicals(multiples, nplet - 1)
return res
def solution(lst):
return find_magicals(lst, 3)
The problem can be divided into selecting any number in the original list as the base (i.e x), how many du-plets we can find among the numbers bigger than the base. Since the method to find all du-plets is the same as finding tri-plets, we can solve the problem recursively.
From my testing, this recursive solution is comparable to, if not more performant than, the iterative solution.

This answer was the first suggestion by #alaniwi and is the one I've found to be the fastest (at 0.59 seconds for a 2,000 integer list).
def solution(l):
n = len(l)
lower_counts = dict((val, 0) for val in l)
upper_counts = lower_counts.copy()
for i in range(n - 1):
lower = l[i]
for j in range(i + 1, n):
upper = l[j]
if upper % lower == 0:
lower_counts[lower] += 1
upper_counts[upper] += 1
return sum((lower_counts[y] * upper_counts[y] for y in l))
I think I've managed to get my head around it. What it is essentially doing is comparing each number in the list with every other number to see if the smaller is divisible by the larger and makes two dictionaries:
One with the number of times a number is divisible by a larger
number,
One with the number of times it has a smaller number divisible by
it.
You compare the two dictionaries and multiply the values for each key because the key having a 0 in either essentially means it is not the second number in a triple.
Example:
l = [1,2,3,4,5,6]
lower_counts = {1:5, 2:2, 3:1, 4:0, 5:0, 6:0}
upper_counts = {1:0, 2:1, 3:1, 4:2, 5:1, 6:3}
triple_tuple = ([1,2,4], [1,2,6], [1,3,6])

how to init a array with each element holding the value different from its neighbours

I have a matrix or a multiple array written in python, each element in the array is an integer ranged from 0 to 7, how would I randomly initalize this matrix or multiple array, so that for each element holds a value, which is different from the values of its 4 neighbours(left,right, top, bottom)? can it be implemented in numpy?

You can write your own matrix initializer.
Go through the array[i][j] for each i, j pick a random number between 0 and 7.
If the number equals to either left element: array[i][j-1] or to the upper one: array[i-1][j] regenerate it once again.
You have 2/7 probability to encounter such a bad case, and 4/49 to make it twice in a row, 8/343 for 3 in a row, etc.. the probability dropes down very quickly.
The average case complexity for n elements in a matrix would be O(n).

A simpler problem that might get you started is to do the same for a 1d array. A pure-python solution would look like:
def sample_1d(n, upper):
x = [random.randrange(upper)]
for i in range(1, n)"
xi = random.randrange(upper - 1)
if xi >= x:
xi += 1
x.append(xi)
return x
You can vectorize this as:
def sample_1d_v(n, upper):
x = np.empty(n)
x[0] = 0
x[1:] = np.cumsum(np.random.randint(1, upper, size=n-1)) % upper
x += np.random.randint(upper)
return
The trick here is noting that if there is adjacent values must be different, then the difference between their values is uniformly distributed in [1, upper)

How to wrap around to the start/end of a list?

I have a 2d array with a different species in each one. I pick a random element on the array and I want to count up how many of each species are in the eight squares immediately adjacent to that element.
But I want the array to wrap at the edges, so if I pick an element on the top row, the bottom row will be counted as "adjacent". How can I do this while iterating through j in range (x-1,x+1) and the same for j and y?
Also, is there a more elegant way of omitting the element I originally picked while looking through the adjacent squares than the if (j!=x or k!=y line?
numspec = [0] * len(allspec)
for i in range (0,len(allspec)):
#count up how many of species i there is in the immediate area
for j in range(x-1,x+1):
for k in range(y-1,y+1):
if (j!=x or k!=y):
numspec[hab[i][j]] = numspec[hab[i][j]]+1

You can wrap using j%8 that gives you a number from 0 to 7.

As for wrapping, I would recomend using relative indexing from -1 to +1 and then computing real index using modulo operator (%).
As for making sure you don't count the original element (x, y), you are doing just fine (I would probably use reversed contidion and continue, but it doesn't matter).
I don't quite understand your usage of i, j, k indexes, so I'll just assume that i is index of the species, j, k are indexes into the 2d map called hab which I changed to x_rel, y_rel and x_idx and y_idx to make it more readable. If I'm mistaken, change the code or let me know.
I also took the liberty of doing some minor fixes:
introduced N constant representing number of species
changed range to xrange (xrange is faster, uses less memory, etc)
no need to specify 0 in range (or xrange)
instead of X = X + 1 for increasing value, I used += increment operator like this: X += 1
Here is resulting code:
N = len(allspec)
numspec = [0] * N
for i in xrange(N):
for x_rel in xrange(-1, +1):
for y_rel in xrange(-1, +1):
x_idx = (x + xrel) % N
y_idx = (y + yrel) % N
if x_idx != x or y_idx != y:
numspec[hab[x_idx][y_idx]] += 1

You could construct a list of the adjacent elements and go from there. For example if your 2d list is called my_array and you wanted to examine the blocks immediately surrounding my_array[x][y] then you can do something like this:
xmax = len(my_array)
ymax = len(my_array[0]) #assuming it's a square...
x_vals = [i%xmax for i in [x-1,x,x+1]]
y_vals = [blah]
surrounding_blocks = [
my_array[x_vals[0]][y_vals[0]],
my_array[x_vals[0]][y_vals[1]],
my_array[x_vals[0]][y_vals[2]],
my_array[x_vals[2]][y_vals[0]],
my_array[x_vals[2]][y_vals[1]],
my_array[x_vals[2]][y_vals[2]],
my_array[x_vals[1]][y_vals[0]],
my_array[x_vals[1]][y_vals[2]],
]

Subset sum Problem

recently I became interested in the subset-sum problem which is finding a zero-sum subset in a superset. I found some solutions on SO, in addition, I came across a particular solution which uses the dynamic programming approach. I translated his solution in python based on his qualitative descriptions. I'm trying to optimize this for larger lists which eats up a lot of my memory. Can someone recommend optimizations or other techniques to solve this particular problem? Here's my attempt in python:
import random
from time import time
from itertools import product
time0 = time()
# create a zero matrix of size a (row), b(col)
def create_zero_matrix(a,b):
return [[0]*b for x in xrange(a)]
# generate a list of size num with random integers with an upper and lower bound
def random_ints(num, lower=-1000, upper=1000):
return [random.randrange(lower,upper+1) for i in range(num)]
# split a list up into N and P where N be the sum of the negative values and P the sum of the positive values.
# 0 does not count because of additive identity
def split_sum(A):
N_list = []
P_list = []
for x in A:
if x < 0:
N_list.append(x)
elif x > 0:
P_list.append(x)
return [sum(N_list), sum(P_list)]
# since the column indexes are in the range from 0 to P - N
# we would like to retrieve them based on the index in the range N to P
# n := row, m := col
def get_element(table, n, m, N):
if n < 0:
return 0
try:
return table[n][m - N]
except:
return 0
# same definition as above
def set_element(table, n, m, N, value):
table[n][m - N] = value
# input array
#A = [1, -3, 2, 4]
A = random_ints(200)
[N, P] = split_sum(A)
# create a zero matrix of size m (row) by n (col)
#
# m := the number of elements in A
# n := P - N + 1 (by definition N <= s <= P)
#
# each element in the matrix will be a value of either 0 (false) or 1 (true)
m = len(A)
n = P - N + 1;
table = create_zero_matrix(m, n)
# set first element in index (0, A[0]) to be true
# Definition: Q(1,s) := (x1 == s). Note that index starts at 0 instead of 1.
set_element(table, 0, A[0], N, 1)
# iterate through each table element
#for i in xrange(1, m): #row
# for s in xrange(N, P + 1): #col
for i, s in product(xrange(1, m), xrange(N, P + 1)):
if get_element(table, i - 1, s, N) or A[i] == s or get_element(table, i - 1, s - A[i], N):
#set_element(table, i, s, N, 1)
table[i][s - N] = 1
# find zero-sum subset solution
s = 0
solution = []
for i in reversed(xrange(0, m)):
if get_element(table, i - 1, s, N) == 0 and get_element(table, i, s, N) == 1:
s = s - A[i]
solution.append(A[i])
print "Solution: ",solution
time1 = time()
print "Time execution: ", time1 - time0

I'm not quite sure if your solution is exact or a PTA (poly-time approximation).
But, as someone pointed out, this problem is indeed NP-Complete.
Meaning, every known (exact) algorithm has an exponential time behavior on the size of the input.
Meaning, if you can process 1 operation in .01 nanosecond then, for a list of 59 elements it'll take:
2^59 ops --> 2^59 seconds --> 2^26 years --> 1 year
-------------- ---------------
10.000.000.000 3600 x 24 x 365
You can find heuristics, which give you just a CHANCE of finding an exact solution in polynomial time.
On the other side, if you restrict the problem (to another) using bounds for the values of the numbers in the set, then the problem complexity reduces to polynomial time. But even then the memory space consumed will be a polynomial of VERY High Order.
The memory consumed will be much larger than the few gigabytes you have in memory.
And even much larger than the few tera-bytes on your hard drive.
( That's for small values of the bound for the value of the elements in the set )
May be this is the case of your Dynamic programing algorithm.
It seemed to me that you were using a bound of 1000 when building your initialization matrix.
You can try a smaller bound. That is... if your input is consistently consist of small values.
Good Luck!

Someone on Hacker News came up with the following solution to the problem, which I quite liked. It just happens to be in python :):
def subset_summing_to_zero (activities):
subsets = {0: []}
for (activity, cost) in activities.iteritems():
old_subsets = subsets
subsets = {}
for (prev_sum, subset) in old_subsets.iteritems():
subsets[prev_sum] = subset
new_sum = prev_sum + cost
new_subset = subset + [activity]
if 0 == new_sum:
new_subset.sort()
return new_subset
else:
subsets[new_sum] = new_subset
return []
I spent a few minutes with it and it worked very well.

An interesting article on optimizing python code is available here. Basically the main result is that you should inline your frequent loops, so in your case this would mean instead of calling get_element twice per loop, put the actual code of that function inside the loop in order to avoid the function call overhead.
Hope that helps! Cheers

, 1st eye catch
def split_sum(A):
N_list = 0
P_list = 0
for x in A:
if x < 0:
N_list+=x
elif x > 0:
P_list+=x
return [N_list, P_list]
Some advices:
Try to use 1D list and use bitarray to reduce memory footprint at minimum (http://pypi.python.org/pypi/bitarray) so you will just change get / set functon. This should reduce your memory footprint by at lest 64 (integer in list is pointer to integer whit type so it can be factor 3*32)
Avoid using try - catch, but figure out proper ranges at beginning, you might found out that you will gain huge speed.

The following code works for Python 3.3+ , I have used the itertools module in Python that has some great methods to use.
from itertools import chain, combinations
def powerset(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
nums = input("Enter the Elements").strip().split()
inputSum = int(input("Enter the Sum You want"))
for i, combo in enumerate(powerset(nums), 1):
sum = 0
for num in combo:
sum += int(num)
if sum == inputSum:
print(combo)
The Input Output is as Follows:
Enter the Elements 1 2 3 4
Enter the Sum You want 5
('1', '4')
('2', '3')

Just change the values in your set w and correspondingly make an array x as big as the len of w then pass the last value in the subsetsum function as the sum for which u want subsets and you wl bw done (if u want to check by giving your own values).
def subsetsum(cs,k,r,x,w,d):
x[k]=1
if(cs+w[k]==d):
for i in range(0,k+1):
if x[i]==1:
print (w[i],end=" ")
print()
elif cs+w[k]+w[k+1]<=d :
subsetsum(cs+w[k],k+1,r-w[k],x,w,d)
if((cs +r-w[k]>=d) and (cs+w[k]<=d)) :
x[k]=0
subsetsum(cs,k+1,r-w[k],x,w,d)
#driver for the above code
w=[2,3,4,5,0]
x=[0,0,0,0,0]
subsetsum(0,0,sum(w),x,w,7)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

out of bounds error when down sampling image - python

Related

Modify the elements of a list inside a for loop (Python equivalent of Matlab code with a nested loop)

Guidance on removing a nested for loop from function

how to init a array with each element holding the value different from its neighbours

How to wrap around to the start/end of a list?

Subset sum Problem

Categories

Resources