Why isn't my implementation O(NlogN)? - python

I was implementing and testing answers to this SO question -
Given an array of integers find the number of all ordered pairs of elements in the array whose sum lies in a given range [a,b]
The answer with the most upvotes (currently) only provides a text description of an algorithm that should be O(NlogN):
Sort the array... .
For each element x in the array:
Consider the array slice after the element.
Do a binary search on this array slice for [a - x], call it y0. If no exact match is found, consider the closest match bigger than [a - x] as y0.
Output all elements (x, y) from y0 forwards as long as x + y <= b. ... If you only need to count the number of pairs, you can do it in O(nlogn). Modify the above algorithm so [b - x] (or the next smaller element) is also searched for.
My implementation:
import bisect
def ani(arr, a, b):
# Sort the array (say in increasing order).
arr.sort()
count = 0
for ndx, x in enumerate(arr):
# Consider the array slice after the element
after = arr[ndx+1:]
# Do a binary search on this array slice for [a - x], call it y0
lower = a - x
y0 = bisect.bisect_left(after, lower)
# If you only need to count the number of pairs
# Modify the ... algorithm so [b - x] ... is also searched for
upper = b - x
y1 = bisect.bisect_right(after, upper)
count += y1 - y0
return count
When I plot Time versus N or some function of N I am seeing an exponential or N^2 response.
# generate timings
T = list() # run-times
N = range(100, 10001, 100) # N
arr = [random.randint(-10, 10) for _ in xrange(1000000)]
print 'start'
start = time.time()
for n in N:
arr1 = arr[:n]
t = Timer('ani(arr1, 5, 16)', 'from __main__ import arr1, ani')
timing_loops = 100
T.append(t.timeit(timing_loops) / timing_loops)
Is my implementation incorrect or is the author's claim incorrect?
Here are some plots of the data.
T vs N
T / NlogN vs N - one commenter thought this should NOT produce a linear plot - but it does.
T vs NlogN - I thought this should be linear if the complexity is NlogN but it is not.

If nothing else, this is your error:
for ndx, x in enumerate(arr):
# Consider the array slice after the element
after = arr[ndx+1:]
arr[ndx+1:] creates a copy of the list of length len(arr) - ndx, so therefore your loop is O(n^2).
Instead, use the lo and hi arguments to bisect.bisect.

Related

Maximum absolute difference of value and index sums of Four arrays

You are given four arrays A, B, C, D each of size N.
Find maximum value (M) of given below expression
M = max(|A[i] - A[j]| + |B[i] - B[j]| + |C[i] - C[j]| + |D[i] - D[j]| + |i -j|)
Where 1 <= i < j <= N <br />
and here |x| refers to the absolute value of x.
Constraints
2 <= N <= 10^5
1 <= Ai,Bi,Ci,Di <= 10^9
Input: N,A,B,C,D
Output: M
Ex.-
Input-
5
5,7,6,3,9
7,9,2,7,5
1,9,9,3,3
8,4,1,10,5
Output-
24
Question picture
I have tried this way
def max_value(arr1,arr2,arr3,arr4, n):
res = 0;
# Iterating two for loop,
# one for i and another for j.
for i in range(n):
for j in range(n):
temp= abs(arr1[i] - arr1[j]) + abs(arr2[i] - arr2[j]) + abs(arr3[i] - arr3[j]) + abs(arr4[i] - arr4[j]) + abs(i - j)
if res>temp:
res = res
else:
res = temp
return res;
This is O(n^2).
But I want a better time complexity solution. This will not work for higher values of N.
Here is solution for single array
One can generalize the solution for a single array that you showed. Given a number K of arrays, including the array of indices, one can make 2**K possible combinations of arrays to get rid of the absolute values. It is then easy to just take the max and min of each of these combinations separately and compare them. This is order O(Kn*2^K), much better than the original O(Kn^2) for the values you report.
Here is a code that works on an arbitrary number of input arrays.
import numpy as np
def run(n, *args):
aux = np.arange(n)
K = len(args) + 1
rows = 2 ** K
x = np.zeros((rows, n))
for i in range(rows):
temp = 0
for m, a in enumerate(args):
temp += np.array(a) * ((-1) ** int(f"{i:0{K}b}"[-(1+m)]))
temp += aux * ((-1) ** int(f"{i:0{K}b}"[-K]))
x[i] = temp
x_max = np.max(x, axis=-1)
x_min = np.min(x, axis=-1)
res = np.max(x_max - x_min)
return res
The for loop maybe deserves more explanation: in order to make all possible combinations of absolute values, I assign each combination to an integer and rely on the binary representation of this integer to choose which ones of the K vectors must be taken negative.
Idea for faster solution
If you are only interested in the maximum of M you could search for the minimum and maximum value of A, B,C, D and i-j.Let's say i_Amax is the i index for the maximum of A.
Now you find the value of B[i_Amax], C[i_Amax].... and the same for i_Amin and calculate M with the differences of the max and min value.
You repeated the step before with the index for the maximum value of B, so i_Bmax and calculate M, you repeat until you gone through A,B,C,D and i-j
You now should have five terms and one of them should be the maximum
If you don't have a clear minimum or maximum you have to calculate the indeces for all the possible minimums and maximums.
I think it should find any maximum and is faster than n^2, especially for big n, but I have not implemented it myself, so you have to think it through to check whether I made a logical error and one can not find every maximum with that idea.
I hope that helps!

Delete certain elements of a numpy array

I have two numpy arrays a and b. I have a definition that construct an array c whose elements are all the possible sums of different elements of a.
import numpy as np
def Sumarray(a):
n = len(a)
sumarray = np.array([0]) # Add a default zero element
for k in range(2,n+1):
full = np.mgrid[k*(slice(n),)]
nd_triu_idx = full[:,(np.diff(full,axis=0)>0).all(axis=0)]
sumarray = np.append(sumarray, a[nd_triu_idx].sum(axis=0))
return sumarray
a = np.array([1,2,6,8])
c = Sumarray(a)
print(d)
I then perform a subsetsum between an element of c and b: isSubsetSum returns the elements of b that when summed gives c[1]. Let's say that I get
c[0] = b[2] + b[3]
Then I want to remove:
the elements b[2], b[3] (easy bit), and
the elements of a that when summed gave c[0]
As you can see from the definition, Sumarray, the order of sums of different elements of a are preserved, so I need to realise some mapping.
The function isSubsetSum is given by
def _isSubsetSum(numbers, n, x, indices):
if (x == 0):
return True
if (n == 0 and x != 0):
return False
# If last element is greater than x, then ignore it
if (numbers[n - 1] > x):
return _isSubsetSum(numbers, n - 1, x, indices)
# else, check if x can be obtained by any of the following
found = _isSubsetSum(numbers, n - 1, x, indices)
if found: return True
indices.insert(0, n - 1)
found = _isSubsetSum(numbers, n - 1, x - numbers[n - 1], indices)
if not found: indices.pop(0)
return found
def isSubsetSum(numbers, x):
indices = []
found = _isSubsetSum(numbers, len(numbers), x, indices)
return indices if found else None
As you are iterating over all possible numbers of terms, you could as well directly generate all possible subsets.
These can be conveniently encoded as numbers 0,1,2,... by means of their binary representations: O means no terms at all, 1 means only the first term, 2 means only the second, 3 means the first and the second and so on.
Using this scheme it becomes very easy to recover the terms from the sum index because all we need to do is obtain the binary representation:
UPDATE: we can suppress 1-term-sums with a small amount of extra code:
import numpy as np
def find_all_subsums(a,drop_singletons=False):
n = len(a)
assert n<=32 # this gives 4G subsets, and we have to cut somewhere
# compute the smallest integer type with enough bits
dt = f"<u{1<<((n-1)>>3).bit_length()}"
# the numbers 0 to 2^n encode all possible subsets of an n
# element set by means of their binary representation
# each bit corresponds to one element number k represents the
# subset consisting of all elements whose bit is set in k
rng = np.arange(1<<n,dtype=dt)
if drop_singletons:
# one element subsets correspond to powers of two
rng = np.delete(rng,1<<np.arange(n))
# np.unpackbits transforms bytes to their binary representation
# given the a bitvector b we can compute the corresponding subsum
# as b dot a, to do it in bulk we can mutliply the matrix of
# binary rows with a
return np.unpackbits(rng[...,None].view('u1'),
axis=1,count=n,bitorder='little') # a
def show_terms(a,idx,drop_singletons=False):
n = len(a)
if drop_singletons:
# we must undo the dropping of powers of two to get an index
# that is easy to translate. One can check that the following
# formula does the trick
idx += (idx+idx.bit_length()).bit_length()
# now we can simply use the binary representation
return a[np.unpackbits(np.asarray(idx,dtype='<u8')[None].view('u1'),
count=n,bitorder='little').view('?')]
example = np.logspace(1,7,7,base=3)
ss = find_all_subsums(example,True)
# check every single sum
for i,s in enumerate(ss):
assert show_terms(example,i,True).sum() == s
# print one example
idx = 77
print(ss[idx],"="," + ".join(show_terms(example.astype('U'),idx,True)))
Sample run:
2457.0 = 27.0 + 243.0 + 2187.0

how to init a array with each element holding the value different from its neighbours

I have a matrix or a multiple array written in python, each element in the array is an integer ranged from 0 to 7, how would I randomly initalize this matrix or multiple array, so that for each element holds a value, which is different from the values of its 4 neighbours(left,right, top, bottom)? can it be implemented in numpy?
You can write your own matrix initializer.
Go through the array[i][j] for each i, j pick a random number between 0 and 7.
If the number equals to either left element: array[i][j-1] or to the upper one: array[i-1][j] regenerate it once again.
You have 2/7 probability to encounter such a bad case, and 4/49 to make it twice in a row, 8/343 for 3 in a row, etc.. the probability dropes down very quickly.
The average case complexity for n elements in a matrix would be O(n).
A simpler problem that might get you started is to do the same for a 1d array. A pure-python solution would look like:
def sample_1d(n, upper):
x = [random.randrange(upper)]
for i in range(1, n)"
xi = random.randrange(upper - 1)
if xi >= x:
xi += 1
x.append(xi)
return x
You can vectorize this as:
def sample_1d_v(n, upper):
x = np.empty(n)
x[0] = 0
x[1:] = np.cumsum(np.random.randint(1, upper, size=n-1)) % upper
x += np.random.randint(upper)
return
The trick here is noting that if there is adjacent values must be different, then the difference between their values is uniformly distributed in [1, upper)

How to wrap around to the start/end of a list?

I have a 2d array with a different species in each one. I pick a random element on the array and I want to count up how many of each species are in the eight squares immediately adjacent to that element.
But I want the array to wrap at the edges, so if I pick an element on the top row, the bottom row will be counted as "adjacent". How can I do this while iterating through j in range (x-1,x+1) and the same for j and y?
Also, is there a more elegant way of omitting the element I originally picked while looking through the adjacent squares than the if (j!=x or k!=y line?
numspec = [0] * len(allspec)
for i in range (0,len(allspec)):
#count up how many of species i there is in the immediate area
for j in range(x-1,x+1):
for k in range(y-1,y+1):
if (j!=x or k!=y):
numspec[hab[i][j]] = numspec[hab[i][j]]+1
You can wrap using j%8 that gives you a number from 0 to 7.
As for wrapping, I would recomend using relative indexing from -1 to +1 and then computing real index using modulo operator (%).
As for making sure you don't count the original element (x, y), you are doing just fine (I would probably use reversed contidion and continue, but it doesn't matter).
I don't quite understand your usage of i, j, k indexes, so I'll just assume that i is index of the species, j, k are indexes into the 2d map called hab which I changed to x_rel, y_rel and x_idx and y_idx to make it more readable. If I'm mistaken, change the code or let me know.
I also took the liberty of doing some minor fixes:
introduced N constant representing number of species
changed range to xrange (xrange is faster, uses less memory, etc)
no need to specify 0 in range (or xrange)
instead of X = X + 1 for increasing value, I used += increment operator like this: X += 1
Here is resulting code:
N = len(allspec)
numspec = [0] * N
for i in xrange(N):
for x_rel in xrange(-1, +1):
for y_rel in xrange(-1, +1):
x_idx = (x + xrel) % N
y_idx = (y + yrel) % N
if x_idx != x or y_idx != y:
numspec[hab[x_idx][y_idx]] += 1
You could construct a list of the adjacent elements and go from there. For example if your 2d list is called my_array and you wanted to examine the blocks immediately surrounding my_array[x][y] then you can do something like this:
xmax = len(my_array)
ymax = len(my_array[0]) #assuming it's a square...
x_vals = [i%xmax for i in [x-1,x,x+1]]
y_vals = [blah]
surrounding_blocks = [
my_array[x_vals[0]][y_vals[0]],
my_array[x_vals[0]][y_vals[1]],
my_array[x_vals[0]][y_vals[2]],
my_array[x_vals[2]][y_vals[0]],
my_array[x_vals[2]][y_vals[1]],
my_array[x_vals[2]][y_vals[2]],
my_array[x_vals[1]][y_vals[0]],
my_array[x_vals[1]][y_vals[2]],
]

Subset sum Problem

recently I became interested in the subset-sum problem which is finding a zero-sum subset in a superset. I found some solutions on SO, in addition, I came across a particular solution which uses the dynamic programming approach. I translated his solution in python based on his qualitative descriptions. I'm trying to optimize this for larger lists which eats up a lot of my memory. Can someone recommend optimizations or other techniques to solve this particular problem? Here's my attempt in python:
import random
from time import time
from itertools import product
time0 = time()
# create a zero matrix of size a (row), b(col)
def create_zero_matrix(a,b):
return [[0]*b for x in xrange(a)]
# generate a list of size num with random integers with an upper and lower bound
def random_ints(num, lower=-1000, upper=1000):
return [random.randrange(lower,upper+1) for i in range(num)]
# split a list up into N and P where N be the sum of the negative values and P the sum of the positive values.
# 0 does not count because of additive identity
def split_sum(A):
N_list = []
P_list = []
for x in A:
if x < 0:
N_list.append(x)
elif x > 0:
P_list.append(x)
return [sum(N_list), sum(P_list)]
# since the column indexes are in the range from 0 to P - N
# we would like to retrieve them based on the index in the range N to P
# n := row, m := col
def get_element(table, n, m, N):
if n < 0:
return 0
try:
return table[n][m - N]
except:
return 0
# same definition as above
def set_element(table, n, m, N, value):
table[n][m - N] = value
# input array
#A = [1, -3, 2, 4]
A = random_ints(200)
[N, P] = split_sum(A)
# create a zero matrix of size m (row) by n (col)
#
# m := the number of elements in A
# n := P - N + 1 (by definition N <= s <= P)
#
# each element in the matrix will be a value of either 0 (false) or 1 (true)
m = len(A)
n = P - N + 1;
table = create_zero_matrix(m, n)
# set first element in index (0, A[0]) to be true
# Definition: Q(1,s) := (x1 == s). Note that index starts at 0 instead of 1.
set_element(table, 0, A[0], N, 1)
# iterate through each table element
#for i in xrange(1, m): #row
# for s in xrange(N, P + 1): #col
for i, s in product(xrange(1, m), xrange(N, P + 1)):
if get_element(table, i - 1, s, N) or A[i] == s or get_element(table, i - 1, s - A[i], N):
#set_element(table, i, s, N, 1)
table[i][s - N] = 1
# find zero-sum subset solution
s = 0
solution = []
for i in reversed(xrange(0, m)):
if get_element(table, i - 1, s, N) == 0 and get_element(table, i, s, N) == 1:
s = s - A[i]
solution.append(A[i])
print "Solution: ",solution
time1 = time()
print "Time execution: ", time1 - time0
I'm not quite sure if your solution is exact or a PTA (poly-time approximation).
But, as someone pointed out, this problem is indeed NP-Complete.
Meaning, every known (exact) algorithm has an exponential time behavior on the size of the input.
Meaning, if you can process 1 operation in .01 nanosecond then, for a list of 59 elements it'll take:
2^59 ops --> 2^59 seconds --> 2^26 years --> 1 year
-------------- ---------------
10.000.000.000 3600 x 24 x 365
You can find heuristics, which give you just a CHANCE of finding an exact solution in polynomial time.
On the other side, if you restrict the problem (to another) using bounds for the values of the numbers in the set, then the problem complexity reduces to polynomial time. But even then the memory space consumed will be a polynomial of VERY High Order.
The memory consumed will be much larger than the few gigabytes you have in memory.
And even much larger than the few tera-bytes on your hard drive.
( That's for small values of the bound for the value of the elements in the set )
May be this is the case of your Dynamic programing algorithm.
It seemed to me that you were using a bound of 1000 when building your initialization matrix.
You can try a smaller bound. That is... if your input is consistently consist of small values.
Good Luck!
Someone on Hacker News came up with the following solution to the problem, which I quite liked. It just happens to be in python :):
def subset_summing_to_zero (activities):
subsets = {0: []}
for (activity, cost) in activities.iteritems():
old_subsets = subsets
subsets = {}
for (prev_sum, subset) in old_subsets.iteritems():
subsets[prev_sum] = subset
new_sum = prev_sum + cost
new_subset = subset + [activity]
if 0 == new_sum:
new_subset.sort()
return new_subset
else:
subsets[new_sum] = new_subset
return []
I spent a few minutes with it and it worked very well.
An interesting article on optimizing python code is available here. Basically the main result is that you should inline your frequent loops, so in your case this would mean instead of calling get_element twice per loop, put the actual code of that function inside the loop in order to avoid the function call overhead.
Hope that helps! Cheers
, 1st eye catch
def split_sum(A):
N_list = 0
P_list = 0
for x in A:
if x < 0:
N_list+=x
elif x > 0:
P_list+=x
return [N_list, P_list]
Some advices:
Try to use 1D list and use bitarray to reduce memory footprint at minimum (http://pypi.python.org/pypi/bitarray) so you will just change get / set functon. This should reduce your memory footprint by at lest 64 (integer in list is pointer to integer whit type so it can be factor 3*32)
Avoid using try - catch, but figure out proper ranges at beginning, you might found out that you will gain huge speed.
The following code works for Python 3.3+ , I have used the itertools module in Python that has some great methods to use.
from itertools import chain, combinations
def powerset(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
nums = input("Enter the Elements").strip().split()
inputSum = int(input("Enter the Sum You want"))
for i, combo in enumerate(powerset(nums), 1):
sum = 0
for num in combo:
sum += int(num)
if sum == inputSum:
print(combo)
The Input Output is as Follows:
Enter the Elements 1 2 3 4
Enter the Sum You want 5
('1', '4')
('2', '3')
Just change the values in your set w and correspondingly make an array x as big as the len of w then pass the last value in the subsetsum function as the sum for which u want subsets and you wl bw done (if u want to check by giving your own values).
def subsetsum(cs,k,r,x,w,d):
x[k]=1
if(cs+w[k]==d):
for i in range(0,k+1):
if x[i]==1:
print (w[i],end=" ")
print()
elif cs+w[k]+w[k+1]<=d :
subsetsum(cs+w[k],k+1,r-w[k],x,w,d)
if((cs +r-w[k]>=d) and (cs+w[k]<=d)) :
x[k]=0
subsetsum(cs,k+1,r-w[k],x,w,d)
#driver for the above code
w=[2,3,4,5,0]
x=[0,0,0,0,0]
subsetsum(0,0,sum(w),x,w,7)

Categories

Resources