My code is very slow. How to optimize it? Python

My code is very slow. How to optimize it? Python - python

def function_1(arr):
return [j for i in range(len(arr)) for j in range(len(arr))
if np.array(arr)[i] == np.sort(arr)[::-1][j]]
An arrarr array is given. It is required for each position [i] to find the arriarri element number in the arrarr array, sorted in descending order. All values of the arrarr array are different.
I have to write func in 1 line. It is working, but very slowly. I have to do this:
np.random.seed(42)
arr = function_1(np.random.uniform(size=1000000))
print(arr[7] + arr[42] + arr[445677] + arr[53422])
Please help to optimize the code.

You are repeatedly sorting and reversing the array, but the result of that operation is independent of the current value of i or j. The simple thing to do is to pre-compute that, then use its value in the list comprehension.
For that matter, range(len(arr)) can also be computed once.
Finally, arr is already an array; you don't need to make a copy each time through the i loop.
def function_1(arr):
arr_sr = np.sort(arr)[::-1]
r = range(len(arr))
return [j for i in r for j in r if arr[i] == arr_sr[j]]
Fitting this into a single line becomes trickier. Aside from extremely artificial outside constraints, there is no reason to do so, but once Python 3.8 is released, assignment expressions will make it simpler to do so. I think the following would be equivalent.
def function_1(arr):
return [j for i in (r:=range(len(arr))) for j in r if arr[i] == (arr_sr:=np.sort(arr)[::-1])[j]]

Have a think about the steps that are going on in here:
[j
for i in range(len(arr))
for j in range(len(arr))
if np.array(arr)[i] == np.sort(arr)[::-1][j]
]
Suppose your array contains N elements.
You pick an i, N different times
You pick a j N different times
Then for each (i,j) pair you are doing the final line.
That is, you're doing the final line N^2 times.
But in that final line, you're sorting an array containing N elements. That's an NlogN operation. So the complexity of your code is O(N^3.logN).
Try making a sorted copy of the array before your [... for i ... for j ...] is called. That'll reduce the time complexity to O(N^2 + NlogN)
I think...

Related

Time complexity of solution to the four sum problem?

Given an array of integers, find all unique quartets summing up to a
specified integer.
I will provide two different solutions below, I was just wondering which one was more efficient with respect to time complexity?
Solution 1:
def four_sum(arr, s):
n = len(arr)
output = set()
for i in range(n-2):
for j in range(i+1, n-1):
seen = set()
for k in range(j+1, n):
target = s - arr[i] - arr[j] - arr[k]
if target in seen:
output.add((arr[i], arr[j], arr[k], target))
else:
seen.add(arr[k])
return print('\n'.join(map(str, list(output))))
I know that this has time complexity of O(n^3).
Solution 2:
def four_sum2(arr, s):
n = len(arr)
seen = {}
for i in range(n-1):
for j in range(i+1, n):
if arr[i] + arr[j] in seen:
seen[arr[i] + arr[j]].add((i, j))
else:
seen[arr[i] + arr[j]] = {(i, j)}
output = set()
for key in seen:
if s - key in seen:
for (i, j) in seen[key]:
for (p, q) in seen[s - key]:
sorted_index = tuple(sorted((arr[i], arr[j], arr[p], arr[q])))
if i not in (p, q) and j not in (p, q):
output.add(sorted_index)
return output
Now, the first block has a time complexity of O(n^2), but I'm not sure what the time complexity is on the second block?

TLDR: the complexity of this algorithm is O(n^4).
In the first part, a tuple is added in seen for all pair (i,j) where j>i.
Thus the number of tuples in seen is about (n-1)*n/2 = O(n^2) as you guess.
The second part is a bit more complex. If we ignore the first condition of the nested loops (critical case), the two first loops can iterate over all possible tuples in seen. Thus the complexity is at least O(n^2). For the third loop, it is a bit tricky: it is hard to know the complexity without making any assumption on the input data. However, we can assume that there is theoretically a critical case where seen[s - key] contains O(n^2) tuples. In such a case, the overall algorithm would run in O(n^4)!
Is this theoretical critical case practical?
Well, sadly yes. Indeed, take the input arr = [5, 5, ..., 5, 5] with s = 20 for example. The seen map will contains one key (10) associated to an array with (n-1)*n/2 = O(n^2) elements. In this case the two first loops of the second part will run in O(n^2) and third nested loop in O(n^2) too.
Thus the overall algorithm run in O(n^4).
However, note that in practice such case should be quite rare and the algorithm should run much faster on random inputs with many different numbers. The complexity can probably be improved to O(n^3) or even O(n^2) if this critical case is fixed (eg. by computing this pathological case separately).

Fastest way to count duplicate integers in two distinct sections of an array

In this snippet of Python code,
fun iterates through the array arr and counts the number of identical integers in two array sections for every section pair. (It simulates a matrix.) This makes n*(n-1)/2*m comparisons in total, giving a time complexity of O(n^2).
Are there programming solutions or ways of reframing this problem that would yield equivalent results but have reduced time complexity?
# n > 500000, 0 < i < n, m = 100
# dim(arr) = n*m, 0 < arr[x] < 4294967311
arr = mp.RawArray(ctypes.c_uint, n*m)
def fun(i):
for j in range(i-1,0,-1):
count = 0
for k in range(0,m):
count += (arr[i*m+k] == arr[j*m+k])
if count/m > 0.7:
return (i,j)
return ()
arr is a shared memory array, therefore it's best kept read-only for simplicity and performance reasons.
arr is implemented as a 1D RawArray from multiprocessing. The reason for this it has by far the fastest performance according to my tests. Using a numpy 2D array, for example, like this:
arr = np.ctypeslib.as_array(mp.RawArray(ctypes.c_uint, n*m)).reshape(n,m)
would provide vectorization capabilities, but increases the total runtime by an order of magnitude - 250s vs. 30s for n = 1500, which amounts to 733%.

Since you can't change the array characteristics at all, I think you're stuck with O(n^2). numpy would gain some vectorization, but would change the access for others sharing the array. Start with the innermost operation:
for k in range(0,m):
count += (arr[i][k] == arr[j][k])
Change this to a one-line assignment:
count = sum(arr[i][k] == arr[j][k] for k in range(m))
Now, if this is truly an array, rather than a list of lists, use the array package's vectorization to simplify the loops, one at a time:
count = sum(arr[i] == arr[j]) # results in a vector of counts
You can now return the j indices where count[j] / m > 0.7. Note that there's no real need to return i for each one: it's constant within the function, and the calling program already has the value. Your array package likely has a pair of vectorized indexing operations that can return those indices. If you're using numpy, those are easy enough to look up on this site.

So after fiddling around some more, I was able to cut down the running time greatly with help from NumPy's vectorization and Numba's JIT compiler. Going back to the original code:
arr = mp.RawArray(ctypes.c_uint, n*m)
def fun(i):
for j in range(i-1,0,-1):
count = 0
for k in range(0,m):
count += (arr[i*m+k] == arr[j*m+k])
if count/m > 0.7:
return (i,j)
return ()
We can leave out the bottom return statement as well as dismiss the idea of using count entirely, leaving us with:
def fun(i):
for j in range(i-1,0,-1):
if sum(arr[i*m+k] == arr[j*m+k] for k in range(m)) > 0.7*m:
return (i,j)
Then, we change the array arr to a NumPy format:
np_arr = np.frombuffer(arr,dtype='int32').reshape(m,n)
The important thing to note here is that we do not use a NumPy array as a shared memory array to be written from multiple processes, avoiding the overhead pitfall.
Finally, we apply Numba's decorator and rewrite the sum function in vector form so that it works with the new array:
import numba as nb
#nb.njit(fastmath=True,parallel=True)
def fun(i):
for j in range(i-1, 0, -1):
if np.sum(np_arr[i] == np_arr[j]) > 0.7*m:
return (i,j)
This reduced the running time to 7.9s, which is definitely a victory for me.

How to go through a double for loop randomly in python

Consider the following code:
for i in range(size-1):
for j in range(i+1,size):
print((i,j))
I need to go through this for-loop in a random fashion. I attempt to write a generator to do such a thing
def Neighborhood(size):
for i in shuffle(range(size-1)):
for j in shuffle(range(i+1), size):
yield i, j
for i,j in Neighborhood(size):
print((i,j))
However, shuffle cannot be applied to whatever object range is. I do not know how to remedy the situation, and any help is much appreciated. I would prefer a solution avoid converting range to a list, since I need speed. For example, size could be on the order of 30,000 and i will do perform this for loop around 30,000 times.
I also plan to escape the for loop early, so I want to avoid solutions that incorporate shuffle(list(range(size)))

You can use random.sample.
The advantage of using random.sample over random.shuffle, is , it can work on iterators, so in :
Python 3.X you don't need to convert range() to list
In Python 2,X, you can use xrange
Same Code can work in Python 2.X and 3.X
Sample code :
n=10
l1=range(n)
for i in sample(l1,len(l1)):
l2=range(i,n)
for j in sample(l2,len(l2)):
print(i,j)
Edit :
As to why I put in this edit, go through the comments.
def Neighborhood(size):
range1 = range(size-1)
for i in sample(range1, len(range1)):
range2 = range(i+1)
for j in sample(range2, len(range2)):
yield i, j

A simple way to go really random, not row-by-row:
def Neighborhood(size):
yielded = set()
while True:
i = random.randrange(size)
j = random.randrange(size)
if i < j and (i, j) not in yielded:
yield i, j
yielded.add((i, j))
Demo:
for i, j in Neighborhood(30000):
print(i, j)
Prints something like:
2045 5990
224 5588
1577 16076
11498 15640
15219 28006
8066 10142
7856 8248
17830 26616
...
Note: I assume you're indeed going to "escape the for loop early". Then this won't have problems with slowing down due to pairs being produced repeatedly.

I don't think you can randomly traverse an Iterator. You can predefine the shuffled lists, though
random iteration in Python
L1 = list(range(size-1))
random.shuffle(L1)
for i in L1:
L2 = list(range(i+1, size))
random.shuffle(L2)
for j in L2:
print((i,j))
Of course, not optimal for large lists

max sum of list elements each separated by (at least) k elements

given a list of numbers to find the maximum sum of non-adjacent elements with time complexity o(n) and space complexity of o(1), i could use this :
sum1= 0
sum2= list[0]
for i in range(1, len(list)):
num= sum1
sum1= sum2+ list[i]
sum2= max(num, sum2)
print(max(sum2, sum1))
this code will work only if the k = 1 [ only one element between the summing numbers] how could improve it by changing k value using dynamic programming. where k is the number of elements between the summing numbers.
for example:
list = [5,6,4,1,2] k=1
answer = 11 # 5+4+2
list = [5,6,4,1,2] k=2
answer = 8 # 6+2
list = [5,3,4,10,2] k=1
answer = 15 # 5+10

It's possible to solve this with space O(k) and time O(nk). if k is a constant, this fits the requirements in your question.
The algorithm loops from position k + 1 to n. (If the array is shorter than that, it can obviously be solved in O(k)). At each step, it maintains an array best of length k + 1, such that the jth entry of best is the best solution found so far, such that the last element it used is at least j to the left of the current position.
Initializing best is done by setting, for its entry j, the largest non-negative entry in the array in positions 1, ..., k + 1 - j. So, for example, best[1] is the largest non-negative entry in positions 1, ..., k, and best[k + 1] is 0.
When at position i of the array, element i is used or not. If it is used, the relevant best until now is best[1], so write u = max(best[1] + a[i], best[1]). If element i is not used, then each "at least" part shifts one, so for j = 2, ..., k + 1, best[j] = max(best[j], best[j - 1]). Finally, set best[1] = u.
At the termination of the algorithm, the solution is the largest item in best.

EDIT:
I had misunderstood the question, if you need to have 'atleast' k elements in between then following is an O(n^2) solution.
If the numbers are non-negative, then the DP recurrence relation is:
DP[i] = max (DP[j] + A[i]) For all j st 0 <= j < i - k
= A[i] otherwise.
If there are negative numbers in the array as well, then we can use the idea from Kadane's algorithm:
DP[i] = max (DP[j] + A[i]) For all j st 0 <= j < i - k && DP[j] + A[i] > 0
= max(0,A[i]) otherwise.

Here's a quick implementation of the algorithm described by Ami Tavory (as far as I understand it). It should work for any sequence, though if your list is all negative, the maximum sum will be 0 (the sum of an empty subsequence).
import collections
def max_sum_separated_by_k(iterable, k):
best = collections.deque([0]*(k+1), k+1)
for item in iterable:
best.appendleft(max(item + best[-1], best[0]))
return best[0]
This uses O(k) space and O(N) time. All of the deque operations, including appending a value to one end (and implicitly removing one from the other end so the length limit is maintained) and reading from the ends, are O(1).
If you want the algorithm to return the maximum subsequence (rather than only its sum), you can change the initialization of the deque to start with empty lists rather than 0, and then append max([item] + best[-1], best[0], key=sum) in the body of the loop. That will be quite a bit less efficient though, since it adds O(N) operations all over the place.

Not sure for the complexity but coding efficiency landed me with
max([sum(l[i::j]) for j in range(k,len(l)) for i in range(len(l))])
(I've replace list variable by l not to step on a keyword).

(Python) Checking the 3x3 in a Sudoku, are there better ways to do this?

My partner in a summative for HS gave me this algorithm, I was hoping somebody could tell me if there is a more eloquent way of coding this..
CB is current board position(global), its a list of lists.
for a in xrange(0, 3):
for b in xrange(0, 3):
for j in xrange(1, 4):
for k in xrange(1, 4):
boxsum += CB[3a + j][3b + k]
if not(boxsum == 45):
return False
boxsum = 0

First, the following code is not indented correctly:
if not(boxsum == 45):
return False
boxsum = 0
(with the current indentation it will always fail on the first time this code is executed)
Second, in the following line:
boxsum += CB[3a + j][3b + k]
you probably meant to do:
boxsum += CB[3*a + j][3*b + k]
And last, in order to check a 3x3 part of sudoku game it is not enough to check the sum - you should also check that every number between 1-9 is present (or in other words, that all the numbers are in the range 1-9 and there is no number that appears more than once).

There are dozens of "cleaner" ways to do so.
First of all, why not use numpy for matrices, where you are obviously working with a matrix? I am assuming your numeration (which is a bit odd, why you start numerating from "1"?)
import numpy as np
CB = np.array(CB)
def constraint3x3check(CB):
return np.all(np.sum( CB[3*a+1:3*a+3, 3*b+1:3*b+3)==45 for a in range(3) for b in range(3))

Given the sum of the box equals 45, that doesn't mean there are all 1-9 numbers present.
You could for example add your numbers to set and check if the length of the set is always 9.

Since the sum 45 does not mean the answer is correct, necessarily, a different way is needed. Personally, I would join the rows into a single list and compare them to the list (1,2,...9), e.g.
#assuming this is your format...
box = [[4,2,3],[1,5,9],[8,7,6]]
def valid_box(box):
check_list = []
for row in box:
check_list += row
return list(range(1,10)) == sorted(check_list)
Although the code creating the list could also be done with list comprehension (I have no idea which one is more efficient, processor-wise)
def valid_box2(box):
return list(range(1,10)) == sorted( [item for row in box for item in row ] )
Merge list code taken from Making a flat list out of list of lists in Python

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

My code is very slow. How to optimize it? Python - python

Related

Time complexity of solution to the four sum problem?

Fastest way to count duplicate integers in two distinct sections of an array

How to go through a double for loop randomly in python

max sum of list elements each separated by (at least) k elements

(Python) Checking the 3x3 in a Sudoku, are there better ways to do this?

Categories

Resources