How can I optimize the following Python code, to prevent time exeption? - python

Everybody. I wrote the following code. Please help me, to optimize this, when I submit in some test cases compiler writing time-limit-exceeded 2.069s / 13.33Mb.
import math
N = int(input())
arr = [None]*N; new_list = []
stepen = 0; res = .0;
arr = input().split(" ")
arr = [float(h) for h in arr]
Q = int(input())
for j in range(Q):
x, y = input().split()
new_list.extend([int(x), int(y)])
for i, j in zip(new_list[0::2], new_list[1::2]):
stepen = (j - i)+ 1
res = math.prod(arr[i:j+1])
print(pow(res, 1./stepen))

The slowest thing in your algorithm is the math.prod(arr[i:j+1]). If all the x and y inputs denote the entire range, you will surely TLE, as the calls to prod must loop over the entire range.
In order to avoid this, you must do a prefix product on your array. The idea is this: Keep a second array pref, with the property that pref[i] = arr[i] * pref[i-1]. As a result, pref[i] will be the product of everything at the ith position and before in arr.
Then to find the product between positions i and j, you want pref[j] / pref[i-1]. See if you can figure out why this gives the correct answer.

Related

Implementing Merge Sort algorithm

def merge(arr,l,m,h):
lis = []
l1 = arr[l:m]
l2 = arr[m+1:h]
while((len(l1) and len(l2)) is not 0):
if l1[0]<=l2[0]:
x = l1.pop(0)
else:
x = l2.pop(0)
lis.append(x)
return lis
def merge_sort(arr,l,h): generating them
if l<h:
mid = (l+h)//2
merge_sort(arr,l,mid)
merge_sort(arr,mid+1,h)
arr = merge(arr,l,mid,h)
return arr
arr = [9,3,7,5,6,4,8,2]
print(merge_sort(arr,0,7))
Can anyone please enlighten where my approach is going wrong ?
I get only [6,4,8] as the answer. I'm trying to understand the algo and implement the logic my own way. Please help.
Several issues:
As you consider h to be the last index of the sublist, then realise that when slicing a list, the second index is the one following the intended range. So change this:
Wrong
Right
l1 = arr[l:m]
l1 = arr[l:m+1]
l2 = arr[m+1:h]
l2 = arr[m+1:h+1]
As merge returns the result for a sub list, you should not assign it to arr. arr is supposed to be the total list, so you should only replace a part of it:
arr[l:h+1] = merge(arr,l,mid,h)
As the while loop requires that both lists are not empty, you should still consider the case where after the loop one of the lists is still not empty: its elements should be added to the merged result. So replace the return statement to this:
return lis + l1 + l2
It is not advised to compare integers with is or is not, which you do in the while condition. In fact that condition can be simplified to this:
while l1 and l2:
With these changes (and correct indentation) it will work.
Further remarks:
This implementation is not efficient. pop(0) has a O(n) time complexity. Use indexes that you update during the loop, instead of really extracting the values out the lists.
It is more pythonic to let h and m be the indices after the range that they close, instead of them being the indices of the last elements within the range they close. So if you go that way, then some of the above points will be resolved differently.
Corrected implementation
Here is your code adapted using all of the above remarks:
def merge(arr, l, m, h):
lis = []
i = l
j = m
while i < m and j < h:
if arr[i] <= arr[j]:
x = arr[i]
i += 1
else:
x = arr[j]
j += 1
lis.append(x)
return lis + arr[i:m] + arr[j:h]
def merge_sort(arr, l, h):
if l < h - 1:
mid = (l + h) // 2
merge_sort(arr, l, mid)
merge_sort(arr, mid, h)
arr[l:h] = merge(arr, l, mid, h)
return arr
arr = [9, 3, 7, 5, 6, 4, 8, 2]
print(merge_sort(arr,0,len(arr)))

(python) Can you please tell me what is the problem in the code below

I just start to learn python and i have a problem:
arr = [1,3,3,3,0,1,1]
def solution(arr):
a=[]
for r in range(len(arr)-1):
if arr[r] == arr[r+1]:
a.append(r+1)
print(a)
for i in range(len(a)):
k = int(a[i])
arr[k] = -1
arr.remove(-1)
return arr
There's a message
IndexError: list index out of range for ''arr[k] = -1''
Can you please tell me the reason for the Error and correct it?
Of course, it results in a Runtime exception. The list a stores indices. For each element v in a, you are trying to remove the value arr[v]. Doing this will reduce the size of arr by one every time. So, in the next iteration, v can be greater than the size of arr. Hence, it results in List index out of bound exception.
Your code, corrected:
arr = [1,3,3,3,0,1,1]
def solution(arr):
a=[]
for r in range(len(arr)-1):
if arr[r] == arr[r+1]:
a.append(r+1)
print(a)
c = 0
for i in range(len(a)):
k = int(a[i])
arr[k - c] = -1
arr.remove(-1)
c += 1
return arr
print(solution(arr))
It looks like you are trying to remove consecutive duplicates from the list. This can be easily solved using the following code.
def remove_duplicates(arr):
stack = [arr[0]]
for i in range(1, len(arr)):
if stack[-1] != arr[i]:
stack.append(arr[i])
return stack
print(remove_duplicates([1,3,3,3,0,1,1]))
In short, you cannot modify the array shape when you have determined the indices based on the unmodified array to index into it.
Here is something that you might be looking for:
def solution(arr):
a = []
for r in range(len(arr) - 1):
if arr[r] == arr[r + 1]:
a.append(r + 1)
print(a)
for i in range(len(a)):
k = int(a[i])
arr[k] = -1
# In the following line, you cannot modify the array length
# when you have already computed the indices based on the unmodified array
# arr.remove(-1)
arr = [x for x in arr if x != -1] # This is a better way to deal with it
return arr
print(solution(arr=[1, 3, 3, 3, 0, 1, 1]))
You don’t want to mess with the original list. Otherwise you’ll run into index errors. Index errors mean the item you were looking for in the list no longer exists. Most likely this line was the culprit arr.remove(-1).
arr = [1,3,3,3,0,1,1]
solution = []
for i, v in enumerate(arr):
if i == 0 or v != arr[i -1]:
solution.append(v)
print(solution)
This should get you what you are after. enumerate tells you want index you are at when looping through the list. More information can be found here: https://realpython.com/python-enumerate/
Well, you've probably already know what wrong happened here, removing the element inside the loop:
for i in range(len(a)):
k = int(a[i])
arr[k] = -1
arr.remove(-1)
You can fix the whole thing just changing the line to this list filter+lambda implementation, well, not inside the loop, but after the completion of loop iterations, just like follows:
for i in range(len(a)):
k = int(a[i])
arr[k] = -1
arr = list(filter(lambda x: x != -1, arr))
And you'll get what you want just from your solution!

Creating multiple matrices with nested loop using numpy

import numpy as np
import random
x = np.linspace(-3.0,3.0,25)
b = 3; a = -3; n = 24
N = np.matrix(np.zeros(shape=(4,24)));
G = [] #want to save the 2 4by 24 matrices in G
s = []; g = []
Y = []
random.seed(4)
for j in range(0,2):
Y.append(np.random.randint(-6.0,6.0,size=(4,1)))
h = (b-a)/float(n)
s.append(0.5*h*((1+(np.cos((np.pi*(a-Y[j]))/3.0)))))
g.append(0.5*h*(1+(np.cos((np.pi*(b-Y[j]))/3.0))))
for k in range(0,Y[j].shape[0]):
for l in range(1, x.shape[0]-1):
N[k,l] = h*(1 + (np.cos((np.pi*(x[l]-Y[j][k]))/3.0)))
N[k,0] = s[j][k]
N = np.concatenate((N,g[j]),axis=1)
print(N)
Please, I need help. When I run this code, it produces just a single 4by25 matrix but it is suppose to be 2 4by25 matrix. I dont know why. My goal is to have the 2 4by25 matrices stored to variable G so that when i call G[0], it produces the first 4by25 and G[1] produces the second 4by25. Here Y outputs 2 4by1 coulmn vectors.
How is your code supposed to append 2 matrices to G? You are totally missing that part.
I don't really get what values you're looking for, so I can't tell you if values are added correctly, anyway you should add this line:
G.append(N)
(I'm just assuming you are appending N because it is the only 2x24 matrix)
Before the end of the first cylce, result should be something like:
for j in range(0,2):
Y.append(np.random.randint(-6.0,6.0,size=(4,1)))
h = (b-a)/float(n)
s.append(0.5*h*((1+(np.cos((np.pi*(a-Y[j]))/3.0)))))
g.append(0.5*h*(1+(np.cos((np.pi*(b-Y[j]))/3.0))))
for k in range(0,Y[j].shape[0]):
for l in range(1, x.shape[0]-1):
N[k,l] = h*(1 + (np.cos((np.pi*(x[l]-Y[j][k]))/3.0)))
N[k,0] = s[j][k]
N = np.concatenate((N,g[j]),axis=1)
G.append(N)

List Comprehensions to create pairwise dissimilarity

I'm not familiar with list comprehensions but I would like to compute the bray-curtis dissimilarity using list comprehensions. The dissimilarity is given by
def bray(x):
bray_diss = np.zeros((x.shape[0], x.shape[0]))
for i in range(0, bray_diss.shape[0]):
bray_diss[i,i] = 0
for j in range(i+1, bray_diss.shape[0]):
l1_diff = abs(x[i,:] - x[j,:])
l1_sum = x[i,:] + x[j,:] + 1
bray_diss[i,j] = l1_diff.sum() / l1_sum.sum()
bray_diss[j,i] = bray_diss[i,j]
return bray_diss
I tryed something like :
def bray(x):
[[((abs(x[i,:] - x[j,:])).sum() / (x[i,:] + x[j,:] + 1).sum()) for j in range(0, x.shape[0])] for i in range(0, x.shape[0])]
without succes, and I can't figure out what is wrong! Moreover, in the first implementation, the second loop is not performed on all the matrix row values to save computation time, how is it possible to do it with list comprehension ?
Thanks !
You won't gain anything wxith a list comprehension... except a better comprehension of list comprehensions!
What you have to understand is that list comprehension is a functional concept. I will not go in functional programming detail,
but you have to keep in mind that functional programming forbids side effects. An example:
my_matrix = np.zeros(n, n)
for i in range(n):
for j in range(n):
my_matrix[i,j] = value_of_cell(i,j)
The last line is a side effect: you modifiy the state of my_matrix. In contrast, a side effect free version would do:
np.array([[value_of_cell(i,j) for j in range(n)] for i in range(n)])
You don't have the "create-then-assign" sequence: you create the matrix by declaring the values at each position. More precisely, to create a matrix:
you have to declare a value for every cell;
when you are given the pair (i,j), you can't use it to declare the value of another cell (e.g. (j,i))
(If you need to transform the matrix later, you have to recreate it. That's why this method may be expensive -- in time and space.)
Now, take look at your code. When you write a list comprehension, a good rule of thumb is to use auxiliary functions as they help to clean the code (we don't try to create a one-liner here):
def bray(x):
n = x.shape[0] # cleaner than to repeat x.shape[0] everywhere
def diss(i,j): # I hope it's correct
l1_diff = abs(x[i,:] - x[j,:])
l1_sum = x[i,:] + x[j,:] + 1
return l1_diff.sum() / l1_sum.sum()
bray_diss = np.zeros((n, n))
for i in range(n): # range(n) = range(0,n)
# bray_diss[i,i] = 0 <-- you don't need to set it to zero here
for j in range(i+1, n):
bray_diss[i,j] = diss(i,j)
bray_diss[j,i] = bray_diss[i,j]
return bray_diss
That's cleaner. What is the next step? In the code above, you choose to iterate over j that are greater than i and to set two values at once. But in a list comprehension, you don't choose the cells: the list comprehension gives you, for each cell, the coordinates and you have to declare the values.
First, let's try to set only one value per iteration, that is to use two loops:
def bray(x):
...
bray_diss = np.zeros((n, n))
for i in range(n):
for j in range(i+1, n):
bray_diss[i,j] = inner(i,j)
for i in range(n):
for j in range(i):
bray_diss[i,j] = bray_diss[j,i]
return bray_diss
That's better. Second, we need to assign a value to every cell of the matrix, not just prefill with zeroes and choose the cells we wan't to update:
def bray(x):
...
bray_diss = np.zeros((n, n))
for i in range(n):
for j in range(n):
if j>i: # j in range(i+1, n)
bray_diss[i,j] = inner(i,j) # top right corner
else # j in range(i+1)
bray_diss[i,j] = 0. # zeroes in the bottom left corner + diagonal
for i in range(n):
for j in range(n):
if j<i: # j in range(i)
bray_diss[i,j] = bray_diss[j,i] # fill the bottom left corner now
else # j in range(i, n)
bray_diss[i,j] = bray_diss[i,j] # top right corner + diagonal is already ok
return bray_diss
A short version would be, using the "fake ternary conditional operator" of Python:
def bray(x):
...
bray_diss = np.zeros((n, n))
for i in range(n):
for j in range(n):
bray_diss[i,j] = inner(i,j) if j>i else 0.
for i in range(n):
for j in range(n):
bray_diss[i,j] = bray_diss[j,i] if j<i else bray_diss[i,j]
return bray_diss
Now we can turn this into list comprehensions:
def bray(x):
...
bray_diss_top_right = np.array([[diss(i,j) if j>i else 0. for j in range(n)] for i in range(n)])
bray_diss = np.array([[bray_diss_top_right[j,i] if j<i else bray_diss_top_right[i,j] for j in range(n)] for i in range(n)])
return bray_diss
And, if I'm not wrong, it is even more simple like this (final version):
def bray(x):
n = x.shape[0]
def diss(i,j):
l1_diff = abs(x[i,:] - x[j,:])
l1_sum = x[i,:] + x[j,:] + 1
return l1_diff.sum() / l1_sum.sum()
bray_diss_top_right = np.array([[diss(i,j) if j>i else 0. for j in range(n)] for i in range(n)])
return bray_diss_top_right + bray_diss_top_right.transpose()
Note that this version is probably (I didn't measure) slower than yours, but the way the matrix is built is, in my opinion, easier to grasp.

Query long lists

I would like to query the value of an exponentially weighted moving average at particular points. An inefficient way to do this is as follows. l is the list of times of events and queries has the times at which I want the value of this average.
a=0.01
l = [3,7,10,20,200]
y = [0]*1000
for item in l:
y[int(item)]=1
s = [0]*1000
for i in xrange(1,1000):
s[i] = a*y[i-1]+(1-a)*s[i-1]
queries = [23,68,103]
for q in queries:
print s[q]
Outputs:
0.0355271185019
0.0226018371526
0.0158992102478
In practice l will be very large and the range of values in l will also be huge. How can you find the values at the times in queries more efficiently, and especially without computing the potentially huge lists y and s explicitly. I need it to be in pure python so I can use pypy.
Is it possible to solve the problem in time proportional to len(l)
and not max(l) (assuming len(queries) < len(l))?
Here is my code for doing this:
def ewma(l, queries, a=0.01):
def decay(t0, x, t1, a):
from math import pow
return pow((1-a), (t1-t0))*x
assert l == sorted(l)
assert queries == sorted(queries)
samples = []
try:
t0, x0 = (0.0, 0.0)
it = iter(queries)
q = it.next()-1.0
for t1 in l:
# new value is decayed previous value, plus a
x1 = decay(t0, x0, t1, a) + a
# take care of all queries between t0 and t1
while q < t1:
samples.append(decay(t0, x0, q, a))
q = it.next()-1.0
# take care of all queries equal to t1
while q == t1:
samples.append(x1)
q = it.next()-1.0
# update t0, x0
t0, x0 = t1, x1
# take care of any remaining queries
while True:
samples.append(decay(t0, x0, q, a))
q = it.next()-1.0
except StopIteration:
return samples
I've also uploaded a fuller version of this code with unit tests and some comments to pastebin: http://pastebin.com/shhaz710
EDIT: Note that this does the same thing as what Chris Pak suggests in his answer, which he must have posted as I was typing this. I haven't gone through the details of his code, but I think mine is a bit more general. This code supports non-integer values in l and queries. It also works for any kind of iterables, not just lists since I don't do any indexing.
I think you could do it in ln(l) time, if l is sorted. The basic idea is that the non recursive form of EMA is a*s_i + (1-a)^1 * s_(i-1) + (1-a)^2 * s_(i-2) ....
This means for query k, you find the greatest number in l less than k, and for a estimation limit, use the following, where v is the index in l, l[v] is the value
(1-a)^(k-v) *l[v] + ....
Then, you spend lg(len(l)) time in search + a constant multiple for the depth of your estimation. I'll provide a code sample in a little bit (after work) if you want it, just wanted to get my idea out there while I was thinking about it
here's the code -
v is the dictionary of values at a given time; replace with 1 if it's just a 1 every time...
import math
from bisect import bisect_right
a = .01
limit = 1000
l = [1,5,14,29...]
def find_nearest_lt(l, time):
i = bisect_right(a, x)
if i:
return i-1
raise ValueError
def find_ema(l, time):
i = find_nearest_lt(l, time)
if l[i] == time:
result = a * v[l[i]
i -= 1
else:
result = 0
while (time-l[i]) < limit:
result += math.pow(1-a, time-l[i]) * v[l[i]]
i -= 1
return result
if I'm thinking correctly, the find nearest is l(n), then the while loop is <= 1000 iterations, guaranteed, so it's technically a constant (though a kind of large one). find_nearest was stolen from the page on bisect - http://docs.python.org/2/library/bisect.html
It appears that y is a binary value -- either 0 or 1 -- depending on the values of l. Why not use y = set(int(item) for item in l)? That's the most efficient way to store and look up a list of numbers.
Your code will cause an error the first time through this loop:
s = [0]*1000
for i in xrange(1000):
s[i] = a*y[i-1]+(1-a)*s[i-1]
because i-1 is -1 when i=0 (first pass of loop) and both y[-1] and s[-1] are the last element of the list, not the previous. Maybe you want xrange(1,1000)?
How about this code:
a=0.01
l = [3.0,7.0,10.0,20.0,200.0]
y = set(int(item) for item in l)
queries = [23,68,103]
ewma = []
x = 1 if (0 in y) else 0
for i in xrange(1, queries[-1]):
x = (1-a)*x
if i in y:
x += a
if i == queries[0]:
ewma.append(x)
queries.pop(0)
When it's done, ewma should have the moving averages for each query point.
Edited to include SchighSchagh's improvements.

Categories

Resources