How to optimize the following python function to execute faster?

How to optimize the following python function to execute faster? - python

The following code is about semi fixed coding using mid-short. The encoding process works fine (takes 2 sec to execute). But decoding process takes about 16 seconds. I have mentioned only decoding process here. The code inside 'Main' block is just an example. Is there a way to make below code faster?
from math import ceil, floor, log2
def semi_fixed(stream, parent):
code_len = ceil(log2(parent + 1))
boundary = 2 ** code_len - parent - 1 # short_code_num
# print('Code_len: ', code_len, 'Boundary: ', boundary)
low = floor(parent / 2) - floor(boundary / 2)
high = floor(parent / 2) + floor(boundary / 2) + 1
if parent % 2 == 0:
low -= 1
bits = stream[-code_len+1::] # First read short code from last
data = int(str(bits), 2)
if low >= data or data >= high:
bits = stream[-code_len::]
data = int(str(bits), 2)
else:
code_len -= 1 # To balance the length in recursive call
return data, code_len
if __name__ == '__main__':
encoded_data = '011010101011011110001010'
decoded_data = [15]
count = 0
while len(decoded_data) <23:
if decoded_data[count] == 0:
decoded_data.append(0)
decoded_data.append(0)
count += 1
continue
else:
node, bit_len = semi_fixed(encoded_data, decoded_data[count])
decoded_data.append(node)
decoded_data.append(decoded_data[count] - node)
encoded_data = encoded_data[:-bit_len]
print(encoded_data)
count +=1
print(decoded_data)
The semi fixed method read the encoded data from right side and decide the number of bits to decode. The process continues up to certain length. Here the length and first decoded data is hard coded. The result of above code is below (This one is just an example which takes less than a second):
01101010101101111000
[15, 10, 5]
0110101010110111
[15, 10, 5, 8, 2]
01101010101101
[15, 10, 5, 8, 2, 3, 2]
01101010101
[15, 10, 5, 8, 2, 3, 2, 5, 3]
0110101010
[15, 10, 5, 8, 2, 3, 2, 5, 3, 1, 1]
01101010
[15, 10, 5, 8, 2, 3, 2, 5, 3, 1, 1, 2, 1]
011010
[15, 10, 5, 8, 2, 3, 2, 5, 3, 1, 1, 2, 1, 2, 0]
0110
[15, 10, 5, 8, 2, 3, 2, 5, 3, 1, 1, 2, 1, 2, 0, 2, 3]
01
[15, 10, 5, 8, 2, 3, 2, 5, 3, 1, 1, 2, 1, 2, 0, 2, 3, 2, 1]
0
[15, 10, 5, 8, 2, 3, 2, 5, 3, 1, 1, 2, 1, 2, 0, 2, 3, 2, 1, 1, 0]
[15, 10, 5, 8, 2, 3, 2, 5, 3, 1, 1, 2, 1, 2, 0, 2, 3, 2, 1, 1, 0, 0, 1]

I can get a 30% speedup by using integer and bitwise operations:
code_len = parent.bit_length()
boundary = ((1 << code_len) - parent - 1) // 2 # short_code_num
odd = parent & 1
parent //= 2
low = parent - boundary + odd - 1
high = parent + boundary + 1
Not much yet, but something.

In the main function instead of slicing encoded_data on every iteration, I used indexing. The index tells the semi_fixed function where to start encoded_data for passing as argument. So, instead of using this:
node, bit_len = semi_fixed(encoded_data, decoded_data[count])
decoded_data.append(node)
decoded_data.append(decoded_data[count] - node)
encoded_data = encoded_data[:-bit_len]
print(encoded_data)
count +=1
I used the following:
node, bit_len = semi_fixed_decoder_QA(encoded_data[:-prev_bit_len], decoded_data[count])
decoded_data.append(node)
decoded_data.append(decoded_data[count] - node)
prev_bit_len += bit_len
Here, prev_bit_len is initialized to 1 and encoded_data is padded with an extra bit at right.
This way I got almost same time for decoding as it was for encoding.

Related

Renumber a sequence to remove gaps, but keep identical numbers

Assume an unordered list of numbers, with duplicates being allowed. I want to patch all gaps or sudden jumps in it. Some examples:
def renum(arr):
# magic happens here
pass
renum(np.array([1, 1, 1, 2, 2, 2])) # already in correct shape
> [1, 1, 1, 2, 2, 2]
renum(np.array([1, 1, 2, 2, 4, 4, 5, 5, 5])) # A jump between 2 and 4
> [1,1, 2, 2, 3, 3, 4, 4, 4]
renum(np.array([1, 1, 2, 2, 5, 2, 2])) # A forward and backward jump
> [1,1, 2, 2, 3, 4, 4]
Finding gaps is easy, but I have a hard time when trying to renumber gaps followed by the same number multiple times when processing the sequence elementwise. I.e the attempt below fails because numbers can occur many times:
def renum(arr):
new_arr = np.zeros(len(arr))
prev_num = new_arr[0]
for idx, num in enumerate(arr):
diff = num - prev_num
if diff == 0 or diff == 1:
new_arr[idx] = num
else:
new_arr[idx] = prev_num + 1
prev_num = new_arr[idx]
return new_arr
renum(np.array([1, 1, 2, 2, 4, 4, 5, 5, 5]))
> [1, 1, 2, 2, 3, 4, 5, 5, 5] # should actually be [1, 1, 2, 2, 3, 3, 4, 4, 4]
Also I think this implementation is not very efficient..
Any ideas?

This seems to do the trick:
def renum(input_array):
diff = np.diff(input_array)
diff[diff != 0] = 1
return np.hstack((input_array[0], diff)).cumsum()
If I understood correctly, you want the differences between your values to be 0 if they are 0 in the original array. If they are non-zero, you want them to be 1. This happens in the first two lines. Now, you can use the first original element and the newly created differences to create a new array as described here.

Finding the number of substrings which sum is equal to m

I'm trying some python and I got this:
I have a string S='3,0,4,0,3,1,0,1,0,1,0,0,5,0,4,2' and m=9.
I want to know how many substrings with with sum equals m there are.
So with S and m above i whould get 7 as a result as:
'3,0,4,0,3,1,0,1,0,1,0,0,5,0,4,2'
_'0,4,0,3,1,0,1,0'_____________
_'0,4,0,3,1,0,1'_______________
___'4,0,3,1,0,1,0'_____________
____'4,0,3,1,0,1'_______________
____________________'0,0,5,0,4'_
______________________'0,5,0,4'_
_______________________'5,0,4'_
Now, the code i came up with does something like that
def es1(S,m):
c = 0
M = 0
ls = StringToInt(S)
for x in ls:
i= ls.index(x)
for y in ls[i+1:]:
M = x + y
if M == m:
c += 1
M = 0
break
if M > m:
M = 0
break
else:
continue
return c
def StringToInt(ls):
s = [int(x) for x in ls.split(',')]
return s
Where StringToInt obv gives me a list of int to work with.
The thing I don't get is where my concept is wrong since es1 returns 3

You could use zip to progressively add numbers to a list of sums and count how many 9s you have at each pass:
S = '3,0,4,0,3,1,0,1,0,1,0,0,5,0,4,2'
m = 9
numbers = list(map(int,S.split(",")))
result = 0
sums = numbers
for i in range(len(numbers)):
result += sums.count(m)
sums = [a+b for a,b in zip(sums,numbers[i+1:]) ]
print(result)
For a more "functional programming" approach, you can use accumulate from itertools:
from itertools import accumulate
numbers = list(map(int,S.split(",")))
ranges = (numbers[i:] for i in range(len(numbers)))
sums = (accumulate(r) for r in ranges)
result = sum( list(s).count(m) for s in sums )
print(result)
To explain how this works, let's first look at the content of ranges, which are substrings from each position up to the end of the list:
[3, 0, 4, 0, 3, 1, 0, 1, 0, 1, 0, 0, 5, 0, 4, 2]
[0, 4, 0, 3, 1, 0, 1, 0, 1, 0, 0, 5, 0, 4, 2]
[4, 0, 3, 1, 0, 1, 0, 1, 0, 0, 5, 0, 4, 2]
[0, 3, 1, 0, 1, 0, 1, 0, 0, 5, 0, 4, 2]
[3, 1, 0, 1, 0, 1, 0, 0, 5, 0, 4, 2]
[1, 0, 1, 0, 1, 0, 0, 5, 0, 4, 2]
[0, 1, 0, 1, 0, 0, 5, 0, 4, 2]
[1, 0, 1, 0, 0, 5, 0, 4, 2]
[0, 1, 0, 0, 5, 0, 4, 2]
[1, 0, 0, 5, 0, 4, 2]
[0, 0, 5, 0, 4, 2]
[0, 5, 0, 4, 2]
[5, 0, 4, 2]
[0, 4, 2]
[4, 2]
[2]
When we make a cumulative sum of the rows (sums), we obtain the total of values starting at the position defined by the row number and for a length defined by the column number. e.g. line 5, column 3 represents the sum of 3 values starting at the fifth position:
[3, 3, 7, 7, 10, 11, 11, 12, 12, 13, 13, 13, 18, 18, 22, 24]
[0, 4, 4, 7, 8, 8, 9, 9, 10, 10, 10, 15, 15, 19, 21]
[4, 4, 7, 8, 8, 9, 9, 10, 10, 10, 15, 15, 19, 21]
[0, 3, 4, 4, 5, 5, 6, 6, 6, 11, 11, 15, 17]
[3, 4, 4, 5, 5, 6, 6, 6, 11, 11, 15, 17]
[1, 1, 2, 2, 3, 3, 3, 8, 8, 12, 14]
[0, 1, 1, 2, 2, 2, 7, 7, 11, 13]
[1, 1, 2, 2, 2, 7, 7, 11, 13]
[0, 1, 1, 1, 6, 6, 10, 12]
[1, 1, 1, 6, 6, 10, 12]
[0, 0, 5, 5, 9, 11]
[0, 5, 5, 9, 11]
[5, 5, 9, 11]
[0, 4, 6]
[4, 6]
[2]
In this triangular matrix each position corresponds to the sum of one of the possible substrings. We simply need to count the number of 9s in there to get the result.
The above solutions will perform in O(N^2) time but, if you are concerned with performance, there is a way to obtain the result in O(N) time using a dictionary. Rather than build the whole sub arrays in the above logic, you could simply count the number of positions that add up to each sum. Then, for the sum at each position, go directly to a previous sum total that is exactly m less to get the number of substrings for that position.
from itertools import accumulate
from collections import Counter
numbers = map(int,S.split(","))
result = 0
sums = Counter([0])
for s in accumulate(numbers):
result += sums[s-m]
sums[s] += 1
print(result)
Note that all these solutions support negative numbers in the list as well as a negative or zero target.

As mentioned by others, your code only looks at sums of pairs of elements from the list. You need to look at sublists.
Here is a O(n) complexity solution (i.e. it's efficient since it only scans though the list once):
def es2(s, m):
s = string_to_int(s)
c = 0
# index of left of sub-list
left = 0
# index of right of sub-list
right = 0
# running total of sublist sum
current_sum = 0
while True:
# if the sub-list has the correct sum
if current_sum == m:
# add as many zeros on the end as works
temp_current_sum = current_sum
for temp_right in range(right, len(s) + 1):
if temp_current_sum == m:
c += 1
if temp_right<len(s):
temp_current_sum += s[temp_right]
else:
break
if current_sum >= m:
# move the left end along and update running total
current_sum -= s[left]
left += 1
else:
# move the right end along and update running total
if right == len(s):
# if the end of the list is reached, exit
return c
current_sum += s[right]
right += 1
def string_to_int(ls):
s = [int(x) for x in ls.split(',')]
return s
if __name__ == '__main__':
print(es2('3,0,4,0,3,1,0,1,0,1,0,0,5,0,4,2', 9))

This is the code you are looking for man. i felt looking by position was better for this problem so I did it and it worked.
def es1(S,m):
c = 0
M = 0
ls = StringToInt(S)
for i in range(0, len(ls)):
M = 0
for x in range(i, len(ls)):
M += ls[x]
if M == m:
c += 1
elif M >= m:
break
return c
def StringToInt(ls):
s = [int(x) for x in ls.split(',')]
return s
print(es1("3,0,4,0,3,1,0,1,0,1,0,0,5,0,4,2", 9))
OUTPUT:
7

Your code counts how many pairs of numbers there are in the String S which together give m while you actually want to test all possible substrings.
You could do something like:
numbers = [3,0,4,0,3,1,0,1,0,1,0,0,5,0,4,2]
m = 9
c = 0
for i in range(len(numbers)):
for j in range(len(numbers)-i):
sum = 0
for k in numbers[i:i+j]:
sum += k
if sum == m:
c += 1
print(c)
Output:
7

EDIT! ->This Code is actually all possible subsets, not sublists. I am going to leave this here though in case this solution is helpful to anyone who visits this question.
This code gets every solution. If you take a look in the function es1() the result variable is huge list of arrays with all the possible solutions.
import itertools
def es1(S,m):
result = [seq for i in range(len(StringToInt(S)), 0, -1) for seq in itertools.combinations(StringToInt(S), i) if sum(seq) == m]
return len(result)
def StringToInt(ls):
s = [int(x) for x in ls.split(',')]
return s
print(es1("3,0,4,0,3,1,0,1,0,1,0,0,5,0,4,2", 9))
OUTPUT:
4608
There are 4608 possible sets that add to the value 9.

s = "3,0,4,0,3,1,0,1,0,1,0,0,5,0,4,2"
m = 9
sn = s.replace(",","+") # replace "," with "+"
d = {} # create a dictionary
# create list of strings:
# s2 = ["3+0","0+4","4+0".............]
# s3 = ["3+0+4","0+4+0","4+0+3".............]
# .
# .
# .
for n in range(2,len(s.split(","))):
d["s%s"%n] = [sn[i:i+(2*n-1)] for i in range(0,len(sn),2)][:-n+1]
# evaluate whether sum of individual lists equal to m or not, then find out how many such lists are there
l = sum([eval(x)==m for y in d.values() for x in y] )
# remember we didnot add s1, i,e check whether individual string == m or not
l = l+sum([x==m for x in s.split(",")])
print(l)
7

Finding ordered array in an ordered array

c = [-1, 0, 1, 2, 3, 4]
d = [-1,0,2,3,4,5,6]
a = [-1, 1, 6, 8, 9, 12]
main = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
desired output:
fc = [-1,0,1,2,3],[0,1,2,3,4]
fd = [2,3,4,5,6]
fa = []
I want to find how many times the ordered set is in the larger set given an interval. In my case, I choose 5 since this is for my poker game. Set's won't work since they need to be in order so I don't know what to use.
In my program, I tried using for loops but I'm not getting it.
ns = len(c)-5
nt = range(0,ns)
if ns >= 0:
for n in nt:
templist = c[n:n+5]
I just need a function to compare both lists.

Compare the small lists to slices of main.
c = [-1, 0, 1, 2, 3, 4]
d = [-1,0,2,3,4,5,6]
a = [-1, 1, 6, 8, 9, 12]
main = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
for sublist in [c, d, a]:
l = len(sublist)
i = 0
while i + l <= len(main):
if sublist == main[i:i+l]:
print 'sublist %s matches' % sublist
i = i + 1

Neither pretty nor optimal, but it does what seems to be asked:
c = [-1, 0, 1, 2, 3, 4]
d = [-1, 0, 2, 3, 4, 5, 6]
a = [-1, 1, 6, 8, 9, 12]
main = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
def find_in_order(to_find, to_search, num_to_find):
solutions = []
for bucket in to_find:
bucket_solutions = []
solutions.append(bucket_solutions)
for thing in [bucket[x:x + num_to_find] for x in range(len(bucket) - num_to_find + 1)]:
for section in [main[y:y + num_to_find] for y in range(len(to_search) - num_to_find + 1)]:
if thing == section:
bucket_solutions.append(thing)
return solutions
fc, fd, fa = find_in_order([c, d, a], main, 5)
# fc == [[-1, 0, 1, 2, 3], [0, 1, 2, 3, 4]]
# fd == [[2, 3, 4, 5, 6]]
# fa == []
There's not bounds-checking in this, so it might be brittle. I also don't like how the additions of the magic number 1 are needed to get things to align. If you care about speed, string searches do things like keeping a rolling checksum and only doing comparisons when the checksums match. This is left as an exercise. Also, I'm on:
sys.version
'3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) \n[GCC 7.3.0]'

Here is a function that I made that might help you. You can pass your list as an argument and it will compare the lists.
main_set = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
c = [-1, 0, 1, 2, 3, 4]
def compare(cmp_array):
new_arrays = []
temp = []
for pos, i in enumerate(cmp_array):
for i2 in range(pos, pos+5):
temp.append(cmp_array[i2])
new_arrays.append(temp)
temp = []
if pos >= len(cmp_array)-5:
break
return_arrays = []
for array in new_arrays:
for pos, i in enumerate(main_set):
match = True
if i == array[0]:
for pos2 in range(pos, pos+5):
if array[pos2-pos] != main_set[pos2]:
match = False
break
if match:
return_arrays.append(array)
return return_arrays
fc = compare(c)
print(fc)

Modifying list with numbers in python

I am trying modify a list. Currently, there is a list with random number and I would like to change the list which creates maximum number of increase between numbers. Maybe I worded badly. For example, if list is [2,3,1,2,1], I would modify into [1,2,3,1,2] since 1->2, 2->3 and 1->2 in an increase which gives total of 3 increasing sequence. Any suggestions?

I would approach your problem with this recursive algorithm. What I am doing is sorting my list, putting all duplicates at the end, and repeating the same excluding the sorted, duplicate-free list.
def sortAndAppendDuplicates(l):
l.sort()
ll = list(dict.fromkeys(l)) # this is 'l' without duplicates
i = 0
while i < (len(ll)-1):
if list[i] == list[i+1]:
a = list.pop(i)
list.append(a)
i = i - 1
i = i + 1
if hasNoDuplicates(l):
return l
return ll + sortAndAppendDuplicates(l[len(ll):])
def hasNoDuplicates(l):
return( len(l) == len( list(dict.fromkeys(l)) ) )
print(sortAndAppendDuplicates([2,3,6,3,4,5,5,8,7,3,2,1,3,4,5,6,7,7,0,1,2,3,4,4,5,5,6,5,4,3,3,5,1,2,1]))
# this would print [0, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 3, 4, 5, 3, 5, 3, 5]

Heap sort Python implementation

def heap_sort(nos):
global size
size = len(nos)
print "the size of the List is : %d " %size
Build_heap(size,nos)
for i in range(size-1,0,-1):
nums[0],nums[i] = nums[i],nums[0]
size = size-1
print "\n", nums
heapify(nos,i,size)
print "heap sort array:" ,nums
def left_child(i):
return 2*i+1
def right_child(i):
return 2*i+2
def heapify(nums,i,size):
l = left_child(i)
r = right_child(i)
if l <= size and r <= size:
if r != size:
if nums[l] >= nums[r]:
max = nums[l]
max_index = l
elif nums[l] <= nums[r]:
max = nums[r]
max_index = r
if nums[i] >= max:
print nums
return
elif nums[i] <= max:
nums[i],nums[max_index] = max,nums[i]
heapify(nums,max_index,size)
else:
nums[i],nums[l] = nums[l],nums[i]
print nums
# build a heap A from an unsorted array
def Build_heap(size,elements):
iterate = size//2-1
for i in range(iterate,-1,-1):
print "In %d iteration" %i
heapify(elements,i,size)
print "heapified array is : " ,elements
if __name__ == '__main__':
#get input from user
nums = [6,9,3,2,4,1,7,5,10]
#sort the list
heap_sort(nums)
Output which I get is something like this:
the size of the List is : 9
In 3 iteration
[6, 9, 3, 10, 4, 1, 7, 5, 2]
In 2 iteration
[6, 9, 7, 10, 4, 1, 3, 5, 2]
In 1 iteration
[6, 10, 7, 9, 4, 1, 3, 5, 2]
[6, 10, 7, 9, 4, 1, 3, 5, 2]
In 0 iteration
[10, 6, 7, 9, 4, 1, 3, 5, 2]
[10, 9, 7, 6, 4, 1, 3, 5, 2]
[10, 9, 7, 6, 4, 1, 3, 5, 2]
heapified array is : [10, 9, 7, 6, 4, 1, 3, 5, 2]
heap sort array:
[9, 7, 6, 4, 1, 3, 5, 2, 10]
I tried implementing a heap sort algorithm in python. The final output is not sorted. There is something wrong in the heapify operation which I tried to figure out, but I couldn't find it.
Can someone point out what's wrong in my code and propose a solution for it?

The first item(0) was swaped with the last item. To keep max-heap invairant, you should call heapify with 0.
def heap_sort(nos):
size = len(nos)
build_heap(size,nos)
for i in range(size-1,0,-1):
nums[0],nums[i] = nums[i],nums[0]
size -= 1
heapify(nos, 0, size) # <--- i -> 0

The following is my PYTHON implementation. If the program is "heapsort.py", an example to run it is "python heapsort.py 10", to sort 10 randomly generated numbers.
The validation code is near the end of the program, to verify the correctness of the function, heapsort().
#!/bin/python
#
# TH #stackoverflow, 2016-01-20, HeapSort
#
import sys, random
def pushdown( A, root, size_of_A ):
M = root * 2
if(M <= size_of_A):
if(size_of_A > M):
if(A[M - 1] < A[M]):
M += 1
if(A[root - 1] < A[M - 1]):
A[root - 1], A[M - 1] = A[M - 1], A[root - 1]
pushdown(A, M, size_of_A)
def heapsort( H ):
for i in range(len(H)/2, 0, -1):
pushdown(H, i, len(H))
for i in range(len(H) - 1, 0, -1):
H[i], H[0] = H[0], H[i]
pushdown(H, 1, i)
return H
number_to_numbers = int(sys.argv[1])
X = [ random.randint(0, number_to_numbers) for i in range(number_to_numbers) ]
Y = X
print Y
print heapsort(X)
print sorted(Y)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to optimize the following python function to execute faster? - python

I can get a 30% speedup by using integer and bitwise operations: code_len = parent.bit_length() boundary = ((1 << code_len) - parent - 1) // 2 # short_code_num odd = parent & 1 parent //= 2 low = parent - boundary + odd - 1 high = parent + boundary + 1 Not much yet, but something.

Related

Renumber a sequence to remove gaps, but keep identical numbers

Finding the number of substrings which sum is equal to m

Finding ordered array in an ordered array

Modifying list with numbers in python

Heap sort Python implementation

Categories

Resources