Can this merge sort be more efficient?

Can this merge sort be more efficient? - python

I made a merge sort in python just to get a better understanding of how it works. The code works fine, but I'm wondering if anyone can critique what I've written. Can this be more efficient?
I anticipate that there is some room for improvement in my recursion - the function continues running after it calls and runs itself. I've prevented it from continuing unnecessarily by simply giving it a return immediately after a recursive call, but I am not sure if there is a better method.
Also, I have to specify in the output that I want index 0 of the returned value, as the merging section of the function generates a multidimensional array.
"""
2019-10-17
This project simply re-creates common sorting algorithms
for the purpose of becoming more intimately familiar with
them.
"""
import random
import math
#Generate random array
array = [random.randint(0, 100) for i in range(random.randint(5, 100))]
print(array)
#Merge Sort
def merge(array, split = True):
split_array = []
#When the function is recursively called by the merging section, the split will be skipped
continue_splitting = False
if split == True:
#Split the array in half
for each in array:
if len(each) > 1:
split_array.append(each[:len(each)//2])
split_array.append(each[len(each)//2:])
continue_splitting = True
else:
split_array.append(each)
if continue_splitting == True:
sorted_array = merge(split_array)
#A return is set here to prevent the recursion from proceeding once the array is properly sorted
return sorted_array
else:
sorted_array = []
if len(array) != 1:
#Merge the array
for i in range(0, len(array), 2):
#Pointers are used to check each element of the two mergin arrays
pointer_a = 0
pointer_b = 0
#The temp array is used to prevent the sorted array from being corrupted by the sorting loop
temp = []
if i < len(array) - 1:
#Loop to merge the array
while pointer_a < len(array[i]) or pointer_b < len(array[i+1]):
if pointer_a < len(array[i]) and pointer_b < len(array[i+1]):
if array[i][pointer_a] <= array[i + 1][pointer_b]:
temp.append(array[i][pointer_a])
pointer_a += 1
elif array[i + 1][pointer_b] < array[i][pointer_a]:
temp.append(array[i + 1][pointer_b])
pointer_b += 1
#If the pointer is equal to the length of the sub array, the other sub array will just be added fully
elif pointer_a < len(array[i]):
for x in range(pointer_a, len(array[i])):
temp.append(array[i][x])
pointer_a += 1
elif pointer_b < len(array[i + 1]):
for x in range(pointer_b, len(array[i + 1])):
temp.append(array[i + 1][x])
pointer_b += 1
else:
for each in array[i]:
temp.append(each)
sorted_array.append(temp)
if len(sorted_array) != 1:
#Recursively sort the sub arrays
sorted_array = merge(sorted_array, False)
return sorted_array
sorted_array = merge([array])[0]
print()
print(sorted_array)

To make it more efficient, you simply can use sum() and sorted() to merge and sort the lists:
list1 = [[2, 7, 10], [0, 4, 6], [3, 11]]
list2 = [[2, 7, 10], [0, 4, 6], [3, 11, 1]]
list3 = [[10, 567, 34], [-10, -30, -109], [98, 10001, 5]]
def merge_sort(lists, ascending=True):
if ascending == True: # If in an ascending order
return sorted(sum(lists,[])) # `sum()` merges and `sorted()` sorts it
elif ascending == False: # If in a descending order
return sorted(sum(lists,[]), reverse=True) # `reverse=True` makes it descending
>>> print(merge_sort(list1))
print(merge_sort(list2))
print(merge_sort(list3))
[0, 2, 3, 4, 6, 7, 10, 11]
[0, 1, 2, 3, 4, 6, 7, 10, 11]
[-109, -30, -10, 5, 10, 34, 98, 567, 10001]
>>> print(merge_sort(list1, ascending=False))
print(merge_sort(list2, ascending=False))
print(merge_sort(list3, ascending=False))
[11, 10, 7, 6, 4, 3, 2, 0]
[11, 10, 7, 6, 4, 3, 2, 1, 0]
[10001, 567, 98, 34, 10, 5, -10, -30, -109]

Related

way to find a sequence in a list with the same number of repetitions of two integers?

I have this function that takes as arguments a list of integers and two integers. I have to find the longest sequence where the two integers repeat the same number of times.
For example if the list is
[9, 5, 7, 33, 9, 5, 5, 5, 8, 5, 33, 33, 6, 15, 8, 5, 6]
and i1 = 33 and i2 = 5, the function must return 9, because the longest sequence is 8, 5, 33, 33, 6, 15, 8, 5, 6 (in fact 33 and 5 both repeat twice).
I thought about creating a count from 0 and using a for loop on the elements of the list. Then, if the current element equals i1 or i2, the count goes up by 1.
I now need to control the number of repetitions, but I'm stuck.

Here's a recursive solution (not efficient but short and simple) that checks subranges from larger to smaller recursing until it finds one that has the same count of each number in it:
def getMaxSeq(L,a,b):
if L.count(a) == L.count(b): return L
return max(getMaxSeq(L[1:],a,b),getMaxSeq(L[:-1],a,b),key=len)
output:
L = [9, 5, 7, 33, 9, 5, 5, 5, 8, 5, 33, 33, 6, 15, 8, 5, 6]
print(getMaxSeq(L,5,33)) # [8, 5, 33, 33, 6, 15, 8, 5, 6]
print(len(getMaxSeq(L,5,33))) # 9
Using a queue to implement a breadth first traversal of subranges in decreasing order of length would be a bit more efficient:
from collections import deque
def getMaxSeq(L,a,b):
ranges = deque([L])
while L.count(a) != L.count(b):
ranges.extend((L[1:],L[:-1]))
L = ranges.popleft()
return L

Neither the most elegant nor the most performant code but it works fine for this example.
def longest_sequence(lst,x,y):
def helper(lst,x):
n = len(lst)
seqs = []
for i in range(0,n):
if lst[i]==x:
j=0
sq=[]
while i+j<n:
if lst[i+j]==x:
sq.append(i+j)
j=j+1
seqs.append(sq)
else:
break
return seqs
x_seqs = helper(lst,x)
y_seqs = helper(lst,y)
x_max_seq = max(x_seqs, key = len)
y_max_seq = max(y_seqs, key = len)
max_len = min(len(x_max_seq),len(y_max_seq))
return x_max_seq[:max_len],y_max_seq[:max_len]
my_list=[9, 5, 7, 33, 9, 5, 5, 5, 8, 5, 33, 33, 6, 15, 8, 5, 6]
x = 33
y = 5
xsq,ysq=longest_sequence(my_list,x,y)
print(xsq,ysq)
and it returns the indices for the longest sequence for both x an y.

I would approach this by converting the list to a balance list and see when the numbers balance each other out. I would then loop through all the starting indices and check each sequence that comes after that, and update my record when the balance is 0 and the sequence length is longer than the previous record.
l = [9, 5, 7, 33, 9, 5, 5, 5, 8, 5, 33, 33, 6, 15, 8, 5, 6]
num1 = 5
num2 = 33
# create list where num1 = 1, num2 = -1, else 0
balance_list = [(1 if x == num1 else -1) if x in [num1, num2] else 0 for x in l]
record = (0, 0)
print(balance_list)
for seq_start in range(len(balance_list)):
# if length of seq_start to end is less than record length, no need to search further
if (len(balance_list)-1 - seq_start) <= (record[1] - record[0]):
break
balance = 0
for seq_end, val in enumerate(balance_list[seq_start:], seq_start):
balance += val
# num1 occurs as frequently as num2, and the sequence length is longer
if balance == 0 and (seq_end - seq_start) > (record[1] - record[0]):
record = (seq_start, seq_end)
# if more numbers are required to get balance to 0, we've hit a dead end
elif abs(balance) > len(balance_list[seq_start:])-1:
break
print(record)
Output:
(8, 16)
This is by no means a perfect solution and a dynamic programming approach would be much more efficient, but I've tried to add optimizations to speed up the search (e.g. quitting the search when no new record can be found).

Flip half of a list and append it to itself

Im trying to flip half of the elements inside of a list and then spend it onto itself, for example,
if I pass the list [1, 8, 7, 2, 9, 18, 12, 0] into it, I should get [1, 0, 7, 18, 9, 2, 12, 8] out, here is my current code
def flip_half(list):
for i in range(0,len(list)):
if i % 2 != 0:
listflip = list[i]
print(listflip)
return list

You can directly assign slices in python. So if you can determine the slice you want to assign from and to, you can directly change the elements. This assign from every other element in reverse l[::-2] to every other element starting with the second: l[1::2] :
l = [1, 8, 7, 2, 9, 18, 12, 0]
l[1::2] = l[::-2]
print(l)
#[1, 0, 7, 18, 9, 2, 12, 8]

If you wanted to do it in a loop:
def flip_half(list):
out = []
for i in range(0,len(list)):
if i % 2 != 0:
out.append(list[-i])
else:
out.append(list[i])
return out
print(flip_half([1, 8, 7, 2, 9, 18, 12, 0]))

If you need that elements with odd indexes has reverse order then one of next solutions (l - is your list):
Slices
l[1::2] = l[1::2][::-1]
List comprehension
l = [l[i] if i % 2 == 0 else l[-(i + len(l)%2)] for i in range(len(l))]
Function
def flip_half(lst):
res= []
for i in range(len(lst)):
if i % 2 != 0:
res.append(lst[-(i + len(lst) % 2)])
else:
res.append(lst[i])
return res
l = flip_half(l)
Generator function
def flip_half(lst):
for i in range(len(lst)):
if i % 2 != 0:
yield lst[-(i + len(lst) % 2)]
else:
yield lst[i]
l = list(flip_half(l))

Finding all consecutive indices that sum X number on list

I'm currently practicing some Python and came across this problem. Let's say we have a list of integers and we want to find out all the indices of its elements that sum a certain number (in particular, the first index and the last index). Here's an example:
arr = [6, 7, 5, 4, 3, 1, 2, 3, 5, 6, 7, 9, 0, 0, 1, 2, 4, 1, 2, 3, 5, 1, 2]
sum_to_find = 13
So we have the array, and we want to find all elements that sum 13, and save the indexes (first and last of the interval) each time. The answer for this problem would be:
answer = [[0, 1], [2, 5], [3, 7], [9, 10], [12, 19], [13, 19], [14, 19], [18, 22]]
Below is the code I've tried:
def find_sum_range(array):
summ = 0
lst_sum = []
lst_ix = []
i = 0
j = -1
while i < len(array):
if summ < 13:
val = array[i]
summ += val
lst_ix.append(i)
elif summ == 13:
lst_sum.append(lst_ix)
j += 1
i = lst_sum[j][1]
summ = 0
lst_ix = []
i += 1
return lst_sum
But it's only returning the first two answers, mostly because I can't seem to properly backtrack the i iterator to start again from the first index of the last sum it correctly identified.

This approach is unnecessarily complicated. Utilizing list slicing produces much simpler code. Try this:
def find_sum_range(array):
result = []
for begin in range(len(array)):
for end in range(begin, len(array)):
if sum(array[begin:end+1]) == 13:
result.append([begin,end])
return result
or, with list comprehension:
def find_sum_range(array):
return [ [begin,end]
for begin in range(len(array))
for end in range(begin+1, len(array))
if sum(array[begin:end+1]) == 13 ]

A simple approach to this problem would be to use nested loops. So, go over each element in the list and for each of them iterate over the elements in the list that come after that element.
As soon as the sum exceeds or is equal to summ, we can break the nested for loop and go over to the main loop. If it turns out to be equal, then just append a list with the correct indices to the answer.
arr = [6, 7, 5, 4, 3, 1, 2, 3, 5, 6, 7, 9, 0, 0, 1, 2, 4, 1, 2, 3, 5, 1, 2]
req_sum = 13
answer = []
for i in range(len(arr)):
curr_s = arr[i]
for k in range(i+1, len(arr)):
curr_s += arr[k]
if curr_s >= req_sum:
if curr_s == req_sum:
answer.append([i, k])
break
print(answer)

Note that you use i = lst_sum[j][1] to try to backtrack. This is the second element in the list you just saved. You should use i = lst_sum[j][0] instead.
Also, you need to treat the case where you go above 13.
You can reduce the number of operations needed by moving a start index instead of keeping all the potential list indexes and deleting everything each time you arrive at 13 or above:
def find_sum_range(array):
summ = 0
lst_sum = []
start = 0
end = -1
for element in array:
summ += element
end += 1
while summ >= 13:
if summ == 13:
lst_sum.append([start, end])
summ -= array[start]
start += 1
return lst_sum

How to replace consecutive values in a list using another list as a reference?

I have a list like this:
list_target = [4, 5, 6, 7, 12, 13, 14]
list_primer = [3, 11]
So list_target consists of blocks of consecutive values, between which are jumps in values (like from 7 to 12). list_primer consists of values at the beginning of those blocks. Elements in list_primer are generated in another process.
My question is: for each element of list_primer, how can I identify the block in list_target and replace their values with what I want? For example, if I choose to replace the values in the first block with 1 and the second with 0, the outcome looks like:
list_target_result = [1, 1, 1, 1, 0, 0, 0]

Here's a simple algorithm which solves your task by looping through both lists beginning to end:
list_target = [4, 5, 6, 7, 12, 13, 14]
list_primer = [3, 11]
block_values = [1, 0]
result = []
for i, primer in enumerate(list_primer):
for j, target in enumerate(list_target):
if target == primer+1:
primer += 1
result.append(block_values[i])
else:
continue
print(result)
[1, 1, 1, 1, 0, 0, 0]
Note that you might run into trouble if not all blocks have a respective primer, depending on your use case.

Modifying method to find groups of strictly increasing numbers in a list
def group_seq(l, list_primer):
" Find groups which are strictly increasing or equals next list_primer value "
temp_list = cycle(l)
temp_primer = cycle(list_primer)
next(temp_list)
groups = groupby(l, key = lambda j: (j + 1 == next(temp_list)) or (j == next(temp_primer)))
for k, v in groups:
if k:
yield tuple(v) + (next((next(groups)[1])), )
Use group_seq to find strictly increasing blocks in list_target
list_target = [4, 5, 6, 7, 12, 13, 14]
list_primer = [3, 11]
block_values = [1, 0]
result = []
for k, v in zip(block_values, group_seq(list_target, list_primer)):
result.extend([k]*len(v)) # k is value from block_values
# v is a block of strictly increasing numbers
# ie. group_seq(list_target) creates sublists
# [(4, 5, 6, 7), (12, 13, 14)]
print(result)
Out: [1, 1, 1, 1, 0, 0, 0]

Here's a solution using numpy.
import numpy as np
list_target = np.array([4, 5, 6, 7, 12, 13, 14])
list_primer = np.array([3, 11])
values = [1, 0]
ix = np.searchsorted(list_target, list_primer)
# [0,4]
blocks = np.split(list_target, ix)[1:]
# [array([4, 5, 6, 7]), array([12, 13, 14])]
res = np.concatenate([np.full(s.size, values[i]) for i,s in enumerate(blocks)])
# array([1, 1, 1, 1, 0, 0, 0])

Here is a solution that works in O(n), where n=len(list_target). It assumes that your list_target list is consecutive in the way you described (increments by exactly one within block, increments of more than one between blocks).
It returns a dictionary with the beginning of each block as key (potential primers) and lower and upper indices of that block within list_target as values. Access to that dict is then O(1).
list_target = [4, 5, 6, 7, 12, 13, 14]
list_primer = [3, 11]
block_dict = dict()
lower_idx = 0
upper_idx = 0
for i, val in enumerate(list_target): # runs in O(n)
upper_idx = i + 1
if i == len(list_target) - 1: # for last block in list
block_dict[list_target[lower_idx] - 1] = (lower_idx, upper_idx)
break
if list_target[i + 1] - list_target[i] != 1: #if increment more than one, save current block to dict, reset lower index
block_dict[list_target[lower_idx] - 1] = (lower_idx, upper_idx)
lower_idx = i + 1
Here are the results:
print(block_dict) # quick checks
>>>> {3: (0,4), 11: (4,7)}
for p in list_primer: # printing the corresponding blocks.
lower, upper = block_dict[p] # dict access in O(1)
print(list_target[lower:upper])
>>>> [4, 5, 6, 7]
[12, 13, 14]
# getting the indices for first primer marked as in your original question:
list_target_result = [0] * len(list_target)
lower_ex, upper_ex = block_dict[3]
list_target_result[lower_ex: upper_ex] = [1]*(upper_ex-lower_ex)
print(list_target_result)
>>>> [1, 1, 1, 1, 0, 0, 0]

Multiples of 3 and 5

I am trying to write a program that takes an input number T(number of test cases), and then asks for the numbers N.
This is my code:
T = int(raw_input())
L = [int(raw_input()) for i in range(T)]
L1 = []
for i in range(0,L[i]):
if (i%3 == 0 or i%5 ==0):
L1.append(i)
print L1
Input: 2 10 20
Output: [0, 3, 5, 6, 9, 10, 12, 15, 18]
I would like the output to be of the following format:
[[0, 3, 5, 6, 9], [0, 3, 5, 6, 9, 10, 12, 15, 18]]
Here [0, 3, 5, 6, 9] is the list that has elements with both multiples of 3 and 5 for number 10
[0, 3, 5, 6, 9, 10, 12, 15, 18] is the list that has elements with both multiples of 3 and 5 for number 20
I am new to python. kindly let me know how I should proceed on this.

The following will produce a list of lists containing all the multiples of 3 and 5 that are less than the given number.
L = [10,20]
L1 = []
for i in L:
L2 = [] # initialize a new list
for j in range(i):
if not (j%3 and j%5): # use falsy values and DeMorgan's Law
L2.append(j) # append to this list
if L2: # use this if you don't want to keep empty lists
L1.append(L2)
>>> L1
[[0, 3, 5, 6, 9], [0, 3, 5, 6, 9, 10, 12, 15, 18]]

I think what you want is splitting a list by input values. Hope it helps
num = int(raw_input())
upperBounds= [int(raw_input()) for i in range(num)]
res= []
for upperBound in upperBounds:
res.append([i for i in range(0,upperBound) if not (i % 3 and i % 5)])
output:
2
10
20
[[0, 3, 5, 6, 9], [0, 3, 5, 6, 9, 10, 12, 15, 18]]

This can be easily done by applying appropriate logic:
if the element at index 0 we have to iterate from 0 to that element
else we have to iterate form L[index-1] to L[index]
T = int(raw_input())
L = [int(raw_input()) for i in range(T)]
L1 = []
for j in xrange(len(L)):
temp = []
get = 0 if not j else L[j-1]
# if j==0:
# get = 0
# else:
# get = L[j-1]
for i in range(get, L[j]):
if (i%3 == 0 or i%5 ==0):
temp.append(i)
L1.append(temp)
print L1
>>> [[0, 3, 5, 6, 9], [10, 12, 15, 18]]
Or a more Pythonic and compacted version may look like:
T = int(raw_input())
L = [int(raw_input()) for i in range(T)]
L1 = []
for j in xrange(len(L)):
get = 0 if not j else L[j-1]
L1.append([i for i in range(get, L[j]) if (i%3 == 0 or i%5 ==0)])
print L1

You can simply generate a list of multiples with range(l,u,s) with l the lower bounds, u the upper bound and d the step.
Now if we want to generate multiples of i for a given range, we can use the following function:
def multiples(factor, lower, upper) :
return set(range(lower+(factor-lower)%factor,upper,factor))
We thus manipulate the lower bound as lower+(factor-lower)%factor in order to search - in constant time - the first multiple that is greater than or equal to lower.
Next we need to multiples of 3 and 5:
def multiples35(lower, upper):
return sorted(list(multiples(3,lower,upper)|multiples(5,lower,upper)))
Now we only need to iterate over the list of values and generate the list of multiples for each two numbers:
def func(B):
return [multiples35(0,upper) for upper in B]
Or as full code:
import sets
def multiples(factor, lower, upper) :
return set(range(lower+(factor-lower)%factor,upper,factor))
def multiples35(lower, upper):
return sorted(list(multiples(3,lower,upper)|multiples(5,lower,upper)))
def func(B):
return [multiples35(0,upper) for upper in B]
The main function reads then:
T = int(raw_input())
B = [int(raw_input()) for i in range(T)]
print func(B)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Can this merge sort be more efficient? - python

Related

way to find a sequence in a list with the same number of repetitions of two integers?

Flip half of a list and append it to itself

Finding all consecutive indices that sum X number on list

How to replace consecutive values in a list using another list as a reference?

Multiples of 3 and 5

Categories

Resources