min sum of consecutive values with Divide and Conquer - python

Given an array of random integers
N = [1,...,n]
I need to find min sum of two consecutive values using divide and conquer.
What is not working here but my IQ?
def minSum(array):
if len(array) < 2:
return array[0]+array[1]
if (len(a)%2) != 0:
mid = int(len(array)/2)
leftArray = array[:mid]
rightArray = array[mid+1:]
return min(minSum(leftArray),minSum(rightArray),crossSum(array,mid))
else:
mid = int(len(array)/2)
leftArray = array[:mid]
rightArray = array[mid:]
return min(minSum(leftArray), minSum(rightArray), array[mid]+array[mid+1])
def crossSum(array,mid):
return min(array[mid-1]+array[mid],array[mid]+array[mid+1])

The main problem seems to be that the first condition is wrong: If len(array) < 2, then the following line is bound to raise an IndexError. Also, a is not defined. I assume that that's the name of the array in the outer scope, thus this does not raise an exception but just silently uses the wrong array. Apart from that, the function seems to more-or-less work (did not test it thoroughly, though.
However, you do not really need to check whether the array has odd or even length, you can just use the same code for both cases, making the crossSum function unneccesary. Also, it is kind of confusing that the function for returning the min sum is called maxSum. If you really want a divide-and-conquer approach, try this:
def minSum(array):
if len(array) < 2:
return 10**100
elif len(array) == 2:
return array[0]+array[1]
else:
# len >= 3 -> both halves guaranteed non-empty
mid = len(array) // 2
leftArray = array[:mid]
rightArray = array[mid:]
return min(minSum(leftArray),
minSum(rightArray),
leftArray[-1] + rightArray[0])
import random
lst = [random.randint(1, 10) for _ in range(20)]
r = minSum(lst)
print(lst)
print(r)
Random example output:
[1, 5, 6, 4, 1, 2, 2, 10, 7, 10, 8, 4, 9, 5, 7, 6, 5, 1, 4, 9]
3
However, a simple loop would be much better suited for the problem:
def minSum(array):
return min(array[i-1] + array[i] for i in range(1, len(array)))

Related

Find the target difference in a pair with recursion

Given a list of unsorted integers and a target integer, find out if any pair's difference in the list is equal to the target integer with recursion.
>>> aList = [5, 4, 8, -3, 6]
>>> target = 9
return True
>>> aList = [-1, 5, 4]
>>> target = 3
return False
For and while loops are not allowed.
No imports allowed.
.sort() is not allowed.
I tried this and it didn't work.
def calculate(aList, target):
if len(aList) == 0 and diff != 0:
return False
startIndex = 0
endIndex = len(aList) - 1
return resursive_sum(aList, target, startIndex, endIndex)
def resursive_sum(aList, targ, start, end):
print(f'Start: {start}')
print(f'End: {end}')
if start == end:
return False
elif aList[end] - aList[start] == targ:
return True
elif aList[end] - aList[start] < targ:
return resursive_sum(values, targ, start, end - 1)
return resursive_sum(aList, targ, start + 1, end)
I'm unsure of how this problem could be solved if we aren't able to use loops to sort the list. Even if we could use recursion to sort the list, how should the recursion look so that it can scan every pair's difference?
So I actually implemented it, but for educational purposes I'm not gonna post it until a bit later (I'll update it in a few hours) as I assume this is for a class or some other setting where you should figure it out on your own.
Assume you are trying to hit a difference target t = 5 and you are evaluating an arbitrary element 8. There are only two values that would allow 8 to have a complement in the set: 8 + 5 = 13 and 8 - 5 = 3.
If 3 or 13 had been in any previous elements, you would know that the set has a pair of complements. Otherwise, you'd want to record the fact that 8 had been seen. Thereby, if 3 was found later, 8 would be queried as 3 + 5 = 8 would be considered.
In other words, I am proposing a method where you recursively traverse the list and either
(base case) Are at the end of the list
Have a current element a such that a + t or a - t has been seen
Record that the current element has been seen and go to the next element
Ideally, this should have O(n) time complexity and O(n) space complexity in the worst case (assuming efficient implementation with pass-by-reference or similar, and also amortized constant-time set query). It can also be implemented using a basic array, but I'm not going to say that's better (in python).
I'll post my solution in a few hours. Good luck!
EDIT 1: Hopefully, you had enough time to get it to work. The method I described can be done as follows:
def hasDiffRecur(L, t, i, C):
"""
Recursive version to see if list has difference
:param L: List to be considered
:param t: Target difference
:param i: Current index to consider
:param C: Cache set
"""
# We've reached the end. Give up
if i >= len(L):
return False
print(f" > L[{i}] = {L[i]:2}; is {L[i]-t:3} or {L[i]+t:2} in {C}")
# Has the complement been cached?
if L[i] - t in C:
print(f"! Difference between {L[i]} and {L[i]-t} is {t}")
return True
if L[i] + t in C:
print(f"! Difference between {L[i]} and {L[i]+t} is {t}")
return True
# Complement not seen yet. Cache element and go to next element
C.add(L[i])
return hasDiffRecur(L, t, i+1, C)
###################################################################
def hasDiff(L, t):
"""
Initialized call for hasDiffRecur. Also prints intro message.
See hasDiffRecur for param info
"""
print(f"\nIs a difference of {t} present in {L}?")
return hasDiffRecur(L, t, 0, set())
###################################################################
hasDiff([5, 4, 8, -3, 6], 9)
hasDiff([-1, 5, 4], 3)
hasDiff([-1, 5, 4, -1, 7], 0) # If concerned about set non-duplicity
OUTPUT:
Is a difference of 9 present in [5, 4, 8, -3, 6]?
> L[0] = 5; is -4 or 14 in set()
> L[1] = 4; is -5 or 13 in {5}
> L[2] = 8; is -1 or 17 in {4, 5}
> L[3] = -3; is -12 or 6 in {8, 4, 5}
> L[4] = 6; is -3 or 15 in {8, -3, 4, 5}
! Difference between 6 and -3 is 9
Is a difference of 3 present in [-1, 5, 4]?
> L[0] = -1; is -4 or 2 in set()
> L[1] = 5; is 2 or 8 in {-1}
> L[2] = 4; is 1 or 7 in {5, -1}
Is a difference of 0 present in [-1, 5, 4, -1, 7]?
> L[0] = -1; is -1 or -1 in set()
> L[1] = 5; is 5 or 5 in {-1}
> L[2] = 4; is 4 or 4 in {5, -1}
> L[3] = -1; is -1 or -1 in {4, 5, -1}
! Difference between -1 and -1 is 0
EDIT 2:
This is a pretty clever and efficient solution. I do realize that maybe it is the intention to not allow any traversal at all (i.e. no existance querying for set). If that is the case, the above approach can be done with a constant-size list that is pre-allocated to size equal to the range of the values of the list.
If the notion of pre-allocating to the size of the range of the list is still too much iteration, I can think of the exhaustive approach implemented recursively. There is likely a more efficient approach for this, but you could boil the problem down to a double-for-loop-like problem (O(n^2) time complexity). This is a trivial algorithm and I think you can understand it without documentation, so I'll just throw it in there to be complete:
def hasDiffRecur(L, t, i = 0, j = 1):
if i >= len(L): return False
if j >= len(L): return hasDiffRecur(L, t, i+1, i+2)
if abs(L[i] - L[j]) == t: return True
return hasDiffRecur(L, t, i, j+1)
###################################################################
print(hasDiffRecur([5, 4, 8, -3, 6], 9)) # True
print(hasDiffRecur([-1, 5, 4], 3)) # False
print(hasDiffRecur([-1, 5, 4, -1, 7], 0)) # True
choose
I'll start with a generic function that takes a list, t, and a number of elements to choose, n -
def choose(t, n):
if n == 0:
return [[]]
elif not t:
return []
else:
return append \
( map \
( choose(rest(t), n - 1)
, lambda c: append([first(t)], c)
)
, choose(rest(t), n)
)
print(choose(["a", "b", "c", "d"], 2))
[['a', 'b'], ['a', 'c'], ['a', 'd'], ['b', 'c'], ['b', 'd'], ['c', 'd']]
helpers
Your question imposes quite a few restrictions and Python is a multi-paradigm language and so we're going to use a number of helpers to make things readable
def first(t):
return t[0]
def rest(t):
return t[1:]
def append(t0, t1):
return t0 + t1
I don't know if map counts as an import, but we will define our own just in case -
def map(t, f):
if not t:
return []
else:
return append \
( [f(first(t))]
, map(rest(t), f)
)
solve
Great, now that we've finished implementing choose, let's see how we can apply it to our problem
print(choose([5, 4, 8, -3, 6], 2))
[[5, 4], [5, 8], [5, -3], [5, 6], [4, 8], [4, -3], [4, 6], [8, -3], [8, 6], [-3, 6]]
As you can see, we've found all combinations of 2 elements. We just need to loop through these and check if a pair can be subtracted to reach our target, q -
def solve(t, q):
def check(p):
(x, y) = p
return x - y == q or y - x == q
def loop(c):
if not c:
return False
else:
return check(first(c)) or loop(rest(c))
return loop(choose(t, 2))
print(solve([5, 4, 8, -3, 6], 9))
print(solve([-1, 5, 4], 3))
True
False
allowing for
This is a great exercise to build your recursion skills. Disallowing for is the most challenging restriction to overcome. Here's what it could look like if we could use it -
def choose(t, n):
if n == 0:
yield []
elif not t:
return
else:
for c in choose(t[1:], n - 1):
yield [t[0]] + c
yield from choose(t[1:], n)
def solve(t, q):
for (x,y) in choose(t, 2):
if x - y == q or y - x == q:
return True
return False
print(solve([5, 4, 8, -3, 6], 9))
print(solve([-1, 5, 4], 3))
True
False
This variant has an added advantage that it will stop computing combinations as soon as a solution is found. The first variant must compute all combinations first and then begin iterating through them.
allowing other built-ins
Python built-in functions include map and any and offer us another way to get around the for restriction, but I'm unsure if those are allowed -
def choose(t, n):
if n == 0:
yield []
elif not t:
return
else:
yield from map \
( lambda c: [t[0]] + c
, choose(t[1:], n - 1)
)
yield from choose(t[1:], n)
def solve(t, q):
def check(p):
(x,y) = p
return x - y == q or y - x == q
return any(map(check, choose(t, 2)))
print(solve([5, 4, 8, -3, 6], 9))
print(solve([-1, 5, 4], 3))
True
False
Problem:
given an array of int aList and int target,
check if the difference between each element in aList equals to target
use recursion
do not use .sort()
do not use while and for
do not use import
Example:
>>> aList = [5, 4, 8, -3, 6]
>>> target = 9
return True
>>> aList = [-1, 5, 4]
>>> target = 3
return False
Comparing the differences:
5 4 8 -3 6
-------------------------
5 | X
4 | 1 X
8 | 3 4 X
-3 | 6 7 11 X
6 | 1 2 2 9 X
where X means that there's no difference (same number)
since we find 9 there, so it should return True (target is 9)
Traditional for loops
To solve recursion problem, first try to solve it with traditional for loops:
def compareAll(lst, tgt):
for x in lst: # let's call this x loop
for y in lst: # let's call this y loop
if abs(x-y) == tgt:
return True
return False
print( compareAll([5,4,8,-3,6],9) )
print( compareAll([-1,5,4],3) )
This returns True then False
Recursion
Now we can try using recursion loop. Since we already got the for loop, we can convert it like this:
def compareAll(lst, tgt, x=0, y=0):
if(len(lst)-1 == x and len(lst) == y):
return False
if(len(lst) == x or len(lst) == y):
return compareAll(lst, tgt, x+1, 0)
if(abs(lst[x] - lst[y])==tgt):
return True
return compareAll(lst, tgt, x, y+1)
print( compareAll([5,4,8,-3,6],9) )
print( compareAll([-1,5,4],3) )
How I convert for loop into this:
python's for loop is actually foreach loop in most other languages
so, pure for loop in python will be like:
def compareAll(lst, tgt):
x = 0
while x < len(lst): # let's call this x loop
y = 0
while y < len(lst): # let's call this y loop
if abs(lst[x]-lst[y]) == tgt:
return True
y = y+1
x = x+1
return False
print( compareAll([5,4,8,-3,6],9) )
print( compareAll([-1,5,4],3) )
notice the stopping condition of x loop: when all the array element have been looped
so we add stopping condition here: if(len(lst)-1 == x and len(lst) == y): return False
notice the stopping condition of y loop: when all the array element have been looped
so we add stopping condition here: if(len(lst) == x or len(lst) == y): return compareAll(lst, tgt, x+1, 0)
this stops the current y loop and continue with the x loop
then, we add the actual content of the loop: if(abs(lst[x] - lst[y])==tgt): return True
last, we have to continue the loop: return compareAll(lst, tgt, x, y+1)
The key to convert for loop into recursive loop is just to identify when the loop should end, and when the loop should continue.
This should work and is quite concise:
def q(target, aList, memo=set()):
if len(aList) == 0:
return False
num = aList.pop()
memo.add(num)
if target + num in memo:
return True
return q(target, aList, memo)
q(target=9, aList=[5, 4, 8, -3, 6]) # True
q(target=3, aList=[-1,5,4]) # False
The key insight for me is that a target t and a given number n, the difference d is known. Dict/set/hashmaps are fast at detecting membership, regardless of how many items are added. So... just pop through the list of values and chuck them into a hashmap for later comparison.
The problem can be solved by checking every possible pair of numbers in the list for a solution. If you are allowed to use Python's standard library, then the solution is pretty straight forward.
from itertools import product
def check(xs, target):
return any(map(lambda x: x[0]-x[1] == target, product(xs, xs)))
Breakdown
product(xs, xs) gives the cross product of xs with itself
any(iterable) returns true if any element of iterable is truthy
map(function, iterable) lazily (applies function to every element of iterable)
lambda arg_tuple: expression annonymous function with arguments arg_tuple and returns the result of expression
The return statement in check uses lazy structures so it only does as much work as
is needed, and is space efficient.
Assuming that this is just an exercise in recursion, it probably doesn't preclude a "brute force" approach. You can use the recursion to pair up every value with the remaining ones until you find a matching difference.
For example:
def hasDiff(L,diff,base=None):
if not L: return False # empty list: no match
if base is None: # search with & without first as base
return hasDiff(L[1:],diff,L[0]) or hasDiff(L[1:],diff)
return abs(base-L[0]) == diff or hasDiff(L[1:],diff,base) # match or recurse
print(hasDiff([5, 4, 8, -3, 6],9)) # True
print(hasDiff([-1, 5, 4],3)) # False
When the function recurses with a base value, it merely checks the first item in the remainder of the list and recurses for the other values. When the function recurses without a base value, it tries to find new pairs that don't involve the first item (i.e. in the remainder of the list)

Python Two Sum - Brute Force Approach

I'm new to Python and have just started to try out LeetCode to build my chops. On this classic question my code misses a test case.
The problem is as follows:
Given an array of integers, return indices of the two numbers such that they add up to a specific target.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
Example:
Given nums = [2, 7, 11, 15], target = 9,
Because nums[0] + nums[1] = 2 + 7 = 9,
return [0, 1].
I miss on test case [3,2,4] with the target number of 6, which should return the indices of [1,2], but hit on test case [1,5,7] with the target number of 6 (which of course returns indices [0,1]), so it appears that something is wrong in my while loop, but I'm not quite sure what.
class Solution:
def twoSum(self, nums, target):
x = 0
y = len(nums) - 1
while x < y:
if nums[x] + nums[y] == target:
return (x, y)
if nums[x] + nums[y] < target:
x += 1
else:
y -= 1
self.x = x
self.y = y
self.array = array
return None
test_case = Solution()
array = [1, 5, 7]
print(test_case.twoSum(array, 6))
Output returns null on test case [3,2,4] with target 6, so indices 1 and 2 aren't even being summarized, could I be assigning y wrong?
A brute force solution is to double nest a loop over the list where the inner loop only looks at index greater than what the outer loop is currently on.
class Solution:
def twoSum(self, nums, target):
for i, a in enumerate(nums, start=0):
for j, b in enumerate(nums[i+1:], start=0):
if a+b==target:
return [i, j+i+1]
test_case = Solution()
array = [3, 2, 4]
print(test_case.twoSum(array, 6))
array = [1, 5, 7]
print(test_case.twoSum(array, 6))
array = [2, 7, 11, 15]
print(test_case.twoSum(array, 9))
Output:
[1, 2]
[0, 1]
[0, 1]
Bit different approach. We will build a dictionary of values as we need them, which is keyed by the values we are looking for.If we look for a value we track the index of that value when it first appears. As soon as you find the values that satisfy the problem you are done. The time on this is also O(N)
class Solution:
def twoSum(self, nums, target):
look_for = {}
for n,x in enumerate(nums):
try:
return look_for[x], n
except KeyError:
look_for.setdefault(target - x,n)
test_case = Solution()
array = [1, 5, 7]
array2 = [3,2,4]
given_nums=[2,7,11,15]
print(test_case.twoSum(array, 6))
print(test_case.twoSum(array2, 6))
print(test_case.twoSum(given_nums,9))
output:
(0, 1)
(1, 2)
(0, 1)
class Solution:
def twoSum(self, nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: List[int]
"""
ls=[]
l2=[]
for i in nums:
ls.append(target-i)
for i in range(len(ls)):
if ls[i] in nums :
if i!= nums.index(ls[i]):
l2.append([i,nums.index(ls[i])])
return l2[0]
x= Solution()
x.twoSum([-1,-2,-3,-4,-5],-8)
output
[2, 4]
import itertools
class Solution:
def twoSum(self, nums, target):
subsets = []
for L in range(0, len(nums)+1):
for subset in itertools.combinations(nums, L):
if len(subset)!=0:
subsets.append(subset)
print(subsets) #returns all the posible combinations as tuples, note not permutations!
#sums all the tuples
sums = [sum(tup) for tup in subsets]
indexes = []
#Checks sum of all the posible combinations
if target in sums:
i = sums.index(target)
matching_combination = subsets[i] #gets the option
for number in matching_combination:
indexes.append(nums.index(number))
return indexes
else:
return None
test_case = Solution()
array = [1,2,3]
print(test_case.twoSum(array, 4))
I was trying your example for my own learning. I am happy with what I found. I used the itertools to make all the combination of the numbers for me. Then I used list manipulation to sum all the possible combination of numbers in your input array, then I just check in one shot if the target is inside the sum array or not. If not then return None, return the indexes otherwise. Please note that this approach will return all the three indexes as well, if they add up to the target. Sorry it took so long :)
This one is more comprehensive cohesive efficient one even so shorter lines of code.
nums = [6, 7, 11, 15, 3, 6, 5, 3,99,5,4,7,2]
target = 27
n = 0
for i in range(len(nums)):
n+=1
if n == len(nums):
n == len(nums)
else:
if nums[i]+nums[n] == target:
# to find the target position
print([nums.index(nums[i]),nums.index(nums[n])])
# to get the actual numbers to add print([nums[i],nums[n]])

How to shuffle an array of numbers without two consecutive elements repeating?

I'm currently trying to get an array of numbers like this one randomly shuffled:
label_array = np.repeat(np.arange(6), 12)
The only constrain is that no consecutive elements of the shuffle must be the same number. For that I'm currently using this code:
# Check if there are any occurrences of two consecutive
# elements being of the same category (same number)
num_occurrences = np.sum(np.diff(label_array) == 0)
# While there are any occurrences of this...
while num_occurrences != 0:
# ...shuffle the array...
np.random.shuffle(label_array)
# ...create a flag for occurrences...
flag = np.hstack(([False], np.diff(label_array) == 0))
flag_array = label_array[flag]
# ...and shuffle them.
np.random.shuffle(flag_array)
# Then re-assign them to the original array...
label_array[flag] = flag_array
# ...and check the number of occurrences again.
num_occurrences = np.sum(np.diff(label_array) == 0)
Although this works for an array of this size, I don't know if it would work for much bigger arrays. And even so, it may take a lot of time.
So, is there a better way of doing this?
May not be technically the best answer, hopefully it suffices for your requirements.
import numpy as np
def generate_random_array(block_length, block_count):
for blocks in range(0, block_count):
nums = np.arange(block_length)
np.random.shuffle(nums)
try:
if nums[0] == randoms_array [-1]:
nums[0], nums[-1] = nums[-1], nums[0]
except NameError:
randoms_array = []
randoms_array.extend(nums)
return randoms_array
generate_random_array(block_length=1000, block_count=1000)
Here is a way to do it, for Python >= 3.6, using random.choices, which allows to choose from a population with weights.
The idea is to generate the numbers one by one. Each time we generate a new number, we exclude the previous one by temporarily setting its weight to zero. Then, we decrement the weight of the chosen one.
As #roganjosh duly noted, we have a problem at the end when we are left with more than one instance of the last value - and that can be really frequent, especially with a small number of values and a large number of repeats.
The solution I used is to insert these value back into the list where they don't create a conflict, with the short send_back function.
import random
def send_back(value, number, lst):
idx = len(lst)-2
for _ in range(number):
while lst[idx] == value or lst[idx-1] == value:
idx -= 1
lst.insert(idx, value)
def shuffle_without_doubles(nb_values, repeats):
population = list(range(nb_values))
weights = [repeats] * nb_values
out = []
prev = None
for i in range(nb_values * repeats):
if prev is not None:
# remove prev from the list of possible choices
# by turning its weight temporarily to zero
old_weight = weights[prev]
weights[prev] = 0
try:
chosen = random.choices(population, weights)[0]
except IndexError:
# We are here because all of our weights are 0,
# which means that all is left to choose from
# is old_weight times the previous value
send_back(prev, old_weight, out)
break
out.append(chosen)
weights[chosen] -= 1
if prev is not None:
# restore weight
weights[prev] = old_weight
prev = chosen
return out
print(shuffle_without_doubles(6, 12))
[5, 1, 3, 4, 3, 2, 1, 5, 3, 5, 2, 0, 5, 4, 3, 4, 5,
3, 4, 0, 4, 1, 0, 1, 5, 3, 0, 2, 3, 4, 1, 2, 4, 1,
0, 2, 0, 2, 5, 0, 2, 1, 0, 5, 2, 0, 5, 0, 3, 2, 1,
2, 1, 5, 1, 3, 5, 4, 2, 4, 0, 4, 2, 4, 0, 1, 3, 4,
5, 3, 1, 3]
Some crude timing: it takes about 30 seconds to generate (shuffle_without_doubles(600, 1200)), so 720000 values.
I came from Creating a list without back-to-back repetitions from multiple repeating elements (referred as "problem A") as I organise my notes and there was no correct answer under "problem A" nor in the current one. Also these two problems seems different because problem A requires same elements.
Basically what you asked is same as an algorithm problem (link) where the randomness is not required. But when you have like almost half of all numbers same, the result can only be like "ABACADAEA...", where "ABCDE" are numbers. In the most voted answer to this problem, a priority queue is used so the time complexity is O(n log m), where n is the length of the output and m is the count of option.
As for this problem A easier way is to use itertools.permutations and randomly select some of them with different beginning and ending so it looks like "random"
I write draft code here and it works.
from itertools import permutations
from random import choice
def no_dup_shuffle(ele_count: int, repeat: int):
"""
Return a shuffle of `ele_count` elements repeating `repeat` times.
"""
p = permutations(range(ele_count))
res = []
curr = last = [-1] # -1 is a dummy value for the first `extend`
for _ in range(repeat):
while curr[0] == last[-1]:
curr = choice(list(p))
res.extend(curr)
last = curr
return res
def test_no_dup_shuffle(count, rep):
r = no_dup_shuffle(count, rep)
assert len(r) == count * rep # check result length
assert len(set(r)) == count # check all elements are used and in `range(count)`
for i, n in enumerate(r): # check no duplicate
assert n != r[i - 1]
print(r)
if __name__ == "__main__":
test_no_dup_shuffle(5, 3)
test_no_dup_shuffle(3, 17)

Finding median of list in Python

How do you find the median of a list in Python? The list can be of any size and the numbers are not guaranteed to be in any particular order.
If the list contains an even number of elements, the function should return the average of the middle two.
Here are some examples (sorted for display purposes):
median([1]) == 1
median([1, 1]) == 1
median([1, 1, 2, 4]) == 1.5
median([0, 2, 5, 6, 8, 9, 9]) == 6
median([0, 0, 0, 0, 4, 4, 6, 8]) == 2
Python 3.4 has statistics.median:
Return the median (middle value) of numeric data.
When the number of data points is odd, return the middle data point.
When the number of data points is even, the median is interpolated by taking the average of the two middle values:
>>> median([1, 3, 5])
3
>>> median([1, 3, 5, 7])
4.0
Usage:
import statistics
items = [6, 1, 8, 2, 3]
statistics.median(items)
#>>> 3
It's pretty careful with types, too:
statistics.median(map(float, items))
#>>> 3.0
from decimal import Decimal
statistics.median(map(Decimal, items))
#>>> Decimal('3')
(Works with python-2.x):
def median(lst):
n = len(lst)
s = sorted(lst)
return (s[n//2-1]/2.0+s[n//2]/2.0, s[n//2])[n % 2] if n else None
>>> median([-5, -5, -3, -4, 0, -1])
-3.5
numpy.median():
>>> from numpy import median
>>> median([1, -4, -1, -1, 1, -3])
-1.0
For python-3.x, use statistics.median:
>>> from statistics import median
>>> median([5, 2, 3, 8, 9, -2])
4.0
The sorted() function is very helpful for this. Use the sorted function
to order the list, then simply return the middle value (or average the two middle
values if the list contains an even amount of elements).
def median(lst):
sortedLst = sorted(lst)
lstLen = len(lst)
index = (lstLen - 1) // 2
if (lstLen % 2):
return sortedLst[index]
else:
return (sortedLst[index] + sortedLst[index + 1])/2.0
Of course you can use build in functions, but if you would like to create your own you can do something like this. The trick here is to use ~ operator that flip positive number to negative. For instance ~2 -> -3 and using negative in for list in Python will count items from the end. So if you have mid == 2 then it will take third element from beginning and third item from the end.
def median(data):
data.sort()
mid = len(data) // 2
return (data[mid] + data[~mid]) / 2
Here's a cleaner solution:
def median(lst):
quotient, remainder = divmod(len(lst), 2)
if remainder:
return sorted(lst)[quotient]
return sum(sorted(lst)[quotient - 1:quotient + 1]) / 2.
Note: Answer changed to incorporate suggestion in comments.
You can try the quickselect algorithm if faster average-case running times are needed. Quickselect has average (and best) case performance O(n), although it can end up O(n²) on a bad day.
Here's an implementation with a randomly chosen pivot:
import random
def select_nth(n, items):
pivot = random.choice(items)
lesser = [item for item in items if item < pivot]
if len(lesser) > n:
return select_nth(n, lesser)
n -= len(lesser)
numequal = items.count(pivot)
if numequal > n:
return pivot
n -= numequal
greater = [item for item in items if item > pivot]
return select_nth(n, greater)
You can trivially turn this into a method to find medians:
def median(items):
if len(items) % 2:
return select_nth(len(items)//2, items)
else:
left = select_nth((len(items)-1) // 2, items)
right = select_nth((len(items)+1) // 2, items)
return (left + right) / 2
This is very unoptimised, but it's not likely that even an optimised version will outperform Tim Sort (CPython's built-in sort) because that's really fast. I've tried before and I lost.
You can use the list.sort to avoid creating new lists with sorted and sort the lists in place.
Also you should not use list as a variable name as it shadows python's own list.
def median(l):
half = len(l) // 2
l.sort()
if not len(l) % 2:
return (l[half - 1] + l[half]) / 2.0
return l[half]
def median(x):
x = sorted(x)
listlength = len(x)
num = listlength//2
if listlength%2==0:
middlenum = (x[num]+x[num-1])/2
else:
middlenum = x[num]
return middlenum
def median(array):
"""Calculate median of the given list.
"""
# TODO: use statistics.median in Python 3
array = sorted(array)
half, odd = divmod(len(array), 2)
if odd:
return array[half]
return (array[half - 1] + array[half]) / 2.0
A simple function to return the median of the given list:
def median(lst):
lst = sorted(lst) # Sort the list first
if len(lst) % 2 == 0: # Checking if the length is even
# Applying formula which is sum of middle two divided by 2
return (lst[len(lst) // 2] + lst[(len(lst) - 1) // 2]) / 2
else:
# If length is odd then get middle value
return lst[len(lst) // 2]
Some examples with the median function:
>>> median([9, 12, 20, 21, 34, 80]) # Even
20.5
>>> median([9, 12, 80, 21, 34]) # Odd
21
If you want to use library you can just simply do:
>>> import statistics
>>> statistics.median([9, 12, 20, 21, 34, 80]) # Even
20.5
>>> statistics.median([9, 12, 80, 21, 34]) # Odd
21
I posted my solution at Python implementation of "median of medians" algorithm , which is a little bit faster than using sort(). My solution uses 15 numbers per column, for a speed ~5N which is faster than the speed ~10N of using 5 numbers per column. The optimal speed is ~4N, but I could be wrong about it.
Per Tom's request in his comment, I added my code here, for reference. I believe the critical part for speed is using 15 numbers per column, instead of 5.
#!/bin/pypy
#
# TH #stackoverflow, 2016-01-20, linear time "median of medians" algorithm
#
import sys, random
items_per_column = 15
def find_i_th_smallest( A, i ):
t = len(A)
if(t <= items_per_column):
# if A is a small list with less than items_per_column items, then:
#
# 1. do sort on A
# 2. find i-th smallest item of A
#
return sorted(A)[i]
else:
# 1. partition A into columns of k items each. k is odd, say 5.
# 2. find the median of every column
# 3. put all medians in a new list, say, B
#
B = [ find_i_th_smallest(k, (len(k) - 1)/2) for k in [A[j:(j + items_per_column)] for j in range(0,len(A),items_per_column)]]
# 4. find M, the median of B
#
M = find_i_th_smallest(B, (len(B) - 1)/2)
# 5. split A into 3 parts by M, { < M }, { == M }, and { > M }
# 6. find which above set has A's i-th smallest, recursively.
#
P1 = [ j for j in A if j < M ]
if(i < len(P1)):
return find_i_th_smallest( P1, i)
P3 = [ j for j in A if j > M ]
L3 = len(P3)
if(i < (t - L3)):
return M
return find_i_th_smallest( P3, i - (t - L3))
# How many numbers should be randomly generated for testing?
#
number_of_numbers = int(sys.argv[1])
# create a list of random positive integers
#
L = [ random.randint(0, number_of_numbers) for i in range(0, number_of_numbers) ]
# Show the original list
#
# print L
# This is for validation
#
# print sorted(L)[int((len(L) - 1)/2)]
# This is the result of the "median of medians" function.
# Its result should be the same as the above.
#
print find_i_th_smallest( L, (len(L) - 1) / 2)
In case you need additional information on the distribution of your list, the percentile method will probably be useful. And a median value corresponds to the 50th percentile of a list:
import numpy as np
a = np.array([1,2,3,4,5,6,7,8,9])
median_value = np.percentile(a, 50) # return 50th percentile
print median_value
Here what I came up with during this exercise in Codecademy:
def median(data):
new_list = sorted(data)
if len(new_list)%2 > 0:
return new_list[len(new_list)/2]
elif len(new_list)%2 == 0:
return (new_list[(len(new_list)/2)] + new_list[(len(new_list)/2)-1]) /2.0
print median([1,2,3,4,5,9])
Just two lines are enough.
def get_median(arr):
'''
Calculate the median of a sequence.
:param arr: list
:return: int or float
'''
arr = sorted(arr)
return arr[len(arr)//2] if len(arr) % 2 else (arr[len(arr)//2] + arr[len(arr)//2-1])/2
median Function
def median(midlist):
midlist.sort()
lens = len(midlist)
if lens % 2 != 0:
midl = (lens / 2)
res = midlist[midl]
else:
odd = (lens / 2) -1
ev = (lens / 2)
res = float(midlist[odd] + midlist[ev]) / float(2)
return res
I had some problems with lists of float values. I ended up using a code snippet from the python3 statistics.median and is working perfect with float values without imports. source
def calculateMedian(list):
data = sorted(list)
n = len(data)
if n == 0:
return None
if n % 2 == 1:
return data[n // 2]
else:
i = n // 2
return (data[i - 1] + data[i]) / 2
def midme(list1):
list1.sort()
if len(list1)%2>0:
x = list1[int((len(list1)/2))]
else:
x = ((list1[int((len(list1)/2))-1])+(list1[int(((len(list1)/2)))]))/2
return x
midme([4,5,1,7,2])
def median(array):
if len(array) < 1:
return(None)
if len(array) % 2 == 0:
median = (array[len(array)//2-1: len(array)//2+1])
return sum(median) / len(median)
else:
return(array[len(array)//2])
I defined a median function for a list of numbers as
def median(numbers):
return (sorted(numbers)[int(round((len(numbers) - 1) / 2.0))] + sorted(numbers)[int(round((len(numbers) - 1) // 2.0))]) / 2.0
import numpy as np
def get_median(xs):
mid = len(xs) // 2 # Take the mid of the list
if len(xs) % 2 == 1: # check if the len of list is odd
return sorted(xs)[mid] #if true then mid will be median after sorting
else:
#return 0.5 * sum(sorted(xs)[mid - 1:mid + 1])
return 0.5 * np.sum(sorted(xs)[mid - 1:mid + 1]) #if false take the avg of mid
print(get_median([7, 7, 3, 1, 4, 5]))
print(get_median([1,2,3, 4,5]))
A more generalized approach for median (and percentiles) would be:
def get_percentile(data, percentile):
# Get the number of observations
cnt=len(data)
# Sort the list
data=sorted(data)
# Determine the split point
i=(cnt-1)*percentile
# Find the `floor` of the split point
diff=i-int(i)
# Return the weighted average of the value above and below the split point
return data[int(i)]*(1-diff)+data[int(i)+1]*(diff)
# Data
data=[1,2,3,4,5]
# For the median
print(get_percentile(data=data, percentile=.50))
# > 3
print(get_percentile(data=data, percentile=.75))
# > 4
# Note the weighted average difference when an int is not returned by the percentile
print(get_percentile(data=data, percentile=.51))
# > 3.04
Try This
import math
def find_median(arr):
if len(arr)%2==1:
med=math.ceil(len(arr)/2)-1
return arr[med]
else:
return -1
print(find_median([1,2,3,4,5,6,7,8]))
Implement it:
def median(numbers):
"""
Calculate median of a list numbers.
:param numbers: the numbers to be calculated.
:return: median value of numbers.
>>> median([1, 3, 3, 6, 7, 8, 9])
6
>>> median([1, 2, 3, 4, 5, 6, 8, 9])
4.5
>>> import statistics
>>> import random
>>> numbers = random.sample(range(-50, 50), k=100)
>>> statistics.median(numbers) == median(numbers)
True
"""
numbers = sorted(numbers)
mid_index = len(numbers) // 2
return (
(numbers[mid_index] + numbers[mid_index - 1]) / 2 if mid_index % 2 == 0
else numbers[mid_index]
)
if __name__ == "__main__":
from doctest import testmod
testmod()
source from
Function median:
def median(d):
d=np.sort(d)
n2=int(len(d)/2)
r=n2%2
if (r==0):
med=d[n2]
else:
med=(d[n2] + d[n2+1]) / 2
return med
Simply, Create a Median Function with an argument as a list of the number and call the function.
def median(l):
l = sorted(l)
lent = len(l)
if (lent % 2) == 0:
m = int(lent / 2)
result = l[m]
else:
m = int(float(lent / 2) - 0.5)
result = l[m]
return result
What I did was this:
def median(a):
a = sorted(a)
if len(a) / 2 != int:
return a[len(a) / 2]
else:
return (a[len(a) / 2] + a[(len(a) / 2) - 1]) / 2
Explanation: Basically if the number of items in the list is odd, return the middle number, otherwise, if you half an even list, python automatically rounds the higher number so we know the number before that will be one less (since we sorted it) and we can add the default higher number and the number lower than it and divide them by 2 to find the median.
Here's the tedious way to find median without using the median function:
def median(*arg):
order(arg)
numArg = len(arg)
half = int(numArg/2)
if numArg/2 ==half:
print((arg[half-1]+arg[half])/2)
else:
print(int(arg[half]))
def order(tup):
ordered = [tup[i] for i in range(len(tup))]
test(ordered)
while(test(ordered)):
test(ordered)
print(ordered)
def test(ordered):
whileloop = 0
for i in range(len(ordered)-1):
print(i)
if (ordered[i]>ordered[i+1]):
print(str(ordered[i]) + ' is greater than ' + str(ordered[i+1]))
original = ordered[i+1]
ordered[i+1]=ordered[i]
ordered[i]=original
whileloop = 1 #run the loop again if you had to switch values
return whileloop
It is very simple;
def median(alist):
#to find median you will have to sort the list first
sList = sorted(alist)
first = 0
last = len(sList)-1
midpoint = (first + last)//2
return midpoint
And you can use the return value like this median = median(anyList)

Why my merge sort in python exceeded maximum recursion depth?

I am learning MIT's open course 6.046 "Introduction to Algorithms" in Youtube, and I was trying to implement the merge sort in python.
My code is
def merge(seq_list, start, middle, end):
left_list = seq_list[start:middle]
left_list.append(float("inf"))
right_list = seq_list[middle:end]
right_list.append(float("inf"))
i = 0
j = 0
for k in range(start, end):
if left_list[i] < right_list[j]:
seq_list[k] = left_list[i]
i = i + 1
else:
seq_list[k] = right_list[j]
j = j + 1
def merge_sort(seq_list, start, end):
if start < end:
mid = len(seq_list)/2
merge_sort(seq_list[0:mid], start, mid)
merge_sort(seq_list[mid:], mid, end)
merge(seq_list, start, mid, end)
And the unittest code is
import unittest
from sorting import *
class SortingTest(unittest.TestCase):
def testMergeSort(self):
test_list = [3, 4, 8, 0, 6, 7, 4, 2, 1, 9, 4, 5]
merge_sort(test_list, 0, 9)
self.assertEqual(test_list, [0, 1, 2, 3, 4, 4, 4, 5, 6, 7, 8, 9])
def testMerge(self):
test_list = [13,17,18,9,2,4,5,7,1,2,3,6,0,38,12]
merge(test_list, 4, 8, 12)
self.assertEqual(test_list, [13,17,18,9,1,2,2,3,4,5,6,7,0,38,12])
if __name__ == "__main__":
unittest.main()
The function merge() seems work perfectly, but the merge_sort() function was wrong, and I don't know what's going on. The terminal show me:
RuntimeError: maximum recursion depth exceeded
You need to add a base clause when the list is empty or of size 1, other wise you keep "shrinking" an empty list, [and actually stay with the same list].
EDIT:
Also, I think it is actually deriving from a different bug: you are using len(seq) some times, and start,end sometimes - you should just stick to one of them.
mid = len(seq_list)/2
merge_sort(seq_list[0:mid], start, mid)
merge_sort(seq_list[mid:], mid, end)
Have a look on the test case [0,1,2,3]
start = 0, end = 3 -> mid = 2
Now you recurse with
mergesort([2,3],2,3) #2 == mid, 3 == end
And later you will set:
mid = len([2,3])/2 == 1
and try recursing again with
mergesort([3],1,3)
You will never reach the "stop condition" of start >= end, because end never changes, and is out of the current list's bounds!
Another bug:
merge_sort(seq_list[0:mid], start, mid)
does not do anything on seq_list, it does not change it - it only changes the new list object you passd to the recursion, and thus merge() will also fail.

Categories

Resources