I am undertaking a course in discrete mathematics. During this course our book talks about different sorting algorithms. To better understand this I tried to translate one of these algorithms into python, however the algorithm returns some unexpected output and I am failing to realise where my error is. If you would, please have a look below. Any help is much appreciated.
### Find max ###
# A = Array of values to find max from.
# n = Length of array A. Can also be described as the number of loops the array will perform
A = [100, 3, 7, 15, 17, 19, 25, 31, 32, 8, 21, 5, 51, 64, 63]
n = len(A) #len: python command to retrieve length of an array.
def find_max(A, n):
max = 0
for i in range(0, n):
if A[i] > max:
max = i
return max
### Input A and N in the algorithm and print the output ###
print find_max(A, n)
Here the expected output should be 0 since the first entry in the array has the highest value. However the script returns 14, which is the highest key in the array.
I would like the python script to resemble the pseudo code as much as possible. Simply so it is easier for us new students to compare the them to eachother. This is the pseudo code from our book:
find_max(A, n)
max = 0
for i = 0 to n-1
if (A[i] > A[max]) max = i
return max
Why it doesn't work: your attempt is mixing indices & values.
to look like the pseudo-code (with, added a check in case the array is empty so it doesn't return 0):
def find_max(A, n)
if not A:
raise Exception("empty array")
max = 0
for i in range(1,n): # no need to start at 0, already covered
if A[i] > A[max]:
max = i
return max
As a conclusion, the best way to be efficient and pythonic would probably be to use enumerate to carry indices & values and built-in max with a lambda to tell max to look for the values:
max(enumerate(A),key=lambda x:x[1])[0]
First, you shouldn't use max as a variable because it's a Python keyword, second, your variable max (let's call it mx), is holding the index of the maximum value, not the value itself, so here's a solution to your problem :
A = [17, 19, 25, 31, 32, 3, 7, 15, 8, 21, 100, 5, 51, 64, 63]
n = len(A)
def find_max(A, n):
mx = 0 # we call it mx
for i in range(1, n): # since mx = 0, no need start at 0
if A[i] > A[mx]: # we compare the number in A[i] with the number in A[mx] not with mx
mx = i
return mx # index
print(find_max(A, n)) # => 10
this would get the job done :
def find_max(A, n):
max = 0
for i in range(0, n):
if A[i] > max:
max = A[i]
return max
or you can use the buil-in max function :
result = max(A)
Related
How can we change this recursive function to "for iteration"?.....
Note: Greed algorithms should not be used. Greedy algorithms are less accurate.
Reason to change : To increase the efficiency of a function by changing to a dynamic algorithm
What I've tried:
# this code is trash
for i in range(1,m+1):
n1=m-coins[i]
for i in range(1,m+1):
n2=m-coins[i]
return min(n1,n2)
while n1==0:
i=i+1
n1=m-coins[i]
return n1
Besides this, I tried using dictionary and combination, but I forgot because I thought it wouldn't work.
How the function behaves : It is a function that takes a list of coins and a value of m and makes m==0.
n1 means moving on without using coins[0].
n2 means using coins[0] as a necessity and moving on to the next level.
If I had a list of [5,4,2,1] and a value of 10 m, I would recurs the first n1=[4,2,1],10 and the first n2=[5,4,2,1],10.
coins =[50,40,20,10,5,4,2,1]
m = 80
def coin_count(coins, m):
if m > 0:
while len(coins) > 0 and m < coins[0]:
coins = coins[1:]
#Below is the part I want to change.
if coins[0] > 1:
n1=coin_count(coins[1:],m)
n2=coin_count(coins,m-coins[0])+1
return min(n1,n2)
else: # coins[0] == 1
return m
#Above is the part I want to change
else: # m == 0
return 0
Here is my solution to your function using a while loop. It returns the smallest possible list of coins that sums up to m - I suppose this is what you were looking for?
How it works is that in each while loop iteration, it takes the current largest coin and checks if the remainder of m is divisible by that coin, adding the number of coins needed and setting m to the remainder
def coin_count(coins, m):
coins = sorted(coins, reverse=True)
out = []
while m > 0:
cur_coin = coins.pop(0)
out.extend([cur_coin] * (m // cur_coin))
m %= cur_coin
return out
NB: If the input coins are always sorted in descending order, you can remove the first line.
Example:
coins = [50, 40, 20, 10, 5, 4, 2, 1]
m = 128
print(coin_count(coins, m))
[50, 50, 20, 5, 2, 1]
I'm trying to find is a target sum could be found from an infinite stream of numbers in Python. The numbers are positive (numbers> 0), unique and the numbers are indefinite. I believe the answer would be to use dynamic programming or heap but can't quite figure out a logic.
Any help on possible data structure or flow of logic to try.
Thank you so much.
e.g
nums = [ 99,85,1,3,6,72,7,9,22,....]
targetSum = 27
output: True
Explanation: 1+6+22 = 27(targetSum)
You can use a set to keep track of all the possible sums given the numbers so far in the iterations. For each iteration, add the current number to each existing sum in the set to add to the set, and add the current number itself to the set. Return True when the target sum is indeed in the set:
def has_sum(nums, targetSum):
sums = set()
for i in nums:
sums.update([s + i for s in sums if s + i <= targetSum])
if i <= targetSum:
sums.add(i)
if targetSum in sums:
return True
return False
so that:
has_sum([99, 85, 1, 3, 6, 72, 7, 9, 22], 29)
returns True (because 1 + 6 + 22 = 29), and that:
has_sum([99, 85, 1, 3, 6, 72, 7, 9, 22], 27)
returns False (because the expected output in your question is incorrect).
#Pavis,
One idea that I can think of is to use tight graph. You can do the following:
add each number element that is less than the sum into the graph as a node
each new node is connected to every other node already present in the graph
after addition to graph, compute the sum using breadth first search from the lowest element
I think this process is slow tho
You can recursively try to satisfy the target sum with or without the first number in the given sequence, until there is no more number in the given sequence, or until the given target sum is no longer positive (since you mentioned in the comment that all given numbers are positive):
def has_sum(nums, targetSum):
if not nums or targetSum <= 0:
return False
first, *rest = nums
return first == targetSum or has_sum(rest, targetSum - first) or has_sum(rest, targetSum)
so that:
has_sum([99, 85, 1, 3, 6, 72, 7, 9, 22], 29)
returns True (because 1 + 6 + 22 = 29), and that:
has_sum([99, 85, 1, 3, 6, 72, 7, 9, 22], 27)
returns False (because the expected output in your question is incorrect).
EDIT: To avoid performance impact from copying the input sequence minus the first item to rest in every call, the code above can be improved with an index instead:
def has_sum(nums, targetSum, index=0):
if index == len(nums) or targetSum <= 0:
return False
num = nums[index]
return num == targetSum or has_sum(nums, targetSum - num, index + 1) or has_sum(nums, targetSum, index + 1)
I am a total Python newbie
so pardon me for this stupid question
The composition() function returns a certain value
However I only want the value to be <=10, and I want 100 of them
The code below calculates the times it took to simulate 100 composition() values to be of <=10
def find():
i=1
aaa=composition()
while(aaa>10):
aaa=composition()
i=i+1
return i
Customers_num=[find() for i in range(100)]
Customers_num=np.array(Customers_num)
print(np.sum(Customers_num))
However, Suppose that the code above returns 150.
And I also want to know all the values that were being simulated in 150 times of composition()
What kind of code should I start with?
I am thinking about combining it with the if else method statement and appending the values to an empty list, but so far my code has been a total disaster
def find():
i=1
aaa=composition()
bbb=[]
if aaa<=10:
bbb.appendd([aaa])
else:
bbb.append([aaa])
aaa=composition()
bbb.appendd([aaa])
while(aaa>10):
i=i+1
if aaa>10:
bbb.append([aaa])
else:
bbb.append([aaa])
return i,bbb
find()
Thanks in advance!
You could make a generator that will generate n of list of values from composition() and stop when n of them are <= 10. This list will include all the numbers, so you have all the intermediate values, and the length of this list will be how "long" it took to generate it. For example (with a fake random composition() function:
from random import randint
def composition():
return randint(0, 20)
def getN(n):
while n > 0:
c = composition()
if c < 10:
n -= 1
yield c
values = list(getN(10)) # get 10 values less than 10
# [2, 0, 11, 15, 12, 8, 16, 16, 2, 8, 10, 3, 14, 2, 9, 18, 6, 11, 1]
time = len(values)
# 19
I'm looking for an efficient way to extract from an array in Python only significant values, for instance, only those 10 times bigger than the rest. The logic (no code) using a very simple case is something like that:
array = [5000, 400, 40, 10, 1, 35] # here the significant value will be 5000.
from i=0 to len.array # to run the procedure in all the array components
delta = array[i] / array [i+1] # to confirm that array[i] is significant or not.
if delta >= 10 : # assuming a rule of 10X significance i.e significance = 10 times bigger than the rest of elements in the array.
new_array = array[i] # Insert to new_array the significant value
elif delta <= 0.1 : # in this case the second element is the significant.
new_array = array[i+1] # Insert to new_array the significant value
at the end new_array will be composed by the significant values, in this case new_array =[5000], but must apply to any kind of array.
Thanks for your help!
UPDATE!!!
Thanks to all for your answers!!! in particular to Copperfield who gave me a good idea about how to do it. Here is the code that's working for the purpose!
array_o = [5000,4500,400, 4, 1, 30, 2000]
array = sorted(array_o)
new_array = []
max_array = max(array)
new_array.append(max_array)
array.remove(max_array)
for i in range(0,len(array)):
delta = max_array / array[i]
if delta <= 10:
new_array.append(array[i])
Does this answer your question?
maxNum = max(array)
array.remove(maxNum)
SecMaxNum = max(array)
if maxNum / SecMaxNum >= 10 :
# take action accordingly
else:
# take action accordingly
your pseudo code can be translate to this function
def function(array):
new_array = []
for i in range(1,len(array)):
delta = array[i-1] / array[i]
if delta >= 10:
new_array.append( array[i-1] )
elif delta <= 0.1:
new_array.append( array[i] )
return new_array
this give this result
>>> function([5000, 400, 40, 10, 1, 35])
[5000, 400, 10, 35]
>>>
Now, what you describe can be done like this in python 3.5+
*rest, secondMax, maxNum = sorted(array)
if maxNum / secondMax >= 10:
# take action accordingly
else:
# take action accordingly
or in previous versions
sortedArray = sorted(array)
if sortedArray[-1] / sortedArray[-2] >= 10:
# take action accordingly
else:
# take action accordingly
(the negative index access the element from last to first, so -1 is the last one, -2 the second last, etc )
I would not adopt the approach of only comparing each value to the one next to it. If the array is unsorted then obviously that's a disaster, but even if it is sorted:
a = [531441, 59049, 6561, 729, 81, 9, 9, 8, 6, 6, 5, 4, 4, 4, 3, 3, 1, 1, 1, 1]
In that example, the "rest" (i.e. majority) of the values are <10, but I've managed to get up into the 6-digit range very quickly with each number only being 9 times the one next to it (so, your rule would not be triggered).
One approach to outlier detection is to subtract the median from your distribution and divide by a non-parametric statistic that reflects the spread of the distribution (below, I've chosen a denominator that would be equivalent to the standard deviation if the numbers were normally distributed). That gives you an "atypicality" score on a standardized scale. Find the large values, and you have found your outliers (any score larger than, say, 3—but you may need to play around a bit to find the cutoff that works nicely for your problem).
import numpy
npstd = numpy.diff(numpy.percentile(a, [16, 84]))/2.0 # non-parametric "standard deviation" equivalent
score = (a - numpy.median(a)) / npstd
outlier_locations, = numpy.where(score > 3) # 3, 4 or 5 might work well as cut-offs
I have a list for example
l = [10, 20, 30, 40, 50, 60]
I need to increment the first n elements of the list given a condition. The condition is independent of the list. For example if n = 3, the list l should become :
l = [11, 21, 31, 40, 50, 60]
I understand that I can do it with a for loop on each element of the list. But I need to do such operation around 150 million times. So, I am looking for a faster method to do this. Any help is highly appreciated. Thanks in advance
Here's an operation-aggregating implementation in NumPy:
initial_array = # whatever your l is, but as a NumPy array
increments = numpy.zeros_like(initial_array)
...
# every time you want to increment the first n elements
if n:
increments[n-1] += 1
...
# to apply the increments
initial_array += increments[::-1].cumsum()[::-1]
This is O(ops + len(initial_array)), where ops is the number of increment operations. Unless you're only doing a small number of increments over a very small portion of the list, this should be much faster. Unlike the naive implementation, it doesn't let you retrieve element values until the increments are applied; if you need to do that, you might need a solution based on a BST or BST-like structure to track increments.
m - queries count, n - list to increment length, O(n + m) algorithm idea:
since you only have to increment from start to some k-th element you will get ranges of increments. Let our increment be pair (up to position, increment by). Example:
(1, 2) - increment positions 0 and 1 by 2
If we are trying to calculate value at position k then we should add increments that have positions greater or equal than k to current value at position k. How we can quickly calculate sum of increments that have positions greater or equal than k? We can start calculating values from the back of the list and then remember sum of increments.
Proof of concept:
# list to increment
a = [1, 2, 5, 1, 6]
# (up to and including k-th index, increment by value)
queries = [(1, 2), (0, 10), (3, 11), (4, 3)]
# decribed algorithm implementation
increments = [0]*len(a)
for position, inc in queries:
increments[position] += inc
got = list(a)
increments_sum = 0
for i in xrange(len(increments) -1, -1, -1):
increments_sum += increments[i]
got[i] += increments_sum
# verify that solution is correct using slow but correct algorithm
expected = list(a)
for position, inc in queries:
for i in xrange(position + 1):
expected[i] += inc
print 'Expected: ', expected
print 'Got: ', got
output:
Expected: [27, 18, 19, 15, 9]
Got: [27, 18, 19, 15, 9]
You can create a simple data structure on top of your list which stores the start and end range of each increment operation. The start would be 0 in your case so you can just store the end.
This way you don't have to actually traverse the list to increment the elements, but you only retain that you performed increments on ranges for example {0 to 2} and {0 to 3}. Furthermore, you can also collate some operations, so that if multiple operations increment until the same index, you only need to store one entry.
The worst case complexity of this solution is O(q + g x qlogq + n) where g is the number of get operations, q is the number of updates and n is the length of the list. Since we can have at most n distinct endings for the intervals this reduces to O(q + nlogn + n) = O(q + nlogn). A naive solution using an update for each query would be O(q * l) where l (the length of a query) could be up to the size of n giving O(q * n). So we can expect this solution to be better when q > log n.
Working python example below:
def RangeStructure(object):
def __init__(self, l):
self.ranges = collections.defaultdict(int)
self.l = l
def incToPosition(self, k):
self.ranges[k] += 1
def get(self):
res = self.l
sorted_keys = sorted(self.ranges)
last = len(sorted_keys) - 1
to_add = 0
while last >= 0:
start = 0 if last < 1 else sorted_keys[last - 1]
end = sorted_keys[last]
to_add += self.ranges[end]
for i in range(start, end):
res[i] += to_add
last -= 1
return res
rs = RangeStructure([10, 20, 30, 40, 50, 60])
rs.incToPosition(2)
rs.incToPosition(2)
rs.incToPosition(3)
rs.incToPosition(4)
print rs.get()
And an explanation:
after the inc operations ranges will contain (start, end, inc) tuples of the form (0, 2, 2), (0, 3, 1), (0, 4, 1); these will be represented in the dict as { 2:2, 3:1, 4:1} since the start is always 1 and can be omitted
during the get operation, we ensure that we only operate on any list element once; we sort the ranges in increasing order of their end point, and traverse them in reverse order updating the contained list elements and the sum (to_add) to be added to subsequent ranges
This prints, as expected:
[14, 24, 32, 41, 50, 60]
You can use list comprehension and add the remaining list
[x + 1 for x in a[:n]]+a[n:]