I'm currently looking for an algorithm to be able to go through a list such as the following one: [1,1,1,1,2,3,4,5,5,5,3,2]
I want, in this example, to be able to select the first "1" as there's a duplicate next to it, and keep going through the list until finding the next number having a duplicate next to it, and then select the last number of this one (ie. "5" in this example).
Eventually, make the difference between these 2 numbers (ie. 5-1)
I have this code at the moment:
i=0
for i in range(len(X)):
if (X[i] == X[i+1]):
first_number = X[i]
elif (X[i] != X[i+1]):
i+=1
I'd like to add a further condition to my question. Suppose you have the following list: lst=[1,1,1,1,2,3,4,5,5,5,3,3,3,3,2,2,2,4,3] In this case, I'll get the following differences according to your code = lst = [4,-2,-1] and then stops. However, I'd like "4-2" to be added to the list afterwards because "4" is followed by a number less than "4" (thus, going to the opposite direction - up - of what "2" followed "4" were following). I hope this is clear enough. Many thanks
You can use enumerate with a starting index of 1. Duplicates are detected if the current value is equal to the value at the previous index:
l = [1,1,1,1,2,3,4,5,5,5,3,2]
r = [v for i, v in enumerate(l, 1) if i < len(l) and v == l[i]]
result = r[-1] - r[0]
# 4
The list r is a list of all duplicates. r[-1] is the last item and r[0] is the first.
More trials:
>>> l= [1,1,5,5,5,2,2]
>>> r = [v for i, v in enumerate(l, 1) if i < len(l) and v == l[i]]
>>> r[-1] - r[0]
1
Solution:
def subDupeLimits( aList ):
dupList = []
prevX = None
for x in aList:
if x == prevX:
dupList.append(x) # track duplicates
prevX = x # update previous x
# return last duplicate minus first
return dupList[-1] - dupList[0]
# call it
y = subDupeLimits( [1,1,1,1,2,3,4,5,5,5,3,2] )
# y = 4
You can use itertools.groupby to find groups of repeating numbers, then find the difference between the first two of those:
>>> import itertools
>>> lst = [1,1,1,1,2,3,4,5,5,5,3,2]
>>> duplicates = [k for k, g in itertools.groupby(lst) if len(list(g)) > 1]
>>> duplicates[1] - duplicates[0]
4
Or use duplicates[-1] - duplicates[0] if you want the difference between the first and the last repeated number.
In the more general case, if you want the difference between all pairs of consecutive repeated numbers, you could combine that with zip:
>>> lst = [1,1,1,1,2,3,4,5,5,5,3,3,3,3,2,2,2]
>>> duplicates = [k for k, g in itertools.groupby(lst) if len(list(g)) > 1]
>>> duplicates
[1, 5, 3, 2]
>>> [x - y for x,y in zip(duplicates, duplicates[1:])]
[-4, 2, 1]
I think now I got what you want: You want the difference between any consecutive "plateaus" in the list, where a plateau is either a repeated value, or a local minimum or maximum. This is a bit more complicated and will take several steps:
>>> lst=[1,1,1,1,2,3,4,5,5,5,3,3,3,3,2,2,2,4,3]
>>> plateaus = [lst[i] for i in range(1, len(lst)-1) if lst[i] == lst[i-1]
... or lst[i-1] <= lst[i] >= lst[i+1]
... or lst[i-1] >= lst[i] <= lst[i+1]]
>>> condensed = [k for k, g in itertools.groupby(plateaus)]
>>> [y-x for x, y in zip(condensed, condensed[1:])]
[4, -2, -1, 2]
I'm trying to solve this problem on the easy section of coderbyte and the prompt is:
Have the function ArrayAdditionI(arr) take the array of numbers stored in arr and return the string true if any combination of numbers in the array can be added up to equal the largest number in the array, otherwise return the string false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output should return true because 4 + 6 + 10 + 3 = 23. The array will not be empty, will not contain all the same elements, and may contain negative numbers.
Here's my solution.
def ArrayAddition(arr):
arr = sorted(arr, reverse=True)
large = arr.pop(0)
storage = 0
placeholder = 0
for r in range(len(arr)):
for n in arr:
if n + storage == large: return True
elif n + storage < large: storage += n
else: continue
storage = 0
if placeholder == 0: placeholder = arr.pop(0)
else: arr.append(placeholder); placeholder = arr.pop(0)
return False
print ArrayAddition([2,95,96,97,98,99,100])
I'm not even sure if this is correct, but it seems to cover all the numbers I plug in. I'm wondering if there is a better way to solve this through algorithm which I know nothing of. I'm thinking a for within a for within a for, etc loop would do the trick, but I don't know how to do that.
What I have in mind is accomplishing this with A+B, A+C, A+D ... A+B+C ... A+B+C+D+E
e.g)
for i in range(len(arr):
print "III: III{}III".format(i)
storage = []
for j in range(len(arr):
print "JJ: II({}),JJ({})".format(i,j)
for k in range(len(arr):
print "K: I{}, J{}, K{}".format(i,j,k)
I've searched all over and found the suggestion of itertool, but I'm wondering if there is a way to write this code up more raw.
Thanks.
A recursive solution:
def GetSum(n, arr):
if len(arr) == 0 and n != 0:
return False
return (n == 0 or
GetSum(n, arr[1:]) or
GetSum(n-arr[0], arr[1:]))
def ArrayAddition(arr):
arrs = sorted(arr)
return GetSum(arrs[-1], arrs[:-1])
print ArrayAddition([2,95,96,97,98,99,100])
The GetSum function returns False when the required sum is non-zero and there are no items in the array. Then it checks for 3 cases:
If the required sum, n, is zero then the goal is achieved.
If we can get the sum with the remaining items after the first item is removed, then the goal is achieved.
If we can get the required sum minus the first element of the list on the rest of the list the goal is achieved.
Your solution doesn't work.
>>> ArrayAddition([10, 11, 20, 21, 30, 31, 60])
False
The simple solution is to use itertools to iterate over all subsets of the input (that don't contain the largest number):
def subsetsum(l):
l = list(l)
target = max(l)
l.remove(l)
for subset_size in xrange(1+len(l)):
for subset in itertools.combinations(l, subset_size):
if sum(subset) == target:
return True
return False
If you want to avoid itertools, you'll need to generate subsets directly. That can be accomplished by counting in binary and using the set bits to determine which elements to pick:
def subsetsum(l):
l = list(l)
target = max(l)
l.remove(l)
for subset_index in xrange(2**len(l)):
subtotal = 0
for i, num in enumerate(l):
# If bit i is set in subset_index
if subset_index & (1 << i):
subtotal += num
if subtotal == target:
return True
return False
Update: I forgot that you want to check all possible combinations. Use this instead:
def ArrayAddition(l):
for length in range(2, len(l)):
for lst in itertools.combinations(l, length):
if sum(lst) in l:
print(lst, sum(lst))
return True
return False
One-liner solution:
>>> any(any(sum(lst) in l for lst in itertools.combinations(l, length)) for length in range(2, len(l)))
Hope this helps!
Generate all the sums of the powerset and test them against the max
def ArrayAddition(L):
return any(sum(k for j,k in enumerate(L) if 1<<j&i)==max(L) for i in range(1<<len(L)))
You could improve this by doing some preprocessing - find the max first and remove it from L
One more way to do it...
Code:
import itertools
def func(l):
m = max(l)
rem = [itertools.combinations([x for x in l if not x == m],i) for i in range(2,len(l)-1)]
print [item for i in rem for item in i if sum(item)==m ]
if __name__=='__main__':
func([1,2,3,4,5])
Output:
[(1, 4), (2, 3)]
Hope this helps.. :)
If I understood the question correctly, simply this should return what you want:
2*max(a)<=sum(a)
Say I have a list of numbers such as:
my_list = [1, 17, 2]
And I wanted to add those together. I know I can use print(sum(my_list)). However I wanted to see if there was another way of doing so, so I tried the following:
b = len(my_list)
for m in range(my_list[0], my_list[b-1]):
m += m
print(m)
I am sure something like this should work, but I am obviously doing it wrong. The output of this is 2. After I tried:
result = 0
b = len(my_list)
for m in range(my_list[0], my_list[b-1]):
result = result + m
print(result)
This outputs 1.
Please explain what I am doing wrong and how I can correct it.
Since you are using range function defining range between 1 and 2. The only data generated in m is 1 hence result is 1.
In Python, you can iterate over the elements of a sequence directly:
m = [1, 17, 2]
res = 0
for i in m:
res += i
print res
First, you should put a correct range: 0..2 in your case (since your list items' indexes starts from 0 and has 2 items)
for i in range(0, b):
result = result + my_list[i];
Or if you prefer "for each" style you should itterate by list you are summing:
for m in my_list:
result = result + m;
Finally if you want to print a final sum only you should correct print indent:
for m in my_list:
result = result + m;
print(result) # <- mind indent
Wrapping up:
my_list = [1, 17, 2]
result = 0
for m in my_list:
result = result + m;
print(result)
from operator import add
my_list = [1, 17, 2]
result=reduce(add, my_list)
import functools
print(functools.reduce(lambda x,y: x+y, my_list))
try this
my_list = [1, 17, 2]
reduce(lambda x, y: x+y, my_list)
to get the values from my_list you can use this syntax:
for m in my_list:
print m
If you use range it will give you a range from 1 ( first value of your list ) to 2 (length of your list -1)
To add the values of your list you can try this code:
out = 0
for m in my_list:
out = out + m
print(out)
Suppose we have two items missing in a sequence of consecutive integers and the missing elements lie between the first and last elements. I did write a code that does accomplish the task. However, I wanted to make it efficient using less loops if possible. Any help will be appreciated. Also what about the condition when we have to find more missing items (say close to n/4) instead of 2. I think then my code should be efficient right because I am breaking out from the loop earlier?
def missing_elements(L,start,end,missing_num):
complete_list = range(start,end+1)
count = 0
input_index = 0
for item in complete_list:
if item != L[input_index]:
print item
count += 1
else :
input_index += 1
if count > missing_num:
break
def main():
L = [10,11,13,14,15,16,17,18,20]
start = 10
end = 20
missing_elements(L,start,end,2)
if __name__ == "__main__":
main()
If the input sequence is sorted, you could use sets here. Take the start and end values from the input list:
def missing_elements(L):
start, end = L[0], L[-1]
return sorted(set(range(start, end + 1)).difference(L))
This assumes Python 3; for Python 2, use xrange() to avoid building a list first.
The sorted() call is optional; without it a set() is returned of the missing values, with it you get a sorted list.
Demo:
>>> L = [10,11,13,14,15,16,17,18,20]
>>> missing_elements(L)
[12, 19]
Another approach is by detecting gaps between subsequent numbers; using an older itertools library sliding window recipe:
from itertools import islice, chain
def window(seq, n=2):
"Returns a sliding window (of width n) over data from the iterable"
" s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ... "
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
def missing_elements(L):
missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)
return list(missing)
This is a pure O(n) operation, and if you know the number of missing items, you can make sure it only produces those and then stops:
def missing_elements(L, count):
missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)
return list(islice(missing, 0, count))
This will handle larger gaps too; if you are missing 2 items at 11 and 12, it'll still work:
>>> missing_elements([10, 13, 14, 15], 2)
[11, 12]
and the above sample only had to iterate over [10, 13] to figure this out.
Assuming that L is a list of integers with no duplicates, you can infer that the part of the list between start and index is completely consecutive if and only if L[index] == L[start] + (index - start) and similarly with index and end is completely consecutive if and only if L[index] == L[end] - (end - index). This combined with splitting the list into two recursively gives a sublinear solution.
# python 3.3 and up, in older versions, replace "yield from" with yield loop
def missing_elements(L, start, end):
if end - start <= 1:
if L[end] - L[start] > 1:
yield from range(L[start] + 1, L[end])
return
index = start + (end - start) // 2
# is the lower half consecutive?
consecutive_low = L[index] == L[start] + (index - start)
if not consecutive_low:
yield from missing_elements(L, start, index)
# is the upper part consecutive?
consecutive_high = L[index] == L[end] - (end - index)
if not consecutive_high:
yield from missing_elements(L, index, end)
def main():
L = [10,11,13,14,15,16,17,18,20]
print(list(missing_elements(L,0,len(L)-1)))
L = range(10, 21)
print(list(missing_elements(L,0,len(L)-1)))
main()
missingItems = [x for x in complete_list if not x in L]
a=[1,2,3,7,5,11,20]
b=[]
def miss(a,b):
for x in range (a[0],a[-1]):
if x not in a:
b.append(x)
return b
print (miss(a,b))
ANS:[4, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19]
works for sorted,unsorted , with duplicates too
Using collections.Counter:
from collections import Counter
dic = Counter([10, 11, 13, 14, 15, 16, 17, 18, 20])
print([i for i in range(10, 20) if dic[i] == 0])
Output:
[12, 19]
arr = [1, 2, 5, 6, 10, 12]
diff = []
"""zip will return array of tuples (1, 2) (2, 5) (5, 6) (6, 10) (10, 12) """
for a, b in zip(arr , arr[1:]):
if a + 1 != b:
diff.extend(range(a+1, b))
print(diff)
[3, 4, 7, 8, 9, 11]
If the list is sorted we can lookup for any gap. Then generate a range object between current (+1) and next value (not inclusive) and extend it to the list of differences.
Using scipy lib:
import math
from scipy.optimize import fsolve
def mullist(a):
mul = 1
for i in a:
mul = mul*i
return mul
a = [1,2,3,4,5,6,9,10]
s = sum(a)
so = sum(range(1,11))
mulo = mullist(range(1,11))
mul = mullist(a)
over = mulo/mul
delta = so -s
# y = so - s -x
# xy = mulo/mul
def func(x):
return (so -s -x)*x-over
print int(round(fsolve(func, 0))), int(round(delta - fsolve(func, 0)))
Timing it:
$ python -mtimeit -s "$(cat with_scipy.py)"
7 8
100000000 loops, best of 3: 0.0181 usec per loop
Other option is:
>>> from sets import Set
>>> a = Set(range(1,11))
>>> b = Set([1,2,3,4,5,6,9,10])
>>> a-b
Set([8, 7])
And the timing is:
Set([8, 7])
100000000 loops, best of 3: 0.0178 usec per loop
My take was to use no loops and set operations:
def find_missing(in_list):
complete_set = set(range(in_list[0], in_list[-1] + 1))
return complete_set - set(in_list)
def main():
sample = [10, 11, 13, 14, 15, 16, 17, 18, 20]
print find_missing(sample)
if __name__ == "__main__":
main()
# => set([19, 12])
Simply walk the list and look for non-consecutive numbers:
prev = L[0]
for this in L[1:]:
if this > prev+1:
for item in range(prev+1, this): # this handles gaps of 1 or more
print item
prev = this
Here's a one-liner:
In [10]: l = [10,11,13,14,15,16,17,18,20]
In [11]: [i for i, (n1, n2) in enumerate(zip(l[:-1], l[1:])) if n1 + 1 != n2]
Out[11]: [1, 7]
I use the list, slicing to offset the copies by one, and use enumerate to get the indices of the missing item.
For long lists, this isn't great because it's not O(log(n)), but I think it should be pretty efficient versus using a set for small inputs. izip from itertools would probably make it quicker still.
>>> l = [10,11,13,14,15,16,17,18,20]
>>> [l[i]+1 for i, j in enumerate(l) if (l+[0])[i+1] - l[i] > 1]
[12, 19]
We found a missing value if the difference between two consecutive numbers is greater than 1:
>>> L = [10,11,13,14,15,16,17,18,20]
>>> [x + 1 for x, y in zip(L[:-1], L[1:]) if y - x > 1]
[12, 19]
Note: Python 3. In Python 2 use itertools.izip.
Improved version for more than one value missing in a row:
>>> import itertools as it
>>> L = [10,11,14,15,16,17,18,20] # 12, 13 and 19 missing
>>> [x + diff for x, y in zip(it.islice(L, None, len(L) - 1),
it.islice(L, 1, None))
for diff in range(1, y - x) if diff]
[12, 13, 19]
def missing_elements(inlist):
if len(inlist) <= 1:
return []
else:
if inlist[1]-inlist[0] > 1:
return [inlist[0]+1] + missing_elements([inlist[0]+1] + inlist[1:])
else:
return missing_elements(inlist[1:])
First we should sort the list and then we check for each element, except the last one, if the next value is in the list. Be carefull not to have duplicates in the list!
l.sort()
[l[i]+1 for i in range(len(l)-1) if l[i]+1 not in l]
I stumbled on this looking for a different kind of efficiency -- given a list of unique serial numbers, possibly very sparse, yield the next available serial number, without creating the entire set in memory. (Think of an inventory where items come and go frequently, but some are long-lived.)
def get_serial(string_ids, longtail=False):
int_list = map(int, string_ids)
int_list.sort()
n = len(int_list)
for i in range(0, n-1):
nextserial = int_list[i]+1
while nextserial < int_list[i+1]:
yield nextserial
nextserial+=1
while longtail:
nextserial+=1
yield nextserial
[...]
def main():
[...]
serialgenerator = get_serial(list1, longtail=True)
while somecondition:
newserial = next(serialgenerator)
(Input is a list of string representations of integers, yield is an integer, so not completely generic code. longtail provides extrapolation if we run out of range.)
There's also an answer to a similar question which suggests using a bitarray for efficiently handling a large sequence of integers.
Some versions of my code used functions from itertools but I ended up abandoning that approach.
A bit of mathematics and we get a simple solution. The below solution works for integers from m to n.
Works for both sorted and unsorted postive and negative numbers.
#numbers = [-1,-2,0,1,2,3,5]
numbers = [-2,0,1,2,5,-1,3]
sum_of_nums = 0
max = numbers[0]
min = numbers[0]
for i in numbers:
if i > max:
max = i
if i < min:
min = i
sum_of_nums += i
# Total : sum of numbers from m to n
total = ((max - min + 1) * (max + min)) / 2
# Subtract total with sum of numbers which will give the missing value
print total - sum_of_nums
With this code you can find any missing values in a sequence, except the last number. It in only required to input your data into excel file with column name "numbers".
import pandas as pd
import numpy as np
data = pd.read_excel("numbers.xlsx")
data_sort=data.sort_values('numbers',ascending=True)
index=list(range(len(data_sort)))
data_sort['index']=index
data_sort['index']=data_sort['index']+1
missing=[]
for i in range (len(data_sort)-1):
if data_sort['numbers'].iloc[i+1]-data_sort['numbers'].iloc[i]>1:
gap=data_sort['numbers'].iloc[i+1]-data_sort['numbers'].iloc[i]
numerator=1
for j in range (1,gap):
mis_value=data_sort['numbers'].iloc[i+1]-numerator
missing.append(mis_value)
numerator=numerator+1
print(np.sort(missing))