remove numbers from a list without changing total sum - python

I have a list of numbers (example: [-1, 1, -4, 5]) and I have to remove numbers from the list without changing the total sum of the list. I want to remove the numbers with biggest absolute value possible, without changing the total, in the example removing [-1, -4, 5] will leave [1] so the sum doesn't change.
I wrote the naive approach, which is finding all possible combinations that don't change the total and see which one removes the biggest absolute value. But that is be really slow since the actual list will be a lot bigger than that.
Here's my combinations code:
from itertools import chain, combinations
def remove(items):
all_comb = chain.from_iterable(combinations(items, n+1)
for n in xrange(len(items)))
biggest = None
biggest_sum = 0
for comb in all_comb:
if sum(comb) != 0:
continue # this comb would change total, skip
abs_sum = sum(abs(item) for item in comb)
if abs_sum > biggest_sum:
biggest = comb
biggest_sum = abs_sum
return biggest
print remove([-1, 1, -4, 5])
It corectly prints (-1, -4, 5). However I am looking for some clever, more efficient solution than looping over all possible item combinations.
Any ideas?

if you redefine the problem as finding a subset whose sum equals the value of the complete set, you will realize that this is a NP-Hard problem, (subset sum)
so there is no polynomial complexity solution for this problem .

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright © 2009 Clóvis Fabrício Costa
# Licensed under GPL version 3.0 or higher
def posneg_calcsums(subset):
sums = {}
for group in chain.from_iterable(combinations(subset, n+1)
for n in xrange(len(subset))):
sums[sum(group)] = group
return sums
def posneg(items):
positive = posneg_calcsums([item for item in items if item > 0])
negative = posneg_calcsums([item for item in items if item < 0])
for n in sorted(positive, reverse=True):
if -n in negative:
return positive[n] + negative[-n]
else:
return None
print posneg([-1, 1, -4, 5])
print posneg([6, 44, 1, -7, -6, 19])
It works fine, and is a lot faster than my first approach. Thanks to Alon for the wikipedia link and ivazquez|laptop on #python irc channel for a good hint that led me into the solution.
I think it can be further optimized - I want a way to stop calculating the expensive part once the solution was found. I will keep trying.

Your requirements don't say if the function is allowed to change the list order or not. Here's a possibility:
def remove(items):
items.sort()
running = original = sum(items)
try:
items.index(original) # we just want the exception
return [original]
except ValueError:
pass
if abs(items[0]) > items[-1]:
running -= items.pop(0)
else:
running -= items.pop()
while running != original:
try:
running -= items.pop(items.index(original - running))
except ValueError:
if running > original:
running -= items.pop()
elif running < original:
running -= items.pop(0)
return items
This sorts the list (big items will be at the end, smaller ones will be at the beginning) and calculates the sum, and removes an item from the list. It then continues removing items until the new total equals the original total. An alternative version that preserves order can be written as a wrapper:
from copy import copy
def remove_preserve_order(items):
a = remove(copy(items))
return [x for x in items if x in a]
Though you should probably rewrite this with collections.deque if you really want to preserve order. If you can guarantee uniqueness in your list, you can get a big win by using a set instead.
We could probably write a better version that traverses the list to find the two numbers closest to the running total each time and remove the closer of the two, but then we'd probably end up with O(N^2) performance. I believe this code's performance will be O(N*log(N)) as it just has to sort the list (I hope Python's list sorting isn't O(N^2)) and then get the sum.

I do not program in Python so my apologies for not offering code. But I think I can help with the algorithm:
Find the sum
Add numbers with the lowest value until you get to the same sum
Everything else can be deleted
I hope this helps

This can be solved using integer programming. You can define a binary variable s_i for each of your list elements x_i and minimize \sum_i s_i, limited by the constraint that \sum_i (x_i*s_i) is equal to the original sum of your list.
Here's an implementation using the lpSolve package in R:
library(lpSolve)
get.subset <- function(lst) {
res <- lp("min", rep(1, length(lst)), matrix(lst, nrow=1), "=", sum(lst),
binary.vec=seq_along(lst))
lst[res$solution > 0.999]
}
Now, we can test it with a few examples:
get.subset(c(1, -1, -4, 5))
# [1] 1
get.subset(c(6, 44, 1, -7, -6, 19))
# [1] 44 -6 19
get.subset(c(1, 2, 3, 4))
# [1] 1 2 3 4

Related

How can I get a sum from some elements of a list? [duplicate]

I have a list of numbers. I also have a certain sum. The sum is made from a few numbers from my list (I may/may not know how many numbers it's made from). Is there a fast algorithm to get a list of possible numbers? Written in Python would be great, but pseudo-code's good too. (I can't yet read anything other than Python :P )
Example
list = [1,2,3,10]
sum = 12
result = [2,10]
NOTE: I do know of Algorithm to find which numbers from a list of size n sum to another number (but I cannot read C# and I'm unable to check if it works for my needs. I'm on Linux and I tried using Mono but I get errors and I can't figure out how to work C# :(
AND I do know of algorithm to sum up a list of numbers for all combinations (but it seems to be fairly inefficient. I don't need all combinations.)
This problem reduces to the 0-1 Knapsack Problem, where you are trying to find a set with an exact sum. The solution depends on the constraints, in the general case this problem is NP-Complete.
However, if the maximum search sum (let's call it S) is not too high, then you can solve the problem using dynamic programming. I will explain it using a recursive function and memoization, which is easier to understand than a bottom-up approach.
Let's code a function f(v, i, S), such that it returns the number of subsets in v[i:] that sums exactly to S. To solve it recursively, first we have to analyze the base (i.e.: v[i:] is empty):
S == 0: The only subset of [] has sum 0, so it is a valid subset. Because of this, the function should return 1.
S != 0: As the only subset of [] has sum 0, there is not a valid subset. Because of this, the function should return 0.
Then, let's analyze the recursive case (i.e.: v[i:] is not empty). There are two choices: include the number v[i] in the current subset, or not include it. If we include v[i], then we are looking subsets that have sum S - v[i], otherwise, we are still looking for subsets with sum S. The function f might be implemented in the following way:
def f(v, i, S):
if i >= len(v): return 1 if S == 0 else 0
count = f(v, i + 1, S)
count += f(v, i + 1, S - v[i])
return count
v = [1, 2, 3, 10]
sum = 12
print(f(v, 0, sum))
By checking f(v, 0, S) > 0, you can know if there is a solution to your problem. However, this code is too slow, each recursive call spawns two new calls, which leads to an O(2^n) algorithm. Now, we can apply memoization to make it run in time O(n*S), which is faster if S is not too big:
def f(v, i, S, memo):
if i >= len(v): return 1 if S == 0 else 0
if (i, S) not in memo: # <-- Check if value has not been calculated.
count = f(v, i + 1, S, memo)
count += f(v, i + 1, S - v[i], memo)
memo[(i, S)] = count # <-- Memoize calculated result.
return memo[(i, S)] # <-- Return memoized value.
v = [1, 2, 3, 10]
sum = 12
memo = dict()
print(f(v, 0, sum, memo))
Now, it is possible to code a function g that returns one subset that sums S. To do this, it is enough to add elements only if there is at least one solution including them:
def f(v, i, S, memo):
# ... same as before ...
def g(v, S, memo):
subset = []
for i, x in enumerate(v):
# Check if there is still a solution if we include v[i]
if f(v, i + 1, S - x, memo) > 0:
subset.append(x)
S -= x
return subset
v = [1, 2, 3, 10]
sum = 12
memo = dict()
if f(v, 0, sum, memo) == 0: print("There are no valid subsets.")
else: print(g(v, sum, memo))
Disclaimer: This solution says there are two subsets of [10, 10] that sums 10. This is because it assumes that the first ten is different to the second ten. The algorithm can be fixed to assume that both tens are equal (and thus answer one), but that is a bit more complicated.
I know I'm giving an answer 10 years later since you asked this, but i really needed to know how to do this an the way jbernadas did it was too hard for me, so i googled it for an hour and I found a python library itertools that gets the job done!
I hope this help to future newbie programmers.
You just have to import the library and use the .combinations() method, it is that simple, it returns all the subsets in a set with order, I mean:
For the set [1, 2, 3, 4] and a subset with length 3 it will not return [1, 2, 3][1, 3, 2][2, 3, 1] it will return just [1, 2, 3]
As you want ALL the subsets of a set you can iterate it:
import itertools
sequence = [1, 2, 3, 4]
for i in range(len(sequence)):
for j in itertools.combinations(sequence, i):
print(j)
The output will be
()
(1,)
(2,)
(3,)
(4,)
(1, 2)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
(3, 4)
(1, 2, 3)
(1, 2, 4)
(1, 3, 4)
(2, 3, 4)
Hope this help!
So, the logic is to reverse sort the numbers,and suppose the list of numbers is l and sum to be formed is s.
for i in b:
if(a(round(n-i,2),b[b.index(i)+1:])):
r.append(i)
return True
return False
then, we go through this loop and a number is selected from l in order and let say it is i .
there are 2 possible cases either i is the part of sum or not.
So, we assume that i is part of solution and then the problem reduces to l being l[l.index(i+1):] and s being s-i so, if our function is a(l,s) then we call a(l[l.index(i+1):] ,s-i). and if i is not a part of s then we have to form s from l[l.index(i+1):] list.
So it is similar in both the cases , only change is if i is part of s, then s=s-i and otherwise s=s only.
now to reduce the problem such that in case numbers in l are greater than s we remove them to reduce the complexity until l is empty and in that case the numbers which are selected are not a part of our solution and we return false.
if(len(b)==0):
return False
while(b[0]>n):
b.remove(b[0])
if(len(b)==0):
return False
and in case l has only 1 element left then either it can be part of s then we return true or it is not then we return false and loop will go through other number.
if(b[0]==n):
r.append(b[0])
return True
if(len(b)==1):
return False
note in the loop if have used b..but b is our list only.and i have rounded wherever it is possible, so that we should not get wrong answer due to floating point calculations in python.
r=[]
list_of_numbers=[61.12,13.11,100.12,12.32,200,60.00,145.34,14.22,100.21,14.77,214.35,200.32,65.43,0.49,132.13,143.21,156.34,11.32,12.34,15.67,17.89,21.23,14.21,12,122,134]
list_of_numbers=sorted(list_of_numbers)
list_of_numbers.reverse()
sum_to_be_formed=401.54
def a(n,b):
global r
if(len(b)==0):
return False
while(b[0]>n):
b.remove(b[0])
if(len(b)==0):
return False
if(b[0]==n):
r.append(b[0])
return True
if(len(b)==1):
return False
for i in b:
if(a(round(n-i,2),b[b.index(i)+1:])):
r.append(i)
return True
return False
if(a(sum_to_be_formed,list_of_numbers)):
print(r)
this solution works fast.more fast than one explained above.
However this works for positive numbers only.
However also it works good if there is a solution only otherwise it takes to much time to get out of loops.
an example run is like this lets say
l=[1,6,7,8,10]
and s=22 i.e. s=1+6+7+8
so it goes through like this
1.) [10, 8, 7, 6, 1] 22
i.e. 10 is selected to be part of 22..so s=22-10=12 and l=l.remove(10)
2.) [8, 7, 6, 1] 12
i.e. 8 is selected to be part of 12..so s=12-8=4 and l=l.remove(8)
3.) [7, 6, 1] 4
now 7,6 are removed and 1!=4 so it will return false for this execution where 8 is selected.
4.)[6, 1] 5
i.e. 7 is selected to be part of 12..so s=12-7=5 and l=l.remove(7)
now 6 are removed and 1!=5 so it will return false for this execution where 7 is selected.
5.)[1] 6
i.e. 6 is selected to be part of 12..so s=12-6=6 and l=l.remove(6)
now 1!=6 so it will return false for this execution where 6 is selected.
6.)[] 11
i.e. 1 is selected to be part of 12..so s=12-1=1 and l=l.remove(1)
now l is empty so all the cases for which 10 was a part of s are false and so 10 is not a part of s and we now start with 8 and same cases follow.
7.)[7, 6, 1] 14
8.)[6, 1] 7
9.)[1] 1
just to give a comparison which i ran on my computer which is not so good.
using
l=[61.12,13.11,100.12,12.32,200,60.00,145.34,14.22,100.21,14.77,214.35,145.21,123.56,11.90,200.32,65.43,0.49,132.13,143.21,156.34,11.32,12.34,15.67,17.89,21.23,14.21,12,122,134]
and
s=2000
my loop ran 1018 times and 31 ms.
and previous code loop ran 3415587 times and took somewhere near 16 seconds.
however in case a solution does not exist my code ran more than few minutes so i stopped it and previous code ran near around 17 ms only and previous code works with negative numbers also.
so i thing some improvements can be done.
#!/usr/bin/python2
ylist = [1, 2, 3, 4, 5, 6, 7, 9, 2, 5, 3, -1]
print ylist
target = int(raw_input("enter the target number"))
for i in xrange(len(ylist)):
sno = target-ylist[i]
for j in xrange(i+1, len(ylist)):
if ylist[j] == sno:
print ylist[i], ylist[j]
This python code do what you asked, it will print the unique pair of numbers whose sum is equal to the target variable.
if target number is 8, it will print:
1 7
2 6
3 5
3 5
5 3
6 2
9 -1
5 3
I have found an answer which has run-time complexity O(n) and space complexity about O(2n), where n is the length of the list.
The answer satisfies the following constraints:
List can contain duplicates, e.g. [1,1,1,2,3] and you want to find pairs sum to 2
List can contain both positive and negative integers
The code is as below, and followed by the explanation:
def countPairs(k, a):
# List a, sum is k
temp = dict()
count = 0
for iter1 in a:
temp[iter1] = 0
temp[k-iter1] = 0
for iter2 in a:
temp[iter2] += 1
for iter3 in list(temp.keys()):
if iter3 == k / 2 and temp[iter3] > 1:
count += temp[iter3] * (temp[k-iter3] - 1) / 2
elif iter3 == k / 2 and temp[iter3] <= 1:
continue
else:
count += temp[iter3] * temp[k-iter3] / 2
return int(count)
Create an empty dictionary, iterate through the list and put all the possible keys in the dict with initial value 0.
Note that the key (k-iter1) is necessary to specify, e.g. if the list contains 1 but not contains 4, and the sum is 5. Then when we look at 1, we would like to find how many 4 do we have, but if 4 is not in the dict, then it will raise an error.
Iterate through the list again, and count how many times that each integer occurs and store the results to the dict.
Iterate through through the dict, this time is to find how many pairs do we have. We need to consider 3 conditions:
3.1 The key is just half of the sum and this key occurs more than once in the list, e.g. list is [1,1,1], sum is 2. We treat this special condition as what the code does.
3.2 The key is just half of the sum and this key occurs only once in the list, we skip this condition.
3.3 For other cases that key is not half of the sum, just multiply the its value with another key's value where these two keys sum to the given value. E.g. If sum is 6, we multiply temp[1] and temp[5], temp[2] and temp[4], etc... (I didn't list cases where numbers are negative, but idea is the same.)
The most complex step is step 3, which involves searching the dictionary, but as searching the dictionary is usually fast, nearly constant complexity. (Although worst case is O(n), but should not happen for integer keys.) Thus, with assuming the searching is constant complexity, the total complexity is O(n) as we only iterate the list many times separately.
Advice for a better solution is welcomed :)

foobar please-pass-the-coded-messages hidden test case not passing

I have been attempting google foobar and in the second level i got the task named please-pass-the-coded-messages. below is the task
==============================
You need to pass a message to the bunny workers, but to avoid detection, the code you agreed to use is... obscure, to say the least. The bunnies are given food on standard-issue plates that are stamped with the numbers 0-9 for easier sorting, and you need to combine sets of plates to create the numbers in the code. The signal that a number is part of the code is that it is divisible by 3. You can do smaller numbers like 15 and 45 easily, but bigger numbers like 144 and 414 are a little trickier. Write a program to help yourself quickly create large numbers for use in the code, given a limited number of plates to work with.
You have L, a list containing some digits (0 to 9). Write a function solution(L) which finds the largest number that can be made from some or all of these digits and is divisible by 3. If it is not possible to make such a number, return 0 as the solution. L will contain anywhere from 1 to 9 digits. The same digit may appear multiple times in the list, but each element in the list may only be used once.
Languages
=========
To provide a Java solution, edit Solution.java
To provide a Python solution, edit solution.py
Test cases
==========
Your code should pass the following test cases.
Note that it may also be run against hidden test cases not shown here.
-- Java cases --
Input:
Solution.solution({3, 1, 4, 1})
Output:
4311
Input:
Solution.solution({3, 1, 4, 1, 5, 9})
Output:
94311
-- Python cases --
Input:
solution.solution([3, 1, 4, 1])
Output:
4311
Input:
solution.solution([3, 1, 4, 1, 5, 9])
Output:
94311
Use verify [file] to test your solution and see how it does. When you are finished editing your code, use submit [file] to submit your answer. If your solution passes the test cases, it will be removed from your home folder.
i have tried a solution which is working very correct in my ide(note i wanted a solution without any library)
def solution(l):
# Your code here
if (len(l) == 1 and l[0] % 3 != 0) or (len(l) == 0):
return 0
number = formGreatestNumber(l)
remainder = number % 3
if remainder == 0:
result = formGreatestNumber(l)
return result
result = removeUnwanted(l, remainder)
return result
def formGreatestNumber(li):
li.sort(reverse=True) # descending order
li = [str(d) for d in li] # each digit in string
number = 0
if len(li) > 0:
number = int("".join(li)) # result
return number
def removeUnwanted(l, remainder):
possibleRemovals = [i for i in l if i % 3 == remainder]
if len(possibleRemovals) > 0:
l.remove(min(possibleRemovals))
result = formGreatestNumber(l)
return result
pairs = checkForTwo(l, remainder)
if len(pairs) > 0:
for ind in pairs:
l.remove(ind)
result = formGreatestNumber(l)
return result
else:
divisibleDigits = [d for d in l if d % 3 == 0]
if len(divisibleDigits) > 0:
result = formGreatestNumber(divisibleDigits)
return result
else:
return 0
def checkForTwo(l, remainder): # check of (sum of any two pairs - remainder) is divisible by 3
result = []
for i in range(len(l)):
for j in range(i+1, len(l)):
if ((l[i]+l[j])-remainder) % 3 == 0:
result.append(l[i])
result.append(l[j])
return result
return []
print(solution([]))
print(solution([1]))
print(solution([9]))
print(solution([3, 1, 4, 1, 9, 2, 5, 7]))
however it is on verifying showing-
Verifying solution...
Test 1 passed!
Test 2 passed!
Test 3 failed [Hidden]
Test 4 passed! [Hidden]
Test 5 passed! [Hidden]
so where is the error i am not noticing and is there any other way without any library like itertools?
I won't give away the code and spoil the fun for you, I'll perhaps try to explain the intuition.
About your code, I think the (2nd part of) the function removeUnwanted() is problematic here.
Let's see.
So first off, you'd arrange the input digits into a single number, in order from largest to smallest, which you've already done.
Then if the number formed isn't divisible by 3, try removing the smallest digit.
If that doesn't work, reinsert the smallest digit and remove the 2nd smallest digit, and so on.
Once you're done with removing all possible digits one at a time, try removing digits two at a time, starting with the two smallest.
If any of these result in a number that is divisible by 3, you're done.
Observe that you'll never need to remove more than 2 digits for this problem. The only way it's impossible to form the required number is if there are 2 or lesser digits and they are both either in the set {1,4,7} or {2,5,8}.
Edit: More about your code -
The initial part of your removeUnwanted() looks okay where you check if there's a single digit in the number which can be removed, removing the minimum from the choice of single digits and getting the answer.
I reckon the problem lies in your function checkForTwo(), which you call subsequently in removeUnwanted.
When you're passing the list to checkForTwo(), observe that the list is actually sorted in the decreasing order. This is because li.sort(reverse=True) in your function formGreatestNumber() sorted the list in place, which means the content of list l was sorted in descending order too.
And then in checkForTwo(), you try to find a pair that satisfies the required condition, but you're looping from the biggest 2 pairs that can possibly be removed. i starts from 0 and j starts from i+1 which is 1, and since your list is in descending order, you're trying to remove the biggest 2 elements possible.
A quick fix would be to sort the list in ascending order and then proceed further iterate through the list in reverse order, because since the list is sorted in descending order already, reverse iteration gives you the list in ascending order and saves us from re-sorting which would normally cost an additional O(NlogN) time.

Analyzing the complexity of this sort algorithm

I know merge sort is the best way to sort a list of arbitrary length, but I am wondering how to optimize my current method.
def sortList(l):
'''
Recursively sorts an arbitrary list, l, to increasing order.
'''
#base case.
if len(l) == 0 or len(l) == 1:
return l
oldNum = l[0]
newL = sortList(l[1:]) #recursive call.
#if oldNum is the smallest number, add it to the beginning.
if oldNum <= newL[0]:
return [oldNum] + newL
#find where oldNum goes.
for n in xrange(len(newL)):
if oldNum >= newL[n]:
try:
if oldNum <= newL[n+1]:
return newL[:n+1] + [oldNum] + newL[n+1:]
#if index n+1 is non-existant, oldNum must be the largest number.
except IndexError:
return newL + [oldNum]
What is the complexity of this function? I was thinking O(n^2) but I wasn't sure. Also, is there anyway to further optimize this procedure? (besides ditching it and going for merge sort!).
There's a few places I'd optimize your code.
You do a lot of list copies: each time you slice, you create a new copy of the list. That can be avoided by adding an index to the function declaration that indicates where in the array to start sorting from.
You should follow PEP 8 for naming: sort_list rather than sortList.
The code that does the insertion is a bit weird; intentionally raising an out-of-bounds index exception isn't normal programming practice. Instead, just percolate the value up the array until it's in the right place.
Applying these changes gives this code:
def sort_list(l, i=0):
if i == len(l): return
sort_list(l, i+1)
for j in xrange(i+1, len(l)):
if l[j-1] <= l[j]: return
l[j-1], l[j] = l[j], l[j-1]
This now sorts the array in-place, so there's no return value.
Here's some simple tests:
cases = [
[1, 2, 0, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[5, 4, 3, 2, 1, 1]
]
for c in cases:
got = c[:]
sort_list(got)
if sorted(c) != got:
print "sort_list(%s) = %s, want %s" % (c, got, sorted(c))
The time complexity is, as you suggest, O(n^2) where n is the length of the list. My version uses O(n) additional memory, whereas yours, because of the way the list gets copied at each stage, uses O(n^2).
One more step, which further improves the memory usage is to eliminate the recursion. Here's a version that does that:
def sort_list(l):
for i in xrange(len(l)-2, -1, -1):
for j in xrange(i+1, len(l)):
if l[j-1] <= l[j]: break
l[j-1], l[j] = l[j], l[j-1]
This works just the same as the recursive version, but does it iteratively; first sorting the last two elements in the array, then the last three, then the last four, and so on until the whole array is sorted.
This still has runtime complexity O(n^2), but now uses O(1) additional memory. Also, avoiding recursion means you can sort longer lists without hitting the notoriously low recursion limit in Python. And another benefit is that this code is now O(n) in the best case (when the array is already sorted).
A young Euler came up with a formula that seems appropriate here. The story goes that in grade school his teacher was very tired and to keep the class busy for a while they were told to add up all the numbers zero to one hundred. Young Euler came back with this:
This is applicable here because your run-time is going to be proportional to the sum of all the numbers up to the length of your list because in the worst case your function will be sorting an already sorted list and will go through the entire length newL each time to find the position of the next element at the end of the list.

Insert value in Incremental List not working

Context:
This code is really simple, I´m just new to python. I have an incremental list of numbers, all I need to do is check if there is any missing value, and if I do, insert -1 in that position, example:
If I have a list with values [1,2,4,5], I want it to become [1,2,-1,4,5].
If I have a list with values [1,4,5], I want it to become [1,-1,-1,4,5].
Simple, yet I can´t do it properly on python.
My code:
id: The list I want to modify.
i, j, and z: Counters.
MyRange: I can´t show the real name of the variable (I don´t own the code), but the range is correct.
z=0
for i in MyRange:
value = id[i]
value2 = id[i+1]
j=z
//This while is here because I try not to compare a value with -1
//(I think this is the problem)
while value == -1:
j=j-1
value = id[j]
if(int(value)+1 == int(value2)):
if(value2 != -1):
id.insert(i,-1)
z=z+1
This code identifies any missing value, but then fills the rest of the list (From the missing value to the last value with -1).
Any help would be apprecciated. Thank you and sorry for any english mistakes.
One somewhat easy way to do this is to make a set of the numbers. Then you can count from the lowest to the biggest and look for the number in the set. If it's there, you're all good. If it's not there, then yield a -1.
def fill_range(initial_range, fill_vallue):
smallest = initial_range[0]
biggest = initial_range[-1]
items = set(initial_range)
for i in range(smallest, biggest+1): # use xrange on python2.x
if i in items:
yield i
else:
yield fill_value
You might use this generator function like this:
print(list(fill_range([1,2,4,5], -1)))
If you haven't seen a generator function before, they're worth learning about but the answer above might be slightly confusing. Here's a version which accumulates a list and then returns it at the end:
def fill_range(initial_range, fill_vallue):
result = []
smallest = initial_range[0]
biggest = initial_range[-1]
items = set(initial_range)
for i in range(smallest, biggest+1):
if i in items:
result.append(i)
else:
result.append(fill_value)
return result
You might also notice that the if else suite could be replaced here pretty easily by a conditional expression...
You need only one additional variable to keep track of missing elements of the sequence.
def insert_minus_ones(lst):
new_lst = []
last = lst[0] - 1
for e in lst:
while (last + 1) != e:
new_lst.append(-1)
last += 1
new_lst.append(e)
last += 1
return new_lst
The code above works for any sequences of numbers:
>>> insert_minus_ones([1,2,4,6,10])
[1, 2, -1, 4, -1, 6, -1, -1, -1, 10]
>>> insert_minus_ones([-5,-4,-3,2])
[-5, -4, -3, -1, -1, -1, -1, 2]

Python: speed up removal of every n-th element from list

I'm trying to solve this programming riddle and although the solution (see code below) works correctly, it is too slow for succesful submission.
Any pointers as how to make this run
faster (removal of every n-th element from a list)?
Or suggestions for a better algorithm to calculate the same; seems I can't think of anything
else than brute-force for now...
Basically, the task at hand is:
GIVEN:
L = [2,3,4,5,6,7,8,9,10,11,........]
1. Take the first remaining item in list L (in the general case 'n'). Move it to
the 'lucky number list'. Then drop every 'n-th' item from the list.
2. Repeat 1
TASK:
Calculate the n-th number from the 'lucky number list' ( 1 <= n <= 3000)
My original code (it calculated the 3000 first lucky numbers in about a second on my machine - unfortunately too slow):
"""
SPOJ Problem Set (classical) 1798. Assistance Required
URL: http://www.spoj.pl/problems/ASSIST/
"""
sieve = range(3, 33900, 2)
luckynumbers = [2]
while True:
wanted_n = input()
if wanted_n == 0:
break
while len(luckynumbers) < wanted_n:
item = sieve[0]
luckynumbers.append(item)
items_to_delete = set(sieve[::item])
sieve = filter(lambda x: x not in items_to_delete, sieve)
print luckynumbers[wanted_n-1]
EDIT: thanks to the terrific contributions of Mark Dickinson, Steve Jessop and gnibbler, I got at the following, which is quite a whole lot faster than my original code (and succesfully got submitted at http://www.spoj.pl with 0.58 seconds!)...
sieve = range(3, 33810, 2)
luckynumbers = [2]
while len(luckynumbers) < 3000:
if len(sieve) < sieve[0]:
luckynumbers.extend(sieve)
break
luckynumbers.append(sieve[0])
del sieve[::sieve[0]]
while True:
wanted_n = input()
if wanted_n == 0:
break
else:
print luckynumbers[wanted_n-1]
This series is called ludic numbers
__delslice__ should be faster than __setslice__+filter
>>> L=[2,3,4,5,6,7,8,9,10,11,12]
>>> lucky=[]
>>> lucky.append(L[0])
>>> del L[::L[0]]
>>> L
[3, 5, 7, 9, 11]
>>> lucky.append(L[0])
>>> del L[::L[0]]
>>> L
[5, 7, 11]
So the loop becomes.
while len(luckynumbers) < 3000:
item = sieve[0]
luckynumbers.append(item)
del sieve[::item]
Which runs in less than 0.1 second
Try using these two lines for the deletion and filtering, instead of what you have; filter(None, ...) runs considerably faster than the filter(lambda ...).
sieve[::item] = [0]*-(-len(sieve)//item)
sieve = filter(None, sieve)
Edit: much better to simply use del sieve[::item]; see gnibbler's solution.
You might also be able to find a better termination condition for the while loop: for example, if the first remaining item in the sieve is i then the first i elements of the sieve will become the next i lucky numbers; so if len(luckynumbers) + sieve[0] >= wanted_n you should already have computed the number you need---you just need to figure out where in sieve it is so that you can extract it.
On my machine, the following version of your inner loop runs around 15 times faster than your original for finding the 3000th lucky number:
while len(luckynumbers) + sieve[0] < wanted_n:
item = sieve[0]
luckynumbers.append(item)
sieve[::item] = [0]*-(-len(sieve)//item)
sieve = filter(None, sieve)
print (luckynumbers + sieve)[wanted_n-1]
An explanation on how to solve this problem can be found here. (The problem I linked to asks for more, but the main step in that problem is the same as the one you're trying to solve.) The site I linked to also contains a sample solution in C++.
The set of numbers can be represented in a binary tree, which supports the following operations:
Return the nth element
Erase the nth element
These operations can be implemented to run in O(log n) time, where n is the number of nodes in the tree.
To build the tree, you can either make a custom routine that builds the tree from a given array of elements, or implement an insert operation (make sure to keep the tree balanced).
Each node in the tree need the following information:
Pointers to the left and right children
How many items there are in the left and right subtrees
With such a structure in place, solving the rest of the problem should be fairly straightforward.
I also recommend calculating the answers for all possible input values before reading any input, instead of calculating the answer for each input line.
A Java implementation of the above algorithm gets accepted in 0.68 seconds at the website you linked.
(Sorry for not providing any Python-specific help, but hopefully the algorithm outlined above will be fast enough.)
You're better off using an array and zeroing out every Nth item using that strategy; after you do this a few times in a row, the updates start getting tricky so you'd want to re-form the array. This should improve the speed by at least a factor of 10. Do you need vastly better than that?
Why not just create a new list?
L = [x for (i, x) in enumerate(L) if i % n]

Categories

Resources