Improving runtime on Euler #10

Improving runtime on Euler #10 - python

So I was attacking a Euler Problem that seemed pretty simple on a small scale, but as soon as I bump it up to the number that I'm supposed to do, the code takes forever to run. This is the question:
The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
Find the sum of all the primes below two million.
I did it in Python. I could wait a few hours for the code to run, but I'd rather find a more efficient way to go about this. Here's my code in Python:
x = 1;
total = 0;
while x <= 2000000:
y = 1;
z = 0;
while x >= y:
if x % y == 0:
z += 1;
y += 1;
if z == 2:
total += x
x += 1;
print total;

Like mentioned in the comments, implementing the Sieve of Eratosthenes would be a far better choice. It takes up O(n) extra space, which is an array of length ~2 million, in this case. It also runs in O(n), which is astronomically faster than your implementation, which runs in O(n²).
I originally wrote this in JavaScript, so bear with my python:
max = 2000000 # we only need to check the first 2 million numbers
numbers = []
sum = 0
for i in range(2, max): # 0 and 1 are not primes
numbers.append(i) # fill our blank list
for p in range(2, max):
if numbers[p - 2] != -1: # if p (our array stays at 2, not 0) is not -1
# it is prime, so add it to our sum
sum += numbers[p - 2]
# now, we need to mark every multiple of p as composite, starting at 2p
c = 2 * p
while c < max:
# we'll mark composite numbers as -1
numbers[c - 2] = -1
# increment the count to 3p, 4p, 5p, ... np
c += p
print(sum)
The only confusing part here might be why I used numbers[p - 2]. That's because I skipped 0 and 1, meaning 2 is at index 0. In other words, everything's shifted to the side by 2 indices.

Clearly the long pole in this tent is computing the list of primes in the first place. For an artificial situation like this you could get someone else's list (say, this one), prase it and add up the numbers in seconds.
But that's unsporting, in my view. In which case, try the sieve of atkin as noted in this SO answer.

Related

Project Euler #641 Python 3.6 - Numpy

I'm working on solve the below problem from Project Euler, which in short deals with iterating over 'n' dice and updating their values.
A Long Row of Dice - project Euler problem #641
Consider a row of n dice all showing 1.
First turn every second die,(2,4,6,…), so that the number showing is increased by 1. Then turn every third die. The sixth die will now show a 3. Then turn every fourth die and so on until every nth die (only the last die) is turned. If the die to be turned is showing a 6 then it is changed to show a 1.
Let f(n) be the number of dice that are showing a 1 when the process finishes. You are given f(100)=2 and f(10^8)=69.
Find f(10^36).
I've written the below code in Python using numpy, but can't exactly figure out what I'm doing wrong to my function output to match the output above. Right now f(100) returns 1 (should be 2); even f(1000) returns 1.
import numpy as np
def f(n):
# establish dice and the value sets for the dice
dice = np.arange(1, n + 1)
dice_values = np.ones(len(dice))
turns = range(2, len(dice) + 1)
print("{a} dice, {b} values, {c} runs to process".format(a=len(dice), b=len(dice_values), c=len(turns)))
# iterate and update the values of each die
# in our array of dice
for turn in turns:
# if die to be processed is 6, update to 1
dice_values[(dice_values == 6) & (dice % turn == 0)] = 1
# update dice_values to if the die's index has no remainder
# from the turn we're processing.
dice_values += dice % turn == 0
# output status
print('Processed every {0} dice'.format(turn))
print('{0}\n\n'.format(dice_values))
return "f({n}) = {x}".format(n=n, x=len(np.where(dice_values == 1)))
UPDATE 11/12/18
#Prune's guidance has been extremely helpful. My methodology is now as follows:
Find all the squares from 1 to n.
Find all squares with a number of factors which have a remainder of 1, when dividing by 6.
import numpy as np
# brute force to find number of factors for each n
def factors(n):
result = []
i = 1
# This will loop from 1 to int(sqrt(n))
while i * i <= n:
# Check if i divides x without leaving a remainder
if n % i == 0:
result.append(i)
if n / i != i:
result.append(n / i)
i += 1
# Return the list of factors of x
return len(result)
vect_factors = np.vectorize(factors)
# avoid brute forcing all numbers
def f(n):
# create an array of 1 to n + 1
# find all perfect squares in that range
dice = np.arange(1, n + 1)[(np.mod(np.sqrt(np.arange(1, n + 1)), 1) == 0)]
# find all squares which have n-factors, which
# when divided by 6 have a remainder of 1.
dice = dice[np.mod(vect_factors(dice), 6) == 1]
return len(dice)
Worth noting - on my machine, I'm unable to run larger than 10^10. While solving this would be ideal, I feel that what I've learned (and determined how to apply) in the process is enough for me.
UPDATE 11/13/2018
I'm continuing to spend a small bit of time trying to optimize this to get it processing more quickly. Here's the updated code base. This evaluates f(10**10) in 1 min and 17 seconds.
import time
from datetime import timedelta
import numpy as np
import math
from itertools import chain, cycle, accumulate
def find_squares(n):
return np.array([n ** 2 for n in np.arange(1, highest = math.sqrt(n) + 1)])
# brute force to find number of factors for each n
def find_factors(n):
def prime_powers(n):
# c goes through 2, 3, 5, then the infinite (6n+1, 6n+5) series
for c in accumulate(chain([2, 1, 2], cycle([2, 4]))):
if c * c > n: break
if n % c: continue
d, p = (), c
while not n % c:
n, p, d = n // c, p * c, d + (p,)
yield (d)
if n > 1: yield ((n,))
r = [1]
for e in prime_powers(n):
r += [a * b for a in r for b in e]
return len(r)
vect_factors = np.vectorize(find_factors)
# avoid brute forcing all numbers
def f(n):
# create an array of 1 to n + 1
# find all perfect squares in that range
start = time.time()
dice = find_squares(n)
# find all squares which have n-factors, which
# when divided by 6 have a remainder of 1.
dice = dice[np.mod(vect_factors(dice), 6) == 1]
diff = (timedelta(seconds=int(time.time() - start))).__str__()
print("{n} has {remain} dice with a value of 1. Computed in {diff}.".format(n=n, remain=len(dice), diff=diff))

I'm raising an x/y issue. Fixing your 6 => 1 flip will correct your code, but it will not solve the presented problem in reasonable time. To find f(10^36), you're processing 10^36 dice 10^36 times each, even if it's merely a divisibility check in the filter. That's a total of 10^72 checks. I don't know what hardware you have, but even my multi-core monster doesn't loop 10^72 times soon enough for comfort.
Instead, you need to figure out the underlying problem and try to generate a count for integers that fit the description.
The dice are merely a device to count something in mod 6. We're counting divisors of a number, including 1 and the number itself. This the (in)famous divisor function.
The problem at hand doesn't ask us to find σ0(n) for all numbers; it wants us to count how many integers have σ0(n) = 1 (mod 6). These are numbers with 1, 7, 13, 19, ... divisors.
First of all, note that this is an odd number. The only integers with an odd number of divisors are perfect squares. Look at the divisor function; how can we tell whether the square of a number will have the desired quantity of factors, 1 (mod 6)?
Does that get you moving?
WEEKEND UPDATE
My code to step through 10^18 candidates is still too slow to finish in this calendar year. It did well up to about 10^7 and then bogged down in the O(N log N) checking steps.
However, there are many more restrictions I've noted in my tracing output.
The main one is in characterizing what combinations of prime powers result in a solution. If we reduce each power mod 3, we have the following:
0 values do not affect validity of the result.
1 values make the number invalid.
2 values must be paired.
Also, these conditions are both necessary and sufficient to declare a given number as a solution. Therefore, it's possible to generate the desired solutions without bothering to step through the squares of all integers <= 10^18.
Among other things, we will need only primes up to 10^9: a solution's square root will need at least 2 of any prime factor.
I hope that's enough hints for now ... you'll need to construct an algorithm to generate certain restricted composite combinations with a given upper limit for the product.

As mentioned by Thierry in the comments, you are looping back to 2 when you flip dice at a 6. I'd suggest just changing dice_values[(dice_values == 6) & (dice % turn == 0)] = 1 to equal 0.
You also have an issue with return "f({n}) = {x}".format(n=n, x=len(np.where(dice_values == 1))) that I'd fix by replacing x=len(np.where(dice_values == 1)) with x=np.count_nonzero(dice_values == 1)
Doing both these changes gave me an output of f(100)=2

Python: Performance when looping and manipulating a large array

My question is two-fold:
Is there a way to both efficiently loop over and manipulate an
array using enumerate for example and manipulate the loop at
the same time?
Are there any memory-optimized versions of arrays in python?
(like NumPy creating smaller arrays with a specified type)
I have made an algorithm finding prime numbers in range (2 - rng) with the Sieve of Eratosthenes.
Note: The problem is nonexistent if searching for primes in 2 - 1,000,000 (under 1 sec total runtime too). In the tens and hundreds of millions this starts to hurt. So far changing the table from including all natural numbers to just odd ones, the rough maximum range I was able to search was 400 million (200 million in odd numbers).
Whiles instead of for loops decrease performance at least with the current algorithm.
NumPy while being able to create smaller arrays with type conversion, it actually takes roughly double the time to process with the same code, except
oddTable = np.int8(np.zeros(size))
in place of
oddTable = [0] * size
and using integers to assign values "prime" and "not prime" to keep the array type.
Using pseudo-code, the algorithm would look like this:
oddTable = [0] * size # Array representing odd numbers excluding 1 up to rng
for item in oddTable:
if item == 0: # Prime, since not product of any previous prime
set item to "prime"
set every multiple of item in oddTable to "not prime"
Python is a neat language particularly when looping over every item in a list, but as the index in, say
for i in range(1000)
can't be manipulated while in the loop, I had to convert the range a few times to produce an iterable which to use. In the code: "P" marks prime numbers, "_" marks not primes and 0 not checked.
num = 1 # Primes found (2 is prime)
size = int(rng / 2) - 1 # Size of table required to represent odd numbers
oddTable = [0] * size # Array with odd numbers \ 1: [3, 5, 7, 9...]
new_rng = int((size - 1) / 3) # To go through every 3rd item
for i in range(new_rng): # Eliminate no % 3's
oddTable[i * 3] = "_"
oddTable[0] = "P" # Set 3 to prime
num += 1
def act(x): # The actual integer index x in table refers to
x = (x + 1) * 2 + 1
return x
# Multiples of 2 and 3 eliminated, so all primes are 6k + 1 or 6k + 5
# In the oddTable: remaining primes are either 3*i + 1 or 3*i + 2
# new_rng to loop exactly 1/3 of the table length -> touch every item once
for i in range(new_rng):
j = 3*i + 1 # 3*i + 1
if oddTable[j] == 0:
num += 1
oddTable[j] = "P"
k = act(j)
multiple = j + k # The odd multiple indexes of act(j)
while multiple < size:
oddTable[multiple] = "_"
multiple += k
j += 1 # 3*i + 2
if oddTable[j] == 0:
num += 1
oddTable[j] = "P"
k = act(j)
multiple = j + k
while multiple < size:
oddTable[multiple] = "_"
multiple += k

To make your code more pythonic, split your algorithm in smaller chunks (functions), so that each chunk can be grasped easily.
My second comment might astound you: Python comes with "batteries included". In order to program your Erathostenes' Sieve, why do you need to manipulate arrays explicitly and pollute your code with it? Why not create a function (e.g. is_prime) and use the standard memoize decorator that was provided for that purpose? (If you insist on using 2.7, see also memoization library for python 2.7).
The result of the two pieces of advice above might not be the "most efficient", but it will (as I experienced with that exact problem) work well enough, while allowing you to quickly create sleek code that will save your programmer's time (both for creation and maintenance).

Speeding up algorithm that finds multiples in a given range

I'm a stumped on how to speed up my algorithm which sums multiples in a given range. This is for a problem on codewars.com here is a link to the problem
codewars link
Here's the code and i'll explain what's going on in the bottom
import itertools
def solution(number):
return multiples(3, number) + multiples(5, number) - multiples(15, number)
def multiples(m, count):
l = 0
for i in itertools.count(m, m):
if i < count:
l += i
else:
break
return l
print solution(50000000) #takes 41.8 seconds
#one of the testers takes 50000000000000000000000000000000000000000 as input
# def multiples(m, count):
# l = 0
# for i in xrange(m,count ,m):
# l += i
# return l
so basically the problem ask the user return the sum of all the multiples of 3 and 5 within a number. Here are the testers.
test.assert_equals(solution(10), 23)
test.assert_equals(solution(20), 78)
test.assert_equals(solution(100), 2318)
test.assert_equals(solution(200), 9168)
test.assert_equals(solution(1000), 233168)
test.assert_equals(solution(10000), 23331668)
my program has no problem getting the right answer. The problem arises when the input is large. When pass in a number like 50000000 it takes over 40 seconds to return the answer. One of the inputs i'm asked to take is 50000000000000000000000000000000000000000, which a is huge number. That's also the reason why i'm using itertools.count() I tried using xrange in my first attempt but range can't handle numbers larger than a c type long. I know the slowest part the problem is the multiples method...yet it is still faster then my first attempt using list comprehension and checking whether i % 3 == 0 or i % 5 == 0, any ideas guys?

This solution should be faster for large numbers.
def solution(number):
number -= 1
a, b, c = number // 3, number // 5, number // 15
asum, bsum, csum = a*(a+1) // 2, b*(b+1) // 2, c*(c+1) // 2
return 3*asum + 5*bsum - 15*csum
Explanation:
Take any sequence from 1 to n:
1, 2, 3, 4, ..., n
And it's sum will always be given by the formula n(n+1)/2. This can be proven easily if you consider that the expression (1 + n) / 2 is just a shortcut for computing the average, or Arithmetic mean of this particular sequence of numbers. Because average(S) = sum(S) / length(S), if you take the average of any sequence of numbers and multiply it by the length of the sequence, you get the sum of the sequence.
If we're given a number n, and we want the sum of the multiples of some given k up to n, including n, we want to find the summation:
k + 2k + 3k + 4k + ... xk
where xk is the highest multiple of k that is less than or equal to n. Now notice that this summation can be factored into:
k(1 + 2 + 3 + 4 + ... + x)
We are given k already, so now all we need to find is x. If x is defined to be the highest number you can multiply k by to get a natural number less than or equal to n, then we can get the number x by using Python's integer division:
n // k == x
Once we find x, we can find the sum of the multiples of any given k up to a given n using previous formulas:
k(x(x+1)/2)
Our three given k's are 3, 5, and 15.
We find our x's in this line:
a, b, c = number // 3, number // 5, number // 15
Compute the summations of their multiples up to n in this line:
asum, bsum, csum = a*(a+1) // 2, b*(b+1) // 2, c*(c+1) // 2
And finally, multiply their summations by k in this line:
return 3*asum + 5*bsum - 15*csum
And we have our answer!

Python 2 lists of positive integers finding prime number

Given 2 lists of positive integers, find how many ways you can select a number from each of the lists such that their sum is a prime number.
My code is tooo slow As i have both list1 and list 2 containing 50000 numbers each. So any way to make it faster so it solves it in minutes instead of days?? :)
# 2 is the only even prime number
if n == 2: return True
# all other even numbers are not primes
if not n & 1: return False
# range starts with 3 and only needs to go
# up the squareroot of n for all odd numbers
for x in range(3, int(n**0.5)+1, 2):
if n % x == 0: return False
return True
for i2 in l2:
for i1 in l1:
if isprime(i1 + i2):
n = n + 1 # increasing number of ways
s = "{0:02d}: {1:d}".format(n, i1 + i2)
print(s) # printing out

Sketch:
Following #Steve's advice, first figure out all the primes <= max(l1) + max(l2). Let's call that list primes. Note: primes doesn't really need to be a list; you could instead generate primes up the max one at a time.
Swap your lists (if necessary) so that l2 is the longest list. Then turn that into a set: l2 = set(l2).
Sort l1 (l1.sort()).
Then:
for p in primes:
for i in l1:
diff = p - i
if diff < 0:
# assuming there are no negative numbers in l2;
# since l1 is sorted, all diffs at and beyond this
# point will be negative
break
if diff in l2:
# print whatever you like
# at this point, p is a prime, and is the
# sum of diff (from l2) and i (from l1)
Alas, if l2 is, for example:
l2 = [2, 3, 100000000000000000000000000000000000000000000000000]
this is impractical. It relies on that, as in your example, max(max(l1), max(l2)) is "reasonably small".
Fleshed out
Hmm! You said in a comment that the numbers in the lists are up to 5 digits long. So they're less than 100,000. And you said at the start that the list have 50,000 elements each. So they each contain about half of all possible integers under 100,000, and you're going to have a very large number of sums that are primes. That's all important if you want to micro-optimize ;-)
Anyway, since the maximum possible sum is less than 200,000, any way of sieving will be fast enough - it will be a trivial part of the runtime. Here's the rest of the code:
def primesum(xs, ys):
if len(xs) > len(ys):
xs, ys = ys, xs
# Now xs is the shorter list.
xs = sorted(xs) # don't mutate the input list
sum_limit = xs[-1] + max(ys) # largest possible sum
ys = set(ys) # make lookups fast
count = 0
for p in gen_primes_through(sum_limit):
for x in xs:
diff = p - x
if diff < 0:
# Since xs is sorted, all diffs at and
# beyond this point are negative too.
# Since ys contains no negative integers,
# no point continuing with this p.
break
if diff in ys:
#print("%s + %s = prime %s" % (x, diff, p))
count += 1
return count
I'm not going to supply my gen_primes_through(), because it's irrelevant. Pick one from the other answers, or write your own.
Here's a convenient way to supply test cases:
from random import sample
xs = sample(range(100000), 50000)
ys = sample(range(100000), 50000)
print(primesum(xs, ys))
Note: I'm using Python 3. If you're using Python 2, use xrange() instead of range().
Across two runs, they each took about 3.5 minutes. That's what you asked for at the start ("minutes instead of days"). Python 2 would probably be faster. The counts returned were:
219,334,097
and
219,457,533
The total number of possible sums is, of course, 50000**2 == 2,500,000,000.
About timing
All the methods discussed here, including your original one, take time proportional to the product of two lists' lengths. All the fiddling is to reduce the constant factor. Here's a huge improvement over your original:
def primesum2(xs, ys):
sum_limit = max(xs) + max(ys) # largest possible sum
count = 0
primes = set(gen_primes_through(sum_limit))
for i in xs:
for j in ys:
if i+j in primes:
# print("%s + %s = prime %s" % (i, j, i+j))
count += 1
return count
Perhaps you'll understand that one better. Why is it a huge improvement? Because it replaces your expensive isprime(n) function with a blazing fast set lookup. It still takes time proportional to len(xs) * len(ys), but the "constant of proportionality" is slashed by replacing a very expensive inner-loop operation with a very cheap operation.
And, in fact, primesum2() is faster than my primesum() in many cases too. What makes primesum() faster in your specific case is that there are only around 18,000 primes less than 200,000. So iterating over the primes (as primesum() does) goes a lot faster than iterating over a list with 50,000 elements.
A "fast" general-purpose function for this problem would need to pick different methods depending on the inputs.

You should use the Sieve of Eratosthenes to calculate prime numbers.
You are also calculating the prime numbers for each possible combination of sums. Instead, consider finding the maximum value you can achieve with the sum from the lists. Generate a list of all the prime numbers up to that maximum value.
Whilst you are adding up the numbers, you can see if the number appears in your prime number list or not.

I would find the highest number in each range. The range of primes is the sum of the highest numbers.
Here is code to sieve out primes:
def eras(n):
last = n + 1
sieve = [0, 0] + list(range(2, last))
sqn = int(round(n ** 0.5))
it = (i for i in xrange(2, sqn + 1) if sieve[i])
for i in it:
sieve[i * i:last:i] = [0] * (n // i - i + 1)
return filter(None, sieve)
It takes around 3 seconds to find the primes up to 10 000 000. Then I would use the same n ^ 2 algorithm you are using for generating sums. I think there is an n logn algorithm but I can't come up with it.
It would look something like this:
from collections import defaultdict
possible = defaultdict(int)
for x in range1:
for y in range2:
possible[x + y] += 1
def eras(n):
last = n + 1
sieve = [0, 0] + list(range(2, last))
sqn = int(round(n ** 0.5))
it = (i for i in xrange(2, sqn + 1) if sieve[i])
for i in it:
sieve[i * i:last:i] = [0] * (n // i - i + 1)
return filter(None, sieve)
n = max(possible.keys())
primes = eras(n)
possible_primes = set(possible.keys()).intersection(set(primes))
for p in possible_primes:
print "{0}: {1} possible ways".format(p, possible[p])

Subset sum Problem

recently I became interested in the subset-sum problem which is finding a zero-sum subset in a superset. I found some solutions on SO, in addition, I came across a particular solution which uses the dynamic programming approach. I translated his solution in python based on his qualitative descriptions. I'm trying to optimize this for larger lists which eats up a lot of my memory. Can someone recommend optimizations or other techniques to solve this particular problem? Here's my attempt in python:
import random
from time import time
from itertools import product
time0 = time()
# create a zero matrix of size a (row), b(col)
def create_zero_matrix(a,b):
return [[0]*b for x in xrange(a)]
# generate a list of size num with random integers with an upper and lower bound
def random_ints(num, lower=-1000, upper=1000):
return [random.randrange(lower,upper+1) for i in range(num)]
# split a list up into N and P where N be the sum of the negative values and P the sum of the positive values.
# 0 does not count because of additive identity
def split_sum(A):
N_list = []
P_list = []
for x in A:
if x < 0:
N_list.append(x)
elif x > 0:
P_list.append(x)
return [sum(N_list), sum(P_list)]
# since the column indexes are in the range from 0 to P - N
# we would like to retrieve them based on the index in the range N to P
# n := row, m := col
def get_element(table, n, m, N):
if n < 0:
return 0
try:
return table[n][m - N]
except:
return 0
# same definition as above
def set_element(table, n, m, N, value):
table[n][m - N] = value
# input array
#A = [1, -3, 2, 4]
A = random_ints(200)
[N, P] = split_sum(A)
# create a zero matrix of size m (row) by n (col)
#
# m := the number of elements in A
# n := P - N + 1 (by definition N <= s <= P)
#
# each element in the matrix will be a value of either 0 (false) or 1 (true)
m = len(A)
n = P - N + 1;
table = create_zero_matrix(m, n)
# set first element in index (0, A[0]) to be true
# Definition: Q(1,s) := (x1 == s). Note that index starts at 0 instead of 1.
set_element(table, 0, A[0], N, 1)
# iterate through each table element
#for i in xrange(1, m): #row
# for s in xrange(N, P + 1): #col
for i, s in product(xrange(1, m), xrange(N, P + 1)):
if get_element(table, i - 1, s, N) or A[i] == s or get_element(table, i - 1, s - A[i], N):
#set_element(table, i, s, N, 1)
table[i][s - N] = 1
# find zero-sum subset solution
s = 0
solution = []
for i in reversed(xrange(0, m)):
if get_element(table, i - 1, s, N) == 0 and get_element(table, i, s, N) == 1:
s = s - A[i]
solution.append(A[i])
print "Solution: ",solution
time1 = time()
print "Time execution: ", time1 - time0

I'm not quite sure if your solution is exact or a PTA (poly-time approximation).
But, as someone pointed out, this problem is indeed NP-Complete.
Meaning, every known (exact) algorithm has an exponential time behavior on the size of the input.
Meaning, if you can process 1 operation in .01 nanosecond then, for a list of 59 elements it'll take:
2^59 ops --> 2^59 seconds --> 2^26 years --> 1 year
-------------- ---------------
10.000.000.000 3600 x 24 x 365
You can find heuristics, which give you just a CHANCE of finding an exact solution in polynomial time.
On the other side, if you restrict the problem (to another) using bounds for the values of the numbers in the set, then the problem complexity reduces to polynomial time. But even then the memory space consumed will be a polynomial of VERY High Order.
The memory consumed will be much larger than the few gigabytes you have in memory.
And even much larger than the few tera-bytes on your hard drive.
( That's for small values of the bound for the value of the elements in the set )
May be this is the case of your Dynamic programing algorithm.
It seemed to me that you were using a bound of 1000 when building your initialization matrix.
You can try a smaller bound. That is... if your input is consistently consist of small values.
Good Luck!

Someone on Hacker News came up with the following solution to the problem, which I quite liked. It just happens to be in python :):
def subset_summing_to_zero (activities):
subsets = {0: []}
for (activity, cost) in activities.iteritems():
old_subsets = subsets
subsets = {}
for (prev_sum, subset) in old_subsets.iteritems():
subsets[prev_sum] = subset
new_sum = prev_sum + cost
new_subset = subset + [activity]
if 0 == new_sum:
new_subset.sort()
return new_subset
else:
subsets[new_sum] = new_subset
return []
I spent a few minutes with it and it worked very well.

An interesting article on optimizing python code is available here. Basically the main result is that you should inline your frequent loops, so in your case this would mean instead of calling get_element twice per loop, put the actual code of that function inside the loop in order to avoid the function call overhead.
Hope that helps! Cheers

, 1st eye catch
def split_sum(A):
N_list = 0
P_list = 0
for x in A:
if x < 0:
N_list+=x
elif x > 0:
P_list+=x
return [N_list, P_list]
Some advices:
Try to use 1D list and use bitarray to reduce memory footprint at minimum (http://pypi.python.org/pypi/bitarray) so you will just change get / set functon. This should reduce your memory footprint by at lest 64 (integer in list is pointer to integer whit type so it can be factor 3*32)
Avoid using try - catch, but figure out proper ranges at beginning, you might found out that you will gain huge speed.

The following code works for Python 3.3+ , I have used the itertools module in Python that has some great methods to use.
from itertools import chain, combinations
def powerset(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
nums = input("Enter the Elements").strip().split()
inputSum = int(input("Enter the Sum You want"))
for i, combo in enumerate(powerset(nums), 1):
sum = 0
for num in combo:
sum += int(num)
if sum == inputSum:
print(combo)
The Input Output is as Follows:
Enter the Elements 1 2 3 4
Enter the Sum You want 5
('1', '4')
('2', '3')

Just change the values in your set w and correspondingly make an array x as big as the len of w then pass the last value in the subsetsum function as the sum for which u want subsets and you wl bw done (if u want to check by giving your own values).
def subsetsum(cs,k,r,x,w,d):
x[k]=1
if(cs+w[k]==d):
for i in range(0,k+1):
if x[i]==1:
print (w[i],end=" ")
print()
elif cs+w[k]+w[k+1]<=d :
subsetsum(cs+w[k],k+1,r-w[k],x,w,d)
if((cs +r-w[k]>=d) and (cs+w[k]<=d)) :
x[k]=0
subsetsum(cs,k+1,r-w[k],x,w,d)
#driver for the above code
w=[2,3,4,5,0]
x=[0,0,0,0,0]
subsetsum(0,0,sum(w),x,w,7)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.