I'm practicing how to numerically solve difference equations, but I often run into problems like the one below.
Can anyone help me sort this out?
import numpy as np
N = 10
#alternative 1
#x = np.zeros(N+1, int) # Produces error IndexError: index 11 is out of bounds for axis 0 with size 11
#alternative 2
x = (N+1)*[0] # Produces error: IndexError: list assignment index out of range
x[0] = 1000
r = 1.02
for n in range(1, N+1):
x[n+1] = r**(n+1)*x[0]
print(f"x[{n}] = {x[n+1]}")
Fixing the indices
The range of your indices is inconsistent with the way you use them in the loop. You can use either of the following two possible loops, but don't mix them:
for n in range(1, N+1):
x[n] = r**n * x[0]
for n in range(0, N):
x[n+1] = r**(n+1) * x[0]
Optimization: multiplications instead of exponentiations
Note that computing an exponent ** is always more costly than computing a multiplication *; you can slightly optimize your code by using a recurrence formula:
for n in range(1, N+1):
x[n] = r * x[n-1]
for n in range(0, N):
x[n+1] = r * x[n]
Using library functions: itertools, numpy or pandas
What you are asking for is called a geometric progression. Python provides several ways of computing geometric progressions without writing the loop yourself.
Documentation: numpy.geomspace
Documentation: itertools.accumulate
Question: Geometric progression using Python / Pandas / Numpy
Question: python geometric sequence
Question: Generate a geometric progression using list comprehension
Question: Making a list of a geometric progression when the ratio and range are given
Question: Writing python code to calculate a Geometric progression
For instance:
import itertools # accumulate, repeat
import operator # mul
def geometric_progression(x0, r, N):
return list(itertools.accumulate(itertools.repeat(r,N), operator.mul, initial=x0))
print(geometric_progression(1000, 1.2, 10))
# [1000, 1200.0, 1440.0, 1728.0, 2073.6, 2488.3199999999997, 2985.9839999999995, 3583.180799999999, 4299.816959999999, 5159.780351999999, 6191.736422399998]
I think your problem that you should remember the index of any element in the list starting from zero and the index of the last element is N - 1 where N is the count of the elements in the list.
So you should make this change in your for loop:
for n in range(0, N):
Also, your using of print should be a reflection to the data in your list. So you should fix the argument of your print function to the following:
print(f"x[{n+1}] = {x[n+1]}")
After making these changes, you will get this result:
x[1] = 1020.0
x[2] = 1040.4
x[3] = 1061.208
x[4] = 1082.43216
x[5] = 1104.0808032
x[6] = 1126.1624192640002
x[7] = 1148.68566764928
x[8] = 1171.6593810022657
x[9] = 1195.092568622311
x[10] = 1218.9944199947574
Please, Note you have N + 1 elements not N elements in your list because of this line of your code
x = (N+1)*[0]
Hope this help.
The length of your array is 11, which means the last element is accessed by x[10]. But in the loop, the value being called when n is 10 is x[11] which makes it go out of range.
I'm not sure about the constraints of your problem, but if you want to access x[11], change the total size of the array to x = (N+2)*[0].
Output
x[1] = 1040.4
x[2] = 1061.208
x[3] = 1082.43216
x[4] = 1104.0808032
x[5] = 1126.1624192640002
x[6] = 1148.68566764928
x[7] = 1171.6593810022657
x[8] = 1195.092568622311
x[9] = 1218.9944199947574
x[10] = 1243.3743083946524
Related
I'm trying to perform the following nested summation in python:
so I tried the following code:
import numpy as np
gamma = 17
R = 0.5
H = np.array([0.1,0.2])
alpha = np.array([0.1,0.2])
n = 2
F = 0
for i in range(n):
for j in range(i+1):
F = F + 3*gamma*H[i]*(R+H[j]*np.tan(alpha[j]))**2
But of course, this isn't giving me the right answer since it is summing all the terms again in the j loop. My question is how I can solve it? Bear in mind that this is just a small piece of a big expression with several summations for j like the one above inside a summation for i, so it must be something a little optimized. Thank you in advance!
I see at least the following options (order by increase efficiency):
Just the readable Pythonic Way
One nice thing about python is that you can write these expressions in very close analogy to the way it is written in math. In your case, you want to sum over an iterable of numbers:
f = sum(
3 * gamma * H[i] * (
R + (
sum(
H[j] * np.tan(alpha[j])
for j in range(i+1)
)
)
)**2
for i in range(n)
)
Caching the Inner Sum
In your case, the inner sum
sum(
H[j] * np.tan(alpha[j])
for j in range(i+1)
)
is calculated multiple times, while it just increments in every iteration. Let's just call this term inner_sum(index). Then inner_sum(index-1) has already been calculated in the previous iteration. So we loose time when we recalculate it in every iteration. One approach could be to make inner_sum a function and cache its previous results. We could use functools.cache for that purpose:
from functools import cache
#cache
def inner_sum(index: int) -> float:
if not index:
return H[0] * np.tan(alpha[0])
return inner_sum(H, index - 1) + H[index] * np.tan(alpha[index])
Now, we can just write:
f = sum(
3 * gamma * H[i] * (
R + inner_sum(i)
)**2
for i in range(n)
)
Using a Generator for the Partial Sum
This is still not memory-efficient, because we store all the H[i] for i < index in memory, while we actually just need the last one. There are different ways to implement an object which only stores the last value. You could just store it in a variable inner_sum_previous, for example. Or you could make inner_sum a proper generator spitting out (in fact: yielding) the partial sums one after another:
from typing import Generator
def partial_sum() -> Generator[float, None, None]:
partsum = 0
index = 0
while True:
try:
partsum += H[index] * np.tan(alpha[index])
yield partsum
index += 1
except IndexError:
raise StopIteration
With this, we would write;
partial_sum_generator = partial_sum()
f = sum(
3 * gamma * H[i] * (
R + next(partial_sum_generator)
)**2
for i in range(n)
)
for loop, in this case, is very similar to ∑, i.e. everything that is outside the ∑ in your formula should be outside the for loop, i.e.:
...
F = 0
for i in range(n):
inner_sum = 0
for j in range(i+1):
inner_sum += H[j] * np.tan(alpha[j])
F += 3 * gamma * H[i] * (R + inner_sum) ** 2
Calculate the part inside the parenthesis first in the j loop, store it in a variable, then multiply it by the rest of the expression afterwards.
Just for compactness and readability, I would go for something like this:
for i in range(n):
3*gamma*H[i]*(R + np.sum([H[j]*np.tan(alpha[i]) for j in range(i)]))**2
Obviously, you can also convert the first for loop into a sum over a list as I did with the second summation to make the expression more compact, but I think it is more readable this way.
I'm trying to solve this question:
Given a positive integral number n, return a strictly increasing
sequence (list/array/string depending on the language) of numbers, so
that the sum of the squares is equal to n².
If there are multiple solutions (and there will be), return the result
with the largest possible value:
Basically, a squared number deconstructed into smaller squares. However, my code only works for smalls numbers efficiently (20 being roughly the max for the first piece of code, 30~ for second) and is exponentially slower onwards. How can I develop the code?
I don't know enough about efficiency in a language and be able to apply to my code and I tried optimising it by reducing the memory usage by setting comb to take out any bits of irrelevant data for the one below
For the second piece of code I tried using recursion to solve my problem but it still is getting exponentially slower. Perhaps I need a new method? Much appreciated.
import itertools as it
from math import sqrt
def decompose(n):
squares = [i ** 2 for i in range(1, n) if (i ** 2)/2 < n ** 2]
comb = [list(i) for i in (reduce(lambda acc, x: acc + list(it.combinations(squares, x)),
range(1, len(squares) + 1), [])) if sum(i) == n ** 2]
print [int(sqrt(i)) for i in max(comb)]
decompose(20)
this was the first attempt, failed in efficiency so I tried this.
from math import sqrt
stuff = []
def decompose(a):
def subset_sum(numbers, target, partial=[]):
s = sum(partial)
if s == target:
stuff.append(partial)
if s >= target:
return
for i in range(len(numbers)):
n = numbers[i]
remaining = numbers[i+1:]
subset_sum(remaining, target, partial + [n])
compare = 1
large = None
subset_sum([x**2 for x in range(1, a)], a**2)
for y in stuff:
if compare < y[-1]:
compare = y[-1]
large = y
print [int(sqrt(o)) for o in large]
decompose(30)
Say I have 4 numpy arrays A,B,C,D , each the size of (256,256,1792).
I want to go through each element of those arrays and do something to it, but I need to do it in chunks of 256x256x256-cubes.
My code looks like this:
for l in range(7):
x, y, z, t = 0,0,0,0
for m in range(a.shape[0]):
for n in range(a.shape[1]):
for o in range(256*l,256*(l+1)):
t += D[m,n,o] * constant
x += A[m,n,o] * D[m,n,o] * constant
y += B[m,n,o] * D[m,n,o] * constant
z += C[m,n,o] * D[m,n,o] * constant
final = (x+y+z)/t
doOutput(final)
The code works and outputs exactly what I want, but its awfully slow. I've read online that those kind of nested for loops should be avoided in python. What is the cleanest solution to it? (right now I'm trying to do this part of my code in C and somehow import it via Cython or other tools, but I'd love a pure python solution)
Thanks
Add on
Willem Van Onsem's Solution to the first part seems to work just fine and I think I comprehend it. But now I want to modify my values before summing them. It looks like
(within the outer l loop)
for m in range(a.shape[0]):
for n in range(a.shape[1]):
for o in range(256*l,256*(l+1)):
R += (D[m,n,o] * constant * (A[m,n,o]**2
+ B[m,n,o]**2 + C[m,n,o]**2)/t - final**2)
doOutput(R)
I obviously can't just square the sum x = (A[:a.shape[0],:a.shape[1],256*l:256*(l+1)]*Dsub).sum()**2*constant since (A²+B²) != (A+B)²
How can I redo this last for loops?
Since you update t with every element of m in range(a.shape[0]), n in range(a.shape[1]) and o in range(256*l,256*(l+1)), you can substitute:
for m in range(a.shape[0]):
for n in range(a.shape[1]):
for o in range(256*l,256*(l+1)):
t += D[m,n,o]
With:
t += D[:a.shape[0],:a.shape[1],256*l:256*(l+1)].sum()
The same for the other assignments. So you can rewrite your code to:
for l in range(7):
Dsub = D[:a.shape[0],:a.shape[1],256*l:256*(l+1)]
x = (A[:a.shape[0],:a.shape[1],256*l:256*(l+1)]*Dsub).sum()*constant
y = (B[:a.shape[0],:a.shape[1],256*l:256*(l+1)]*Dsub).sum()*constant
z = (C[:a.shape[0],:a.shape[1],256*l:256*(l+1)]*Dsub).sum()*constant
t = Dsub.sum()*constant
final = (x+y+z)/t
doOutput(final)
Note that the * in numpy is the element-wise multiplication, not the matrix product. You can do the multiplication before the sum, but since the sum of a multiplications with a constant is equal to the multiplication of that constant with the sum, I think it is more efficient to do this out of the loop.
If a.shape[0] is equal to D.shape[0], etc. You can use : instead of :a.shape[0]. Based on your question, that seems to be the case. so:
# only when `a.shape[0] == D.shape[0], a.shape[1] == D.shape[1] (and so for A, B and C)`
for l in range(7):
Dsub = D[:,:,256*l:256*(l+1)]
x = (A[:,:,256*l:256*(l+1)]*Dsub).sum()*constant
y = (B[:,:,256*l:256*(l+1)]*Dsub).sum()*constant
z = (C[:,:,256*l:256*(l+1)]*Dsub).sum()*constant
t = Dsub.sum()*constant
final = (x+y+z)/t
doOutput(final)
Processing the .sum() on the numpy level will boost performance since you do not convert values back and forth and with .sum(), you use a tight loop.
EDIT:
Your updated question does not change much. You can simply use:
m,n,_* = a.shape
lo,hi = 256*l,256*(l+1)
R = (D[:m,:n,lo:hi]*constant*(A[:m,:n,lo:hi]**2+B[:m,:n,lo:hi]**2+D[:m,:n,lo:hi]**2)/t-final**2)).sum()
doOutput(R)
I was implementing and testing answers to this SO question -
Given an array of integers find the number of all ordered pairs of elements in the array whose sum lies in a given range [a,b]
The answer with the most upvotes (currently) only provides a text description of an algorithm that should be O(NlogN):
Sort the array... .
For each element x in the array:
Consider the array slice after the element.
Do a binary search on this array slice for [a - x], call it y0. If no exact match is found, consider the closest match bigger than [a - x] as y0.
Output all elements (x, y) from y0 forwards as long as x + y <= b. ... If you only need to count the number of pairs, you can do it in O(nlogn). Modify the above algorithm so [b - x] (or the next smaller element) is also searched for.
My implementation:
import bisect
def ani(arr, a, b):
# Sort the array (say in increasing order).
arr.sort()
count = 0
for ndx, x in enumerate(arr):
# Consider the array slice after the element
after = arr[ndx+1:]
# Do a binary search on this array slice for [a - x], call it y0
lower = a - x
y0 = bisect.bisect_left(after, lower)
# If you only need to count the number of pairs
# Modify the ... algorithm so [b - x] ... is also searched for
upper = b - x
y1 = bisect.bisect_right(after, upper)
count += y1 - y0
return count
When I plot Time versus N or some function of N I am seeing an exponential or N^2 response.
# generate timings
T = list() # run-times
N = range(100, 10001, 100) # N
arr = [random.randint(-10, 10) for _ in xrange(1000000)]
print 'start'
start = time.time()
for n in N:
arr1 = arr[:n]
t = Timer('ani(arr1, 5, 16)', 'from __main__ import arr1, ani')
timing_loops = 100
T.append(t.timeit(timing_loops) / timing_loops)
Is my implementation incorrect or is the author's claim incorrect?
Here are some plots of the data.
T vs N
T / NlogN vs N - one commenter thought this should NOT produce a linear plot - but it does.
T vs NlogN - I thought this should be linear if the complexity is NlogN but it is not.
If nothing else, this is your error:
for ndx, x in enumerate(arr):
# Consider the array slice after the element
after = arr[ndx+1:]
arr[ndx+1:] creates a copy of the list of length len(arr) - ndx, so therefore your loop is O(n^2).
Instead, use the lo and hi arguments to bisect.bisect.
recently I became interested in the subset-sum problem which is finding a zero-sum subset in a superset. I found some solutions on SO, in addition, I came across a particular solution which uses the dynamic programming approach. I translated his solution in python based on his qualitative descriptions. I'm trying to optimize this for larger lists which eats up a lot of my memory. Can someone recommend optimizations or other techniques to solve this particular problem? Here's my attempt in python:
import random
from time import time
from itertools import product
time0 = time()
# create a zero matrix of size a (row), b(col)
def create_zero_matrix(a,b):
return [[0]*b for x in xrange(a)]
# generate a list of size num with random integers with an upper and lower bound
def random_ints(num, lower=-1000, upper=1000):
return [random.randrange(lower,upper+1) for i in range(num)]
# split a list up into N and P where N be the sum of the negative values and P the sum of the positive values.
# 0 does not count because of additive identity
def split_sum(A):
N_list = []
P_list = []
for x in A:
if x < 0:
N_list.append(x)
elif x > 0:
P_list.append(x)
return [sum(N_list), sum(P_list)]
# since the column indexes are in the range from 0 to P - N
# we would like to retrieve them based on the index in the range N to P
# n := row, m := col
def get_element(table, n, m, N):
if n < 0:
return 0
try:
return table[n][m - N]
except:
return 0
# same definition as above
def set_element(table, n, m, N, value):
table[n][m - N] = value
# input array
#A = [1, -3, 2, 4]
A = random_ints(200)
[N, P] = split_sum(A)
# create a zero matrix of size m (row) by n (col)
#
# m := the number of elements in A
# n := P - N + 1 (by definition N <= s <= P)
#
# each element in the matrix will be a value of either 0 (false) or 1 (true)
m = len(A)
n = P - N + 1;
table = create_zero_matrix(m, n)
# set first element in index (0, A[0]) to be true
# Definition: Q(1,s) := (x1 == s). Note that index starts at 0 instead of 1.
set_element(table, 0, A[0], N, 1)
# iterate through each table element
#for i in xrange(1, m): #row
# for s in xrange(N, P + 1): #col
for i, s in product(xrange(1, m), xrange(N, P + 1)):
if get_element(table, i - 1, s, N) or A[i] == s or get_element(table, i - 1, s - A[i], N):
#set_element(table, i, s, N, 1)
table[i][s - N] = 1
# find zero-sum subset solution
s = 0
solution = []
for i in reversed(xrange(0, m)):
if get_element(table, i - 1, s, N) == 0 and get_element(table, i, s, N) == 1:
s = s - A[i]
solution.append(A[i])
print "Solution: ",solution
time1 = time()
print "Time execution: ", time1 - time0
I'm not quite sure if your solution is exact or a PTA (poly-time approximation).
But, as someone pointed out, this problem is indeed NP-Complete.
Meaning, every known (exact) algorithm has an exponential time behavior on the size of the input.
Meaning, if you can process 1 operation in .01 nanosecond then, for a list of 59 elements it'll take:
2^59 ops --> 2^59 seconds --> 2^26 years --> 1 year
-------------- ---------------
10.000.000.000 3600 x 24 x 365
You can find heuristics, which give you just a CHANCE of finding an exact solution in polynomial time.
On the other side, if you restrict the problem (to another) using bounds for the values of the numbers in the set, then the problem complexity reduces to polynomial time. But even then the memory space consumed will be a polynomial of VERY High Order.
The memory consumed will be much larger than the few gigabytes you have in memory.
And even much larger than the few tera-bytes on your hard drive.
( That's for small values of the bound for the value of the elements in the set )
May be this is the case of your Dynamic programing algorithm.
It seemed to me that you were using a bound of 1000 when building your initialization matrix.
You can try a smaller bound. That is... if your input is consistently consist of small values.
Good Luck!
Someone on Hacker News came up with the following solution to the problem, which I quite liked. It just happens to be in python :):
def subset_summing_to_zero (activities):
subsets = {0: []}
for (activity, cost) in activities.iteritems():
old_subsets = subsets
subsets = {}
for (prev_sum, subset) in old_subsets.iteritems():
subsets[prev_sum] = subset
new_sum = prev_sum + cost
new_subset = subset + [activity]
if 0 == new_sum:
new_subset.sort()
return new_subset
else:
subsets[new_sum] = new_subset
return []
I spent a few minutes with it and it worked very well.
An interesting article on optimizing python code is available here. Basically the main result is that you should inline your frequent loops, so in your case this would mean instead of calling get_element twice per loop, put the actual code of that function inside the loop in order to avoid the function call overhead.
Hope that helps! Cheers
, 1st eye catch
def split_sum(A):
N_list = 0
P_list = 0
for x in A:
if x < 0:
N_list+=x
elif x > 0:
P_list+=x
return [N_list, P_list]
Some advices:
Try to use 1D list and use bitarray to reduce memory footprint at minimum (http://pypi.python.org/pypi/bitarray) so you will just change get / set functon. This should reduce your memory footprint by at lest 64 (integer in list is pointer to integer whit type so it can be factor 3*32)
Avoid using try - catch, but figure out proper ranges at beginning, you might found out that you will gain huge speed.
The following code works for Python 3.3+ , I have used the itertools module in Python that has some great methods to use.
from itertools import chain, combinations
def powerset(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
nums = input("Enter the Elements").strip().split()
inputSum = int(input("Enter the Sum You want"))
for i, combo in enumerate(powerset(nums), 1):
sum = 0
for num in combo:
sum += int(num)
if sum == inputSum:
print(combo)
The Input Output is as Follows:
Enter the Elements 1 2 3 4
Enter the Sum You want 5
('1', '4')
('2', '3')
Just change the values in your set w and correspondingly make an array x as big as the len of w then pass the last value in the subsetsum function as the sum for which u want subsets and you wl bw done (if u want to check by giving your own values).
def subsetsum(cs,k,r,x,w,d):
x[k]=1
if(cs+w[k]==d):
for i in range(0,k+1):
if x[i]==1:
print (w[i],end=" ")
print()
elif cs+w[k]+w[k+1]<=d :
subsetsum(cs+w[k],k+1,r-w[k],x,w,d)
if((cs +r-w[k]>=d) and (cs+w[k]<=d)) :
x[k]=0
subsetsum(cs,k+1,r-w[k],x,w,d)
#driver for the above code
w=[2,3,4,5,0]
x=[0,0,0,0,0]
subsetsum(0,0,sum(w),x,w,7)