I am trying to modify the function below so that it gives the expected output below. For the first calculation it go like 100+ 100*-87/100 = 13 with the equation NonC_Amount + Int_amount * np.cumprod(PnL / 100). Since -87 is the first element in the PnL, for the second calculation it will go as 13 + 100*-4/100 = 9. the NonC_Amounts value is updated.
PnL = np.array([-87., -4., -34.1, 8.5])
Int_amount = 100
NonC_Amount = 100
PnL.prod(initial=Int_amount)
NonCompounding =NonC_Amount + Int_amount * np.cumprod(PnL / 100)
Current Output:
[ 13, 103.48 , 98.81332, 99.8991322]
Expected Output:
[ 13, 9, -25.1, -16.6]
You do the wrong calculations. From your description it seems you want to do
NonC_Amount = 100
NonCompounding = np.zeros_like(PnL)
for i in range(PnL.shape[0]):
NonCompounding[i] = NonC_Amount + Int_amount * PnL[i] / 100
NonC_Amount = NonCompounding[i]
Edit: If you want this operation vectorised, you can do
NonCompounding = NonC_Amount + np.cumsum(Int_amount * PnL / 100)
np.cumprod() gives you the cumulative product up to the ith element. For np.cumprod(PnL) that'd be
[-87, -87*(-4), -87*(-4)*(-34.1), etc...]
and it's only by pure chance it gives you the correct result for the first element.
Related
Creating evenly spaced numbers on a log scale (a geometric progression) can easily be done for a given base and number of elements if the starting and final values of the sequence are known, e.g., with numpy.logspace and numpy.geomspace. Now assume I want to define the geometric progression the other way around, i.e., based on the properties of the resulting geometric series. If I know the sum of the series as well as the first and last element of the progression, can I compute the quotient and number of elements?
For instance, assume the first and last elements of the progression are and and the sum of the series should be equal to . I know from trial and error that it works out for n=9 and r≈1.404, but how could these values be computed?
You have enough information to solve it:
Sum of series = a + a*r + a*(r^2) ... + a*(r^(n-1))
= a*((r^n)-1)/(r-1)
= a*((last element * r) - 1)/(r-1)
Given the sum of series, a, and the last element, you can use the above equation to find the value of r.
Plugging in values for the given example:
50 = 1 * ((15*r)-1) / (r-1)
50r - 50 = 15r - 1
35r = 49
r = 1.4
Then, using sum of series = a*((r^n)-1)/(r-1):
50 = 1*((1.4^n)-1)(1.4-1)
21 = 1.4^n
n = log(21)/log(1.4) = 9.04
You can approximate n and recalculate r if n isn't an integer.
We have to reconstruct geometric progesssion, i.e. obtain a, q, m (here ^ means raise into power):
a, a * q, a * q^2, ..., a * q^(m - 1)
if we know first, last, total:
first = a # first item
last = a * q^(m - 1) # last item
total = a * (q^m - 1) / (q - 1) # sum
Solving these equation we can find
a = first
q = (total - first) / (total - last)
m = log(last / a) / log(q)
if you want to get number of items n, note that n == m + 1
Code:
import math
...
def Solve(first, last, total):
a = first
q = (total - first) / (total - last)
n = math.log(last / a) / math.log(q) + 1
return (a, q, n);
Fiddle
If you put your data (1, 15, 50) you'll get the solution
a = 1
q = 1.4
n = 9.04836151801382 # not integer
since n is not an integer you, probably want to adjust; let last == 15 be exact, when total can vary. In this case q = (last / first) ^ (1 / (n - 1)) and total = first * (q ^ n - 1) / (q - 1)
a = 1
q = 1.402850552006674
n = 9
total = 49.752 # now n is integer, but total <> 50
You have to solve the following two equations for r and n:
a:= An / Ao = r^(n - 1)
and
s:= Sn / Ao = (r^n - 1) / (r - 1)
You can eliminate n by
s = (r a - 1) / (r - 1)
and solve for r. Then n follows by log(a) / log(r) + 1.
In your case, from s = 50 and a = 15, we obtain r = 7/5 = 1.4 and n = 9.048...
It makes sense to round n to 9, but then r^8 = 15 (r ~ 1.40285) and r = 1.4 are not quite compatible.
Im tryng to convert this formula (WMA Moving Average) for loop in Python from Pinescript
but for i to x not exist. I tried for i in range(x) but seems dont return same result.
What exactly means to? Documentation of Pinescript said means from i to x but i dont find the equivalent in Python
pine_wma(x, y) =>
norm = 0.0
sum = 0.0
for i = 0 to y - 1
weight = (y - i) * y
norm := norm + weight
sum := sum + x[i] * weight
sum / norm
plot(pine_wma(close, 15))
Python Code:
import pandas as pd
dataframe = pd.read_csv('dataframe.csv')
def formula_wma(x, y):
list = []
norm = 0.0
sum = 0.0
i = 0
for i in range(y - 1):
weight = (y - i) * y
norm = norm + weight
sum = sum + x[i] * weight
_wma = sum / norm
list.append(_wma)
i += 1
return list
wma_slow = formula_wma(dataframe['close'],45)
dataframe['wma_slow'] = pd.Series(wma_slow, index=dataframe.index[:len(wma_slow)])
print(dataframe['wma_slow'].to_string())
Output:
0 317.328133
[Skipping lines]
39 317.589010
40 317.449259
41 317.421662
42 317.378052
43 317.328133
44 NaN
45 NaN
[Skipping Lines]
2999 NaN
3000 NaN
First of all, don't reassign built-in names!
sum is a built-in function that calculates the summation of a sequence of numbers. So is list, it is a class constructor.
For example:
sum(range(10)) returns 45.
The above is equivalent to:
numbers = (0,1,2,3,4,5,6,7,8,9)
s = 0
for i in numbers: s += i
Second, don't increment the variable you use for looping inside the loop, unless you have a good reason for it.
That i += 1 at the end of the loop has no effect whatsoever, for loop automatically reassigns the name to the next item in the sequence, in this case the next item is incremented by one, so i automatically gets incremented.
Further, if there is anything using i after that line, they will break.
Lastly, the reason you are not getting the same result, is Python uses zero-based indexing and range excludes the stop.
I don't know about pine script, but from what you have written, from x to y must include y.
For example 0 to 10 in pine script will give you 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
But using range(10):
print(list(range(10)))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Why? Because there are exactly ten numbers in the range you specified.
In the first example, there are actually eleven numbers. If you know your math, the number of terms in an arithmetic sequence is the difference between the maximum term and the minimum term divided by the increment plus one.
So how to solve your problem?
Remove - 1 after y in range!
Fixed code:
import pandas as pd
dataframe = pd.read_csv('dataframe.csv')
def formula_wma(x, y):
lst = []
norm = 0.0
sum_ = 0.0
i = 0
for i in range(y):
weight = (y - i) * y
norm = norm + weight
sum_ = sum_ + x[i] * weight
_wma = sum_ / norm
lst.append(_wma)
return lst
wma_slow = formula_wma(dataframe['close'],45)
dataframe['wma_slow'] = pd.Series(wma_slow, index=dataframe.index[:len(wma_slow)])
print(dataframe['wma_slow'].to_string())
I'm using sympy to solve an equation in a for loop in which at each interaction a variable (kp) multiples the function. But in each interaction the length of the output increases. I have an array k and kp is selected from k
k = [2,4,5,7,9]
for kp in k:
didt = beta * kp * teta
dteta = integrate(1/((kp-1) * pk * didt / avgk),teta)
dt = integrate(1,(t,0 ,1))
teta2 = solve(dteta - dt ,teta)
#print(solve(dteta - dt ,teta))
didt2 = beta * solve(dteta - dt ,teta) *kp
print(didt2)
Also, the output for didt2 for 1st iteration is
[1.49182469764127, 1.49182469764127]
for the second one is
[11.0231763806416, 11.0231763806416, 11.0231763806416, 11.0231763806416]
for the 3rd one is [54.5981500331442, 54.5981500331442, 54.5981500331442, 54.5981500331442, 54.5981500331442]
I'm just wondering, why the length of didt2 increases at each interaction?
It looks like solve() returns a list. This means that beta * solve(dteta - dt ,teta) *kp doesn't do what you think. Rather than multiplying the result, you are duplicating the elements of the returned list. For a simple example, try to see what the output is:
[0] * 10
In your case, kp takes on the values of 2, 4, and 5 on each iteration of the list, so the output you see is the result of doing
[1.49182469764127] * 2
[11.0231763806416] * 4
[54.5981500331442] * 5
These all result in lists with the exact length of the value of kp. It does not do numeric multiplication.
I have two DFs which I would like to use to calculate the following:
w(ti,ti)*a(ti)^2 + w(tj,tj)*b(sj,tj)^2 + 2*w(si,tj)*a(ti)*b(tj)
The above uses two terms (a,b).
w is the weight df where i and j are index and column spaces pertaining to the Tn index of a and b.
Set Up - Edit dynamic W
import pandas as pd
import numpy as np
I = ['i'+ str(i) for i in range(4)]
Q = ['q' + str(i) for i in range(5)]
T = ['t' + str(i) for i in range(3)]
n = 100
df1 = pd.DataFrame({'I': [I[np.random.randint(len(I))] for i in range(n)],
'Q': [Q[np.random.randint(len(Q))] for i in range(n)],
'Tn': [T[np.random.randint(len(T))] for i in range(n)],
'V': np.random.rand(n)}).groupby(['I','Q','Tn']).sum()
df1.head(5)
I Q Tn V
i0 q0 t0 1.626799
t2 1.725374
q1 t0 2.155340
t1 0.479741
t2 1.039178
w = np.random.randn(len(T),len(T))
w = (w*w.T)/2
np.fill_diagonal(w,1)
W = pd.DataFrame(w, columns = T, index = T)
W
t0 t1 t2
t0 1.000000 0.029174 -0.045754
t1 0.029174 1.000000 0.233330
t2 -0.045754 0.233330 1.000000
Effectively I would like to use the index Tn in df1 to use the above equation for every I and Q.
The end result for df1.loc['i0','q0'] in the example above should be:
W(t0,t0) * V(t0)^2
+ W(t2,t2) * V(t2)^2
+ 2 * W(t0,t2) * V(t0) * V(t2)
=
1.0 * 1.626799**2
+ 1.0 * 1.725374**2
+ (-0.045754) * 1.626799 * 1.725374
The end result for df1.loc['i0','q1'] in the example above should be:
W(t0,t0) * V(t0)^2
+ W(t1,t1) * V(t1)^2
+ W(t2,t2) * V(t2)^2
+ 2 * W(t0,t1) * V(t0) * V(t1)
+ 2 * W(t0,t2) * V(t0) * V(t2)
+ 2 * W(t2,t1) * V(t1) * V(t2)
=
1.0 * 2.155340**2
+ 1.0 * 0.479741**2
+ 1.0 * 1.039178**2
+ 0.029174 * 2.155340 * 0.479741 * 1
+ (-0.045754) * 2.155340 * 1.039178 * 1
+ 0.233330 * 0.479741 * 1.039178 * 1
This pattern will repeat depending on the number of tn terms in each Q hence it should be robust enough to handle as many Tn terms as needed (in the example I use 3, but it could be as much as 100 or more).
Each result should then be saved in a new DF with Index = [I, Q]
The solution should also not be slower than excel when n increases in value.
Thanks in advance
One way could be first reindex your dataframe df1 with all the possible combinations of the lists I, Q and Tn with pd.MultiIndex.from_product, filling the missing value in the column 'V' with 0. The column has then len(I)*len(Q)*len(T) elements. Then you can reshape the values to get each row related to one combination on I and Q such as:
ar = (df1.reindex(pd.MultiIndex.from_product([I,Q,T], names=['I','Q','Tn']),fill_value=0)
.values.reshape(-1,len(T)))
To see the relation between my input df1 and ar, here are some related rows
print (df1.head(6))
V
I Q Tn
i0 q0 t1 1.123666
q1 t0 0.538610
t1 2.943206
q2 t0 0.570990
t1 0.617524
t2 1.413926
print (ar[:3])
[[0. 1.1236656 0. ]
[0.53861027 2.94320574 0. ]
[0.57099049 0.61752408 1.4139263 ]]
Now, to perform the multiplication with the element of W, one way is to create the outer product of ar with itself but row-wise to get, for each row a len(T)*len(T) matrix. For example, for the second row:
[0.53861027 2.94320574 0. ]
becomes
[[0.29010102, 1.58524083, 0. ], #0.29010102 = 0.53861027**2, 1.58524083 = 0.53861027*2.94320574 ...
[1.58524083, 8.66246003, 0. ],
[0. , 0. , 0. ]]
Several methods are possible such as ar[:,:,None]*ar[:,None,:] or np.einsum with the right subscript: np.einsum('ij,ik->ijk',ar,ar). Both give same result.
The next step can be done with a tensordot and specify the right axes. So with ar and W as an input, you do:
print (np.tensordot(np.einsum('ij,ik->ijk',ar,ar),W.values,axes=([1,2],[0,1])))
array([ 1.26262437, 15.29352438, 15.94605435, ...
To check for the second value here, 1*0.29010102 + 1*8.66246003 + 2.*2*1.58524083 == 15.29352438 (where 1 is W(t0,t0) and W(t1,t1), 2 is W(t0,t1))
Finally, to create the dataframe as expected, use again pd.MultiIndex.from_product:
new_df = pd.DataFrame({'col1': np.tensordot(np.einsum('ij,ik->ijk',ar,ar),
W.values,axes=([1,2],[0,1]))},
index=pd.MultiIndex.from_product([I,Q], names=['I','Q']))
print (new_df.head(3))
col1
I Q
i0 q0 1.262624
q1 15.293524
q2 15.946054
...
Note: if you are SURE that each element of T is at least once in the last level of df1, the ar can be obtain using unstack such as ar=df1.unstack(fill_value=0).values. But I would suggest to use the reindex method above to prevent any error
I want to find out how many ways there are to make 500 using only 1, 2, 5, 10, 20, 50, 100, and 200.
I understand that there exist greedy algorithms etc that can solve this type of question, but I want to be able to do it the following way:
The number of integer partitions of a given number, n, using only numbers from some set T, can be obtained from the coefficient of the xn term in the product of all (1-xt)-1, where t is in T.
To do this, note that the Taylor expansion of (1-xt)-1 equals (1+xt+x2t+...).
Here is the code I've written so far:
#################################
# First, define a function that returns the product of two
# polynomials. A list here
# represents a polynomial with the entry in a list corresponding to
# the coefficient of the power of x associated to that position, e.g.
# [1,2,3] = 1 + 2x + 3x^2.
#################################
def p(a,v):
"""(list,list) -> list"""
prodav = [0]*(len(a)+len(v)-1)
for n in range(len(a)):
for m in range(len(v)):
prodav[n+m] += v[m]*a[n]
return prodav
#################################
# Now, let a,b,c,...,h represent the first 501 terms in the Taylor
# expansion of 1/(1-x^n), where n gives the coin value, i.e
# 1,2,5,10,20,50,100,200 in pence. See the generating function
# section in https://en.wikipedia.org/wiki/Partition_(number_theory).
# Our function, p, multiplies all these polynomials together
# (when applied iteratively). As per the Wiki entry, the coefficient
# of x^t is equal to the number of possible ways t can be made,
# using only the denominations of change available, a so called
# 'restricted integer partition'. Note that it isn't necessary to
# consider the entire Taylor expansion since we only need the
# 500th power.
#################################
a = ( [1] ) * 501 # 1
b = ( [1] + [0] ) * 250 + [1] # 2
c = ( [1] + [0]*4 ) * 100 + [1] # 5
d = ( [1] + [0]*9 ) * 50 + [1] # 10
e = ( [1] + [0]*19 ) * 25 + [1] # 20
f = ( [1] + [0]*49 ) * 10 + [1] # 50
g = ( [1] + [0]*99 ) * 5 + [1] # 100
h = ( [1] + [0]*199 ) * 2 + [0]*101 # 200
x = p(h,p(g,p(f,p(e,p(d,p(c,p(b,a)))))))
print(x[500]) # = 6290871
My problem is that I'm not confident the answer this gives is correct. I've compared it to two other greedy algorithms whose outputs agree with each other, but not mine. Can anyone see where I might have gone wrong?