Algorithm about sum of powers - python

I'm working on the question shown below, part 2. However when I implement it in python, it fails with "RecursionError: maximum recursion depth exceeded".
Here's my algorithm:
import math
def sumofpowers2(x):
count = 1
if math.isclose(x ** count,0,rel_tol=0.001):
return 0
count += 1
return 1 + x * sumofpowers2(x)
print(sumofpowers2(0.8))
Edited.

In a nutshell, sumofpowers2(x) calls itself with the same argument, resulting in infinite recursion (unless the if condition is true right from the start, it will never be true).
Every time sumofpowers2() calls itself, a new variable called count gets created and set to 1. To make this code work, you need to figure out a way to carry the value of count across calls.

First, please learn basic debugging: add a simple print to track your values just before you depend on them:
def sumofpowers2(x):
count = 1
print(x, count, x**count)
if math.isclose(x ** count,0,rel_tol=0.001):
...
Output:
(0.8, 1, 0.8)
(0.8, 1, 0.8)
(0.8, 1, 0.8)
...
This points up the critical problem: you reset count to 1 every time you enter the routine. The simple fix is to hoist the initialization outside the loop:
count = 1
def sumofpowers2(x):
global count
print(x, count, x**count)
if math.isclose(x ** count,0,rel_tol=0.001):
Output:
0.8 1 0.8
0.8 2 0.6400000000000001
0.8 3 0.5120000000000001
0.8 4 0.4096000000000001
0.8 5 0.3276800000000001
0.8 6 0.2621440000000001
0.8 7 0.20971520000000007
0.8 8 0.1677721600000001
0.8 9 0.13421772800000006
0.8 10 0.10737418240000006
0.8 11 0.08589934592000005
0.8 12 0.06871947673600004
0.8 13 0.054975581388800036
0.8 14 0.043980465111040035
0.8 15 0.03518437208883203
0.8 16 0.028147497671065624
0.8 17 0.022517998136852502
0.8 18 0.018014398509482003
0.8 19 0.014411518807585602
0.8 20 0.011529215046068483
0.8 21 0.009223372036854787
0.8 22 0.00737869762948383
0.8 23 0.005902958103587064
0.8 24 0.004722366482869652
0.8 25 0.0037778931862957215
0.8 26 0.0030223145490365774
0.8 27 0.002417851639229262
0.8 28 0.0019342813113834097
0.8 29 0.0015474250491067279
0.8 30 0.0012379400392853823
0.8 31 0.0009903520314283058
4.993810299803575
Better yet, make count an added parameter to your function:
def sumofpowers2(x, count):
print(x, count, x**count)
if math.isclose(x ** count,0,rel_tol=0.001):
return 0
return 1 + x * sumofpowers2(x, count+1)
Not that your cascaded arithmetic is not the value you expect.

Related

Bin values into groups

The relevant data in my dataframe looks as follows:
Datapoint
Values
1
0.2
2
0.8
3
0.4
4
0.1
5
1.0
6
0.6
7
0.7
8
0.2
9
0.5
10
0.1
I am hoping to group the numbers in the Values column into three categories: less than 0.25 as 'low', between 0.25 and 0.75 as middle and greater than 0.75 as high.
I want to create a new column which returns 'low', 'middle' or 'high' for each row based off the data in the value column.
What I have tried:
def categorize_values("Values"):
if "Values" > 0.75:
return 'high'
elif 'Values' < 0.25:
return 'low'
else:
return 'middle'
However this is returning an error for me.
If you're using a dataframe, Pandas has a built-in function called pd.cut()
import pandas as pd
import numpy as np
from io import StringIO
df = pd.read_csv(StringIO('''Datapoint Values
1 0.2
2 0.8
3 0.4
4 0.1
5 1.0
6 0.6
7 0.7
8 0.2
9 0.5
10 0.1'''), sep='\t')
df['category'] = pd.cut(df['Values'], [0, 0.25, 0.75, df['Values'].max()], labels=['low', 'middle', 'high'])
#output
>>> df
Datapoint Values category
0 1 0.2 low
1 2 0.8 high
2 3 0.4 middle
3 4 0.1 low
4 5 1.0 high
5 6 0.6 middle
6 7 0.7 middle
7 8 0.2 low
8 9 0.5 middle
9 10 0.1 low
First of all, you cannot put constants in your function parameters.
You need to fix your function first like this,
def categorize_values(Values):
if Values > 0.75:
return 'high'
elif Values < 0.25:
return 'low'
else:
return 'middle'
and then you can apply that function to your 'Values' column as below.
df['Category'] = df['Values'].apply(categorize_values)
df.head()
it will generate that DataFrame,
Values Category
DataPoint
1 0.22 low
2 0.32 middle
3 0.55 middle
4 0.75 middle
5 0.12 low
You should take the '' around the Values away.
That would look like this:
def categorize_values(Values):
if Values > 0.75:
return 'high'
elif Values < 0.25:
return 'low'
else:
return 'middle'

How to create modified dataframe based on list values?

Consider a dataframe df of the following structure:-
Name Slide Height Weight Status General
A X 3 0.1 0.5 0.2
B Y 10 0.2 0.7 0.8
...
I would like to create duplicates for each row in this dataframe (specific to the Name and Slide) for the following combinations of Height and Weight shown by this list:-
list_combinations = [[3,0.1],[10,0.2],[5,1.3]]
The desired output:-
Name Slide Height Weight Status General
A X 3 0.1 0.5 0.2 #original
A X 10 0.2 0.5 0.2 # modified duplicate
A X 5 1.3 0.5 0.2 # modified duplicate
B Y 10 0.2 0.7 0.8 #original
B Y 3 0.1 0.7 0.8 # modified duplicate
B Y 5 1.3 0.7 0.8 # modified duplicate
etc. ...
Any suggestions and help would be much appreciated.
We can do merge with cross
out = pd.DataFrame(list_combinations,columns = ['Height','Weight']).\
merge(df,how='cross',suffixes = ('','_')).\
reindex(columns=df.columns).sort_values('Name')
Name Slide Height Weight Status General
0 A X 3 0.1 0.5 0.2
2 A X 10 0.2 0.5 0.2
4 A X 5 1.3 0.5 0.2
1 B Y 3 0.1 0.7 0.8
3 B Y 10 0.2 0.7 0.8
5 B Y 5 1.3 0.7 0.8

How to use increments in Python

def number():
b = 0.1
while True:
yield b
b = b + 0.1
b = number()
for i in range(10):
print(next(b))
Outputs
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.9999999999999999
Then, I just want
c=b*2
print("c="=)
My expected outputs are
c=0.2
0.4
0.6
0.8
1
1.2
And so on.
Could you tell me what I have to do to get my expected outputs?
Floating point numbers are not precise. The more you handle them, the more error they can accumulate. To have numbers you want, the best way is to keep things integral for as long as possible:
def number():
b = 1
while True:
yield b / 10.0
b += 1
You can pass the number as an argument:
def number(start=0.1,num=0.1):
b = start
while True:
yield round(b,1)
b += num
b = number(0,0.2)
It yields:
0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
Like this?
for i in range(10):
AnotherB=next(b)
c=AnotherB*2
print(AnotherB)
print("c="+str(c))
or do you mean how do you reset a yeild?
just redeclare it.
def number():
b = 0.1
while True:
yield round(b,1)
b = b + 0.1
b = number()
for i in range(10):
print(next(b))
b=number()
for i in range(10):
print("c="+str(next(b)*2))

Pandas Timeseries Data - Calculating product over intervals of varying length

I have some timeseries data that basically contains information on price change period by period. For example, let's say:
df = pd.DataFrame(columns = ['TimeStamp','PercPriceChange'])
df.loc[:,'TimeStamp']=[1457280,1457281,1457282,1457283,1457284,1457285,1457286]
df.loc[:,'PercPriceChange']=[0.1,0.2,-0.1,0.1,0.2,0.1,-0.1]
so that df looks like
TimeStamp PercPriceChange
0 1457280 0.1
1 1457281 0.2
2 1457282 -0.1
3 1457283 0.1
4 1457284 0.2
5 1457285 0.1
6 1457286 -0.1
What I want to achieve is to calculate the overall price change before the an increase/decrease streak ends, and store the value in the row where the streak started. That is, what I want is a column 'TotalPriceChange' :
TimeStamp PercPriceChange TotalPriceChange
0 1457280 0.1 1.1 * 1.2 - 1 = 0.31
1 1457281 0.2 0
2 1457282 -0.1 -0.1
3 1457283 0.1 1.1 * 1.2 * 1.1 - 1 = 0.452
4 1457284 0.2 0
5 1457285 0.1 0
6 1457286 -0.1 -0.1
I can identify the starting points using something like:
df['turn'] = 0
df['PriceChange_L1'] = df['PercPriceChange'].shift(periods=1, freq=None, axis=0)
df.loc[ df['PercPriceChange'] * df['PriceChange_L1'] < 0, 'turn' ] = 1
to get
TimeStamp PercPriceChange turn
0 1457280 0.1 NaN or 1?
1 1457281 0.2 0
2 1457282 -0.1 1
3 1457283 0.1 1
4 1457284 0.2 0
5 1457285 0.1 0
6 1457286 -0.1 1
Given this column "turn", I need help proceeding with my quest (or perhaps we don't need this "turn" at all). I am pretty sure I can write a nested for-loop going through the entire DataFrame row by row, calculating what I need and populating the column 'TotalPriceChange', but given that I plan on doing this on a fairly large data set (think minute or hour data for couple of years), I imagine nested for-loops will be really slow.
Therefore, I just wanted to check with you experts to see if there is any efficient solution to my problem that I am not aware of. Any help would be much appreciated!
Thanks!
The calculation you are looking for looks like a groupby/product operation.
To set up the groupby operation, we need to assign a group value to each row. Taking the cumulative sum of the turn column gives the desired result:
df['group'] = df['turn'].cumsum()
# 0 0
# 1 0
# 2 1
# 3 2
# 4 2
# 5 2
# 6 3
# Name: group, dtype: int64
Now we can define the TotalPriceChange column (modulo a little cleanup work) as
df['PercPriceChange_plus_one'] = df['PercPriceChange']+1
df['TotalPriceChange'] = df.groupby('group')['PercPriceChange_plus_one'].transform('prod') - 1
import pandas as pd
df = pd.DataFrame({'PercPriceChange': [0.1, 0.2, -0.1, 0.1, 0.2, 0.1, -0.1],
'TimeStamp': [1457280, 1457281, 1457282, 1457283, 1457284, 1457285, 1457286]})
df['turn'] = 0
df['PriceChange_L1'] = df['PercPriceChange'].shift(periods=1, freq=None, axis=0)
df.loc[ df['PercPriceChange'] * df['PriceChange_L1'] < 0, 'turn' ] = 1
df['group'] = df['turn'].cumsum()
df['PercPriceChange_plus_one'] = df['PercPriceChange']+1
df['TotalPriceChange'] = df.groupby('group')['PercPriceChange_plus_one'].transform('prod') - 1
mask = (df['group'].diff() != 0)
df.loc[~mask, 'TotalPriceChange'] = 0
df = df[['TimeStamp', 'PercPriceChange', 'TotalPriceChange']]
print(df)
yields
TimeStamp PercPriceChange TotalPriceChange
0 1457280 0.1 0.320
1 1457281 0.2 0.000
2 1457282 -0.1 -0.100
3 1457283 0.1 0.452
4 1457284 0.2 0.000
5 1457285 0.1 0.000
6 1457286 -0.1 -0.100

Algorithm to detect left or right turn from x,y co-ordinates

I have a data set of x,y co-ordinates, starting from origin, recorded each second. I can detect distance, speed,acceleration, modulus of displacement . Is there any algorithm to detect whether a left or right turn ?
I am currently calculating distance and modulus of displacement for every 10 seconds, if the displacement is approximately equal to distance, then the vehicle is on straight path, but of the values change then there is a turn involved.
IS there an algorithm to decide whether the turn was left or right ? My data looks like this
Time x y
0 0 0
1 -0.2 -0.1
2 -0.7 0.9
3 -0.8 0.9
4 -1 0.8
5 -1.1 0.8
6 -1.2 0.7
7 -1.4 0.7
8 -1.9 1.7
9 -2 1.7
10 -2.2 1.6
11 -2.3 1.6
12 -2.5 1.5
13 -2.6 1.5
14 -2.7 1.5
15 -2.9 1.4
16 -3.6 1.2
17 -4.1 -0.1
18 -4.7 -1.5
19 -4.7 -2.6
20 -4.3 -3.7
21 -4.3 -3.7
22 -4.7 -3.8
23 -6.2 -3.1
24 -9.9 -1.9
25 -13.7 -1.9
26 -17.9 -2
27 -21.8 -0.8
28 -25.1 -0.6
29 -28.6 1.8
Looking at 3 points p0, p1 and p2, you can look at the relative orientation of the two vectors p1 - p0 and p2 - p1. An easy way to do this is to calculate the cross product between the two vectors. The x- and y-components of the cross product are 0 because both vectors are in the xy-plane. So only the z-component of the cross product needs to be calculated.
If the z-component of the cross product is positive, you know that the second vector points left relative to the first one, because the first vector, second vector, and a vector in the positive z-direction are right handed. If the cross product is negative, the second vector points to the right relative to the first one.
I used my mad Python skills (I use Python about once a year...) to put this into the code below. There's a little logic so that the Left/Right designation can be printed at the middle point, even though it can only be calculated after the next point was read. To enable that, a couple of previous lines are saved away, with their printing delayed. The actual calculation is in the calcDir() function.
import sys
fileName = sys.argv[1]
dataFile = open(fileName, 'r')
def calcDir(p0, p1, p2):
v1x = float(p1[0]) - float(p0[0])
v1y = float(p1[1]) - float(p0[1])
v2x = float(p2[0]) - float(p1[0])
v2y = float(p2[1]) - float(p1[1])
if v1x * v2y - v1y * v2x > 0.0:
return 'Left'
else:
return 'Right'
lineIdx = 0
for line in dataFile:
line = line.rstrip()
lineIdx += 1
if lineIdx == 1:
print line
elif lineIdx == 2:
line0 = line
print line0
elif lineIdx == 3:
line1 = line
else:
line2 = line
dir = calcDir(line0.split()[1:], line1.split()[1:], line2.split()[1:])
print line1 + ' ' + dir
line0 = line1
line1 = line2
print line2
Yes: you want to calculate the dot product of the previous direction and the new direction.
If you start by normalising the two vectors (giving each one a length of 1) then the dot product will be the cosine of the angle between the two vectors, and this will allow you to determine whether it's a left turn or a right turn, and by how much you've turned.
You might also find the further explanation here to be handy.

Categories

Resources