Calculating the average grade in Python? - python

I am trying to get the average of two passing grades. The output should be:
0.0 if neither of the grades is a passing grade both <50
The passing grade, if only one of the grades is a passing grade (if
one is >50)
The average of the two grades, if both are passing grades (if both
are greater than 50)
Here is my code so far:
def passing_grade(grade1,grade2):
'''(number, number)--> number
This function definition prints the average of all passing grade(s)
'''
# Function 1 - If both numbers are outside the grading range (0-100)
if 0.0 < grade1 > 100.0 and 0 < grade2 > 100.0:
print ('Not available grading')
elif 0.0 >= grade1 <= 50.0 and 0.0 >= grade2 <= 50.0:
print (0.0)
#Function 2 - If one of the grades is passing then, print passing grade
elif 0.0 >= grade1 <= 50.0 and 0.0 >= grade2 >= 50.0:
print (grade2)
elif 0.0 >= grade1 >= 50.0 and 0.0 >= grade2 <= 50.0:
print (grade1)
#Function 3 - If both grades are passing >50 then print the average
elif 50.0 > grade1 <= 100.0 and 50.0> grade2 <= 100.0:
print ((grade1+grade2)/2)

I'm just guessing about your problem here, since you haven't specified, but it looks like you bad logic in the second part of "Function #2":
elif 0.0 >= grade1 <= 50.0 and 0.0 >= grade2 >= 50.0:
print (grade2)
elif 0.0 >= grade1 >= 50.0 and 0.0 >= grade2 <= 50.0:
print (grade1)
Should be:
elif grade1 <= 50.0 and grade2 >= 50.0:
print (grade2)
elif grade1 >= 50.0 and grade2 <= 50.0:
print (grade1)
If you look at your original conditions, you keep checking 0.0 >= gradeN, which means is only true if the grade is a negative number. There are similar problems in some of your other sections.

Your comparisons are screwed up. They don't say what you mean, and many evaluate to False always. There can be no grade1 such that 0.0 >= grade1 >= 50.0, as there are no nonpositive numbers greater-equal 50. I suggest you write out your multiple comparisons "the long way" until you're clear about what you mean to say, rather than using this keystroke-saving feature of Python. a < b < c in Python means a < b and b < c not a < b or b < c which is the form of what you want to say in your first 'if' statement.
Finally, when writing multiple comparisons in one expression, don't mix directions of the comparisons, it's needlessly confusing (for you, to start with).
A more concise way of writing (the calculation part of) your function:
def avg_passing_grade(grade1, grade2):
passing_grades = [g for g in (grade1, grade2) if 50 <= g <= 100]
return sum(passing_grades)/max(1, len(passing_grades))
This makes a list passing_grades containing only the grades supplied to the function that are passing. The function returns their average, taking care not to divide by 0 in case no grades are passing.
Although the following may be overkill, it's within such easy reach that I have to mention it: the function above generalizes easily to one that takes an arbitrary number of grades:
def average_passing_grade(* grades):
'''Return the average of the passing grades among grades.'''
passing_grades = [g for g in grades if 50 <= g <= 100]
return sum(passing_grades)/max(1, len(passing_grades))
which you can use like this:
>>> average_passing_grade()
0.0
>>> average_passing_grade(35.3)
0.0
>>> average_passing_grade(75.5)
75.5
>>> average_passing_grade(35.3, 88)
88.0
>>> average_passing_grade(88, 20)
88.0
>>> average_passing_grade(50, 100)
75.0
>>> average_passing_grade(40, 50, 60, 70, 80, 90)
70.0

Besides the bad logic pointed out in the other answers, you could use max and min to do a single logical check for certain cases.
if grade1 >= 50.0 and grade2 >= 50.0:
can be
if min (grade1, grade2) >= 50.0: # Both are >= 50.0
similarly
if max(grade1, grade2) < 50.0 # both are less than 50.0
Once those two have been shown false, the the else means that one is on each side of the limit.
Similarly, to test for the invalid values you can use
if max(grade1, grade2) > 100.0 or min(grade1, grade2) < 0:
means that at least one grade is invalid
if min(grade1, grade2) > 100 or max(grade1, grade2) <0:
means that both grades are invalid in the same way.

I am no expert in Python. However, I'm pretty sure your error is because of your if conditions.
if 0.0 < grade1 > 100.0 and 0 < grade2 > 100.0:
Should be something like
if (grade1 < 0 or grade1 > 100) and (grade2 < 0 or grade2 > 100):

Related

Python - Adding count to output

I'm writing a simple program that takes a number and continually doubles it until it has reached an upper limit. The code I've written does this with a for loop and while loop, but I want to add a count to the output to see how many iterations through the while loop it took to get to the upper limit.
The code looks like this:
def double_function():
print('Enter an upper range to target')
upper_range = int(input())
for number in range(0, upper_range):
print('Enter a number to double')
number = float(input())
while number < upper_range:
number * 2
number += number
print(number)
else:
break
output_list = list(str(number))
iterations = enumerate(output_list)
print('It took ' + str(iterations) + ' iterations to reach ' + str(number))
double_function()
If upper_range = 1000 and number = 1, output is:
2.0
4.0
8.0
16.0
32.0
64.0
128.0
256.0
512.0
1024.0
It took <enumerate object at 0x7ffe02f74e40> iterations to reach 1024.0
I tried using enumerate because that's the only suggestion I've seen, but every other example used it with lists. I tried converting my output to a list, but I'm still not getting the output I want. I want it to look something like this:
1: 2.0
2: 4.0
3: 8.0
4: 16.0
5: 32.0
6: 64.0
7: 128.0
8: 256.0
9: 512.0
10: 1024.0
It took 10 iterations to reach 1000
Thanks for the help
enumerate returns an enumerate object see help(enumerate).
To get what you're looking for, just use len(output_list)
Just realized your "output_list" isn't actually a list containing the intermediate results. So to actually get the number of iterations, you can just initialize a counter before starting the while loop, then increment it by one in the while-loop block, and once it breaks, that variable will store the number of iterations performed.
RESOLVED. Initialized a counter before starting while loop and incremented by one inside the loop. Code looks like this:
def double_function():
print('Enter an upper range to target')
upper_range = int(input())
for number in range(0, upper_range):
print('Enter a number to double')
number = float(input())
iterations = 0
while number < upper_range:
number * 2
number += number
iterations += 1
print(iterations, number)
else:
break
print('It took ' + str(iterations) + ' iterations to reach ' + str(number))
double_function()
Output if upper_range = 1000, number = 1:
1 2.0
2 4.0
3 8.0
4 16.0
5 32.0
6 64.0
7 128.0
8 256.0
9 512.0
10 1024.0
It took 10 iterations to reach 1024.0

Fill NaN values based on operators from another column

I have a database (pd.DataFrame) like this:
condition odometer
0 new NaN
1 bad 1100
2 excellent 110
3 NaN 200
4 NaN 2000
5 new 20
6 bad NaN
And I want to fill the NaN of "condition" based on the values of "odometer":
new: odometer >0 and <= 100
excellent: odometer >100 and <= 1000
bad: odometer >1000
I tried to do this but it is not working:
for i in range(len(database)):
if math.isnan(database['condition'][i]) == True:
odometer = database['odometer'][i]
if odometer > 0 & odometer <= 100: value = 'new'
elif odometer > 100 & odometer <= 1000: value = 'excellent'
elif odometer > 1000: value = 'bad'
database['condition'][i] = value
Tried also making the first "if" condition:
database['condition'][i] == np.nan
But it doesn't work as well.
You can use DataFrame.apply() to generate a new condition column with your function, and replace it afterwards. Not sure what types your columns are. df['condition'].dtype will tell you. It looks like condition could either be string or object, which could create a bug in your logic. If it's a string column, you'll need to do a direct comparison == 'NaN'. If it's an object, you can use np.nan or math.nan. I included a sample database for each case below. You also might want to test the type of your odometer column.
import numpy as np
import pandas as pd
# condition column as string
df = pd.DataFrame({'condition':['new','bad','excellent','NaN','NaN','new','bad'], 'odometer':np.array([np.nan, 1100, 110, 200, 2000, 20, np.nan], dtype=object)})
# condition column as object
# df = pd.DataFrame({'condition':np.array(['new','bad','excellent',np.nan,np.nan,'new','bad'], dtype=object), 'odometer':np.array([np.nan, 1100, 110, 200, 2000, 20, np.nan], dtype=object)})
def f(database):
if database['condition'] == 'NaN':
#if np.isnan(database['condition']):
odometer = database['odometer']
if odometer > 0 & odometer <= 100: value = 'new'
elif odometer > 100 & odometer <= 1000: value = 'excellent'
elif odometer > 1000: value = 'bad'
return value
return database['condition']
df['condition'] = df.apply(f, axis=1)
I have a nice one liner solution for you:
Lets create a sample dataframe:
import pandas as pd
df = pd.DataFrame({'condition':['new','bad',None,None,None], 'odometer':[None,1100,50,500,2000]})
df
Out:
condition odometer
0 new NaN
1 bad 1100.0
2 None 50.0
3 None 500.0
4 None 2000.0
Solution:
df.condition = df.condition.fillna(df.odometer.apply(lambda number: 'new' if number in range(101) else 'excellent' if number in range(101,1000) else 'bad'))
df
Out:
condition odometer
0 new NaN
1 bad 1100.0
2 new 50.0
3 excellent 500.0
4 bad 2000.0

How can i replace values in a column using pandas?

It's my first time using python and pandas (plz help this old man). I have a column with float and negative numbers and I want to replace them with conditions.
I.e. if the number is between -2 and -1.6 all'replace it with -2 etc.
How can I create the condition (using if else or other) to modify my column. Thanks a lot
mean=[]
for row in df.values["mean"]:
if row <= -1.5:
mean.append(-2)
elif row <= -0.5 and =-1.4:
mean.append(-1)
elif row <= 0.5 and =-0.4:
mean.append(0)
else:
mean.append(1)
df = df.assign(mean=mean)
Doesn't work
create a function defining your conditions and then apply it to your column (I fixed some of your conditionals based on what I thought they should be):
df = pd.read_table('fun.txt')
# create function to apply for value ranges
def labels(x):
if x <= -1.5:
return(-2)
elif -1.5 < x <= -0.5:
return(-1)
elif -0.5 < x < 0.5:
return(0)
else:
return(1)
df['mean'] = df['mean'].apply(lambda x: labels(x)) # apply your function to your table
print(df)
another way to apply your function that returns the same result:
df['mean'] = df['mean'].map(labels)
fun.txt:
mean
0
-1.5
-1
-0.5
0.1
1.1
output from above:
mean
0 0
1 -2
2 -1
3 -1
4 0
5 1

Python Pandas: calculate rolling mean (moving average) over variable number of rows

Say I have the following dataframe
import pandas as pd
df = pd.DataFrame({ 'distance':[2.0, 3.0, 1.0, 4.0],
'velocity':[10.0, 20.0, 5.0, 40.0] })
gives the dataframe
distance velocity
0 2.0 10.0
1 3.0 20.0
2 1.0 5.0
3 4.0 40.0
How can I calculate the average of the velocity column over the rolling sum of the distance column? With the example above, create a rolling sum over the last N rows in order to get a minimum cumulative distance of 5, and then calculate the average velocity over those rows.
My target output would then be like this:
distance velocity rv
0 2.0 10.0 NaN
1 3.0 20.0 15.0
2 1.0 5.0 11.7
3 4.0 40.0 22.5
where
15.0 = (10+20)/2 (2 because 3 + 2 >= 5)
11.7 = (10 + 20 + 5)/3 (3 because 1 + 3 + 2 >= 5)
22.5 = (5 + 40)/2 (2 because 4 + 1 >= 5)
Update: in Pandas-speak, my code should find the index of the reverse cumulative distance sum back from my current record (such that it is 5 or greater), and then use that index to calculate the start of the moving average.
Not a particularly pandasy solution, but it sounds like you want to do something like
df['rv'] = np.nan
for i in range(len(df)):
j = i
s = 0
while j >= 0 and s < 5:
s += df['distance'].loc[j]
j -= 1
if s >= 5:
df['rv'].loc[i] = df['velocity'][j+1:i+1].mean()
Update: Since this answer, the OP stated that they want a "valid Pandas solution (e.g. without loops)". If we take this to mean that they want something more performant than the above, then, perhaps ironically given the comment, the first optimization that comes to mind is to avoid the data frame unless needed:
l = len(df)
a = np.empty(l)
d = df['distance'].values
v = df['velocity'].values
for i in range(l):
j = i
s = 0
while j >= 0 and s < 5:
s += d[j]
j -= 1
if s >= 5:
a[i] = v[j+1:i+1].mean()
df['rv'] = a
Moreover, as suggested by #JohnE, numba quickly comes in handy for further optimization. While it won't do much on the first solution above, the second solution can be decorated with a #numba.jit out-of-the-box with immediate benefits. Benchmarking all three solutions on
pd.DataFrame({'velocity': 50*np.random.random(10000), 'distance': 5*np.random.rand(10000)})
I get the following results:
Method Benchmark
-----------------------------------------------
Original data frame based 4.65 s ± 325 ms
Pure numpy array based 80.8 ms ± 9.95 ms
Jitted numpy array based 766 µs ± 52 µs
Even the innocent-looking mean is enough to throw off numba; if we get rid of that and go instead with
#numba.jit
def numba_example():
l = len(df)
a = np.empty(l)
d = df['distance'].values
v = df['velocity'].values
for i in range(l):
j = i
s = 0
while j >= 0 and s < 5:
s += d[j]
j -= 1
if s >= 5:
for k in range(j+1, i+1):
a[i] += v[k]
a[i] /= (i-j)
df['rv'] = a
then the benchmark reduces to 158 µs ± 8.41 µs.
Now, if you happen to know more about the structure of df['distance'], the while loop can probably be optimized further. (For example, if the values happen to always be much lower than 5, it will be faster to cut the cumulative sum from its tail, rather than recalculating everything.)
How about
df.rolling(window=3, min_periods=2).mean()
distance velocity
0 NaN NaN
1 2.500000 15.000000
2 2.000000 11.666667
3 2.666667 21.666667
To combine them
df['rv'] = df.velocity.rolling(window=3, min_periods=2).mean()
It looks like something's a little off with the window shape.

Why do I get a syntax error?

sales = 1000
#def commissionRate():
if (sales < 10000):
print("da")
else:
if (sales <= 10000 and >= 15000):
print("ea")
Syntax error on the if (sales <= 10000 and >= 15000): line. Particularly on the equal signs.
You need to compare sales against the second condition also:
In [326]:
sales = 1000
​
#def commissionRate():
​
​
if (sales < 10000):
print("da")
else:
if (sales <= 10000 and sales >= 15000):
print("ea")
da
you need this:
if (sales <= 10000 and sales >= 15000):
^^^^ sales here
Additionally you don't need parentheses () around the if conditions:
if sales <= 10000 and sales >= 15000:
works fine
You could rewrite it to the more compact:
In [328]:
sales = 1000
​
if sales < 10000:
print("da")
else:
if 10000 <= sales <= 15000:
print("ea")
da
so if 10000 <= sales <= 15000: works also, thanks #Donkey Kong
Additionally (thanks #pjz) and nothing to do with code is that logically sales cannot be both less than 10000 and greater than 15000.
So even without the syntax errors that condition will never be True.
You wanted if sales > 10000 and sales <= 15000: or if 10000 <= sales <= 15000: which maybe clearer for you
Just to expand on the if 10000 <= sales <= 15000: syntax (thanks #will for the suggestion), in python one can perform math comparisons lower_limit < x < upper_limit also explained here that are more natural than the usual if x > lower_limit and x < upper_limit:.
This allows comparisons to be chained, from the docs:
Formally, if a, b, c, ..., y, z are expressions and op1, op2, ..., opN
are comparison operators, then a op1 b op2 c ... y opN z is equivalent
to a op1 b and b op2 c and ... y opN z, except that each expression is
evaluated at most once.
About syntax:
if (sales <= 10000 and >= 15000): should be if (sales <= 10000 and sales >= 15000):
About logic:
sales can never samller than or equal to 10,000 and bigger than or equal to 15,000
if (10000 <= sales <= 15000):

Categories

Resources