Related
Would python intrepet this:
if hour < 7 and hour > 0 or hour > 20 and hour < 23:
the same as
if 7 > hour > 0 or 23 > hour > 20 (this one is just the usual mathematical inequality)
if not then what should I write to tell python this inequality?
You can use comparison chaining.
if (0 < hour < 7) or (20 < hour < 23):
# do stuff
(Parenthesis for emphasis.)
You can always use parentheses to be sure:
if (hour < 7 and hour > 0) or (hour > 20 and hour < 23):
I'm trying to create a new column in a pandas dataframe to then assign an integer value depending on conditional formatting. An example would be:
if ((a > 1) & (a < 5)) give value 10, if ((a >= 5) & (a < 10)) give value 24, if ((a > 10) & (a < 5)) give value 57
where 'a' is another column in the dataframe.
Is there any way to do it with pandas/numpy without creating a function? I tried few different options but none worked.
Using pd.cut
df = pd.DataFrame({'a': [
2, 3, 5,7,8,10,100]})
pd.cut(df.a,bins=[1,5,10,np.inf],labels=[10,24,57])
Out[282]:
0 10
1 10
2 10
3 24
4 24
5 24
6 57
Name: a, dtype: category
Categories (3, int64): [10 < 24 < 57]
I think any way of doing this without creating a function would be pretty roundabout, though it's actually not too bad with a function. Additionally, your conditions don't really mesh with each other, but I assume that's a typo. If your conditions are relatively simple, you can define your function on the fly to keep your code compact:
df['new column'] = df['a'].apply(lambda x: 10 if x < 5 else 24 if x < 10 else 57)
that can get a little hairy if your conditions are more complicatied - it's easier to manage if you define the function more explicitly:
def f(x):
if x > 1 and x < 5: return 10
elif x >= 5 and x < 10: return 14
else: return 57
df['new column'] = df['a'].apply(f)
if your really want to avoid functions, the best i can think of is creating a new list for your new column, populating it by iterating through your data, and then adding it to your dataframe:
newcol = []
for a in df['a'].values:
if x > 1 and x < 5: newcol.append(10)
elif x >= 5 and x < 10: newcol.append(24)
else: newcol.append(57)
df['newcol'] = newcol
How do I determine whether a given integer is between two other integers (e.g. greater than/equal to 10000 and less than/equal to 30000)?
What I've attempted so far is not working:
if number >= 10000 and number >= 30000:
print ("you have to pay 5% taxes")
if 10000 <= number <= 30000:
pass
For details, see the docs.
>>> r = range(1, 4)
>>> 1 in r
True
>>> 2 in r
True
>>> 3 in r
True
>>> 4 in r
False
>>> 5 in r
False
>>> 0 in r
False
Your operator is incorrect. It should be if number >= 10000 and number <= 30000:. Additionally, Python has a shorthand for this sort of thing, if 10000 <= number <= 30000:.
Your code snippet,
if number >= 10000 and number >= 30000:
print ("you have to pay 5% taxes")
actually checks if number is larger than both 10000 and 30000.
Assuming you want to check that the number is in the range 10000 - 30000, you could use the Python interval comparison:
if 10000 <= number <= 30000:
print ("you have to pay 5% taxes")
This Python feature is further described in the Python documentation.
There are two ways to compare three integers and check whether b is between a and c:
if a < b < c:
pass
and
if a < b and b < c:
pass
The first one looks like more readable, but the second one runs faster.
Let's compare using dis.dis:
>>> dis.dis('a < b and b < c')
1 0 LOAD_NAME 0 (a)
2 LOAD_NAME 1 (b)
4 COMPARE_OP 0 (<)
6 JUMP_IF_FALSE_OR_POP 14
8 LOAD_NAME 1 (b)
10 LOAD_NAME 2 (c)
12 COMPARE_OP 0 (<)
>> 14 RETURN_VALUE
>>> dis.dis('a < b < c')
1 0 LOAD_NAME 0 (a)
2 LOAD_NAME 1 (b)
4 DUP_TOP
6 ROT_THREE
8 COMPARE_OP 0 (<)
10 JUMP_IF_FALSE_OR_POP 18
12 LOAD_NAME 2 (c)
14 COMPARE_OP 0 (<)
16 RETURN_VALUE
>> 18 ROT_TWO
20 POP_TOP
22 RETURN_VALUE
>>>
and using timeit:
~$ python3 -m timeit "1 < 2 and 2 < 3"
10000000 loops, best of 3: 0.0366 usec per loop
~$ python3 -m timeit "1 < 2 < 3"
10000000 loops, best of 3: 0.0396 usec per loop
also, you may use range, as suggested before, however it is much more slower.
if number >= 10000 and number <= 30000:
print ("you have to pay 5% taxes")
Define the range between the numbers:
r = range(1,10)
Then use it:
if num in r:
print("All right!")
The trouble with comparisons is that they can be difficult to debug when you put a >= where there should be a <=
# v---------- should be <
if number >= 10000 and number >= 30000:
print ("you have to pay 5% taxes")
Python lets you just write what you mean in words
if number in xrange(10000, 30001): # ok you have to remember 30000 + 1 here :)
In Python3, you need to use range instead of xrange.
edit: People seem to be more concerned with microbench marks and how cool chaining operations. My answer is about defensive (less attack surface for bugs) programming.
As a result of a claim in the comments, I've added the micro benchmark here for Python3.5.2
$ python3.5 -m timeit "5 in range(10000, 30000)"
1000000 loops, best of 3: 0.266 usec per loop
$ python3.5 -m timeit "10000 <= 5 < 30000"
10000000 loops, best of 3: 0.0327 usec per loop
If you are worried about performance, you could compute the range once
$ python3.5 -m timeit -s "R=range(10000, 30000)" "5 in R"
10000000 loops, best of 3: 0.0551 usec per loop
Below are few possible ways, ordered from best to worse performance (i.e first one will perform best)
# Old school check
if 10000 >= b and b <=30000:
print ("you have to pay 5% taxes")
# Python range check
if 10000 <= number <= 30000:
print ("you have to pay 5% taxes")
# As suggested by others but only works for integers and is slow
if number in range(10000,30001):
print ("you have to pay 5% taxes")
While 10 <= number <= 20 works in Python, I find this notation using range() more readable:
if number in range(10, 21):
print("number is between 10 (inclusive) and 21 (exclusive)")
else:
print("outside of range!")
Keep in mind that the 2nd, upper bound parameter is not included in the range set as can be verified with:
>>> list(range(10, 21))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
However prefer the range() approach only if it's not running on some performance critical path. A single call is still fast enough for most requirements, but if run 10,000,000 times, we clearly notice nearly 3 times slower performance compared to a <= x < b:
> { time python3 -c "for i in range(10000000): x = 50 in range(1, 100)"; } 2>&1 | sed -n 's/^.*cpu \(.*\) total$/\1/p'
1.848
> { time python3 -c "for i in range(10000000): x = 1 <= 50 < 100"; } 2>&1 | sed -n 's/^.*cpu \(.*\) total$/\1/p'
0.630
Suppose there are 3 non-negative integers: a, b, and c. Mathematically speaking, if we want to determine if c is between a and b, inclusively, one can use this formula:
(c - a) * (b - c) >= 0
or in Python:
> print((c - a) * (b - c) >= 0)
True
You want the output to print the given statement if and only if the number falls between 10,000 and 30,000.
Code should be;
if number >= 10000 and number <= 30000:
print("you have to pay 5% taxes")
You used >=30000, so if number is 45000 it will go into the loop, but we need it to be more than 10000 but less than 30000. Changing it to <=30000 will do it!
I'm adding a solution that nobody mentioned yet, using Interval class from sympy library:
from sympy import Interval
lower_value, higher_value = 10000, 30000
number = 20000
# to decide whether your interval shhould be open or closed use left_open and right_open
interval = Interval(lower_value, higher_value, left_open=False, right_open=False)
if interval.contains(number):
print("you have to pay 5% taxes")
Try this simple function; it checks if A is between B and C (B and C may not be in the right order):
def isBetween(A, B, C):
Mi = min(B, C)
Ma = max(B, C)
return Mi <= A <= Ma
so isBetween(2, 10, -1) is the same as isBetween(2, -1, 10).
The condition should be,
if number == 10000 and number <= 30000:
print("5% tax payable")
reason for using number == 10000 is that if number's value is 50000 and if we use number >= 10000 the condition will pass, which is not what you want.
How do I determine whether a given integer is between two other integers (e.g. greater than/equal to 10000 and less than/equal to 30000)?
What I've attempted so far is not working:
if number >= 10000 and number >= 30000:
print ("you have to pay 5% taxes")
if 10000 <= number <= 30000:
pass
For details, see the docs.
>>> r = range(1, 4)
>>> 1 in r
True
>>> 2 in r
True
>>> 3 in r
True
>>> 4 in r
False
>>> 5 in r
False
>>> 0 in r
False
Your operator is incorrect. It should be if number >= 10000 and number <= 30000:. Additionally, Python has a shorthand for this sort of thing, if 10000 <= number <= 30000:.
Your code snippet,
if number >= 10000 and number >= 30000:
print ("you have to pay 5% taxes")
actually checks if number is larger than both 10000 and 30000.
Assuming you want to check that the number is in the range 10000 - 30000, you could use the Python interval comparison:
if 10000 <= number <= 30000:
print ("you have to pay 5% taxes")
This Python feature is further described in the Python documentation.
There are two ways to compare three integers and check whether b is between a and c:
if a < b < c:
pass
and
if a < b and b < c:
pass
The first one looks like more readable, but the second one runs faster.
Let's compare using dis.dis:
>>> dis.dis('a < b and b < c')
1 0 LOAD_NAME 0 (a)
2 LOAD_NAME 1 (b)
4 COMPARE_OP 0 (<)
6 JUMP_IF_FALSE_OR_POP 14
8 LOAD_NAME 1 (b)
10 LOAD_NAME 2 (c)
12 COMPARE_OP 0 (<)
>> 14 RETURN_VALUE
>>> dis.dis('a < b < c')
1 0 LOAD_NAME 0 (a)
2 LOAD_NAME 1 (b)
4 DUP_TOP
6 ROT_THREE
8 COMPARE_OP 0 (<)
10 JUMP_IF_FALSE_OR_POP 18
12 LOAD_NAME 2 (c)
14 COMPARE_OP 0 (<)
16 RETURN_VALUE
>> 18 ROT_TWO
20 POP_TOP
22 RETURN_VALUE
>>>
and using timeit:
~$ python3 -m timeit "1 < 2 and 2 < 3"
10000000 loops, best of 3: 0.0366 usec per loop
~$ python3 -m timeit "1 < 2 < 3"
10000000 loops, best of 3: 0.0396 usec per loop
also, you may use range, as suggested before, however it is much more slower.
if number >= 10000 and number <= 30000:
print ("you have to pay 5% taxes")
Define the range between the numbers:
r = range(1,10)
Then use it:
if num in r:
print("All right!")
The trouble with comparisons is that they can be difficult to debug when you put a >= where there should be a <=
# v---------- should be <
if number >= 10000 and number >= 30000:
print ("you have to pay 5% taxes")
Python lets you just write what you mean in words
if number in xrange(10000, 30001): # ok you have to remember 30000 + 1 here :)
In Python3, you need to use range instead of xrange.
edit: People seem to be more concerned with microbench marks and how cool chaining operations. My answer is about defensive (less attack surface for bugs) programming.
As a result of a claim in the comments, I've added the micro benchmark here for Python3.5.2
$ python3.5 -m timeit "5 in range(10000, 30000)"
1000000 loops, best of 3: 0.266 usec per loop
$ python3.5 -m timeit "10000 <= 5 < 30000"
10000000 loops, best of 3: 0.0327 usec per loop
If you are worried about performance, you could compute the range once
$ python3.5 -m timeit -s "R=range(10000, 30000)" "5 in R"
10000000 loops, best of 3: 0.0551 usec per loop
Below are few possible ways, ordered from best to worse performance (i.e first one will perform best)
# Old school check
if 10000 >= b and b <=30000:
print ("you have to pay 5% taxes")
# Python range check
if 10000 <= number <= 30000:
print ("you have to pay 5% taxes")
# As suggested by others but only works for integers and is slow
if number in range(10000,30001):
print ("you have to pay 5% taxes")
While 10 <= number <= 20 works in Python, I find this notation using range() more readable:
if number in range(10, 21):
print("number is between 10 (inclusive) and 21 (exclusive)")
else:
print("outside of range!")
Keep in mind that the 2nd, upper bound parameter is not included in the range set as can be verified with:
>>> list(range(10, 21))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
However prefer the range() approach only if it's not running on some performance critical path. A single call is still fast enough for most requirements, but if run 10,000,000 times, we clearly notice nearly 3 times slower performance compared to a <= x < b:
> { time python3 -c "for i in range(10000000): x = 50 in range(1, 100)"; } 2>&1 | sed -n 's/^.*cpu \(.*\) total$/\1/p'
1.848
> { time python3 -c "for i in range(10000000): x = 1 <= 50 < 100"; } 2>&1 | sed -n 's/^.*cpu \(.*\) total$/\1/p'
0.630
Suppose there are 3 non-negative integers: a, b, and c. Mathematically speaking, if we want to determine if c is between a and b, inclusively, one can use this formula:
(c - a) * (b - c) >= 0
or in Python:
> print((c - a) * (b - c) >= 0)
True
You want the output to print the given statement if and only if the number falls between 10,000 and 30,000.
Code should be;
if number >= 10000 and number <= 30000:
print("you have to pay 5% taxes")
You used >=30000, so if number is 45000 it will go into the loop, but we need it to be more than 10000 but less than 30000. Changing it to <=30000 will do it!
I'm adding a solution that nobody mentioned yet, using Interval class from sympy library:
from sympy import Interval
lower_value, higher_value = 10000, 30000
number = 20000
# to decide whether your interval shhould be open or closed use left_open and right_open
interval = Interval(lower_value, higher_value, left_open=False, right_open=False)
if interval.contains(number):
print("you have to pay 5% taxes")
Try this simple function; it checks if A is between B and C (B and C may not be in the right order):
def isBetween(A, B, C):
Mi = min(B, C)
Ma = max(B, C)
return Mi <= A <= Ma
so isBetween(2, 10, -1) is the same as isBetween(2, -1, 10).
The condition should be,
if number == 10000 and number <= 30000:
print("5% tax payable")
reason for using number == 10000 is that if number's value is 50000 and if we use number >= 10000 the condition will pass, which is not what you want.
I am trying to get the average of two passing grades. The output should be:
0.0 if neither of the grades is a passing grade both <50
The passing grade, if only one of the grades is a passing grade (if
one is >50)
The average of the two grades, if both are passing grades (if both
are greater than 50)
Here is my code so far:
def passing_grade(grade1,grade2):
'''(number, number)--> number
This function definition prints the average of all passing grade(s)
'''
# Function 1 - If both numbers are outside the grading range (0-100)
if 0.0 < grade1 > 100.0 and 0 < grade2 > 100.0:
print ('Not available grading')
elif 0.0 >= grade1 <= 50.0 and 0.0 >= grade2 <= 50.0:
print (0.0)
#Function 2 - If one of the grades is passing then, print passing grade
elif 0.0 >= grade1 <= 50.0 and 0.0 >= grade2 >= 50.0:
print (grade2)
elif 0.0 >= grade1 >= 50.0 and 0.0 >= grade2 <= 50.0:
print (grade1)
#Function 3 - If both grades are passing >50 then print the average
elif 50.0 > grade1 <= 100.0 and 50.0> grade2 <= 100.0:
print ((grade1+grade2)/2)
I'm just guessing about your problem here, since you haven't specified, but it looks like you bad logic in the second part of "Function #2":
elif 0.0 >= grade1 <= 50.0 and 0.0 >= grade2 >= 50.0:
print (grade2)
elif 0.0 >= grade1 >= 50.0 and 0.0 >= grade2 <= 50.0:
print (grade1)
Should be:
elif grade1 <= 50.0 and grade2 >= 50.0:
print (grade2)
elif grade1 >= 50.0 and grade2 <= 50.0:
print (grade1)
If you look at your original conditions, you keep checking 0.0 >= gradeN, which means is only true if the grade is a negative number. There are similar problems in some of your other sections.
Your comparisons are screwed up. They don't say what you mean, and many evaluate to False always. There can be no grade1 such that 0.0 >= grade1 >= 50.0, as there are no nonpositive numbers greater-equal 50. I suggest you write out your multiple comparisons "the long way" until you're clear about what you mean to say, rather than using this keystroke-saving feature of Python. a < b < c in Python means a < b and b < c not a < b or b < c which is the form of what you want to say in your first 'if' statement.
Finally, when writing multiple comparisons in one expression, don't mix directions of the comparisons, it's needlessly confusing (for you, to start with).
A more concise way of writing (the calculation part of) your function:
def avg_passing_grade(grade1, grade2):
passing_grades = [g for g in (grade1, grade2) if 50 <= g <= 100]
return sum(passing_grades)/max(1, len(passing_grades))
This makes a list passing_grades containing only the grades supplied to the function that are passing. The function returns their average, taking care not to divide by 0 in case no grades are passing.
Although the following may be overkill, it's within such easy reach that I have to mention it: the function above generalizes easily to one that takes an arbitrary number of grades:
def average_passing_grade(* grades):
'''Return the average of the passing grades among grades.'''
passing_grades = [g for g in grades if 50 <= g <= 100]
return sum(passing_grades)/max(1, len(passing_grades))
which you can use like this:
>>> average_passing_grade()
0.0
>>> average_passing_grade(35.3)
0.0
>>> average_passing_grade(75.5)
75.5
>>> average_passing_grade(35.3, 88)
88.0
>>> average_passing_grade(88, 20)
88.0
>>> average_passing_grade(50, 100)
75.0
>>> average_passing_grade(40, 50, 60, 70, 80, 90)
70.0
Besides the bad logic pointed out in the other answers, you could use max and min to do a single logical check for certain cases.
if grade1 >= 50.0 and grade2 >= 50.0:
can be
if min (grade1, grade2) >= 50.0: # Both are >= 50.0
similarly
if max(grade1, grade2) < 50.0 # both are less than 50.0
Once those two have been shown false, the the else means that one is on each side of the limit.
Similarly, to test for the invalid values you can use
if max(grade1, grade2) > 100.0 or min(grade1, grade2) < 0:
means that at least one grade is invalid
if min(grade1, grade2) > 100 or max(grade1, grade2) <0:
means that both grades are invalid in the same way.
I am no expert in Python. However, I'm pretty sure your error is because of your if conditions.
if 0.0 < grade1 > 100.0 and 0 < grade2 > 100.0:
Should be something like
if (grade1 < 0 or grade1 > 100) and (grade2 < 0 or grade2 > 100):