Strings and the and operator: best practice, differences with + - python

For one of my sites, I need to check if several class attributes are defined and not empty. So far, I've happily used if self.attr:, which in my mind is the shorthand for if self.attr is not None and self.attr is not '':, or whatever the undefined value of the attribute is.
This works fine, but yields to surprising behavior when checking multiple string attributes. '' and '' is not False (as I expected), but ''.
This begs the question: are there other types for which the and operator does not force a typecast to bool?
I can't come up with an example where this difference in behavior would cause an actual different outcome for the if-clause (after all, '' still evaluates to False), but am left with the gut feeling that there's edge cases that could be a trap.
Lastly, I'd be keen to know if anybody knows why it was implemented that way? I thought the Zen of Python encourages one way and one way only, and the + operator already seems to bethe intuitive way for string concatenation.

and never typecasts to bool. Rather, if calls bool() on the result of expressions.
An expression using and (and or, for that matter), short-circuits when it can determine that the expression will not evaluate to True or False based on the first operand, and returns the last evaluated value:
>>> 0 and 'string'
0
>>> 1 and 'string'
'string'
>>> 'string' or 10
'string'
>>> '' or 10
10
This 'side-effect' is often used in python code. Note that not does return a boolean value. See the python documentation on boolean operators for the details.
The documentation also explains what constitutes a True or False equivalent for various types, such as None, 0, '' and empty containers being False, while most everything else is equivalent to True.
For custom classes, you need to define the .__nonzero__() method (returns True or False) or a .__len__() method (returns an int, where 0 is False and everything else is True), to influence their boolean equivalent, otherwise they'll always default to True.

Related

I'm not able to understand this "or" conditions when i am writing 0 or 3 it is returning 3 and also when i am writing 3 and 0 same condition printed [duplicate]

First, the code:
>>> False or 'hello'
'hello'
This surprising behavior lets you check if x is not None and check the value of x in one line:
>>> x = 10 if randint(0,2) == 1 else None
>>> (x or 0) > 0
# depend on x value...
Explanation: or functions like this:
if x is false, then y, else x
No language that I know lets you do this. So, why does Python?
It sounds like you're combining two issues into one.
First, there's the issue of short-circuiting. Marcin's answer addresses this issue perfectly, so I won't try to do any better.
Second, there's or and and returning the last-evaluated value, rather than converting it to bool. There are arguments to be made both ways, and you can find many languages on either side of the divide.
Returning the last-evaluated value allows the functionCall(x) or defaultValue shortcut, avoids a possibly wasteful conversion (why convert an int 2 into a bool 1 if the only thing you're going to do with it is check whether it's non-zero?), and is generally easier to explain. So, for various combinations of these reasons, languages like C, Lisp, Javascript, Lua, Perl, Ruby, and VB all do things this way, and so does Python.
Always returning a boolean value from an operator helps to catch some errors (especially in languages where the logical operators and the bitwise operators are easy to confuse), and it allows you to design a language where boolean checks are strictly-typed checks for true instead of just checks for nonzero, it makes the type of the operator easier to write out, and it avoids having to deal with conversion for cases where the two operands are different types (see the ?: operator in C-family languages). So, for various combinations of these reasons, languages like C++, Fortran, Smalltalk, and Haskell all do things this way.
In your question (if I understand it correctly), you're using this feature to be able to write something like:
if (x or 0) < 1:
When x could easily be None. This particular use case isn't very useful, both because the more-explicit x if x else 0 (in Python 2.5 and later) is just as easy to write and probably easier to understand (at least Guido thinks so), but also because None < 1 is the same as 0 < 1 anyway (at least in Python 2.x, so you've always got at least one of the two options)… But there are similar examples where it is useful. Compare these two:
return launchMissiles() or -1
return launchMissiles() if launchMissiles() else -1
The second one will waste a lot of missiles blowing up your enemies in Antarctica twice instead of once.
If you're curious why Python does it this way:
Back in the 1.x days, there was no bool type. You've got falsy values like None, 0, [], (), "", etc., and everything else is true, so who needs explicit False and True? Returning 1 from or would have been silly, because 1 is no more true than [1, 2, 3] or "dsfsdf". By the time bool was added (gradually over two 2.x versions, IIRC), the current logic was already solidly embedded in the language, and changing would have broken a lot of code.
So, why didn't they change it in 3.0? Many Python users, including BDFL Guido, would suggest that you shouldn't use or in this case (at the very least because it's a violation of "TOOWTDI"); you should instead store the result of the expression in a variable, e.g.:
missiles = launchMissiles()
return missiles if missiles else -1
And in fact, Guido has stated that he'd like to ban launchMissiles() or -1, and that's part of the reason he eventually accepted the ternary if-else expression that he'd rejected many times before. But many others disagree, and Guido is a benevolent DFL. Also, making or work the way you'd expect everywhere else, while refusing to do what you want (but Guido doesn't want you to want) here, would actually be pretty complicated.
So, Python will probably always be on the same side as C, Perl, and Lisp here, instead of the same side as Java, Smalltalk, and Haskell.
No language that i know lets you do this. So, why Python do?
Then you don't know many languages. I can't think of one language that I do know that does not exhibit this "shortcircuiting" behaviour.
It does it because it is useful to say:
a = b or K
such that a either becomes b, if b is not None (or otherwise falsy), and if not it gets the default value K.
Actually a number of languages do. See Wikipedia about Short-Circuit Evaluation
For the reason why short-circuit evaluation exists, wikipedia writes:
If both expressions used as conditions are simple boolean variables,
it can be actually faster to evaluate both conditions used in boolean
operation at once, as it always requires a single calculation cycle,
as opposed to one or two cycles used in short-circuit evaluation
(depending on the value of the first).
This behavior is not surprising, and it's quite straightforward if you consider Python has the following features regarding or, and and not logical operators:
Short-circuit evaluation: it only evaluates operands up to where it needs to.
Non-coercing result: the result is one of the operands, not coerced to bool.
And, additionally:
The Truth Value of an object is False only for None, False, 0, "", [], {}. Everything else has a truth value of True (this is a simplification; the correct definition is in the official docs)
Combine those features, and it leads to:
or : if the first operand evaluates as True, short-circuit there and return it. Or return the 2nd operand.
and: if the first operand evaluates as False, short-circuit there and return it. Or return the 2nd operand.
It's easier to understand if you generalize to a chain of operations:
>>> a or b or c or d
>>> a and b and c and d
Here is the "rule of thumb" I've memorized to help me easily predict the result:
or : returns the first "truthy" operand it finds, or the last one.
and: returns the first "falsy" operand it finds, or the last one.
As for your question, on why python behaves like that, well... I think because it has some very neat uses, and it's quite intuitive to understand. A common use is a series of fallback choices, the first "found" (ie, non-falsy) is used. Think about this silly example:
drink = getColdBeer() or pickNiceWine() or random.anySoda or "meh, water :/"
Or this real-world scenario:
username = cmdlineargs.username or configFile['username'] or DEFAULT_USERNAME
Which is much more concise and elegant than the alternative.
As many other answers have pointed out, Python is not alone and many other languages have the same behavior, for both short-circuit (I believe most current languanges are) and non-coercion.
"No language that i know lets you do this. So, why Python do?" You seem to assume that all languages should be the same. Wouldn't you expect innovation in programming languages to produce unique features that people value?
You've just pointed out why it's useful, so why wouldn't Python do it? Perhaps you should ask why other languages don't.
You can take advantage of the special features of the Python or operator out of Boolean contexts. The rule of thumb is still that the result of your Boolean expressions is the first true operand or the last in the line.
Notice that the logical operators (or included) are evaluated before the assignment operator =, so you can assign the result of a Boolean expression to a variable in the same way you do with a common expression:
>>> a = 1
>>> b = 2
>>> var1 = a or b
>>> var1
1
>>> a = None
>>> b = 2
>>> var2 = a or b
>>> var2
2
>>> a = []
>>> b = {}
>>> var3 = a or b
>>> var3
{}
Here, the or operator works as expected, returning the first true operand or the last operand if both are evaluated to false.

What does something = other_thing or antother_thing? [duplicate]

Why do we see Python assignments with or?
For example:
def my_function(arg_1=None, arg_2=0):
determination = arg_1 or arg_2 or 'no arguments given!'
print(determination)
return determination
When called with no arguments, the above function would print and return 'no arguments given!'
Why does Python do this, and how can one best make best use of this functionality?
What the "or" expression does on assignment:
We sometimes see examples of this in Python as a substitute for conditional expression with ternary assignments, (in fact, it helped inspire the language to add conditional statements).
x = a or b
If bool(a) returns False, then x is assigned the value of b
Identical Use-case of Conditional Expressions (i.e. Ternary Assignments)
Here's an example of such a conditional expression that accomplishes the same thing, but with perhaps a bit less mystery.
def my_function(arg_1=None, arg_2=0):
determination = arg_1 if arg_1 else arg_2 if arg_2 else 'no arguments given!'
print(determination)
return determination
Repeating this syntax too much is considered to be bad style, otherwise it's OK for one-liners. The downside is that it is a bit repetitive.
or Expressions
The base case, x or y returns x if bool(x) evaluates True, else it evaluates y, (see the docs for reference). Therefore, a series of or expressions has the effect of returning the first item that evaluates True, or the last item.
For example
'' or [] or 'apple' or () or set(['banana'])
returns 'apple', the first item that evaluates as True, and
'' or [] or ()
returns (), even though it evaluates as False.
Extended and usage
For contrast, x and y returns x if bool(x) evaluates as False, else it returns y.
It makes sense that and would work this way when you consider that all of the conditions in a conditional and series needs to evaluate as True for the control flow to proceed down that path, and that it makes no sense to continue evaluating those items when you come across one that is False.
The utility of using and for assignment is not immediately as apparent as using or, but it was historically used for ternary assignment. That is, before this more clear and straightforward construction was available:
a = x if condition else y
the equivalent formed with boolean operators was:
a = condition and x or z # don't do this!
which while the meaning is derivable based on a full understanding of Python and and or evaluation, is not nearly as readable as the ternary conditional, and is best avoided altogether.
Conclusion
Using Boolean expressions for assignment must be done carefully. Definitely never use and for assignment, which is confusing enough to be quite error-prone. Style mavens will find use of or for assignments less preferable (than the more verbose ternary, if condition else), but I have found that it is so common in the professional Python community that it could be considered idiomatic.
If you choose to use it, do so cautiously with the understanding that the final element, if reached, will always be returned regardless of its evaluation, so that final element should probably be a literal, so that you know you have a good default fallback for your variable.

Why is "if(x)" different from "if(x==True)" in Python?

Sorry if this hasn't been asked before because it's obvious to anyone with any form of training in Python. I spent most of my days programming in Java and I was wondering about this thing I take to be an idiosyncrasy in Python. I was messing around with
this=bool
if(this==True):this="big"
print(this)
Not surprisingly, I received a generic type declaration as output because that would indeed be big if True.
<class 'bool'>
Then I used my limited Java comprehension to determine that I could simplify the expression more, it's a conditional, right? Those are made for Boolean types.
this=bool
if(this):this="big"
print(this)
Then I almost laughed out loud but it was mirth tinged with terror
big
It makes sense that "this" is "less falsy" than "this==True", which is False with a capital F, but I still didn't think it should evaluate as True. Besides, truthy and falsy seem to be good enough in other cases to produce the expected result. So what makes an empty Boolean truthy?
My first thought was it must be just checking to see if "this" exists. So I removed the declaration in the first line, expecting to find "if([something that doesn't exist]):" would be skipped. Instead it threw an error. If the simple fact of its existence evaluates as True, shouldn't its nonexistence simply evaluate to False? What's the point of a functionality in which an initialized (empty) value evaluates to true if an uninitialized value doesn't return false?
I thought I understood after reading this answer and indeed my first thought was, "Oh, it must just be checking to see if 'this' exists at all." Then I typed
if(this is not None):
as well as
if(this!=None):
and got "big" as the output again, suggesting that "this" is None.
Then, almost in a fit of panic, I typed
this=bool
if(this==None):this="big"
if(this==True):this="big"
if(this==False):this="big"
print(this)
and, as you may have guessed, failed
<class 'bool'>
Surely a Boolean must be one of these "three"? Please don't tell me I've discovered a "fourth" Boolean value, I couldn't handle that kind of notoriety.
The problem (I think) I have with all this is that removing the initialization of 'this' in the first line causes an error, and not a false case in the if statement. If "if(x):" is executed when x exists, then shouldn't a non-existent "x" pass and just skip over that statement? Why not leave the Boolean if functionality from Java and just force us to check if x is not null in that case?
bool isn't a boolean, it's a class (just like int, str etc.). Classes are always truthy.
I think you meant to write this = bool() (which is the same as this = False - there's no such thing as an "empty" boolean).
I don't understand what you're doing here. You're taking the class bool and checking if it is exactly equal to True. Well, no of course it isn't, True is an instance of bool, a class isn't exactly equal to its instances. It's not equal to False or None either. But that doesn't stop it from being true in a Boolean context, because it's not empty.
Simply put, in general, you cannot say that if x has the same meaning as if x == True in Python. Their result depends on how x's methods __bool__ and __eq__ are defined, respectively.
For classes themselves, like bool or int (i.e. the type, not an object of that class), __bool__ is defined to always return True; while the comparison against True returns False.
Therefore:
if int == True: print("NOT printed")
if int: print("printed")
if bool(int): print("printed") # equivalent to the previous one
bool is a builtin class. You are assigning this to the bool class, making it never equal True. When you type if(this), this is being automatically converted to a boolean so it can be evaluated in the if statement. Classes are always converted to True rather than False, which is why the if statement executes.

Behavior of NotImpemented in comparison

I recently found out that python has a special value NotImpemented to be used with respect to binary special methods to indicate that some operation has not been implemented.
The peculiar about this is that when checked in a binary situation it is always equivalent to True.
For example using io.BytesIO (which is a case where __eq__ in not implemented for example) for two objects in comparison will virtually return True. As in this example (encoded_jpg_io1 and encoded_jpg_io2 are objects of the io.BytesIO class):
if encoded_jpg_io1.__ne__(encoded_jpg_io2):
print('Equal')
else:
print('Unequal')
Equal
if encoded_jpg_io1.__eq__(encoded_jpg_io2) == True:
print('Equal')
else:
print('Unequal')
Unequal
Since the second style is a bit too verbose and normally not prefered (even my pyCharm suggests to remove the explicit comparison with True) isn't a bit tricky behavior? I wouldn't have noticed it if I haven't explicitly print the result of the Boolean operation (which is not Boolean in this case at all).
I guess suggesting to be considered False would cause the same problem with __ne__ so we arew back to step one.
So, the only way to check out for these cases is by doing an exact comparison with True or False in the opposite case.
I know that NotImpemented is preferred over NotImplementedError for various reasons so I am not asking for any explanation over why this matter.
Per convention, objects that do not define a __bool__ method are considered truthy. From the docs:
By default, an object is considered true unless its class defines either a __bool__() method that returns False or a __len__() method that returns zero
This means that most classes, functions, and other builtin singletons are considered true, since they don't go out of their way to specify different behavior. (An exception is None, which is one of the few built-in singletons that does specifically signal it should be considered false):
>>> bool(int) # the class, not an integer object
True
>>> bool(min)
True
>>> bool(object())
True
>>> bool(...) # that's the Ellipsis object
True
>>> bool(NotImplemented)
True
There is no real reason for the NotImplemented object to break this convention. The problem with your code isn't that NotImplemented is considered truthy; the real problem is that x.__eq__(y) is not equivalent to x == y.
If you want to compare two objects for equality, doing it with x.__eq__(y) is incorrect. Using x.__eq__(y) == True instead is still incorrect.
The correct solution is to do comparison with the == operator. If, for whatever reason, you can't use the == operator directly, you should use the operator.eq function instead.

Should these expressions evaluate differently?

I was somewhat confused until I found the bug in my code. I had to change
a.matched_images.count #True when variable is 0
to
a.matched_images.count > 0 #False when variable is 0
Since I quickly wanted to know whether an object had any images, the first code will appear like the photo has images since the expression evaluates to True when the meaning really is false ("no images" / 0 images)
Did I understand this correctly and can you please answer or comment if these expressions should evaluate to different values.
What is the nature of count? If it's a basic Python number, then if count is the same as if count != 0. On the other hand, if count is a custom class then it needs to implement either __nonzero__ or __len__ for Python 2.x, or __bool__ or __len__ for Python 3.x. If those methods are not defined, then every instance of that class is considered True.
Without knowing what count is, it's hard to answer, but this excerpt may be of use to you:.
The following values are considered false:
None
False
zero of any numeric type, for example, 0, 0L, 0.0, 0j.
any empty sequence, for example, '', (), [].
any empty mapping, for example, {}.
instances of user-defined classes, if the class defines a
__nonzero__() or __len__() method, when that method returns the
integer zero or bool value False. [1]
All other values are considered true — so objects of many types are
always true.
>>> bool(0)
False
So.. no, if it were an int that wouldn't matter. Please do some tracing, print out what count actually is.

Categories

Resources