is there ever a reason to use "is"? - python

Instead of "=="? I know what "is" is, it is comparing the identity of the variable. But when would you ever want to do that? All it has ever done for me is cause problems. After using it for a while (because I felt it made my code more readable), I am not declaring war on "is".
Does anyone use it for something that "==" wouldn't do? I don't understand why they didn't make 'is' the same thing as '==', like they made 'and' the same as '&&' etc. If someone wanted the pointer, they just have to say "id(x) == id(y)" and we wouldn't have this confusion.
It's one of the "gotcha's" in python that I don't understand, and trip a lot of newbies up. The reason I think it trips people up is that they don't get why it would do an identity comparison. What is the point(er)?
Edit:
Thanks for the great answers. I think what new people should take away is "always use '==' unless you know what you are doing"!

Yes, there is a reason.
When you want to compare object-identity ("same object") and not object-equality ("same value"). In almost all cases == (object-equality) is the correct operator to use. (As pointed out in the comments, the trivial case I skipped entirely is the x is None idiom -- None is the sole inhabitant of NoneType.)
One case for object-identity is an identity-cache/dictionary which is a common pattern in some proxy or "inside-out" situations. In this case I don't want the value for "a similar object" back, I want value for "the same object" back.
Happy coding.
Thinking about pointers in Python is wrong. Values in Python are objects and each value represents itself. That is, x is y is only true when x and y evaluate to the same object. When objects are passed in python this same concept holds -- the object itself is passed. No need to think about references.
Python is Call-by-Sharing/Call-by-Object-Sharing although the term "Call-By-Reference" -- which exists in the "official documentation places" -- is often incorrectly used to describe the behavior (perhaps to "ease" the transition from other languages where the term is also often incorrectly used).

There is a huge difference the two. is tests for object identity and == tests for equality.
Consider this simple example:
print [] == [] # True
print [] is [] # False

Related

Is this the proper way to test if an enum is a specific value? [duplicate]

PEP 8 Programming Recommendations says:
Comparisons to singletons like None should always be done with is or is not, never the equality operators.
According to the docs, enum members are singletons. Does that mean they should also be compared by identity?
class Color(Enum):
RED = 1
GREEN = 2
BLUE = 3
# like this?
if color is Color.RED:
...
# or like this
if color == Color.RED:
...
When using equality operators, I haven't noticed any issues with this to warrant such strong wording as PEP 8. What's the drawback of using equality, if any? Doesn't it just fall back to an identity-based comparison anyway? Is this just a micro-optimisation?
From https://docs.python.org/3/library/enum.html#module-enum:
Within an enumeration, the members can be compared by identity
In particular, from https://docs.python.org/3/howto/enum.html#comparisons :
Enumeration members are compared by identity
First, we can definitely rule out x.value is y.value, because those aren't singletons, those are perfectly ordinary values that you've stored in attributes.
But what about x is y?
First, I believe that, by "singletons like None", PEP 8 is specifically referring to the small, fixed set of built-in singletons that are like None in some important way. What important way? Why do you want to compare None with is?
Readability: if foo is None: reads like what it means. On the rare occasions when you want to distinguish True from other truthy values, if spam is True: reads better than if spam == True:, as well as making it more obvious that this isn't just a frivolous == True used by someone improperly following a C++ coding standard in Python. That might apply in foo is Potato.spud, but not so much in x is y.
Use as a sentinel: None is used to mean "value missing" or "search failed" or similar cases. It shouldn't be used in cases where None itself can be a value, of course. And if someone creates a class whose instances compare equal to None, it's possible to run into that problem without realizing it. is None protects against that. This is even more of a problem with True and False (again, on those rare occasions when you want to distinguish them), since 1 == True and 0 == False. This reason doesn't seem to apply here—if 1 == Potato.spud, that's only because you intentionally chose to use an IntEnum instead of an Enum, in which case that's exactly what you want…
(Quasi-)keyword status: None and friends have gradually migrated from perfectly normal builtin to keyword over the years. Not only is the default value of the symbol None always going to be the singleton, the only possible value is that singleton. This means that an optimizer, static linter, etc. can make an assumption about what None means in your code, in a way that it can't for anything defined at runtime. Again, this reason doesn't seem to apply.
Performance: This really is not a consideration at all. It might be faster, on some implementations, to compare with is than with ==, but this is incredibly unlikely to ever make a difference in real code (or, if it does, that real code probably needs a higher-level optimization, like turning a list into a set…).
So, what's the conclusion?
Well, it's hard to get away from an opinion here, but I think it's reasonable to say that:
if devo is Potato.spud: is reasonable if it makes things more readable, but as long as you're consistent within a code base, I don't think anyone will complain either way.
if x is y:, even when they're both known to be Potato objects, is not reasonable.
if x.value is Potato.spud.value is not reasonable.
PEP 8 says:
Comparisons to singletons like None should always be done with is or is not, never the equality operators.
I disagree with abarnert: the reason for this isn't because these are built-in or special in any way. It's because in these cases you care about having the object, not something that looks like it.
When using is None, for example, you care about whether it's the None that you put there, not a different None that had been passed in. This might be difficult in practice (there is only one None, after all) but it does sometimes matter.
Take, for example:
no_argument = object()
def foo(x=no_argument):
if x OP no_argument:
...
...
If OP is is, this is perfectly idiomatic code. If it's ==, it's not.
For the same reason, you should make the decision as so:
If you want equality to duck type, such as with IntEnum or an Enum that you might want to subclass and overwrite (such as when you have complex enum types with methods and other extras) it makes sense to use ==.
When you are using enums as dumb sentinels, use is.

why some types of data refer to the same memory location

Asked such a question. Why only the type only str and boolean with the same variables refer to one memory location:
a = 'something'
b = 'something'
if a is b: print('True') # True
but we did not write anywhere a = b. hence the interpreter saw that the strings are equal to each other and made a reference to one memory cell.
Of course, if we assign a new value to either of these two variables, there will be no conflict, so now the variable will refer to another memory location
b = 'something more'
if a is b: print('True') # False
with type boolean going on all the same
a = True
b = True
if a is b: print('True') # True
I first thought that this happens with all mutable types. But no. There remained one unchangeable type - tuple. But it has a different behavior, that is, when we assign the same values ​​to variables, we already refer to different memory cells. Why does this happen only with tuple of immutable types
a = (1,9,8)
b = (1,9,8)
if a is b: print('True') # False
In Python, == checks for value equality, while is checks if basically its the same object like so: id(object) == id(object)
Python has some builtin singletons which it starts off with (I'm guessing lower integers and some commonly used strings)
So, if you dig deeper into your statement
a = 'something'
b = 'something'
id(a)
# 139702804094704
id(b)
# 139702804094704
a is b
# True
But if you change it a bit:
a = 'something else'
b = 'something else'
id(a)
# 139702804150640
id(b)
# 139702804159152
a is b
# False
We're getting False because Python uses different memory location for a and b this time, unlike before.
My guess is with tuples (and someone correct me if I'm mistaken) Python allocates different memory every time you create one.
Why do some types cache values? Because you shouldn't be able to notice the difference!
is is a very specialized operator. Nearly always you should use == instead, which will do exactly what you want.
The cases where you want to use is instead of == basically are when you're dealing with objects that have overloaded the behavior of == to not mean what you want it to mean, or where you're worried that you might be dealing with such objects.
If you're not sure whether you're dealing with such objects or not, you're probably not, which means that == is always right and you don't have to ever use is.
It can be a matter of "style points" to use is with known singleton objects, like None, but there's nothing wrong with using == there (again, in the absence of a pathological implementation of ==).
If you're dealing with potentially untrustworthy objects, then you should never do anything that may invoke a method that they control.... and that's a good place to use is. But almost nobody is doing that, and those who do should be aware of the zillion other ways a malicious object could cause problems.
If an object implements == incorrectly then you can get all kinds of weird problems. In the course of debugging those problems, of course you can and should use is! But that shouldn't be your normal way of comparing objects in code you write.
The one other case where you might want to use is rather than == is as a performance optimization, if the object you're dealing with implements == in a particularly expensive way. This is not going to be the case very often at all, and most of the time there are better ways to reduce the number of times you have to compare two objects (e.g. by comparing hash codes instead) which will ultimately have a much better effect on performance without bringing correctness into question.
If you use == wherever you semantically want an equality comparison, then you will never even notice when some types sneakily reuse instances on you.

Why isn't there an ignore special variable in python?

Let's say I want to partition a string. It returns a tuple of 3 items. I do not need the second item.
I have read that _ is used when a variable is to be ignored.
bar = 'asd cds'
start,_,end = bar.partition(' ')
If I understand it correctly, the _ is still filled with the value. I haven't heard of a way to ignore some part of the output.
Wouldn't that save cycles?
A bigger example would be
def foo():
return list(range(1000))
start,*_,end = foo()
It wouldn't really save any cycles to ignore the return argument, no, except for those which are trivially saved, by noting that there is no point to binding a name to a returned object that isn't used.
Python doesn't do any function inlining or cross-function optimization. It isn't targeting that niche in the slightest. Nor should it, as that would compromise many of the things that python is good at. A lot of core python functionality depends on the simplicity of its design.
Same for your list unpacking example. Its easy to think of syntactic sugar to have python pick the last and first item of the list, given that syntax. But there is no way, staying within the defining constraints of python, to actually not construct the whole list first. Who is to say that the construction of the list does not have relevant side-effects? Python, as a language, will certainly not guarantee you any such thing. As a dynamic language, python does not even have the slightest clue, or tries to concern itself, with the fact that foo might return a list, until the moment that it actually does so.
And will it return a list? What if you rebound the list identifier?
As per the docs, a valid variable name can be of this form
identifier ::= (letter|"_") (letter | digit | "_")*
It means that, first character of a variable name can be a letter or an underscore and rest of the name can have a letter or a digit or _. So, _ is a valid variable name in Python but that is less commonly used. So people normally use that like a use and throw variable.
And the syntax you have shown is not valid. It should have been
start,*_,end = foo()
Anyway, this kind of unpacking will work only in Python 3.x
In your example, you have used
list(range(1000))
So, the entire list is already constructed. When you return it, you are actually returning a reference to the list, the values are not copied actually. So, there is no specific way to ignore the values as such.
There certainly is a way to extract just a few elements. To wit:
l = foo()
start, end = foo[0], foo[-1]
The question you're asking is, then, "Why doesn't there exist a one-line shorthand for this?" There are two answers to that:
It's not common enough to need shorthand for. The two line solution is adequate for this uncommon scenario.
Features don't need a good reason to not exist. It's not like Guido van Rossum compiled a list of all possible ideas and then struck out yours. If you have an idea for improved syntax you could propose it to the Python community and see if you could get them to implement it.

Why are there no ++ and --​ operators in Python?

Why are there no ++ and -- operators in Python?
It's not because it doesn't make sense; it makes perfect sense to define "x++" as "x += 1, evaluating to the previous binding of x".
If you want to know the original reason, you'll have to either wade through old Python mailing lists or ask somebody who was there (eg. Guido), but it's easy enough to justify after the fact:
Simple increment and decrement aren't needed as much as in other languages. You don't write things like for(int i = 0; i < 10; ++i) in Python very often; instead you do things like for i in range(0, 10).
Since it's not needed nearly as often, there's much less reason to give it its own special syntax; when you do need to increment, += is usually just fine.
It's not a decision of whether it makes sense, or whether it can be done--it does, and it can. It's a question of whether the benefit is worth adding to the core syntax of the language. Remember, this is four operators--postinc, postdec, preinc, predec, and each of these would need to have its own class overloads; they all need to be specified, and tested; it would add opcodes to the language (implying a larger, and therefore slower, VM engine); every class that supports a logical increment would need to implement them (on top of += and -=).
This is all redundant with += and -=, so it would become a net loss.
This original answer I wrote is a myth from the folklore of computing: debunked by Dennis Ritchie as "historically impossible" as noted in the letters to the editors of Communications of the ACM July 2012 doi:10.1145/2209249.2209251
The C increment/decrement operators were invented at a time when the C compiler wasn't very smart and the authors wanted to be able to specify the direct intent that a machine language operator should be used which saved a handful of cycles for a compiler which might do a
load memory
load 1
add
store memory
instead of
inc memory
and the PDP-11 even supported "autoincrement" and "autoincrement deferred" instructions corresponding to *++p and *p++, respectively. See section 5.3 of the manual if horribly curious.
As compilers are smart enough to handle the high-level optimization tricks built into the syntax of C, they are just a syntactic convenience now.
Python doesn't have tricks to convey intentions to the assembler because it doesn't use one.
I always assumed it had to do with this line of the zen of python:
There should be one — and preferably only one — obvious way to do it.
x++ and x+=1 do the exact same thing, so there is no reason to have both.
Of course, we could say "Guido just decided that way", but I think the question is really about the reasons for that decision. I think there are several reasons:
It mixes together statements and expressions, which is not good practice. See http://norvig.com/python-iaq.html
It generally encourages people to write less readable code
Extra complexity in the language implementation, which is unnecessary in Python, as already mentioned
Because, in Python, integers are immutable (int's += actually returns a different object).
Also, with ++/-- you need to worry about pre- versus post- increment/decrement, and it takes only one more keystroke to write x+=1. In other words, it avoids potential confusion at the expense of very little gain.
Clarity!
Python is a lot about clarity and no programmer is likely to correctly guess the meaning of --a unless s/he's learned a language having that construct.
Python is also a lot about avoiding constructs that invite mistakes and the ++ operators are known to be rich sources of defects.
These two reasons are enough not to have those operators in Python.
The decision that Python uses indentation to mark blocks rather
than syntactical means such as some form of begin/end bracketing
or mandatory end marking is based largely on the same considerations.
For illustration, have a look at the discussion around introducing a conditional operator (in C: cond ? resultif : resultelse) into Python in 2005.
Read at least the first message and the decision message of that discussion (which had several precursors on the same topic previously).
Trivia:
The PEP frequently mentioned therein is the "Python Enhancement Proposal" PEP 308. LC means list comprehension, GE means generator expression (and don't worry if those confuse you, they are none of the few complicated spots of Python).
My understanding of why python does not have ++ operator is following: When you write this in python a=b=c=1 you will get three variables (labels) pointing at same object (which value is 1). You can verify this by using id function which will return an object memory address:
In [19]: id(a)
Out[19]: 34019256
In [20]: id(b)
Out[20]: 34019256
In [21]: id(c)
Out[21]: 34019256
All three variables (labels) point to the same object. Now increment one of variable and see how it affects memory addresses:
In [22] a = a + 1
In [23]: id(a)
Out[23]: 34019232
In [24]: id(b)
Out[24]: 34019256
In [25]: id(c)
Out[25]: 34019256
You can see that variable a now points to another object as variables b and c. Because you've used a = a + 1 it is explicitly clear. In other words you assign completely another object to label a. Imagine that you can write a++ it would suggest that you did not assign to variable a new object but ratter increment the old one. All this stuff is IMHO for minimization of confusion. For better understanding see how python variables works:
In Python, why can a function modify some arguments as perceived by the caller, but not others?
Is Python call-by-value or call-by-reference? Neither.
Does Python pass by value, or by reference?
Is Python pass-by-reference or pass-by-value?
Python: How do I pass a variable by reference?
Understanding Python variables and Memory Management
Emulating pass-by-value behaviour in python
Python functions call by reference
Code Like a Pythonista: Idiomatic Python
It was just designed that way. Increment and decrement operators are just shortcuts for x = x + 1. Python has typically adopted a design strategy which reduces the number of alternative means of performing an operation. Augmented assignment is the closest thing to increment/decrement operators in Python, and they weren't even added until Python 2.0.
I'm very new to python but I suspect the reason is because of the emphasis between mutable and immutable objects within the language. Now, I know that x++ can easily be interpreted as x = x + 1, but it LOOKS like you're incrementing in-place an object which could be immutable.
Just my guess/feeling/hunch.
To complete already good answers on that page:
Let's suppose we decide to do this, prefix (++i) that would break the unary + and - operators.
Today, prefixing by ++ or -- does nothing, because it enables unary plus operator twice (does nothing) or unary minus twice (twice: cancels itself)
>>> i=12
>>> ++i
12
>>> --i
12
So that would potentially break that logic.
now if one needs it for list comprehensions or lambdas, from python 3.8 it's possible with the new := assignment operator (PEP572)
pre-incrementing a and assign it to b:
>>> a = 1
>>> b = (a:=a+1)
>>> b
2
>>> a
2
post-incrementing just needs to make up the premature add by subtracting 1:
>>> a = 1
>>> b = (a:=a+1)-1
>>> b
1
>>> a
2
I believe it stems from the Python creed that "explicit is better than implicit".
First, Python is only indirectly influenced by C; it is heavily influenced by ABC, which apparently does not have these operators, so it should not be any great surprise not to find them in Python either.
Secondly, as others have said, increment and decrement are supported by += and -= already.
Third, full support for a ++ and -- operator set usually includes supporting both the prefix and postfix versions of them. In C and C++, this can lead to all kinds of "lovely" constructs that seem (to me) to be against the spirit of simplicity and straight-forwardness that Python embraces.
For example, while the C statement while(*t++ = *s++); may seem simple and elegant to an experienced programmer, to someone learning it, it is anything but simple. Throw in a mixture of prefix and postfix increments and decrements, and even many pros will have to stop and think a bit.
The ++ class of operators are expressions with side effects. This is something generally not found in Python.
For the same reason an assignment is not an expression in Python, thus preventing the common if (a = f(...)) { /* using a here */ } idiom.
Lastly I suspect that there operator are not very consistent with Pythons reference semantics. Remember, Python does not have variables (or pointers) with the semantics known from C/C++.
as i understood it so you won't think the value in memory is changed.
in c when you do x++ the value of x in memory changes.
but in python all numbers are immutable hence the address that x pointed as still has x not x+1. when you write x++ you would think that x change what really happens is that x refrence is changed to a location in memory where x+1 is stored or recreate this location if doe's not exists.
Other answers have described why it's not needed for iterators, but sometimes it is useful when assigning to increase a variable in-line, you can achieve the same effect using tuples and multiple assignment:
b = ++a becomes:
a,b = (a+1,)*2
and b = a++ becomes:
a,b = a+1, a
Python 3.8 introduces the assignment := operator, allowing us to achievefoo(++a) with
foo(a:=a+1)
foo(a++) is still elusive though.
Maybe a better question would be to ask why do these operators exist in C. K&R calls increment and decrement operators 'unusual' (Section 2.8page 46). The Introduction calls them 'more concise and often more efficient'. I suspect that the fact that these operations always come up in pointer manipulation also has played a part in their introduction.
In Python it has been probably decided that it made no sense to try to optimise increments (in fact I just did a test in C, and it seems that the gcc-generated assembly uses addl instead of incl in both cases) and there is no pointer arithmetic; so it would have been just One More Way to Do It and we know Python loathes that.
This may be because #GlennMaynard is looking at the matter as in comparison with other languages, but in Python, you do things the python way. It's not a 'why' question. It's there and you can do things to the same effect with x+=. In The Zen of Python, it is given: "there should only be one way to solve a problem." Multiple choices are great in art (freedom of expression) but lousy in engineering.
I think this relates to the concepts of mutability and immutability of objects. 2,3,4,5 are immutable in python. Refer to the image below. 2 has fixed id until this python process.
x++ would essentially mean an in-place increment like C. In C, x++ performs in-place increments. So, x=3, and x++ would increment 3 in the memory to 4, unlike python where 3 would still exist in memory.
Thus in python, you don't need to recreate a value in memory. This may lead to performance optimizations.
This is a hunch based answer.
I know this is an old thread, but the most common use case for ++i is not covered, that being manually indexing sets when there are no provided indices. This situation is why python provides enumerate()
Example : In any given language, when you use a construct like foreach to iterate over a set - for the sake of the example we'll even say it's an unordered set and you need a unique index for everything to tell them apart, say
i = 0
stuff = {'a': 'b', 'c': 'd', 'e': 'f'}
uniquestuff = {}
for key, val in stuff.items() :
uniquestuff[key] = '{0}{1}'.format(val, i)
i += 1
In cases like this, python provides an enumerate method, e.g.
for i, (key, val) in enumerate(stuff.items()) :
In addition to the other excellent answers here, ++ and -- are also notorious for undefined behavior. For example, what happens in this code?
foo[bar] = bar++;
It's so innocent-looking, but it's wrong C (and C++), because you don't know whether the first bar will have been incremented or not. One compiler might do it one way, another might do it another way, and a third might make demons fly out of your nose. All would be perfectly conformant with the C and C++ standards.
(EDIT: C++17 has changed the behavior of the given code so that it is defined; it will be equivalent to foo[bar+1] = bar; ++bar; — which nonetheless might not be what the programmer is expecting.)
Undefined behavior is seen as a necessary evil in C and C++, but in Python, it's just evil, and avoided as much as possible.

Comparing if datetime.datetime exists or None

I'm running a small app on Google App Engine with Python.
In the model I have a property of type DateTimeProperty, which is datetime.datetime. When it's created there is no value (i.e. "None").
I want compare if that datetime.datetime is None, but I can't.
if object.updated_date is None or object.updated_date >= past:
object.updated_date = now
Both updated_date and past is datetime.datetime.
I get the following error.
TypeError: can't compare datetime.datetime to NoneType
What is the correct way to do this?
Given that the previous discussion seems to have established that either of the variables could be None, one approach would be (assuming you want to set object.updated_date when either of the variables is None):
if None in (past, object.updated_date) or object.updated_date >= past:
object.updated_date = now
point being that the check None in (past, object.updated_date) may be handier than the semantically equivalent alternative (past is None or object.update_date is None) (arguably an epsilon more readable thanks to its better compactness, but it is, of course, an arguable matter of style).
As an aside, and a less-arguable matter of style;-), I strongly recommend against using built-ins' names as names for your own variables (and functions, etc) -- object is such a built-in name which in this context is clearly being used for your own purposes. Using obj instead is more concise, still readable (arguably more so;-), and has no downside. You're unlikely to be "bitten" in any given case by the iffy practice of "shadowing" built-ins' names with your own, but eventually it will happen (as you happen to need the normal meaning of the shadowed name during some later ordinary maintenance operation) and you may be in for a confusing debugging situation then; meanwhile, you risk confusing other readers / maintainers... and are getting absolutely no advantage in return for these disadvantages.
I realize that many of Python's built-ins' names are an "attractive nuisance" in this sense... file, object, list, dict, set, min, max... all apparently attractive name for "a file", "an object", "a list`, etc. But, it's worthwhile to learn to resist this particular temptation!-)
Perhaps you mean:
if object.updated_date and object.updated_date >= past:
If it's truthy (which implies not null), we check that it's also >= past. This uses short-circuit evaluation, which means the second condition isn't checked if the first is falsy.
You want and, not or.
You may also want to use is None.
EDIT:
Since you've determined that object.updated_date isn't None, this only other possibility is that past is None.

Categories

Resources