Can't compare strings with symbols? [duplicate] - python

Two string variables are set to the same value. s1 == s2 always returns True, but s1 is s2 sometimes returns False.
If I open my Python interpreter and do the same is comparison, it succeeds:
>>> s1 = 'text'
>>> s2 = 'text'
>>> s1 is s2
True
Why is this?

is is identity testing, and == is equality testing. What happens in your code would be emulated in the interpreter like this:
>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False
So, no wonder they're not the same, right?
In other words: a is b is the equivalent of id(a) == id(b)

Other answers here are correct: is is used for identity comparison, while == is used for equality comparison. Since what you care about is equality (the two strings should contain the same characters), in this case the is operator is simply wrong and you should be using == instead.
The reason is works interactively is that (most) string literals are interned by default. From Wikipedia:
Interned strings speed up string
comparisons, which are sometimes a
performance bottleneck in applications
(such as compilers and dynamic
programming language runtimes) that
rely heavily on hash tables with
string keys. Without interning,
checking that two different strings
are equal involves examining every
character of both strings. This is
slow for several reasons: it is
inherently O(n) in the length of the
strings; it typically requires reads
from several regions of memory, which
take time; and the reads fills up the
processor cache, meaning there is less
cache available for other needs. With
interned strings, a simple object
identity test suffices after the
original intern operation; this is
typically implemented as a pointer
equality test, normally just a single
machine instruction with no memory
reference at all.
So, when you have two string literals (words that are literally typed into your program source code, surrounded by quotation marks) in your program that have the same value, the Python compiler will automatically intern the strings, making them both stored at the same memory location. (Note that this doesn't always happen, and the rules for when this happens are quite convoluted, so please don't rely on this behavior in production code!)
Since in your interactive session both strings are actually stored in the same memory location, they have the same identity, so the is operator works as expected. But if you construct a string by some other method (even if that string contains exactly the same characters), then the string may be equal, but it is not the same string -- that is, it has a different identity, because it is stored in a different place in memory.

The is keyword is a test for object identity while == is a value comparison.
If you use is, the result will be true if and only if the object is the same object. However, == will be true any time the values of the object are the same.

One last thing to note is you may use the sys.intern function to ensure that you're getting a reference to the same string:
>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True
As pointed out in previous answers, you should not be using is to determine equality of strings. But this may be helpful to know if you have some kind of weird requirement to use is.
Note that the intern function used to be a built-in on Python 2, but it was moved to the sys module in Python 3.

is is identity testing and == is equality testing. This means is is a way to check whether two things are the same things, or just equivalent.
Say you've got a simple person object. If it is named 'Jack' and is '23' years old, it's equivalent to another 23-year-old Jack, but it's not the same person.
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.name == other.name and self.age == other.age
jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)
jack1 == jack2 # True
jack1 is jack2 # False
They're the same age, but they're not the same instance of person. A string might be equivalent to another, but it's not the same object.

This is a side note, but in idiomatic Python, you will often see things like:
if x is None:
# Some clauses
This is safe, because there is guaranteed to be one instance of the Null Object (i.e., None).

If you're not sure what you're doing, use the '=='.
If you have a little more knowledge about it you can use 'is' for known objects like 'None'.
Otherwise, you'll end up wondering why things doesn't work and why this happens:
>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False
I'm not even sure if some things are guaranteed to stay the same between different Python versions/implementations.

From my limited experience with Python, is is used to compare two objects to see if they are the same object as opposed to two different objects with the same value. == is used to determine if the values are identical.
Here is a good example:
>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True
s1 is a Unicode string, and s2 is a normal string. They are not the same type, but they are the same value.

I think it has to do with the fact that, when the 'is' comparison evaluates to false, two distinct objects are used. If it evaluates to true, that means internally it's using the same exact object and not creating a new one, possibly because you created them within a fraction of 2 or so seconds and because there isn't a large time gap in between it's optimized and uses the same object.
This is why you should be using the equality operator ==, not is, to compare the value of a string object.
>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>>
In this example, I made s2, which was a different string object previously equal to 'one' but it is not the same object as s, because the interpreter did not use the same object as I did not initially assign it to 'one', if I had it would have made them the same object.

The == operator tests value equivalence. The is operator tests object identity, and Python tests whether the two are really the same object (i.e., live at the same address in memory).
>>> a = 'banana'
>>> b = 'banana'
>>> a is b
True
In this example, Python only created one string object, and both a and b refers to it. The reason is that Python internally caches and reuses some strings as an optimization. There really is just a string 'banana' in memory, shared by a and b. To trigger the normal behavior, you need to use longer strings:
>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)
When you create two lists, you get two objects:
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False
In this case we would say that the two lists are equivalent, because they have the same elements, but not identical, because they are not the same object. If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.
If a refers to an object and you assign b = a, then both variables refer to the same object:
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
Reference: Think Python 2e by Allen B. Downey

I believe that this is known as "interned" strings. Python does this, so does Java, and so do C and C++ when compiling in optimized modes.
If you use two identical strings, instead of wasting memory by creating two string objects, all interned strings with the same contents point to the same memory.
This results in the Python "is" operator returning True because two strings with the same contents are pointing at the same string object. This will also happen in Java and in C.
This is only useful for memory savings though. You cannot rely on it to test for string equality, because the various interpreters and compilers and JIT engines cannot always do it.

Actually, the is operator checks for identity and == operator checks for equality.
From the language reference:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
So from the above statement we can infer that the strings, which are immutable types, may fail when checked with "is" and may succeed when checked with "is".
The same applies for int and tuple which are also immutable types.

is will compare the memory location. It is used for object-level comparison.
== will compare the variables in the program. It is used for checking at a value level.
is checks for address level equivalence
== checks for value level equivalence

is is identity testing and == is equality testing (see the Python documentation).
In most cases, if a is b, then a == b. But there are exceptions, for example:
>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False
So, you can only use is for identity tests, never equality tests.

The basic concept, we have to be clear, while approaching this question, is to understand the difference between is and ==.
"is" is will compare the memory location. if id(a)==id(b), then a is b returns true else it returns false.
So, we can say that is is used for comparing memory locations. Whereas,
== is used for equality testing which means that it just compares only the resultant values. The below shown code may acts as an example to the above given theory.
Code
In the case of string literals (strings without getting assigned to variables), the memory address will be same as shown in the picture. so, id(a)==id(b). The remaining of this is self-explanatory.

Related

why same strings in python doesn't have same id? [duplicate]

Two string variables are set to the same value. s1 == s2 always returns True, but s1 is s2 sometimes returns False.
If I open my Python interpreter and do the same is comparison, it succeeds:
>>> s1 = 'text'
>>> s2 = 'text'
>>> s1 is s2
True
Why is this?
is is identity testing, and == is equality testing. What happens in your code would be emulated in the interpreter like this:
>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False
So, no wonder they're not the same, right?
In other words: a is b is the equivalent of id(a) == id(b)
Other answers here are correct: is is used for identity comparison, while == is used for equality comparison. Since what you care about is equality (the two strings should contain the same characters), in this case the is operator is simply wrong and you should be using == instead.
The reason is works interactively is that (most) string literals are interned by default. From Wikipedia:
Interned strings speed up string
comparisons, which are sometimes a
performance bottleneck in applications
(such as compilers and dynamic
programming language runtimes) that
rely heavily on hash tables with
string keys. Without interning,
checking that two different strings
are equal involves examining every
character of both strings. This is
slow for several reasons: it is
inherently O(n) in the length of the
strings; it typically requires reads
from several regions of memory, which
take time; and the reads fills up the
processor cache, meaning there is less
cache available for other needs. With
interned strings, a simple object
identity test suffices after the
original intern operation; this is
typically implemented as a pointer
equality test, normally just a single
machine instruction with no memory
reference at all.
So, when you have two string literals (words that are literally typed into your program source code, surrounded by quotation marks) in your program that have the same value, the Python compiler will automatically intern the strings, making them both stored at the same memory location. (Note that this doesn't always happen, and the rules for when this happens are quite convoluted, so please don't rely on this behavior in production code!)
Since in your interactive session both strings are actually stored in the same memory location, they have the same identity, so the is operator works as expected. But if you construct a string by some other method (even if that string contains exactly the same characters), then the string may be equal, but it is not the same string -- that is, it has a different identity, because it is stored in a different place in memory.
The is keyword is a test for object identity while == is a value comparison.
If you use is, the result will be true if and only if the object is the same object. However, == will be true any time the values of the object are the same.
One last thing to note is you may use the sys.intern function to ensure that you're getting a reference to the same string:
>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True
As pointed out in previous answers, you should not be using is to determine equality of strings. But this may be helpful to know if you have some kind of weird requirement to use is.
Note that the intern function used to be a built-in on Python 2, but it was moved to the sys module in Python 3.
is is identity testing and == is equality testing. This means is is a way to check whether two things are the same things, or just equivalent.
Say you've got a simple person object. If it is named 'Jack' and is '23' years old, it's equivalent to another 23-year-old Jack, but it's not the same person.
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.name == other.name and self.age == other.age
jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)
jack1 == jack2 # True
jack1 is jack2 # False
They're the same age, but they're not the same instance of person. A string might be equivalent to another, but it's not the same object.
This is a side note, but in idiomatic Python, you will often see things like:
if x is None:
# Some clauses
This is safe, because there is guaranteed to be one instance of the Null Object (i.e., None).
If you're not sure what you're doing, use the '=='.
If you have a little more knowledge about it you can use 'is' for known objects like 'None'.
Otherwise, you'll end up wondering why things doesn't work and why this happens:
>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False
I'm not even sure if some things are guaranteed to stay the same between different Python versions/implementations.
From my limited experience with Python, is is used to compare two objects to see if they are the same object as opposed to two different objects with the same value. == is used to determine if the values are identical.
Here is a good example:
>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True
s1 is a Unicode string, and s2 is a normal string. They are not the same type, but they are the same value.
I think it has to do with the fact that, when the 'is' comparison evaluates to false, two distinct objects are used. If it evaluates to true, that means internally it's using the same exact object and not creating a new one, possibly because you created them within a fraction of 2 or so seconds and because there isn't a large time gap in between it's optimized and uses the same object.
This is why you should be using the equality operator ==, not is, to compare the value of a string object.
>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>>
In this example, I made s2, which was a different string object previously equal to 'one' but it is not the same object as s, because the interpreter did not use the same object as I did not initially assign it to 'one', if I had it would have made them the same object.
The == operator tests value equivalence. The is operator tests object identity, and Python tests whether the two are really the same object (i.e., live at the same address in memory).
>>> a = 'banana'
>>> b = 'banana'
>>> a is b
True
In this example, Python only created one string object, and both a and b refers to it. The reason is that Python internally caches and reuses some strings as an optimization. There really is just a string 'banana' in memory, shared by a and b. To trigger the normal behavior, you need to use longer strings:
>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)
When you create two lists, you get two objects:
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False
In this case we would say that the two lists are equivalent, because they have the same elements, but not identical, because they are not the same object. If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.
If a refers to an object and you assign b = a, then both variables refer to the same object:
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
Reference: Think Python 2e by Allen B. Downey
I believe that this is known as "interned" strings. Python does this, so does Java, and so do C and C++ when compiling in optimized modes.
If you use two identical strings, instead of wasting memory by creating two string objects, all interned strings with the same contents point to the same memory.
This results in the Python "is" operator returning True because two strings with the same contents are pointing at the same string object. This will also happen in Java and in C.
This is only useful for memory savings though. You cannot rely on it to test for string equality, because the various interpreters and compilers and JIT engines cannot always do it.
Actually, the is operator checks for identity and == operator checks for equality.
From the language reference:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
So from the above statement we can infer that the strings, which are immutable types, may fail when checked with "is" and may succeed when checked with "is".
The same applies for int and tuple which are also immutable types.
is will compare the memory location. It is used for object-level comparison.
== will compare the variables in the program. It is used for checking at a value level.
is checks for address level equivalence
== checks for value level equivalence
is is identity testing and == is equality testing (see the Python documentation).
In most cases, if a is b, then a == b. But there are exceptions, for example:
>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False
So, you can only use is for identity tests, never equality tests.
The basic concept, we have to be clear, while approaching this question, is to understand the difference between is and ==.
"is" is will compare the memory location. if id(a)==id(b), then a is b returns true else it returns false.
So, we can say that is is used for comparing memory locations. Whereas,
== is used for equality testing which means that it just compares only the resultant values. The below shown code may acts as an example to the above given theory.
Code
In the case of string literals (strings without getting assigned to variables), the memory address will be same as shown in the picture. so, id(a)==id(b). The remaining of this is self-explanatory.

I am getting SyntaxWarning: "is" with a literal. Did you mean "=="? [duplicate]

Two string variables are set to the same value. s1 == s2 always returns True, but s1 is s2 sometimes returns False.
If I open my Python interpreter and do the same is comparison, it succeeds:
>>> s1 = 'text'
>>> s2 = 'text'
>>> s1 is s2
True
Why is this?
is is identity testing, and == is equality testing. What happens in your code would be emulated in the interpreter like this:
>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False
So, no wonder they're not the same, right?
In other words: a is b is the equivalent of id(a) == id(b)
Other answers here are correct: is is used for identity comparison, while == is used for equality comparison. Since what you care about is equality (the two strings should contain the same characters), in this case the is operator is simply wrong and you should be using == instead.
The reason is works interactively is that (most) string literals are interned by default. From Wikipedia:
Interned strings speed up string
comparisons, which are sometimes a
performance bottleneck in applications
(such as compilers and dynamic
programming language runtimes) that
rely heavily on hash tables with
string keys. Without interning,
checking that two different strings
are equal involves examining every
character of both strings. This is
slow for several reasons: it is
inherently O(n) in the length of the
strings; it typically requires reads
from several regions of memory, which
take time; and the reads fills up the
processor cache, meaning there is less
cache available for other needs. With
interned strings, a simple object
identity test suffices after the
original intern operation; this is
typically implemented as a pointer
equality test, normally just a single
machine instruction with no memory
reference at all.
So, when you have two string literals (words that are literally typed into your program source code, surrounded by quotation marks) in your program that have the same value, the Python compiler will automatically intern the strings, making them both stored at the same memory location. (Note that this doesn't always happen, and the rules for when this happens are quite convoluted, so please don't rely on this behavior in production code!)
Since in your interactive session both strings are actually stored in the same memory location, they have the same identity, so the is operator works as expected. But if you construct a string by some other method (even if that string contains exactly the same characters), then the string may be equal, but it is not the same string -- that is, it has a different identity, because it is stored in a different place in memory.
The is keyword is a test for object identity while == is a value comparison.
If you use is, the result will be true if and only if the object is the same object. However, == will be true any time the values of the object are the same.
One last thing to note is you may use the sys.intern function to ensure that you're getting a reference to the same string:
>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True
As pointed out in previous answers, you should not be using is to determine equality of strings. But this may be helpful to know if you have some kind of weird requirement to use is.
Note that the intern function used to be a built-in on Python 2, but it was moved to the sys module in Python 3.
is is identity testing and == is equality testing. This means is is a way to check whether two things are the same things, or just equivalent.
Say you've got a simple person object. If it is named 'Jack' and is '23' years old, it's equivalent to another 23-year-old Jack, but it's not the same person.
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.name == other.name and self.age == other.age
jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)
jack1 == jack2 # True
jack1 is jack2 # False
They're the same age, but they're not the same instance of person. A string might be equivalent to another, but it's not the same object.
This is a side note, but in idiomatic Python, you will often see things like:
if x is None:
# Some clauses
This is safe, because there is guaranteed to be one instance of the Null Object (i.e., None).
If you're not sure what you're doing, use the '=='.
If you have a little more knowledge about it you can use 'is' for known objects like 'None'.
Otherwise, you'll end up wondering why things doesn't work and why this happens:
>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False
I'm not even sure if some things are guaranteed to stay the same between different Python versions/implementations.
From my limited experience with Python, is is used to compare two objects to see if they are the same object as opposed to two different objects with the same value. == is used to determine if the values are identical.
Here is a good example:
>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True
s1 is a Unicode string, and s2 is a normal string. They are not the same type, but they are the same value.
I think it has to do with the fact that, when the 'is' comparison evaluates to false, two distinct objects are used. If it evaluates to true, that means internally it's using the same exact object and not creating a new one, possibly because you created them within a fraction of 2 or so seconds and because there isn't a large time gap in between it's optimized and uses the same object.
This is why you should be using the equality operator ==, not is, to compare the value of a string object.
>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>>
In this example, I made s2, which was a different string object previously equal to 'one' but it is not the same object as s, because the interpreter did not use the same object as I did not initially assign it to 'one', if I had it would have made them the same object.
The == operator tests value equivalence. The is operator tests object identity, and Python tests whether the two are really the same object (i.e., live at the same address in memory).
>>> a = 'banana'
>>> b = 'banana'
>>> a is b
True
In this example, Python only created one string object, and both a and b refers to it. The reason is that Python internally caches and reuses some strings as an optimization. There really is just a string 'banana' in memory, shared by a and b. To trigger the normal behavior, you need to use longer strings:
>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)
When you create two lists, you get two objects:
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False
In this case we would say that the two lists are equivalent, because they have the same elements, but not identical, because they are not the same object. If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.
If a refers to an object and you assign b = a, then both variables refer to the same object:
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
Reference: Think Python 2e by Allen B. Downey
I believe that this is known as "interned" strings. Python does this, so does Java, and so do C and C++ when compiling in optimized modes.
If you use two identical strings, instead of wasting memory by creating two string objects, all interned strings with the same contents point to the same memory.
This results in the Python "is" operator returning True because two strings with the same contents are pointing at the same string object. This will also happen in Java and in C.
This is only useful for memory savings though. You cannot rely on it to test for string equality, because the various interpreters and compilers and JIT engines cannot always do it.
Actually, the is operator checks for identity and == operator checks for equality.
From the language reference:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
So from the above statement we can infer that the strings, which are immutable types, may fail when checked with "is" and may succeed when checked with "is".
The same applies for int and tuple which are also immutable types.
is will compare the memory location. It is used for object-level comparison.
== will compare the variables in the program. It is used for checking at a value level.
is checks for address level equivalence
== checks for value level equivalence
is is identity testing and == is equality testing (see the Python documentation).
In most cases, if a is b, then a == b. But there are exceptions, for example:
>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False
So, you can only use is for identity tests, never equality tests.
The basic concept, we have to be clear, while approaching this question, is to understand the difference between is and ==.
"is" is will compare the memory location. if id(a)==id(b), then a is b returns true else it returns false.
So, we can say that is is used for comparing memory locations. Whereas,
== is used for equality testing which means that it just compares only the resultant values. The below shown code may acts as an example to the above given theory.
Code
In the case of string literals (strings without getting assigned to variables), the memory address will be same as shown in the picture. so, id(a)==id(b). The remaining of this is self-explanatory.

python: compare sliced string to string with "is" equals False [duplicate]

Two string variables are set to the same value. s1 == s2 always returns True, but s1 is s2 sometimes returns False.
If I open my Python interpreter and do the same is comparison, it succeeds:
>>> s1 = 'text'
>>> s2 = 'text'
>>> s1 is s2
True
Why is this?
is is identity testing, and == is equality testing. What happens in your code would be emulated in the interpreter like this:
>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False
So, no wonder they're not the same, right?
In other words: a is b is the equivalent of id(a) == id(b)
Other answers here are correct: is is used for identity comparison, while == is used for equality comparison. Since what you care about is equality (the two strings should contain the same characters), in this case the is operator is simply wrong and you should be using == instead.
The reason is works interactively is that (most) string literals are interned by default. From Wikipedia:
Interned strings speed up string
comparisons, which are sometimes a
performance bottleneck in applications
(such as compilers and dynamic
programming language runtimes) that
rely heavily on hash tables with
string keys. Without interning,
checking that two different strings
are equal involves examining every
character of both strings. This is
slow for several reasons: it is
inherently O(n) in the length of the
strings; it typically requires reads
from several regions of memory, which
take time; and the reads fills up the
processor cache, meaning there is less
cache available for other needs. With
interned strings, a simple object
identity test suffices after the
original intern operation; this is
typically implemented as a pointer
equality test, normally just a single
machine instruction with no memory
reference at all.
So, when you have two string literals (words that are literally typed into your program source code, surrounded by quotation marks) in your program that have the same value, the Python compiler will automatically intern the strings, making them both stored at the same memory location. (Note that this doesn't always happen, and the rules for when this happens are quite convoluted, so please don't rely on this behavior in production code!)
Since in your interactive session both strings are actually stored in the same memory location, they have the same identity, so the is operator works as expected. But if you construct a string by some other method (even if that string contains exactly the same characters), then the string may be equal, but it is not the same string -- that is, it has a different identity, because it is stored in a different place in memory.
The is keyword is a test for object identity while == is a value comparison.
If you use is, the result will be true if and only if the object is the same object. However, == will be true any time the values of the object are the same.
One last thing to note is you may use the sys.intern function to ensure that you're getting a reference to the same string:
>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True
As pointed out in previous answers, you should not be using is to determine equality of strings. But this may be helpful to know if you have some kind of weird requirement to use is.
Note that the intern function used to be a built-in on Python 2, but it was moved to the sys module in Python 3.
is is identity testing and == is equality testing. This means is is a way to check whether two things are the same things, or just equivalent.
Say you've got a simple person object. If it is named 'Jack' and is '23' years old, it's equivalent to another 23-year-old Jack, but it's not the same person.
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.name == other.name and self.age == other.age
jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)
jack1 == jack2 # True
jack1 is jack2 # False
They're the same age, but they're not the same instance of person. A string might be equivalent to another, but it's not the same object.
This is a side note, but in idiomatic Python, you will often see things like:
if x is None:
# Some clauses
This is safe, because there is guaranteed to be one instance of the Null Object (i.e., None).
If you're not sure what you're doing, use the '=='.
If you have a little more knowledge about it you can use 'is' for known objects like 'None'.
Otherwise, you'll end up wondering why things doesn't work and why this happens:
>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False
I'm not even sure if some things are guaranteed to stay the same between different Python versions/implementations.
From my limited experience with Python, is is used to compare two objects to see if they are the same object as opposed to two different objects with the same value. == is used to determine if the values are identical.
Here is a good example:
>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True
s1 is a Unicode string, and s2 is a normal string. They are not the same type, but they are the same value.
I think it has to do with the fact that, when the 'is' comparison evaluates to false, two distinct objects are used. If it evaluates to true, that means internally it's using the same exact object and not creating a new one, possibly because you created them within a fraction of 2 or so seconds and because there isn't a large time gap in between it's optimized and uses the same object.
This is why you should be using the equality operator ==, not is, to compare the value of a string object.
>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>>
In this example, I made s2, which was a different string object previously equal to 'one' but it is not the same object as s, because the interpreter did not use the same object as I did not initially assign it to 'one', if I had it would have made them the same object.
The == operator tests value equivalence. The is operator tests object identity, and Python tests whether the two are really the same object (i.e., live at the same address in memory).
>>> a = 'banana'
>>> b = 'banana'
>>> a is b
True
In this example, Python only created one string object, and both a and b refers to it. The reason is that Python internally caches and reuses some strings as an optimization. There really is just a string 'banana' in memory, shared by a and b. To trigger the normal behavior, you need to use longer strings:
>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)
When you create two lists, you get two objects:
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False
In this case we would say that the two lists are equivalent, because they have the same elements, but not identical, because they are not the same object. If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.
If a refers to an object and you assign b = a, then both variables refer to the same object:
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
Reference: Think Python 2e by Allen B. Downey
I believe that this is known as "interned" strings. Python does this, so does Java, and so do C and C++ when compiling in optimized modes.
If you use two identical strings, instead of wasting memory by creating two string objects, all interned strings with the same contents point to the same memory.
This results in the Python "is" operator returning True because two strings with the same contents are pointing at the same string object. This will also happen in Java and in C.
This is only useful for memory savings though. You cannot rely on it to test for string equality, because the various interpreters and compilers and JIT engines cannot always do it.
Actually, the is operator checks for identity and == operator checks for equality.
From the language reference:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
So from the above statement we can infer that the strings, which are immutable types, may fail when checked with "is" and may succeed when checked with "is".
The same applies for int and tuple which are also immutable types.
is will compare the memory location. It is used for object-level comparison.
== will compare the variables in the program. It is used for checking at a value level.
is checks for address level equivalence
== checks for value level equivalence
is is identity testing and == is equality testing (see the Python documentation).
In most cases, if a is b, then a == b. But there are exceptions, for example:
>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False
So, you can only use is for identity tests, never equality tests.
The basic concept, we have to be clear, while approaching this question, is to understand the difference between is and ==.
"is" is will compare the memory location. if id(a)==id(b), then a is b returns true else it returns false.
So, we can say that is is used for comparing memory locations. Whereas,
== is used for equality testing which means that it just compares only the resultant values. The below shown code may acts as an example to the above given theory.
Code
In the case of string literals (strings without getting assigned to variables), the memory address will be same as shown in the picture. so, id(a)==id(b). The remaining of this is self-explanatory.

== and is in python [duplicate]

This question already has answers here:
Is there a difference between "==" and "is"?
(13 answers)
Closed 2 years ago.
Its been a couple of days since I started learning python, at which point I stumbled across the == and is. Coming from a java background I assumed == does a comparison by object id and is by value, however doing
>>> a = (1,2)
>>> b = (1,2)
>>> a is b
False
>>> a == b
True
Seems like is is equivalent of java's == and python's == is equivalent to java's equals(). Is this the right way to think about the difference between is and ==? Or is there a caveat?
'==' checks for equality,
'is' checks for identity
See also
Why does comparing strings in Python using either '==' or 'is' sometimes produce a different result?
is checks that both operands are the same object. == calls __eq__() on the left operand, passing the right. Normally this method implements equality comparison, but it is possible to write a class that uses it for other purposes (but it never should).
Note that is and == will give the same results for certain objects (string literals, integers between -1 and 256 inclusive) on some implementations, but that does not mean that the operators should be considered substitutable in those situations.
To follow up on #CRUSADER's answer:
== checks the equality of the objects, using the eq method.
is checks the actual memory location of the objects. If they are the same memory location, they test as True
As was mentioned above, the first 2**8 integers are stored in memory locations for speed, so to see whats going on use some other object or integers above 256. For instance:
In [8]: a = 1001
In [9]: b = a # this sets a pointer to a for the variable b
In [10]: a == b
Out[10]: True # of course they are equal
In [11]: a is b
Out[11]: True # and they point to the same memory location
In [12]: id(a)
Out[12]: 14125728
In [13]: id(b)
Out[13]: 14125728
In [14]: b = 1001 #this instantiates a new object in memory
In [15]: a == b
Out[15]: True
In [16]: a is b
Out[16]: False #now the memory locations are different
In [17]: id(a)
Out[17]: 14125728
In [18]: id(b)
Out[18]: 14125824
This is one of those situations where seemingly synonymous concepts might confuse newer programmers, such as I was when I first wrote this answer. You were close with your assumption based on Java, but backwards. The difference between these operators boils down to the matter of object equivalency vs. object identity, but contrary to what you assumed, == compares by value and is compares by object id. From cpython's built-in documentation (as obtained from typing help("is") at my interpreter's prompt, but also available online here):
Identity comparisons
====================
The operators "is" and "is not" test for object identity: "x is y" is
true if and only if x and y are the same object. Object identity
is determined using the "id()" function. "x is not y" yields the
inverse truth value.
To break this down a bit for less experienced programmers (or really anyone that needs a refresher), a rough definition of each concept is given as follows:
object equivalency: two references are equivalent if they have the
same effective value.
object identity: two references are identical if they refer to the same exact object, e.g. same memory location
object equivalency occurs in most of the situations that you might expect, such as if you compare 2 == 2 or [0, None, "Hello world!"] == [0, None, "Hello world!"]. For built-in types, this is usually determined based on the value of the object, but user-defined types can define their own behavior by defining the __eq__ method (though it is still advised to do so in a way that reflects the complete value of the object).
Object identity is something that can lead to equivalence, but is, on the whole, a separate matter entirely. Object identity depends strictly on whether 2 objects (or rather, 2 references) refer to the exact same object in memory, as determined by id(). Some useful notes about identical references: because they refer to the same entity in memory, they will ALWAYS (at least in cpython) have the same value and, unless __eq__ was defined unconventionally, will therefore be equivalent. This even holds if you attempt to change one of the references through an in-place operation, such as list.append() or my_object[0]=6, and care should be taken to test identity and make copies of objects that should be separate (this is one of the main purposes of is: detecting and dealing with aliases). For example:
>>> first_object = [1, 2, 3]
>>> aliased_object = first_object
>>> first_object is aliased_object
True
>>> aliased_object[0]= "this affects first_object"
>>> first_object
['this affects first_object', 2, 3]
>>> copied_object= first_object.copy() #there are other ways to do this, such as slice notation or the copy module, but this is the most simple and direct
>>> first_object is copied_object
False
>>> copied_object[2] = "this DOES NOT affect first_object"
>>> first_object
['this affects first_object', 2, 3]
>>> copied_object
['this affects first_object', 2, "this DOES NOT affect first_object"]
There are a lot of situations that can result in 2 references being aliased, but outside the assignment operator (which always creates a reference to the assigned object, as above), many of them depend on the exact implementation, e.g. not every implementation of Python will intern strings under the same circumstances or preemptively cache (I'm not sure what the proper term is in this case) the same range of integers. My installation of cpython, for instance, seems to have cached -8 on startup when this article seems to imply that's out of the normal range. Thus, even if is seems to work in your dev environment, it's better to be on the same side, avoid inconsistent behavior altogether, and use ==. is should be reserved for situations where you actually want to compare identity.

Python: Why operator "is" and "==" are sometimes interchangeable for strings? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
String comparison in Python: is vs. ==
Python string interning
Why does comparing strings in Python using either '==' or 'is' sometimes produce a different result?
I used accidentally is and == for strings interchangeably, but I discovered is not always the same.
>>> Folder = "locales/"
>>> Folder2 = "locales/"
>>> Folder is Folder2
False
>>> Folder == Folder2
True
>>> File = "file"
>>> File2 = "file"
>>> File is File2
True
>>> File == File2
True
>>>
Why in one case operators are interchangeable and in the other not ?
Short strings are interned for efficiency, so will refer to the same object therefore is will be true.
This is an implementation detail in CPython, and is absolutely not to be relied on.
This question sheds more light on it: String comparison in Python: is vs. ==
The short answer is: == tests for equal value where is tests for equal identity (via object reference).
The fact that 2 strings with equal value have equal identity suggests the python interpreter is optimizing, as Daniel Roseman confirms :)
The == operator calls the internal __cmp__() method of the first object which compares it to the second one. This is true to all Python objects including strings. The is operator compares the objects identity:
Every object has an identity, a type and a value. An object’s identity never changes once it has been created; you may think of it as the object’s address in memory. The ‘is‘ operator compares the identity of two objects; the id() function returns an integer representing its identity (currently implemented as its address).
E.g.:
s1 = 'abc'
id(s1)
>>> 140058541968080
s2 = 'abc'
id(s2)
>>> 140058541968080
# The values the same due to SPECIFIC CPython behvaiour which detects
# that the value exists in interpreter's memory thus there is no need
# to store it twice.
s1 is s2
>>> True
s2 = 'cde'
id(s2)
>>> 140058541968040
s1 is s2
>>> False

Categories

Resources