I'm aware that in python every identifier or variable name is a reference to the actual object.
a = "hello"
b = "hello"
When I compare the two strings
a == b
the output is
True
If I write an equivalent code in Java,the output would be false because the comparison is between references(which are different) but not the actual objects.
So what i see here is that the references(variable names) are replaced by actual objects by the interpreter at run time.
So,is is safe for me to assume that "Every time the interpreter sees an already assigned variable name,it replaces it with the object it is referring to" ? I googled it but couldn't find any appropriate answer I was looking for.
If you actually ran that in Java, I think you'd find it probably prints out true because of string interning, but that's somewhat irrelevant.
I'm not sure what you mean by "replaces it with the object it is referring to". What actually happens is that when you write a == b, Python calls a.__eq__(b), which is just like any other method call on a with b as an argument.
If you want an equivalent to Java-like ==, use the is operator: a is b. That compares whether the name a refers to the same object as b, regardless of whether they compare as equal.
Python interning:
>>> a = "hello"
>>> b = "hello"
>>> c = "world"
>>> id(a)
4299882336
>>> id(b)
4299882336
>>> id(c)
4299882384
Short strings tend to get interned automatically, explaining why a is b == True. See here for more.
To show that equal strings don't always have the same id
>>> a = "hello"+" world"
>>> b = "hello world"
>>> c = a
>>> a == b
True
>>> a is b
False
>>> b is c
False
>>> a is c
True
also:
>>> str([]) == str("[]")
True
>>> str([]) is str("[]")
False
Related
How variables work in python?
I tried understand it by assigning a value to a variable(a) and checking memory address of it.But,when I changed the value of that variable(a),I got another memory address.what is the reason for that? and that memory address is in the stack area of the memory? and the scope of it?,when I call del a,only variable identifier('a') was deleted.but,It is still on the memory.After,I call id(3),then,that memory address in the code section of the memory?and how python variables stored in memory?,anyone can explain more?
Code:
#!/usr/bin/python3
import _ctypes
a=45
s=id(a)
print(s)
a=a+2
d=id(a)
print(d)
print(_ctypes.PyObj_FromPtr(s))
del a
print(_ctypes.PyObj_FromPtr(d))
print(id(3))
Output:
10915904
10915968
45
47
10914560
What you're seeing is an optimization detail of CPython (the most common Python implementation, and the one you get if you download the language from python.org).
Since small integers are used so frequently, CPython always stores the numbers -5 through 256 in memory, and uses those stored integer objects whenever those numbers come up. This means that all instances of, say, 5 will have the same memory address.
>>> a = 5
>>> b = 5
>>> id(a) == id(b)
True
>>> c = 4
>>> id(a) == id(c)
False
>>> c += 1
>>> id(a) == id(c)
True
This won't be true for other integers or non-integer values, which are only created when needed:
>>> a = 300
>>> b = 300
>>> id(a) == id(b)
False
If I write the following in python, I get a syntax error, why so?
a = 1
b = (a+=1)
I am using python version 2.7
what I get when I run it, the following:
>>> a = 1
>>> b = (a +=1)
File "<stdin>", line 1
b = (a +=1)
^
SyntaxError: invalid syntax
>>>
Unlike in some other languages, assignment (including augmented assignment, like +=) in Python is not an expression. This also affects things like this:
(a=1) > 2
which is legal in C, and several other languages.
The reason generally given for this is because it helps to prevent a class of bugs like this:
if a = 1: # instead of ==
pass
else:
pass
since assignment isn't an expression, this is a SyntaxError in Python. In the equivalent C code, it is a subtle bug where the variable will be modified rather than checked, the check will always be true (in C, like in Python, a non-zero integer is always truthy), and the else block can never fire.
You can still do chained assignment in Python, so this works:
>>> a = 1
>>> a = b = a+1
>>> a
2
>>> b
2
a +=1 is a statement in Python and you can't assign a statement to a variable. Though it is a valid syntax in languages like C, PHP, etc but not Python.
b = (a+=1)
An equivalent version will be:
>>> a = 1
>>> a += 1
>>> b = a
As #Ashwini stated, a+=1 is an assigment, not a value. You can't assign it to b, or any variable. What you probably want is:
b = a+1
All the answers provided here are good, I just want to add that you can achieve what you want in a one-line expression, but written in a different manner:
b, a = a+1, a+1
Here you're doing almost the same thing: incrementing a by 1, and assigning the value of a+1 to b - I'm telling 'almost' because here we have two summations instead of one.
Is there a way to assign references in python?
For example, in php i can do this:
$a = 10;
$b = &$a;
$a = 20;
echo $a." ".$b; // 20, 20
how can i do same thing in python?
In python, if you're doing this with non-primitive types, it acts exactly like you want: assigning is done using references. That's why, when you run the following:
>>> a = {'key' : 'value'}
>>> b = a
>>> b['key'] = 'new-value'
>>> print a['key']
you get 'new-value'.
Strictly saying, if you do the following:
>>> a = 5
>>> b = a
>>> print id(a) == id(b)
you'll get True.
But! Because of primitive types are immutable, you cant change the value of variable b itself. You are just able create a new variable with a new value, based on b. For example, if you do the following:
>>> print id(b)
>>> b = b + 1
>>> print id(b)
you'll get two different values.
This means that Python created a new variable, computed its value basing on b's value and then gave this new variable the name b. This concerns all of the immutable types. Connecting two previous examples together:
>>> a = 5
>>> b = a
>>> print id(a)==id(b)
True
>>> b += 1
>>> print id(b)==id(a)
False
So, when you assign in Python, you always assign reference. But some types cannot be changed, so when you do some changes, you actually create a new variable with another reference.
In Python, everything is by default a reference. So when you do something like:
x=[1,2,3]
y=x
x[1]=-1
print y
It prints [1,-1,3].
The reason this does not work when you do
x=1
y=x
x=-1
print y
is that ints are immutable. They cannot be changed. Think about it, does a number really ever change? When you assign a new value to x, you are assigning a new value - not changing the old one. So y still points to the old one. Other immutable types (e.g. strings and tuples) behave in the same way.
... the is keyword that can be used for equality in strings.
>>> s = 'str'
>>> s is 'str'
True
>>> s is 'st'
False
I tried both __is__() and __eq__() but they didn't work.
>>> class MyString:
... def __init__(self):
... self.s = 'string'
... def __is__(self, s):
... return self.s == s
...
>>>
>>>
>>> m = MyString()
>>> m is 'ss'
False
>>> m is 'string' # <--- Expected to work
False
>>>
>>> class MyString:
... def __init__(self):
... self.s = 'string'
... def __eq__(self, s):
... return self.s == s
...
>>>
>>> m = MyString()
>>> m is 'ss'
False
>>> m is 'string' # <--- Expected to work, but again failed
False
>>>
Testing strings with is only works when the strings are interned. Unless you really know what you're doing and explicitly interned the strings you should never use is on strings.
is tests for identity, not equality. That means Python simply compares the memory address a object resides in. is basically answers the question "Do I have two names for the same object?" - overloading that would make no sense.
For example, ("a" * 100) is ("a" * 100) is False. Usually Python writes each string into a different memory location, interning mostly happens for string literals.
The is operator is equivalent to comparing id(x) values. For example:
>>> s1 = 'str'
>>> s2 = 'str'
>>> s1 is s2
True
>>> id(s1)
4564468760
>>> id(s2)
4564468760
>>> id(s1) == id(s2) # equivalent to `s1 is s2`
True
id is currently implemented to use pointers as the comparison. So you can't overload is itself, and AFAIK you can't overload id either.
So, you can't. Unusual in python, but there it is.
The Python is keyword tests object identity. You should NOT use it to test for string equality. It may seem to work frequently because Python implementations, like those of many very high level languages, performs "interning" of strings. That is to say that string literals and values are internally kept in a hashed list and those which are identical are rendered as references to the same object. (This is possible because Python strings are immutable).
However, as with any implementation detail, you should not rely on this. If you want to test for equality use the == operator. If you truly want to test for object identity then use is --- and I'd be hard-pressed to come up with a case where you should care about string object identity. Unfortunately you can't count on whether two strings are somehow "intentionally" identical object references because of the aforementioned interning.
The is keyword compares objects (or, rather, compares if two references are to the same object).
Which is, I think, why there's no mechanism to provide your own implementation.
It happens to work sometimes on strings because Python stores strings 'cleverly', such that when you create two identical strings they are stored in one object.
>>> a = "string"
>>> b = "string"
>>> a is b
True
>>> c = "str"+"ing"
>>> a is c
True
You can hopefully see the reference vs data comparison in a simple 'copy' example:
>>> a = {"a":1}
>>> b = a
>>> c = a.copy()
>>> a is b
True
>>> a is c
False
If you are not afraid of messing up with bytecode, you can intercept and patch COMPARE_OP with 8 ("is") argument to call your hook function on objects being compared. Look at dis module documentation for start-in.
And don't forget to intercept __builtin__.id() too if someone will do id(a) == id(b) instead of a is b.
'is' compares object identity whereas == compares values.
Example:
a=[1,2]
b=[1,2]
#a==b returns True
#a is b returns False
p=q=[1,2]
#p==q returns True
#p is q returns True
is fails to compare a string variable to string value and two string variables when the string starts with '-'. My Python version is 2.6.6
>>> s = '-hi'
>>> s is '-hi'
False
>>> s = '-hi'
>>> k = '-hi'
>>> s is k
False
>>> '-hi' is '-hi'
True
You can't overload the is operator. What you want to overload is the == operator. This can be done by defining a __eq__ method in the class.
You are using identity comparison. == is probably what you want. The exception to this is when you want to be checking if one item and another are the EXACT same object and in the same memory position. In your examples, the item's aren't the same, since one is of a different type (my_string) than the other (string). Also, there's no such thing as someclass.__is__ in python (unless, of course, you put it there yourself). If there was, comparing objects with is wouldn't be reliable to simply compare the memory locations.
When I first encountered the is keyword, it confused me as well. I would have thought that is and == were no different. They produced the same output from the interpreter on many objects. This type of assumption is actually EXACTLY what is... is for. It's the python equivalent "Hey, don't mistake these two objects. they're different.", which is essentially what [whoever it was that straightened me out] said. Worded much differently, but one point == the other point.
the
for some helpful examples and some text to help with the sometimes confusing differences
visit a document from python.org's mail host written by "Danny Yoo"
or, if that's offline, use the unlisted pastebin I made of it's body.
in case they, in some 20 or so blue moons (blue moons are a real event), are both down, I'll quote the code examples
###
>>> my_name = "danny"
>>> your_name = "ian"
>>> my_name == your_name
0 #or False
###
###
>>> my_name[1:3] == your_name[1:3]
1 #or True
###
###
>>> my_name[1:3] is your_name[1:3]
0
###
Assertion Errors can easily arise with is keyword while comparing objects. For example, objects a and b might hold same value and share same memory address. Therefore, doing an
>>> a == b
is going to evaluate to
True
But if
>>> a is b
evaluates to
False
you should probably check
>>> type(a)
and
>>> type(b)
These might be different and a reason for failure.
Because string interning, this could look strange:
a = 'hello'
'hello' is a #True
b= 'hel-lo'
'hel-lo' is b #False
How does the is operator determine if two objects are the same? How does it work? I can't find it documented.
From the documentation:
Every object has an identity, a type
and a value. An object’s identity
never changes once it has been
created; you may think of it as the
object’s address in memory. The ‘is‘
operator compares the identity of two
objects; the id() function returns an
integer representing its identity
(currently implemented as its
address).
This would seem to indicate that it compares the memory addresses of the arguments, though the fact that it says "you may think of it as the object's address in memory" might indicate that the particular implementation is not guranteed; only the semantics are.
Comparison Operators
Is works by comparing the object referenced to see if the operands point to the same object.
>>> a = [1, 2]
>>> b = a
>>> a is b
True
>>> c = [1, 2]
>>> a is c
False
c is not the same list as a therefore the is relation is false.
To add to the other answers, you can think of a is b working as if it was is_(a, b):
def is_(a, b):
return id(a) == id(b)
Note that you cannot directly replace a is b with id(a) == id(b), but the above function avoids that through parameters.