Python copy-on-write behavior - python

I'm working on a problem where I'm instantiating many instances of an object. Most of the time the instantiated objects are identical. To reduce memory overhead, I'd like to have all the identical objects point to the same address. When I modify the object, though, I'd like a new instance to be created--essentially copy-on-write behavior. What is the best way to achieve this in Python?
The Flyweight Pattern comes close. An example (from http://codesnipers.com/?q=python-flyweights):
import weakref
class Card(object):
_CardPool = weakref.WeakValueDictionary()
def __new__(cls, value, suit):
obj = Card._CardPool.get(value + suit, None)
if not obj:
obj = object.__new__(cls)
Card._CardPool[value + suit] = obj
obj.value, obj.suit = value, suit
return obj
This behaves as follows:
>>> c1 = Card('10', 'd')
>>> c2 = Card('10', 'd')
>>> id(c1) == id(c2)
True
>>> c2.suit = 's'
>>> c1.suit
's'
>>> id(c1) == id(c2)
True
The desired behavior would be:
>>> c1 = Card('10', 'd')
>>> c2 = Card('10', 'd')
>>> id(c1) == id(c2)
True
>>> c2.suit = 's'
>>> c1.suit
'd'
>>> id(c1) == id(c2)
False
Update: I came across the Flyweight Pattern and it seemed to almost fit the bill. However, I'm open to other approaches.

Do you need id(c1)==id(c2) to be identical, or is that just a demonstration, where the real objective is avoiding creating duplicated objects?
One approach would be to have each object be distinct, but hold an internal reference to the 'real' object like you have above. Then, on any __setattr__ call, change the internal reference.
I've never done __setattr__ stuff before, but I think it would look like this:
class MyObj:
def __init__(self, value, suit):
self._internal = Card(value, suit)
def __setattr__(self, name, new_value):
if name == 'suit':
self._internal = Card(value, new_value)
else:
self._internal = Card(new_value, suit)
And similarly, expose the attributes through getattr.
You'd still have lots of duplicated objects, but only one copy of the 'real' backing object behind them. So this would help if each object is massive, and wouldn't help if they are lightweight, but you have millions of them.

Impossible.
id(c1) == id(c2)
says that c1 and c2 are references to the exact same object. So
c2.suit = 's' is exactly the same as saying c1.suit = 's'.
Python has no way of distinguishing the two (unless you allow introspection of prior call frames, which leads to a dirty hack.)
Since the two assignments are identical, there is no way for Python to know that c2.suit = 's' should cause the name c2 to reference a different object.
To give you an idea of what the dirty hack would look like,
import traceback
import re
import sys
import weakref
class Card(object):
_CardPool = weakref.WeakValueDictionary()
def __new__(cls, value, suit):
obj = Card._CardPool.get(value + suit, None)
if not obj:
obj = object.__new__(cls)
Card._CardPool[value + suit] = obj
obj._value, obj._suit = value, suit
return obj
#property
def suit(self):
return self._suit
#suit.setter
def suit(self, suit):
filename,line_number,function_name,text=traceback.extract_stack()[-2]
name = text[:text.find('.suit')]
setattr(sys.modules['__main__'], name, Card(self._value, suit))
c1 = Card('10', 'd')
c2 = Card('10', 'd')
assert id(c1) == id(c2)
c2.suit = 's'
print(c1.suit)
# 'd'
assert id(c1) != id(c2)
This use of traceback only works with those implementations of Python that uses frames, such as CPython, but not Jython or IronPython.
Another problem is that
name = text[:text.find('.suit')]
is extremely fragile, and would screw up, for example, if the assignment were to look like
if True: c2.suit = 's'
or
c2.suit = (
's')
or
setattr(c2, 'suit', 's')
Yet another problem is that it assumes the name c2 is global. It could just as easily be a local variable (say, inside a function), or an attribute (obj.c2.suit = 's').
I do not know a way to address all the ways the assignment could be made.
In any of these cases, the dirty hack would fail.
Conclusion: Don't use it. :)

This is impossible in your current form. A name (c1 and c2 in your example) is a reference, and you can not simply change the reference by using __setattr__, not to mention all other references to the same object.
The only way this would be possible is something like this:
c1 = c1.changesuit("s")
Where c1.changesuit returns a reference to the (newly created) object. But this only works if each object is referenced by only one name. Alternatively you might be able to do some magic with locals() and stuff like that, but please - don't.

Related

Difference between pickling an object piecewise vs all at once?

Given an arbitrary pythonic object like this:
class ExampleObj(object):
def __init__(self):
self.a = 'a'
self.b = 'b'
self.c = 'c'
obj = ExampleObj()
Is there any functional difference between these two serialization approaches?
Piecewise Pickling
base = type(obj)
name = obj.__class__.__name__
pickled_data = {}
for key,val in obj.__dict__.items():
pickled_data[key] = pickle.dumps(val)
vars = {k : pickle.loads(v) for k,v in pickled_data.items()}
restored = type(name, (base,), vars)
Standard Pickling
restored = pickle.loads( pickle.dumps(obj) )
I can't envision any, but I'm worried there may be some edge case I'm not considering.
(In my application, some objects may not have serializable variables. We were hoping to implement piecewise pickling so we better identify what variables are preventing us from pickling the object)
In the first case, you're creating an instance of type, whereas in the second case, you're creating an instance of the type ExampleObj. To see how the two results are functionally different, I'll name restored_1 the result of your first example, and restored_2 the second.
type(restored_1) # type
type(restored_2) # __main__.ExampleObj
Thus, restored_1 and restored_2 will not be functionally equivalent in the sense that you mention you're looking for.
As a simple illustration, add a method or property to ExampleObj and try to use the restored object from either procedure in various ways.
class ExampleObj(object):
def __init__(self):
self.a = 'a'
self.b = 'b'
self.c = 'c'
def foo(self):
print('bar')
#property
def baz(self):
print(self.a + self.b)
obj = ExampleObj()
After executing your first code, which returns an instance of type:
restored_1.foo() # exception raised because restored_1 is not an ExampleObj instance
restored_1.bar # returns <property at 0x107863138> type
restored_1.__dict__ # returns a mappingproxy object
After executing your second code, which returns an instance of ExampleObj:
restored_2.foo() # bar
restored_2.bar # ab
restored_2.__dict__ # {'a': 'a', 'b': 'b', 'c': 'c'}
If you're looking for a discussion on approaches to see for which instance attrs pickling failed, see this question: How to tell for which object attribute pickle fails?

Python equivalent of C++ member pointer

What would be the equivalent of a C++ member pointer in Python? Basically, I would like to be able to replicate similar behavior in Python:
// Pointer to a member of MyClass
int (MyClass::*ptMember)(int) = &MyClass::member;
// Call member on some instance, e.g. inside a function to
// which the member pointer was passed
instance.*ptMember(3)
Follow up question, what if the member is a property instead of a method? Is it possible to store/pass a "pointer" to a property without specifying the instance?
One way would obviously be to pass a string and use eval. But is there a cleaner way?
EDIT: There are now several really good answers, each having something useful to offer depending on the context. I ended up using what is described in my answer, but I think that other answers will be very helpful for whoever comes here based on the topic of the question. So, I am not accepting any single one for now.
Assuming a Python class:
class MyClass:
def __init__(self):
self.x = 42
def fn(self):
return self.x
The equivalent of a C++ pointer-to-memberfunction is then this:
fn = MyClass.fn
You can take a method from a class (MyClass.fn above) and it becomes a plain function! The only difference between function and method is that the first parameter is customarily called self! So you can call this using an instance like in C++:
o = MyClass()
print(fn(o)) # prints 42
However, an often more interesting thing is the fact that you can also take the "address" of a bound member function, which doesn't work in C++:
o = MyClass()
bfn = o.fn
print(bfn()) # prints 42, too
Concerning the follow-up with the properties, there are plenty answers here already that address this issue, provided it still is one.
The closest fit would probably be operator.attrgetter:
from operator import attrgetter
foo_member = attrgetter('foo')
bar_member = attrgetter('bar')
baz_member = attrgetter('baz')
class Example(object):
def __init__(self):
self.foo = 1
#property
def bar(self):
return 2
def baz(self):
return 3
example_object = Example()
print foo_member(example_object) # prints 1
print bar_member(example_object) # prints 2
print baz_member(example_object)() # prints 3
attrgetter goes through the exact same mechanism normal dotted access goes through, so it works for anything at all you'd access with a dot. Instance fields, methods, module members, dynamically computed attributes, whatever. It doesn't matter what the type of the object is, either; for example, attrgetter('count') can retrieve the count attribute of a list, tuple, string, or anything else with a count attribute.
For certain types of attribute, there may be more specific member-pointer-like things. For example, for instance methods, you can retrieve the unbound method:
unbound_baz_method = Example.baz
print unbound_baz_method(example_object) # prints 3
This is either the specific function that implements the method, or a very thin wrapper around the function, depending on your Python version. It's type-specific; list.count won't work for tuples, and tuple.count won't work for lists.
For properties, you can retrieve the property object's fget, fset, and fdel, which are the functions that implement getting, retrieving, and deleting the attribute the property manages:
example_bar_member = Example.bar.fget
print example_bar_member(example_object) # prints 2
We didn't implement a setter or deleter for this property, so the fset and fdel are None. These are also type-specific; for example, if example_bar_member handled lists correctly, example_bar_member([]) would raise an AttributeError rather than returning 2, since lists don't have a bar attribute.
I was not satisfied with the string approach and did some testing. This seems to work pretty well and avoids passing strings around:
import types
# Our test class
class Class:
def __init__(self, val):
self._val = val
def method(self):
return self._val
#property
def prop(self):
return self._val
# Get the member pointer equivalents
m = Class.method
p = Class.prop
# Create an instance
c1 = Class(1)
# Bind the method and property getter to the instance
m1 = types.MethodType(m, c1)
p1 = types.MethodType(p.fget, c1)
# Use
m1() # Returns 1
p1() # Returns 1
# Alternatively, the instance can be passed to the function as self
m(c1) # Returns 1
p.fget(c1) # Returns 1
I'm not a C++ programmer, so maybe I'm missing some detail of method pointers here, but it sounds like you just want a reference to a function that's defined inside a class. (These were of type instancemethod in Python 2, but are just type function in Python 3.)
The syntax will be slightly different --- instead of calling it like a method with object.reference(args), you'll call it like a function: reference(object, args). object will be the argument to the self parameter --- pretty much what the compiler would have done for you.
Despite the more C-like syntax, I think it still does what you wanted... at least when applied to a callable member like in your example. It won't help with a non-callable instance field, though: they don't exist until after __init__ runs.
Here's a demonstration:
#!/usr/bin/env python3
import math
class Vector(object):
def __init__(self, x, y):
self.x = x
self.y = y
return
def __str__(self):
return '(' + str(self.x) + ', ' + str(self.y) + ')'
def __repr__(self):
return self.__class__.__name__ + str(self)
def magnitude(self):
return math.sqrt(self.x ** 2 + self.y ** 2)
def print_dict_getter_demo():
print('Demo of member references on a Python dict:')
dict_getter = dict.get
d = {'a': 1, 'b': 2, 'c': 3, 'z': 26}
print('Dictionary d : ' + str(d))
print("d.get('a') : " + str(d.get('a')))
print("Ref to get 'a' : " + str(dict_getter(d, 'a')))
print("Ref to get 'BOGUS': " + str(dict_getter(d, 'BOGUS')))
print('Ref to get default: ' + str(dict_getter(d, 'BOGUS', 'not None')))
return
def print_vector_magnitude_demo():
print('Demo of member references on a user-defined Vector:')
vector_magnitude = Vector.magnitude
v = Vector(3, 4)
print('Vector v : ' + str(v))
print('v.magnitude() : ' + str(v.magnitude()))
print('Ref to magnitude: ' + str(vector_magnitude(v)))
return
def print_vector_sorting_demo():
print('Demo of sorting Vectors using a member reference:')
vector_magnitude = Vector.magnitude
v0 = Vector(0, 0)
v1 = Vector(1, 1)
v5 = Vector(-3, -4)
v20 = Vector(-12, 16)
vector_list = [v20, v0, v5, v1]
print('Unsorted: ' + str(vector_list))
sorted_vector_list = sorted(vector_list, key=vector_magnitude)
print('Sorted: ' + str(sorted_vector_list))
return
def main():
print_dict_getter_demo()
print()
print_vector_magnitude_demo()
print()
print_vector_sorting_demo()
return
if '__main__' == __name__:
main()
When run with Python 3, this produces:
Demo of member references on a Python dict:
Dictionary d : {'a': 1, 'c': 3, 'b': 2, 'z': 26}
d.get('a') : 1
Ref to get 'a' : 1
Ref to get 'BOGUS': None
Ref to get default: not None
Demo of member references on a user-defined Vector:
Vector v : (3, 4)
v.magnitude() : 5.0
Ref to magnitude: 5.0
Demo of sorting Vectors using a member reference:
Unsorted: [Vector(-12, 16), Vector(0, 0), Vector(-3, -4), Vector(1, 1)]
Sorted: [Vector(0, 0), Vector(1, 1), Vector(-3, -4), Vector(-12, 16)]
As you can see, it works with both builtins and user-defined classes.
Edit:
The huge demo above was based on an assumption: that you had a reference to the class, and that your goal was to "hold on to" to one of the class's methods for use on whatever instances of that class showed up sometime later.
If you already have a reference to the instance, it's much simpler:
d = {'a': 1, 'b': 2, 'c': 3, 'z': 26}
d_getter = d.get
d_getter('z') # returns 26
This is basically the same thing as above, only after the transformation from a function into a method has "locked in" the argument to self, so you don't need to supply it.
The way I would approach this in python is to use __getattribute__. If you have the name of an attribute, which would be the analog of the c++ pointer-to-member, you could call a.__getattribute__(x) to get the attribute whose name is stored in x. It's strings and dicts instead of offsets & pointers, but that's python.

Method inside a method in Python

I have seen source code where more than one methods are called on an object eg x.y().z() Can someone please explain this to me, does this mean that z() is inside y() or what?
This calls the method y() on object x, then the method z() is called on the result of y() and that entire line is the result of method z().
For example
friendsFavePizzaToping = person.getBestFriend().getFavoritePizzaTopping()
This would result in friendsFavePizzaTopping would be the person's best friend's favorite pizza topping.
Important to note: getBestFriend() must return an object that has the method getFavoritePizzaTopping(). If it does not, an AttributeError will be thrown.
Each method is evaluated in turn, left to right. Consider:
>>> s='HELLO'
>>> s.lower()
'hello'
>>> s='HELLO '
>>> s.lower()
'hello '
>>> s.lower().strip()
'hello'
>>> s.lower().strip().upper()
'HELLO'
>>> s.lower().strip().upper().replace('H', 'h')
'hELLO'
The requirement is that the object to the left in the chain has to have availability of the method on the right. Often that means that the objects are similar types -- or at least share compatible methods or an understood cast.
As an example, consider this class:
class Foo:
def __init__(self, name):
self.name=name
def m1(self):
return Foo(self.name+'=>m1')
def m2(self):
return Foo(self.name+'=>m2')
def __repr__(self):
return '{}: {}'.format(id(self), self.name)
def m3(self):
return .25 # return is no longer a Foo
Notice that as a type of immutable, each return from Foo is a new object (either a new Foo for m1, m2 or a new float). Now try those methods:
>>> foo
4463545376: init
>>> foo.m1()
4463545304: init=>m1
^^^^ different object id
>>> foo
4463545376: init
^^^^ foo still the same because you need to assign it to change
Now assign:
>>> foo=foo.m1().m2()
>>> foo
4464102576: init=>m1=>m2
Now use m3() and it will be a float; not a Foo anymore:
>>> foo=foo.m1().m2().m3()
>>> foo
.25
Now a float -- can't use foo methods anymore:
>>> foo.m1()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'float' object has no attribute 'm1'
But you can use float methods:
>>> foo.as_integer_ratio()
(1, 4)
In the case of:
x.y().z()
You're almost always looking at immutable objects. Mutable objects don't return anything that would HAVE a function like that (for the most part, but I'm simplifying). For instance...
class x:
def __init__(self):
self.y_done = False
self.z_done = False
def y(self):
new_x = x()
new_x.y_done = True
return new_x
def z(self):
new_x = x()
new_x.z_done = True
return new_x
You can see that each of x.y and x.z returns an x object. That object is used to make the consecutive call, e.g. in x.y().z(), x.z is not called on x, but on x.y().
x.y().z() =>
tmp = x.y()
result = tmp.z()
In #dawg's excellent example, he's using strings (which are immutable in Python) whose methods return strings.
string = 'hello'
string.upper() # returns a NEW string with value "HELLO"
string.upper().replace("E","O") # returns a NEW string that's based off "HELLO"
string.upper().replace("E","O") + "W"
# "HOLLOW"
The . "operator" is Python syntax for attribute access. x.y is (nearly) identical to
getattr(x, 'y')
so x.y() is (nearly) identical to
getattr(x, 'y')()
(I say "nearly identical" because it's possible to customize attribute access for a user-defined class. From here on out, I'll assume no such customization is done, and you can assume that x.y is in fact identical to getattr(x, 'y').)
If the thing that x.y() returns has an attribute z such that
foo = getattr(x, 'y')
bar = getattr(foo(), 'z')
is legal, then you can chain the calls together without needing the name foo in the middle:
bar = getattr(getattr(x, 'y')(), 'z')
Converting back to dot notation gives you
bar = getattr(x.y(), 'z')
or simply
bar = x.y().z()
x.y().z() means that the x object has the method y() and the result of x.y() object has the method z() . Now if you first want to apply the method y() on x and then on the result want to apply the z() method, you will write x.y().z(). This is like,
val = x.y()
result = val.z()
Example:
my_dict = {'key':'value'}
my_dict is a dict type object. my_dict.get('key') returns 'value' which is a str type object. now I can apply any method of str type object on it. which will be like,
my_dict.get('key').upper()
This will return 'VALUE'.
That is (sometimes a sign of) bad code.
It violates The law of Demeter. Here is a quote from Wikipedia explaining what is meant:
Each unit should have only limited knowledge about other units: only units "closely" related to the current unit.
Each unit should only talk to its friends; don't talk to strangers.
Only talk to your immediate friends.
Suppose you have a car, which itself has an engine:
class Car:
def __init__(self):
self._engine=None
#property
def engine(self):
return self._engine
#engine.setter
def engine(self, value):
self._engine = value
class Porsche_engine:
def start(self):
print("starting")
So if you make a new car and set the engine to Porsche you could do the following:
>>> from car import *
>>> c=Car()
>>> e=Porsche_engine()
>>> c.engine=e
>>> c.engine.start()
starting
If you are maing this call from an Object, it has not only knowledge of a Car object, but has too knowledge of Engine, which is bad design.
Additionally: if you do not know whether a Car has an engine, calling directly start
>>> c=Car()
>>> c.engine.start()
May result in an Error
AttributeError: 'NoneType' object has no attribute 'start'
Edit:
To avoid (further) misunterstandings and misreadings, from what I am saying.
There are two usages:
1) as I pointed out, Objects calling methods on other objects, returned from a third object is a violation of LoD. This is one way to read the question.
2) an exception to that is method chaining, which is not bad design.
And a better design would be, if the Car itself had a start()-Method which delegates to the engine.

Inspect python class attributes

I need a way to inspect a class so I can safely identify which attributes are user-defined class attributes. The problem is that functions like dir(), inspect.getmembers() and friends return all class attributes including the pre-defined ones like: __class__, __doc__, __dict__, __hash__. This is of course understandable, and one could argue that I could just make a list of named members to ignore, but unfortunately these pre-defined attributes are bound to change with different versions of Python therefore making my project volnerable to changed in the python project - and I don't like that.
example:
>>> class A:
... a=10
... b=20
... def __init__(self):
... self.c=30
>>> dir(A)
['__doc__', '__init__', '__module__', 'a', 'b']
>>> get_user_attributes(A)
['a','b']
In the example above I want a safe way to retrieve only the user-defined class attributes ['a','b'] not 'c' as it is an instance attribute. So my question is... Can anyone help me with the above fictive function get_user_attributes(cls)?
I have spent some time trying to solve the problem by parsing the class in AST level which would be very easy. But I can't find a way to convert already parsed objects to an AST node tree. I guess all AST info is discarded once a class has been compiled into bytecode.
Below is the hard way. Here's the easy way. Don't know why it didn't occur to me sooner.
import inspect
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
return [item
for item in inspect.getmembers(cls)
if item[0] not in boring]
Here's a start
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
attrs = {}
bases = reversed(inspect.getmro(cls))
for base in bases:
if hasattr(base, '__dict__'):
attrs.update(base.__dict__)
elif hasattr(base, '__slots__'):
if hasattr(base, base.__slots__[0]):
# We're dealing with a non-string sequence or one char string
for item in base.__slots__:
attrs[item] = getattr(base, item)
else:
# We're dealing with a single identifier as a string
attrs[base.__slots__] = getattr(base, base.__slots__)
for key in boring:
del attrs['key'] # we can be sure it will be present so no need to guard this
return attrs
This should be fairly robust. Essentially, it works by getting the attributes that are on a default subclass of object to ignore. It then gets the mro of the class that's passed to it and traverses it in reverse order so that subclass keys can overwrite superclass keys. It returns a dictionary of key-value pairs. If you want a list of key, value tuples like in inspect.getmembers then just return either attrs.items() or list(attrs.items()) in Python 3.
If you don't actually want to traverse the mro and just want attributes defined directly on the subclass then it's easier:
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
if hasattr(cls, '__dict__'):
attrs = cls.__dict__.copy()
elif hasattr(cls, '__slots__'):
if hasattr(base, base.__slots__[0]):
# We're dealing with a non-string sequence or one char string
for item in base.__slots__:
attrs[item] = getattr(base, item)
else:
# We're dealing with a single identifier as a string
attrs[base.__slots__] = getattr(base, base.__slots__)
for key in boring:
del attrs['key'] # we can be sure it will be present so no need to guard this
return attrs
Double underscores on both ends of 'special attributes' have been a part of python before 2.0. It would be very unlikely that they would change that any time in the near future.
class Foo(object):
a = 1
b = 2
def get_attrs(klass):
return [k for k in klass.__dict__.keys()
if not k.startswith('__')
and not k.endswith('__')]
print get_attrs(Foo)
['a', 'b']
Thanks aaronasterling, you gave me the expression i needed :-)
My final class attribute inspector function looks like this:
def get_user_attributes(cls,exclude_methods=True):
base_attrs = dir(type('dummy', (object,), {}))
this_cls_attrs = dir(cls)
res = []
for attr in this_cls_attrs:
if base_attrs.count(attr) or (callable(getattr(cls,attr)) and exclude_methods):
continue
res += [attr]
return res
Either return class attribute variabels only (exclude_methods=True) or also retrieve the methods.
My initial tests og the above function supports both old and new-style python classes.
/ Jakob
If you use new style classes, could you simply subtract the attributes of the parent class?
class A(object):
a = 10
b = 20
#...
def get_attrs(Foo):
return [k for k in dir(Foo) if k not in dir(super(Foo))]
Edit: Not quite. __dict__,__module__ and __weakref__ appear when inheriting from object, but aren't there in object itself. You could special case these--I doubt they'd change very often.
Sorry for necro-bumping the thread. I'm surprised that there's still no simple function (or a library) to handle such common usage as of 2019.
I'd like to thank aaronasterling for the idea. Actually, set container provides a more straightforward way to express it:
class dummy: pass
def abridged_set_of_user_attributes(obj):
return set(dir(obj))-set(dir(dummy))
def abridged_list_of_user_attributes(obj):
return list(abridged_set_of_user_attributes(obj))
The original solution using list comprehension is actually two level of loops because there are two in keyword compounded, despite having only one for keyword made it look like less work than it is.
This worked for me to include user defined attributes with __ that might be be found in cls.__dict__
import inspect
class A:
__a = True
def __init__(self, _a, b, c):
self._a = _a
self.b = b
self.c = c
def test(self):
return False
cls = A(1, 2, 3)
members = inspect.getmembers(cls, predicate=lambda x: not inspect.ismethod(x))
attrs = set(dict(members).keys()).intersection(set(cls.__dict__.keys()))
__attrs = {m[0] for m in members if m[0].startswith(f'_{cls.__class__.__name__}')}
attrs.update(__attrs)
This will correctly yield: {'_A__a', '_a', 'b', 'c'}
You can update to clean the cls.__class__.__name__ if you wish

How do I get the string representation of a variable in python?

I have a variable x in python. How can i find the string 'x' from the variable. Here is my attempt:
def var(v,c):
for key in c.keys():
if c[key] == v:
return key
def f():
x = '321'
print 'Local var %s = %s'%(var(x,locals()),x)
x = '123'
print 'Global var %s = %s'%(var(x,locals()),x)
f()
The results are:
Global var x = 123
Local var x = 321
The above recipe seems a bit un-pythonesque. Is there a better/shorter way to achieve the same result?
Q: I have a variable x in python. How can i find the string 'x' from the variable.
A: If I am understanding your question properly, you want to go from the value of a variable to its name. This is not really possible in Python.
In Python, there really isn't any such thing as a "variable". What Python really has are "names" which can have objects bound to them. It makes no difference to the object what names, if any, it might be bound to. It might be bound to dozens of different names, or none.
Consider this example:
foo = 1
bar = foo
baz = foo
Now, suppose you have the integer object with value 1, and you want to work backwards and find its name. What would you print? Three different names have that object bound to them, and all are equally valid.
print(bar is foo) # prints True
print(baz is foo) # prints True
In Python, a name is a way to access an object, so there is no way to work with names directly. You might be able to search through locals() to find the value and recover a name, but that is at best a parlor trick. And in my above example, which of foo, bar, and baz is the "correct" answer? They all refer to exactly the same object.
P.S. The above is a somewhat edited version of an answer I wrote before. I think I did a better job of wording things this time.
I believe the general form of what you want is repr() or the __repr__() method of an object.
with regards to __repr__():
Called by the repr() built-in function
and by string conversions (reverse
quotes) to compute the “official”
string representation of an object.
See the docs here: object.repr(self)
stevenha has a great answer to this question. But, if you actually do want to poke around in the namespace dictionaries anyway, you can get all the names for a given value in a particular scope / namespace like this:
def foo1():
x = 5
y = 4
z = x
print names_of1(x, locals())
def names_of1(var, callers_namespace):
return [name for (name, value) in callers_namespace.iteritems() if var is value]
foo1() # prints ['x', 'z']
If you're working with a Python that has stack frame support (most do, CPython does), it isn't required that you pass the locals dict into the names_of function; the function can retrieve that dictionary from its caller's frame itself:
def foo2():
xx = object()
yy = object()
zz = xx
print names_of2(xx)
def names_of2(var):
import inspect
callers_namespace = inspect.currentframe().f_back.f_locals
return [name for (name, value) in callers_namespace.iteritems() if var is value]
foo2() # ['xx', 'zz']
If you're working with a value type that you can assign a name attribute to, you can give it a name, and then use that:
class SomeClass(object):
pass
obj = SomeClass()
obj.name = 'obj'
class NamedInt(int):
__slots__ = ['name']
x = NamedInt(321)
x.name = 'x'
Finally, if you're working with class attributes and you want them to know their names (descriptors are the obvious use case), you can do cool tricks with metaclass programming like they do in the Django ORM and SQLAlchemy declarative-style table definitions:
class AutonamingType(type):
def __init__(cls, name, bases, attrs):
for (attrname, attrvalue) in attrs.iteritems():
if getattr(attrvalue, '__autoname__', False):
attrvalue.name = attrname
super(AutonamingType,cls).__init__(name, bases, attrs)
class NamedDescriptor(object):
__autoname__ = True
name = None
def __get__(self, instance, instance_type):
return self.name
class Foo(object):
__metaclass__ = AutonamingType
bar = NamedDescriptor()
baaz = NamedDescriptor()
lilfoo = Foo()
print lilfoo.bar # prints 'bar'
print lilfoo.baaz # prints 'baaz'
There are three ways to get "the" string representation of an object in python:
1: str()
>>> foo={"a":"z","b":"y"}
>>> str(foo)
"{'a': 'z', 'b': 'y'}"
2: repr()
>>> foo={"a":"z","b":"y"}
>>> repr(foo)
"{'a': 'z', 'b': 'y'}"
3: string interpolation:
>>> foo={"a":"z","b":"y"}
>>> "%s" % (foo,)
"{'a': 'z', 'b': 'y'}"
In this case all three methods generated the same output, the difference is that str() calls dict.__str__(), while repr() calls dict.__repr__(). str() is used on string interpolation, while repr() is used by Python internally on each object in a list or dict when you print the list or dict.
As Tendayi Mawushe mentiones above, string produced by repr isn't necessarily human-readable.
Also, the default implementation of .__str__() is to call .__repr__(), so if the class does not have it's own overrides to .__str__(), the value returned from .__repr__() is used.

Categories

Resources