Can't set attributes on instance of "object" class - python

So, I was playing around with Python while answering this question, and I discovered that this is not valid:
o = object()
o.attr = 'hello'
due to an AttributeError: 'object' object has no attribute 'attr'. However, with any class inherited from object, it is valid:
class Sub(object):
pass
s = Sub()
s.attr = 'hello'
Printing s.attr displays 'hello' as expected. Why is this the case? What in the Python language specification specifies that you can't assign attributes to vanilla objects?
For other workarounds, see How can I create an object and add attributes to it?.

To support arbitrary attribute assignment, an object needs a __dict__: a dict associated with the object, where arbitrary attributes can be stored. Otherwise, there's nowhere to put new attributes.
An instance of object does not carry around a __dict__ -- if it did, before the horrible circular dependence problem (since dict, like most everything else, inherits from object;-), this would saddle every object in Python with a dict, which would mean an overhead of many bytes per object that currently doesn't have or need a dict (essentially, all objects that don't have arbitrarily assignable attributes don't have or need a dict).
For example, using the excellent pympler project (you can get it via svn from here), we can do some measurements...:
>>> from pympler import asizeof
>>> asizeof.asizeof({})
144
>>> asizeof.asizeof(23)
16
You wouldn't want every int to take up 144 bytes instead of just 16, right?-)
Now, when you make a class (inheriting from whatever), things change...:
>>> class dint(int): pass
...
>>> asizeof.asizeof(dint(23))
184
...the __dict__ is now added (plus, a little more overhead) -- so a dint instance can have arbitrary attributes, but you pay quite a space cost for that flexibility.
So what if you wanted ints with just one extra attribute foobar...? It's a rare need, but Python does offer a special mechanism for the purpose...
>>> class fint(int):
... __slots__ = 'foobar',
... def __init__(self, x): self.foobar=x+100
...
>>> asizeof.asizeof(fint(23))
80
...not quite as tiny as an int, mind you! (or even the two ints, one the self and one the self.foobar -- the second one can be reassigned), but surely much better than a dint.
When the class has the __slots__ special attribute (a sequence of strings), then the class statement (more precisely, the default metaclass, type) does not equip every instance of that class with a __dict__ (and therefore the ability to have arbitrary attributes), just a finite, rigid set of "slots" (basically places which can each hold one reference to some object) with the given names.
In exchange for the lost flexibility, you gain a lot of bytes per instance (probably meaningful only if you have zillions of instances gallivanting around, but, there are use cases for that).

As other answerers have said, an object does not have a __dict__. object is the base class of all types, including int or str. Thus whatever is provided by object will be a burden to them as well. Even something as simple as an optional __dict__ would need an extra pointer for each value; this would waste additional 4-8 bytes of memory for each object in the system, for a very limited utility.
Instead of doing an instance of a dummy class, in Python 3.3+, you can (and should) use types.SimpleNamespace for this.

It is simply due to optimization.
Dicts are relatively large.
>>> import sys
>>> sys.getsizeof((lambda:1).__dict__)
140
Most (maybe all) classes that are defined in C do not have a dict for optimization.
If you look at the source code you will see that there are many checks to see if the object has a dict or not.

So, investigating my own question, I discovered this about the Python language: you can inherit from things like int, and you see the same behaviour:
>>> class MyInt(int):
pass
>>> x = MyInt()
>>> print x
0
>>> x.hello = 4
>>> print x.hello
4
>>> x = x + 1
>>> print x
1
>>> print x.hello
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: 'int' object has no attribute 'hello'
I assume the error at the end is because the add function returns an int, so I'd have to override functions like __add__ and such in order to retain my custom attributes. But this all now makes sense to me (I think), when I think of "object" like "int".

https://docs.python.org/3/library/functions.html#object :
Note: object does not have a __dict__, so you can’t assign arbitrary attributes to an instance of the object class.

It's because object is a "type", not a class. In general, all classes that are defined in C extensions (like all the built in datatypes, and stuff like numpy arrays) do not allow addition of arbitrary attributes.

This is (IMO) one of the fundamental limitations with Python - you can't re-open classes. I believe the actual problem, though, is caused by the fact that classes implemented in C can't be modified at runtime... subclasses can, but not the base classes.

Related

In Ruby, why does `Array.length` give NoMethodError?

In Python, I can get the length of a list by calling its __len__ method. __len__ is not defined on each individual list object, but on the list class. The method can be accessed directly through the class:
>>> [].__len__()
0
>>> type([])
<class 'list'>
>>> list.__len__([])
0
However, the equivalent code does not work in Ruby:
> [].length
=> 0
> [].class
=> Array
> Array.length([])
NoMethodError: undefined method `length' for Array:Class
Why can the length method not be found when using Array.length? Does method lookup in Ruby work differently from Python?
Yes, method lookup in Ruby works quite a bit differently than method lookup in Python.
The TL;DR answer is, if you really want to pull the length method from the Array Class object, you'd need to write Array.instance_method(:length). This would get you an UnboundMethod object, which you could then bind to particular object and call:
unbound_method = Array.instance_method(:length)
ary = [1,2,3,4]
bound_method = unbound_method.bind(ary)
bound_method.call #=> 4
Unsurprisingly, this is not commonly done in Ruby.
In Ruby, Classes themselves are objects, instances of the class Class; these classes have their own methods, above and beyond the methods they provide for their instances, methods such as new, name, superclass, etc. Of course, it would be possible to define a length method on the Array class object so you could write Array.length, but this would not be inherently related to the instance method of the same name. There are a few places in core classes and the standard library where this is actually done, but for the most part it is not idiomatic in Ruby.
The distinction between the the methods a Class object responds to directly, and those it provides for its instances, can be seen pretty clearly in the output of the methods and instance_methods methods:
irb(main):001:0> Array.methods
=> [:[], :try_convert, :new, :allocate, :superclass, ...
irb(main):002:0> Array.instance_methods
=> [:each_index, :join, :rotate, :rotate!, :sort!, ...
One reason that method lookup in Ruby doesn't work the way it does in Python is that there may be some method names that both a class and an instance need to respond to, in different ways. For example, consider a hypothetical Person class:
Person.ancestors # => [Person, Animal, Organism, Object, Kernel, BasicObject]
john = Person.new(...)
john.ancestors # => [John Sr., Sally, Jospeh, Mary, Charles, Jessica]
In Python, of course, the first call would be an error: you can't call Person.ancestors without passing an instance of Person to be __self__, that wouldn't make any sense. But then how do you get the ancestors of the Person class? You'd pass Person itself to a function: inspect.getmro(Person). That's idiomatic in Python, but would be considered extremely poor style in Ruby.
Essentially, as different languages, even though in many ways they are similar, they do still have different ways of solving certain problems, and some things that are considered good practice or style in one are quite bad in the other (and vice versa).

arbitrary __getattr__ fails on classes that inherit object but not on object [duplicate]

So, I was playing around with Python while answering this question, and I discovered that this is not valid:
o = object()
o.attr = 'hello'
due to an AttributeError: 'object' object has no attribute 'attr'. However, with any class inherited from object, it is valid:
class Sub(object):
pass
s = Sub()
s.attr = 'hello'
Printing s.attr displays 'hello' as expected. Why is this the case? What in the Python language specification specifies that you can't assign attributes to vanilla objects?
For other workarounds, see How can I create an object and add attributes to it?.
To support arbitrary attribute assignment, an object needs a __dict__: a dict associated with the object, where arbitrary attributes can be stored. Otherwise, there's nowhere to put new attributes.
An instance of object does not carry around a __dict__ -- if it did, before the horrible circular dependence problem (since dict, like most everything else, inherits from object;-), this would saddle every object in Python with a dict, which would mean an overhead of many bytes per object that currently doesn't have or need a dict (essentially, all objects that don't have arbitrarily assignable attributes don't have or need a dict).
For example, using the excellent pympler project (you can get it via svn from here), we can do some measurements...:
>>> from pympler import asizeof
>>> asizeof.asizeof({})
144
>>> asizeof.asizeof(23)
16
You wouldn't want every int to take up 144 bytes instead of just 16, right?-)
Now, when you make a class (inheriting from whatever), things change...:
>>> class dint(int): pass
...
>>> asizeof.asizeof(dint(23))
184
...the __dict__ is now added (plus, a little more overhead) -- so a dint instance can have arbitrary attributes, but you pay quite a space cost for that flexibility.
So what if you wanted ints with just one extra attribute foobar...? It's a rare need, but Python does offer a special mechanism for the purpose...
>>> class fint(int):
... __slots__ = 'foobar',
... def __init__(self, x): self.foobar=x+100
...
>>> asizeof.asizeof(fint(23))
80
...not quite as tiny as an int, mind you! (or even the two ints, one the self and one the self.foobar -- the second one can be reassigned), but surely much better than a dint.
When the class has the __slots__ special attribute (a sequence of strings), then the class statement (more precisely, the default metaclass, type) does not equip every instance of that class with a __dict__ (and therefore the ability to have arbitrary attributes), just a finite, rigid set of "slots" (basically places which can each hold one reference to some object) with the given names.
In exchange for the lost flexibility, you gain a lot of bytes per instance (probably meaningful only if you have zillions of instances gallivanting around, but, there are use cases for that).
As other answerers have said, an object does not have a __dict__. object is the base class of all types, including int or str. Thus whatever is provided by object will be a burden to them as well. Even something as simple as an optional __dict__ would need an extra pointer for each value; this would waste additional 4-8 bytes of memory for each object in the system, for a very limited utility.
Instead of doing an instance of a dummy class, in Python 3.3+, you can (and should) use types.SimpleNamespace for this.
It is simply due to optimization.
Dicts are relatively large.
>>> import sys
>>> sys.getsizeof((lambda:1).__dict__)
140
Most (maybe all) classes that are defined in C do not have a dict for optimization.
If you look at the source code you will see that there are many checks to see if the object has a dict or not.
So, investigating my own question, I discovered this about the Python language: you can inherit from things like int, and you see the same behaviour:
>>> class MyInt(int):
pass
>>> x = MyInt()
>>> print x
0
>>> x.hello = 4
>>> print x.hello
4
>>> x = x + 1
>>> print x
1
>>> print x.hello
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: 'int' object has no attribute 'hello'
I assume the error at the end is because the add function returns an int, so I'd have to override functions like __add__ and such in order to retain my custom attributes. But this all now makes sense to me (I think), when I think of "object" like "int".
https://docs.python.org/3/library/functions.html#object :
Note: object does not have a __dict__, so you can’t assign arbitrary attributes to an instance of the object class.
It's because object is a "type", not a class. In general, all classes that are defined in C extensions (like all the built in datatypes, and stuff like numpy arrays) do not allow addition of arbitrary attributes.
This is (IMO) one of the fundamental limitations with Python - you can't re-open classes. I believe the actual problem, though, is caused by the fact that classes implemented in C can't be modified at runtime... subclasses can, but not the base classes.

classes with same name in different modules share exact same name string?

Slightly surprising behaviour. Two classes with the same name in different modules share the same name. (Not equality, which is expected, but string object identity! )
I don't support it really matters much, but does anyone know why, and whether there's any potential here for further surprises?
Demo (with a/a.py and b/b.py and empty __init__.py in a/ and b/)
>>> from a import a
>>> from b import b
>>> ta = a.Test()
>>> tb = b.Test()
>>> ta.__class__.__name__
'Test'
>>> tb.__class__.__name__
'Test'
>>> tb.__class__.__name__ is ta.__class__.__name__ # unexpected
True
>>> ta.__class__
<class 'a.a.Test'>
>>> tb.__class__
<class 'b.b.Test'>
This is an implementation detail of the CPython interpreter.
It interns identifier strings; only one copy is created for source code strings, and re-used everywhere the exact same string value appears. This makes dictionary containment tests faster (by trying pointer comparisons first).
You can do this for your own strings with the sys.intern() function, which helpfully notes:
Normally, the names used in Python programs are automatically interned, and the dictionaries used to hold module, class or instance attributes have interned keys.
Also see About the changing id of a Python immutable string
As youve noted - you can't rely in sting objects being the same - nor in they being different objects. (Unless you explicitly makes references to the same strings, of course).
If you want to compare for the name, just use the == object. If yu want to know if the classes are the same, them use the is operator with the classes themselves, not with their names.
In Python everything is object/class. Whatever you write either a class or function, it's type is type.
In question you have 2 diff classes a and b. Here these 2 lines create 2 instances of both classes.
>>> ta = a.Test()
>>> tb = b.Test()
Function Test on each object return class Test. A class to __class__ returns object type or __repr__ of object. And name returns name of class in form of string. If you want to confirm type this and see type(tb.class.name). So
>>> tb.__class__.__name__ is ta.__class__.__name__
this is returning true as one str is type of other.

What does "Everything" mean when someone says "Everything in Python is an object."?

I constantly see people state that "Everything in Python is an object.", but I haven't seen "thing" actually defined. This saying would lead me to believe that all tokens of any kind are also considered to be objects, including operators, punctuators, whitespace, etc. Is that actually the case? Is there a more concise way of stating what a Python object actually is?
Thanks
Anything that can be assigned to a variable is an object.
That includes functions, classes, and modules, and of course int's, str's, float's, list's, and everything else. It does not include whitespace, punctuation, or operators.
Just to mention it, there is the operator module in the standard library which includes functions that implement operators; those functions are objects. That doesn't mean + or * are objects.
I could go on and on, but this is simple and pretty complete.
Some values are obviously objects; they are instances of a class, have attributes, etc.
>>> i = 3
>>> type(i)
<type 'int'>
>>> i.denominator
1
Other values are less obviously objects. Types are objects:
>>> type(int)
<type 'type'>
>>> int.__mul__(3, 5)
15
Even type is an object (of type type, oddly enough):
>>> type(type)
<type 'type'>
Modules are objects:
>>> import sys
>>> type(sys)
<type 'module'>
Built-in functions are objects:
>>> type(sum)
<type 'builtin_function_or_method'>
In short, if you can reference it by name, it's an object.
What is generally meant is that most things, for example functions and methods are objects. Modules too. Classes (not just their instances) themselves are objects. and int/float/strings are objects. So, yes, things generally tend to be objects in Python. Cyphase is correct, I just wanted to give some examples of things that might not be immediately obvious as objects.
Being objects then a number of properties are observable on things that you would consider special case, baked-in stuff in other languages. Though __dict__, which allows arbitrary attribute assignment in Python, is often missing on things intended for large volume instantiations like int.
Therefore, at least on pure-Python objects, a lot of magic can happen, from introspection to things like creating a new class on the fly.
Kinda like turtles all the way down.
You're not going to find a rigorous definition like C++11's, because Python does not have a formal specification like C++11, it has a reference manual like pre-ISO C++. The Data model chapter is as rigorous as it gets:
Objects are Python’s abstraction for data. All data in a Python program is represented by objects or by relations between objects. (In a sense, and in conformance to Von Neumann’s model of a “stored program computer,” code is also represented by objects.)
Every object has an identity, a type and a value. An object’s identity never changes once it has been created; you may think of it as the object’s address in memory. …
The glossary also has a shorter definition:
Any data with state (attributes or value) and defined behavior (methods).
And it's true that everything in Python has methods and (other) attributes. Even if there are no public methods, there's a set of special methods and values inherited from the object base class, like the __str__ method.
This wasn't true in versions of Python before 2.2, which is part of the reason we have multiple words for nearly the same thing—object, data, value; type, class… But from then on, the following kinds of things are identical:
Objects.
Things that can be returned or yielded by a function.
Things that can be stored in a variable (including a parameter).
Things that are instances of type object (usually indirectly, through a subclass or two).
Things that can be the value resulting from an expression.
Things represented by pointers to PyObject structs in CPython.
… and so on.
That's what "everything is an object" means.
It also means that Python doesn't have "native types" and "class types" like Java, or "value types" and "reference types" like C#; there's only one kind of thing, objects.
This saying would lead me to believe that all tokens of any kind are also considered to be objects, including operators, punctuators, whitespace, etc. Is that actually the case?
No. Those things don't have values, so they're not objects.1
Also, variables are not objects. Unlike C-style variables, Python variables are not memory locations with a type containing a value, they're just names bound to a value in some namespace.2 And that's why you can't pass around references to variables; there is no "thing" to reference.3
Assignment targets are also not objects. They sometimes look a lot like values, and even the core devs sometimes refer to things like the a, b in a, b = 1, 2 loosely as a tuple object—but there is no tuple there.4
There's also a bit of apparent vagueness with things like elements of a numpy.array (or an array.array or ctypes.Structure). When you write a[0] = 3, the 3 object doesn't get stored in the array the way it would with a list. Instead, numpy stores some bytes that Python doesn't even understand, but that it can use to do "the same thing a 3 would do" in array-wide operations, or to make a new copy of the 3 object if you later ask for a[0] = 3.
But if you go back to the definition, it's pretty clear that this "virtual 3" is not an object—while it has a type and value, it does not have an identity.
1. At the meta level, you can write an import hook that can act on imported code as a byte string, a decoded Unicode string, a list of token tuples, an AST node, a code object, or a module, and all of those are objects… But at the "normal" level, from within the code being imported, tokens, etc. are not objects.
2. Under the covers, there's almost always a string object to represent that name, stored in a dict or tuple that represents the namespace, as you can see by calling globals() or dir(self). But that's not what the variable is.
3. A closure cell is sort of a way of representing a reference to a variable, but really, it's the cell itself that's an object, and the variables at different scopes are just a slightly special kind of name for that cell.
4. However, in a[0] = 3, although a[0] isn't a value, a and 0 are, because that assignment is equivalent to the expression a.__setitem__(0, 3), except that it's not an expression.

How do I create a unique object?

I want
class MyClass(object):
_my_unique = ???? # if this were lisp, I would have written (cons nil nil) here
def myFunc (self, arg):
assert arg != _my_unique # this must never fail
...
What do use instead of ??? to ensure that the assert never fails?
(With Lisp, I could create _my_unique with (cons nil nil) and used eq in assert).
PS. Use case: I will put _my_unique in a dict, so I want it to be equal to itself, but I do not want it to be equal (in the dict collision sense) to anything passed in from the outside.
You can use object(), but this won't make the assert "never fail". It will still fail if you call myFunc and pass MyClass.my_unique as the object. Also, if you want to test whether it's the same object, use arg is not my_unique rather than !=.
You can use object().
Return a new featureless object. object is a base for all new style
classes. It has the methods that are common to all instances of new
style classes.
If what you're asking is "how do I make _my_unique unique for each instance of MyClass", you could simply create a new, empty object in the constructor.
For example:
>>> class MyClass(object):
... def __init__(self):
... self._my_unique = object()
...
>>> foo=MyClass()
>>> bar=MyClass()
>>> foo._my_unique
<object object at 0x100ce70c0>
>>> bar._my_unique
<object object at 0x100ce7090>
If you want to hide _my_unique, give it two underscores. This way, nobody can accidentally get at the value. It's not impossible, but they would need to work a little harder to get at the value. This is called name mangling, and you can read more about it here: http://docs.python.org/2/tutorial/classes.html#private-variables-and-class-local-references
>>> class MyClass(object):
... def __init__(self):
... self.__my_unique = object()
...
>>> foo=MyClass()
>>> foo.__my_unique
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyClass' object has no attribute '__my_unique'
This is a mildly confusing question, as it seems to mix a bunch of concepts without purpose. For one thing, most any object in Python is unique, but there may be multiple ways to reach them. The operator to check identities is not != (inequality) but is not. Unlike Java, Python does not require you to put things in a class, and does not implicitly look in self (which they know as implicit this). cons from Lisp is used to construct pairs which frequently form singly linked lists, a structure that's unusual in Python because we use dynamic reference arrays known as lists. Lists are mutable, and thus constructing one using list() or [] will produce a unique object.
So, while it is rather pointless, one way to write a function producing functions that perform the useless assert might be:
def createfunc():
mylist=[]
def subfunc(x):
assert x is not mylist
return subfunc
Each call to createfunc() returns a new function with its own mylist. However, the object being unique doesn't make it impossible to reach:
f=createfunc()
f(f.func_closure[0].cell_contents) # raises AssertionError
Regarding the PS, to be relevant to dict collissions, your object must also be hashable, which makes object() a better choice than list() for the unique object.

Categories

Resources