Related
I want to clarify how variables are declared in Python.
I have seen variable declaration as
class writer:
path = ""
sometimes, there is no explicit declaration but just initialization using __init__:
def __init__(self, name):
self.name = name
I understand the purpose of __init__, but is it advisable to declare variable in any other functions?
How can I create a variable to hold a custom type?
class writer:
path = "" # string value
customObj = ??
Okay, first things first.
There is no such thing as "variable declaration" or "variable initialization" in Python.
There is simply what we call "assignment", but should probably just call "naming".
Assignment means "this name on the left-hand side now refers to the result of evaluating the right-hand side, regardless of what it referred to before (if anything)".
foo = 'bar' # the name 'foo' is now a name for the string 'bar'
foo = 2 * 3 # the name 'foo' stops being a name for the string 'bar',
# and starts being a name for the integer 6, resulting from the multiplication
As such, Python's names (a better term than "variables", arguably) don't have associated types; the values do. You can re-apply the same name to anything regardless of its type, but the thing still has behaviour that's dependent upon its type. The name is simply a way to refer to the value (object). This answers your second question: You don't create variables to hold a custom type. You don't create variables to hold any particular type. You don't "create" variables at all. You give names to objects.
Second point: Python follows a very simple rule when it comes to classes, that is actually much more consistent than what languages like Java, C++ and C# do: everything declared inside the class block is part of the class. So, functions (def) written here are methods, i.e. part of the class object (not stored on a per-instance basis), just like in Java, C++ and C#; but other names here are also part of the class. Again, the names are just names, and they don't have associated types, and functions are objects too in Python. Thus:
class Example:
data = 42
def method(self): pass
Classes are objects too, in Python.
So now we have created an object named Example, which represents the class of all things that are Examples. This object has two user-supplied attributes (In C++, "members"; in C#, "fields or properties or methods"; in Java, "fields or methods"). One of them is named data, and it stores the integer value 42. The other is named method, and it stores a function object. (There are several more attributes that Python adds automatically.)
These attributes still aren't really part of the object, though. Fundamentally, an object is just a bundle of more names (the attribute names), until you get down to things that can't be divided up any more. Thus, values can be shared between different instances of a class, or even between objects of different classes, if you deliberately set that up.
Let's create an instance:
x = Example()
Now we have a separate object named x, which is an instance of Example. The data and method are not actually part of the object, but we can still look them up via x because of some magic that Python does behind the scenes. When we look up method, in particular, we will instead get a "bound method" (when we call it, x gets passed automatically as the self parameter, which cannot happen if we look up Example.method directly).
What happens when we try to use x.data?
When we examine it, it's looked up in the object first. If it's not found in the object, Python looks in the class.
However, when we assign to x.data, Python will create an attribute on the object. It will not replace the class' attribute.
This allows us to do object initialization. Python will automatically call the class' __init__ method on new instances when they are created, if present. In this method, we can simply assign to attributes to set initial values for that attribute on each object:
class Example:
name = "Ignored"
def __init__(self, name):
self.name = name
# rest as before
Now we must specify a name when we create an Example, and each instance has its own name. Python will ignore the class attribute Example.name whenever we look up the .name of an instance, because the instance's attribute will be found first.
One last caveat: modification (mutation) and assignment are different things!
In Python, strings are immutable. They cannot be modified. When you do:
a = 'hi '
b = a
a += 'mom'
You do not change the original 'hi ' string. That is impossible in Python. Instead, you create a new string 'hi mom', and cause a to stop being a name for 'hi ', and start being a name for 'hi mom' instead. We made b a name for 'hi ' as well, and after re-applying the a name, b is still a name for 'hi ', because 'hi ' still exists and has not been changed.
But lists can be changed:
a = [1, 2, 3]
b = a
a += [4]
Now b is [1, 2, 3, 4] as well, because we made b a name for the same thing that a named, and then we changed that thing. We did not create a new list for a to name, because Python simply treats += differently for lists.
This matters for objects because if you had a list as a class attribute, and used an instance to modify the list, then the change would be "seen" in all other instances. This is because (a) the data is actually part of the class object, and not any instance object; (b) because you were modifying the list and not doing a simple assignment, you did not create a new instance attribute hiding the class attribute.
This might be 6 years late, but in Python 3.5 and above, you can give a hint about a variable type like this:
variable_name: type_name
or this:
variable_name # type: shinyType
This hint has no effect in the core Python interpreter, but many tools will use it to aid the programmer in writing correct code.
So in your case(if you have a CustomObject class defined), you can do:
customObj: CustomObject
See this or that for more info.
There's no need to declare new variables in Python. If we're talking about variables in functions or modules, no declaration is needed. Just assign a value to a name where you need it: mymagic = "Magic". Variables in Python can hold values of any type, and you can't restrict that.
Your question specifically asks about classes, objects and instance variables though. The idiomatic way to create instance variables is in the __init__ method and nowhere else — while you could create new instance variables in other methods, or even in unrelated code, it's just a bad idea. It'll make your code hard to reason about or to maintain.
So for example:
class Thing(object):
def __init__(self, magic):
self.magic = magic
Easy. Now instances of this class have a magic attribute:
thingo = Thing("More magic")
# thingo.magic is now "More magic"
Creating variables in the namespace of the class itself leads to different behaviour altogether. It is functionally different, and you should only do it if you have a specific reason to. For example:
class Thing(object):
magic = "Magic"
def __init__(self):
pass
Now try:
thingo = Thing()
Thing.magic = 1
# thingo.magic is now 1
Or:
class Thing(object):
magic = ["More", "magic"]
def __init__(self):
pass
thing1 = Thing()
thing2 = Thing()
thing1.magic.append("here")
# thing1.magic AND thing2.magic is now ["More", "magic", "here"]
This is because the namespace of the class itself is different to the namespace of the objects created from it. I'll leave it to you to research that a bit more.
The take-home message is that idiomatic Python is to (a) initialise object attributes in your __init__ method, and (b) document the behaviour of your class as needed. You don't need to go to the trouble of full-blown Sphinx-level documentation for everything you ever write, but at least some comments about whatever details you or someone else might need to pick it up.
For scoping purpose, I use:
custom_object = None
Variables have scope, so yes it is appropriate to have variables that are specific to your function. You don't always have to be explicit about their definition; usually you can just use them. Only if you want to do something specific to the type of the variable, like append for a list, do you need to define them before you start using them. Typical example of this.
list = []
for i in stuff:
list.append(i)
By the way, this is not really a good way to setup the list. It would be better to say:
list = [i for i in stuff] # list comprehension
...but I digress.
Your other question.
The custom object should be a class itself.
class CustomObject(): # always capitalize the class name...this is not syntax, just style.
pass
customObj = CustomObject()
As of Python 3, you can explicitly declare variables by type.
For instance, to declare an integer one can do it as follows:
x: int = 3
or:
def f(x: int):
return x
see this question for more detailed info about it:
Explicitly declaring a variable type in Python
I want to clarify how variables are declared in Python.
I have seen variable declaration as
class writer:
path = ""
sometimes, there is no explicit declaration but just initialization using __init__:
def __init__(self, name):
self.name = name
I understand the purpose of __init__, but is it advisable to declare variable in any other functions?
How can I create a variable to hold a custom type?
class writer:
path = "" # string value
customObj = ??
Okay, first things first.
There is no such thing as "variable declaration" or "variable initialization" in Python.
There is simply what we call "assignment", but should probably just call "naming".
Assignment means "this name on the left-hand side now refers to the result of evaluating the right-hand side, regardless of what it referred to before (if anything)".
foo = 'bar' # the name 'foo' is now a name for the string 'bar'
foo = 2 * 3 # the name 'foo' stops being a name for the string 'bar',
# and starts being a name for the integer 6, resulting from the multiplication
As such, Python's names (a better term than "variables", arguably) don't have associated types; the values do. You can re-apply the same name to anything regardless of its type, but the thing still has behaviour that's dependent upon its type. The name is simply a way to refer to the value (object). This answers your second question: You don't create variables to hold a custom type. You don't create variables to hold any particular type. You don't "create" variables at all. You give names to objects.
Second point: Python follows a very simple rule when it comes to classes, that is actually much more consistent than what languages like Java, C++ and C# do: everything declared inside the class block is part of the class. So, functions (def) written here are methods, i.e. part of the class object (not stored on a per-instance basis), just like in Java, C++ and C#; but other names here are also part of the class. Again, the names are just names, and they don't have associated types, and functions are objects too in Python. Thus:
class Example:
data = 42
def method(self): pass
Classes are objects too, in Python.
So now we have created an object named Example, which represents the class of all things that are Examples. This object has two user-supplied attributes (In C++, "members"; in C#, "fields or properties or methods"; in Java, "fields or methods"). One of them is named data, and it stores the integer value 42. The other is named method, and it stores a function object. (There are several more attributes that Python adds automatically.)
These attributes still aren't really part of the object, though. Fundamentally, an object is just a bundle of more names (the attribute names), until you get down to things that can't be divided up any more. Thus, values can be shared between different instances of a class, or even between objects of different classes, if you deliberately set that up.
Let's create an instance:
x = Example()
Now we have a separate object named x, which is an instance of Example. The data and method are not actually part of the object, but we can still look them up via x because of some magic that Python does behind the scenes. When we look up method, in particular, we will instead get a "bound method" (when we call it, x gets passed automatically as the self parameter, which cannot happen if we look up Example.method directly).
What happens when we try to use x.data?
When we examine it, it's looked up in the object first. If it's not found in the object, Python looks in the class.
However, when we assign to x.data, Python will create an attribute on the object. It will not replace the class' attribute.
This allows us to do object initialization. Python will automatically call the class' __init__ method on new instances when they are created, if present. In this method, we can simply assign to attributes to set initial values for that attribute on each object:
class Example:
name = "Ignored"
def __init__(self, name):
self.name = name
# rest as before
Now we must specify a name when we create an Example, and each instance has its own name. Python will ignore the class attribute Example.name whenever we look up the .name of an instance, because the instance's attribute will be found first.
One last caveat: modification (mutation) and assignment are different things!
In Python, strings are immutable. They cannot be modified. When you do:
a = 'hi '
b = a
a += 'mom'
You do not change the original 'hi ' string. That is impossible in Python. Instead, you create a new string 'hi mom', and cause a to stop being a name for 'hi ', and start being a name for 'hi mom' instead. We made b a name for 'hi ' as well, and after re-applying the a name, b is still a name for 'hi ', because 'hi ' still exists and has not been changed.
But lists can be changed:
a = [1, 2, 3]
b = a
a += [4]
Now b is [1, 2, 3, 4] as well, because we made b a name for the same thing that a named, and then we changed that thing. We did not create a new list for a to name, because Python simply treats += differently for lists.
This matters for objects because if you had a list as a class attribute, and used an instance to modify the list, then the change would be "seen" in all other instances. This is because (a) the data is actually part of the class object, and not any instance object; (b) because you were modifying the list and not doing a simple assignment, you did not create a new instance attribute hiding the class attribute.
This might be 6 years late, but in Python 3.5 and above, you can give a hint about a variable type like this:
variable_name: type_name
or this:
variable_name # type: shinyType
This hint has no effect in the core Python interpreter, but many tools will use it to aid the programmer in writing correct code.
So in your case(if you have a CustomObject class defined), you can do:
customObj: CustomObject
See this or that for more info.
There's no need to declare new variables in Python. If we're talking about variables in functions or modules, no declaration is needed. Just assign a value to a name where you need it: mymagic = "Magic". Variables in Python can hold values of any type, and you can't restrict that.
Your question specifically asks about classes, objects and instance variables though. The idiomatic way to create instance variables is in the __init__ method and nowhere else — while you could create new instance variables in other methods, or even in unrelated code, it's just a bad idea. It'll make your code hard to reason about or to maintain.
So for example:
class Thing(object):
def __init__(self, magic):
self.magic = magic
Easy. Now instances of this class have a magic attribute:
thingo = Thing("More magic")
# thingo.magic is now "More magic"
Creating variables in the namespace of the class itself leads to different behaviour altogether. It is functionally different, and you should only do it if you have a specific reason to. For example:
class Thing(object):
magic = "Magic"
def __init__(self):
pass
Now try:
thingo = Thing()
Thing.magic = 1
# thingo.magic is now 1
Or:
class Thing(object):
magic = ["More", "magic"]
def __init__(self):
pass
thing1 = Thing()
thing2 = Thing()
thing1.magic.append("here")
# thing1.magic AND thing2.magic is now ["More", "magic", "here"]
This is because the namespace of the class itself is different to the namespace of the objects created from it. I'll leave it to you to research that a bit more.
The take-home message is that idiomatic Python is to (a) initialise object attributes in your __init__ method, and (b) document the behaviour of your class as needed. You don't need to go to the trouble of full-blown Sphinx-level documentation for everything you ever write, but at least some comments about whatever details you or someone else might need to pick it up.
For scoping purpose, I use:
custom_object = None
Variables have scope, so yes it is appropriate to have variables that are specific to your function. You don't always have to be explicit about their definition; usually you can just use them. Only if you want to do something specific to the type of the variable, like append for a list, do you need to define them before you start using them. Typical example of this.
list = []
for i in stuff:
list.append(i)
By the way, this is not really a good way to setup the list. It would be better to say:
list = [i for i in stuff] # list comprehension
...but I digress.
Your other question.
The custom object should be a class itself.
class CustomObject(): # always capitalize the class name...this is not syntax, just style.
pass
customObj = CustomObject()
As of Python 3, you can explicitly declare variables by type.
For instance, to declare an integer one can do it as follows:
x: int = 3
or:
def f(x: int):
return x
see this question for more detailed info about it:
Explicitly declaring a variable type in Python
I run the following very trivial Python code. I am very surprised that it actually run. Could someone explain to me why I can even assign values to "nd" and "hel" without defining them in the class definition? Is this because the attribute can be added in the instance level?
class tempClass(object):
a = tempClass()
a.nd = 1
a.hel = 'wem3'
Python has no notion of variable declaration, only assignments. The same applies to attributes: you simply assign an initial value to bring it into existence.
There is nothing special about the __init__ method in this regard. For example,
class TempClass(object):
def __init__(self):
self.nd = 1
a = tempClass()
a.hel = 'wem3'
Both attributes are created in the same way: by assigning a value to them. __init__ is called when a is first created, but otherwise is not special. self inside __init__ is a reference to the object referenced by a, so self.nd = 1 is identical to a.nd = 1. After the object is created, a.hel is created and initialized with 'wem3' by the same process.
I see that the Python syntax for a namedtuple is:
Point = namedtuple('Point', ['x', 'y'])
Why isn't it simpler like so:
Point = namedtuple(['x','y'])
Its less verbose,
In general, objects don't know what variables they are assigned to:
# Create three variables referring to an OrderedPair class
tmp = namedtuple('OrderedPair', ['x','y']) # create a new class with metadata
Point = tmp # assign the class to a variable
Coordinate = tmp # assign the class to another var
That's a problem for named tuples. We have to pass in the class name to the namedtuple() factory function so that the class can be given a useful name, docstring, and __repr__ all of which have the class name inside it.
These reason it seems strange to you is that normal function and class definitions are handled differently. Python has special syntax for def and class that not only creates functions and classes, but it assigns their metadata (name and docstring) and assigns the result to a variable.
Consider what def does:
def square(x):
'Return a value times itself'
return x * x
The keyword def takes care of several things for you (notice that the word "square" will be used twice):
tmp = lambda x: x*x # create a function object
tmp.__name__ = 'square' # assign its metadata
tmp.__doc__ = 'Return a value times itself'
square = tmp # assign the function to a variable
The same is also true for classes. The class keyword takes care of multiple actions that would otherwise repeat the class name:
class Dog(object):
def bark(self):
return 'Woof!'
The underlying steps repeat the class name (notice that the word "Dog" is used twice):
Dog = type('Dog', (object,), {'bark': lambda self: 'Woof'})
Named tuples don't have the advantage of a special keyword like def or class so it has to do the first to steps itself. The final step of assigning to a variable belongs to you. If you think about it, the named tuple way is the norm in Python while def and class are the exception:
survey_results = open('survey_results') # is this really a duplication?
company_db = sqlite3.connect('company.db') # is this really a duplication?
www_python_org = urllib.urlopen('http://www.python.org')
radius = property(radius)
You are not the first to notice this. PEP 359 that suggested we add a new keyword, make, that could allow any callable to gain the auto-assignment capabilities of def, class, and import.
make <callable> <name> <tuple>:
<block>
would be translated into the assignment:
<name> = <callable>("<name>", <tuple>, <namespace>)
In the end, Guido didn't like the "make" proposal because it caused more problems than it solved (after all, it only saves you from making a single variable assignment).
Hope that helps you see why the class name is written twice. It isn't really duplication. The string form of the class name is used to assign metadata when the object is created, and the separate variable assignment just gives you a way to refer to that object. While they are usually the same name, they don't have to be :-)
namedtuple is a factory, returning a class. Consider only expression:
namedtuple(['x','y'])
What would be the name of class returned by this expression?
The class should have a name and know it. And it doesn't see the variable you assign it to, so it can't use that. Plus you could call it something else or even nothing at all:
c = namedtuple('Point', ['x', 'y'])
do_something_with_this(namedtuple('Point', ['x', 'y']))
Speaking of simpler syntax, you can also write it like this:
namedtuple('Point', 'x y')
Because namedtuple is a function that returns a class. To do that, it is actually rendering a string template and calling eval. To build the string, it needs all the arguments beforehand.
You need to include the relevant context as arguments to namedtuple for that to happen. If you don't provide the class name argument, it would need to guess. Programming languages don't like to guess.
With the rules of the Python language, the namedtuple function within this expression..
>>> Point = namedtuple(['x','y'])
..doesn't have access to variable name (Point) that the result is stored in once the expression has been executed. It only has access to the elements of the list provided as its argument (and variables that have been defined earlier).
I am reading this Genshi Tutorial and see there the following example:
from formencode import Schema, validators
class LinkForm(Schema):
username = validators.UnicodeString(not_empty=True)
url = validators.URL(not_empty=True, add_http=True, check_exists=False)
title = validators.UnicodeString(not_empty=True)
As far as I understand this example, we create a new class that inherits Schema class and this class contain three methods: username, url, title. However, I am not sure about the last because before I only saw methods created with def.
Anyway, my question is not about that. I would like to know if it is possible to make the definition of the class dynamic. For example, sometimes I do not want url or title to be in the class. It seems to be doable (I just use if and assign a value to url only if-statement is satisfied.
But what if I do not know in advance what fields I would like to have in the form? For example, now I have username, url and title. But what if later I would like to have city or age. Can I do something like that:
from formencode import Schema, validators
class LinkForm(Schema):
__init__(self, fields):
for field in fields:
condition = fields[field]
field = validators.UnicodeString(condition)
I think it will not work. Is there a work around in this case?
Yes, you can add methods to an instance dynamically. No, you can't do what you want.
You can bind methods to the instance in the initializer. Unfortunately what you have there are descriptors and those must be bound to the class.
I would go the other way round—first define all form fields that might be used, and delete unneeded ones later.
Provided that you have:
from formencode import Schema, validators
class LinkForm(Schema):
username = validators.UnicodeString(not_empty=True)
url = validators.URL(not_empty=True, add_http=True, check_exists=False)
title = validators.UnicodeString(not_empty=True)
you could do either this:
def xy():
my_form = LinkForm()
del my_form.url
…
… or this:
def xy():
class CustomLinkForm(LinkForm):
pass
if …:
del CustomLinkForm.url
…
Disclaimer: I am not familiar with FormEncode, so it might depend on its inner workings which of these two versions actually works.
of course you can have a constructor with some arguments after self and these arguments will be the value for some members of your class if you have for instance
__init__(self, fields):
self.fields = []
for field in fields:
self.fields = self.fields + field
see this in Dive into Python
class FileInfo(UserDict):
"store file metadata"
def __init__(self, filename=None):
UserDict.__init__(self)
self["name"] = filename
Classes can (and should) have doc strings too, just like modules and
functions.
init is called immediately after an instance of the
class is created. It would be tempting but incorrect to call this the
constructor of the class. It's tempting, because it looks like a
constructor (by convention, init is the first method defined for
the class), acts like one (it's the first piece of code executed in a
newly created instance of the class), and even sounds like one (“init”
certainly suggests a constructor-ish nature). Incorrect, because the
object has already been constructed by the time init is called,
and you already have a valid reference to the new instance of the
class. But init is the closest thing you're going to get to a
constructor in Python, and it fills much the same role.
The first
argument of every class method, including init, is always a
reference to the current instance of the class. By convention, this
argument is always named self. In the init method, self refers to
the newly created object; in other class methods, it refers to the
instance whose method was called. Although you need to specify self
explicitly when defining the method, you do not specify it when
calling the method; Python will add it for you automatically.
init methods can take any number of arguments, and just like
functions, the arguments can be defined with default values, making
them optional to the caller. In this case, filename has a default
value of None, which is the Python null value.
Note that in the later example you learn how to deal with inherited class, calling __init()__ for this inherited class.
To answer your not-a-question about class or instance variables, see this
Variables defined in the class definition are class variables; they
are shared by all instances. To create instance variables, they can be
set in a method with self.name = value. Both class and instance
variables are accessible through the notation “self.name”, and an
instance variable hides a class variable with the same name when
accessed in this way. Class variables can be used as defaults for
instance variables, but using mutable values there can lead to
unexpected results. For new-style classes, descriptors can be used to
create instance variables with different implementation details.