Custom constructors for models in Google App Engine (python) - python

I'm getting back to programming for Google App Engine and I've found, in old, unused code, instances in which I wrote constructors for models. It seems like a good idea, but there's no mention of it online and I can't test to see if it works. Here's a contrived example, with no error-checking, etc.:
class Dog(db.Model):
name = db.StringProperty(required=True)
breeds = db.StringListProperty()
age = db.IntegerProperty(default=0)
def __init__(self, name, breed_list, **kwargs):
db.Model.__init__(**kwargs)
self.name = name
self.breeds = breed_list.split()
rufus = Dog('Rufus', 'spaniel terrier labrador')
rufus.put()
The **kwargs are passed on to the Model constructor in case the model is constructed with a specified parent or key_name, or in case other properties (like age) are specified. This constructor differs from the default in that it requires that a name and breed_list be specified (although it can't ensure that they're strings), and it parses breed_list in a way that the default constructor could not.
Is this a legitimate form of instantiation, or should I just use functions or static/class methods? And if it works, why aren't custom constructors used more often?

In your example, why not use the default syntax instead of a custom constructor:
rufus = Dog( name='Rufus', breeds=['spaniel','terrier','labrador'] )
Your version makes it less clear semantically IMHO.
As for overriding Model constructors, Google recommends against it (see for example: http://groups.google.com/group/google-appengine/browse_thread/thread/9a651f6f58875bfe/111b975da1b4b4db?lnk=gst&q=python+constructors#111b975da1b4b4db) and that's why we don't see it in Google's code.
I think it's unfortunate because constructor overriding can be useful in some cases, like creating a temporary property.
One problem I know of is with Expando, anything you define in the constructor gets auto-serialized in the protocol buffer.
But for base Models I am not sure what are the risks, and I too would be happy to learn more.

There's usually no need to do something like that; the default constructor will assign name, and when working with a list it almost always makes more sense to pass an actual list instead of a space-separated string (just imagine the fun if you passed "cocker spaniel" instead of just "spaniel" there, for one thing...).
That said, if you really need to do computation when instantiating a Model subclass instance, there's probably nothing inherently wrong with it. I think most people probably prefer to get the data into the right form and then create the entity, which is why you're not seeing a lot of examples like that.

Related

Python class attributes decorator

not sure about the proper wording for that, but essentially what I would like to ask is if it is possible in python do something like this:
class Customer:
self.name = 'Freddy'
#reportMarketing
self.surname = 'Krueger'
self.eyes = 'Blue'
#reportMarketing
self.address = 'Elm Street'
#reportAccounting
self.tax_id = '8ab9a66cf'
...
where #reportMarketing and #reportAccounting would be decorators (of sorts). The idea is that an instance like the one above could be then passed to some 'reporting class' that - based on whether the instance's attribute is decorated with #reportMarketing or a #reportAccounting, would then consider it for a specific 'reporting'. For instance, imagine you have customers database with instances like the one above, and you want data/attributes decorated with reportMarketing to be included in a report that goes to your marketing department, while the data/attributes decorated with reportAccounting to be included in a report that goes to your accounting department.
I guess conceptually, my question is can I even have decorators for attributes? I understand they are 'typically' (always?) meant for functions.
Also, to give more background, the classes I deal with have plenty of attributes (each), so going through all of them manually is quite error-prone, and so I thought dealing with it by simply decorating the couple of ones I'm after, is the cleanest and most extensible in the future.

Classes vs Function: Do I need to use 'self' keyword if using a class in Python?

I have a data engineering program that is grabbing some data off of Federal government websites and transforming that data. I'm a bit confused on whether I need to use the 'self' keyword or if it's a better practice to not use a class at all. This is how it's currently organized:
class GetGovtData():
def get_data_1(arg1=0, arg2=1):
df = conduct_some_operations
return df
def get_data_2(arg1=4, arg2=5):
df = conduct_some_operations_two
return df
I'm mostly using a class here for organization purposes. For instance, there might be a dozen different methods from one class that I need to use. I find it more aesthetically pleasing / easier to type out this:
from data.get_govt_data import GetGovtData
df1 = GetGovtData.get_data_1()
df2 = GetGovtData.get_data_2()
Rather than:
from data import get_govt_data
df1 = get_govt_data.get_data_1()
df2 = get_govt_data.get_data_2()
Which just has a boatload of underscores. So I'm just curious if this would be considered bad code to use a class like this, without bothering with 'self'? Or should I just eliminate the classes and use a bunch of functions in my files instead?
If you develop functions within a Python class you can two ways of defining a function: The one with a self as first parameter and the other one without self.
So, what is the different between the two?
Function with self
The first one is a method, which is able to access content within the created object. This allows you to access the internal state of an individual object, e.g., a counter of some sorts. These are methods you usually use when using object oriented programming. A short intro can be fund here [External Link]. These methods require you to create new instances of the given class.
Function without self
Functions without initialising an instance of the class. This is why you can directly call them on the imported class.
Alternative solution
This is based on the comment of Tom K. Instead of using self, you can also use the decorator #staticmethod to indicate the role of the method within your class. Some more info can be found here [External link].
Final thought
To answer you initial question: You do not need to use self. In your case you do not need self, because you do not share the internal state of an object. Nevertheless, if you are using classes you should think about an object oriented design.
I suppose you have a file called data/get_govt_data.py that contains your first code block. You can just rename that file to data/GetGovtData.py, remove the class line and not bother with classes at all, if you like. Then you can do
from data import GetGovtData
df1 = GetGovtData.get_data_1()
Depending on your setup you may need to create an empty file data/__init__.py for Python to see data as a module.
EDIT: Regarding the file naming, Python does not impose any too tight restrictions here. Note however that many projects conventionally use camelCase or CapitalCase to distinguish function, class and module names. Using CapitalCase for a module may confuse others for a second to assume it's a class. You may choose not to follow this convention if you do not want to use classes in your project.
To answer the question in the title first: The exact string 'self' is a convention (that I can see no valid reason to ignore BTW), but the first argument in a class method is always going to be a reference to the class instance.
Whether you should use a class or flat functions depends on if the functions have shared state. From your scenario it sounds like they may have a common base URL, authentication data, database names, etc. Maybe you even need to establish a connection first? All those would be best held in the class and then used in the functions.

Defining a class property within __init__ as opposed to within another class method -- python

EDIT
Note, it was brought to my attention that Instance attribute attribute_name defined outside __init__ is a possible duplicate, which I mostly agree with (I didn't come upon this because I didn't know to search for pylint). However, I would like to keep this question open because of the fact that I want to be able to reinitialize my class using the same method. The general consensus in the previous question was to return each parameter from the loadData script and then parse it into the self object. This is fine, however, I would still have to do that again within another method to be able to reinitialize my instance of class, which still seems like extra work for only a little bit more readability. Perhaps the issue is my example. In real life there are about 30 parameters that are read in by the loadData routine, which is why I am hesitant to have to parse them in two different locations.
If the general consensus here is that returning the parameters are the way to go then we can go ahead and close this question as a duplicate; however, in the mean time I would like to wait to see if anyone else has any ideas/a good explanation for why.
Original
This is something of a "best practices" question. I have been learning python recently (partially to learn something new and partially to move away from MATLAB). While working in python I created a class that was structured as follows:
class exampleClass:
"""
This is an example class to demonstrate my question to stack exchange
"""
def __init__( self, fileName ):
exampleClass.loadData( self, fileName )
def loadData( self, fileName ):
"""
This function reads the data specified in the fileName into the
current instance of exampleClass.
:param fileName: The file that the data is to be loaded from
"""
with open(fileName,'r') as sumFile:
self.name = sumFile.readLine().strip(' \n\r\t')
Now this makes sense to me. I have an init class that populated the current instance of the class by calling to a population function. I also have the population function which would allow me to reinitialize a given instance of this class if for some reason I need to (for instance if the class takes up a lot of memory and instead of creating separate instances of the class I just want to have one instance that I overwrite.
However, when I put this code into my IDE (pycharm) it throws a warning that an instance attribute was defined outside of __init__. Now obviously this doesn't affect the operation of the code, everything works fine, but I am wondering if there is any reason to pay attention to the warning in this case. I could do something where I initialize all the properties to some default value in the init method before calling the loadData method but this just seems like unnecessary work to me and like it would slow down the execution (albeit only a very small amount). I could also have essentially two copies of the loadData method, one in the __init__ method and one as an actual method but again this just seems like unnecessary extra work.
Overall my question is what would the best practice be in this situation be. Is there any reason that I should restructure the code in one of the ways I mentioned in the previous paragraph or is this just an instance of an IDE with too broad of a code-inspection warning. I can obviously see some instances where this warning is something to consider but using my current experience it doesn't look like a problem in this case.
I think it's a best practice to define all of your attributes up front, even if you're going to redefine them later. When I read your code, I want to be able to see your data structures. If there's some attribute hidden in a method that only becomes defined under certain circumstances, it makes it harder to understand the code.
If it is inconvenient or impossible to give an attribute it's final value, I recommend at least initializing it to None. This signals to the reader that the object includes that attribute, even if it gets redefined later.
class exampleClass:
"""
This is an example class to demonstrate my question to stack exchange
"""
def __init__( self, fileName ):
# Note: this will be modified when a file is loaded
self.name = None
exampleClass.loadData( self, fileName )
Another choice would be for loadData to return the value rather than setting it, so your init might look like:
def __init__(self, fileName):
self.name = self.loadData(fileName)
I tend to think this second method is better, but either method is fine. The point is, make your classes and objects as easy to understand as possible.

Python/Django OOP modify the following code to show get/set and constructor

Case. I want to modify and add the following behavior to the code below (it's a context processor):
After checking if a user is authenticated check the last time the balance was updated (cookie maybe) if it was updated in the last 5 mins do nothing, else get the new balance as normal.
def get_balance(request):
if request.user.is_authenticated():
balance = Account.objects.get(user=request.user).balance
else:
balance = 0
return {'account_balance': balance}
HOWEVER:
I want to learn a little more about OOP in Django/Python can some modify the example to achieve my goal include the use of:
Property: I come from Java, I want to set and get, it makes more sense to me. get balance if does not exist else create new one.
Constructor method: In Python I think I have to change this to a class and use init right?
UPDATE:
To use a construct I first think I need to create a class, I'm assuming this is ok using as a context processor in Django to do something like this:
class BalanceProcessor(request):
_balance = Account.objects.get(user=request.user).balance
#property
def get_balance(self):
return return {'account_balance': _balance}
#setter???
Python is not Java. In Python you don't create classes for no reason. Classes are for when you have data you want to encapsulate with code. In this case, there is no such thing: you simply get some data and return it. A class would be of no benefit here whatsoever.
In any case, even if you do create a class, once again Python is not Java, and you don't create getters and setters on properties unless you actually need to do some processing when you get and set. If you just want to access an instance attribute, then you simply access it.
Finally, your proposed code will not work for two reasons. Firstly, you are trying to inherit from request. That makes no sense: you should inherit from object unless you are subclassing something. Secondly, how are you expecting your class to be instantiated? Context processors are usually functions, and that means Django is expecting a callable. If you give the class as the context processor, then calling it will instantiate it: but then there's nothing that will call the get_balance method. And your code will fail because Django will pass the request into the instantation (as it is expecting to do with a function) and your __init__ doesn't expect that parameter.
It's fine to experiment with classes in Python, but a context processor is not the place for it.

Using static methods in python - best practice

When and how are static methods suppose to be used in python? We have already established using a class method as factory method to create an instance of an object should be avoided when possible. In other words, it is not best practice to use class methods as an alternate constructor (See Factory method for python object - best practice).
Lets say I have a class used to represent some entity data in a database. Imagine the data is a dict object containing field names and field values and one of the fields is an ID number that makes the data unique.
class Entity(object):
def __init__(self, data, db_connection):
self._data = data
self._db_connection
Here my __init__ method takes the entity data dict object. Lets say I only have an ID number and I want to create an Entity instance. First I will need to find the rest of the data, then create an instance of my Entity object. From my previous question, we established that using a class method as a factory method should probably be avoided when possible.
class Entity(object):
#classmethod
def from_id(cls, id_number, db_connection):
filters = [['id', 'is', id_number]]
data = db_connection.find(filters)
return cls(data, db_connection)
def __init__(self, data, db_connection):
self._data = data
self._db_connection
# Create entity
entity = Entity.from_id(id_number, db_connection)
Above is an example of what not to do or at least what not to do if there is an alternative. Now I am wondering if editing my class method so that it is more of a utility method and less of a factory method is a valid solution. In other words, does the following example comply with the best practice for using static methods.
class Entity(object):
#staticmethod
def data_from_id(id_number, db_connection):
filters = [['id', 'is', id_number]]
data = db_connection.find(filters)
return data
# Create entity
data = Entity.data_from_id(id_number, db_connection)
entity = Entity(data)
Or does it make more sense to use a standalone function to find the entity data from an ID number.
def find_data_from_id(id_number, db_connection):
filters = [['id', 'is', id_number]]
data = db_connection.find(filters)
return data
# Create entity.
data = find_data_from_id(id_number, db_connection)
entity = Entity(data, db_connection)
Note: I do not want to change my __init__ method. Previously people have suggested making my __init__ method to look something like this __init__(self, data=None, id_number=None) but there could be 101 different ways to find the entity data so I would prefer to keep that logic separate to some extent. Make sense?
When and how are static methods suppose to be used in python?
The glib answer is: Not very often.
The even glibber but not quite as useless answer is: When they make your code more readable.
First, let's take a detour to the docs:
Static methods in Python are similar to those found in Java or C++. Also see classmethod() for a variant that is useful for creating alternate class constructors.
So, when you need a static method in C++, you need a static method in Python, right?
Well, no.
In Java, there are no functions, just methods, so you end up creating pseudo-classes that are just bundles of static methods. The way to do the same thing in Python is to just use free functions.
That's pretty obvious. However, it's good Java style to look as hard as possible for an appropriate class to wedge a function into, so you can avoid writing those pseudo-classes, while doing the same thing is bad Python style—again, use free functions—and this is much less obvious.
C++ doesn't have the same limitation as Java, but many C++ styles are pretty similar anyway. (On the other hand, if you're a "Modern C++" programmer who's internalized the "free functions are part of a class's interface" idiom, your instincts for "where are static methods useful" are probably pretty decent for Python.)
But if you're coming at this from first principles, rather than from another language, there's a simpler way to look at things:
A #staticmethod is basically just a global function. If you have a function foo_module.bar() that would be more readable for some reason if it were spelled as foo_module.BazClass.bar(), make it a #staticmethod. If not, don't. That's really all there is to it. The only problem is building up your instincts for what's more readable to an idiomatic Python programmer.
And of course use a #classmethod when you need access to the class, but not the instance—alternate constructors are the paradigm case for that, as the docs imply. Although you often can simulate a #classmethod with a #staticmethod just by explicitly referencing the class (especially when you don't have much subclassing), you shouldn't.
Finally, getting to your specific question:
If the only reason clients ever need to look up data by ID is to construct an Entity, that sounds like an implementation detail you shouldn't be exposing, and it also makes client code more complex. Just use a constructor. If you don't want to modify your __init__ (and you're right that there are good reasons you might not want to), use a #classmethod as an alternate constructor: Entity.from_id(id_number, db_connection).
On the other hand, if that lookup is something that's inherently useful to clients in other cases that have nothing to do with Entity construction, it seems like this has nothing to do with the Entity class (or at least no more than anything else in the same module). So, just make it a free function.
The answer to the linked question specifically says this:
A #classmethod is the idiomatic way to do an "alternate constructor"—there are examples all over the stdlib—itertools.chain.from_iterable, datetime.datetime.fromordinal, etc.
So I don't know how you got the idea that using a classmethod is inherently bad. I actually like the idea of using a classmethod in your specific situation, as it makes following the code and using the api easy.
The alternative would be to use default constructor arguments like so:
class Entity(object):
def __init__(self, id, db_connection, data=None):
self.id = id
self.db_connection = db_connection
if data is None:
self.data = self.from_id(id, db_connection)
else:
self.data = data
def from_id(cls, id_number, db_connection):
filters = [['id', 'is', id_number]]
return db_connection.find(filters)
I prefer the classmethod version that you wrote originally however. Especially since data is fairly ambiguous.
Your first example makes the most sense to me: Entity.from_id is pretty succinct and clear.
It avoids the use of data in the next two examples, which does not describe what's being returned; the data is used to construct an Entity. If you wanted to be specific about the data being used to construct the Entity, then you could name your method something like Entity.with_data_for_id or the equivalent function entity_with_data_for_id.
Using a verb such as find can also be pretty confusing, as it doesn't give any indication of the return value — what is the function supposed to do when it's found the data? (Yes, I realize str has a find method; wouldn't it be better named index_of? But then there's also index...) It reminds me of the classic:
I always try to think what a name would indicate to someone with (a) no knowledge of the system, and (b) knowledge of other parts of the system — not to say I'm always successful!
Here is a decent use case for #staticmethod.
I have been working on a game as a side project. Part of that game includes rolling dice based on stats, and the possibility of picking up items and effects that impact your character's stats (for better or worse).
When I roll the dice in my game, I need to basically say... take the base character stats and then add any inventory and effect stats into this grand netted figure.
You can't take these abstract objects and add them without instructing the program how. I'm not doing anything at the class level or instance level either. I didn't want to define the function in some global module. The last best option was to go with a static method for adding up stats together. It just makes the most sense this way.
class Stats:
attribs = ['strength', 'speed', 'intellect', 'tenacity']
def __init__(self,
strength=0,
speed=0,
intellect=0,
tenacity=0
):
self.strength = int(strength)
self.speed = int(speed)
self.intellect = int(intellect)
self.tenacity = int(tenacity)
# combine adds stats objects together and returns a single stats object
#staticmethod
def combine(*args: 'Stats'):
assert all(isinstance(arg, Stats) for arg in args)
return_stats = Stats()
for stat in Stats.attribs:
for _ in args:
setattr(return_stats, stat,
getattr(return_stats, stat) + getattr(_, stat))
return (return_stats)
Which would make the stat combination calls work like this
a = Stats(strength=3, intellect=3)
b = Stats(strength=1, intellect=-1)
c = Stats(tenacity=5)
print(Stats.combine(a, b, c).__dict__)
{'strength': 4, 'speed': 0, 'intellect': 2, 'tenacity': 5}

Categories

Resources