So, I've read most of the docs and I've been looking around on SO a bit, but I can't quite find the answer to my question. I'll start with the code.
# Manager
class ActiveManager(models.Manager):
def get_query_set(self):
return super(ActiveManager, self).get_query_set().filter(is_active=True)
# Model
class ModelA(models.Model):
# ...
is_active = models.BooleanField()
objects = ActiveManager()
all_objects = models.Manager()
So, while I was playing around I noticed that if I wrote it this way and used get_object_or_404(), then it would use the ActiveManager to first search for all active records and then return the one related to my query. However, if I switched the order of the managers:
class ModelA(models.Model):
# ...
all_objects = models.Manager()
objects = ActiveManager()
Then it uses the default manager, in this case all_objects, to do the query. I'm wondering what other functions does this change impact.
EDIT: I understand that the first manager found in the class becomes the default manager, but I'm wondering which specific functions use this default manager (like get_object_or_404)
Here's the relevant bit from the docs: "If you use custom Manager objects, take note that the first Manager Django encounters (in the order in which they're defined in the model) has a special status. Django interprets the first Manager defined in a class as the "default" Manager, and several parts of Django (including dumpdata) will use that Manager exclusively for that model. As a result, it's a good idea to be careful in your choice of default manager in order to avoid a situation where overriding get_query_set() results in an inability to retrieve objects you'd like to work with".
If you look at the way get_object_or_404 is implemented, they use the _default_manager attribute of the model, which is how Django refers to the first manager encountered. (As far as I know, all Django internals work this way -- they never use Model.objects etc. because you shouldn't assume the default manager happens to be called objects).
It effects many things. The default name for the manager, objects, is just that, a default, but it's not required. If you didn't include objects in your model definition and just defined a manager as all_objects, ModelA.objects wouldn't exist. Django merely assigns a default manager to that if no other managers are present on the model and you have not defined objects on your own.
Anyways, because of this Django takes the first manager defined in a model and calls that the "default", and later uses the "default" manager anytime is needs to reference the model's manager (because, again, it can't simply use objects because objects might not be defined).
The rule of thumb is that the standard manager that Django should use (in a sense, the manager that should most normally be used), should be the first one defined, whether it be assigned to objects or something else entirely. Every other additional manager should come after that.
Related
My IDE keeps suggesting I convert my instance methods to static methods. I guess because I haven't referenced any self within these methods.
An example is :
class NotificationViewSet(NSViewSet):
def pre_create_processing(self, request, obj):
log.debug(" creating messages ")
# Ensure data is consistent and belongs to the sending bot.
obj['user_id'] = request.auth.owner.id
obj['bot_id'] = request.auth.id
So my question would be: do I lose anything by just ignoring the IDE suggestions, or is there more to it?
This is a matter of workflow, intentions with your design, and also a somewhat subjective decision.
First of all, you are right, your IDE suggests converting the method to a static method because the method does not use the instance. It is most likely a good idea to follow this suggestion, but you might have a few reasons to ignore it.
Possible reasons to ignore it:
The code is soon to be changed to use the instance (on the other hand, the idea of soon is subjective, so be careful)
The code is legacy and not entirely understood/known
The interface is used in a polymorphic/duck typed way (e.g. you have a collection of objects with this method and you want to call them in a uniform way, but the implementation in this class happens to not need to use the instance - which is a bit of a code smell)
The interface is specified externally and cannot be changed (this is analog to the previous reason)
The AST of the code is read/manipulated either by itself or something that uses it and expects this method to be an instance method (this again is an external dependency on the interface)
I'm sure there can be more, but failing these types of reasons I would follow the suggestion. However, if the method does not belong to the class (e.g. factory method or something similar), I would refactor it to not be part of the class.
I think that you might be mixing up some terminology - the example is not a class method. Class methods receive the class as the first argument, they do not receive the instance. In this case you have a normal instance method that is not using its instance.
If the method does not belong in the class, you can move it out of the class and make it a standard function. Otherwise, if it should be bundled as part of the class, e.g. it's a factory function, then you should probably make it a static method as this (at a minimum) serves as useful documentation to users of your class that the method is coupled to the class, but not dependent on it's state.
Making the method static also has the advantage this it can be overridden in subclasses of the class. If the method was moved outside of the class as a regular function then subclassing is not possible.
I have a class from which all my entity definitions inherit:
class Model(db.Model):
"""Superclass for all others; contains generic properties and methods."""
created = db.DateTimeProperty(auto_now_add=True)
modified = db.DateTimeProperty(auto_now=True)
For various reasons I want to be able to occasionally modify an entity without changing its modified property. I found this example:
Model.__dict__["modified"].__dict__["auto_now"] = False
db.put(my_entity)
Model.__dict__["modified"].__dict__["auto_now"] = True
When I test this locally, it works great. My question is this: could this have wider ramifications for any other code that happens to be saving entities during the small period of time Model is altered? I could see that leading to incredibly confusing bugs. But maybe this little change only affects the current process/thread or whatever?
Any other request coming in to the same instance and being handled whilst the put is in progress will also get auto_now=False, whilst unlikely it is possible
Something else other thing to consider
You don't have try block around this code, if you get a timeout or error during the put() your code will leave the model in the modified state with auto_now=False .
Personally in think its a bad idea and will definatley be a source of errors.
There are a number of ways of achieving this without manipulating models,
consider setting the default behaviour to auto_now=False, and then have two methods you use for updating. The primary method sets the modified time to datetime.now() just before you do the put(), e.g save() and save_without_modified()
A better method would to override put() in your class, then set modified and then call super put() have put() accept a new argument like modified=False so you don't set the modified date before you call super.
Lastly you could use _pre_put hook to run code before the put() call, but you need to annotate the instance in some way so the _pre_put method can determine if modified needs to be set or not.
I think each of these strategies is a lot more safe than hacking the model
Is there a way to change the default object manager for all Models? (which would include the object managers on third party apps)
The default manager is attached in the function ensure_default_manager in django.db.models.manager. It attaches by default a manager of class Manager. You could monkeypatch this function to attach a different (subclass of) Manager.
But you have to consider whether this is the most ideal solution to the problem you're trying to solve.
If you really need to do that modify the django code itself. Monkey patching is an option also, there are a lot of techniques for that out there.
After reading up on Django Managers, I'm still unsure how much benefit I will get by using it. It seems that the best use is to add custom queries (read-only) methods like XYZ.objects.findBy*(). But I can easily do that with static methods off of the Model classes themselves.
I prefer the latter always because:
code locality in terms of readability and easier maintenance
slightly less verbose as I don't need the objects property in my calls
Manager classes have weird rules regarding model inheritance, might as well stay clear of that.
Is there any good reason not to use static methods and instead use Manager classes?
Adding custom queries to managers is the Django convention. From the Django docs on custom managers:
Adding extra Manager methods is the preferred way to add "table-level" functionality to your models.
If it's your own private app, the convention word doesn't matter so much - indeed my company's internal codebase has a few classmethods that perhaps belong in a custom manager.
However, if you're writing an app that you're going to share with other Django users, then they'll expect to see findBy on a custom manager.
I don't think the inheritance issues you mention are too bad. If you read the custom managers and model inheritance docs, I don't think you'll get caught out. The verbosity of writing .objects is bearable, just as it is when we do queries using XYZ.objects.get() and XYZ.objects.all()
Here's a few advantages of using manager methods in my opinion:
Consistency of API. Your method findBy belongs with get, filter, aggregate and the rest. Want to know what lookups you can do on the XYZ.objects manager? It's simple when you can introspect with dir(XYZ.objects).
Static methods "clutter" the instance namespace. XYZ.findBy() is fine but if you define a static method, you can also do xyz.findBy(). Running the findBy lookup on a particular instance doesn't really make sense.
DRYness. Sometimes you can use the same manager on more than one model.
Having said all that, it's up to you. I'm not aware of a killer reason why you should not use a static method. You're an adult, it's your code, and if you don't want to write findBy as a manager method, the sky isn't going to fall in ;)
For further reading, I recommend the blog post Managers versus class methods by James Bennett, the Django release manager.
I'm in a need for doing some sort of processing on the objects that get pickled just before it happens. More precisely for instances of subclasses of a certain base class I would like something totally different to be pickled instead and then recreated on loading.
I'm aware of __getstate__ & __setstate__ however this is a very invasive approach. My understanding is that these are private methods (begin with double underscore: __), and as such are subject to name mangling. Therefore this effectively would force me to redefine those two methods for every single class that I want to be subject to this non standard behavior. In addition I don't really have a full control over the hierarchy of all classes.
I was wondering if there is some sort of brief way of hooking into pickling process and applying this sort of control that __getstate__ and __setstate__ give but without having to modify the pickled classes as such.
A side note for the curious ones. This is a use case taken from a project using Django and Celery. Django models are either unpickable or very unpractical and cumbersome to pickle. Therefore it's much more advisable to pickle pairs of values ID + model class instead. However sometimes it's not the model directly that is pickled but rather a dictionary of models, a list of models, a list of lists of models, you name it. This forces me to write a lot of copy-paste code that I really dislike. A need for pickling models comes itself from Django-celery setup, where functions along with their call arguments are scheduled for later execution. Unfortunately among those arguments there are usually a lot of models mixed up in some nontrivial hierarchy.
EDIT
I do have a possibility of specifying a custom serializer to be used by Celery, so it's really a question of being able to build a slightly modified serializer on top of pickle without much effort.
The only additional hooks that are related are reduce() and __reduce__ex()
http://docs.python.org/library/pickle.html
What is the difference between __reduce__ and __reduce_ex__?
Python: Ensuring my class gets pickled only with the latest protocol
Not sure if they really provide what you need in particular.