I'm writing a Django view that sometimes gets data from the database, and sometimes from an external API.
When it comes from the database, it is a Django model instance. Attributes must be accessed with dot notation.
Coming from the API, the data is a dictionary and is accessed through subscript notation.
In either case, some processing is done on the data.
I'd like to avoid
if from_DB:
item.image_url='http://example.com/{0}'.format(item.image_id)
else:
item['image_url']='http://example.com/{0}'.format(item['image_id'])
I'm trying to find a more elegant, DRY way to do this.
Is there a way to get/set by key that works on either dictionaries or objects?
You could use a Bunch class, which transforms the dictionary into something that accepts dot notation.
In JavaScript they're equivalent (often useful; I mention it in case you didn't know as you're doing web development), but in Python they're different - [items] versus .attributes.
It's easy to write something which allows access through attributes, using __getattr__:
class AttrDict(dict):
def __getattr__(self, attr):
return self[attr]
def __setattr__(self, attr, value):
self[attr] = value
Then just use it as you'd use a dict (it'll accept a dict as a parameter, as it's extending dict), but you can do things like item.image_url and it'll map it to item.image_url, getting or setting.
I don't know what the implications will be, but I would add a method to the django model which reads the dictionary into itself, so you can access the data through the model.
Related
In a nutshell, I receive json events via an API and recently I've been learning a lot more about classes. One of the recommended ways to use classes is to implement getters, setters etc.. However, my classes aren't too sophisticated all they're doing is parsing data from a json object and passing better formatted data onto further ETL processes.
Below is a simple example of what I've encountered.
data = {'status': 'ready'}
class StatusHandler:
def __init__(self, data):
self.status = data.get('status', None)
class StatusHandler2:
def __init__(self, data):
self._status = data.get('status', None)
#property
def status(self):
return self._status
without_getter = StatusHandler(data)
print(without_getter.status)
with_getter = StatusHandler2(data)
print(with_getter.status)
Is there anything wrong with me using the class StatusHandler and referencing a status instance variable and using that to pass information forward to other bits of code? I'm just wondering if further down the line as my project gets more complicated that this would be an issue as it doesn't seem to be standard although I could be wrong...
The point of getters/setters is to avoid replacing plain attributes access with computed ones without breaking client code if and when you have to change your implementation. This only make sense for languages that have no support for computed attributes.
Python has a quite strong support for computed attributes thru the descriptor protocol, including the generic builtin property type, so you don't need explicit getters/setters - if you have to change your implementation, just replace affected public attributes by computed ones.
Just make sure to not abuse computed attributes - they should not make any heavy computation, external resource access or so. No one expects what looks like an attribute to have a high cost or raise IOErrors or so ;-)
EDIT
With regard to your example: computed attributes are a way to control attribute access, and making an attribute read-only (not providing a setter for your property) IS a perfectly valid use case - IF you have a reason to make it read-only of course.
I've created a subclass of dict as per this question. What I'd like to do is be able to create a new dictionary of my subclass by using bracket notation. Namely, when I have {"key":"value"}, I'd like it to call mydict(("key","value")). Is this possible?
No. And for good reasons: it would violate the expectations of people reading the code. It would also likely break any third-party libraries that you import which would (reasonably) expect dict literals to be standard python dictionaries.
A better method to fix this without lots of extra typing is to simply add a static .from method on your custom dictionary class that attempts to consume the literal and returns an instance of your custom dictionary.
MyDict.from({
"key": "value"
})
Where an implementation of from might look something like
#classmethod
def from(cls, dictionary):
new_inst = cls()
for key, value of dictionary.items():
new_inst[key] = value
return newInst
Edit based on comment:
user2357112 correctly points out that you could just use the constructor as long as the dict constructor's signature is the same as your custom class:
some_instance = MyDict({"key": "value"})
If you've messed with it though you'll have to go the custom route a la from.
In my Python app, I need to create a simple RequestContext object that contains the actual request to send, as well as some other metadata about the request, e.g. index, id, source, etc. Also, it's likely that more metadata will be added.
I can think of two ways to do this:
Option 1: Attributes of the RequestContext:
class RequestContext(object):
def __init__(self, request, index=0, source=None):
self.request = request
self.index = index
self.source = source
...
Option 2: Dictionary in the RequestContext:
class RequestContext(object):
def __init__(self, request):
self.request = request
self.context_info = {}
Then users can add whatever context info they want.
I personally like accessing values through attributes, because you have a predefined set of attributes that you know are there.
On the other hand, a dictionary lets the client code (owned by me) add more metadata without having to update the RequestContext object definition.
So which one would be better in terms of ease of use and ease of adding more metadata? Are there other pitfalls or considerations that I should think about?
First, you seem to be operating under a misapprehension:
On the other hand, a dictionary lets the client code (owned by me) add more metadata without having to update the RequestContext object definition.
You don't have to update the class definition to add new attributes. (You can force the class to have a fixed set of attributes in various ways—e.g., by using __slots__. But if you don't do any of that, you can add new attributes on the fly.) For example:
>>> class C(object):
... def __init__(self, x):
... self.x = x
>>> c = C(10)
>>> c.y = 20
>>> print(c.x, c.y)
10 20
In fact, if you look under the covers, attributes are (by default) stored in a perfectly normal dictionary, named __dict__:
>>> print(c.__dict__)
{'x': 10, 'y': 20}
So, what is the difference between using attributes, vs. just adding a dictionary attribute and using members of that dictionary (or, alternatively, inheriting from or delegating to a dict)?
Really, it's the same as the difference between using separate variables vs. a single dict at the top level.
One way to look at it is whether the name-value pairs are data, or whether the names are part of your program and only the values are data. In the former case, you want a dict; in the latter case, you want separate variables.
Alternatively, you can ask how dynamic the names are. If there's an open-ended set of values whose names are only known at runtime, you want a dict. If there's a fixed set of values whose names are hardcoded into your source, you want attributes.
Finally, just ask yourself how often you'd have to use getattr and setattr if you went with attributes. If the answer is "frequently", you want a dict; if it's "never", you want attributes.
In many real-life apps, it's not entirely clear, because you're somewhere between the two. Sometimes, rethinking your design can make it clearly one or the other, but sometimes things are just inherently "sort of dynamic". In that case, you have to make a judgment call: decide which of the two cases you're closest to, and code as if that were the real case.
It may be worth looking at some real-life open source apps that are similar to yours. For example, you're dealing with metadata about some kind of requests. Maybe look at how requests and pycurl deal with HTTP information that's kind of like metadata, like headers and status lines. Or maybe look at how QuodLibet and MusicBrainz Picard deal with metadata in a different domain, music files. And so on.
I am working on a project where I have a number of custom classes to interface with a varied collection of data on a user's system. These classes only have properties as user-facing attributes. Some of these properties are decently resource intensive, so I want to only run the generation code once, and store the returned value on disk (cache it, that is) for faster retrieval on subsequent runs. As it stands, this is how I am accomplishing this:
def stored_property(func):
"""This ``decorator`` adds on-disk functionality to the `property`
decorator. This decorator is also a Method Decorator.
Each key property of a class is stored in a settings JSON file with
a dictionary of property names and values (e.g. :class:`MyClass`
stores its properties in `my_class.json`).
"""
#property
#functools.wraps(func)
def func_wrapper(self):
print('running decorator...')
try:
var = self.properties[func.__name__]
if var:
# property already written to disk
return var
else:
# property written to disk as `null`
return func(self)
except AttributeError:
# `self.properties` does not yet exist
return func(self)
except KeyError:
# `self.properties` exists, but property is not a key
return func(self)
return func_wrapper
class MyClass(object):
def __init__(self, wf):
self.wf = wf
self.properties = self._properties()
def _properties(self):
# get name of class in underscore format
class_name = convert(self.__class__.__name__)
# this is a library used (in Alfred workflows) for interacted with data stored on disk
properties = self.wf.stored_data(class_name)
# if no file on disk, or one of the properties has a null value
if properties is None or None in properties.values():
# get names of all properties of this class
propnames = [k for (k, v) in self.__class__.__dict__.items()
if isinstance(v, property)]
properties = dict()
for prop in propnames:
# generate dictionary of property names and values
properties[prop] = getattr(self, prop)
# use the external library to save that dictionary to disk in JSON format
self.wf.store_data(class_name, properties,
serializer='json')
# return either the data read from file, or data generated in situ
return properties
#this decorator ensures that this generating code is only run if necessary
#stored_property
def only_property(self):
# some code to get data
return 'this is my property'
This code works precisely as I need it, but it still forces me to manually add the _properties(self) method to each class wherein I need this functionality (currently, I have 3). What I want is a way to "insert" this functionality into any class I please. I think that a Class Decorator could get this job done, but try as I might, I can't quite figure out how to wrangle it.
For the sake of clarity (and in case a decorator is not the best way to get what I want), I will try to explain the overall functionality I am after. I want to write a class that contains some properties. The values of these properties are generated via various degrees of complex code (in one instance, I'm searching for a certain app's pref file, then searching for 3 different preferences (any of which may or may not exist) and determining the best single result from those preferences). I want the body of the properties' code only to contain the algorithm for finding the data. But, I don't want to run that algorithmic code each time I access that property. Once I generate the value once, I want to write it to disk and then simply read that on all subsequent calls. However, I don't want each value written to its own file; I want a dictionary of all the values of all the properties of a single class to be written to one file (so, in the example above, my_class.json would contain a JSON dictionary with one key, value pair). When accessing the property directly, it should first check to see if it already exists in the dictionary on disk. If it does, simply read and return that value. If it exists, but has a null value, then try to run the generation code (i.e. the code actually written in the property method) and see if you can find it now (if not, the method will return None and that will once again be written to file). If the dictionary exists and that property is not a key (my current code doesn't really make this possible, but better safe than sorry), run the generation code and add the key, value pair. If the dictionary doesn't exist (i.e. on the first instantiation of the class), run all generation code for all properties and create the JSON file. Ideally, the code would be able to update one property in the JSON file without rerunning all of the generation code (i.e. running _properties() again).
I know this is a bit peculiar, but I need the speed, human-readable content, and elegant code all together. I would really not to have to compromise on my goal. Hopefully, the description of what I want it clear enough. If not, let me know in a comment what doesn't make sense and I will try to clarify. But I do think that a Class Decorator could probably get me there (essentially by inserting the _properties() method into any class, running it on instantiation, and mapping its value to the properties attribute of the class).
Maybe I'm missing something, but it doesn't seem that your _properties method is specific to the properties that a given class has. I'd put that in a base class and have each of your classes with #stored_property methods subclass that. Then you don't need to duplicate the _properties method.
class PropertyBase(object):
def __init__(self, wf):
self.wf = wf
self.properties = self._properties()
def _properties(self):
# As before...
class MyClass(PropertyBase):
#stored_property
def expensive_to_calculate(self):
# Calculate it here
If for some reason you can't subclass PropertyBase directly (maybe you already need to have a different base class), you can probably use a mixin. Failing that, make _properties accept an instance/class and a workflow object and call it explicitly in __init__ for each class.
I have several TextField columns on my UserProfile object which contain JSON objects. I've also defined a setter/getter property for each column which encapsulates the logic for serializing and deserializing the JSON into python datastructures.
The nature of this data ensures that it will be accessed many times by view and template logic within a single Request. To save on deserialization costs, I would like to memoize the python datastructures on read, invalidating on direct write to the property or save signal from the model object.
Where/How do I store the memo? I'm nervous about using instance variables, as I don't understand the magic behind how any particular UserProfile is instantiated by a query. Is __init__ safe to use, or do I need to check the existence of the memo attribute via hasattr() at each read?
Here's an example of my current implementation:
class UserProfile(Model):
text_json = models.TextField(default=text_defaults)
#property
def text(self):
if not hasattr(self, "text_memo"):
self.text_memo = None
self.text_memo = self.text_memo or simplejson.loads(self.text_json)
return self.text_memo
#text.setter
def text(self, value=None):
self.text_memo = None
self.text_json = simplejson.dumps(value)
You may be interested in a built-in django decorator django.utils.functional.memoize.
Django uses this to cache expensive operation like url resolving.
Generally, I use a pattern like this:
def get_expensive_operation(self):
if not hasattr(self, '_expensive_operation'):
self._expensive_operation = self.expensive_operation()
return self._expensive_operation
Then you use the get_expensive_operation method to access the data.
However, in your particular case, I think you are approaching this in slightly the wrong way. You need to do the deserialization when the model is first loaded from the database, and serialize on save only. Then you can simply access the attributes as a standard Python dictionary each time. You can do this by defining a custom JSONField type, subclassing models.TextField, which overrides to_python and get_db_prep_save.
In fact someone's already done it: see here.
For class methods, you should use django.utils.functional.cached_property.
Since the first argument on a class method is self, memoize will maintain a reference to the object and the results of the function even after you've thrown it away. This can cause memory leaks by preventing the garbage collector from cleaning up the stale object. cached_property turns Daniel's suggestion into a decorator.