How to do DB memcaching in Django with derived data? - python

NOTE:
This is a detailed question asking how best to implement and manage Database caching in my web-application with memcached. This question uses Python/Django to illustrate the data-models and usage, but the language is not really that relevant. I'm really more interested in learning what the best strategy to maintain cache-coherency is. Python/Django just happens to be the language I'm using to illustrate this question.
RULES OF MY APPLICATION:
I have a 3 x 3 grid of cells of integers
The size of this grid may increase or decrease in the future. Our solution must scale.
Their is a cumulative score for each row that is calculated by summing (value * Y-Coord) for each cell in that row.
Their is a cumulative score for each column that is calculated by summing (value * X-Coord) for each cell in that column.
The values in the cells change infrequently. But those values and the scores scores are read frequently.
I want to use memcached to minimize my database accesses.
I want to minimize/eliminate storing duplicate or derived information in my database
The image below shows an example of the state of the my grid.
MY CODE:
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
class Cell(models.Model):
x = models.IntegerField(editable=False)
y = models.IntegerField(editable=False)
# Whenever this value is updated, the keys for the row and column need to be
# invalidated. But not sure exactly how I should manage that.
value = models.IntegerField()
class Row(models.Model):
y = models.IntegerField()
#property
def cummulative_score(self):
# I need to do some memcaching here.
# But not sure the smartest way to do it.
return sum(map(lambda p: p.x * p.value, Cell.objects.filter(y=self.y)))
class Column(models.Model):
x = models.IntegerField()
#property
def cummulative_score(self):
# I need to do some memcaching here.
# But not sure the smartest way to do it.
return sum(map(lambda p: p.y * p.value, Cell.objects.filter(x=self.x)))
SO HERE IS MY QUESTION:
You can see that I have setup a memcached instance. Of course I know how to insert/delete/update keys and values in memcached. But given my code above how should I name the keys appropriately? It won't work if the key names are fixed since there must exist individual keys for each row and column. And critically how can I ensure that the appropriate keys (and only the appropriate keys) are invalidated when the values in the cells are updated?
How do I manage the cache invalidations whenever anyone updates Cell.values so that the database accesses are minimized? Isn't there some django middleware that can handle this book-keeping for me? The documents that I have seen don't do that.

# your client, be it memcache or redis, assign to client variable
# I think both of them use set without TTL for permanent values.
class Cell(models.Model):
x = models.IntegerField(editable=False)
y = models.IntegerField(editable=False)
value = models.IntegerField()
def save(self, *args, **kwargs):
Cell.cache("row",self.y)
Cell.cache("column",self.x)
super(Cell, self).save(*args, **kwargs)
#staticmethod
def score(dimension, number):
return client.get(dimension+str(number), False) or Cell.cache(number)
#staticmethod
def cache(dimension, number):
if dimension == "row":
val = sum([c.y * c.value for c in Cell.objects.filter(y=number)])
client.set(dimension+str(self.y),val)
return val
if dimension == "column":
val = sum([c.x * c.value for c in Cell.objects.filter(x=number)])
client.set(dimension+str(self.x),val)
return val
raise Exception("No such dimension:"+str(dimension))

If you want to cache individual row/column combinations you should append the object id to the key name.
given a x, y variables:
key = 'x={}_y={}'.format(x, y)
I would use the table name and just append the id, row id could just be the table PK, column id could just be the column name, like this
key = '{}_{}_{}'.format(table_name, obj.id, column_name)
In any case I suggest considering caching the whole row instead of individuals cells

The Cell object can invalidate cached values for its Row and Column when the model object is saved.
(Row and Column are plain objects here, not Django models, but of course you can change that if you need to store them in the database for some reason.)
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
class Cell(models.Model):
x = models.IntegerField(editable=False)
y = models.IntegerField(editable=False)
# Whenever this value is updated, the keys for the row and column need to be
# invalidated. But not sure exactly how I should manage that.
value = models.IntegerField()
def invalidate_cache(self):
Row(self.y).invalidate_cache()
Column(self.x).invalidate_cache()
def save(self, *args, **kwargs):
super(Cell, self).save(*args, **kwargs)
self.invalidate_cache()
class Row(object):
def __init__(self, y):
self.y = y
#property
def cache_key(self):
return "row_{}".format(self.y)
#property
def cumulative_score(self):
score = mc.get(self.cache_key)
if not score:
score = sum(map(lambda p: p.x * p.value, Cell.objects.filter(y=self.y)))
mc.set(self.cache_key, score)
return score
def invalidate_cache(self):
mc.delete(self.cache_key)
class Column(object):
def __init__(self, x):
self.x = x
#property
def cache_key(self):
return "column_{}".format(self.x)
#property
def cumulative_score(self):
score = mc.get(self.cache_key)
if not score:
score = sum(map(lambda p: p.y * p.value, Cell.objects.filter(x=self.x)))
mc.set(self.cache_key, score)
return score
def invalidate_cache(self):
mc.delete(self.cache_key)

Related

Python Class Objects/Attributes

I am learning python with an online course and have been doing fine, I have stumbled across and issue. Normally when having an issue i just google to find the answer or look up guides but the problem here is I'm not even sure what I'm looking for!
I currently have the below code. I have a task to make each workout intensity have a specific value, such as Low = 3, Medium = 6 and High = 12. Then I can find the calories burned via duration * the values dependant on the intensity and duration passed into the class.
What I don't know is how do I assign a value to a class method? I tried lists and Dictionarys and both are throwing errors. I tried writing and If statement to try if self.intensity(getattr) = "Low": Then x = 3, etc.
I really am not sure where to even start to find an answer hence asking you guys and girls.
The code currently is (I am aware pieces are missing also I'm currently only focusing on assigning values to the Intensity
class ExerciseSession:
def __init__(self, exercise, intensity, duration):
self.exercise = exercise
self.intensity = intensity
self.duration = duration
def get_exercise(self):
return self.exercise
def get_intensity(self):
return self.intensity
def get_duration(self):
return self.duration + " minutes"
def set_exercise(self, excersice):
self.set_exercise = exercise
def set_intensity(self, intensity):
self.set_intensity = intensity
def set_duration(self, duration):
self.set_duration = duration
new_exercise = ExerciseSession("Running", "Low", 60)
print(new_exercise.get_exercise())
print(new_exercise.get_exercise())
print(new_exercise.get_intensity())
print(new_exercise.get_duration())
new_exercise.set_exercise("Swimming")
new_exercise.set_intensity("High")
new_exercise.set_duration(30)
print(new_exercise.get_exercise())
print(new_exercise.get_intensity())
print(new_exercise.get_duration())
print(new_exercise.get_intensity())
print(new_exercise.get_duration())
new_exercise.set_exercise("Swimming")
new_exercise.set_intensity("High")
new_exercise.set_duration(30)
print(new_exercise.get_exercise())
print(new_exercise.get_intensity())
print(new_exercise.get_duration())
Am i just doing lists/dictionaries wrong within a class or am I missing something incredibly easy here. I understand classes and methods but it seems some things work slightly different when inside a class etc.
Firstly - are your setters implemented as they are in your question? If so, you appear to be trying to override your methods with a value. There is also a small typo in set_exercise(). Compare the below with your question:
def set_exercise(self, exercise):
self.exercise = exercise
def set_intensity(self, intensity):
self.intensity = intensity
def set_duration(self, duration):
self.duration = duration
You can then write your test statements outside of the class to figure out what value to use when calculating the number of calories consumed based on the intensity.
Alternatively, Python provides a neat way of encapsulating data without resorting to getter/setter methods through the #property decorator. The property can then be set and retrieved in a simple fashion. This is particularly useful in more complex situations where an attribute is derived from other attributes, meaning you will always access the most up-to-date attributes. This is covered in more detail here: "public" or "private" attribute in Python ? What is the best way?
class ExerciseSession:
def __init___(self, exercise, intensity, duration):
self._exercise = exercise
self._intensity = intensity
self._duration = duration
#property
def exercise(self):
return self._exercise
#property
def intensity(self):
if self._intensity == "Low":
out = 3
elif self._intensity == "Medium":
out = 6
elif self._intensity == "High":
out = 12
else:
out = None
return out
#property
def duration(self):
return self._duration
Note: the "_" before each instance attribute is used to indicate the internal attribute is conventionally private.
These properties can then be accessed as follows (note we do not have to call any methods, e.g. new_exercise.exercise()):
new_exercise = ExerciseSession("Running", "Low", 60)
print(new_exercise.exercise)
print(new_exercise.intensity)
print("{} minutes.".format(new_exercise.duration))
If you need to be able to update the type of exercise/duration/intensity on the object, rather than just creating a new one, you can add setter methods using the #.setter decorator, e.g.:
#exercise.setter
def exercise(self, value):
self._exercise = value
and these properties can be updated as:
new_exercise.exercise = "Swimming"
doing the same :-)
def calories_burned(self):
if self.intensity == "Low":
return 4 * self.duration
elif self.intensity == "Medium":
return 8 * self.duration
else:
return 12 * self.duration

Call a method on a new object from a variable

I have a class containing a list and some boolean methods.
class Cls:
data = [] // populated in __init__()
def flag1(self):
def flag2(self):
def flag3(self): # these all return booleans, based on the data
I want to create a higher level function, taking a parameter one of the flags, manipulating the data in a number of ways, applying the flag to the new data, and counting the number of results.
Something like:
def hof(self, fn):
count = 0
for i in range(1, 10):
new_obj = Cls(self.data+i)
if new_obj.fn():
count +=1
Is there any way to accomplish this without turning all the flags into static methods ?
===
Edit: Made it work, in a very hackish way:
class Cls:
data = []
def __init__(self):
self.data = value
class flag1(self):
return True
class flag2(self):
return False
# The hackish part
flag_dict = {
1: flag1,
2: flag2,
}
def hof(self, flag):
count = 0
for i in range(1,10):
new_obj = Cls(self.data + [i])
if self.flag_dict[flag](new_obj):
count +=1
return count
But it seems like a hack, and it's not quite understandable. Could someone point to a better way ?
Thanks.
You should be able to just pass the methods into the function like instance.hof(Cls.flag1), and internally, write it as if fn(new_obj):, with no need to make it a staticmethod.

Building variable dependencies

I am trying to build some variable dependencies in Python. For example, if a = x, b = y and c = a + b, then if a or b changes the value of c should be automatically updated. I am aware the Python variables and values work on the basis of tags and have been trying to work around this using __setattr__. I seem to be having some trouble doing this, due to the cyclic dependency in __setattr__.
Consider this small code snippet:
class DelayComponents(object):
'''
Delay Components Class
'''
def __init__(self, **delays):
'''
Constructor
'''
self.prop_delay = round(float(delays['prop_delay']), 2)
self.trans_delay = round(float(delays['trans_delay']), 2)
self.proc_delay = round(float(delays['proc_delay']), 2)
self.queue_delay = round(float(delays['queue_delay']), 2)
self.delay = (self.prop_delay + self.proc_delay +
self.trans_delay + self.queue_delay)
def __setattr__(self, key, value):
self.__dict__[key] = value
if (key in ("prop_delay", "trans_delay",
"proc_delay", "queue_delay")):
self.delay = (self.prop_delay + self.proc_delay +
self.trans_delay + self.queue_delay)
This seems to serve the purpose well, but when I create an object of DelayComponents for the first time, since __setattr__ has been overridden and is called for each of the values being created, the if check inside __setattr__ throws an error saying the remaining three variables have not been found (which is true, since they have not yet been created).
How do I resolve this dependency?
Also, is there some way to accomplish the same with a dict? More specifically, if the three variables were actually key-value pairs in a dict, where the third key's value was the sum of the values of the first two keys, would it be possible to update the third value automatically when either of the first two changes?
Assuming that you want zero default values for the unset _delays (in both __init__ and __setattr__) you could do something like:
class DelayComponents(object):
'''
Delay Components Class
'''
ATTRS = ['prop_delay', 'trans_delay', 'proc_delay', 'queue_delay']
def __init__(self, **delays):
'''
Constructor
'''
for attr in self.ATTRS:
setattr(self, attr, round(float(delays.get(attr, 0)), 2))
# No point in setting delay here - it's already done!
def __setattr__(self, key, value):
super(DelayComponents, self).__setattr__(key, value)
# This avoids directly interacting with the __dict__
if key in self.ATTRS:
self.delay = sum(getattr(self, attr, 0) for attr in self.ATTRS)
In use:
>>> d = DelayComponents(prop_delay=1, trans_delay=2, proc_delay=3, queue_delay=4)
>>> d.delay
10.0
Should you want different defaults for different attributes, DelayComponents.ATTRS could be a dictionary {'attribute_name': default_value, ...}.
A much simpler alternative is to make delay a #property, that is calculated only as required:
class DelayComponents(object):
'''
Delay Components Class
'''
ATTRS = ['prop_delay', 'trans_delay', 'proc_delay', 'queue_delay']
def __init__(self, **delays):
'''
Constructor
'''
for attr in self.ATTRS:
setattr(self, attr, round(float(delays.get(attr, 0)), 2))
#property
def delay(self):
return sum(getattr(self, attr, 0) for attr in self.ATTRS)
To answer your sub-question: no, there's no way to do this with a vanilla dict; the values for keys aren't reevaluated based on changes to the values from which they're calculated.
Also, in all seriousness, there is no point to your current docstrings; you might as well leave them out entirely. They provide no information, and aren't compliant with PEP-257 either.

Using class to define multiple variables in python

I'm still new to python and this is probably going be one of those (stupid) boring questions. However, any help will be much appreciated. I'm programing something that involves many variables and I've decided to use a class to encapsulate all variables (hopefully making it easier to "read" for me as time passes), but it's not working as I thought it will. So, without further ado here is a part of the class that captures the gist.
import numpy as np
class variable:
def __init__(self, length):
self.length = length # time length`
def state_dynamic(self):
length = self.length
return np.zeros((2, np.size(length)))
def state_static(self):
length = self.length
return np.zeros((2, np.size(length)))
def control_dynamic(self):
length = self.length
return np.zeros((2, np.size(length)))
def control_static(self):
length = self.length
return np.zeros((2, np.size(length)))
def scheduling(self):
length = self.length
return np.zeros(np.size(length))
def disturbance(self):
length = self.length
dummy = np.random.normal(0., 0.1, np.size(length))
for i in range(20):
dummy[i+40] = np.random.normal(0., 0.01) + 1.
dummy[80:100] = 0.
return dummy
I've also tried this one:
import numpy as np
class variable:
def __init__(self, type_1, type_2, length):
self.type_1 = type_1 # belongs to set {state, control, scheduling, disturbance}
self.type_2 = type_2 # belongs to set {static, dynamic, none}
self.length = length # time length
def type_v(self):
type_1 = self.type_1
type_2 = self.type_2
length = self.length
if type_1 == 'state' and type_2 == 'dynamic':
return np.zeros((2, np.size(length)))
elif type_1 == 'state' and type_2 == 'static':
return np.zeros((2, np.size(length)))
elif type_1 == 'control' and type_2 == 'dynamic':
return np.zeros((2, np.size(length)))
elif type_1 == 'control' and type_2 == 'static':
return np.zeros((2, np.size(length)))
elif type_1 == 'scheduling' and type_2 == 'none':
return np.zeros(np.size(length))
elif type_1 == 'disturbance' and type_2 == 'none':
dummy = np.random.normal(0., 0.1, np.size(length))
for i in range(20):
dummy[i+40] = np.random.normal(0., 0.01) + 1.
dummy[80:100] = 0.
return dummy
Now, using the first one (the outcome is the same for the second class as well), when I write the following, say:
In [2]: time = np.linspace(0,10,100)
In [5]: v = variable(time)
In [6]: v1 = v.state_dynamic
In [7]: v1.size
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/home/<ipython-input-7-e6a5d17aeb75> in <module>()
----> 1 v1.size
AttributeError: 'function' object has no attribute 'size'
In [8]: v2 = variable(np.size(time)).state_dynamic
In [9]: v2
Out[9]: <bound method variable.state_dynamic of <__main__.variable instance at 0x3ad0a28>>
In [10]: v1[0,0]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/home/<ipython-input-10-092bc2b9f982> in <module>()
----> 1 v1[0,0]
TypeError: 'instancemethod' object has no attribute '__getitem__'
I was hoping that by writing
variable(length).state_dynamic
I'll access
np.zeros((2, np.size(length)))
Anyway, if I made something utterly stupid please let me know :) and feel free to give any kind of advice. Thank you in advance for your time and kind attention. Best regards.
EDIT #1:
#wheaties:
Thank you for a quick reply and help :)
What I'm currently trying to do is the following. I have to plot several "variables", e.g., state, control, dropouts, scheduling and disturbances. All the variables depend on three parameters, namely, dynamic, static and horizon. Further, state and control are np.zeros((2, np.size(length))), dropouts and scheduling are np.zeros(np.size(length)) and disturbance has specific form (see above). Initially, I declared them in the script and the list is very long and looks ugly. I use these variables to store responses of dynamical systems considered and to plot them. I don't know if this is a good way of doing this and if you have any suggestion please share.
Thanks again for your help.
Do you mean you want named access to a bunch of state information? The ordinary python idiom for class variables would look like this:
class Variable(object):
def __init__ (self, state_dynamic, state_static, control_static, control_dynamic, scheduling):
self.state_dynamic = state_dynamic
self.state_static = state_static
self.control_static = control_static
self.control_dynamic = control_dynamic
self.scheduling = control_dynamic
Which essentially creates a bucket with named fields that hold values you put in via the constructor. You can also create lightweight data classes using the namedtuple factory class, which avoids some of the boilerplate.
The other python idiom that might apply is to use the #property decorator as in #wheaties answer. This basically disguises a function call to make it look like a field. If what you're doing can be reduced to a functional basis this would make sense. This is an example of the idea (not based on your problem set, since I'm not sure I grok what you're doing in detail with all those identical variables) -- in this case I'm making a convenience wrapper for pulling individual flags out that are stored in a python number but really make a bit field:
class Bits(object):
def __init__(self, integer):
self.Integer = integer # pretend this is an integer between 0 and 8 representing 4 flags
#property
def locked(self):
# low bit = locked
return self.Integer & 1 == 1
#property
def available(self):
return self.Integer & 2 == 2
#property
def running_out_of_made_up_names(self):
return self.Integer & 4 == 4
#property
def really_desperate_now(self):
return self.Integer & 8 == 8
example = Bits(7)
print example.locked
# True
print example.really_desperate_now
# False
A method in Python is a function. If you want to get a value from a member function you have to end it with (). That said, some refactoring may help eliminate boilerplate and reduce the problem set size in your head. I'd suggest using a #property for some of these things, combined with a slight refactor
class variable:
def __init__(self, length):
self.length = length # time length`
#property
def state_dynamic(self):
return self.np_length
#property
def state_static(self):
return self.np_length
#property
def control_dynamic(self):
return self.np_length
#property
def control_static(self):
return self.np_length
#property
def scheduling(self):
return self.np_length
#property
def np_length(self):
return np.zeros(2, np.size(self.length))
That way you can use those functions as you would a member variable like you tried before:
var = variable(length).state_dynamic
What I can't tell from all this is what the difference is between all these variables? I don't see a single one. Are you assuming that you have to access them in order? If so, that's bad design and a problem. Never make that assumption.

Implementing sub-table (view into a table): designing class relationship

I'm using Python 3, but the question isn't really tied to the specific language.
I have class Table that implements a table with a primary key. An instance of that class contains the actual data (which is very large).
I want to allow users to create a sub-table by providing a filter for the rows of the Table. I don't want to copy the table, so I was planning to keep in the sub-table just the subset of the primary keys from the parent table.
Obviously, the sub-table is just a view into the parent table; it will change if the parent table changes, will become invalid if the parent table is destroyed, and will lose some of its rows if they are deleted from the parent table. [EDIT: to clarify, if parent table is changed, I don't care what happens to the sub-table; any behavior is fine.]
How should I connect the two classes? I was thinking of:
class Subtable(Table):
def __init__(self, table, filter_function):
# ...
My assumption was that Subtable keeps the interface of Table, except slightly overrides the inherited methods just to check if the row is in. Is this a good implementation?
The problem is, I'm not sure how to initialize the Subtable instance given that I don't want to copy the table object passed to it. Is it even possible?
Also I was thinking to give class Table an instance method that returns Subtable instance; but that creates a dependency of Table on Subtable, and I guess it's better to avoid?
I'm going to use the following (I omitted many methods such as sort, which work quite well in this arrangement; also omitted error handling):
class Table:
def __init__(self, *columns, pkey = None):
self.pkey = pkey
self.__columns = columns
self.__data = {}
def __contains__(self, key):
return key in self.__data
def __iter__(self):
for key in self.__order:
yield key
def __len__(self):
return len(self.__data)
def items(self):
for key in self.__order:
yield key, self.__data[key]
def insert(self, *unnamed, **named):
if len(unnamed) > 0:
row_dict = {}
for column_id, column in enumerate(self.__columns):
row_dict[column] = unnamed[column_id]
else:
row_dict = named
key = row_dict[self.pkey]
self.__data[key] = row_dict
class Subtable(Table):
def __init__(self, table, row_filter):
self.__order = []
self.__data = {}
for key, row in table.items():
if row_filter(row):
self.__data[key] = row
Essentially, I'm copying the primary keys only, and create references to the data tied to them. If a row in the parent table is destroyed, it will still exist in the sub-table. If a row is modified in the parent table, it is also modified in the sub-table. This is fine, since my requirements was "anything goes when parent table is modified".
If you see any issues with this design, let me know please.

Categories

Resources