I am setting up a model where two players are involved in a competition. I'm leaning towards this model:
def match(models.Model):
player = ForeignKey(Player)
opponent = ForeignKey(Player)
score = PositiveSmallIntegerField()
games_won = PositiveSmallIntegerField()
games_lost = PositiveSmallIntegerField()
won_match = BooleanField()
There are statistics involved, however, and it would require another pull to find the matching record for the opponent if I want to describe the match in full.
Alternatively I could set up the model to include full stats:
def match(models.Model):
home_player = ForeignKey(Player)
away_player = ForeignKey(Player)
home_player_score = PositiveSmallIntegerField()
away_player_score = PositiveSmallIntegerField()
...
But that seems equally bad, as I would have to do two logic sets for one player (to find his scores when he's home_player and his scores when he's away_player).
The final option is to do two inserts per match, both with full stats, and keep redundant data in the table.
There seems like a better way, and therefore I poll SO.
Id go with the first model and use select_related() to avoid the extra db calls.
If you're looking to reduce redundancy and maintain consistiency of logic...
Match:
- id
- name
Match_Player: (2 records per match)
- match_id
- player_id
- is_home
Match_Player_Score:
- match_id
- player_id
- score
I'd avoid having redundant data in the database. This leaves open the possibility of the database data getting internally inconsistent and messing up everything.
Use a single entry per match, as in your second example. If you plan ahead, you can accomplish the two sets of logic pretty easily. Take a look at proxy models. There might be an elegant way to do this -- have all of your logic refer to the data fields through accessors like get_my_score and get_opponent_score. Then build a Proxy Model class which swaps home and away.
class match(models.Model):
def get_my_score(self):
return self.home_player_score
def get_opponent_score(self):
return self.away_player_score
def did_i_win(self):
return self.get_my_score() > self.get_opponent_score()
class home_player_match(match):
class Meta:
proxy = True
def get_my_score(self):
return self.away_player_score
def get_opponent_score(self):
return self.home_player_score
Or maybe you want two Proxy models, and have the names in the base model class be neutral. The problem with this approach is that I don't know how to convert a class from one proxy model to another without reloading from the database. You want a "rebless" as in perl. You could do this by using containment rather than inheritance. Or maybe just a flag in the wrapper class (not stored in the database) saying whether or not to swap fields. But I'd recommend some solution like that -- solve the tricky stuff in code and don't let the database get inconsistent.
Related
I'm trying to implement Load Balancing in Django using Round robin method. At First I created model where I kept all instances and a sequence of each instance.
My Model:
class Load_Balancing(models.Model):
id = models.AutoField(primary_key=True)
instance = models.CharField(max_length=100)
sequence = models.IntegerField()
Don't try to implement load balancing on application level as it makes no sense.
Your database would be bottleneck in your solution.
use proper HTTP server/reverse proxy most of them have well established load-balancing support for example: nginx, apache
Unaware of your intentions with getting an instance in your view and doing whatever further operations with it, below could be a simple PoC to achieve this.
However I strongly recommend you to go with iklinac's solution and reconsider your architecture design.
You can create a model which serves as a counter for you.
Note that this can also be done using an inmemory persistent solution like pickle but I prefer doing this way.
Create a table which acts as a counter
class InstanceSq(models.Model):
sequence_id = models.IntegerField()
Table for this model will always only contain 1 row.
Get it in your views.py as below:
try:
sequence_id = InstanceSq.objects.get(id=1).sequence_id
except InstanceSq.DoesNotExist:
#This is when it runs first time
instance_row = InstanceSq(sequence_id = 1)
instance_row.save()
sequence_id = 1
#..
#Here you get the current instance as:
instance_ip = Load_Balancing.objects.filter(sequence=sequence_id)
#Use your instance here and do whatever you want to do
#.. and then
# Rotation logic
new_id = sequence_id % 4 + 1
current = InstanceSq.objects.select_for_update().get(id=1) #to avoid race conditions
current.sequence_id = new_id
current.save()
Background:
I scrape data from 2 sources for upcoming properties for sale, lets call one SaleAnnouncement, and the other SellerMaintainedData. They share many of the same field names (although some data can only be found in one and not the other). If an item is coming up for sale, there is guaranteed to be a SaleAnnouncement, but not necessarily SellerMaintainedData. In fact only about 10% of the "sellers" maintain there own site with relevant data. However those that do, always have more information and that data is more up to date than the data in the announcement. Also, the "announcement" is free form text which needs to go through several processing steps before the relevant data is extracted and as such, the model has some fields to store data in intermediate steps of processing (part of the reason I opted for 2 models as opposed to combining them into 1), while the "seller" data is scraped in a neat tabular format.
Problem:
I would ultimately like to combine them into one SaleItem and have implemented a model which is related to the previous 2 models and relies heavily on properties to prioritize which model the data comes from. Something like:
#property
def sale_datetime(self):
if self.sellermaintaineddata and self.sellermaintaineddata.sale_datetime:
return self.trusteeinfo.sale_datetime
else:
return self.latest_announcement and self.latest_announcement.sale_datetime
However I obviously won't be able to query those fields, which would be my end goal when listing upcoming sales. I have been suggested a solution which involves creating a custom manager which overrides the filter/exclude methods, which sounds promising but I would have to duplicate all the property field logic in the model manager.
Summary (for clarity)
I have:
class SourceA(Model):
sale_datetime = ...
address = ...
parcel_number = ...
# other attrs...
class SourceB(Model):
sale_datetime = ...
address = ...
# no parcel number here
# other attrs...
I want:
class Combined(Model):
sale_datetime = # from sourceB if sourceB else from sourceA
...
I want a unified model where common fields between SourceA and SourceB are prioritized so that if SourceB exists it derives the value of that field from SourceB or else it comes from SourceA. I would also like to query those fields so maybe using properties is not the best way...
Question
Is there a better way, should I consider restructuring my models (possibly combining those 2), or is the custom manager solution the way to go?
I would suggest another solution. What about using inheritance? You could create base class that would be abstract (https://docs.djangoproject.com/en/1.9/topics/db/models/#abstract-base-classes). You can put all common fields there and then create separate model for SaleAnnouncement and SellerMaintainedData. Since both of them will inherit from your base model, you'll have to define fields only specific for the certain model.
I'd like to create a directed graph in Django, but each node could be a separate model, with separate fields, etc.
Here's what I've got so far:
from bannergraph.apps.banners.models import *
class Node(models.Model):
uuid = UUIDField(db_index=True, auto=True)
class Meta:
abstract = True
class FirstNode(Node):
field_name = models.CharField(max_length=100)
next_node = UUIDField()
class SecondNode(Node):
is_something = models.BooleanField(default=False)
first_choice = UUIDField()
second_choice = UUIDField()
(obviously FirstNode and SecondNode are placeholders for the more domain-specific models, but hopefully you get the point.)
So what I'd like to do is query all the subclasses at once, returning all of the ones that match. I'm not quite sure how to do this efficiently.
Things I've tried:
Iterating over the subclasses with queries - I don't like this, as it could get quite heavy with the number of queries.
Making Node concrete. Apparently I have to still check for each subclass, which goes back to #1.
Things I've considered:
Making Node the class, and sticking a JSON blob in it. I don't like this.
Storing pointers in an external table or system. This would mean 2 queries per UUID, where I'd ideally want to have 1, but it would probably do OK in a pinch.
So, am I approaching this wrong, or forgetting about some neat feature of Django? I'd rather not use a schemaless DB if I don't have to (the Django admin is almost essential for this project). Any ideas?
The InheritanceManager from django-model-utils is what you are looking for.
You can iterate over all your Nodes with:
nodes = Node.objects.filter(foo="bar").select_subclasses()
for node in nodes:
#logic
I'm stuck using Elixir and I currently have a really messy way of searching through a database that i'd like to improve. The documentation provides insight on how to do a basic generative search but I need to step through many different classes and i'd prefer to use Elixir rather than scanning through the list myself.
Here's an example:
Class Student:
hobby = Field(String)
additional_info = OneToOne('AdditionalInformation', inverse='student')
user_profile = OneToOne('UserProfile', inverse='student')
Class AdditionalInformation:
state = Field(String)
city = Field(String)
student = OneToOne('Student', inverse='additional_info')
Class UserProfile:
username = Field(String)
date_signed_up = Field(DateTime)
student = OneToOne('Student', inverse = 'user_profile')
In this example, i'd like to find all students that:
Signed up after 2008
Are from California
Have "video games" as their hobby
I'm thinking there should be a way for me to go:
result = UserProfile.query.filter_by(date_signed_up>2008)
result.query.filter_by(UserProfile.student.hobby='blabla')
result.query....
Currently i'm putting them into a list and looking for a set.
I haven't used Elixir, but I have used SQLAlchemy. I don't think you can do what you want given that current setup. As far as I know, there is no way to filter by relationships directly.
It's unclear whether you're creating new tables or dealing with existing ones, so I'm just going to throw some info at you and hope some of it is helpful.
You can join tables together in SQLAlchemy (assuming there's a foreign key called student_id on UserProfile). This would give you all students who signed up since 2008.
result = Student.query.join(UserProfile).filter(Student.id==UserProfile.student_id).filter(UserProfile.date_signed_up>2008).all()
You can chain .filter() together like I did above, or you can pass multiple args to them. I find this especially useful for dealing with unknown numbers of filters, like you'd get from a search form.
conditions = [UserProfile.date_signed_up>2008]
if something_is_true:
conditions.append(UserProfile.username=="foo")
result = Student.query.join(UserProfile).filter(Student.id==UserProfile.student_id).filter(and_(*conditions)).all()
There's also more complex stuff you can do with hybrid properties, but that doesn't seem appropriate here.
After building a few application on the gae platform I usually use some relationship between different models in the datastore in basically every application. And often I find my self the need to see what record is of the same parent (like matching all entry with same parent)
From the beginning I used the db.ReferenceProperty to get my relations going, like:
class Foo(db.Model):
name = db.StringProperty()
class Bar(db.Model):
name = db.StringProperty()
parentFoo = db.ReferanceProperty(Foo)
fooKey = someFooKeyFromSomePlace
bars = Bar.all()
for bar in bar:
if bar.parentFoo.key() == fooKey:
// do stuff
But lately I've abandoned this approch since the bar.parentFoo.key() makes a sub query to fetch Foo each time. The approach I now use is to store each Foo key as a string on Bar.parentFoo and this way I can string compare this with someFooKeyFromSomePlace and get rid of all the subquery overhead.
Now I've started to look at Entity groups and wondering if this is even a better way to go? I can't really figure out how to use them.
And as for the two approaches above I'm wondering is there any downsides to using them? Could using stored key string comeback and bit me in the * * *. And last but not least is there a faster way to do this?
Tip:
replace...
bar.parentFoo.key() == fooKey
with...
Bar.parentFoo.get_value_for_datastore(bar) == fooKey
To avoid the extra lookup and just fetch the key from the ReferenceProperty
See Property Class
I think you should consider this as well. This will help you fetch all the child entities of a single parent.
bmw = Car(brand="BMW")
bmw.put()
lf = Wheel(parent=bmw,position="left_front")
lf.put()
lb = Wheel(parent=bmw,position="left_back")
lb.put()
bmwWheels = Wheel.all().ancestor(bmw)
For more reference in modeling. you can refer this Appengine Data modeling
I'm not sure what you're trying to do with that example block of code, but I get the feeling it could be accomplished with:
bars = Bar.all().filter("parentFoo " = SomeFoo)
As for entity groups, they are mainly used if you want to alter multiple things in transactions, since appengine restricts that to entities within the same group only; in addition, appengine allows ancestor filters ( http://code.google.com/appengine/docs/python/datastore/queryclass.html#Query_ancestor ), which could be useful depending on what it is you need to do. With the code above, you could very easily also use an ancestor query if you set the parent of Bar to be a Foo.
If your purposes still require a lot of "subquerying" as you put it, there is a neat prefetch pattern that Nick Johnson outlines here: http://blog.notdot.net/2010/01/ReferenceProperty-prefetching-in-App-Engine which basically fetches all the properties you need in your entity set as one giant get instead of a bunch of tiny ones, which gets rid of a lot of the overhead. However do note his warnings, especially regarding altering the properties of entities while using this prefetch method.
Not very specific, but that's all the info I can give you until you be more specific about exactly what you're trying to do here.
When you design your modules you also need to consider whether you want to be able to save this within a transaction. However only do this if you need to use transactions.
An alternative approach is to assign the parent like so:
from google.appengine.ext import db
class Foo(db.Model):
name = db.StringProperty()
class Bar(db.Model):
name = db.StringProperty()
def _save_entities( foo_name, bar_name ):
"""Save the model data"""
foo_item = Foo( name = foo_name )
foo_item.put()
bar_item = Bar( parent = foo_item, name = bar_name )
bar_item.put()
def main():
# Run the save in a transaction, if any fail this should all roll back
db.run_in_transaction( _save_transaction, "foo name", "bar name" )
# to query the model data using the ancestor relationship
for item in bar_item.gql("WHERE ANCESTOR IS :ancestor", ancestor = foo_item.key()).fetch(1000):
# do stuff