Can I create Dynamic columns (and models) in django? - python

I want to create a database of dislike items, but depending on the category of item, it has different columns I'd like to show when all you're looking at is cars. In fact, I'd like the columns to be dynamic based on the category so we can easily an additional property to cars in the future, and have that column show up now too.
For example:
But when you filter on car or person, additional rows show up for filtering.
All the examples that I can find about using django models aren't giving me a very clear picture on how I might accomplish this behavior in a clean, simple web interface.

I would probably go for a model describing a "dislike criterion":
class DislikeElement(models.Model):
item = models.ForeignKey(Item) # Item is the model corresponding to your first table
field_name = models.CharField() # e.g. "Model", "Year born"...
value = models.CharField() # e.g. "Mustang", "1960"...
You would have quite a lot of flexibility in what data you can retrieve. For example, to get for a given item all the dislike elements, you would just have to do something like item.dislikeelements_set.all().
The only problem with this solution is that you would to store in value numbers, strings, dates... under the same data type. But maybe that's not an issue for you.

Related

Django Model and Many-to-Many Relationships -- finding most similar objects

I'm running into an issue that I can't find an explanation for.
Given one object (in this case, an "Article"), I want to use another type of object (in this case, a "Category") to determine which other articles are most similar to article X, as measured by the number of categories they have in common. The relationship between Article and Category is Many-to-Many. The use case is to get a quick list of related Objects to present as links.
I know exactly how I would write the SQL by hand:
select
ac.article_id
from
Article_Category ac
where
ac.category_id in
(
select
category_id
from
Article_Category
where
article_id = 1 -- get all categories for article in question
)
and ac.article_id <> 1
group by
ac.article_id
order by
count(ac.category_id) desc, random() limit 5
What I'm struggling with is how to use the Django Model aggregation to match this logic and only run one query. I'd obv. prefer to do it within the framework if possible. Does anybody have pointers on this?
Adding this in now that I've found a way within the model framework to do this.
related_article_list = Article.objects.filter(category=self.category.all())\
.exclude(id=self.id)
related_article_ids = related_article_list.values('id')\
.annotate(count=models.Count('id'))\
.order_by('-count','?')
In the related_article_list part, other Article objects that match on 2 or more Categories will be included separate times. Thus, when using annotation to count them the number will be > 1 and they can be ordered that way.
I think the correct answer if you really want to filter articles on all category should look like this:
related_article_list = Article.objects.filter(category__in=self.category.all())\
.exclude(id=self.id)

Variable interpolation in python/django, django query filters [duplicate]

Given a class:
from django.db import models
class Person(models.Model):
name = models.CharField(max_length=20)
Is it possible, and if so how, to have a QuerySet that filters based on dynamic arguments? For example:
# Instead of:
Person.objects.filter(name__startswith='B')
# ... and:
Person.objects.filter(name__endswith='B')
# ... is there some way, given:
filter_by = '{0}__{1}'.format('name', 'startswith')
filter_value = 'B'
# ... that you can run the equivalent of this?
Person.objects.filter(filter_by=filter_value)
# ... which will throw an exception, since `filter_by` is not
# an attribute of `Person`.
Python's argument expansion may be used to solve this problem:
kwargs = {
'{0}__{1}'.format('name', 'startswith'): 'A',
'{0}__{1}'.format('name', 'endswith'): 'Z'
}
Person.objects.filter(**kwargs)
This is a very common and useful Python idiom.
A simplified example:
In a Django survey app, I wanted an HTML select list showing registered users. But because we have 5000 registered users, I needed a way to filter that list based on query criteria (such as just people who completed a certain workshop). In order for the survey element to be re-usable, I needed for the person creating the survey question to be able to attach those criteria to that question (don't want to hard-code the query into the app).
The solution I came up with isn't 100% user friendly (requires help from a tech person to create the query) but it does solve the problem. When creating the question, the editor can enter a dictionary into a custom field, e.g.:
{'is_staff':True,'last_name__startswith':'A',}
That string is stored in the database. In the view code, it comes back in as self.question.custom_query . The value of that is a string that looks like a dictionary. We turn it back into a real dictionary with eval() and then stuff it into the queryset with **kwargs:
kwargs = eval(self.question.custom_query)
user_list = User.objects.filter(**kwargs).order_by("last_name")
Additionally to extend on previous answer that made some requests for further code elements I am adding some working code that I am using
in my code with Q. Let's say that I in my request it is possible to have or not filter on fields like:
publisher_id
date_from
date_until
Those fields can appear in query but they may also be missed.
This is how I am building filters based on those fields on an aggregated query that cannot be further filtered after the initial queryset execution:
# prepare filters to apply to queryset
filters = {}
if publisher_id:
filters['publisher_id'] = publisher_id
if date_from:
filters['metric_date__gte'] = date_from
if date_until:
filters['metric_date__lte'] = date_until
filter_q = Q(**filters)
queryset = Something.objects.filter(filter_q)...
Hope this helps since I've spent quite some time to dig this up.
Edit:
As an additional benefit, you can use lists too. For previous example, if instead of publisher_id you have a list called publisher_ids, than you could use this piece of code:
if publisher_ids:
filters['publisher_id__in'] = publisher_ids
Django.db.models.Q is exactly what you want in a Django way.
This looks much more understandable to me:
kwargs = {
'name__startswith': 'A',
'name__endswith': 'Z',
***(Add more filters here)***
}
Person.objects.filter(**kwargs)
A really complex search forms usually indicates that a simpler model is trying to dig it's way out.
How, exactly, do you expect to get the values for the column name and operation?
Where do you get the values of 'name' an 'startswith'?
filter_by = '%s__%s' % ('name', 'startswith')
A "search" form? You're going to -- what? -- pick the name from a list of names? Pick the operation from a list of operations? While open-ended, most people find this confusing and hard-to-use.
How many columns have such filters? 6? 12? 18?
A few? A complex pick-list doesn't make sense. A few fields and a few if-statements make sense.
A large number? Your model doesn't sound right. It sounds like the "field" is actually a key to a row in another table, not a column.
Specific filter buttons. Wait... That's the way the Django admin works. Specific filters are turned into buttons. And the same analysis as above applies. A few filters make sense. A large number of filters usually means a kind of first normal form violation.
A lot of similar fields often means there should have been more rows and fewer fields.

A good way to store this browser versions - Django/Postgresql

I have this data:
Firefox 3.6
There are 3 items
name
max version
min version
I am storing it this way:
class MyModel(models.Model):
browser_name = models.CharField(...)
browser_max_version = models.IntegerField(...)
browser_min_version = models.IntegerField(...)
or alternative
class Browser(models.Model):
name = models.CharField(...)
max_version = models.IntegerField(...)
min_version = models.IntegerField(...)
class MyModel(models.Model):
browser = models.ForeignKey(Browser)
Is there any clever way to store the value in 1 field and making it parsable at the same time?
I know this might sound weird, but I wonder if there are any alternative to building 1 million models to represent data.
Any ideas? :)
You could make it parseable, but probably not indexable. For example, you could concatenate the values together separated by semicolons (or some other character), then simply split the string to get the values back. "Firefox 3.6" would become "Firefox;3;6". While this is somewhat easier to parse, it doesn't provide much of an advantage over the original formatting.
The big caveat with this approach is that the column wouldn't be indexable in a very granular way. For example, you couldn't ask for all versions of Firefox. PostgreSQL allows for some very advanced indexing which, I believe, would allow you to create the required indexes, but I don't know of any way you could access the indexes via Django's ORM.
What is the purpose of MyModel in the second example? The one table Browser is all you need. Why on earth would you need 'millions' of models? Or are you talking about rows in a table?
class Browser(models.Model):
name = models.CharField(...)
max_version = models.IntegerField(...)
min_version = models.IntegerField(...)
is fine

Designing a scalable product database on Google App Engine

I've built a product database that is divided in 3 parts. And each part has a "sub" part containing labels. But the more I work with it the more unstable it feels. And each addition I make it takes more and more code to get it to work.
A product is built of parts, and each part is of a type. Each product, part and type has a label. And there's a label for each language.
A product contains parts in 2 list. One list for default parts (one of each type) and one of optional parts.
Now I want to add currency in the mix and have come to the decision to re-model the entire way I handle this.
The result I want to get is a list of all product objects that contains the name, description, price, all parts and all types that match the parts. And for these the correct language labels.
Like so:
product
- name
- description (by language)
- price (by currency)
- parts
- part (type name and part name by language)
- partPrice (by currency)
The problem with my current setup that is a wild mix of db.ReferenceProperty and db.ListProperty(db.key)
And getting all data by is a bit of a hassle that require multiple for-loops, matching dict and datastore calls. Well it's bit of a mess.
The re-model(un-tested) look like this
class Products(db.model)
name = db.StringProperty()
imageUrl = db.StringProperty()
optionalParts = db.ListProperty(db.Key)
defaultParts = db.ListProperty(db.Key)
active = db.BooleanProperty(default=True)
#property
def itemId(self):
return self.key().id()
class ProductPartTypes(db.Model):
name= db.StringProperty()
#property
def itemId(self):
return self.key().id()
class ProductParts(db.Model):
name = db.StringProperty()
type = db.ReferenceProperty(ProductPartTypes)
imageUrl = db.StringProperty()
parts = db.ListProperty(db.Key)
#property
def itemId(self):
return self.key().id()
class Labels(db.Model)
key = db.StringProperty() #want to store a key here
language = db.StringProperty()
label = db.StringProperty()
class Price(db.Model)
key = db.StringProperty() #want to store a key here
language = db.StringProperty()
price = db.IntegerProperty()
The major thing here is that I've split the Labels and Price out. So these can contain labels and prices for any products, parts or types.
So what I am curious about, is this a solid solution from a architectural point of view? Will this hold even if there's thousands of entries in each model?
Also, any tips for retrieving data in a good manner are welcome. My current solution of get all data first and for-looping over them and stick them in dicts works but feels like it could fail any minute.
..fredrik
You need to keep in mind that App Engine's datastore requires you to rethink your usual way of designing databases. It goes against intuition at first but you must denormalize your data as much as possible if you want your application to be scalable. The datastore has been designed this way.
The approach I usually take is to consider first what kind of queries will need to be done in different use cases, eg. what data do I need to retrieve at the same time ? In what order ? What properties should be indexed ?
If I understand correctly, your main goal is to fetch a list of products with complete details. BTW, if you have other query scenarios - ie. filtering on price, type, etc - you should take them into account too.
In order to fetch all the data you need from only one query, I suggest you create one model which could look like this :
class ProductPart(db.Model):
product_name = db.StringProperty()
product_image_url = db.StringProperty()
product_active = db.BooleanProperty(default=True)
product_description = db.StringListProperty(indexed=False) # Contains product description in all languages
part_name = db.StringProperty()
part_image_url = db.StringProperty()
part_type = db.StringListProperty(indexed=False) # Contains part type in all languages
part_label = db.StringListProperty(indexed=False) # Contains part label in all languages
part_price = db.ListProperty(float, indexed=False) # Contains part price in all currencies
part_default = db.BooleanProperty()
part_optional = db.BooleanProperty()
About this solution :
ListProperties are set to
indexed=False in order to avoid
exploding indexes if you don't need
to filter on them.
In order to get the right
description, label or type, you will have to set
list values always in the same order.
For example : part_label[0] is
English, part_label[1] is Spanish,
etc. Same idea for prices and
currencies.
After fetching entities from this
model you will have to do some
in-memory manipulations in order to
get the data nicely structured the way
you want, maybe in a new dictionary.
Obviously, there will be a lot of redundancy in the datastore with such a design - but that's okay, since it allows you to query the datastore in a scalable fashion.
Besides, this is not meant as a replacement for the architecture that you had in mind, but rather an additional Model designed specifically for the user-facing kind of queries that you need to do, ie. retrieving lists of complete product/parts information.
These ProductPart entities could be populated by background tasks, replicating data located in your other normalized entities which would be the authoritative data source. Since you have plenty of data storage on App Engine, this should not be a problem.
IMO your design mostly makes sense. I did come up with almost same design after reading your problem statement. With a few differnces
I had prices with Product and ProductPart not as a separate table.
Other difference was part_types. If there are not many part_type you can simply have them as python list/tuple.
part_types = ('wheel', 'break', 'mirror')
It also depends on kind of queries you are anticipating. If there are many queries of nature price calculation (independent of rest of product and part info) then it might make sense to design it way you have done.
You have mentioned that you will get all the data first. Isn't querying possible? If you get the whole data in your app and then sort/filter in python then it would be slow. Which database are you considering? For me mongodb looks like a good option here.
Finally why are you suspicious about even 1000 records? You can run a few tests on your db beforehand.
Bests

Filter and sort music info on Google App Engine

I've enjoyed building out a couple simple applications on the GAE, but now I'm stumped about how to architect a music collection organizer on the app engine. In brief, I can't figure out how to filter on multiple properties while sorting on another.
Let's assume the core model is an Album that contains several properties, including:
Title
Artist
Label
Publication Year
Genre
Length
List of track names
List of moods
Datetime of insertion into database
Let's also assume that I would like to filter the entire collection using those properties, and then sorting the results by one of:
Publication year
Length of album
Artist name
When the info was added into the database
I don't know how to do this without running into the exploding index conundrum. Specifically, I'd love to do something like:
Albums.all().filter('publication_year <', 1980).order('artist_name')
I know that's not possible, but what's the workaround?
This seems like a fairly general type of application. The music albums could be restaurants, bottles of wine, or hotels. I have a collection of items with descriptive properties that I'd like to filter and sort.
Is there a best practice data model design that I'm overlooking? Any advice?
There's a couple of options here: You can filter as best as possible, then sort the results in memory, as Alex suggests, or you can rework your data structures for equality filters instead of inequality filters.
For example, assuming you only want to filter by decade, you can add a field encoding the decade in which the song was recorded. To find everything before or after a decade, do an IN query for the decades you want to span. This will require one underlying query per decade included, but if the number of records is large, this can still be cheaper than fetching all the results and sorting them in memory.
Since storage is cheap, you could create your own ListProperty based indexfiles with key_names that reflect the sort criteria.
class album_pubyear_List(db.Model):
words = db.StringListProperty()
class album_length_List(db.Model):
words = db.StringListProperty()
class album_artist_List(db.Model):
words = db.StringListProperty()
class Album(db.Model):
blah...
def save(self):
super(Album, self).save()
# you could do this at save time or batch it and do
# it with a cronjob or taskqueue
words = []
for field in ["title", "artist", "label", "genre", ...]:
words.append("%s:%s" %(field, getattr(self, field)))
word_records = []
now = repr(time.time())
word_records.append(album_pubyear_List(parent=self, key_name="%s_%s" %(self.pubyear, now)), words=words)
word_records.append(album_length_List(parent=self, key_name="%s_%s" %(self.album_length, now)), words=words)
word_records.append(album_artist_List(parent=self, key_name="%s_%s" %(self.artist_name, now)), words=words)
db.put(word_records)
Now when it's time to search you create an appropriate WHERE clause and call the appropriate model
where = "WHERE words = " + "%s:%s" %(field-a, value-a) + " AND " + "%s:%s" %(field-b, value-b) etc.
aModel = "album_pubyear_List" # or anyone of the other key_name sorted wordlist models
indexes = db.GqlQuery("""SELECT __key__ from %s %s""" %(aModel, where))
keys = [k.parent() for k in indexes[offset:numresults+1]] # +1 for pagination
object_list = db.get(keys) # returns a sorted by key_name list of Albums
As you say, you can't have an inequality condition on one field and an order by another (or inequalities on two fields, etc, etc). The workaround is simply to use the "best" inequality condition to get data in memory (where "best" means the one that's expected to yield the least data) and then further refine it and order it by Python code in your application.
Python's list comprehensions (and other forms of loops &c), list's sort method and the sorted built-in function, the itertools module in the standard library, and so on, all help a lot to make these kinds of tasks quite simple to perform in Python itself.

Categories

Resources