Can this be achieved through Aggregation?

Can this be achieved through Aggregation? - python

I have achieved what I want but I'm not convinced it is the best approach. This is my model:
class foo(models.Model):
user = models.ForeignKey(User)
rate = models.PositiveSmallIntegerField(default = 3)
rate1 = models.PositiveSmallIntegerField(default = 3)
rate2 = models.PositiveSmallIntegerField(default = 3)
This is my view:
sum = {}
bad_rating = None
rate = foo.objects.values('user').distinct().annotate(r=Avg("rate"), r1 = Avg("rate1"), r2= Avg("rate2"))
for r in rate:
sum[r['user']]=r['r']+r['r1']+r['r2']
bad_rating = sorted(sum.items(), key=operator.itemgetter(1))
Basically, I'm getting average of rate, rate1 and rate2 grouped by distinct users. Once I have gained the average of these three fields associated with distinct users, I want to add them all together, and keep the user association.
In my case, I have a queryset at the beginning, then I store it in a dictionary, then I store it in a tupple list (because it allows me to order it and gain the lowest rating, or highest rating).
Is it possible to achieve the same results through Django Queryset and aggregation?
In example:
My Queryset result:
<QuerySet [{'r': 1.0, 'r2': 3.0, 'user': 16, 'r3': 5.0}, {'r': 4.333333333333333, 'r2': 2.1666666666666665, 'user': 17, 'r3': 5.0}, {'r': 2.0, 'r1': 2.0, 'user': 18, 'r2': 2.0}]>
Instead I would like a queryset that would produce the following results:
<QuerySet [{16: 5.4}, {17: 3.5}, {18: 4.0}]>
16 being the user id, and 5.4 being avg(rate)+avg(rate1)+avg(rate2).

Based on #9000 comment I manage to solve the problem:
rate = foo.objects.values('user').distinct().annotate(r=Avg("rate"), r1= Avg("rate1"), r2= Avg("rate2")).annotate(a = F('r')+F('r1')+F('r2')).order_by('a')[:5]
The only issue I have now is that it returns the user id, which is useful, but it would be much more convenient if it returned user instead, so I can use it directly in the template.
Thanks,

Related

Django annotate field value from external dictionary

Lets say I have a following dict:
schools_dict = {
'1': {'points': 10},
'2': {'points': 14},
'3': {'points': 5},
}
And how can I put these values into my queryset using annotate?
I would like to do smth like this, but its not working
schools = SchoolsExam.objects.all()
queryset = schools.annotate(
total_point = schools_dict[F('school__school_id')]['points']
)
Models:
class SchoolsExam(Model):
school = ForeignKey('School', on_delete=models.CASCADE),
class School(Model):
school_id = CharField(),
This code gives me an error KeyError: F(school__school_id)

You can not work with F objects in a lookup, since a dictionary does not "understand" F-objects.
You can translate this to a conditional expression [Django-doc]:
from django.db.models import Case, Value, When
schools = SchoolsExam.objects.annotate(
total_point=Case(
*[
When(school__school_id=school_id, then=Value(v['points']))
for school_id, v in school_dict.items()
]
)
)
This will thus "unwind" the dictionary into CASE WHEN school_id=1 THEN 10 WHEN school_id=2 THEN 14 WHEN school_id=3 THEN 5.
However using data in a dictionary often does not make much sense: usually you store this in a table and perform a JOIN.

Django group by Choice Field and COUNT Zeros

Consider the following django model:
class Image(models.Model):
image_filename = models.CharField(max_length=50)
class Rating(models.Model):
DIMENSIONS = [
('happy', 'happiness'),
('competence', 'competence'),
('warm_sincere', 'warm/sincere'),
]
rating_value = models.IntegerField(),
rating_dimension = models.CharField(max_length=50, choices=DIMENSIONS),
image = models.ForeignKey(Image, on_delete=models.CASCADE)
Now, I'd like to group all Ratings by the number of ratings per category like this:
Rating.objects.values("rating_dimension").annotate(num_ratings=Count("rating_value"))
which returns a QuerySets like this:
[{'rating_dimension': 'happy', 'num_ratings': 2},
{'rating_dimension': 'competence', 'num_ratings': 5}]
Is there a way to include all not-rated dimensions? To achieve an output like:
[{'rating_dimension': 'happy', 'num_ratings': 2},
{'rating_dimension': 'competence', 'num_ratings': 5},
{'rating_dimension': 'warm_sincere', 'num_ratings': 0}] # ← zero occurrences should be included.

First we will create a dictionary with counts for all dimensions initialised to 0.
results = {dimension[0]: 0 for dimension in Rating.DIMENSIONS}
Next we will query the database:
queryset = Rating.objects.values("rating_dimension").annotate(num_ratings=Count("rating_value"))
Next we will update our results dictionary:
for entry in queryset:
results.update({entry['rating_dimension']: entry['num_ratings']})
In the template we can iterate over this dictionary by {% for key, value in results.items %}. Or the dictionary can be converted to any suitable structure as per need in the views.

Django ORM annotate with subquery

I've got following models:
class Store(models.Model):
name = models.CharField()
class Goods(models.Model):
store = models.ForeigKey(Store, on_delete=models.CASCADE)
total_cost = models.DecimalField()
different values ...
So, I filtered all the goods according to the parameters, and now my goal is to get one good from each store, which has the lowest price among other goods from this store
stores = Store.objects.all() - all stores
goods = Good.objects.filter(..) - filter goods
goods.annotate(min_price=Subquery(Min(stores.values('goods__total_cost'))))
I tried something like this, but I've got an error:
AttributeError: 'Min' object has no attribute 'query'

I think in you context, you need a Group By feature than a Django annotation,
from this SO answer,
>>> q = Book.objects.annotate(num_authors=Count('authors'))
>>> q[0].num_authors
2
>>> q[1].num_authors
1
q is the queryset of books, but each book has been annotated with the number of authors.
That is, if you annotate your goods queryset, they won't give you back some sorted/filtered set of objects. It will annotate with new field min_price only.So I would suggest you to do a Group By operation as follow
from django.db.models import Min
result = Goods.objects.values('store').annotate(min_val=Min('total_cost'))
Example
In [2]: from django.db.models import Min
In [3]: Goods.objects.values('store').annotate(min_val=Min('total_cost'))
Out[3]: <QuerySet [{'store': 1, 'min_val': 1}, {'store': 2, 'min_val': 2}]>
In [6]: Goods.objects.annotate(min_val=Min('total_cost'))
Out[6]: <QuerySet [<Goods: Goods object>, <Goods: Goods object>, <Goods: Goods object>, <Goods: Goods object>, <Goods: Goods object>]>
In [7]: Goods.objects.annotate(min_val=Min('total_cost'))[0].__dict__
Out[7]:
{'_state': <django.db.models.base.ModelState at 0x7f5b60168ef0>,
'id': 1,
'min_val': 1,
'store_id': 1,
'total_cost': 1}
In [8]: Goods.objects.annotate(min_val=Min('total_cost'))[1].__dict__
Out[8]:
{'_state': <django.db.models.base.ModelState at 0x7f5b6016af98>,
'id': 2,
'min_val': 123,
'store_id': 1,
'total_cost': 123}
UPDATE-1
I think, this is not a good idea, may some optimization issues occur, but you can try if you want
from django.db.models import Min
store_list = Store.objects.values_list('id', flat=True) # list of id's od Store instance
result_queryset = []
for store_id in store_list:
min_value = Goods.objects.filter(store_id=store_id).aggregate(min_value=Min('total_cost'))
result_queryset = result_queryset|Goods.objects.filter(store_id=store_id, total_cost=min_value)
UPDATE-2
I think my Update-1 section has very large amount of performance issues, So I found one possible answer to your question, which is ,
goods_queryset = Goods.objects.filter(**you_possible_filters)
result = goods_queryset.filter(store_id__in=[good['store'] for good in Goods.objects.values('store').annotate(min_val=Min('total_cost'))])

Django complex query based on dicts

Tldr of Problem
Frontend is a form that requires a complex lookup with ranges and stuff across several models, given in a dict. Best way to do it?
Explanation
From the view, I receive a dict of the following form (After being processed by something else):
{'h_index': {"min": 10,"max":20},
'rank' : "supreme_overlord",
'total_citations': {"min": 10,"max":400},
'year_began': {"min": 2000},
'year_end': {"max": 3000},
}
The keys are column names from different models (Right now, 2 separate models, Researcher and ResearchMetrics), and the values are the range / exact value that I want to query.
Example (Above)
Belonging to model Researcher :
rank
year_began
year_end
Belonging to model ResearchMetrics
total_citations
h_index
Researcher has a One to Many relationship with ResearchMetrics
Researcher has a Many to Many relationship with Journals (not mentioned in question)
Ideally: I want to show the researchers who fulfill all the criteria above in a list of list format.
Researcher ID, name, rank, year_began, year_end, total_citations, h_index
[[123, "Thomas", "professor", 2000, 2012, 15, 20],
[ 343 ... ]]
What's the best way to go about solving this problem? (Including changes to form, etc?) I'm not very familiar with the whole form query model thing.
Thank you for your help!

To dynamically perform a query you pass a dict with items 'fieldname__lookuptype': value as **kwargs to Model.objects.filter.
So to filter for rank, year_began and year_end in your example above, you would do this:
How exactly you do the transformation depends on how variable this incoming dictionary is. An example could be something like this:
filter_in = {
'h_index': {"min": 10,"max":20},
'rank' : "supreme_overlord",
'total_citations': {"min": 10,"max":400},
'year_began': {"min": 2000},
'year_end': {"max": 3000},
}
LOOKUP_MAPPING = {
'min': 'gt',
'max': 'lt'
}
filter_kwargs = {}
for field in RESEARCHER_FIELDS:
if not field in filter_in:
continue
filter = filter_in[field]
if isinstance(filter, dict):
for filter_type, value in filter.items():
lookup_type = LOOKUP_MAPPING[filter_type]
lookup = '%s__%s' % (field, lookup_type)
filter_dict[lookup] = value
else:
filter_dict[field] = filter
This results in a dictionary like this:
{
'rank': 'supreme_overlord',
'year_began__gt': 2000,
'year_end__lt': 3000
}
Use it like this:
qs = Researcher.objects.filter(**filter_kwargs)
Regarding the fields total_citations and h_index from ResearchMetrics, I assume you want to aggregate the values. So in your example above you want either a sum or an average.
The principle is the same:
from django.db.models import Sum
METRICS_FIELDS = ['total_citations', 'h_index']
annotate_kwargs = {}
for field in METRICS_FIELDS:
if not field in filter_in:
continue
annotated_field = '%s_sum' % field
annotate_kwargs[annotated_field] = Sum('researchmetric__%s' % field)
filter = filter_in[field]
if isinstance(filter, dict):
for filter_type, value in filter.items():
lookup_type = LOOKUP_MAPPING[filter_type]
lookup = '%s__%s' % (annotated_field, lookup_type)
filter_dict[lookup] = value
else:
filter_kwargs[field] = filter
Now your filter_kwargs look like this:
{
'h_index_sum__gt': 10,
'h_index_sum__lt': 20,
'rank': 'supreme_overlord',
'total_citations_sum__gt': 10,
'total_citations_sum__lt': 400,
'year_began__gt': 2000,
'year_end__lt': 3000
}
And your annotate_kwargs look like this:
{
'h_index_sum': Sum('reasearchmetric__h_index')),
'total_citations_sum': Sum('reasearchmetric__total_citations'))
}
So your final call looks like this:
Researcher.objects.annotate(**annotate_kwargs).filter(**filter_kwargs)
There are some assumptions in my answer, but I hope you get the general idea.
There is one important point: make sure you properly validate the input to make sure that only the field can be filtered that you want the user to filter. In my approach, this is ensured by hard coding the field names in RESEARCHER_FIELDS and METRICS_FIELDS.

Python: How to store multiple values for one dictionary key

I want to store a list of ingredients in a dictionary, using the ingredient name as the key and storing both the ingredient quantity and measurement as values. For example, I would like to be able to do something like this:
ingredientList = {'flour' 500 grams, 'tomato' 10, 'mozzarella' 250 grams}
With the 'tomato' key, tomatoes do not have a measurement, only a quantity. Would I be able to achieve this in Python? Is there an alternate or more efficient way of going about this?

If you want lists just use lists:
ingredientList = {'flour': [500,"grams"], 'tomato':[10], 'mozzarella' :[250, "grams"]}
To get the items:
weight ,meas = ingredientList['flour']
print(weight,meas)
(500, 'grams')
If you want to update just ingredientList[key].append(item)

You could use another dict.
ingredientList = {
'flour': {'quantity': 500, 'measurement': 'grams'},
'tomato': {'quantity': 10},
'mozzarella': {'quantity': 250, 'measurement': 'grams'}
}
Then you could access them like this:
print ingredientList['mozzarella']['quantity']
>>> 250

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Can this be achieved through Aggregation? - python

Related

Django annotate field value from external dictionary

Django group by Choice Field and COUNT Zeros

Django ORM annotate with subquery

Django complex query based on dicts

Python: How to store multiple values for one dictionary key

Categories

Resources