How to do general maths in sql query in django? - python

The following query I'd love to do in django, ideally without using iteration. I just want the database call to return the result denoted by the query below. Unfortunately according to the docs this doesn't seem to be possible; only the general functions like Avg, Max and Min etc are available. Currently I'm using django 1.4 but I'm happy to rewrite stuff from django 1.8 (hence the docs page; I've heard that 1.8 does a lot of these things much better than 1.4)
select sum(c.attr1 * fs.attr2)/ sum(c.attr1) from fancyStatistics as fs
left join superData s on fs.super_id=s.id
left join crazyData c on s.crazy_id=c.id;
Note:
The main reason for doing this in django directly is that if we ever want to change our database from MySQL to something more appropriate for django, it would be good not to have to rewrite all the queries.

You should be able to get aggregates with F expressions to do most of what you want without dropping into SQL.
https://docs.djangoproject.com/en/1.8/topics/db/aggregation/#joins-and-aggregates
aggregate_dict = FancyStatistics.objects.all()\
.aggregate(
sum1=Sum(
F('superdata__crazydata__attr1') * F('attr2'), output_field=FloatField()
) ,
sum2=Sum('superdata__crazydata__attr1')
)
)
result = aggregate_dict['sum1'] / aggregate_dict['sum2']
You need to specify the output fields if the data types used are different.

You can do that query in Django directly using your SQL expression. Check the docs concerning performing raw SQL queries.

Related

Best way to save Raw SQL queries in Django

What is the best way to make raw SQL queries in django?
I have to search a table for the mode of another table. I could not find a way to solve this in django's ORM so I turned to raw SQL queries.
Yet creating all these very long queries in python is very unreadable and does not feel like a proper way to do this. Is there a way to save these queries in a neat format perhaps in the database.
I have to join three separate tables and compute the mode of a few columns on the last table. The length of the queries is getting very big and the code to make these queries becomes very unreadable. An example query would be
SELECT * FROM "core_assembly" INNER JOIN (SELECT * FROM "core_taxonomy" INNER JOIN(SELECT "core_phenotypic"."taxonomy_id" , \
array_agg("core_phenotypic"."isolation_host_NCBI_tax_id") FILTER (WHERE "core_phenotypic"."isolation_host_NCBI_tax_id" IS NOT NULL) \
AS super_set_isolation_host_NCBI_tax_ids FROM core_phenotypic GROUP BY "core_phenotypic"."taxonomy_id") "mode_table" ON \
"core_taxonomy"."id"="mode_table"."taxonomy_id") "tax_mode" ON "core_assembly"."taxonomy_id"="tax_mode"."id" WHERE ( 404=ANY(super_set_isolation_host_NCBI_tax_ids));
Where I would have a very big parse function to make all the WHERE clauses based on user input.
You can try this:
from django.db import connection
cursor = connection.cursor()
raw_query = "write your query here"
cursor.execute(raw_query)
You can also run raw queries for models. eg. MyModel.objects.raw('my query').
Read Performing raw SQL queries | Django documentation | Django for more.

How do I write my Django query when the WHERE clause is meant to have a function?

I'm using Django and Python 3.7 along with PostGres 9.5. I have a column in my PostGres table of type text, which records URLs for articles. I want to run a query that compares everything before the query string, e.g.
SELECT * FROM article where regexp_replace(url, '\?.*$', '') = :url_wo_query_info
but I'm not sure how to pull this off in Django. Normally if I want to straigh tup query on just a URL, I could write
Article.objects.filter(url=url)
BUt I'm unsure how to do the above in Django's lingo because there is a more complicated function involved.
You can use Func with F expressions to use database functions on model fields. Your query would look like this in Django ORM:
Article.objects.all().annotate(
processed_url=Func(
F('url'),
Value('\?.*$'), Value(''),
function='regexp_replace',
)
).filter(processed_url=url_wo_query_info)

query.group_by in Django 1.9

I am moving code from Django 1.6 to 1.9.
In 1.6 I had this code
models.py
class MyReport(models.Model):
group_id = models.PositiveIntegerField(blank=False, null=False)
views.py
query = MyReport.objects.filter(owner=request.user).query
query.group_by = ['group_id']
entries = QuerySet(query=query, model=MyReport)
The query would return one object for each 'group_id'; due to the way I use it, any table row with the group_id would do as a representative.
With 1.9 this code is broken. The query after the second line above is:
SELECT "reports_myreport"."group_id", ... etc FROM "reports_myreport" WHERE "reports_myreport"."owner_id" = 1 GROUP BY "reports_myreport"."group_id", "reports_report"."otherfield", ...
Basically it lists all the table fields in the group by clause, making the query return the whole table.
Ever though in the debugger I see
query.group_by = ['group_by']
It doesn't look like query.group_by is a method in 1.9 nor does the change-logs of 1.7-1.9 suggest that something changed.
Is there a better way - not depending on internal Django stuff - I can use for my query?
Any way to fix my current query?
You can use order_by() to get the results ordered, in that same query you can order by a second criteria.
If your want to get the groups you will need to iterate over the collection to retrieve those values.
If you consume all of the results returned by the query, you can consider:
a) itertools.groupby which makes an in-memory group by instead, but you should not use it for large data sets.
b) Another option is to use Manager.raw() but you will need to write SQL inside Django, like this:
for report in MyReport.objects.raw('SELECT * FROM reporting_report GROUP by group_id'):
print(report)
This will work for large data sets, but you could lose compatibility with some database engines.
Bonus: I recommend you to understand what exactly the old code did before doing a rewrite.

Django OR query using Extra and Filter

I am trying to use Django's ORM to generate a query using both extra and filter methods. Something like this:
Model.objects.filter(clauseA).extra(clauseB).all()
This generates a query, but the issue is that everything in the filter clause is AND'd with everything in the extra clause, so the sql looks like:
SELECT * FROM model WHERE clauseA AND clauseB.
My question is, is there a way to change the default combination operator for a query in Django such that the query generated will be:
SELECT * FROM model WHERE clauseA OR clauseB.
Try Q object
Model.objects.filter(Q(clauseA) | ~Q(clauseB))
EDIT
try this
Model.objects.filter(clauseA) | Model.objects.extra(clauseB)
It might be easier if you just get rid of the filter clause, and include that filter directly into extra OR'd with your Postgres specific function. I think it is already a limitation of the Django ORM.
You can attempt to create your own Func expression though. Once you have created one for your Postgres specific function, you might be able to use a combination of Func(), F(), and Q() objects to get rid of that nasty .extra() function and chain them nicely.

Django charfield queryset filter without escaping the MYSQL wildcard

I'm looking for a way to do this:
qs = MyModel.objects.filter(mystring__like="____10____")
#Which would create a sql clause
... LIKE '____10____'
instead of behave like this:
qs = MyModel.objects.filter(mystring__icontains="____10____")
#Which creates a sql clause
... LIKE %\_\\_\\_\\_10\\_\\_\\_\\_%
I know I can use a regex filter, but that's substantially slower and more error prone than just using the built-in mysql wildcard feature (I've tested it directly in mysql, the query strings are long enough that the difference is substantial).
EDIT:
figured out how to do this with the .extra() method with madisvain's help.
qs = MyModel.objects.extra(where=["`mystring` LIKE '____10____'"])
In terms of performance difference, 2000 random queries with the regex approach took 20.5 seconds, with this approach 2000 random queries take 6 seconds.
I didn't even know django had this __like version. Well but according to the docs please read this!
https://docs.djangoproject.com/en/dev/topics/db/queries/#escaping-percent-signs-and-underscores-in-like-statements
You should be writing the query like this.
qs = MyModel.objects.filter(mystring__contains="____10____")
Or this.
qs = MyModel.objects.extra(where="mystring LIKE '____10____'")
Read more about the extra() method here:
https://docs.djangoproject.com/en/dev/ref/models/querysets/#extra
You can use the raw() manager to preform raw SQL queries, so the line would become:
qs = MyModel.objects.raw("SELECT * from MyApp_MyModel where mystring like %s", [variable])
I would suggest to regex filter function or case insensitive iregex version. Example:
qs = MyModel.objects.filter(mystring__regex="....10....")
Disadvantage: you may check regex support for the backend you use, impact to performance and in the case of backend change this could cause potential problems: the regular expression syntax is that of the database backend in use.

Categories

Resources