Django generic query builder

Django generic query builder - python

I have a lot of models in my Django project, and many of them need their own queries for specific pages. While working on the seperation of concerns, I feel like there should be a way to make one (or very few) methods that are generic enough and work with input based to work as queries for ALL models.
Say I have model A B and C and I have the following sets of queries:
A.objects.filter(id=object_id)
A.objects.filter(id=object_id).values("id", "name")
A.objects.filter(name="test")
B.objects.filter(relationA=A)
B.objects.filter(id__in=list_of_ids)
C.objects.all()
C.objects.all().exclude('last_name')
Is there any way to create a query by a given:
Model Name (A, B, C)
filter type (possibly, least important if this isn't doable? filter, all)
comparison field (id, name, relation)
comparison type (x=x, x__in=list, x__lte=10)
comparison value (object_id, "test", list_of_ids)
So for example, X being dynamic parts would result in a function (Pseudocode) like:
def query(model, filter_type, comparison_field, comparison_type, comparison_value):
#X.objects.X(X=X)
return model.objects.filter_type(comparison_field + comparison=comparison_value)
When trying it briefly, I immediately ran into the issue of comparison_field not doing what I need it to. Seeing Q objects being mentioned a lot on SO, are they something I should be applying here? and if so, how?
Edit:
As suggested by Klaus D. , I've implemented kwargs with a dynamic dictionary.
query_manager.py
def make_query_kwargs(kwargs_dict, field, value):
kwargs_dict[field] = value
return kwargs_dict
def make_object_query(model, kwargs, columns=[]):
return model.objects.filter(**kwargs).values(*columns)
Which I use by doing the following:
kwargs = {}
id_list = [1, 2, 3, 4]
kwargs = query_manager.make_query_kwargs(kwargs, "id__in", id_list)
query_result = query_manager.make_object_query(A, kwargs)

Related

Check if each value within list is present in the given Django Model Table in a SINGLE query

So let's say I want to implement this generic function:
def do_exist(key:str, values:list[any], model: django.db.models.Model) -> bool
Which checks if all the given values exist within a given model's column named key in a SINGLE query.
I've implemented something like this
from django.db.models import Exists
def do_exist(key, values, model):
chained_exists = (Exists(model.objects.filter(F(key)=value)) for value in values)
qs = model.objects.filter(*chained_exists).values("pk")[:1]
# Limit and values used for the sake of less payload
return len(qs) > 0
It generates a pretty valid SQL query, but the thing that scares me is that if I try to append evaluating method call to qs like .first() instead of [:1] or .exists() MYSQL connection drops.
Does anyone have a more elegant way of solving this?

If you know you're passing in N pks, then a count() query filtered by those pks should have exactly N results.
def do_exist(model, pks):
return model.objects.filter(pk__in=pks).count() == len(pks)

qs = MyModel.objects.filter(id__in=pks)
This gives you a queryset that you can apply .all() etc to

res = MyModel.objects.filter(id__in=pks)

In Django, you can use "entry__in" to filter down a queryset based on a list of entries.
results = Model.objects.filter(id__in = pks)

Dynamic sqlalchemy query

I have 2 sqlalchemy queries...
dummy = DBSession().query(Dummy).filter(Dummy.c_name) == "foo")
dummy2 = DBSession().query(Dummy2).filter(Dummy2.c_name == "foo").filter(Dummy2.f_name == "bar")
And I have some sort of made up a generic function that combines the two...
def generic(Object, c_name, f_name):
dummy = DBSession().query(Object).filter(Object.c_name == c_name).filter(Object.f_name == f_name)
What's the best way to generically handle this, say if f_name doesn't exist or is not queryable in the Dummy2 table?
To summarise my question:
How do I create a general sqlalchemy query which can query any given table, where in some cases the attributes I am querying varies based on the given object.
I think I need some sort of reflection...maybe? or *args / **kwargs ... I dunno... help?

you should probably pack these in separate helper functions for reusability but here's how I approached this:
check_fields = {'c_name': 'exists', 'f_name': 'doesnt_exist'}
existing_fields = {}
# check if the fields exist in the table
for field in check_fields:
if field in given_table.c:
existing_fields.update({field: check_fields.get(field)})
# construct the query based on existing fields
query = given_table.select()
if existing_fields:
for k, v in existing_fields.items():
query = query.where(getattr(given_table.c, k) == v)
I'm then using session.execute(query) to get the results. Here's a similar answer that doesn't use execute: How to get query in sqlalchemyORM также
Note: All the attributes will be chained with AND

Selecting column by string in Google App Engine NDB

I have an entity in Google App Engine as below:
class HesapKalemi(ndb.Model):
hk=ndb.IntegerProperty(indexed=True)
ha=ndb.StringProperty(indexed=True)
A=ndb.FloatProperty(default=0.00)
B=ndb.FloatProperty(default=0.00)
C=ndb.FloatProperty(default=0.00)
F=ndb.FloatProperty(default=0.00)
G=ndb.FloatProperty(default=0.00)
H=ndb.FloatProperty(default=0.00)
I=ndb.FloatProperty(default=0.00)
J=ndb.FloatProperty(default=0.00)
DG=ndb.FloatProperty(default=0.00)
As known, the normal query can be below:
sektorkodu=self.request.get('sektorkodu')
qall=HesapKalemi.query().order(HesapKalemi.hk)
for hesap in qall:
hesap.ho=hesap.A
Is there any way to fetching A column by writing this way:
hesap.GETTHECOLUMN('A') or
hesap.GETTHECOLUMN(sektorkodu)
I have a very horizontal table and want to query it without if-else structure by the .GETTHECOLUMN('string') method.
Is there this kind of method?

In the NDB world, this is called Projection, or a Projection Query. In that link to the docs, you'll see the following:
Projection queries are similar to SQL queries of the form:
SELECT name, email, phone FROM CUSTOMER
So the .GETTHECOLUMN('A') method you're after would look like either of these:
qall_option_one = HesapKalemi.query().order(HesapKalemi.hk).fetch(projection=['A'])
qall_option_two = HesapKalemi.query().order(HesapKalemi.hk).fetch(projection=[HasepKalemi.A])
# to access the values
for hesap in qall_option_one:
print hesap
# output:
# HesapKalemi(key=Key('HesapKalemi', 1234567890), A=0.00, _projection=('A',))
# HesapKalemi(key=Key('HesapKalemi', 1234567891), A=0.00, _projection=('A',))
# ...
This is a bit faster than getting the full entities with all of their properties, but you do still have to iterate through them afterwards, even if you want to just generate a list of the 'A' values. Another option you should look at is "Calling a Function For Each Entity (Mapping)", where you define a callback function to be called on each entity as the query runs. So let's say you just want a list of the 'A' values. You could form that list like this:
def callback(hesap):
return hesap.A
a_values = HesapKelami.query().map(callback)
# a_values = [0.00, 0.00, ...]
If you're really after performance, look into asynchronous gets.
Note: instead of projection, you could use GQL, but that would look messier/more confusing than using projection with the regular ndb Query syntax IMO.
Edit: To answer your question in your comment, you can use either projection or mapping to select data from multiple properties.
Projection of multiple properties:
qall_option_one = HesapKalemi.query().order(HesapKalemi.hk).fetch(projection=['A', 'B', 'C'])
qall_option_two = HesapKalemi.query().order(HesapKalemi.hk).fetch(projection=[HesapKalemi.A, HesapKalemi.B, HesapKalemi.C])
# to access the values
for hesap in qall_option_one:
print hesap
# output:
# HesapKalemi(key=Key('HesapKalemi', 1234567890), A=0.00, B=0.00, C=0.00 _projection=('A', 'B', 'C',))
# HesapKalemi(key=Key('HesapKalemi', 1234567891), A=0.00, B=0.00, C=0.00 _projection=('A', 'B', 'C',))
# ...
Mapping to return multiple properties:
def callback(hesap):
# this returns a tuple of A,B,C values
return hesap.A, hesap.B, hesap.C
values = HesapKelami.query().map(callback)
# values is a list of tuples
# values = [(0.00, 0.00, 0.00), (0.00, 0.00, 0.00), ...]
Edit #2: After rereading the question and comments, I think your question, or at least part of it, may be how to get the property from the model itself using a string, and not how to pull one column out of the datastore. To answer that question, use getattr(hesap, "property_name"), or, and this may be more suited to your needs, turn hesap into a dict with hesap_dict = hesap.to_dict(). Then you could do this:
property_name = 'some_string'
hesap = HesapKelami.query().fetch(1)[0]
hesap_dict = hesap.to_dict()
property_value = hesap_dict.get(property_name, None)
You could pass hesap_dict to your Jinja2 template, and then I think you could accomplish what you asked about in your comments.

Elegant Disjunctive Normal Form in Django

Let's say I've defined this model:
class Identifier(models.Model):
user = models.ForeignKey(User)
key = models.CharField(max_length=64)
value = models.CharField(max_length=255)
Each user will have multiple identifiers, each with a key and a value. I am 100% sure I want to keep the design like this, there are external reasons why I'm doing it that I won't go through here, so I'm not interested in changing this.
I'd like to develop a function of this sort:
def get_users_by_identifiers(**kwargs):
# something goes here
return users
The function will return all users that have one of the key=value pairs specified in **kwargs. Here's an example usage:
get_users_by_identifiers(a=1, b=2)
This should return all users for whom a=1 or b=2. I've noticed that the way I've set this up, this amounts to a disjunctive normal form...the SQL query would be something like:
SELECT DISTINCT(user_id) FROM app_identifier
WHERE (key = "a" AND value = "1") OR (key = "b" AND value = "2") ...
I feel like there's got to be some elegant way to take the **kwargs input and do a Django filter on it, in just 1-2 lines, to produce this result. I'm new to Django though, so I'm just not sure how to do it. Here's my function now, and I'm completely sure it's not the best way to do it :)
def get_users_by_identifiers(**identifiers):
users = []
for key, value in identifiers.items():
for identifier in Identifier.objects.filter(key=key, value=value):
if not identifier.user in users:
users.append(identifier.user)
return users
Any ideas? :)
Thanks!

def get_users_by_identifiers(**kwargs):
q = reduce(operator.or_, Q(identifier__key=k, identifier__value=v)
for (k, v) in kwargs.iteritems())
return User.objects.filter(q)

Best practices for manipulating database result sets in Python?

I am writing a simple Python web application that consists of several pages of business data formatted for the iPhone. I'm comfortable programming Python, but I'm not very familiar with Python "idiom," especially regarding classes and objects. Python's object oriented design differs somewhat from other languages I've worked with. So, even though my application is working, I'm curious whether there is a better way to accomplish my goals.
Specifics: How does one typically implement the request-transform-render database workflow in Python? Currently, I am using pyodbc to fetch data, copying the results into attributes on an object, performing some calculations and merges using a list of these objects, then rendering the output from the list of objects. (Sample code below, SQL queries redacted.) Is this sane? Is there a better way? Are there any specific "gotchas" I've stumbled into in my relative ignorance of Python? I'm particularly concerned about how I've implemented the list of rows using the empty "Record" class.
class Record(object):
pass
def calculate_pnl(records, node_prices):
for record in records:
try:
# fill RT and DA prices from the hash retrieved above
if hasattr(record, 'sink') and record.sink:
record.da = node_prices[record.sink][0] - node_prices[record.id][0]
record.rt = node_prices[record.sink][1] - node_prices[record.id][1]
else:
record.da = node_prices[record.id][0]
record.rt = node_prices[record.id][1]
# calculate dependent values: RT-DA and PNL
record.rtda = record.rt - record.da
record.pnl = record.rtda * record.mw
except:
print sys.exc_info()
def map_rows(cursor, mappings, callback=None):
records = []
for row in cursor:
record = Record()
for field, attr in mappings.iteritems():
setattr(record, attr, getattr(row, field, None))
if not callback or callback(record):
records.append(record)
return records
def get_positions(cursor):
# get the latest position time
cursor.execute("SELECT latest data time")
time = cursor.fetchone().time
hour = eelib.util.get_hour_ending(time)
# fetch the current positions
cursor.execute("SELECT stuff FROM atable", (hour))
# read the rows
nodes = {}
def record_callback(record):
if abs(record.mw) > 0:
if record.id: nodes[record.id] = None
return True
else:
return False
records = util.map_rows(cursor, {
'id': 'id',
'name': 'name',
'mw': 'mw'
}, record_callback)
# query prices
for node_id in nodes:
# RT price
row = cursor.execute("SELECT price WHERE ? ? ?", (node_id, time, time)).fetchone()
rt5 = row.lmp if row else None
# DA price
row = cursor.execute("SELECT price WHERE ? ? ?", (node_id, hour, hour)).fetchone()
da = row.da_lmp if row else None
# update the hash value
nodes[node_id] = (da, rt5)
# calculate the position pricing
calculate_pnl(records, nodes)
# sort
records.sort(key=lambda r: r.name)
# return the records
return records

The empty Record class and the free-floating function that (generally) applies to an individual Record is a hint that you haven't designed your class properly.
class Record( object ):
"""Assuming rtda and pnl must exist."""
def __init__( self ):
self.da= 0
self.rt= 0
self.rtda= 0 # or whatever
self.pnl= None #
self.sink = None # Not clear what this is
def setPnl( self, node_prices ):
# fill RT and DA prices from the hash retrieved above
# calculate dependent values: RT-DA and PNL
Now, your calculate_pnl( records, node_prices ) is simpler and uses the object properly.
def calculate_pnl( records, node_prices ):
for record in records:
record.setPnl( node_prices )
The point isn't to trivially refactor the code in small ways.
The point is this: A Class Encapsulates Responsibility.
Yes, an empty-looking class is usually a problem. It means the responsibilities are scattered somewhere else.
A similar analysis holds for the collection of records. This is more than a simple list, since the collection -- as a whole -- has operations it performs.
The "Request-Transform-Render" isn't quite right. You have a Model (the Record class). Instances of the Model get built (possibly because of a Request.) The Model objects are responsible for their own state transformations and updates. Perhaps they get displayed (or rendered) by some object that examines their state.
It's that "Transform" step that often violates good design by scattering responsibility all over the place. "Transform" is a hold-over from non-object design, where responsibility was a nebulous concept.

Have you considered using an ORM? SQLAlchemy is pretty good, and Elixir makes it beautiful. It can really reduce the ammount of boilerplate code needed to deal with databases. Also, a lot of the gotchas mentioned have already shown up and the SQLAlchemy developers dealt with them.

Depending on how much you want to do with the data you may not need to populate an intermediate object. The cursor's header data structure will let you get the column names - a bit of introspection will let you make a dictionary with col-name:value pairs for the row.
You can pass the dictionary to the % operator. The docs for the odbc module will explain how to get at the column metadata.
This snippet of code to shows the application of the % operator in this manner.
>>> a={'col1': 'foo', 'col2': 'bar', 'col3': 'wibble'}
>>> 'Col1=%(col1)s, Col2=%(col2)s, Col3=%(col3)s' % a
'Col1=foo, Col2=bar, Col3=wibble'
>>>

Using a ORM for an iPhone app might be a bad idea because of performance issues, you want your code to be as fast as possible. So you can't avoid boilerplate code. If you are considering a ORM, besides SQLAlchemy I'd recommend Storm.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django generic query builder - python

Related

Check if each value within list is present in the given Django Model Table in a SINGLE query

Dynamic sqlalchemy query

Selecting column by string in Google App Engine NDB

Elegant Disjunctive Normal Form in Django

Best practices for manipulating database result sets in Python?

Categories

Resources