Conditionally change field value for lookup in Django ORM - python

I'm trying to conditionally change field value during lookup - I have some specific order in mind and I do not want to overwrite field value, just to sort it my way. Let's say, I have classProduct and every class object has product_code field. Now I want to get less than or equal, but it's not trivial - product_code is for most of the time like this A01, B02 and so on and Django lookup lte would work. But now I have fields 0001C01 which I would like to be the biggest value. So during lookup I would like to add 0000 at the begining of every string that does not have this prefix, so it would look like 0001C01, 0000B02, 0000A01.

You can conditionally annotate your queryset in order to get a new field that has de desired value and then use this field in your filter or order_by clause. For example you could do the following:
from django.db.models import CharField, Value as V, F, Q, Case, When
from django.db.models.functions import Concat
Product.objects.annotate(
new_product_code=Case(
When(product_code__iregex=r'^[A-Z]+.*', # If it starts with letters
then=Concat(V('0000'), 'product_code', output_field=CharField()) # Then prepend four 0's
),
default=F('product_code') # Else, the original value
)
).filter(new_product_code__lte='whatever you like') # Now filter by using your new value
Relevant parts of the documentation are conditional expressions, database functions and QuerySet API reference

This sounds fairly straightforward. Fetch the desired Product objects, and for each one, prepend 0000 to product_code if it doesn't start with that string.
products = Product.objects.filter(some_query_expression)
for product in products:
if not product.product_code.startswith('0000'):
product.product_code = '0000' + product.product_code
It's not clear if you want to save this value back to the database, or just use it for temporary comparisons. If you do want to save it, call product.save().

Related

Variable filter for SQLAlchemy Query

I'm adding a search feature to my application (created using PyQt5) that will allow the user to search an archive table in the database. I've provided applicable fields for the user to choose to match rows with. I'm having some trouble with the query filter use only what was provided by the user, given that the other fields would be empty strings.
Here's what I have so far:
def search_for_order(pierre):
fields = {'archive.pat.firstname': pierre.search_firstname.text(),
'archive.pat.lastname': pierre.search_lastname.text(),
'archive.pat.address': pierre.search_address.text(),
'archive.pat.phone': pierre.search_phone.text(),
'archive.compound.compname': pierre.search_compname.text(),
'archive.compound.compstrength': pierre.search_compstrength.text(),
'archive.compound.compform': pierre.search_compform.currentText(),
'archive.doc.lastname': pierre.search_doctor.text(),
'archive.clinic.clinicname': pierre.search_clinic.text()
}
filters = {}
for field, value in fields.items():
if value is not '':
filters[field] = value
query = session.query(Archive).join(Patient, Prescribers, Clinic, Compound)\
.filter(and_(field == value for field, value in filters.items())).all()
The fields dictionary collects the values of all the fields in the search form. Some of them will be blank, resulting in empty strings. filters is intended to be a dictionary of the object names and the value to match that.
The problem lies in your definition of the expressions within your and_ conjunction. As of now you're comparing each field with the corresponding value which of course returns false for each comparison.
To properly populate the and_ conjunction you have to create a list of what sqlalchemy calls BinaryExpression objects.
In order to do so I'd change your code like this:
1) First use actual references to your table classes in your definition of fields:
fields = {
(Patient, 'firstname'): pierre.search_firstname.text(),
(Patient, 'lastname'): pierre.search_lastname.text(),
(Patient, 'address'): pierre.search_address.text(),
(Patient, 'phone'): pierre.search_phone.text(),
(Compound, 'compname'): pierre.search_compname.text(),
(Compound, 'compstrength'): pierre.search_compstrength.text(),
(Compound, 'compform'): pierre.search_compform.currentText(),
(Prescribers, 'lastname'): pierre.search_doctor.text(),
(Clinic, 'clinicname'): pierre.search_clinic.text()
}
2) Define filters as a list instead of a dictionary:
filters = list()
3) To populate the filters list explode the tuple of table and fieldname used as key in the fields dictionary and add the value to again create tuples but now with three elements. Append each of the newly created tuples to the list of filters:
for table_field, value in fields.items():
table, field = table_field
if value:
filters.append((table, field, value))
4) Now transform the created list of filter definitions to a list of BinaryExpression objects usable by sqlalchemy:
binary_expressions = [getattr(table, attribute) == value for table, attribute, value in filters]
5) Finally apply the binary expressions to your query, make sure it's presented to the and_ conjunction in a consumable form:
query = session.query(Archive).join(Patient, Prescribers, Clinic, Compound)\
.filter(and_(*binary_expressions)).all()
I'm not able to test that solution within your configuration, but a similar test using my environment was successful.
Once you get a query object bound to a table in SqlAlquemy - that is, what is returned by session.query(Archive) in the code above -, calling some methods on that object will return a new, modified query, where that filter is already applied.
So, my preferred way of combining several and filters is to start from the bare query, iterate over the filters to be used, and for each, add a new .filter call and reassign the query:
query = session.query(Archive).join(Patient, Prescribers, Clinic, Compound)
for field, value in filters.items():
query = query.filter(field == value)
results = query.all()
Using and_ or or_ as you intend can also work - in the case of your example, the only thing missing was an *. Without an * preceeding the generator expression, it is passed as the first (and sole) parameter to and_. With a prefixed *, all elements in the iterator are unpacked in place, each one passed as an argument:
...
.filter(and_(*(field == value for field, value in filters.items()))).all()

Filter Django queryset for a dict value

I have a queryset of Products with JSONField attributes containing dict
class Product(models.Model):
attributes = JSONField(
pgettext_lazy('Product field', 'attributes'),
encoder=DjangoJSONEncoder, default={})
I want to filter Products where attributes['12'] == '31'
Following one works:
qs.filter(attributes__contains={'12': '31'})
Following one does not:
qs.filter(attributes__12='31')
Is this something I can achieve with PostgreSQL or should I move it to ES?
EDIT:
Unfortunately I cannot use first solution, as this dict may contain more keys.
First solution works well. Given we have:
product.attributes = {'333': ['6', '1']}
We can filter it out by:
Product.objects.filter(attributes__contains={'333': ['6']}
etc. Totally overlooked it.
You should be able to use the second format, i.e. qs.filter(attributes__key='value').
Your issue in this case, as explained in the docs, is that when using an integer as a key in a JSON query, that key will be used as the index of an array, thus it's interpreted as attributes[12] instead of attributes['12'].
As long as you stick to string keys, you should be fine.
An example:
class MyModel(models.Model)
json = JSONField(default=dict)
p = MyModel.objects.create(json={'0': 'something', 'a': 'something else'})
MyModel.objects.filter(json__0='something') # returns empty queryset
MyModel.objects.filter(json__a='something else') # returns the object created above

Querying objects using attribute of member of many-to-many

I have the following models:
class Member(models.Model):
ref = models.CharField(max_length=200)
# some other stuff
def __str__(self):
return self.ref
class Feature(models.Model):
feature_id = models.BigIntegerField(default=0)
members = models.ManyToManyField(Member)
# some other stuff
A Member is basically just a pointer to a Feature. So let's say I have Features:
feature_id = 2, members = 1, 2
feature_id = 4
feature_id = 3
Then the members would be:
id = 1, ref = 4
id = 2, ref = 3
I want to find all of the Features which contain one or more Members from a list of "ok members." Currently my query looks like this:
# ndtmp is a query set of member-less Features which Members can point to
sids = [str(i) for i in list(ndtmp.values('feature_id'))]
# now make a query set that contains all rels and ways with at least one member with an id in sids
okmems = Member.objects.filter(ref__in=sids)
relsways = Feature.geoobjects.filter(members__in=okmems)
# now combine with nodes
op = relsways | ndtmp
This is enormously slow, and I'm not even sure if it's working. I've tried using print statements to debug, just to make sure anything is actually being parsed, and I get the following:
print(ndtmp.count())
>>> 12747
print(len(sids))
>>> 12747
print(okmems.count())
... and then the code just hangs for minutes, and eventually I quit it. I think that I just overcomplicated the query, but I'm not sure how best to simplify it. Should I:
Migrate Feature to use a CharField instead of a BigIntegerField? There is no real reason for me to use a BigIntegerField, I just did so because I was following a tutorial when I began this project. I tried a simple migration by just changing it in models.py and I got a "numeric" value in the column in PostgreSQL with format 'Decimal:( the id )', but there's probably some way around that that would force it to just shove the id into a string.
Use some feature of Many-To-Many Fields which I don't know abut to more efficiently check for matches
Calculate the bounding box of each Feature and store it in another column so that I don't have to do this calculation every time I query the database (so just the single fixed cost of calculation upon Migration + the cost of calculating whenever I add a new Feature or modify an existing one)?
Or something else? In case it helps, this is for a server-side script for an ongoing OpenStreetMap related project of mine, and you can see the work in progress here.
EDIT - I think a much faster way to get ndids is like this:
ndids = ndtmp.values_list('feature_id', flat=True)
This works, producing a non-empty set of ids.
Unfortunately, I am still at a loss as to how to get okmems. I tried:
okmems = Member.objects.filter(ref__in=str(ndids))
But it returns an empty query set. And I can confirm that the ref points are correct, via the following test:
Member.objects.values('ref')[:1]
>>> [{'ref': '2286047272'}]
Feature.objects.filter(feature_id='2286047272').values('feature_id')[:1]
>>> [{'feature_id': '2286047272'}]
You should take a look at annotate:
okmems = Member.objects.annotate(
feat_count=models.Count('feature')).filter(feat_count__gte=1)
relsways = Feature.geoobjects.filter(members__in=okmems)
Ultimately, I was wrong to set up the database using a numeric id in one table and a text-type id in the other. I am not very familiar with migrations yet, but as some point I'll have to take a deep dive into that world and figure out how to migrate my database to use numerics on both. For now, this works:
# ndtmp is a query set of member-less Features which Members can point to
# get the unique ids from ndtmp as strings
strids = ndtmp.extra({'feature_id_str':"CAST( \
feature_id AS VARCHAR)"}).order_by( \
'-feature_id_str').values_list('feature_id_str',flat=True).distinct()
# find all members whose ref values can be found in stride
okmems = Member.objects.filter(ref__in=strids)
# find all features containing one or more members in the accepted members list
relsways = Feature.geoobjects.filter(members__in=okmems)
# combine that with my existing list of allowed member-less features
op = relsways | ndtmp
# prove that this set is not empty
op.count()
# takes about 10 seconds
>>> 8997148 # looks like it worked!
Basically, I am making a query set of feature_ids (numerics) and casting it to be a query set of text-type (varchar) field values. I am then using values_list to make it only contain these string id values, and then I am finding all of the members whose ref ids are in that list of allowed Features. Now I know which members are allowed, so I can filter out all the Features which contain one or more members in that allowed list. Finally, I combine this query set of allowed Features which contain members with ndtmp, my original query set of allowed Features which do not contain members.

django field comparing values to record

I have a field that records the time, but I need to rewrite this value. My app seems to chronometre and I need to keep the best time. How can I do that ? For example, I have 5 laps and the results:
1:0:0
2:0:0
1:5:0
3:0:0
0:5:0
Database should record the latest 0:5:0. I have only 1 field to do that.
I would convert the time strings into seconds and store that value in the DB.
Alternatively, you can either use min() with a custom key function:
times = ["1:0:0",
"2:0:0",
"1:5:0",
"3:0:0",
"0:5:0"]
print min(times, key=lambda x: [int(i.lstrip('0')) for i in x.split(':')])

Custom Sorting on Custom Field in Django

In my app, I have defined a custom field to represent a physical quantity using the quantities package.
class AmountField(models.CharField):
def __init__(self, *args, **kwargs):
...
def to_python(self, value):
create_quantities_value(value)
Essentially the way it works is it extends CharField to store the value as a string in the database
"12 min"
and represents it as a quantities object when using the field in a model
array(12) * min
Then in a model it is used as such:
class MyModel(models.Model):
group = models.CharField()
amount = AmountField()
class Meta:
ordering = ['group', 'amount']
My issue is that these fields do not seem to sort by the quantity, but instead by the string.
So if I have some objects that contain something like
{"group":"A", "amount":"12 min"}
{"group":"A", "amount":"20 min"}
{"group":"A", "amount":"2 min"}
{"group":"B", "amount":"20 min"}
{"group":"B", "amount":"1 hr"}
they end up sorted something like this:
>>> MyModel.objects.all()
[{A, 12 min}, {A, 2 min}, {A, 20 min}, {B, 1 hr}, {B, 20 min}]
essentially alphabetical order.
Can I give my custom AmountField a comparison function so that it will compare by the python value instead of the DB value?
I think there is no way to sort it as numbers. I mean no effective way since I believe Django allows sorting by computational field somehow but you have to compute all the keys to do that. So try to find a way to store numbers as numbers in database. Maybe store quantity as an integer and add method or property for conversion it to quantity object? Something like this:
class MyModel(models.Model):
group = models.CharField()
amount = models.IntegerField() # or PositiveIntegerField
class Meta:
ordering = ['group', 'amount']
def amountAsQuantityObject(self):
return create_quantities_value(self.amount)
# or as a property
quantity_object = property(amountAsQuantityObject)
Could you store it with leading zeros in the database?
E.g. "02 min"
That way it would sort correctly and when you parse it out again the leading zeros should not matter.
I'm going to go with a totally different approach.
Instead of having a custom field to store what is really time units, why not just store the time units?
For example, 02 minutes, should just be 02. You can either store as integer or, I think there is a TimeField that will allow you to store units in units of time.
You obviously want it to display so a human being can correctly understand that 02 really means 2 mins, so just write a custom filter in your form to deal with the mins or hrs or whatever you might want to add to the end.
This would have other benefits. Let's say you wanted to add all those time units as you had it previously. To do so would require some string processing, and to change that portion of the string to something that has add or sub or the like methods.

Categories

Resources