Custom Sorting on Custom Field in Django - python

In my app, I have defined a custom field to represent a physical quantity using the quantities package.
class AmountField(models.CharField):
def __init__(self, *args, **kwargs):
...
def to_python(self, value):
create_quantities_value(value)
Essentially the way it works is it extends CharField to store the value as a string in the database
"12 min"
and represents it as a quantities object when using the field in a model
array(12) * min
Then in a model it is used as such:
class MyModel(models.Model):
group = models.CharField()
amount = AmountField()
class Meta:
ordering = ['group', 'amount']
My issue is that these fields do not seem to sort by the quantity, but instead by the string.
So if I have some objects that contain something like
{"group":"A", "amount":"12 min"}
{"group":"A", "amount":"20 min"}
{"group":"A", "amount":"2 min"}
{"group":"B", "amount":"20 min"}
{"group":"B", "amount":"1 hr"}
they end up sorted something like this:
>>> MyModel.objects.all()
[{A, 12 min}, {A, 2 min}, {A, 20 min}, {B, 1 hr}, {B, 20 min}]
essentially alphabetical order.
Can I give my custom AmountField a comparison function so that it will compare by the python value instead of the DB value?

I think there is no way to sort it as numbers. I mean no effective way since I believe Django allows sorting by computational field somehow but you have to compute all the keys to do that. So try to find a way to store numbers as numbers in database. Maybe store quantity as an integer and add method or property for conversion it to quantity object? Something like this:
class MyModel(models.Model):
group = models.CharField()
amount = models.IntegerField() # or PositiveIntegerField
class Meta:
ordering = ['group', 'amount']
def amountAsQuantityObject(self):
return create_quantities_value(self.amount)
# or as a property
quantity_object = property(amountAsQuantityObject)

Could you store it with leading zeros in the database?
E.g. "02 min"
That way it would sort correctly and when you parse it out again the leading zeros should not matter.

I'm going to go with a totally different approach.
Instead of having a custom field to store what is really time units, why not just store the time units?
For example, 02 minutes, should just be 02. You can either store as integer or, I think there is a TimeField that will allow you to store units in units of time.
You obviously want it to display so a human being can correctly understand that 02 really means 2 mins, so just write a custom filter in your form to deal with the mins or hrs or whatever you might want to add to the end.
This would have other benefits. Let's say you wanted to add all those time units as you had it previously. To do so would require some string processing, and to change that portion of the string to something that has add or sub or the like methods.

Related

Filter Django queryset by related field (with code-value)

Simplifying my model a lot, I have the following:
class Player(models.Model):
name = models.CharField(max_length=50)
number = models.IntegerField()
class Statistic(models.Model):
'''
Known codes are:
- goals
- assists
- red_cards
'''
# Implicit ID
player = models.ForeignKey(
'Player', on_delete=models.CASCADE, related_name='statistics')
code = models.CharField(max_length=50)
value = models.CharField(max_length=50, null=True)
I'm using a code-value strategy to add different statistics in the future, without the need of adding new fields to the model.
Now, I want to filter the players based on some statistics, for example, players who scored between 10 and 15 goals.
I'm trying something like this:
.filter('statistics__code'='goals').filter('statistics__value__range'=[10,15])
but I'm getting duplicated players, I'm guessing because that value__range could refer to any Statistic.
How could I properly filter the queryset or avoid those duplicates?
And how could I filter by more than one statistic, for example, players who scored between 10 and 15 goals and have more than 5 assists?
By the way, note that the value field (in Statistic) is a string, and it will need to be treated as an integer in some scenarios (when using __range, for example).
You don't need to chain the filter. Use the filter() method only once with distinct() method.
.filter(statistics__code='goals', statistics__value__range=[10,15]).distinct()
NOTE: I can see few quotes around statistics__code and statistics__value__range, no need to put that.

Django How to filter and sort queryset by related model

I have this model relationship:
class Account:
< ... fields ... >
class Balance(models.Model):
name = models.CharField(...)
count = models.FloatField(...)
account = models.ForeignKey(Account, related_name='balance')
Let's say we have some number of accounts. I need to filter these accounts by balance__name and sort by balance__count. I need sorted accounts, not a list of balances.
How do I do that? I don't even have any suggestions to find out a solution using iteration.
You can implement a queryset like:
Account.objects.filter(
balance__name='my_balance_name'
).order_by('balance__count')
Note that here an account can occur multiple times if there are multiple Balances that have the given name.
If you want to sort in descending order (so from larger counts to smaller counts), then you should add a minus (-):
Account.objects.filter(
balance__name='my_balance_name'
).order_by('-balance__count')

Conditionally change field value for lookup in Django ORM

I'm trying to conditionally change field value during lookup - I have some specific order in mind and I do not want to overwrite field value, just to sort it my way. Let's say, I have classProduct and every class object has product_code field. Now I want to get less than or equal, but it's not trivial - product_code is for most of the time like this A01, B02 and so on and Django lookup lte would work. But now I have fields 0001C01 which I would like to be the biggest value. So during lookup I would like to add 0000 at the begining of every string that does not have this prefix, so it would look like 0001C01, 0000B02, 0000A01.
You can conditionally annotate your queryset in order to get a new field that has de desired value and then use this field in your filter or order_by clause. For example you could do the following:
from django.db.models import CharField, Value as V, F, Q, Case, When
from django.db.models.functions import Concat
Product.objects.annotate(
new_product_code=Case(
When(product_code__iregex=r'^[A-Z]+.*', # If it starts with letters
then=Concat(V('0000'), 'product_code', output_field=CharField()) # Then prepend four 0's
),
default=F('product_code') # Else, the original value
)
).filter(new_product_code__lte='whatever you like') # Now filter by using your new value
Relevant parts of the documentation are conditional expressions, database functions and QuerySet API reference
This sounds fairly straightforward. Fetch the desired Product objects, and for each one, prepend 0000 to product_code if it doesn't start with that string.
products = Product.objects.filter(some_query_expression)
for product in products:
if not product.product_code.startswith('0000'):
product.product_code = '0000' + product.product_code
It's not clear if you want to save this value back to the database, or just use it for temporary comparisons. If you do want to save it, call product.save().

Filter on calculated representation of some fields

I have three numeric columns in my model that together create a string, which is presented to the user:
class Location(models.Model):
aisle = models.PositiveIntegerField()
rack = models.PositiveIntegerField()
plank = models.PositiveIntegerField()
def __unicode__(self):
return "{a:02}{r:02}{p:02}".format(a=self.aisle, r=self.rack, p=self.plank)
Now I want to filter on (part of) that string, so say I have three locations, 010101, 010102, 010201, and I filter on 0101, I want to select only the first two.
How would I do that, I looked into the Q objects and available database functions, but I don't find a solution.
After a lot of experimenting, I managed to do it using a Func:
class LocationLabel(Func):
function = 'CONCAT'
template = '%(function)s(RIGHT(CONCAT(\'00\',%(expressions)s),2))'
arg_joiner = '),2), RIGHT(CONCAT(\'00\','
models.Location.object.
annotate(locationlabel=
LocationLabel('aisle','rack','plank', output_field=CharField())
).
filter(locationlabel__icontains=query)
You cannot perform a filter on a property, it has to be on fields.
In this case i think this will do what you require because unicode is just a formatted form actual integer value in fields:
Location.objects.filter(aisle=1, rack=1)

Querying objects using attribute of member of many-to-many

I have the following models:
class Member(models.Model):
ref = models.CharField(max_length=200)
# some other stuff
def __str__(self):
return self.ref
class Feature(models.Model):
feature_id = models.BigIntegerField(default=0)
members = models.ManyToManyField(Member)
# some other stuff
A Member is basically just a pointer to a Feature. So let's say I have Features:
feature_id = 2, members = 1, 2
feature_id = 4
feature_id = 3
Then the members would be:
id = 1, ref = 4
id = 2, ref = 3
I want to find all of the Features which contain one or more Members from a list of "ok members." Currently my query looks like this:
# ndtmp is a query set of member-less Features which Members can point to
sids = [str(i) for i in list(ndtmp.values('feature_id'))]
# now make a query set that contains all rels and ways with at least one member with an id in sids
okmems = Member.objects.filter(ref__in=sids)
relsways = Feature.geoobjects.filter(members__in=okmems)
# now combine with nodes
op = relsways | ndtmp
This is enormously slow, and I'm not even sure if it's working. I've tried using print statements to debug, just to make sure anything is actually being parsed, and I get the following:
print(ndtmp.count())
>>> 12747
print(len(sids))
>>> 12747
print(okmems.count())
... and then the code just hangs for minutes, and eventually I quit it. I think that I just overcomplicated the query, but I'm not sure how best to simplify it. Should I:
Migrate Feature to use a CharField instead of a BigIntegerField? There is no real reason for me to use a BigIntegerField, I just did so because I was following a tutorial when I began this project. I tried a simple migration by just changing it in models.py and I got a "numeric" value in the column in PostgreSQL with format 'Decimal:( the id )', but there's probably some way around that that would force it to just shove the id into a string.
Use some feature of Many-To-Many Fields which I don't know abut to more efficiently check for matches
Calculate the bounding box of each Feature and store it in another column so that I don't have to do this calculation every time I query the database (so just the single fixed cost of calculation upon Migration + the cost of calculating whenever I add a new Feature or modify an existing one)?
Or something else? In case it helps, this is for a server-side script for an ongoing OpenStreetMap related project of mine, and you can see the work in progress here.
EDIT - I think a much faster way to get ndids is like this:
ndids = ndtmp.values_list('feature_id', flat=True)
This works, producing a non-empty set of ids.
Unfortunately, I am still at a loss as to how to get okmems. I tried:
okmems = Member.objects.filter(ref__in=str(ndids))
But it returns an empty query set. And I can confirm that the ref points are correct, via the following test:
Member.objects.values('ref')[:1]
>>> [{'ref': '2286047272'}]
Feature.objects.filter(feature_id='2286047272').values('feature_id')[:1]
>>> [{'feature_id': '2286047272'}]
You should take a look at annotate:
okmems = Member.objects.annotate(
feat_count=models.Count('feature')).filter(feat_count__gte=1)
relsways = Feature.geoobjects.filter(members__in=okmems)
Ultimately, I was wrong to set up the database using a numeric id in one table and a text-type id in the other. I am not very familiar with migrations yet, but as some point I'll have to take a deep dive into that world and figure out how to migrate my database to use numerics on both. For now, this works:
# ndtmp is a query set of member-less Features which Members can point to
# get the unique ids from ndtmp as strings
strids = ndtmp.extra({'feature_id_str':"CAST( \
feature_id AS VARCHAR)"}).order_by( \
'-feature_id_str').values_list('feature_id_str',flat=True).distinct()
# find all members whose ref values can be found in stride
okmems = Member.objects.filter(ref__in=strids)
# find all features containing one or more members in the accepted members list
relsways = Feature.geoobjects.filter(members__in=okmems)
# combine that with my existing list of allowed member-less features
op = relsways | ndtmp
# prove that this set is not empty
op.count()
# takes about 10 seconds
>>> 8997148 # looks like it worked!
Basically, I am making a query set of feature_ids (numerics) and casting it to be a query set of text-type (varchar) field values. I am then using values_list to make it only contain these string id values, and then I am finding all of the members whose ref ids are in that list of allowed Features. Now I know which members are allowed, so I can filter out all the Features which contain one or more members in that allowed list. Finally, I combine this query set of allowed Features which contain members with ndtmp, my original query set of allowed Features which do not contain members.

Categories

Resources