How to keep a property value in django.values()? - python

I have my model as:
class Subs(models.Model):
...
created_at = models.DateTimeField(auto_now_add=True, db_column="order_date", null=True, blank=True)
#property
def created_date(self):
return self.created_at.strftime('%B %d %Y')
I want to get created_Date in my views.py
data = Subs.objects.values('somevalues','created_date')
It throws an error. How to access created_date so that I can use it here?

Although your approach works, it's not best practice performance-wise. generally iterating whole Model.objects.all() is a bad idea because it loads all rows in memory.
In such cases you have several options:
if you just need some simple python logic on your data (like formatting here) do this on the presentation layer (e.g. filter tags)
if you need to apply some heavy business logic, it's better to have them in create/update time (e.g. overriding .save()) or have some cronjobs for it in off-peak time and save them in an extra column in DB.
if your manipulation needs some DB layer query and depends on several columns or tables use .annotate() to add it into your queryset.

As values didnot work i used for loop.
instead of doing this:
data = Subs.objects.values('somevalues','created_date')
I did this :
newarr = [{'created_date': i.created_date} for i in Subs.objects.all()]

Related

use __gte for string date django

Date is in string format in database
class A(models.Model):
date = models.CharField(max_length=50, blank=True, null=True)
Yah i know it should be date type.
but now we have many records and we will soon change it to date type.
For current status i want get the objects greater than particular date.
So how can i use date__gte
example
objs = A.objects.filter(date__gte=datetime.now())
Is there any way to achieve this without converting date to datetime field.
I am not sure if that works. What you can try is a custom manager. So you can create a manager for your model, add something like date_gte and then convert the string to a datetime. Then you can user those operators as usual. That's a quick fix for now, but the best solution is to use a DateTimeField, which you want to do as far as I understood.
Example Manager:
from django.db import models
class MyManager(models.Manager):
def date_gte(self, date=datetime.now()):
items = []
for obj in self.all():
if datetime(obj.date) < date:
items.append(obj)
return items
Then you could call it like MyModel.objects.date_gte(date=datetime.now()).
Note: This is an expensive query and you may need to convert the simple list into QuerySet object. I haven't tested it, so this example should only help you get started.
There is no way to do this without a conversion (either in Django or the database during the query) to a proper DateTime type. You're trying to compare a datetime.datetime to a str. That won't work in normal Python and it won't work here.
What kind of string? if it is formatted to %Y%m%d, you can use .extra() method to do the query.
A.objects.extra(where=['date >= date_of_today'])

Update multiple objects at once in Django?

I am using Django 1.9. I have a Django table that represents the value of a particular measure, by organisation by month, with raw values and percentiles:
class MeasureValue(models.Model):
org = models.ForeignKey(Org, null=True, blank=True)
month = models.DateField()
calc_value = models.FloatField(null=True, blank=True)
percentile = models.FloatField(null=True, blank=True)
There are typically 10,000 or so per month. My question is about whether I can speed up the process of setting values on the models.
Currently, I calculate percentiles by retrieving all the measurevalues for a month using a Django filter query, converting it to a pandas dataframe, and then using scipy's rankdata to set ranks and percentiles. I do this because pandas and rankdata are efficient, able to ignore null values, and able to handle repeated values in the way that I want, so I'm happy with this method:
records = MeasureValue.objects.filter(month=month).values()
df = pd.DataFrame.from_records(records)
// use calc_value to set percentile on each row, using scipy's rankdata
However, I then need to retrieve each percentile value from the dataframe, and set it back onto the model instances. Right now I do this by iterating over the dataframe's rows, and updating each instance:
for i, row in df.iterrows():
mv = MeasureValue.objects.get(org=row.org, month=month)
if (row.percentile is None) or np.isnan(row.percentile):
row.percentile = None
mv.percentile = row.percentile
mv.save()
This is unsurprisingly quite slow. Is there any efficient Django way to speed it up, by making a single database write rather than tens of thousands? I have checked the documentation, but can't see one.
Atomic transactions can reduce the time spent in the loop:
from django.db import transaction
with transaction.atomic():
for i, row in df.iterrows():
mv = MeasureValue.objects.get(org=row.org, month=month)
if (row.percentile is None) or np.isnan(row.percentile):
# if it's already None, why set it to None?
row.percentile = None
mv.percentile = row.percentile
mv.save()
Django’s default behavior is to run in autocommit mode. Each query is immediately committed to the database, unless a transaction is actives.
By using with transaction.atomic() all the inserts are grouped into a single transaction. The time needed to commit the transaction is amortized over all the enclosed insert statements and so the time per insert statement is greatly reduced.
As of Django 2.2, you can use the bulk_update() queryset method to efficiently update the given fields on the provided model instances, generally with one query:
objs = [
Entry.objects.create(headline='Entry 1'),
Entry.objects.create(headline='Entry 2'),
]
objs[0].headline = 'This is entry 1'
objs[1].headline = 'This is entry 2'
Entry.objects.bulk_update(objs, ['headline'])
In older versions of Django you could use update() with Case/When, e.g.:
from django.db.models import Case, When
Entry.objects.filter(
pk__in=headlines # `headlines` is a pk -> headline mapping
).update(
headline=Case(*[When(pk=entry_pk, then=headline)
for entry_pk, headline in headlines.items()]))
Actually, attempting #Eugene Yarmash 's answer I found I got this error:
FieldError: Joined field references are not permitted in this query
But I believe iterating update is still quicker than multiple saves, and I expect using a transaction should also expedite.
So, for versions of Django that don't offer bulk_update, assuming the same data used in Eugene's answer, where headlines is a pk -> headline mapping:
from django.db import transaction
with transaction.atomic():
for entry_pk, headline in headlines.items():
Entry.objects.filter(pk=entry_pk).update(headline=headline)

Django numeric comparison of hstore or json data?

Is it possible to filter a queryset by casting an hstore value to int or float?
I've run into an issue where we need to add more robust queries to an existing data model. The data model uses the HStoreField to store the majority of the building data, and we need to be able to query/filter against them, and some of the values need to be treated as numeric values.
However, since the values are treated as strings, they're compared character by character and results in incorrect queries. For example, '700' > '1000'.
So if I want to query for all items with a sqft value between 700 and 1000, I get back zero results, even though I can plainly see there are hundreds of items with values within that range. If I just query for items with sqft value >= 700, I only get results where the sqft value starts with 7, 8 or 9.
I also tried testing this using a JsonField from django-pgjson (since we're not yet on Django 1.9), but it appears to have the same issue.
Setup
Django==1.8.9
django-pgjson==0.3.1 (for jsonfield functionality)
Postgres==9.4.7
models.py
from django.contrib.postgres.fields import HStoreField
from django.db import models
class Building (models.Model):
address1 = models.CharField(max_length=50)
address2 = models.CharField(max_length=20, default='', blank=True)
city = models.CharField(max_length=50)
state = models.CharField(max_length=2)
zipcode = models.CharField(max_length=10)
data = HStoreField(blank=True, null=True)
Example Data
This is an example of what some of the data on the hstore field looks like.
address1: ...
address2: ...
city: ...
state: ...
zipcode: ...
data: {
'year_built': '1995',
'building_type': 'residential',
'building_subtype': 'single-family',
'bedrooms': '2',
'bathrooms': '1',
'total_sqft': '958',
}
Example Query which returns incorrect results
queryset = Building.objects.filter(data__total_sqft__gte=700)
I've tried playing around with the annotate feature to see if I can coerce it to cast to a numeric value but I have not had any luck getting that to work. I always get an error saying the field I'm querying against does not exist. This is an example I found elsewhere which doesn't seem to work.
queryset = Building.objects.all().annotate(
sqft=RawSQL("((data->>total_sqft)::numeric)")
).filter(sqft__gte=700)
Which results in this error:
FieldError: Cannot resolve keyword 'sqft' into field. Choices are: address1, address2, city, state, zipcode, data
One thing that complicates this setup a little further is that we're building the queries dynamically and using Q() objects to and/or them together.
So, trying to do something sort of like this, given a key, value and operator type (gte, lte, iexact):
queryset.annotate(**{key: RawSQL("((%data->>%s)::numeric)", (key,)})
queries.append(Q(**{'{}__{}'.format(key, operator): value})
queries.filter(reduce(operator.and_, queries)
However, I'd be happy even just getting the first query working without dynamically building them out.
I've thought about the possibility of having to create a separate model for the building data with the fields explicitly defined, however there are over 600 key value pairs in the data hstore. It seems like changing that into a concrete data model would be a nightmare to setup and potentially maintain.
So I had a very similar problem and ended up using the Cast Function (Django > 1.10) with KeyTextTransform.
my_query =.query.annotate(as_numeric=Cast(KeyTextTransform('my_json_fieldname', 'metadata'), output_field=DecimalField(max_digits=6, decimal_places=2))).filter(as_numeric=2)

django save data in database only when certain conditions are met

I have a python function that scrapes some data from a few different websites and I want to save that data into my database only if a certain condition is met. Namely, the scraped data should only be saved if the combination of the location and date field is unique
So in my view I have a new location variable and and date variable and essentially I just need to test this combination of values against what's already in the database. If this combination is unique, then save it. If it's not, then do nothing.
class Speech(models.Model):
location = models.ForeignKey(Location)
speaker = models.CharField(max_lenth=100)
date = models.DateField
I'm pretty new to django so I'm just not sure how to go about executing this sort of database query.
You want a combination of two things. First, you want a inner Meta class to enforce the uniqueness in the database:
class Speech(models.Model):
location = models.ForeignKey(Location)
speaker = models.CharField(max_length=100)
date = models.DateField()
class Meta:
unique_together = ('location', 'date')
Then, when you're doing your data manipulation in your view, you want the get_or_create method of the default model manager:
speech, new = Speech.objects.get_or_create(
location=my_location_string,
date=my_datetime_variable,
)
if new:
speech.speaker = my_speaker_string
speech.save()
I hope that gets you started. As always, you know your needs better than I do, so don't blindly copy this example, but adapt it to your needs.
Documentation:
unique_together
get_or_create

A good way to store this browser versions - Django/Postgresql

I have this data:
Firefox 3.6
There are 3 items
name
max version
min version
I am storing it this way:
class MyModel(models.Model):
browser_name = models.CharField(...)
browser_max_version = models.IntegerField(...)
browser_min_version = models.IntegerField(...)
or alternative
class Browser(models.Model):
name = models.CharField(...)
max_version = models.IntegerField(...)
min_version = models.IntegerField(...)
class MyModel(models.Model):
browser = models.ForeignKey(Browser)
Is there any clever way to store the value in 1 field and making it parsable at the same time?
I know this might sound weird, but I wonder if there are any alternative to building 1 million models to represent data.
Any ideas? :)
You could make it parseable, but probably not indexable. For example, you could concatenate the values together separated by semicolons (or some other character), then simply split the string to get the values back. "Firefox 3.6" would become "Firefox;3;6". While this is somewhat easier to parse, it doesn't provide much of an advantage over the original formatting.
The big caveat with this approach is that the column wouldn't be indexable in a very granular way. For example, you couldn't ask for all versions of Firefox. PostgreSQL allows for some very advanced indexing which, I believe, would allow you to create the required indexes, but I don't know of any way you could access the indexes via Django's ORM.
What is the purpose of MyModel in the second example? The one table Browser is all you need. Why on earth would you need 'millions' of models? Or are you talking about rows in a table?
class Browser(models.Model):
name = models.CharField(...)
max_version = models.IntegerField(...)
min_version = models.IntegerField(...)
is fine

Categories

Resources