I got this long queryset statement on a view
contributions = user_profile.contributions_chosen.all()\
.filter(payed=False).filter(belongs_to=concert)\
.filter(contribution_def__left__gt=0)\
.filter(contribution_def__type_of='ticket')
That i use in my template
context['contributions'] = contributions
And later on that view i make changes(add or remove a record) to the table contributions_chosen and if i want my context['contributions'] updated i need to requery the database with the same lenghty query.
contributions = user_profile.contributions_chosen.all()\
.filter(payed=False).filter(belongs_to=concert)\
.filter(contribution_def__left__gt=0)\
.filter(contribution_def__type_of='ticket')
And then again update my context
context['contributions'] = contributions
So i was wondering if theres any way i can avoid repeating my self, to reevaluate the contributions so it actually reflects the real data on the database.
Ideally i would modify the queryset contributions and its values would be updated, and at the same time the database would reflect this changes, but i don't know how to do this.
UPDATE:
This is what i do between the two
context['contributions'] = contributions
I add a new contribution object to the contributions_chosen(this is a m2m relation),
contribution = Contribution.objects.create(kwarg=something,kwarg2=somethingelse)
user_profile.contributions_chosen.add(contribution)
contribution.save()
user_profile.save()
And in some cases i delete a contribution object
contribution = user_profile.contributions_chosen.get(id=1)
user_profile.contributions_chosen.get(id=request.POST['con
contribution.delete()
As you can see i'm modifying the table contributions_chosen so i have to reissue the query and update the context.
What am i doing wrong?
UPDATE
After seeing your comments about evaluating, i realize i do eval the queryset i do
len(contributions) between context['contribution'] and that seems to be problem.
I'll just move it after the database operations and thats it, thanks guy.
update
Seems you have not evaluated the queryset contributions, thus there is no need to worry about updating it because it still has not fetched data from DB.
Can you post code between two context['contributions'] = contributions lines? Normally before you evaluate the queryset contributions (for example by iterating over it or calling its __len__()), it does not contain anything reading from DB, hence you don't have to update its content.
To re-evaluate a queryset, you could
# make a clone
contribution._clone()
# or any op that makes clone, for example
contribution.filter()
# or clear its cache
contribution._result_cache = None
# you could even directly add new item to contribution._result_cache,
# but its could cause unexpected behavior w/o carefulness
I don't know how you can avoid re-evaluating the query, but one way to save some repeated statements in your code would be to create a dict with all those filters and to specify the filter args as a dict:
query_args = dict(
payed=False,
belongs_to=concert,
contribution_def__left__gt=0,
contribution_def__type_of='ticket',
)
and then
contributions = user_profile.contributions_chosen.filter(**query_args)
This just removes some repeated code, but does not solve the repeated query. If you need to change the args, just handle query_args as a normal Python dict, it is one after all :)
Related
I have a couple models that I want to update at the same time. First I get their data from the db with a simple:
s = Store.get(Store.id == store_id)
new_book = Book.get(Book.id == data[book_id'])
old_book = Book.get(Book.id == s.books.id)
The actual schema is irrelevant here. Then I do some updates to these models and at the end I save all three of them with:
s.save()
new_book.save()
old_book.save()
The function that handles these operations uses the #db.atomic() decorator so the writes are bunched into a single transaction. The problem is that what if, between the point where I get() the data from the DB and the point where I save the modified data, another process changed something with these models in the DB already. Is there a way to execute those writes (.save() operations) only if the underlying DB rows have not been changed? I could read their last_changed value but again, is there a way to do this and update at the same time? And if data has been changed, simply throw an exception?
Turns out there is a solution for this in the official docs called Optimistic Locking.
I try to understand Django documentation for queryset exists method
Additionally, if a some_queryset has not yet been evaluated, but you
know that it will be at some point, then using some_queryset.exists()
will do more overall work (one query for the existence check plus an
extra one to later retrieve the results) than simply using
bool(some_queryset), which retrieves the results and then checks if
any were returned.
What I'm doing:
if queryset.exists():
do_something()
for element in queryset:
do_something_else(element)
So I'm doing more overall work than just using bool(some_queryset)
Does this code makes only one query?
if bool(queryset):
do_something()
for element in queryset:
do_something_else(element)
If yes where python puts the results ? In queryset variable ?
Thank you
From the .exists() docs itself:
Additionally, if a some_queryset has not yet been evaluated, but you
know that it will be at some point, then using
some_queryset.exists() will do more overall work (one query for the
existence check plus an extra one to later retrieve the results) than
simply using bool(some_queryset), which retrieves the results and
then checks if any were returned.
The results of an already evaluated queryset are cached by Django. So, whenever the data is required from the queryset the cached results are used.
Related docs: Caching and QuerySets
It is quite easy to check number of queries with assertNumQueries:
https://docs.djangoproject.com/en/1.3/topics/testing/#django.test.TestCase.assertNumQueries
In your case:
with self.assertNumQueries(1):
if bool(queryset):
do_something()
for element in queryset:
do_something_else(element)
If I want to check for the existence and if possible retrieve an object, which of the following methods is faster? More idiomatic? And why? If not either of the two examples I list, how else would one go about doing this?
if Object.objects.get(**kwargs).exists():
my_object = Object.objects.get(**kwargs)
my_object = Object.objects.filter(**kwargs)
if my_object:
my_object = my_object[0]
If relevant, I care about mysql and postgres for this.
Why not do this in a try/except block to avoid the multiple queries / query then an if?
try:
obj = Object.objects.get(**kwargs)
except Object.DoesNotExist:
pass
Just add your else logic under the except.
django provides a pretty good overview of exists
Using your first example it will do the query two times, according to the documentation:
if some_queryset has not yet been evaluated, but you
know that it will be at some point, then using some_queryset.exists()
will do more overall work (one query for the existence check plus an
extra one to later retrieve the results) than simply using
bool(some_queryset), which retrieves the results and then checks if
any were returned.
So if you're going to be using the object, after checking for existance, the docs suggest just using it and forcing evaluation 1 time using
if my_object:
pass
Imagine you have the following situation:
for i in xrange(100000):
account = Account()
account.foo = i
account.save
Obviously, the 100,000 INSERT statements executed by Django are going to take some time. It would be nicer to be able to combine all those INSERTs into one big INSERT. Here's the kind of thing I'm hoping I can do:
inserts = []
for i in xrange(100000):
account = Account()
account.foo = i
inserts.append(account.insert_sql)
sql = 'INSERT INTO whatever... ' + ', '.join(inserts)
Is there a way to do this using QuerySet, without manually generating all those INSERT statements?
As shown in this related question, one can use #transaction.commit_manually to combine all the .save() operations as a single commit to greatly improve performance.
#transaction.commit_manually
def your_view(request):
try:
for i in xrange(100000):
account = Account()
account.foo = i
account.save()
except:
transaction.rollback()
else:
transaction.commit()
Alternatively, if you're feeling adventurous, have a look at this snippet which implements a manager for bulk inserting. Note that it works only with MySQL, and hasn't been updated in a while so it's hard to tell if it will play nice with newer versions of Django.
You could use raw SQL.
Either by Account.objects.raw() or using a django.db.connection objects.
This might not be an option if you want to maintain database agnosticism.
http://docs.djangoproject.com/en/dev/topics/db/sql/
If what you're doing is a one time setup, perhaps using a fixture would be better.
I have question regarding the SQLAlchemy. I have database which contains Items, every Item has assigned more Records (1:n). And the Record is partially stored in the database, but it also has an assigned file (1:1) on the filesystem.
What I want to do is to delete the assigned file when the Record is removed from the database. So I wrote the following MapperExtension:
class _StoredRecordEraser(MapperExtension):
def before_delete(self, mapper, connection, instance):
instance.erase()
The following code creates an experimental setup (full code is here: test.py):
session = Session()
i1 = Item(id='item1')
r11 = Record(id='record11', attr='1')
i1.records.append(r11)
r12 = Record(id='record12', attr='2')
i1.records.append(r12)
session.add(i1)
session.commit()
And finally, my problem... The following code works O.k. and the old.erase() method is called:
session = Session()
i1 = session.query(Item).get('item1')
old = i1.records[0]
new = Record(id='record13', attr='3')
i1.records.remove(old)
i1.records.append(new)
session.commit()
But when I change the id of a new Record to record11, which is already in the database, but it is not the same item (attr=3), the old.erase() is not called. Does anybody know why?
Thanks
A delete + insert of two records that ultimately have the same primary key within a single flush are converted into a single update right now. this is not the best behavior - it really should delete then insert, so that the various events assigned to those activities are triggered as expected (not just mapper extension methods, but database level defaults too). But the flush() process is hardwired to perform inserts/updates first, then deletes. As a workaround, you can issue a flush() after the remove/delete operation, then a second for the add/insert.
As far as flushes' current behavior, I've looked into trying to break this out but it gets very complicated - inserts which depend on deletes would have to execute after the deletes, but updates which depend on inserts would have to execute beforehand. Ultimately, the unitofwork module would be rewritten (big time) to consider all insert/update/deletes in a single stream of dependent actions that would be topologically sorted against each other. This would simplify the methods used to execute statements in the correct order, although all new systems for synchronizing data between rows based on server-level defaults would have to be devised, and its possible that complexity would be re-introduced if it turned out the "simpler" method spent too much time naively sorting insert statements that are known at the ORM level to not require any sorting against each other. The topological sort works at a more coarse grained level than that right now.