django-mptt: dealing with concurrent inserts - python

I have a threaded comment system which works fine 99.9% of the time, but very occasionally the tree breaks down and left/right values get duplicated.
I have discovered that this happens when two posts happen at the same time (within a second of each other), and presumably what is happening is that the second post is updating the left/right values of the tree before the first has completed doing so.
My comment insert code from views.py is the following:
#login_required
#transaction.autocommit
def comment(request, post_id):
parent = get_object_or_404(Post, pk=post_id)
if request.method == 'POST':
form = PostForm(request.POST)
form.parent = post_id
if form.is_valid():
new_post = newPost(request.user, form.cleaned_data['subject'], form.cleaned_data['body'])
new_post.insert_at(parent, 'last-child', save=True)
return HttpResponseRedirect('/posts/')
else:
form = PostForm()
return render_to_response('posts/reply.html', {'requestPost': request.POST, 'form': form, 'parent': parent}, context_instance=RequestContext(request))
What is the correct approach to dealing with this? Is there a django way to ensure that the second view does not get called until the first database transaction is complete? Or should I rebuild the tree after each insert to ensure integrity? Or is there a better insert method to be using?
Thanks!
edit: I'm using MySQL.

transaction.autocommit() is a standard django behavior. You decorator does nothing, if global transaction behavior was not redefined.
Use should use commit_on_success() decorator. All db operations in view will be in one transaction.
You can read more on https://docs.djangoproject.com/en/1.5/topics/db/transactions/
PS: In django 1.6 transaction management will be updated, be attentive.

Related

Why is my django core.serializers so slow

I have my a serializer from core.serializers in my django view. It does work, but it sometimes takes over 1 minute to show my results table. Any ideas how to get it faster?
# views.py
from django.core import serializers
def search_institution(request):
form = SearchInstitutionsForm()
qs = Institution.objects.all()
if request.method == "POST":
form = SearchInstitutionsForm(request.POST)
if form.is_valid():
cd = form.cleaned_data
if cd['name']:
qs = Institution.objects.filter(name__contains=cd['name'])
print(f"Before requesting from db: {datetime.now()}")
print(f"After requesting from db, before serializing: {datetime.now()}")
context = {
"result_data": SafeString(serializers.serialize("json", qs)),
'form': form
}
print(f"After serializing, before rendering: {datetime.now()}")
return render(request, "landing/result_table.html", context)
else:
context = {
"form": SearchInstitutionsForm
}
return render(request, "stakeholders/institution_form.html", context)
Your print statements don't help much because Querysets are lazy. It means they are not evaluated at time of filtering qs = Institution.objects.filter(name__contains=cd['name']) but rather when the actual value is used.
So I'm not going much into serialization there are really nice ways of doing that, and Django on it's own provides good tools already.
Look into Queryset and database performance, it's the most essential topic, where I suspect your performance issue comes from. Also do more research on select_related and prefetch_related, it will help you reduce 200+ queries to 2
You can use django Debug toolbar in your local environment to easily monitor your application. But there are other profiling tools and tips too.

Django Row Level Locking For Model Forms

I am using Python 3.5, Django 1.8 and PostgreSql 9.4.
So I have one edit method where in get request I am rendering form and when user submit form it will get updated. Hers's the code
def edit_case(request, case_id):
case = get_object_or_404(Case, pk=case_id)
if request.method == 'POST':
data = request.POST.copy()
form = CaseEditForm(data, instance=case)
if form.is_valid():
res = form.save()
return HttpResponseRedirect(reverse("case_list"))
else:
form = CaseEditForm(instance=case)
variables = RequestContext(request, {
"form": form,
})
return render_to_response(
'sample/edit_case.html',
variables,
)
Now I want to add row level locking on it like if one user is updating something at the same time other will not be able update anything unless previous transaction succeeded Or if someone have any other better suggestions rather then using Pessimistic Locking.
I know about select_for_update but don't have any idea how it will get implemented in case of form.save()
Any help would be really appreciated
After so much research I have figured out the solution. So now while getting queryset against id I am using "select_for_update", something like this
with transaction.atomic():
case = get_object_or_404(Case.objects.select_for_update(), pk=case_id)
form = CaseEditForm(data, instance=case)
if form.is_valid():
res = form.save()
return HttpResponseRedirect(reverse("case_list"))
So as you can see now I am fetching the query set object under the transaction and if any exception appear during transaction it will automatically rollback and Also as I am using select_for_update so it will lock the row until the transaction get succeeded or failed.
If anyone have better suggestion then kindly share it.

Avoid recreating the GET logic in the POST part of a view

How can I avoid duplicating the logic code from the GET block in the view below?
The logic is view-specific enough that I don't feel it makes sense to put a helper function in a separate utils.py.
'''
show_stuff view shows different stuff depending on how many other
POST requests have been submitted to view and saved to the DB.
All the users access the same URL randomly, so I don't believe it's possible to
split things up like "view_for_template1", "view_for_template2" in urls.py
'''
def show_stuff(request, url_dispatch_var1, url_dispatch_var2=None):
if request.method == "GET":
#30 lines of logic determining which Template and Context to return
if request.method =="POST":
#10 lines of logic determining which Form type to save
#then, the same 30 lines of logic as the GET block to determine
#which Template and Context to return
You can usually do something like the following:
def show_stuff(request, url_dispatch_var1, url_dispatch_var2=None):
if request.method =="POST":
#10 lines of logic determining which Form type to save
# redirect if form is valid
else:
# this is a GET request
form = MyForm() # unbound form for get request
# 30 lines of logic to determine which Template and Context to return
return render(request, template, context)
Note that after a successful post request, the usual approach is to redirect to prevent duplicate submissions.
This might be a case where class based views are useful. You could subclass FormView, then override get_context_data, get_template_names and so on.
Maybe instead of returning body for POST request you could redirect user to your GET view ?

How to Prevent a Redirected Django Form from Executing Twice?

My form2 is executing twice due to HttpResponseRedirect and from 'POST'. How do I prevent that from happening? Is it even possible?
What I've tried:
Process and render "getInfo" from form 1 and display it in form2. While this may work but I'll still end up going through the "getInfo" again in form2 to be able to use the returned variable.
Putting "getInfo" inside the if request.method will create an error because getInfo will need to be executed to obtain the returned errors variable.
Any suggestion is definitely welcomed.
Update
I've raised a similar question regarding "Why My Django Form Executed Twice?" and it was answered. I didn't want to create a bigger confusion by adding more questions on top of it. I created this as a follow-up question on how to actually solve it.
views.py
def form1 (request):
NameFormSet = formset_factory (NameForm, formset = BaseNodeFormSet, extra = 2, max_num = 5)
if request.method == 'POST':
name_formset = NameFormSet (request.POST, prefix = 'nameform')
if name_formset.is_valid ():
data = name_formset.cleaned_data
request.session ['data'] = data
return HttpResponseRedirect ('form2')
else:
name_formset = NameFormSet (prefix = 'nameform')
context = {'name_formset': name_formset}
return render (request, 'nameform/form1.html', context)
def form2 (request):
data = request.session ['data']
n, data, errors = getInfo (data) # <==== This statement executed twice in the console
CheckBoxFormSet = formset_factory (CheckBox, extra = 2, max_num = 5)
if request.method == 'POST':
checkbox_formset = CheckBoxFormSet (request.POST, prefix = 'checkbox')
if checkbox_formset.is_valid ():
for i, form in enumerate (checkbox_formset.cleaned_data):
data [i].update (form) # Join cleaned data with original data
n.processData (data, errors) # <=== n object will be used from getInfo
del request.session ['data']
context = {'data': data}
return render (request, 'nameform/success.html', context)
else:
checkbox_formset = CheckBoxFormSet (prefix = 'checkbox')
context = {
'checkbox_formset': checkbox_formset,
'data': data,
'errors': errors, # <==== getInfo needed to execute first to display errors messages
}
return render (request, 'nameform/form2.html', context)
def getInfo (data):
# Do something with data
return (n, data, errors)
Should the else: statement be less indented to align with if request.method == POST:? Also, where does getInfo come from? In the version of django I'm using (1.7) errors are an attribute on the formset after calling is_valid().
edit: further information
OK so your getInfo function runs twice because HttpResponseRedirect actually returns a 302 response to the browser with the new address, which the browser then GETs. Then when the user submits the form the browser POSTs the data to the same view. Since getInfo runs at the start of your view before any GET/POST conditional check, it runs both times. If you just want to delegate returning a response to another view function, you can call it directly, don't return a redirect.
Without knowing any more about your program this is as much as anyone can tell you.
Some more points:
getInfo sounds like it should be a 'safe' function that doesn't mutate its input or have any side effects, so running it twice shouldn't be a problem. If it does either of those things, you ought to rename it at least.
If the results of getInfo aren't expected to change between the GET and the POST request then you can move it into the form1 view function and store the results in session['data']. If it is expected to change, or you need to know if it does change, then you have no option but to run it twice anyway, unless there is some conditional you can check without running it to know if it will change.
Finally, form validation shouldn't be in the view if possible, keep it in your form class. There are hooks in django's form classes for whatever kind of validation you could want to do on submitted data. As a general rule, try to work within the framework, in the way it was designed to be used. This approach, as opposed to constructing a Rube-Goldberg machine out of the scavenged parts of many libraries taped together, will save you a lot of effort in the long run, as the library author and you will be working in the same direction.

How to re-use a view to update objects in Django?

I am trying to get my head around Class based views (I'm new to Django). I currently have a project that uses function based views. My 'create' view renders a form and successfully submits to the database. However, I need an edit/update function so the obvious option is to re-use the 'create' function I made but I'm struggling to work it out and adhere to the DRY principle.
Is using Class based views the right way to go?
Do they handle the creation of all the 'CRUD' views?
I'm currently working my way through the GoDjano tutorials on Class based views but its still not sinking in.
Any help/pointers would be, as usual, much appreciated.
As you can see in the source code, a CreateView and an UpdateView are very similar. The only difference is that a CreateView sets self.object to None, forcing the creation of a new object, while UpdateView sets it to the updated object.
Creating a UpdateOrCreateView would be as simple as subclassing UpdateView and overriding the get_object method to return None, should a new object be created.
class UpdateOrCreateView(UpdateView):
def get_object(self, queryset=None):
# or any other condition
if not self.kwargs.get('pk', None):
return None
return super(UpdateOrCreateView, self).get_object(queryset)
The GoDjango tutorials don't seem to be out of date (CBVs have barely changed since their introduction), but they do seem to be missing some of the essential views in their tutorials.
CBV is in my opinion never the solution. A dry FBV is (assuming you have created an imported a form RecordForm and a model Record, imported get_object_or_404 and redirect):
#render_to('sometemplate.html')
def update(request, pk=None):
if pk:
record = get_object_or_404(Record, pk=pk)
else:
record = None
if request.POST:
form = RecordForm(request.POST)
if form.is_valid():
form.save()
return redirect('somepage')
else:
// ....
elif record:
form = RecordForm(instance=record)
else:
form = RecordForm()
return { 'form': form, 'record': record }
I also integrate the messages framework to for example add an error message when form.is_valid() is False.
I use a render_to decorator but that's not necessary (but then you have to return the view results differently).

Categories

Resources