Using infinite scroll with Django Rest Framework?

Using infinite scroll with Django Rest Framework? - python

I am creating a REST API using Django Rest Framework. The API will serve large amount of data and I want to use infinite scrolling on the page. I would like to use Angular for the frontend. I am not sure how to serve the data such that not all data has to be sent once but only when the user scrolls down.
I am using serializer class -
class CompanySerializer(serializers.ModelSerializer):
class Meta:
model = Company
fields = ('company_name', 'location', 'founded_year')
I am not sure about how to implement this. Should i use Django Endless Pagination or could it be done by using the pagination provided by django-rest-framework. I am also not sure about how the frontend of this would work. Web development newbie here, please help.

Create a ListAPIView subclass
from rest_framework import pagination, generics
class CompanyPagination(pagination.PageNumberPagination):
page_size = 20 # the no. of company objects you want to send in one go
# Assume url for this view is /api/v1/companies/
class CompanyListView(generics.ListAPIView):
queryset = Company.objects.all()
serializer_class = CompanySerializer
pagination_class = CompanyPagination
Once this is done, you can get first 20 by calling the http://your_domain.com/api/v1/companies/?page=1, ?page=2 will give 21st company to 40th company and so on. (Not specifying ?page= is like doing ?page=1)
On your AngularJS side, you will maintain some variable which will hold which page number to fetch next. Now you can bind your API request to either click event on some Load More type of button, or may be you can detect whether user has scrolled to the bottom, and then do the API request and fetch next set of Company objects.
Note:
It is not compulsory to use PageNumberPagination, you can even use LimitOffsetPagination or CursorPagination to achieve your objective. Read more about various pagination styles here

You should give CursorPagination a try.
I don't know exactly whether it's infinite or not but it's definitively for huge sets.

Related

Passing user-entered data to Queryset.filter() - is it safe?

I have a page which takes GET parameters from its url, and passes these directly to a REST API. So the page URL looks like:
foo.com/pizzas/?toppings=mushrooms&toppings=sausage
As the page loads, it will take the GET params and pass them to a REST API like:
foo.com/totally/unrelated/url/?toppings=mushrooms&toppings=sausage
On the backend, I want to extract these out and filter based on them. This is basically what I have right now:
# inside a django rest framework ModelViewSet
# it's not really relevant other than that self.request exists
def get_queryset(self):
queryset = self.model.objects.all()
for param in self.request.query_params:
# param = "toppings"
if not self.is_real_model_field(param): # assume this works
continue
param_value_list = self.request.query_params.getlist(param)
# param_value_list = ['mushrooms', 'sausage']
queryset = queryset.filter(
f"{param}__in": param_value_list
)
return queryset
I've said that the fact it's Django Rest Framework is irrelevant, but I'm not 100% sure on that. In the above example, request.query_params is added by Django Rest Framework, but based on DRF's documentation here I believe it is simply an alias for django's built-in request.GET.
So, is this safe to do in Django? A malicious actor could directly manipulate the URL. I assume that django's QuerySet.filter(field__in: values) will automatically do some cleaning for you, and/or the limited character set of a URL will help stop anything nasty from coming through, but I haven't been able to find any resources discussing the issue.

Have a look at django_filters package. It does what you want and you won't need to reinvent a wheel.
Using this package you could list all filterable fields. This package also adds query param value validation.

Django REST Framework that doesn't alter the model data

I am trying to create a middleware web app that will allow users to control some services on our servers. To that end, I have several models created in Django that are used to track things like the current state of the server, or a list of which inputs are valid for any given service.
The API needs to be able to:
List all instances of a model
Show detailed information from one instance of a model
Accept JSON to be converted into instructions for the software (i.e. "This list of outputs should source from this input")
I don't need to have any further access to the data - Any changes to the details of the models will be done by a superuser through the Django admin interface, as it will only change if the software configuration changes.
So far all the DRF documentation I've found assumes that the API will be used to create and update model data - How can I use DRF for just GET calls and custom actions? Or should I forego DRF and just use plain Django, returning JSON instead of HTML?
Edit: I've realised where my confusion was coming from; I was misunderstanding the purpose/function of serializers vs viewsets. Serializers will always have create + update methods because they turn incoming data into a model object. Viewsets determine what can be done with that object, so that's where you enable different access methods.

If you are using ModelViewSet, you can use the http_method_names class variable.
class MyModelViewSet(viewsets.ModelViewSet):
queryset = MyModel.objects.all()
serializer_class = MyModelSerializer
http_method_names = ['get']

you can try to use readonlymodelviewset, example from docs
class AccountViewSet(viewsets.ReadOnlyModelViewSet):
"""
A simple ViewSet for viewing accounts.
"""
queryset = Account.objects.all()
serializer_class = AccountSerializer

How to think about Django's normal class based views vs. using a REST API

I've been writing a webapp with Django to replace a clumsy, spreadsheet based sports picking game that I play with some friends. I've learned a lot, and had a great time getting to know Django and how to build something like this from scratch.
I recently realized that I wanted to use something more powerful on the frontend (Ember, Angular, etc) with the end goal being a single page app. To that end, I installed Django REST Framework (DRF) and started reading the docs and following the tutorial. It's super interesting, and I'm finally starting to see why a client-server model with an API is really the only way to achieve the smooth interactivity that's all over now.
I'm trying to implement one of my class based views as an API endpoint, and I've been having a lot of trouble conceptualizing it. I thought I'd start with a simple, GET-only endpoint- here's the simple CBV I'm trying to replicate in API form:
class MatchupDetail(DetailView):
template_name = 'app/matchups.html'
context_object_name = 'pick_sheet'
def get_object(self):
#logic to find and return object
def get_opponent(self,username,schedule,week, **kwargs):
#logic to find and return the opponent in the matchup
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
#logic to pull the opponents details and set them in the context
I feel like I have a handle on this flow- a user clicks a link, and this view retrieves the object at the heart of the requested page, supplements it with content in the context, then renders it.
As I began thinking about turning this into an API endpoint, it didn't make a whole lot of sense. Should I be putting all the user-related data into a single JSON response? Or should the frontend basically handle the flow of this logic and the API simply be composed of a collection of endpoints- for example, one to retrieve the object, and one or more to retrieve what's now being passed in the context?
What prompted me to make this post was some trouble with my (super basic) API implementation of the above view:
class MatchupDetailApi(generics.ListAPIView):
queryset = Sheet.objects.all()
serializer_class = SheetSerializer
With serializer:
class SheetSerializer(serializers.ModelSerializer):
user = serializers.ReadOnlyField()
class Meta:
model = Sheet
I added the user field when I noticed that without it, the returned serialized Sheet objects are literally just the row in the database- an integer ID, integer foreign key to the User object, and so on. With a 'traditional' CBV, the entire objects are returned to the template- so it's very intuitive to access related fields, and with Django it's also easy to traverse object relationships.
Does a REST implementation offer the same sort of thing? From what I've read, it seems like I'll need an extension to DRF (django-rest-multiple-models) to return more than one model in a single response, which leads me to think I should be creating endpoints for every model, and leaving presentation logic to when I take care of the frontend. Is that typical? Or is it feasible to have an API endpoint that does return something like an object and several related objects?
Note: the basic endpoint above stopped working when I added the user to the SheetSerializer. I realized I should have a UserSerializer as well, which is:
class UserSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = User
However, when I try to browse the API, i get a TypeError that the first user isn't serializable. Specifically: <User: dkhaupt> is not JSON serializable. Isn't this what the UserSerializer is for?

Is it feasible to have an API endpoint that does return something like
an object and several related objects?
Yes!
And it sounds like you are off to a great start. I would structure it something like this:
class UserSerializer(serializers.ModelSerializer):
"""serializes a user"""
class Meta:
model = User
fields = ('id', 'first_name', 'last_name',)
class SheetSerializer(serializers.ModelSerializer):
"""serializes a sheet, and nests user relationship"""
user = UserSerializer(read_only=True)
class Meta:
model = Sheet
fields = ('id', 'sheet_name', 'user',)
I don't think you need django-rest-multiple-models for what you are trying to achieve. In my sketch (where I'm guessing fieldnames) you will serialize the sheet, and also the associated user object.

You can add fields from another related model using the source attribute.
for example:
class SheetSerializer(serializers.ModelSerializer):
user_id = serializers.ReadOnlyField(source='user.user_id')
username = serializers.ReadOnlyField(source='user.username')
class Meta:
model = Sheet
Here the serializer will return the information from the user model that is related to the Sheet model.

Django: Filter request results to only contain data related to the requesting user

I'm a Django beginner (though I do have experience in web development using Sails.js + Angular) so bear with me.
I have an existing application that uses REST API in communicating between Sails.js backend and AngularJS frontend. Now, we've found the backend to be unsuited for our purposes, and we're going to swap to using Django in near-future. Sails.js automatically creates the REST methods for the controllers while Django doesn't, so I suppose I'm going to use something like Django Rest Framework to create the API.
So yeah, I've found corresponding features for most things. The on thing I haven't found yet is a replacement for a Sails.js feature called "policies". They are functions that can be executed on queries to certain controller actions, and can be defined as model-specific, model-controller action-specific, and request type specific. For example, you can have an "authAccess" policy that checks that the user of a request is authenticated, and the policy gets executed before the actual requested controller method gets executed. You can also use these to modify request objects before passing them to the controller. Now to my actual problem:
Let's say I have a User model that has a many-to-one relation with another model, let's call it Book, meaning a user can own many books, but a book can only have one owner. Goody good. Now, we have a logged-in user that is making a query to find all of his books. He makes a GET request to /book. I want to ensure that the returned Book objects are filtered so that ONLY HIS BOOKS are returned.
So basically in Sails I was able to write a policy that altered the request parameters to be like {user: loggedInUser} so the resulting Book results were automatically filtered. This was handy, since I was able to use the same policy to filter other models, too, like DVD or Friend or whatnot. My question is - what would be the best way to implement the same functionality in Django?

Have a look at the documentation:
http://www.django-rest-framework.org/api-guide/filtering/#filtering-against-the-current-user
Most likely you are better off overwriting the get_queryset method in a model viewset. And you can make this a generic approach by creating a base class for your views, something like:
from rest_framework import generics, viewsets, mixins, generics
class OwnerModelViewSet(viewsets.ModelViewSet):
def get_queryset(self):
"""
This view should return a list of all the records
for the currently authenticated user.
"""
return self.model.objects.filter(user=self.request.user)
All your model viewset classes can inherit from that class. It would require the foreign key field to be always named "user" though. If that is not the case here is a slightly hacky way how you could find a foreign key field to the User table. Use with care.
from django.db.models.fields.related import ForeignKey
from accounts.models import User
def _get_related_user(self, obj):
'''
Search for FK to user model and return field name or False if no FK.
This can lead to wrong results when the model has more than one user FK.
'''
for f in self.model._meta.fields:
if isinstance(f, ForeignKey) and f.rel.to == User:
return f.name
return False

Django admin hangs (until timeout error) for a specific model when trying to edit/create

This one is driving me nuts right now. It was not happening before (even got screenshots I had to do for the user-manual since the customer required it).
I first noticed it on production server and then I checked and also happens in the dev server that comes with Django. The model appears on the main-page of the django admin, I can click it and it will display the list of point of sales. The problem comes whenever I want to edit an existing instance or create a new one.
I just click on the link (or put it on the bar) and it just hangs.
class PointOfSaleAdmin(admin.ModelAdmin):
list_display = ('id','business', 'user', 'zipcode', 'address','date_registered')
list_filter = ('business',)
filter_horizontal = ('services',)
admin.site.register(models.PointOfSale, PointOfSaleAdmin)
That's the registration of the model. All models are registered in the admin application and the user to test this is a super user. The model is:
class PointOfSale(models.Model):
user = models.ForeignKey(User)
zipcode = models.ForeignKey(Zipcode)
business = models.ForeignKey(Business)
services = models.ManyToManyField(Service,
verbose_name='available services')
date_registered = models.DateField(auto_now_add=True)
address = models.CharField(max_length=300)
Plus a few methods that shouldn't really matter much. Plus, last time before this that I tested the admin was right after I created all those methods, so it shouldn't matter on this.
The administrator very very rarely has to access this page. Usually it's just listing the PoS, but it still bothers me. Any idea of why it could be hanging? All other models are working just fine.
This is happening on both Django 1.2.5 and 1.3
EDIT:
I modified the timeout limits. It IS working, but somehow it takes several minutes for it to actually happen. So, there is something in the background that is taking ages. I don't understand how come it happens only for this model and it happens in different environments (and with small datasets)
I almost feel like slapping myself. My fault for not sleeping for so long.
The problem is that the zipcode list is pretty big (dozens of thousands) and the foreign key field is loaded as an html select tag, which means it loads every single entry. It's an issue with how much data there is simply.
Now I wonder how to control the way the foreign key is displayed in the admin. Anyone could help with that?

In your admin.py file, under the appropriate admin class, set
raw_id_fields = ('zipcode',)
This will display the zipcode's PK instead of a dropdown.
Is there a reason that you are setting up zipcode as it's own model instead of using a CharField or an actual zipcode modelfield?

I just wanted to add that another option here is creating a read_only_fields list. In cases where there is a relationship to a model with a large number of choices(in my case a rel table cataloging flags between a large number of users and discussion threads) but you don't need to edit the field. You can add it to the read_only_fields list will just print the value rather than the choices.
class FlaggedCommentsAdmin(ModelAdmin):
list_display = ('user', 'discussion', 'flagged_on')
readonly_fields = ('user', 'discussion')

For people still landing on this page: As Mamsaac points out in his original post, the timeout happens because django tries to load all instances of a ForeignKey into an html-select. Django 2 lets you add an auto-complete field which asynchronously lets you search for the ForeignKey to deal with this. In your admin.py do something like this:
from django.contrib import admin
from .models import Parent, Child
#admin.register(Parent)
class ParentAdmin(admin.ModelAdmin):
# tell admin to autocomplete-select the "Parent"-field 'children'
autocomplete_fields = ['children']
#admin.register(Child)
class ChildAdmin(admin.ModelAdmin):
# when using an autocomplete to find a child, search in the field 'name'
search_fields = ['name']

Have you tried checking the apache logs (if you're using apache obviously) or any other HTTP server related logs? That might give you an idea of where to start.
That's the only model that is affected? You mentioned methods on the model. Try commenting out those methods and trying again (including the __unicode__ method), just to see if they somehow affect it. Reduce everything down to the bare minimum (as much as possible obviously), to try and deduce where the regression started.
Try to monitor server resources when you request this page. Does CPU spike dramatically? What about network I/O? Could be a database issue (somehow?).
Sorry this doesn't really answer your question, but those are the first debugging techniques that I'd attempt trying to diagnose the problem.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.