Django: Streaming dynamically generated XML output through an HttpResponse

Django: Streaming dynamically generated XML output through an HttpResponse - python

recently I wanted to return through a Django view a dynamically generated XML tree. The module I use for XML manipulation is the usual cElementTree.
I think I tackled what I wanted by doing the following:
def view1(request):
resp = HttpResponse(g())
return resp
def g():
root = Element("ist")
list_stamp = SubElement(root, "list_timestamp")
list_creation = str(datetime.now())
for i in range(1,1000000):
root.text = str(i)
yield cET.tostring(root)
Is something like this a good idea ? Do I miss something ?

About middlewares "breaking" streaming:
CommonMiddleware will try to consume the whole iterator if you set USE_ETAGS = True in settings. But in modern Django (1.1) there's a better way to do conditional get than CommonMiddleware + ConditionalGetMiddleware -- condition decorator. Use that and your streaming will stream okay :-)
Another thing that will try to consume the iterator is GzipMiddleware. If you want to use it you can avoid gzipping your streaming responses by turning it into a decorator and applying to individual views instead of globally.

Does it work? If it doesn't work, what error does it throw?
If you're building a full-blown API for a django site, take a look at django-piston. It takes care of a lot of the busywork related to that.
http://bitbucket.org/jespern/django-piston/wiki/Home

Yes, it's perfectly legitimate to return an iterator in an HttpResponse. As you've discovered, that allows you to stream content to the client.

Yes. That's THE WAY you do it on Django.

Related

Eve framework: modify query_string

I'm writing API using Python EVE framework.
In my on_post_GET hook I want to extend request.query_string with some additional condition for some reason.
This request.query_string looks like a raw encoded string and it's not useful to add some new condition into existing.
My string looks like:
embedded=%7B%22some_key%22%3A1%2C%22another_key%22%3A1%2C%22one_more_key%22%3A1%2C%22and_more_key%22%3A1%2C%22and_more%22%3A1%2C%22some_specific_key%22%3A1%2C%22the_last_key%22%3A1%7D&where=%7B%22some_statement%22%3A%22in%28%5B%5C%22value1%5C%22%2C%5C%22value2%5C%22%5D%29%22%7D&max_results=10&page=1&sort=%5B%28%22date%22%2C0%29%5D
So, I want to add one additional condition into WHERE statement. I may parse it somehow, but there are a few things:
1) I may have another conditions and hardcoding related to condition looks terrible for me.
2) I hope, there is some better way to extend it somehow.
Thoughts?

You should be able to make your filter by handling the lookup inside a pre_GET event hook, as in this example from pyeve's documentation:
def pre_GET(resource, request, lookup):
# only return documents that have a 'username' field.
lookup["username"] = {'$exists': True}
app = Eve()
app.on_pre_GET += pre_GET
app.run()

Is it OK to send the whole POST as a JSON object?

I am using GAE with python, and I am using many forms. Usually, my code looks something like this:
class Handler(BaseHandler):
#...
def post(self):
name = self.request.get("name")
last_name = self.request.get("last_name")
# More variables...
n = self.request.get("n")
#Do something with the variables, validations, etc.
#Add them to a dictionary
data = dict(name=name, last_name=last_name, n=n)
info = testdb.Test(**data)
info.put()
I have noticed lately that it gets too long when there are many inputs in the form (variables), so I thought maybe I could send a stringified JSON object (which can be treated as a python dictionary using json.loads). Right now it looks like this:
class Handler(BaseHandler):
#...
def post(self):
data = validate_dict(json.loads(self.request.body))
#Use a variable like this: data['last_name']
test = testdb.Test(**data)
test.put()
Which is a lot shorter. I am inclined to do things this way (and stop using self.request.get("something")), but I am worried I may be missing some disadvantage of doing this apart from the client needing javascript for it to even work. Is it OK to do this or is there something I should consider before rearranging my code?

There is absolutely nothing wrong with your short JSON-focused code variant (few web apps today bother supporting clients w/o Javascript anyway:-).
You'll just need to adapt the client-side code preparing that POST, from being just a traditional HTML form, to a JS-richer approach, of course. But, I'm pretty sure you're aware of that -- just spelling it out!-)
BTW, there is nothing here that's App Engine - specific: the same considerations would apply no matter how you chose to deploy your server.

How does the get_list tastypie function work?

I'm trying to use the get_list tastypie function but I can't make it work. I've looked for documentation about that but I can't find it.
Whatever, I've a list of item ids and an ItemResource. I'm trying to return a list of serialized objects.
So I just want to do something like that :
item_resource = ItemResource()
item_ids = my_item_id_list
return item_resource.get_list(request, id=item_ids)
But of course it's not working.
What would be the correct syntax to do that ?
Thx !

Unless your ItemResource accepts filters (more here), you have to copy-paste all the stuff from here, lines #1306 - #1313.
The point is that get_list results get filtered only by obj_get_list (initial filters), and apply_filters (request-specific filters) so you have to skip directly to the serialization part (you can include the pagination part, if needed).
This is one of the cases where django-restframework appears to be better than django-tastypie - it refactores serialization out into a separate class, avoiding the code duplication.

Why would Django get request with long url lock python?

I have a strange error using the built in webserver in Django (haven't tested against Apache as I'm in active development). I have a url pattern that works for short url parameters (e.g. Chalk%20Hill), but locks up python on this one
http://localhost:8000/chargeback/checkDuplicateProject/Bexar%20Street%20Phase%20IV%20Brigham%20Ln%20to%20Myrtle%20St
The get request just says pending, and never returns, and I have to force quit python to get the server to function again. What am I doing wrong?
EDIT:
In continuing testing, it's strange, if I just enter the url, it returns the correct json response. Then it locks python. While I'm in the website, though, it never returns, and locks python.
urls:
url(r'^chargeback/checkDuplicateProject/(?P<aProjectName>(\w+)((\s)?(-)?(\w+)?)*)/$', 'chargeback.views.isProjectDuplicate'),
views:
def isProjectDuplicate(request, aProjectName):
#count the number of matching project names
p = Project.objects.filter(projectName__exact = aProjectName).count()
#if > 0, the project is a duplicate
if p > 0:
return HttpResponse('{"results":["Duplicate"]}', mimetype='application/json')
else:
return HttpResponse('{"results":["Not Duplicate"]}', mimetype='application/json')
Model:
class Project(models.Model):
projectName = models.TextField('project name')
department = models.ForeignKey('Department')
def __unicode__(self):
return self.projectName

The accepted answer is spot on about the regex, but since we're discussing optimization, I thought I should note that the code for checking whether a project exists could be modified to generate a much quicker query, especially in other contexts where you could be counting millions of rows needlessly. Call this 'best practices' advice, if you will.
p = Project.objects.filter(projectName__exact = aProjectName).count()
if p > 0:
could instead be
if Project.objects.filter(project_name__iexact=aProjectName).exists():
for two reasons.
First, you're not using p for anything so there's no need to store it as a variable as it increases readability and p is an obscure variable name and the best code is no code at all.
Secondly, this way we only ask for a single row instead of saving the results to the queryset cache. Please see the official Queryset API docs, a related question on Stack Overflow and the discussion about the latter on the django-developers group.
Additionally, it is customary in python (and Django, naturally) to name your fields lower_cased_separated_by_underscores. Please see more about this on the Python Style Guide (PEP 8).

Since you are going to check whether aProjectName already exists in the database, there's no need for you to make the regex so complicated.
I suggest you simplify the regex to
url(r'^chargeback/checkDuplicateProject/(?P<aProjectName>[\w+\s-]*)/$', 'chargeback.views.isProjectDuplicate'),
For a further explanation, see the question url regex keeps django busy/crashing on the django-users group.

How do I use beaker caching in Pyramid?

I have the following in my ini file:
cache.regions = default_term, second, short_term, long_term
cache.type = memory
cache.second.expire = 1
cache.short_term.expire = 60
cache.default_term.expire = 300
cache.long_term.expire = 3600
And this in my __init__.py:
from pyramid_beaker import set_cache_regions_from_settings
set_cache_regions_from_settings(settings)
However, I'm not sure how to perform the actual caching in my views/handlers. Is there a decorator available? I figured there would be something in the response API but only cache_control is available - which instructs the user to cache the data. Not cache it server-side.
Any ideas?

My mistake was to call decorator function #cache_region on a view-callable. I got no error reports but there were no actual caching. So, in my views.py I was trying like:
#cache_region('long_term')
def photos_view(request):
#just an example of a costly call from Google Picasa
gd_client = gdata.photos.service.PhotosService()
photos = gd_client.GetFeed('...')
return {
'photos': photos.entry
}
No errors and no caching. Also your view-callable will start to require another parameter! But this works:
#make a separate function and cache it
#cache_region('long_term')
def get_photos():
gd_client = gdata.photos.service.PhotosService()
photos = gd_client.GetFeed('...')
return photos.entry
And then in view-callable just:
def photos_view(request):
return {
'photos': get_photos()
}
The same way it works for #cache.cache etc.
Summary: do not try to cache view-callables.
PS. I still have a slight suspiction that view callables can be cached :)
UPD.: As hlv later explains, when you cache a view-callabe, the cache actually is never hit, because #cache_region uses callable's request param as the cache id. And request is unique for every request.

btw.. the reason it didnt work for you when calling view_callable(request) is,
that the function parameters get pickled into a cache-key for later lookup in the cache.
since "self" and "request" change for every request, the return values ARE indeed cached, but can never be looked up again. instead your cache gets bloated with lots of useless keys.
i cache parts of my view-functions by defining a new function inside the view-callable
like
def view_callable(self, context, request):
#cache_region('long_term', 'some-unique-key-for-this-call_%s' % (request.params['some-specific-id']))
def func_to_cache():
# do something expensive with request.db for example
return something
return func_to_cache()
it SEEMS to work nicely so far..
cheers

You should use cache region:
from beaker.cache import cache_region
#cache_region('default_term')
def your_func():
...

A hint for those using #cache_region on functions but not having their results cached - make sure the parameters of the function are scalar.
Example A (doesn't cache):
#cache_region('hour')
def get_addresses(person):
return Session.query(Address).filter(Address.person_id == person.id).all()
get_addresses(Session.query(Person).first())
Example B (does cache):
#cache_region('hour')
def get_addresses(person):
return Session.query(Address).filter(Address.person_id == person).all()
get_addresses(Session.query(Person).first().id)
The reason is that the function parameters are used as the cache key - something like get_addresses_123. If an object is passed this key can't be made.

Same problem here, you can perform caching using default parameters with
from beaker.cache import CacheManager
and then decorators like
#cache.cache('get_my_profile', expire=60)
lik in http://beaker.groovie.org/caching.html, but I can't find the solution how to make it work with pyramid .ini configuration.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django: Streaming dynamically generated XML output through an HttpResponse - python

Does it work? If it doesn't work, what error does it throw? If you're building a full-blown API for a django site, take a look at django-piston. It takes care of a lot of the busywork related to that. http://bitbucket.org/jespern/django-piston/wiki/Home

Yes, it's perfectly legitimate to return an iterator in an HttpResponse. As you've discovered, that allows you to stream content to the client.

Yes. That's THE WAY you do it on Django.

Related

Eve framework: modify query_string

Is it OK to send the whole POST as a JSON object?

How does the get_list tastypie function work?

Why would Django get request with long url lock python?

How do I use beaker caching in Pyramid?

Categories

Resources