An "image-serving web framework" in Python? - python

I'm planning an iOS app that requires a server backend capable of efficiently serving image files and performing some dynamic operations based on the requests it gets (like reading and writing into a data store, such as Redis). I'm most comfortable with, and would thus prefer to write the backend in Python.
I've looked at a lot of Python web framework/server options, Flask, Bottle, static and Tornado among them. The common thread seems to be that either they support serving static files as a development-time convenience only, discouraging it in production, or are efficient static file servers but not really geared towards the dynamic framework-like side of things. This is not to say they couldn't function as the backend, but at a quick glance they all seem a bit awkward at it.
In short, I need a web framework that specializes in serving JPEGs instead of generating HTML. I'm pretty certain no such thing exists, but right now I'm hoping that someone could suggest a solution that works without bending the used Python applications in ways they are not meant for.
Specifications and practical requirements
The images I'd be serving to the clients live in the file system in a shallow directory hierarchy. The actual file names would be invisible to the clients. The server would essentially read the directory hierarchy at startup, assigning a numeric ID for each file, and would route the requests to controller methods that then actually serve the image files. Here are a few examples of ways the client would want to access the images in different circumstances:
Randomly (example URL path: /image/random)
Randomly, each file only once (/image/random_unique), produces some suitable non-200 HTTP status code when the files are exhausted
Sequentially in either direction (/image/0, /image/1, /image/2 etc.)
and so on. In addition, there would be URL endpoints for things like ratings, image info and other metadata, some client-specific information as well (the client would "register" with the server, so that needs some logic, too). This data would live in a Redis datastore, most likely.
All in all, the backend needs to be good at serving image/jpeg and application/json (which it would also generate). The scalability and concurrency requirements are modest, at least to start with (this is not an App Store app, going for ad-hoc or enterprise distribution).
I don't want the app to rely on redirects. That is, I don't want a model where a request to a URL would return a redirect to another URL that is backed by, say, nginx as a separate static file server, leaving only the image selection logic for the Python backend. Instead, a request to a URL from the client should always return image/jpeg, with metadata in custom HTTP headers where necessary. I specify this because it is a way of avoiding serving static files from Python that I thought of, and someone else might think of too ;-)
Given this information, what sort of solution would you consider a good choice, and why? Or is this something for which I need to code non-trivial extensions to existing projects?
EDIT: I've been thinking about this a bit more. I don't want redirects due to the delay inherent in the multiple requests they entail, plus I'd like to abstract out the file names from the client, but I was wondering if something like this would be possible:
It's pretty self-explanatory, but the idea is that the Python program is given the request info by nginx (or whatever serves the role), mulls it over and then tells nginx to respond to the client's request with a specific file from the file system. It does so. The client is none the wiser about how the request was fulfilled, it just receives a response with the correct content type.
This would be pretty optimal in my view, but is it possible? If not with nginx, perhaps something else?

I've been using Django for well over a year now, and it is the hammer I use for all my nails. You could probably do this with a bit of database-image storage and django's builtin orm and url routing (with regex). If you store the images in the database, you will automatically get the unique-id's set. According to this stackoverflow answer, you can use redis with django.

I don't want a model where a request to a URL would return a redirect to another URL that is backed by, say, nginx as a separate static file server, leaving only the image selection logic for the Python backend.
I think Nginx for serving static and python for figuring out the image url is the better solution.
Still if you do not want to do that I would suggest you use any Python web framework (like Django) and write your models and convert them into REST resources (Eg. Using django-tastypie) and/or return a base64 encoded image which you can then decode in your iOS client.
Refs:
Decoding a Base64 image
TastyPie returns the path as default, you might have to do extra work to either store the image blob in the table or write more code to return a base64 encoded image string

You might want to look at one of the async servers like Tornado or Twisted.

Related

Running complex calculations (using python/pandas) in a Django server

I have developed a RESTful API using the Django-rest-framework in python. I developed the required models, serialised them, set up token authentication and all the other due diligence that goes along with it.
I also built a front-end using Angular, hosted on a different domain. I setup CORS modifications so I can access the API as required. Everything seems to be working fine.
Here is the problem. The web app I am building is a financial application that should allow the user to run some complex calculations on the server and send the results to the front-end app so they can be rendered into charts and other formats. I do not know how or where to put these calculations.
I chose Django for the back-end as I expected that python would help me run such calculations wherever required. Basically, when I call a particular api link on the server, I want to be able to retrieve data from my database, from multiple tables if required, and use the data to run some calculations using python or a library of python (pandas or numpy) and serve the results of the calculations as response to the API call.
If this is a daunting task, I at least want to be able to use the API to retrieve data from the tables to the front-end, process the data a little using JS, and send it to a python function located on the server with this processed data, and this function would run the necessary complex calculations and respond with results which would be rendered into charts / other formats.
Can anyone point me to a direction to move from here? I looked for resources online but I think I am unable to find the correct keywords to search for them. I just want a shell code kind of a thing to integrate into my current backed using which I can call some python scripts that I write to run these calculations.
Thanks in advance.
I assume your question is about "how do I do these calculations in the restful framework for django?", but I think in this case you need to move away from that idea.
You did everything correctly but RESTful APIs serve resources -- basically your model.
A computation however is nothing like that. As I see it, you have two ways of achieving what you want:
1) Write a model that represents the results of a computation and is served using the RESTful framework, thus your computation being a resource (can work nicely if you store the results in your database as a way of caching)
2) Add a route/endpoint to your api, that is meant to serve results of that computation.
Path 1: Computation as Resource
Create a model, that handles the computation upon instantiation.
You could even set up an inheritance structure for computations and implement an interface for your computation models.
This way, when the resource is requested and the restful framework wants to serve this resource, the computational result will be served.
Path 2: Custom Endpoint
Add a route for your computation endpoints like /myapi/v1/taxes/compute.
In the underlying controller of this endpoint, you will load up the models you need for your computation, perform the computation, and serve the result however you like it (probably a json response).
You can still implement computations with the above mentioned inheritance structure. That way, you can instantiate the Computation object based on a parameter (in the above case taxes).
Does this give you an idea?

nodejs or python proxy lib with relative url support

I am looking for a lib that lets me roughly:
connect to localhost:port, but see http://somesite.com
rewrite all static assets to point to localhost:port instead of somesite.com
support cookies / authentication
i know that http://betterinternet.co/ does this already, but they wont give me their source code for some reason.
I assume this doesnt exist as free code, so if i were to write one, are there any nuances to it? If i replace all occurrences of somesite.com in html and headers, will that be enough?
So...you want an http proxy that does link rewriting? Sounds like Apache and mod_proxy_html. It's not written in node or Python, but I think it will do what you're asking.
I don't see any straight forward solution to your problem. If I've understood correctly, you want a caching HTTP proxy which serves static contents locally, with URL rewriting rules defined in Python (or nodejs). That's quite a task.
A caching HTTP proxy implementation is not trivial. So I'd use an existing implementation, such as Squid (or Apache if it does caching too).
You could then place a (relatively) simple HTTP server written in Python in front of that (e.g. based on BaseHTTPServer and urllib2) which performs the URL rewriting as you want them and forwards the requests to the proxy (or direct to internet).
The idea would be to rely on the proxy setup to perform all the processing you don't want to modify (including basic rewrite rules, authentication, caching and cache management) and limit your front-end implementation to performing only the custom rewriting you are interested in.

Django - Understanding X-Sendfile

I've been doing some research regarding file downloads with access control, using Django. My goal is to completely block access to a file, except when accessed by a specific user. I've read that when using Django, X-Sendfile is one of the methods of choice for achieving this (based on other SO questions, etc). My rudimentary understanding of using X-Sendfile with Django is:
User requests URI to get a protected file
Django app decides which file to return based on URL, and checks user permission, etc.
Django app returns an HTTP Response with the 'X-Sendfile' header set to the server's file path
The web server finds the file and returns it to the requester (I assume the webs server also strips out the 'X-Sendfile' header along the way)
Compared with chucking the file directly from Django, X-Sendfile seems likely to be a more efficient method of achieving protected downloads (since I can rely on Nginx to serve files, vs Django), but leaves 2 questions for me:
Is my explanation of X-Sendfile at least abstractly correct?
Is it really secure, assuming I don't provide normal, front-end HTTP access (e.g. http://www.example.com/downloads/secret-file.jpg) to the directory that the file is stored (ie, don't keep it in my public_html directory)? Or, could a tech-savvy user examine headers, etc. and reverse engineer a way to access a file (to then distribute)?
Is it really a big difference in performance. Am I going to bog my application server down by providing 8b chunked downloads of 150Mb files directly from Django, or is this sort-of a non-issue? The reason I ask is because if both versions are near equal, the Django version would be preferable due to my ability to do things in Python, like log the number of completed downloads, tally bandwidth of downloads etc.
Thanks in advance.
Yes, that's just how it works.
The exact implementation depends on the webserver but in the case of nginx, it's recommended to mark the location as internal to prevent external access.
Nginx can asynchronously serve files while with Django you need one thread per request which can get problematic for higher numbers of parallel requests.
Remember to send a X-Accel-Redirect header for nginx instead of X-Sendfile.
See http://wiki.nginx.org/XSendfile for more information.

How to do cloud computing with Python and Java? Final Year project

For my final year project I plan to code a cloud in Python. The client will be written in Java by the other member of my team. The client will have a tabbed interface and it will provide a text editor, a media player, a couple of small Java based games and a maybe a few more services.
The server will work like this:
1) Validate the user.
2) Send a file, called "dump" to the user. Dump will contain all the file names and file types that the user created by himself or the files which the user can read/write. This info will be fetched from the database.
3) The tabs in the client will display the file types associated with the tab application. e.g the media tab will only select and show the media files from the dump readable by user. The text editor tab will show only the txt files from the dump readable by the user.
4) A request to open the file will send the file back to client, which the associated application will open.
5) All the changes made to the files and all the actions (overwriting, saving, deleting etc.) will be sent back to the server along with the new object. Something similar will be done to the newly created objects.
My Questions are:
What are the best approaches for the communication between the client and the server. For the dump I plan to use some sort of encrypted XML file. For the other way round, I don't have a clue :/.
For easy integration with the database, I was planning to use Django (which I started few days back). How can I send my requests from the client to the server (without Django I'd use SQL queries) and the files from the server to the client? Maybe GET and POST will work for the former problem? Any other suggestions?
Q1: how should I transfer data between client/server securely
A: HTTPS to support encryption & JSON to serialise objects between languages (Python/Java) seems to be the most natural. You could experiment with XML-RPC over SSL or TSL if you want to be creative.
Q2: How do I send queries to the server's db?
A: My first response is to say talk to the person coding the server, and see what's easiest on that end. However, I think that your client should stick to HTTP. The server developer would ensure the server supports RESTful URIs. Then your client only access a URI and have the results processed by the server.
At its most raw, this could be implemented like this:
https://www.example.com/db?q="SELECT * FROM docs"
There are smarter ways to do it, but you get the idea.
If you're going to use a web framework on the server, it makes sense to use an HTTP-based protocol. The downside is that only the client can initiate a connection (e.g., the client needs to first ask for the "dump" file), but a simple GET request will suffice (remember, the server can send anything in the HTTP response, including your XML file).
Regarding encryption, it's best to use an existing protocol like HTTPS. There are well-vetted libraries that will correctly establish a secure connection between your client and the server.
Overall, I'm advocating the highest-level protocols that are appropriate for your application. HTTP(S) goes hand-in-hand with your web-based architecture, so make use of it.
Stick to Django. It's really productive. I would use JSON instead of XML. More convenient. import json. This should help you in communicating between client-server.
Also cloud computing is just a recent word that's just thrown around for (client+server+some services). Oh by the way all that you want to do can be completely done in Django itself. No need to go to JAVA.
Django is Cool :)

AppEngine/Python, query database and send multiple images to the client as a response to a single get request

I am working on a social-network type of application on App Engine, and would like to send multiple images to the client based on a single get request. In particular, when a client loads a page, they should see all images that are associated with their account.
I am using python on the server side, and would like to use Javascript/JQuery on the client side to decode/display the received images.
The difficulty is that I would like to only perform a single query on the server side (ie. query for all images associated with a single user) and send all of the images resulting from the query to the client as a single unit, which will then be broken up into the individual images. Ideally, I would like to use something similar to JSON, but while JSON appears to allow multiple "objects" to be sent as a JSON response, it does not appear to have the ability to allow multiple images (or binary files) to be sent as a JSON response.
Is there another way that I should be looking at this problem, or perhaps a different technology that I should be considering that might allow me to send multiple images to the client, in response to a single get request?
Thank you and Kind Regards
Alexander
The App Engine part isn't much of a problem (as long as the number of images and total size doesn't exceed GAE's limits), but the user's browser is unlikely to know what to do in order to receive multiple payloads per GET request -- that's just not how the web works. I guess you could concatenate all the blobs/bytestreams (together with metadata needed for the client to reconstruct them) and send that (it will still have to be a separate payload from the HTML / CSS / Javascript that you're also sending), as long as you can cajole Javascript into separating the megablob into the needed images again (but for that part you should open a separate question and tag it Javascript, as Python has little to do with it, and GAE nothing at all).
I would instead suggest just accepting the fact that the browser (presumably via ajax, as you mention in tags) will be sending multiple requests, just as it does to every other webpage on the WWW, and focus on optimizing the serving side -- the requests will be very close in time, so you should just use memcache to keep the yet-unsent images to avoid multiple fetch-from-storage requests in your GAE app.
As an improvement to Alex's answer, there's no need to use memcache: Simply do a keys-only query to get a list of keys of images you want to send to the client, then use db.get() to fetch the image corresponding to the required key for each image request. This requires roughly the same amount of effort as a single regular query.
Trying to send all of the images in one request means that you will be fighting very hard against some of the fundamental assumptions of the web and browser technology. If you don't have a really, really compelling reason to do this, you should consider delivering one image per request. That already works now, no sweat, no effort, no wheels reinvented.
I can't think of a sensible way to do what you ask, but I can tell you that you are asking for pain in trying to implement the solution that you are describing.
Send the client URLs for all the images in one hit, and deal with it on the client. That fits with the design of the protocol, and still lets you only make one query. The client might, if you're lucky, be able to stream those back in its next request, but the neat thing is that it'll work (eventually) even if it can't reuse the connection for some reason (usually a busted proxy in the way).

Categories

Resources