web2py site doesn't load all images/videos (especially larger ones)

web2py site doesn't load all images/videos (especially larger ones) - python

First of all, I must say I have seen something similar to this in the web2py discussion group, but I couldn't understand it very well.
I've set up a database-driven website using web2py in which the entries are just HTML text. Most of them will contain img and/or video tags that point to relative URLs; these files are stored in folders with the address pattern static/content/article/<article-name> and the document's base href is set via the controller to make these links work. So, the images are stored and referenced directly, without all the upload/download machinery.
I'm testing it locally and using Rocket server because I'm not allowed to install Apache in this PC.
The problem:
Everything works fine, except, as it seems, when there are several "large" files being requested. By "large" I mean 4Mb files were enough, which isn't really a lot (and I think slightly smaller files would produce the same result). I'm pretty sure the links aren't broken since 1) by copying/pasting their URLs in the browser they show up normally, 2) the images/videos appear well/broken randomly when I refresh the page and 3) sometimes a video loads until a certain point and then stops, and the browser inspector shows a 'fail' signal. When I replaced these files with smaller ones (each with a dozen kb), all of them loaded. Another thing to consider is that sometimes it takes a really long time until the page finishes loading (from 2 seconds to several minutes).
The questions:
Is this the simplest/optimal way of getting the job done? I'm aware that web2py has some neat features like upload fields, but I don't know how I could make these files be effortlessly referenced in the document, considering there will be some special features in such pages involving the static files. So the solution I've come up with so far was to create a directory which name equals to the entry's and store the files there, as I said before. Is it an overkill considering what web2py has to offer?
If the answer to the first question is something like "yes", then (obvious question) what may be causing the problem and how do I fix it? Does it have something to do with the fact that web2py sends static files in chunks of 1Mb? Might it be the Rocket server? Or because I'm testing it locally?
Thanks in advance!

It's hard to give you an answer without knowing some details...
Where is hosted your Web2py application?
Do you use apache? nginx?
Did you deploy using a one step-deploy script? (http://web2py.googlecode.com/hg/scripts/setup-web2py-ubuntu.sh)
But in any case, you can (should) :
Configure Apache/Nginx to serve your static files directly (files in /YourApp/static/.). See "setup-web2py-" scripts in the "scripts" folder for more informations
Use scripts/zip_static_files.py to create gzipped versions of your static files. You can create a cron to run "python web2py.py -S myapp -R scripts/zip_static_files.py"
More details about efficiency in the book : http://web2py.com/books/default/chapter/29/13/deployment-recipes?search=static+files#Efficiency-and-scalability

Related

Django taking very long time to serve generated static files

In my view I generate 5 images (they are generated by pyplot.)
Everything seems to work fine. The files are generated correctly in the right directory.
But the browser only shows one of the five images, and the requests for the other 4 usually timeout. Looking at the django server output, the GET requests will often take five minutes to finish.
These images are ~100kb, and are present and correct on the drive immediately after being generated.
Am I missing a call for Django to update the new static files? Something else? Please help!

Deal with them as a media file, save the files to your media server (CDN or local), then send the URL of the images to the browser to load them.

I also faced a similar problem but with audio files. To resolve that issue I used pagination and that solved my issue. I think you should try that, like view one image in one page and second image on second page.

Filling out paragraph text via urllib?

Say I have a paragraph in a site I am managing and I want a python program to change the contents of that paragraph. Is this plausible with urllib?

Quite possibly; it depends on how the site is designed.
If the site is just a collection of static pages (ie .html files) then you would have to get a copy of the page, modify it, and upload the new version - most likely using sftp or WebDAV.
If the site is running a content management system (like Wordpress, Drupal, Joomla, etc) then it gets quite a bit simpler - your script can simply post new page content.
If it is static content maintained through a template system (ie Dreamweaver) then life gets quite a bit nastier again - because any changes you make will not be reflected in the template files, and will likely get overwritten and disappear the next time you update the site.

If you have access to any server-side scripting language, its easy.

ensuring dynamic image urls in a web-app: use a blob store?

I want to serve images in a web-app using sessions such that the links to the images expire once the session has expired.
If I show the actual links to the filesystem store of the images, say http://www.mywebapp.com/images/foo1.jpg this clearly makes stopping future requests for the image (one the user has signed out of the session) difficult to stop. Which is why I was considering placing the images in a sqlite db, and serving them from there.
It seems that using the db for image storage is considered bad practice (though apparently the GAE blob store seems to provide this functionality), so i was trying to figure out what the alternatives would be.
1)
Perhaps I do somesort of url-re-writing like so:
http://www.mywebapp.com/images/[session_id]/foo1.jpg
Thinking of using nginx, but it seems (on a first look) that this will require some hackin to accomplish?
2)
Copy the files to a physical directory on the filesystem and delete when the session expires. this seems quite messy though?
Are there any standard methods of accomplishing this dynamic image url thing?
I'm using web.py - if that helps.
Many thanks!

lighty's mod_secdownload has worked well for me to solve this issue. You can read more about it at http://redmine.lighttpd.net/wiki/1/Docs:ModSecDownload
The lighttpd wiki also has a generic article about your problem: http://redmine.lighttpd.net/wiki/1/HowToFightDeepLinking

Why so complicated?
Serve the image under the name which the user supplied (i.e. http://www.mywebapp.com/images/foo1.jpg)
Save the images in a directory using a UUID as name.
Create a map of file names to UUIDs in the session.
In the handler for /images/ look up the real file name in the map. Return 404 if no such entry exists. Otherwise serve the image.
When the session is closed, delete all files from the map.
In a cron job, delete all images that are older than one day.
This way, several users can upload the same image (same name), images get deleted as soon as possible or by the cron job (if the server crashes or something like that).

A combination of your two ideas (copy to a dir, expire when session expires) could be generalized to creating a new dir (could be as simple as a symlink) every 15 minutes. When generating the new symlink, also remove the one that's an hour old by now. Always link to the newest name in your code.

Read static content from within the code of an application

Is there a way to read the contents of a static data directory or interact with that data in any way from within the code of an application?
Edit: Please excuse me if it wasn't clear at first, I mean getting a list of the files in that directory, not reading the data in them.

No. Files marked as static in app.yaml are not available to your application; they're served from separate servers.
If you just need to list them, you could build a list as part of your deploy process. If you need to actually read them, you'll need to include a second copy in your application directory (although the "copy" can be just a symlink; appcfg.py will follow symlinks and upload them.)

You can just open them (only read only).

Getting information about static files in Python App Engine; workarounds

I'm working on an App Engine project that will have customizable themes. I'd like to be able to use jQuery UI themes. The problem is figuring out what the CSS file is going to be named. (Typically, "jquery-ui-1.7.2.custom.css". Version numbers will change, and people tend to rename things, but there should only be one CSS file, and I'm OK with it being an error condition if there's two or more for some reason.) Because it's a static file (static files are uploaded to App Engine separately from the rest of the application's resources), I can't just glob the directory for a CSS file. I can't just assume that it's hard-coded, and I really don't want to make it a configuration setting, because that's a bad user experience.
Guido told me to symlink it so that App Engine sees two copies and can treat one as static and the other as an application resource, but symlinks don't work on Windows, and since this will ultimately be open source, I can't control which SDK the user uses. Another suggestion was to use a deploy-time script, but Mac users have this nice "Deploy" button in their version of the SDK and I'd rather not have to tell them, "Oh hey, sorry for the inconvenience, but you can't use that for this project."
I clearly need an out-of-the-box solution to this one, but I'm at a loss. Anyone have any good suggestions for how to get a custom jQuery UI theme out of the ThemeRoller and into an App Engine app? Some post-processing is already needed, because the only files in the zip file that ThemeRoller gives you are in the "css" directory. Maybe I can write something that takes a raw theme as input and spits out something useful on the other side (the deploy-time script trick, but somehow less user-unfriendly). The trick here is presentation — I want the user to spend as little time on the command line as possible. An ideal solution assumes the person performing this task is non-technical for the most part. No part of the solution can be much harder than installing something like WordPress or Drupal, and in a perfect world, it should be way, way easier.

To accomplish what you are asking, I would use the datastore for serving the CSS files. Since this would allow easy listing, sorting and even modification and uploading.
Other than that, your next best options would be to store the CSS data inside a script (a dictionary where the filename is the key name, and the CSS code is the value). Or, as you suggested, to run a script before deploying to AppEngine.
Personally, I would go for the storing in the datastore option, since it will allow for a great deal more user customization (such as each user being able to provide their own CSS file), just be sure to use memcache to avoid needing to access the datastore when possible (which should be a very common occurrence), as well as using HTTP headers to tell the browser to cache the CSS file locally.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.