Strip unnecessary whitespace and html comments from Tornado-rendered Pages

Strip unnecessary whitespace and html comments from Tornado-rendered Pages - python

I am using tornado and the tornado templating engine. Even when debug is set to False, tornado-rendered pages still have HTML comments and unnecessary whitespace in them. Is there a setting to automatically strip this out when rendering pages (essentially minifying the rendered pages)?

Tornado doesn't know anything about html comments, so any html comments will be passed through as-is. (You can use {# #} to add comments to your templates). There is limited support for stripping whitespace, which is enabled by default based on file extension (.html and .js). There's also a half-implemented compress_whitespace setting, although there is no clean way to set it unless you implement your own template loader.

You can set the compress_whitespace flag for a Template (or to render/render_string) to automatically remove extraneous whitespace. I don't believe there's an option to strip out comments built-in.

Related

How do you include flask/jinja2 code inside a markdown file?

I am using a markdown editor which is converted by
post_body = markdown(text_from_markdown_editor)
but when i render the html, the actual jinja2 code is displayed
This is a post by {{ post.author }}
instead of the actual value.

I've been seeing this issue come up a lot lately in various different places, both in relation to Jinja and Django Templates. There seems to be a fundamental misunderstanding (among some users) about how templates systems work and how that relates to Markdown text which is rendered to HTML and inserted into a template. I’ll try to explain this clearly. Note that while the answer below applies to most templating systems (including Jinja and Django), the examples use Jinja for illustrative purposes (after all, the original question specifically asks about Jinja). Simply adapt the code to match the API of your templating system of choice, and it should work just as well.
First of all, Markdown has no knowledge of template syntax. In fact, Markdown has been around longer than Jinja, Django or various other popular templating systems. Additionally, the Markdown Syntax Rules make no mention of template syntax. Therefore, your template syntax will not be processed simply by passing some Markdown text which contains template syntax through a Markdown parser. The template syntax needs to be processed separately by the template engine. For example:
from jinja2 import Environment
# Set up a new template environment
env = Environment()
# Create template with the markdown source text
template = env.from_string(text_from_markdown_editor)
# Render that template. Be sure to pass in the context (post in this instance).
template_processed_markdown = template.render(post=post)
# Now pass the Markdown text through the Markdown engine:
post_body = markdown(template_processed_markdown)
Note that the above first processes the template syntax, then parses the Markdown. In other words, the output of the template processing is still Markdown text with the tags replaced by the appropriate values. Only in the last line is the Markdown text converted to HTML by the Markdown parser. If you want the order of processing to be reversed, you will need to switch the code around to run the Markdown parser first and then pass the output of that through the template processor.
I assume that some of the confusion comes from people passing the Markdown text through a templating system. Shouldn’t that cause the template syntax to get processed? In short, No.
At its core, a templating system takes a template and a context. It then finds the various tags in the template and replaces those tags with the matching data provided in the context. However, the template has no knowledge about the data in the context and does no processing of that data. For example, this template:
Hello, {{ name }}!
And this context:
output = template(name='John')
Would result in the following output:
Hello, John!
However, if the context was this instead:
output = template(name='{(some_template_syntax)}')
then the output would be:
Hello, {{some_template_syntax}}!
Note that while the data in the context contained template syntax, the template did not process that data. It simply considered it a value and inserted it as-is into the template in the appropriate location. This is normal and correct behavior.
Sometimes however, you may have a legitimate need for a template to do some additional processing on some data passed to the template. For that reason, the template system offers filters. When given a variable in the context, the filter will process the data contained in that variable and then insert that processed data in the template. For example, to ensure that the name in our previous example is capitalized, the template would look like the following:
Hello, {{ name|capatalize }}!
Passing in the context output = template(name='john') (note that the name is lowercase), we then get the following output”
Hello, John!
Note, that the data in the name variable was processed by having the first letter capitalized, which is the function of Jinja’s built-in filter capitalize. However, that filter does not process template syntax, and therefore passing template syntax to that filter will not cause the template syntax to be processed.
The same concept applies to any markdown filter. Such a filter only parses the provided data as Markdown text and returns HTML text which is then placed into the template. No processing of template syntax would happen in such a scenario. In fact, doing so could result in a possible security issue, especially if the Markdown text is being provided by untrusted users. Therefore, any Markdown text which contains template syntax must have the template syntax processed separately.
However, there is a note of caution. If you are writing documentation which includes examples of Template syntax in them as code blocks (like the Markdown source for this answer), the templating system is not going to know the difference and will process those tags just like any template syntax not in a code block. If the Markdown processing was done first, so that the resulting HTML was passed to the templating system, that HTML would still contain unaltered template syntax within the code blocks which would still be processed by the templating system. This is most likely not what is desired in either case. As a workaround, one could conceivably create some sort of Markdown Extension which would add syntax processing to the Markdown processor itself. However, the mechanism for doing so would differ depending on which Markdown processor one is using and is beyond the scope of this question/answer.

How to correctly indent Django template code automatically?

I am taking care of an old Django project and discovered that most of the templates are not indented correctly (random spaces in each line, sometimes tabs instead of spaces, etc). I feel comfortable with two-space indentation.
I never had this kind of problem and would like a solution (a script, library, doesn't matter) which would take care of this. Note that I do not want something that formats the HTML before sending it to the client but I want to format the local files that are used for templating.
I tried to use BeautifulSoup and its prettify function - however, it does not let you change the indentation level (although you can always modify it) and the indentation made for some template tags ({% example %}) was incorrect.

Flask Safe text rendering user generated block

Currently coding a bbs-style app, I'm wondering what are the best practices to display text that user entered safely. Which means I don't want them to type javascript or things like that. For now I'm rendering in a pre, and forbid the "<" and ">" chars. Though I guess it is not the right way to do it. Plus the lines are cut while it shouldn't.
Could you please give me some hints on how to that ?
More informations : I'm storing those posts in a sqlite db using flask-sqlalchemy. So I don't want them to contain SQL-Injection (which I highly doubt it is possible to do with sqlalchemy)

If you are using the Jinja2 templates as set up by Flask, all content is auto-escaped by default for HTML characters; see Jinja2 setup in the templating documentation:
Unless customized, Jinja2 is configured by Flask as follows:
autoescaping is enabled for all templates ending in .html, .htm, .xml as well as .xhtml
a template has the ability to opt in/out autoescaping with the {% autoescape %} tag.
SQLAlchemy, used properly, indeed protects you from SQL injection attacks.
If you want newlines in user input to be translated to <br/> tags in the HTML output, you can use this Flask Snippet that adds a Jinja2 tag filter that does just that; translate newlines in an input variable into <br/> tags in the rendered output, while still escaping everything else. Multiple newlines are translated to <p> paragraph tags.
Add that snippet as a module to your project, make sure it is imported, then use the nl2br filter in your templates:
User text as paragraphs with line breaks:<br/>
{{ user_text | nl2br }}

Jinja's autoescape is turned on by default in Flask. Anything not explicitly marked safe will be escaped correctly.
http://flask.pocoo.org/docs/templating/#controlling-autoescaping

I use Bleach - http://bleach.readthedocs.org/ to strip HTML and only let through the tags I want to allow. It's really easy to use.

simple markup for jinja2 or html textarea

I am writing a google app engine web app using python and jinja2. Is there a simple lightweight textarea markup that I can use either for jinja2 or just generally for an HTML text area. I just want line breaks to come across and italics and bold and maybe one or two other things but I don't just want to mark is safe (and not autoescape the rest of it).
I am surprised that no one has asked a similar question and they probably have so maybe I'm just using the wrong keywords.

You could use TinyMCE to add markup features to a textarea. Visitors of your site (or administrators, if that's where you happen to use it) can then add HTML markup in a user-friendly way.
Alternatively, you could add a Markdown parser (or a similar language) that allows users to add markup using basic symbols. Your website would parse and process the content prior to displaying it.

Find untranslated strings in HTML templates

Is there way to find untranslated strings in the HTML templates of my Django application i.e. blocks of text that are not wrapped in trans and blocktrans tags.
Since we have many templates, it would be a very time-consuming process to go through them manually and check but if there isn't an option, I guess it has to be done the long and tedious way.
Thanks

Found this recently, but have not tried it yet.
Doc: http://www.technomancy.org/python/django-template-i18n-lint/
Code: https://github.com/rory/django-template-i18n-lint
Looks like it hasn't been updated in a year but it might provide a good starting spot.

You can use the builtin template parser to parse your templates, and recurse into all tags that are not instances of BlockTransTag

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.