Converting \n to <br> in mako files - python

I'm using python with pylons
I want to display the saved data from a textarea in a mako file with new lines formatted correctly for display
Is this the best way of doing it?
> ${c.info['about_me'].replace("\n", "<br />") | n}

The problem with your solution is that you bypass the string escaping, which can lead to security issues. Here is my solution :
<%! import markupsafe %>
${text.replace('\n', markupsafe.Markup('<br />'))}
or, if you want to use it more than once :
<%!
import markupsafe
def br(text):
return text.replace('\n', markupsafe.Markup('<br />'))
%>
${text | br }
This solution uses markupsafe, which is used by mako to mark safe strings and know which to escape. We only mark <br/> as being safe, not the rest of the string, so it will be escaped if needed.

It seems to me that is perfectly suitable.
Be aware that replace() returns a copy of the original string and does not modify it in place. So since this replacement is only for display purposes it should work just fine.
Here is a little visual example:
>>> s = """This is my paragraph.
...
... I like paragraphs.
... """
>>> print s.replace('\n', '<br />')
This is my paragraph.<br /><br />I like paragraphs.<br />
>>> print s
This is my paragraph.
I like paragraphs.
The original string remains unchanged. So... Is this the best way of doing it?
Ask yourself: Does it work? Did it get the job done quickly without resorting to horrible hacks? Then yes, it is the best way.

Beware as line breaks in <textarea>s should get submitted as \r\n according to http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.13.4
To be safe, try s.replace('\r\n', '<br />') then s.replace('\n', '<br />').

This seems dangerous to me because it prints the whole string without escaping, which would allow arbitrary tags to be rendered. Make sure you cleanse the user's input with lxml or similar before printing. Beware that lxml will wrap in an HTML tag, it just can't handle things that aren't like that, so get ready to remove that manually or adjust your CSS to accommodate.

try this it will work:-
${c.info['about_me'] | n}

There is also a simply help function that can be called which will format and santize text correctly replacing \n for tags (see http://sluggo.scrapping.cc/python/WebHelpers/modules/html/converters.html).
In helpers.py add the following:
from webhelpers.html.converters import textilize
Then in your mako file simply say
h.textilize( c.info['about_me'], santize=True)
The santize=True just means that it will make sure any other nasty codes are escaped so users can't hack your site, as the default is False. Alternatively I make do a simple wrapper function in helpers so that santize=True is always defaults to True i.e.
from webhelpers.html.converters import textilize as unsafe_textilize
def textilize( value, santize=True):
return unsafe_textilize( value, santize )
This way you can just call h.textilize( c.info['about_me'] ) from your mako file, which if you work with lots of designers stops them from going crazy.

Related

Django - Issues passing String correctly to template after replacing characters in view

I'm having a bizarre issue with passing a particular string from a view to a template.
The string originates from a form, and contains text that I want to simplify into a split-able string later. So, I substitute potential separator characters with a comma like so:
# views.py
mystring = myform.cleanedData['mystring']
mystring = str(mystring) # convert from unicode
mystring = mystring.replace("\n", ",").replace("\r\n", ",").replace(" ", ",").replace(";", ",")
# Then I pass it to the template:
return render(request, 'html/mytemplate.html', {'mystring': mystring})
Now, take this form data for example:
%15
%16
If I print out mystring to a file just before rendering the template, it looks like this:
%15,%16
All good so far. The problem, though, comes from trying to render this string into the template. If I try to render the string like this:
{{ mystring }}
The result is this (leading spaces included):
%15
,%16
It preserves the comma, but adds some other funky stuff, which I don't want because it makes some of my JS get pretty darn confused. I've tried to prevent escaping with the safe filter, but it doesn't seem to change anything in this case. Another thing to note is that if the original form data is already correctly formatted, i.e. "%15,%16", it works just fine and passes the string as intended.
Any ideas? I've done quite a bit of logging inside my views, but it seems to be fine up until I render it to the template.
Well, after looking at my own example code here, I realized what the issue was.
I needed to swap the order of replace("\n", ",") with replace("\r\n", ","), ensuring that the latter occurs first. The issue was caused by the \n escapes being replaced, and then not being able to find any occurrences of \r\n, therefore leaving all of the \rs in the text.

How to pass html directly to template

I want to pass HTML directly as a parameter in template().
I know I could do something like:
%for i in array:
<a>{{i}}</a>
%end
but I need to pass it directly when I call template,
I tried replacing &lt and &gt with < > with javascript but that did not work.
I want to do this:
{{results_of_i_in_array}}
and the loop will occur in my main rather than in the template,
I never found anyone asking the same question.
Note: this question is NOT a duplicate of this question.
I am using bottle default templating system, thanks in advance.
Bottle doc:
You can start the expression with an exclamation mark to disable
escaping
for that expression:
>>> template('Hello {{name}}!', name='<b>World</b>')
u'Hello <b>World</b>!'
>>> template('Hello {{!name}}!', name='<b>World</b>')
u'Hello <b>World</b>!'

Python - Remove 'style'-attribute from HTML

I have a String in Python, which has some HTML in it. Basically it looks like this.
>>> print someString # I get someString from the backend
"<img style='height:50px;' src='somepath'/>"
I try to display this HTML in a PDF. Because my PDF generator can't handle the styles-attribute (and no, I can't take another one), I have to remove it from the string. So basically, it should be like that:
>>> print someString # I get someString from the backend
"<img style='height:50px;' src='somepath'/>"
>>> parsedString = someFunction(someString)
>>> print parsedString
"<img src='somepath'/>"
I guess the best way to do this is with RegEx, but I'm not very keen on it. Can someone help me out?
I wouldn't use RegEx with this because
Regex is not really suited for HTML parsing and even though this is a simple one there could be many variations and edge cases you need to consider and the resulting regex could turn out to be a nightmare
Regex sucks. It can be really useful but honestly, they are the epitome of user unfriendly.
Alright, so how would I go about it. I would use trusty BeautifulSoup! Install with pip by using the following command:
pip install beautifulsoup4
Then you can do the following to remove the style:
from bs4 import BeautifulSoup as Soup
del Soup(someString).find('img')['style']
This first parses your string, then finds the img tag and then deletes its style attribute.
It should also work with arbitrary strings but I can't promise that. Maybe you will come up with an edge case.
Remember, using RegEx to parse an HTML string is not the best of ideas. The internet and Stackoverflow is full of answers why this is not possible.
Edit: Just for kicks you might want to check out this answer. You know stuff is serious when it is said that even Jon Skeet can't do it.
Using RegEx to work with HTML is a very bad idea but if you really want to use it, try this:
/style=["']?((?:.(?!["']?\s+(?:\S+)=|[>"']))+.)["']?/ig

In python the Jinja2 template returns a backslash in front of a double quote, I need to remove that

One of the lines in my jinja2 template needs to return
STACKNAME=\"",{"Ref":"AWS::StackName"},"\"
Putting the above into the template returns
STACKNAME=\\\"\",{\"Ref\":\"AWS::StackName\"},\"\\\"
I tried creating a variable
DQ = '"'
and setting the template as
STACKNAME="{{DQ}},{{{DQ}}Ref{{DQ}}:{{DQ}}AWS::StackName{{DQ}}},{{DQ}}"
but the result still puts a backslash in front of the {{DQ}} variable
I also tried putting in a unique string %%%DQ%%% and then getting the results and then doing a string replace but it still gives me the backslash.
How do I get the results I want?
UPDATE:
My apologies. It turns out that it is not the jinja2 template that is returning the escaped quotes. I am making a later call in the script to:
lc.UserData=Base64(Join("", [commandList]))
And it is this call to the Troposphere Module for Base64 and/or Join that is causing the problem and inserting the escapes.
Testing Further shows specifically that it is Base64 that does the escaping.
This feels like a hack and I hope someone has a better solution but I solved the problem by doing the following.
In the template, I made the line look like this:
STACKNAME="QQ,{QQRefQQ:QQAWS::StackNameQQ},QQ"
Then, in the last line of the program where I currently have:
print t.to_json()
I changed it to
print t.to_json().replace("QQ", '"')
Which produces exactly what I'm looking for.

Sensible python source line wrapping for printout

I am working on a latex document that will require typesetting significant amounts of python source code. I'm using pygments (the python module, not the online demo) to encapsulate this python in latex, which works well except in the case of long individual lines - which simply continue off the page. I could manually wrap these lines except that this just doesn't seem that elegant a solution to me, and I prefer spending time puzzling about crazy automated solutions than on repetitive tasks.
What I would like is some way of processing the python source code to wrap the lines to a certain maximum character length, while preserving functionality. I've had a play around with some python and the closest I've come is inserting \\\n in the last whitespace before the maximum line length - but of course, if this ends up in strings and comments, things go wrong. Quite frankly, I'm not sure how to approach this problem.
So, is anyone aware of a module or tool that can process source code so that no lines exceed a certain length - or at least a good way to start to go about coding something like that?
You might want to extend your current approach a bit, but using the tokenize module from the standard library to determine where to put your line breaks. That way you can see the actual tokens (COMMENT, STRING, etc.) of your source code rather than just the whitespace-separated words.
Here is a short example of what tokenize can do:
>>> from cStringIO import StringIO
>>> from tokenize import tokenize
>>>
>>> python_code = '''
... def foo(): # This is a comment
... print 'foo'
... '''
>>>
>>> fp = StringIO(python_code)
>>>
>>> tokenize(fp.readline)
1,0-1,1: NL '\n'
2,0-2,3: NAME 'def'
2,4-2,7: NAME 'foo'
2,7-2,8: OP '('
2,8-2,9: OP ')'
2,9-2,10: OP ':'
2,11-2,30: COMMENT '# This is a comment'
2,30-2,31: NEWLINE '\n'
3,0-3,4: INDENT ' '
3,4-3,9: NAME 'print'
3,10-3,15: STRING "'foo'"
3,15-3,16: NEWLINE '\n'
4,0-4,0: DEDENT ''
4,0-4,0: ENDMARKER ''
I use the listings package in LaTeX to insert source code; it does syntax highlight, linebreaks et al.
Put the following in your preamble:
\usepackage{listings}
%\lstloadlanguages{Python} # Load only these languages
\newcommand{\MyHookSign}{\hbox{\ensuremath\hookleftarrow}}
\lstset{
% Language
language=Python,
% Basic setup
%basicstyle=\footnotesize,
basicstyle=\scriptsize,
keywordstyle=\bfseries,
commentstyle=,
% Looks
frame=single,
% Linebreaks
breaklines,
prebreak={\space\MyHookSign},
% Line numbering
tabsize=4,
stepnumber=5,
numbers=left,
firstnumber=1,
%numberstyle=\scriptsize,
numberstyle=\tiny,
% Above and beyond ASCII!
extendedchars=true
}
The package has hook for inline code, including entire files, showing it as figures, ...
I'd check a reformat tool in an editor like NetBeans.
When you reformat java it properly fixes the lengths of lines both inside and outside of comments, if the same algorithm were applied to Python, it would work.
For Java it allows you to set any wrapping width and a bunch of other parameters. I'd be pretty surprised if that didn't exist either native or as a plugin.
Can't tell for sure just from the description, but it's worth a try:
http://www.netbeans.org/features/python/

Categories

Resources