How to escape slash in url path in python? [duplicate] - python

I have set up my coldfusion application to have dynamic urls on the page, such as
www.musicExplained/index.cfm/artist/:VariableName
However my variable names will sometimes contain slashes, such as
www.musicExplained/index.cfm/artist/GZA/Genius
This is causing a problem, because my application presumes that the slash in the variable name represents a different section of the website, the artists albums. So the URL will fail.
I am wondering if there is anyway to prevent this from happening? Do I need to use a function that replaces slashes in the variable names with another character?

You need to escape the slashes as %2F.

You could easily replace the forward slashes / with something like an underscore _ such as Wikipedia uses for spaces. Replacing special characters with underscores, etc., is common practice.

You need to escape those but don't just replace it by %2F manually. You can use URLEncoder for this.
Eg URLEncoder.encode(url, "UTF-8")
Then you can say
yourUrl = "www.musicExplained/index.cfm/artist/" + URLEncoder.encode(VariableName, "UTF-8")

Check out this w3schools page about "HTML URL Encoding Reference":
https://www.w3schools.com/tags/ref_urlencode.asp
for / you would escape with %2F

Related

I have a file location stored in a reach file. Like \reach. It thinks it is a carriage return

My file location is detecting the \r in \reach as a carriage return.
There is nothing I could find online about the topic. I need it to list the file location as only a string.
Declare your string as a raw string by prefixing a r. A raw string ignores all backslashes.
location = r'\reach'
Alternatively you could use double backslashes like so
location = '\\reach'
A third way would be to just use forward slashes instead
location = '/reach'
You need to escape slash with another slash in the string: \\reach
It looks like you might be using windows as your os which uses '\' as separators.
You could try defining your file path in raw string. Eg.
f = open(r'dir1\dir2\reach')

Using os.chdir to access a file in which a folder starts with '\f'

I know that \f is a form feed. I want to access my folder the following way:
os.chdir("C:\Python27\BGT_Python\skills\fuzzymatching")
The folder 'fuzzymatching' starts with the \f symbol which breaks the string.
What's the easiest way to get around these types of symbols?
Add an r character in front of the string:
os.chdir(r"C:\Python27\BGT_Python\skills\fuzzymatching")
See the Python docs.
In triple-quoted strings, unescaped newlines and quotes are allowed (and are retained), except that three unescaped quotes in a row terminate the string. (A ``quote'' is the character used to open the string, i.e. either ' or ".)
and
Unless an r' orR' prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C.
For completeness, I'll add:
os.chdir("C:/Python27/BGT_Python/skills/fuzzymatching")
About the only part of Windows that actually requires backslashes is the command line.
This should work:
os.chdir("C:\Python27\BGT_Python\skills\\fuzzymatching")
I just added a \ to scape \f.

variables with space in url (django)

I am having the same issue as How to pass variables with spaces through URL in :Django. I have tried the solutions mentioned but everything is returning as "The resource you are looking for has been removed, had its name changed, or is temporarily unavailable."
I am trying to pass a file name example : new 3
in urls.py:
url(r'^file_up/delete_file/(?P<oname>[0-9A-Za-z\ ]+)/$', 'app.views.delete_file' , name='delete_file'),
in views.py:
def delete_file(request,fname):
return render_to_response(
'app/submission_error.html',
{'fname':fname,
},
context_instance=RequestContext(request)
)
url : demo.net/file_up/delete_file/new%25203/
Thanks for the help
Thinking this over; are you stuck with having to use spaces? If not, I think you may find your patterns (and variables) easier to work with. A dash or underscore, or even a forward slash will look cleaner, and more predictable.
I also found this: https://stackoverflow.com/a/497972/352452 which cites:
The space character is unsafe because significant spaces may disappear and insignificant spaces may be introduced when URLs are transcribed or typeset or subjected to the treatment of word-processing programs.
You may also be able to capture your space with a literal %20. Not sure. Just leaving some thoughts here that come to mind.
demo.net/file_up/delete_file/new%25203/
This URL is double-encoded. The space is first encoded to %20, then the % character is encoded to %25. Django only decodes the URL once, so the decoded url is /file_up/delete_file/new%203/. Your pattern does not match the literal %20.
If you want to stick to spaces instead of a different delimiter, you should find the source of that URL and make sure it is only encoded once: demo.net/file_up/delete_file/new%203/.

Django URLs - trailing slash gets added to variable value

I have a django application hosted with Apache. I'm busy using the django restframework to create an API, but I am having issues with URLs. As an example, I have a URL like this:
url(r'path/to/endpoint/(?P<db_id>.+)/$', views.PathDetail.as_view())
If I try to access this url and don't include the trailing slash, it will not match. If I add a question mark on at the end like this:
url(r'path/to/endpoint/(?P<db_id>.+)/?', views.PathDetail.as_view())
This matches with and without a trailing slash. The only issue is that if a trailing slash is used, it now gets included in the db_id variable in my view. So when it searches the database, the id doesn't match. I don't want to have to go through all my views and remove trailing slashes from my url variables using string handling.
So my question is, what is the best way to make a url match both with and without a trailing slash without including that trailing slash in a parameter that gets sent to the view?
Your pattern for the parameter is .+, which means 1 or more of any character, including /. No wonder the slash is included in it, why wouldn't it?
If you want the pattern to include anything but /, use [^/]+ instead. If you want the pattern to include anything except slashes at the end, use .*[^/] for the pattern.
The .+ part of your regex will match one or more characters. This match is "greedy", meaning it will match as many characters as it can.
Check out: http://www.regular-expressions.info/repeat.html.
In the first case, the / has to be there for the full pattern to match.
In the second case, when the slash is missing, the pattern will match anyway because the slash is optional.
If the slash is present, the greedy db_id field will expand to the end (including the slash) and the slash will not match anything, but the overall pattern will still match because the slash is optional.
Some easy solutions would be to make the db_id non greedy by using the ? modifier: (?P<db_id>.+?)/? or make the field not match any slashes: (?P<db_id>[^/]+)/?

Building proper link with spaces

I have the following code in Python:
linkHTML = "click here" % strLink
The problem is that when strLink has spaces in it the link shows up as
click here
I can use strLink.replace(" ","+")
But I am sure there are other characters which can cause errors. I tried using
urllib.quote(strLink)
But it doesn't seem to help.
Thanks!
Joel
Make sure you use the urllib.quote_plus(string[, safe]) to replace spaces with plus sign.
urllib.quote_plus(string[, safe])
Like quote(), but also replaces spaces
by plus signs, as required for quoting
HTML form values when building up a
query string to go into a URL. Plus
signs in the original string are
escaped unless they are included in
safe. It also does not have safe
default to '/'.
from http://docs.python.org/library/urllib.html#urllib.quote_plus
Ideally you'd be using the urllib.urlencode function and passing it a sequence of key/value pairs like {["q","with space"],["s","with space & other"]} etc.
As well as quote_plus(*), you also need to HTML-encode any text you output to HTML. Otherwise < and & symbols will be markup, with potential security consequences. (OK, you're not going to get < in a URL, but you definitely are going to get &, so just one parameter name that matches an HTML entity name and your string's messed up.
html= 'click here' % cgi.escape(urllib.quote_plus(q))
*: actually plain old quote is fine too; I don't know what wasn't working for you, but it is a perfectly good way of URL-encoding strings. It converts spaces to %20 which is also valid, and valid in path parts too. quote_plus is optimal for generating query strings, but otherwise, when in doubt, quote is safest.

Categories

Resources