Django request.GET.get() truncating url string - python

I am sending a message from chrome extension to django app running locally using chrome.runtime.sendMessage. I am able to capture the message in the url but somehow the whole GET parameter is not being captured. For example,
"GET /sensitiveApi/?text=%20%20%20%20The%20Idiots%20-%20Rainbow%20Six%20Siege%20Funny%20Moments%20&%20Epic%20Stuff%20%20We%27re%20back%20with%20some%20Rainbow%20Six%20Siege%20funny%20moments!%20All%20clips%20were%20streamed%20live%20on%20my%20Twitch:%20https://www.twitch.tv/teosgameMore%20Siege%20funny%20moments:%20https://www.youtube.com/playlist?list...Discord:%20https://discord.gg/teoTwitter:%20https://twitter.com/LAGxPeanutPwnerInstagram:%20https://www.instagram.com/photeographPeople%20in%20video:Alex:%20https://twitter.com/AlexandraRose_GKatie:%20https://www.twitch.tv/katielouise_jKatja:%20https://www.twitch.tv/katjawastakenPaddy:%20https://twitter.com/Patward96Smii7y:%20https://www.youtube.com/user/SMii7YSnedger:%20https://www.twitch.tv/snedgerStefan:%20https://twitter.com/lagxsourTortilla:%20https://twitter.com/Tortilla_NZColderMilk:%20https://www.youtube.com/user/ColderMilkColderMilk%20Twitch:%20https://www.twitch.tv/colder_milkColderMilk:%20Twitter:%20https://twitter.com/colder_milkMusic%20used:Outro:%20Come%20Back%20from%20San%20Francisco%20(Instrumental)%20by%20Rameses%20B%20https://www.youtube.com/watch?v=fBWac...%20Go%20check%20out%20his%20music!%20:)%20https://www.youtube.com/RamesesB2 HTTP/1.1" 200 2
this is one response that I want to capture and I a doing request.GET.get('text', '') but all it returns is this,
The Idiots - Rainbow Six Siege Funny Moments
How do I capture the whole GET parameter?
This is how I use chrome.runtime.sendMessage,
chrome.runtime.sendMessage({
method: 'GET',
action: 'xhttp',
url: "http://127.0.0.1:8000/sensitiveApi/?text=",
data : text
});

Unescaped ampersand(&), that needs to be percent-encoded:
>>> import urllib
>>> print(urllib.quote('&'.encode('utf-8')))
%26
url(http://www.example.com?fields=name&age) with & would look like below mentioned value:
url = http://www.example.com?fields=name%26age

Related

Python GET request to redirecting URL does not actually redirect me

I have an URL that redirects me to an other page, for example:
https://www.redirector.com/1
that redirects me to https://www.redirected.com/1
I am trying to fetch the second URL using python requests, I tried doing so using the following code:
import requests
rq = requests.get('https://www.redirector.com/1')
for re in rq.history:
print(re.url)
But that doesn't output anything...
Then I tried print the rq.history and turns out that was actually an empty list. Is there a way to get the https://www.redirected.com/1 URL besides using the history attribute?
You could view the headers of the response and see if there is a Location header (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Location) and the response code is 3xx. This would be the "low" level approach

How to make API accept URLs as parameters with GET or POST requests using Bottle-python

I am making a simple python API using Bottle. Everything is working fine until I provide the parameter to be something like http://sahildua.com/projects/. Even if I send a URL as a encoded string, it still shows the same error i.e. 404 Not Found.
#route('/expand-url/<url>', method='GET')
def expand(url = ""):
if url == "":
return {"success" : False}
What do I need to change in #route instruction to make it work? Or is there any other way of sending the URL as a parameter?
Your issue is caused by the fact that there are forward slashes '/' in the value you are sending to your route. Bottle is interpreting/parsing the API call correctly, just not with you're expecting.
With your current route setup, Bottle is interpreting your API call as sending the value projects/ to the route /expand-url/http://sahildua.com/(which doesn't exist - hence the 404 error), instead of sending the url value http://sahildua.com/projects/ to the route /expand-url/<url> - The forward slashes are mucking things up - so you need a different approach.
I suggest passing the url as a GET parameter instead of accepting it via the route url.
So your API call would look like curl -XGET http://APIURL/expand-url?url=http://sahildua.com/projects/.
Then you can retrieve the url in bottle using: url = request.query.get('url', ''). i.e.
#route('/expand-url', method='GET')
def expand():
url = request.query.get('url', '')
if url == "":
return {"success" : False}
This code is not tested but just to give you an idea.

Django URL dispatcher not matching named group

I'm trying to make a DJango site, but the group matching in the URL dispatcher is giving me "p" no matter what I enter into the URL. Here's the pertinent parts of my code:
From user's urls.py (it does get included in the main urls.py)
url(r'^lookup?(?P<match_str>\w+)/$', views.lookup, name='user_lookup')
From views.py
def lookup(request, match_str):
users = User.objects.filter(name__contains=match_str)
json = serializers.serialize("json", users)
return json
And a couple log entries:
[01/Jul/2014 22:43:17] "GET /user/lookup/?z HTTP/1.1" 500 11363
[01/Jul/2014 22:43:18] "GET /user/lookup/?za HTTP/1.1" 500 11363
On closer inspection, it looks like my AJAX is actually sending two calls, and the second call is actually what's being matched. The logs for the second calls of the above log lines are:
[01/Jul/2014 22:43:17] "GET /merchant/lookup?z HTTP/1.1" 301 0
[01/Jul/2014 22:43:18] "GET /merchant/lookup?za HTTP/1.1" 301 0
I put a "debug" line in the view to print match_str and no matter I put it, I get 'p'. What is going on here?
Per karthikr's request, here's the result of print request.GET, match_str
<QueryDict: {u'za': [u'']}> p
Your regex doesn't match the URL from the log. The GET goes to /user/lookup, and the string user is not contained in Django's url Changing your regex to ^lookup/\?(?P<match_str>\w+)$, the request lookup/?someuser creates a named group match_str with the value someuser.
I recommend using one of the many online regex testers to play with the URL regex.

In Python why does urllib.urlopen make Google give an http status "302 Moved"?

Using Python 2.6.6 on CentOS 6.4
import urllib
#url = 'http://www.google.com.hk' #ok
#url = 'http://clients1.google.com.hk' #ok
#url = 'http://clients1.google.com.hk/complete/search' #ok (blank)
url = 'http://clients1.google.com.hk/complete/search?output=toolbar&hl=zh-CN&q=abc' #fails
print url
page = urllib.urlopen(url).read()
print page
Using the first 3 URLs, the code works. But with the 4th URL, Python gives the following 302:
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
here.
</BODY></HTML>
The URL in my code is the same as the URL it tells me to use:
My URL: http://clients1.google.com.hk/complete/search?output=toolbar&hl=zh-CN&q=abc
Its URL: http://clients1.google.com.hk/complete/search?output=toolbar&hl=zh-CN&q=abc
Google says URL moved, but the URLs are the same. Any ideas why?
Update: The URLs all work fine in a browser. But in Python command line the 4th URL is giving a 302.
urllib is ignoring the cookies and sending the new request without cookies, so it causes a redirect loop at that URL. To handle this you can use urllib2 (which is more up-to-date) and add a cookie handler:
import urllib2
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
response = opener.open('http://clients1.google.com.hk/complete/search?output=toolbar&hl=zh-CN&q=abc')
print response.read()
It most likely has to do with the headers and perhaps cookies. I did a quick test on the command-line using curl. It also gives me the 302 moved. The Location header it provides is different, as is the one in the document. If I follow the body URL I get a 204 response (weird). If I follow the Location header I end up getting a circular response like you indicate.
Perhaps important is the Set-Cookie header. It may be redirecting until it gets an appropriate cookie set. It may also be scanning the User-Agent and doing something based on that. Those are the big aspects that differentiate a browser from a tool like requests, or urlib. The browser creates sessions, stores cookies, and sends different headers.
I don't know why urllib fails (I get the same response), however requests lib works perfectly:
import requests
url = 'http://clients1.google.com.hk/complete/search?output=toolbar&hl=zh-CN&q=abc' # fails
print (requests.get(url).text)
If you use your favorite web debugger (Fiddler for me) and open up that URL in your browser, you'll see that you also get that initial 302 response. Your browser is just smart enough to redirect you automatically. So your code is returning the correct response. If you want your code to redirect to the new URL automatically, then you have to make your code smart enough to do so.

Python URLLib / URLLib2 POST

I'm trying to create a super-simplistic Virtual In / Out Board using wx/Python. I've got the following code in place for one of my requests to the server where I'll be storing the data:
data = urllib.urlencode({'q': 'Status'})
u = urllib2.urlopen('http://myserver/inout-tracker', data)
for line in u.readlines():
print line
Nothing special going on there. The problem I'm having is that, based on how I read the docs, this should perform a Post Request because I've provided the data parameter and that's not happening. I have this code in the index for that url:
if (!isset($_POST['q'])) { die ('No action specified'); }
echo $_POST['q'];
And every time I run my Python App I get the 'No action specified' text printed to my console. I'm going to try to implement it using the Request Objects as I've seen a few demos that include those, but I'm wondering if anyone can help me explain why I don't get a Post Request with this code. Thanks!
-- EDITED --
This code does work and Posts to my web page properly:
data = urllib.urlencode({'q': 'Status'})
h = httplib.HTTPConnection('myserver:8080')
headers = {"Content-type": "application/x-www-form-urlencoded",
"Accept": "text/plain"}
h.request('POST', '/inout-tracker/index.php', data, headers)
r = h.getresponse()
print r.read()
I am still unsure why the urllib2 library doesn't Post when I provide the data parameter - to me the docs indicate that it should.
u = urllib2.urlopen('http://myserver/inout-tracker', data)
h.request('POST', '/inout-tracker/index.php', data, headers)
Using the path /inout-tracker without a trailing / doesn't fetch index.php. Instead the server will issue a 302 redirect to the version with the trailing /.
Doing a 302 will typically cause clients to convert a POST to a GET request.

Categories

Resources