Python Requests Invalid URL Label error - python

I'm trying to access Shopify's API which uses a URL format of -
https://apikey:password#hostname/admin/resource.xml
e.g.http://7ea7a2ff231f9f7:95c5e8091839609c864#iliketurtles.myshopify.com/admin/orders.xml
doing $curl api_url downloads the correct XML however when I do
import requests
api_url = 'http://7ea7a2ff231f9f7d:95c5e8091839609c864#iliketurtles.myshopify.com/admin/orders.xml'
r = requests.get(api_url) # Invalid url label error
Any idea why I'm getting this? Curl / opening the link directly in the browser is working fine. Is it because the length of the URL is too long?
Thanks!

The error ('URL has an invalid label.') is probably a bug in requests library: it applies idna encoding (for internationalized domain names) on hostname with userinfo attached, source:
netloc = netloc.encode('idna').decode('utf-8')
that might raise 'label empty or too long' error for the long username:password. You can try to report it on the requests' issue tracker.
a:b#example.com form is deprecated otherwise
requests.get('https://a:b#example.com') should be equivalent to requests.get('https://example.com', auth=('a', 'b')) if all characters in username:password are from [-A-Za-z0-9._~!$&'()*+,;=] set.
curl and requests also differ then there are percent-encoded characters in userinfo e.g., https://a:%C3%80#example.com leads to curl generating the following http header:
Authorization: Basic YTrDgA==
but requests produces:
Authorization: Basic YTolQzMlODA=
i.e.:
>>> import base64
>>> base64.b64decode('YTrDgA==')
'a:\xc3\x80'
>>> print _
a:À
>>> base64.b64decode('YTolQzMlODA=')
'a:%C3%80'

It's not the length of the URL. If I do:
import requests
test_url = 'http://www.google.com/?somereallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallylongurl=true'
r = requests.get(test_url)
returns <Response [200]>
Have you tried making the request with the requests Authentication parameters detailed here
>>> requests.get('http://iliketurtles.myshopify.com/admin/orders.xml', auth=('ea7a2ff231f9f7', '95c5e8091839609c864'))
<Response [403]>

Related

Getting 403 Forbidden error when requesting JSON output with apikey

I am trying to request information from a server with Python. The API key is correct however I still get the 403 error. It works with curl, but not with Python.
Here is the curl code that outputs JSON:
curl -H "apiKey: xxx" https://kretaglobalmobileapi.ekreta.hu/api/v1/Institute/3928
And here is my code that outputs Forbidden error:
from urllib.request import Request, urlopen
import json
ker = Request('https://kretaglobalmobileapi.ekreta.hu/api/v1/Institute/3928')
ker.add_header('apiKey', 'xxxx')
content = json.loads(urlopen(ker))
print(content)
What is the problem?
urlopen usually returns a HTTPResponse object. So in order to read the contents use the read() function. Other wise your code looks fine.
req = Request('https://kretaglobalmobileapi.ekreta.hu/api/v1/Institute/3928')
req.add_header('apikey', 'xxx')
content = urlopen(req).read()
print(content).
You can also use another library for instance requests if the above method dosent work,
r = requests.get('<MY_URI>', headers={'apikey': 'xxx'})

401 response with Python requests

I'am using library "requests" for get image from url ('http://any_login:any_password#10.10.9.2/ISAPI/Streaming/channels/101/picture?snapShotImageType=JPEG'), but response error with code 401. Thats url from my rtsp camera.
I try using 'HTTPBasicAuth', 'HTTPDigestAuth' and 'HTTPProxyAuth'. But it's not working.
import requests
from requests.auth import HTTPBasicAuth
url = "http://any_login:any_password#10.10.9.2/ISAPI/Streaming/channels/101/picture?snapShotImageType=JPEG"
response = requests.get(url, auth=requests.auth.HTTPBasicAuth("any_login", "any_password"))
if response.status_code == 200:
with open("sample.jpg", 'wb') as f:
f.write(response.content)
I expected the output of image file from rtsp flow, but I got error code 401.
Given your username I suspect your password may contain non-ASCII characters. I had a similar issue with a password containing diacritics.
This worked :
curl -u user:pwd --basic https://example.org
This (and variations) throwed 401 Unauthorized :
import requests
requests.get("https://example.org", auth=requests.auth.HTTPBasicAuth("user","pwd"))
Changing the password to ASCII only characters solved the issue.

How can i add cookie in headers?

i want to automation testing tool using api.
at first, i login the site and get a cookie.
my code is python3
import urllib
import urllib3
from bs4 import BeautifulSoup
url ='http://ip:port/api/login'
login_req = urllib.parse.urlencode(login_form)
http = urllib3.PoolManager()
r= http.request('POST',url,fields={'userName':'id','password':'passoword'})
soup = BeautifulSoup(r.data.decode('utf-8'),'lxml')
cookie = r.getheaders().get('Set-Cookie')
str1 = r.getheaders().get('Set-Cookie')
str2 = 'JSESSIONID' +str1.split('JSESSIONID')[1]
str2 = str2[0:-2]
print(str2)
-- JSESSIONID=df0010cf-1273-4add-9158-70d817a182f7; Path=/; HttpOnly
and then, i add cookie on head another web site api.
but it is not working!
url2 = 'http://ip:port/api/notebook/job/paragraph'
r2 = http.request('POST',url2)
r2.headers['Set-Cookie']=str2
r2.headers['Cookie']=str2
http.request('POST',url2, headers=r2.headers)
why is not working? it shows another cookie
if you know this situation, please explain to me..
error contents is that.
HTTP ERROR 500
Problem accessing /api/login;JSESSIONID=b8f6d236-494b-4646-8723-ccd0d7ef832f.
Reason: Server Error
Caused by:</h3><pre>javax.servlet.ServletException: Filtered request failed.
ProtocolError: ('Connection aborted.', BadStatusLine('<html>\n',))
thanks a lot!
Use requests module in python 3.x. You have to create a session which you are not doing now that's why you are facing problems.
import requests
s=requests.Session()
url ='http://ip:port/api/login'
r=s.get(url)
dct=s.cookies.get_dict() #it will return cookies(if any) and save it in dict
Take which ever cookie is wanted by the server and all the headers requested and pass it in header
jid=dct["JSESSIONID"]
head = {JSESSIONID="+jid,.....}
payload = {'userName':'id','password':'passoword'}
r = s.post(url, data=payload,headers=head)
r = s.get('whatever url after login')
To get info about which specific headers you have to pass and all the parameters required for POST
Open link in google chrome.
Open Developers Console(fn + F12).
There search for login doc (if cannot find, input wrong details and submit).
You will get info about request headers and POST parameters.

Python - Get redirected url of links from Google Alerts feeds

If you create a google alert as a rss feed (not automaticcaly sent to your e-mail address), it contains links like this one: https://www.google.com/url?rct=j&sa=t&url=http://www.statesmanjournal.com/story/opinion/readers/2014/10/13/gmo-labels-encourage-people-make-choices/17171289/&ct=ga&cd=CAIyGjkyZjE1NGUzMGIwZjRkNGQ6Y29tOmVuOlVT&usg=AFQjCNHrCLmbml7baTXaqySagcuKHp-KHA.
This link is obviously a redirection (just try it and you'll end up here: http://www.statesmanjournal.com/story/opinion/readers/2014/10/13/gmo-labels-encourage-people-make-choices/17171289/), but I cannot get this final url with Python (otherwise than by removing the beginning of the url, which is quite ugly).
I've tried so far with packages urllib2, httplib2 and requests:
urllib2.urlopen and geturl() from the return value
httplib2 request with follow_all_redirects=True and 'content-location' from the return value
requests.get and history from the return value
Has someone already been confronted to this issue?
Thanks!
Google does not give you a HTTP redirect; a 200 OK response is returned, not a 30x redirect:
>>> import requests
>>> url = 'https://www.google.com/url?rct=j&sa=t&url=http://www.statesmanjournal.com/story/opinion/readers/2014/10/13/gmo-labels-encourage-people-make-choices/17171289/&ct=ga&cd=CAIyGjkyZjE1NGUzMGIwZjRkNGQ6Y29tOmVuOlVT&usg=AFQjCNHrCLmbml7baTXaqySagcuKHp-KHA'
>>> response = requests.get(url)
>>> response.url
u'https://www.google.com/url?rct=j&sa=t&url=http://www.statesmanjournal.com/story/opinion/readers/2014/10/13/gmo-labels-encourage-people-make-choices/17171289/&ct=ga&cd=CAIyGjkyZjE1NGUzMGIwZjRkNGQ6Y29tOmVuOlVT&usg=AFQjCNHrCLmbml7baTXaqySagcuKHp-KHA'
>>> response.text
u'<script>window.googleJavaScriptRedirect=1</script><script>var m={navigateTo:function(b,a,d){if(b!=a&&b.google){if(b.google.r){b.google.r=0;b.location.href=d;a.location.replace("about:blank");}}else{a.location.replace(d);}}};m.navigateTo(window.parent,window,"http://www.statesmanjournal.com/story/opinion/readers/2014/10/13/gmo-labels-encourage-people-make-choices/17171289/");\n</script><noscript><META http-equiv="refresh" content="0;URL=\'http://www.statesmanjournal.com/story/opinion/readers/2014/10/13/gmo-labels-encourage-people-make-choices/17171289/\'"></noscript>'
The response is a piece of HTML and JavaScript that your browser will interpret as loading a new URL. You'll have to parse that response to extract the target.
String splitting could achieve that:
>>> response.text.partition("URL='")[-1].rpartition("'\"")[0]
u'http://www.statesmanjournal.com/story/opinion/readers/2014/10/13/gmo-labels-encourage-people-make-choices/17171289/'
If we assume that the URL parameter in the body is just a direct reflection of the url parameter in the query string, then you can just extract it from there too, and we don't even have to ask Google to execute the redirect:
try:
from urllib.parse import parse_qs, urlsplit
except ImportError:
# Python 2
from urlparse import parse_qs, urlsplit
target = parse_qs(urlsplit(url).query)['url'][0]

How to send GET request including headers using python

I'm trying to build a website using web.py, which is able to search the mobile.de database (mobile.de is a German car sales website). For this I need to use the mobile.de API and make a GET request to it doing the following (this is an example from the API docs):
GET /1.0.0/ad/search?exteriorColor=BLACK&modificationTime.min=2012-05-04T18:13:51.0Z HTTP/1.0
Host: services.mobile.de
Authorization: QWxhZGluOnNlc2FtIG9wZW4=
Accept: application/xml
(The authorization needs to be my username and password joined together using a colon and then being encoded using Base64.)
So I use urllib2 to do the request as follows:
>>> import base64
>>> import urllib2
>>> headers = {'Authorization': base64.b64encode('myusername:mypassw'), 'Accept': 'application/xml'}
>>> req = urllib2.Request('http://services.mobile.de/1.0.0/ad/search?exteriorColor=BLACK', headers=headers)
And from here I am unsure how to proceed. req appears to be an instance with some methods to get the information in it. But did it actually send the request? And if so, where can I get the response?
All tips are welcome!
You need to call req.read() to call the URL and get the response.
But you'd be better off using the requests library, which is much easier to use.

Categories

Resources