I am testing the Python library request to see if it is suitable for my work. Here is my sample code for reference:
import requests
url = "http://www.genenetwork.org/webqtl/main.py?cmd=sch&gene=Grin2b&tissue=hip&format=text"
print url
print requests.get(url)
My Output:
http://www.genenetwork.org/webqtl/main.py?cmd=sch&gene=Grin2b&tissue=hip&format=text
Response [200]
Output that I get from my browser & my expected result:
What made the differences? How can I get my expected results? I wanted to process the data inside the webpage.
Your code is currently printing the status code of your GET request. You can access the requested content via the text attribute of the Response class returned by the get method.
import requests
r = requests.get("http://www.genenetwork.org/webqtl/main.py?cmd=sch&gene=Grin2b&tissue=hip&format=text")
r.text
Related
I have an URL that redirects me to an other page, for example:
https://www.redirector.com/1
that redirects me to https://www.redirected.com/1
I am trying to fetch the second URL using python requests, I tried doing so using the following code:
import requests
rq = requests.get('https://www.redirector.com/1')
for re in rq.history:
print(re.url)
But that doesn't output anything...
Then I tried print the rq.history and turns out that was actually an empty list. Is there a way to get the https://www.redirected.com/1 URL besides using the history attribute?
You could view the headers of the response and see if there is a Location header (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Location) and the response code is 3xx. This would be the "low" level approach
This works fine, I can get data returned:
r = urllib2.Request("http://myServer.com:12345/myAction")
data = json.dumps(q) #q is a python dict
r.add_data(data)
r=urllib2.urlopen(r)
But doing the same with requests package fails:
r=requests.get("http://myServer.com:12345/myAction", data=q)
r.text #This will return a message that says method is not allowed.
It works if I make it a post request: r=requests.post("http://myServer.com:12345/myAction", data=json.dumps(q))
But why?
According to the urllib2.urlopen documentation:
the HTTP request will be a POST instead of a GET when the data parameter is provided.
This way, r=urllib2.urlopen(r) is also making a POST request. That is why your requests.get does not work, but requests.post does.
Set up a session
import session
session = requests.Session()
r = session.get("http://myServer.com:12345/myAction", data=q)
print r.content (<- or could us r.raw)
I am using Requests API with Python2.7.
I am trying to download certain webpages through proxy servers. I have a list of available proxy servers. But not all proxy servers work as desired. Some proxies require authentication, others redirect to advertisement pages etc. In order to detect/verify incorrect responses, I have included two checks in my url requests code. It looks similar to this
import requests
proxy = '37.228.111.137:80'
url = 'http://www.google.ca/'
response = requests.get(url, proxies = {'http' : 'http://%s' % proxy})
if response.url != url or response.status_code != 200:
print 'incorrect response'
else:
print 'response correct'
print response.text
There are some proxy servers with which the requests.get call is successful and they pass these two conditions and still contain invalid html source in response.text attribute. However, if I use the same proxy in my FireFox browser and try to open the same webpage, I am displayed an invalid webpage, but my python script says that the response should be valid.
Can someone point to me that what other necessary checks I am missing to weed out incorrect html results?
or
How can I successfully verify if the webpage I intended to receive is correct?
Regards.
What is an "invalid webpage" when displayed by your browser? The server can return a HTTP status code of 200, but the content is an error message. You understand it to be an error message because you can comprehend it, a browser or code can not.
If you have any knowledge about the content of the target page, you could check whether the returned HTML contains that content and accept it on that basis.
I need to get the output printed on the screen on accessing a url with username and password. When I access the url through my browser, I get a popup where I enter the credentials and get the output in the browser. How do I do it using python script? I tried the following, but it only returns <Response [200]> which means that the request is successful. The output I want is a simple text message.
import requests
response = requests.get(url, auth=(username, password))
print response
I have tried requests.post also, with same results.
print response tries to print out a Response object. If you want the text of the response, use print response.text.
You may want to read the Quickstart documentation for the python-requests library here: http://docs.python-requests.org/en/latest/user/quickstart/.
I am working on a tool that queries a number of APIs, one of which is a RESTful API. All of the other functions (API calls) of my program work fine with requests.get(), however with the REST API, I do not seem to be able to access the actual content of the response, only the status code. i.e. when I simply print the response, (not response.status_code) I get: <Response [200]> output to the screen. Any ideas?
Snippet of code:
# The URL is correct in my program, For sure.
url = ('http://APIurl/%s' % entry)
try:
response = requests.get(url)
# prints <Response [200]>
print response
# Fails, expecting JSON that isn't there
results.append(response.json())
print the response object's attributes to see what it has available:
print response.__dict__
response.text is your friend, if the content is not valid json.
You need to print response.context or response.text. Your data is probably there.
Sometimes when your request is wrong, the API returns the whole error page (in HTML). So if you're getting a bunch of HTML code, make sure you're request parameters are ok.