Python Requests unable to get redirected URL - python

I have defined the following function to get the redirected URLs using Requests library. However i get the error KeyError: 'location'
def get_redirected_url(r_url):
r = requests.get(r_url, allow_redirects=False)
url = r.headers['Location']
return url
Calling the function
get_redirected_url('http://omgili.com/ri/.wHSUbtEfZQujfav8g98PjRMi_ogV.5EwBTfg476RyS2Gqya3tDAwNIv8Yi8wQ9AK4.U2mxeyq2_xbUjqsOx8NYY8r0qgxD.4Bm2SrouZKnrg1jqRxEfVmGbtTaKTaaDJtOjtS46fYr6A5UJoh9BYxVtDGJIsbSfgshRXR3FVr4-')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in get_redirected_url
File "/home/user/PycharmProjects/untitled/venv/lib/python3.6/site-packages/requests/structures.py", line 54, in __getitem__
return self._store[key.lower()][1]
KeyError: 'location'
Is it failing because the redirection waits for 5 seconds? If so, how do we incorporate that as well?
I have tried the other answers like this and this. But unable to crack it.

It is simple: r.headers doesn't have 'Location' key. You may have use the wrong key.
Edit: the site you want to browse with requests is protected.

Related

Python - Get Instagram API Access Token

I am trying to call the Instagram Basic API to get the user info. However, I need to fetch the access token first for the authentication.
I have implemented the code for getting the same in python but I am only able to fetch the re-direct uri. Now, as a manual step we execute that uri on browser and get the code from the url.
But in case of programming how can I get the access token?
In below code: I am getting error
from instagram_basic_display.InstagramBasicDisplay import InstagramBasicDisplay
from flask import request
instagram_basic_display = InstagramBasicDisplay(app_id='5163403347035424', app_secret='80473de6ad506c837969e03b8ccdb4cb', redirect_url='https://mycogito.me/')
print(instagram_basic_display.get_login_url()) # Returns login URL you need to follow
code = request.args.get('code')
print(code)
Error
Traceback (most recent call last):
File "D:\Work Backup\Development\Workspaces\python\SocialMedia\src\com\social\media\HelloWorld.py", line 9, in <module>
https://api.instagram.com/oauth/authorize?client_id=5163403347035424&redirect_uri=https%3A%2F%2Fmycogito.me%2F&scope=user_profile%2Cuser_media&response_type=code
code = request.args.get('code')
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\site-packages\werkzeug\local.py", line 347, in __getattr__
return getattr(self._get_current_object(), name)
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\site-packages\werkzeug\local.py", line 306, in _get_current_object
return self.__local()
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\site-packages\flask\globals.py", line 38, in _lookup_req_object
raise RuntimeError(_request_ctx_err_msg)
RuntimeError: Working outside of request context.
There is a package when you want to get data from instagram and written in python click.

Error when trying to open a webpage with mechanize

I'm trying to learn mechanize to create a chat logging bot later, so I tested out some basic code
import mechanize as mek
import re
br = mek.Browser()
br.open("google.com")
However, whenever I run it, I get this error.
Traceback (most recent call last):
File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/mechanize/_mechanize.py", line 262, in _mech_open
url.get_full_url
AttributeError: 'str' object has no attribute 'get_full_url'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 5, in <module>
br.open("google.com")
File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/mechanize/_mechanize.py", line 253, in open
return self._mech_open(url_or_request, data, timeout=timeout)
File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/mechanize/_mechanize.py", line 269, in _mech_open
raise BrowserStateError("can't fetch relative reference: "
mechanize._mechanize.BrowserStateError: can't fetch relative reference: not viewing any document
I double checked with the documentation on the mechanize page and it seems consistent. What am I doing wrong?
You have to use a schema, otherwise mechanize thinks you are trying to open a local/relative path (as the error suggests).
br.open("google.com") should be br.open("http://google.com").
Then you will see an error mechanize._response.httperror_seek_wrapper: HTTP Error 403: b'request disallowed by robots.txt', because google.com does not allow crawlers. This can be remedied with br.set_handle_robots(False) before open.

How to get active tab url from browser?

How do I get URL from current tab in browser using python? Using os.environ['REQUEST_URI'] gives an error.
The following is my code :
os.environ['REQUEST_URI']
and the error :
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
os.environ["REQUEST_URI"]
File "C:\Python27\lib\os.py", line 423, in __getitem__
return self.data[key.upper()]
KeyError: 'REQUEST_URI'
Any other alternatives are also welcome.
Alternate to your asked way by "os.environ", you can do it:
by copy and past by hand
on other ways than you asked way by python
by Brotab
by Selenium
by xdotool
by javascript from serversite or by code injection from client side

Getting InvalidURLError: ApplicationError: 1 in URLFetch

I am getting the following error:
InvalidURLError: ApplicationError: 1
Checked my code, and logged some various things and the url's causing this error to appear look pretty normal. They are being quoted through urllib.quote and visiting them through a browser results in a normal result.
The error is happening with many URL's, not one. The URL points to an API service and is constructed within the app.
Btw,here's a link to the google.appengine.api.urlfetch source code: http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/api/urlfetch.py?r=56.
The docstrings say that the error should happen when: "InvalidURLError if the url was invalid." and "If the URL is an empty string or obviously invalid, we throw an urlfetch.InvalidURLError"
Just to make it simple for those who would like to test this:
url = 'http://api.embed.ly/1/oembed?key=REMOVEDKEY&maxwidth=400&urls=http%3A//V.interesting.As,http%3A//abcn.ws/z26G9a,http%3A//apne.ws/z37VyP,http%3A//bambuser.com/channel/baba-omer/broadcast/2348417,http%3A//bambuser.com/channel/baba-omer/broadcast/2348417,http%3A//bambuser.com/channel/baba-omer/broadcast/2348417,http%3A//bbc.in/xFx3rc,http%3A//bbc.in/zkkLJq,http%3A//billingsgazette.com/news/local/former-president-bush-to-speak-at-billings-fundraiser-in-may/article_f7ef425a-349c-56a9-a399-606b48033f35.html,http%3A//billingsgazette.com/news/local/former-president-bush-to-speak-at-billings-fundraiser-in-may/article_f7ef425a-349c-56a9-a399-606b48033f35.html,http%3A//billingsgazette.com/news/local/friday-forecast-calls-for-cloudy-windy-day-nighttime-snow-possible/article_d3eb3159-68b0-5559-8255-03fce56eaedd.html,http%3A//billingsgazette.com/news/local/gallery-toy-run/collection_f5042a31-bfd4-5f63-a901-2a8c3e8fb26a.html%230,http%3A//billingsgazette.com/news/local/gas-prices-continue-to-drop-in-billings/article_4e8fd07e-0e1e-5c0e-b551-4162b60c4b60.html,http%3A//billingsgazette.com/news/local/gas-prices-continue-to-drop-in-billings/article_713a0c32-32c9-59f1-9aeb-67b8462bbe88.html,http%3A//billingsgazette.com/news/local/gas-prices-continue-to-fall-in-billings-area/article_2bdebf4b-242c-569e-b414-f388a48f4a14.html,http%3A//billingsgazette.com/news/local/gas-prices-dip-below-a-gallon-at-some-billings-stations/article_c7f4d373-dc2b-55c0-b457-10346c0274a6.html,http%3A//billingsgazette.com/news/local/gas-prices-keep-dropping-in-billings-area/article_3666cf9c-4552-5108-9d5c-de2bba12fa3f.html,http%3A//billingsgazette.com/news/local/government-and-politics/city-picks-st-vincent-as-care-provider-for-health-insurance/article_a899f885-15e1-5b98-b899-75acc01e8feb.html,http%3A//billingsgazette.com/news/local/government-and-politics/linder-settles-in-after-first-year-as-sheriff/article_55a9836e-2196-546d-80f0-48bdef717fa3.html,http%3A//billingsgazette.com/news/local/government-and-politics/new-council-members-city-judge-sworn-in/article_bb7ac948-1d45-579c-a057-1323fb2e643d.html'
from google.appengine.api import urlfetch
result = urlfetch.fetch(url=url)
Here's the traceback:
Traceback (most recent call last):
File "", line 1, in
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/urlfetch.py", line 263, in fetch return rpc.get_result()
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/apiproxy_stub_map.py", line 592, in get_result
return self.__get_result_hook(self)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/urlfetch.py", line 359, in _get_fetch_result
raise InvalidURLError(str(err))
InvalidURLError: ApplicationError: 1
I wonder if it's something very simple that I'm missing from all of this. Would appreciate your comments and ideas. Thanks!
Your URL is too long, there is a limit on the length of URLs.

Getting and trapping HTTP response using Mechanize in Python

I am trying to get the response codes from Mechanize in python. While I am able to get a 200 status code anything else isn't returned (404 throws and exception and 30x is ignored). Is there a way to get the original status code?
Thanks
Errors will throw an exception, so just use try:...except:... to handle them.
Your Mechanize browser object has a method set_handle_redirect() that you can use to turn 30x redirection on or off. Turn it off and you get an error for redirects that you handle just like you handle any other error:
>>> from mechanize import Browser
>>> browser = Browser()
>>> resp = browser.open('http://www.oxfam.com') # this generates a redirect
>>> resp.geturl()
'http://www.oxfam.org/'
>>> browser.set_handle_redirect(False)
>>> resp = browser.open('http://www.oxfam.com')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "build\bdist.win32\egg\mechanize\_mechanize.py", line 209, in open
File "build\bdist.win32\egg\mechanize\_mechanize.py", line 261, in _mech_open
mechanize._response.httperror_seek_wrapper: HTTP Error 301: Moved Permanently
>>>
>>> from urllib2 import HTTPError
>>> try:
... resp = browser.open('http://www.oxfam.com')
... except HTTPError, e:
... print "Got error code", e.code
...
Got error code 301
In twill, do get_browser().get_code()
twill is an outstanding automation and test layer built on top of mechanize, to make it easier to use. It is seriously handy.

Categories

Resources