Getting InvalidURLError: ApplicationError: 1 in URLFetch

Getting InvalidURLError: ApplicationError: 1 in URLFetch - python

I am getting the following error:
InvalidURLError: ApplicationError: 1
Checked my code, and logged some various things and the url's causing this error to appear look pretty normal. They are being quoted through urllib.quote and visiting them through a browser results in a normal result.
The error is happening with many URL's, not one. The URL points to an API service and is constructed within the app.
Btw,here's a link to the google.appengine.api.urlfetch source code: http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/api/urlfetch.py?r=56.
The docstrings say that the error should happen when: "InvalidURLError if the url was invalid." and "If the URL is an empty string or obviously invalid, we throw an urlfetch.InvalidURLError"
Just to make it simple for those who would like to test this:
url = 'http://api.embed.ly/1/oembed?key=REMOVEDKEY&maxwidth=400&urls=http%3A//V.interesting.As,http%3A//abcn.ws/z26G9a,http%3A//apne.ws/z37VyP,http%3A//bambuser.com/channel/baba-omer/broadcast/2348417,http%3A//bambuser.com/channel/baba-omer/broadcast/2348417,http%3A//bambuser.com/channel/baba-omer/broadcast/2348417,http%3A//bbc.in/xFx3rc,http%3A//bbc.in/zkkLJq,http%3A//billingsgazette.com/news/local/former-president-bush-to-speak-at-billings-fundraiser-in-may/article_f7ef425a-349c-56a9-a399-606b48033f35.html,http%3A//billingsgazette.com/news/local/former-president-bush-to-speak-at-billings-fundraiser-in-may/article_f7ef425a-349c-56a9-a399-606b48033f35.html,http%3A//billingsgazette.com/news/local/friday-forecast-calls-for-cloudy-windy-day-nighttime-snow-possible/article_d3eb3159-68b0-5559-8255-03fce56eaedd.html,http%3A//billingsgazette.com/news/local/gallery-toy-run/collection_f5042a31-bfd4-5f63-a901-2a8c3e8fb26a.html%230,http%3A//billingsgazette.com/news/local/gas-prices-continue-to-drop-in-billings/article_4e8fd07e-0e1e-5c0e-b551-4162b60c4b60.html,http%3A//billingsgazette.com/news/local/gas-prices-continue-to-drop-in-billings/article_713a0c32-32c9-59f1-9aeb-67b8462bbe88.html,http%3A//billingsgazette.com/news/local/gas-prices-continue-to-fall-in-billings-area/article_2bdebf4b-242c-569e-b414-f388a48f4a14.html,http%3A//billingsgazette.com/news/local/gas-prices-dip-below-a-gallon-at-some-billings-stations/article_c7f4d373-dc2b-55c0-b457-10346c0274a6.html,http%3A//billingsgazette.com/news/local/gas-prices-keep-dropping-in-billings-area/article_3666cf9c-4552-5108-9d5c-de2bba12fa3f.html,http%3A//billingsgazette.com/news/local/government-and-politics/city-picks-st-vincent-as-care-provider-for-health-insurance/article_a899f885-15e1-5b98-b899-75acc01e8feb.html,http%3A//billingsgazette.com/news/local/government-and-politics/linder-settles-in-after-first-year-as-sheriff/article_55a9836e-2196-546d-80f0-48bdef717fa3.html,http%3A//billingsgazette.com/news/local/government-and-politics/new-council-members-city-judge-sworn-in/article_bb7ac948-1d45-579c-a057-1323fb2e643d.html'
from google.appengine.api import urlfetch
result = urlfetch.fetch(url=url)
Here's the traceback:
Traceback (most recent call last):
File "", line 1, in
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/urlfetch.py", line 263, in fetch return rpc.get_result()
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/apiproxy_stub_map.py", line 592, in get_result
return self.__get_result_hook(self)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/urlfetch.py", line 359, in _get_fetch_result
raise InvalidURLError(str(err))
InvalidURLError: ApplicationError: 1
I wonder if it's something very simple that I'm missing from all of this. Would appreciate your comments and ideas. Thanks!

Your URL is too long, there is a limit on the length of URLs.

Related

Python - Get Instagram API Access Token

I am trying to call the Instagram Basic API to get the user info. However, I need to fetch the access token first for the authentication.
I have implemented the code for getting the same in python but I am only able to fetch the re-direct uri. Now, as a manual step we execute that uri on browser and get the code from the url.
But in case of programming how can I get the access token?
In below code: I am getting error
from instagram_basic_display.InstagramBasicDisplay import InstagramBasicDisplay
from flask import request
instagram_basic_display = InstagramBasicDisplay(app_id='5163403347035424', app_secret='80473de6ad506c837969e03b8ccdb4cb', redirect_url='https://mycogito.me/')
print(instagram_basic_display.get_login_url()) # Returns login URL you need to follow
code = request.args.get('code')
print(code)
Error
Traceback (most recent call last):
File "D:\Work Backup\Development\Workspaces\python\SocialMedia\src\com\social\media\HelloWorld.py", line 9, in <module>
https://api.instagram.com/oauth/authorize?client_id=5163403347035424&redirect_uri=https%3A%2F%2Fmycogito.me%2F&scope=user_profile%2Cuser_media&response_type=code
code = request.args.get('code')
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\site-packages\werkzeug\local.py", line 347, in __getattr__
return getattr(self._get_current_object(), name)
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\site-packages\werkzeug\local.py", line 306, in _get_current_object
return self.__local()
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\site-packages\flask\globals.py", line 38, in _lookup_req_object
raise RuntimeError(_request_ctx_err_msg)
RuntimeError: Working outside of request context.

There is a package when you want to get data from instagram and written in python click.

Python Requests unable to get redirected URL

I have defined the following function to get the redirected URLs using Requests library. However i get the error KeyError: 'location'
def get_redirected_url(r_url):
r = requests.get(r_url, allow_redirects=False)
url = r.headers['Location']
return url
Calling the function
get_redirected_url('http://omgili.com/ri/.wHSUbtEfZQujfav8g98PjRMi_ogV.5EwBTfg476RyS2Gqya3tDAwNIv8Yi8wQ9AK4.U2mxeyq2_xbUjqsOx8NYY8r0qgxD.4Bm2SrouZKnrg1jqRxEfVmGbtTaKTaaDJtOjtS46fYr6A5UJoh9BYxVtDGJIsbSfgshRXR3FVr4-')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in get_redirected_url
File "/home/user/PycharmProjects/untitled/venv/lib/python3.6/site-packages/requests/structures.py", line 54, in __getitem__
return self._store[key.lower()][1]
KeyError: 'location'
Is it failing because the redirection waits for 5 seconds? If so, how do we incorporate that as well?
I have tried the other answers like this and this. But unable to crack it.

It is simple: r.headers doesn't have 'Location' key. You may have use the wrong key.
Edit: the site you want to browse with requests is protected.

How to detect/trap error codes from convertapi so that my python app doesn't fail?

First, my apologies as a Python newbie that I'm asking this question. It probably has nothing at all to do with convertapi and more to do with my basic lack of understanding as to how to interact with APIs.
I'm reading a Google sheet to find embedded hyperlinks containing references to files (PDF, html, whatever) and then using convertapi to get a txt version so that I can do content analysis based on existence, count and proximity of various terms.
My question has to do with the convertapi.convert failing because (in this case) it turns out convertapi thinks the PDF is invalid (because I have tested the file # convertapi.com and it returned a 5002 error). I don't dispute the file may be bad - all I want to do is detect that convertapi.convert can't convert the file so that I can ignore it and move on.
My python code has a small function:
def convert_PDF_to_text(inputfilename):
result = convertapi.convert('txt', { 'File': inputfilename }, from_format = 'pdf')
result.save_files('converted_pdf_files')
...and while it works fine for some inputs there is a particular URL PDF that results in this output (including my own messages from program):
about to call convertapi.convert with filename (https://www.epa.gov/sites/production/files/2016-06/documents/2016_policy_order_revision_6-10-16.pdf)
yes this is the specific file causing the problem: https://www.epa.gov/sites/production/files/2016-06/documents/2016_policy_order_revision_6-10-16.pdf
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/client.py", line 46, in handle_response
r.raise_for_status()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://v2.convertapi.com/convert/pdf/to/txt?Secret=PIuLcqNVL8w4rc9Y
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./p1.py", line 244, in <module>
convert_PDF_to_text(source_URL)
File "./p1.py", line 63, in convert_PDF_to_text
result = convertapi.convert('txt', { 'File': inputfilename }, from_format = 'pdf')
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/api.py", line 7, in convert
return task.run()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/task.py", line 26, in run
response = convertapi.client.post(path, params, timeout = timeout)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/client.py", line 16, in post
return self.handle_response(r)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/convertapi/client.py", line 49, in handle_response
raise ApiError(r.json())
convertapi.exceptions.ApiError: <exception str() failed>
I know it should be obvious just from the errors what I should check...but I'm too much of a newbie to Python and APIs to know how to decipher.
How do I test for errors so that my Python code doesn't abort?
Thanks in advance and again sorry for the basic question - yes I did search for answers and don't find anyone addressing my question, it's likely too simple...

All - disregard. I used try: & except: to manage this.

hubspot3 client and "too many retries" error

I'm trying to pull the details for a contact from hubspot using the recipient's email. I'm using the python3 client "hubspot3" (https://github.com/jpetrucciani/hubspot3).
Here's the code I'm submitting:
import requests
from hubspot3.contacts import ContactsClient
API_KEY=[myapikey]
client=ContactsClient(api_key=API_KEY,debug=True)
email='mytest#gmail.com'
client.get_contact_by_email(email)
The response:
WARNING:root:Too many retries for /contacts/v1/contact/email/nwnippy27+cb1#gmail.com/profile?hapikey=[myapikey]
Traceback (most recent call last):
File "hubspot_api_test.py", line 11, in <module>
client.get_contact_by_email(email)
File "/opt/virtual_env/hubspot-test/lib/python3.7/site-packages/hubspot3/contacts.py", line 38, in get_contact_by_email
"contact/email/{email}/profile".format(email=email), method="GET", **options
File "/opt/virtual_env/hubspot-test/lib/python3.7/site-packages/hubspot3/base.py", line 339, in _call
**options
File "/opt/virtual_env/hubspot-test/lib/python3.7/site-packages/hubspot3/base.py", line 245, in _call_raw
result = self._execute_request_raw(connection, request_info)
File "/opt/virtual_env/hubspot-test/lib/python3.7/site-packages/hubspot3/base.py", line 162, in _execute_request_raw
raise HubspotNotFound(result, request)
hubspot3.error.HubspotNotFound:
Hubspot Error
I'm reading this error as saying that the email address can't be found. Is that correct? If not, I appreciate any intel on the cause and solution.

OK ... so not super useful, but it turned out that this is just the error message you get when the email doesn't exist. After a few tries it gives up, which is why you get the "too many retries" error.

OAuthAccessTokenException-The access_token provided is invalid

I am trying to follow along with the Python-Instagram simple example on their readme:
api = InstagramAPI(client_id=client_id, client_secret=client_secret)
popular_media = api.media_popular(count=20)
for media in popular_media:
print media.images['standard_resolution'].url
However, every time I try with my own client_id and client_secret I get the following error and have been unable to figure out how to solve it:
Traceback (most recent call last):
File "/Users/PycharmProjects/Instagram API/InstaApiTest.py", line 10, in <module>
popular_media = api.media_popular(count=20)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/instagram/bind.py", line 197, in _call
return method.execute()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/instagram/bind.py", line 189, in execute
content, next = self._do_api_request(url, method, body, headers)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/instagram/bind.py", line 163, in _do_api_request
raise InstagramAPIError(status_code, content_obj['meta']['error_type'], content_obj['meta']['error_message'])
instagram.bind.InstagramAPIError: (400) OAuthAccessTokenException-The access_token provided is invalid.
Can anyone help me solve this issue?

This is likely due to the tutorial being out of date. Note that the README file in the tutorial repo (which hasn't been updated in years) warns:
This project is not actively maintained. Proceed at your own risk!
Instagram doesn't even list media/popular/ on their media endpoints official documentation anymore. Even when it was a thing, there's some indication that you needed an actual access token, even a couple of years back, contrary to what the tutorial you're following states.
TL;DR - Look for a different tutorial, this endpoint is either changed or requires full authentication.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Getting InvalidURLError: ApplicationError: 1 in URLFetch - python

Your URL is too long, there is a limit on the length of URLs.

Related

Python - Get Instagram API Access Token

Python Requests unable to get redirected URL

How to detect/trap error codes from convertapi so that my python app doesn't fail?

hubspot3 client and "too many retries" error

OAuthAccessTokenException-The access_token provided is invalid

Categories

Resources