I am trying to pull out data from SimilarWeb and I am finding some difficult in accessing the API (never done it before).
I am using requests and these are my attemps:
api_key = '3dac084d01e55d6e9a18c0d33eiu6341'
url = 'https://api.similarweb.com/v1/website/bbc.com/total-traffic-and-engagement/average-visit-duration?api_key={{similarweb_api_key}}&start_date=2017-11&end_date=2018-01&country=gb&granularity=monthly&main_domain_only=false&format=json'
headers = {'Ocp-Apim-Subscription-Key': '{key}'.format(key=api_key)}
Visits = requests.get(url, headers=headers).json()
I get the following error message and it looks like that there is a network error?
ConnectionError: HTTPSConnectionPool(host='api.similarweb.com', port=443): Max retries exceeded with url: /v1/website/bbc.com/total-traffic-and-engagement/average-visit-duration?api_key=%7B%7Bsimilarweb_api_key%7D%7D&start_date=2017-11&end_date=2018-01&country=gb&granularity=monthly&main_domain_only=false&format=json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000001A3B68476A0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
I am using Spyder if this can help.
Anyone had the same issue?
How can I solve it?
You can try do add this parameter verify=False.
like that :
requests.get(url, headers=headers, verify=False)
Summarize the problem
I am trying to practice a simple example from Keith Galli tutorials, and scraping some headers from target web. But I don't know why Keith can scrap successfully from his tutorial but I can't. Now I am working at company, if I replace this url by "company website", it works. I don't know if it is a proxy issue or some else.
The code is like this:
# Load the webpage content
r = requests.get("https://keithgalli.github.io/web-scraping/example.html")
# Convert to a beautiful soup object
soup = bs(r.content)
# Print out our html
And my prompt is:
ConnectionError: HTTPSConnectionPool(host='keithgalli.github.io', port=443): Max retries exceeded with url: /web-scraping/example.html (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x0000026C42D80190>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
The link is to the example.html is fine. Either you don't have an internet connection or some network security policy is blocking your connection to the example.html.
My code:
import requests
url = 'https://hacker-news.firebaseio.com/v0/topstories.json'
r = requests.get(url)
print('Status code:', r.status_code)
and error:
requests.exceptions.SSLError: HTTPSConnectionPool(host='hacker-news.firebaseio.com', port=443): Max retries exceeded with url: /v0/topstories.json (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1125)')))
What's the problem?
There might be a larger network problem at play here causing this ssql error but to remedy this quicker do the following:
requests.get(url, verify=False)
You are bypassing the SSL for now.
I have a server running in Ubuntu OS, and I am using Python for getting to learn.
I started an app which I installed, When I opened the app in browser the page is not secure, like this -
I am getting some data from this page using python -
from urllib3.exceptions import InsecureRequestWarning
import requests
req = requests.get(url='', verify=False)
This shows an error -
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='', port=7001): Max retries exceeded with url: / (Caused by
NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fc2ae5f94f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
When I print the variable req to check the status, but python shows the variable is not defined
How to solve this, thanks
HTTPSConnectionPool(host='', port=443)
443 is the default port for HTTPS.
You're probably (we can't see how you've defined url, but that's my best guess) trying, not – fix that in your code.
well i've been trying to download a File directly from a URL, the thing is, when I do it at home it works perfectly fine, but when I try to run it at my company's server it doesn't work.
My code is:
import requests
url = "https://servicos.ibama.gov.br/ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php"
response = requests.get(url, verify=False)
url = open("teste.zip", "w")
and the error message that i get is:
HTTPSConnectionPool(host='servicos.ibama.gov.br', port=443): Max retries exceeded with url: /ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x000001F2F792C0F0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
I konw that we use proxy here, i've been looking for solutions for this problem and I could see that this is relevant, but I couldn't find something to do with this information
I have the follow code:
res = requests.get(url)
I use multi-thread method that will have the follow error:
ConnectionError: HTTPConnectionPool(host='bjtest.com', port=80): Max retries exceeded with url: /rest/data?method=check&test=123 (Caused by : [Errno 104] Connection reset by peer)
I have used the follow method, but it still have the error:
s = requests.session()
s.keep_alive = False
res = requests.get(url, headers={'Connection': 'close'})
So, I should how do it?
BTW, the url is OK, but it only can be visited internal, so the url have no problem. Thanks!
you run your script on Mac? I also meet similar problem, you can execute ulimit -n to check how many files you can handle in a time.
you can use below to enlarge the configuration.
resource.setrlimit(resource.RLIMIT_NOFILE, (the number you reset,resource.RLIM_INFINITY))
hoping can help you.
my blog which associated with your problem
I got a similar case, hopefully it can save some time to you:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8001): Max retries exceeded with url: /enroll/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10f96ecc0>: Failed to establish a new connection: [Errno 61] Connection refused'))
The problem was actually silly... the localhost was down at port 8001! Restarting the server solved it.
The error message (which is admittedly a little confusing) actually means that requests failed to connect to your requested URL at all.
In this case that's because your url is http://bjtest.com/rest/data?method=check&test=123, which isn't a real website.
It has nothing to do with the format you made the request in. Fix your url and it should (presumably) work for you.