Python Requests HTTPConnectionPool and Max retries exceeded with url - python

On a Linux cluster, I get this error with Requests:
ConnectionError: HTTPConnectionPool(host='andes-1-47', port=8181): Max
retries exceeded with url: /jammy/api/v1 (Caused by : '')
What does this error mean? Is it a Requests problem or is it on the host, and what is the solution?
By the way, the program works successfully on both Windows and Linux standalone machines with localhost.

So the Max retries exceeded with url: ... bit can be vastly confusing. In all likelihood (since you mention that this works using localhost) that this is an application that you're deploying somewhere. This would also explain why the host name is andes-1-47 and not something most would expect (e.g., example.com). My best guess is that you need to either use the IP address for andes-1-47 (e.g., 192.168.0.255) or your linux cluster doesn't know how to resolve andes-1-47 and you should add it to your /etc/hosts file (i.e., adding the line: 192.168.0.255 andes-1-47).
If you want to see if your linux cluster can resolve the name you can always use this script:
import socket
socket.create_connection(('andes-1-47', 8181), timeout=2)
This will timeout in 2 seconds if you cannot resolve the hostname. (You can remove the timeout but it may take a lot longer to determine if the hostname is reachable that way.)

in the urlopen call, try setting retries=False or retries=1 to see the difference. The default is 3, which sounds quite reasonable.
http://urllib3.readthedocs.org/en/latest/pools.html#urllib3.connectionpool.HTTPConnectionPool.urlopen

Related

What can cause a “Resource temporarily unavailable” on sock connect() command

I am debugging a Python flask application. The application runs atop uWSGI configured with 6 threads and 1 process. I am using Flask-Executor to offload some slower tasks. These tasks create a connection with the Flask application, i.e., the same process, and perform some HTTP GET requests. The executor is configured to use 2 threads max. This application runs on Ubuntu 16.04.3 LTS.
Every once in a while the threads in the executor completely stop working. The code uses the Python requests library to do the requests. The underlying error message is:
Action failed. HTTPSConnectionPool(host='somehost.com', port=443): Max retries exceeded with url: /api/get/value (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f8d75bb5860>: Failed to establish a new connection: [Errno 11] Resource temporarily unavailable',))
The code that is running within the executor looks like this:
adapter = requests.adapters.HTTPAdapter(max_retries=3)
session = requests.Session()
session.mount('http://somehost.com:80', adapter)
session.headers.update({'Content-Type': 'application/json'})
...
session.get(uri, params=params, headers=headers, timeout=3)
I've spent a good amount of time trying to peel back the Python requests stack down to the C sockets that it uses. I've also tried reproducing this error using small C and Python programs. At first I thought it could be that sockets were not getting closed and so we were running out of allowable sockets as a resource, but that gives me a message more along the lines of "too many files are open".
Setting aside the Python stack, what could cause a [Errno 11] Resource temporarily unavailable on a socket connect() command? Also, if you've run into this using requests, are there arguments that I could pass in to prevent this?
I've seen the What can cause a “Resource temporarily unavailable” on sock send() command StackOverflow post, but I'm that's on a send() command and not on the initial connect(), which is what I suspect is where the code is getting hung up.
The error message Resource temporarily unavailable corresponds to the error code EAGAIN.
The connect() manpage states, that the error `EAGAIN occurs in the following situations:
No more free local ports or insufficient entries in the routing cache. For AF_INET see the description of /proc/sys/net/ipv4/ip_local_port_range ip(7) for information on how to increase the number of local ports.
This can happen, when very many connections to the same IP/port combination are in use and no local port for automatic binding can be found. You can check with
netstat -tulpen
which connections exactly cause this.

Python Post Requests using Requests library giving broken pipe error

I am running a API developed in GOLang which accepts post requests over LAN. My client is using Python to send some data (size 350KB) to the server. The python code is multithreaded and may be performing simultaneous post requests, 1 per each thread. Expected average requests per second between client and server is around 3. The requests are not getting timeout as the error gets raised much earlier.
I can not seem to find the source of the error. The network should be robust as both server and client are on a single switch having 1 Gbps. Please help.
HTTPConnectionPool(host='192.168.1.105', port=8080): Max retries exceeded with url: /match (Caused by <class 'socket.error'>: [Errno 32] Broken pipe)

Python Requests: Allow more retries

I have a website that has two servers - one is dedicated to client-facing web services, and the other is a beefier data processing server.
I currently have a process in which the web server contacts the data server for multiple requests that typically look like this:
payload = {'req_type':'data_processing', 'sub_type':'data_crunch', 'id_num':12345}
r = requests.get('https://data.mywebsite.com/_api_route', params = payload)
...which has been running like clockwork for the better part of the past year. However, after creating a pandas-heavy function on the data server, I've been getting the following error (which I can't imagine has anything to do with pandas, but thought I'd throw it out there anyway):
HTTPSConnectionPool(host='data.mywebsite.com', port=443):
Max retries exceeded with url: /_api_route?......
(Caused by <class 'httplib.BadStatusLine'>: '')
Both servers are running ubuntu, with python, and the Requests library to handle communication between the servers.
There is a similar question here:
Max retries exceeded with URL, but the OP is asking about contacting a server over which he has no control - I can code both sides, so I'm hoping I can change something on my data server, but am not sure what it would be.
Changing the number if retries will not solve your problem. Caused by <class 'httplib.BadStatusLine'>: '' is what you should fix. The server returned an empty HTTP status code, instead of something like "200" or "500".
The solution would be, if you're not already, to use a container such as uWSGI or gunicorn to handle concurrency and, once again if you're not already, use Nginx or Apache to host the server. I've used uWSGI a fair amount and it's configuration is very straight forward. To create more processes to handle requests, you just need to set processes = 2 in your .ini file. You can also use Nginx or Apache to spawn processes, but uWSGI is built specifically for python and works wonderfully with Flask. I would advise you implement this is your haven't already, and then observe memory and processor usage as you increment the number processes until you find a good number that your server can handle.
EDIT: Just as a P.S. I run a Flask app on an Nginx server using uWSGI with fairly bare bones hardware (just 2.5Ghz dual core) and with 16 processes, I average about 40% cpu usage.
I've scoured the known internet for solutions to this problem, and I don't think I'm going to find one in the near future.
Instead, I built in an internal retry loop (in python) on a 1-second delay that looks something like this:
counter = 0
max_tries = 10
while counter < max_tries:
try:
r = requests.get('https://data.mywebsite.com/_api_route', params = payload)
counter = max_tries
code = r.json()['r']['code']
res = r.json()['r']['response']
return code, res
except requests.exceptions.ConnectionError, e:
counter += 1
time.sleep(1)
It's definitely not a solution as much as it is a workaround, but for now, it does just that, it works... assuming it doesn't have to retry more than 10 times.

TransportError happened when calling softlayer-api

It is wired to get an exception at about 7:30am(utc+8) everyday when calling softlayer-api.
TransportError: TransportError(0): HTTPSConnectionPool(host='api.softlayer.com', port=443): Max retries exceeded with url: /xmlrpc/v3.1/SoftLayer_Product_Package (Caused by ProxyEr
ror('Cannot connect to proxy.', error('Tunnel connection failed: 503 Service Unavailable',)))
And I uses a proxy to forward https request to softlayer's server. At first I thougth it is caused by the proxy, but when I looked into the log, it showed every request had been forwarded successfully. So maybe it is caused by the server. Does the server do something so busy at that moment everyday that it fails to server?
We don't have any report about this kind of issue nor if the server is busy in SoftLayer's side. but regarding to your issue, it is something related with network issues. It seems that there is something happening with your proxy connection.
First we need to discard if the proxy can be the reason for this issue, it would be very useful if can verify that this issue is reproducible without using a proxy from your side, let me know if you could test it.
If you could check this without proxy, I recommend to submit a ticket for further investigation about this issue.

I wish to avoid overrunning the http connections pool

I am creating a tool that will run many simultaneous calls to a RESTful API. I am using the python "Requests" module and the "threading" module. Once I stack too many simultaneous gets on the system I am getting exceptions like this:
ConnectionError: HTTPConnectionPool(host='xxx.net', port=80): Max retries exceeded with url: /thing/subthing/ (Caused by : [Errno 10055] An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full)
What can I do to either increase the buffer and queue space, or ask the Requests module to wait for an available slot?
(I know I could stuff it in a "try" loop, but that seems clumsy)
Use a session. If you use the requests.request family of methods (get, post, ...), each request will use it's own session with it's own connection pool, therfore not making any use of connection pooling.
If you need to fine-tune the number of connections used within a session, you can do this by changing it's HTTPAdapter

Categories

Resources