So I have an Azure functions app written in python and quite often the code throws an error like this.
HTTPSConnectionPool(host='www.***.com', port=443): Max retries exceeded with url: /x/y/z (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7faba31d0438>: Failed to establish a new connection: [Errno 110] Connection timed out',))
This happens in a few diffrent functions that make https connections.
I contacted support and they told me that this was caused by SNAT port exhaustion and adviced me to: "Modify the application to reuse connections instead of creating a connection per request, use connection pooling, use service endpoints if the you are connecting to resources in Azure." They sent me this link https://4lowtherabbit.github.io/blogs/2019/10/SNAT/ and also this https://learn.microsoft.com/en-us/azure/azure-functions/manage-connections
Problem is I am unsure about how to practically reuse and or pool connections in python and I am unsure what the primary cause of exhaustion is, as this data is not publicly available.
So I am looking for help with applying their advice to all our http(s) and database connections.
I made the assumption that pymongo and pyodbc (the database clients we use) would handle pooling an reuse despite me creating a new client each time a function runs. Is this incorrect and if so, how do I reuse these database clients in python to prevent this?
The problem has so far only been caused when using requests (or the zeep SOAP library that internally defaults to using requests) to hit a https endpoint. Is there any way I could improve how I use requests. Like reusing sessions or closing connections explicitly. I am aware that requests creates a session in the background when calling requests.get. But my knowledge about the library is insufficient to figure out if this is the problem and how I could solve it. I am thinking I might be able to create and reuse a single session instance for each specific http(s) call in each function, but I am unsure if this is correct and also I have no idea on how to actually do it.
In a few places I also use aiohttp and if possible would like to achive the same thing there.
I haven't looked into service endpoints yet but I am about to.
So in short. What can I in pratice do to ensure reusage/pooling with requests, pyodbc, pymongo and aiohttp?
Related
I am creating service A that communicates with service B over http requests. Since my system needs to support high load of usage, I wanted to create a persistent connection.
I am working with python requests library. I thought of creating a requests.Session object and just keep using the same object for all http requests.
What happens if for some reason the underlying tcp connection was lost? How can I check the session object is still alive and healthy? I tried googling but haven't found anything...
Thank you :)
I am debugging a Python flask application. The application runs atop uWSGI configured with 6 threads and 1 process. I am using Flask-Executor to offload some slower tasks. These tasks create a connection with the Flask application, i.e., the same process, and perform some HTTP GET requests. The executor is configured to use 2 threads max. This application runs on Ubuntu 16.04.3 LTS.
Every once in a while the threads in the executor completely stop working. The code uses the Python requests library to do the requests. The underlying error message is:
Action failed. HTTPSConnectionPool(host='somehost.com', port=443): Max retries exceeded with url: /api/get/value (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f8d75bb5860>: Failed to establish a new connection: [Errno 11] Resource temporarily unavailable',))
The code that is running within the executor looks like this:
adapter = requests.adapters.HTTPAdapter(max_retries=3)
session = requests.Session()
session.mount('http://somehost.com:80', adapter)
session.headers.update({'Content-Type': 'application/json'})
...
session.get(uri, params=params, headers=headers, timeout=3)
I've spent a good amount of time trying to peel back the Python requests stack down to the C sockets that it uses. I've also tried reproducing this error using small C and Python programs. At first I thought it could be that sockets were not getting closed and so we were running out of allowable sockets as a resource, but that gives me a message more along the lines of "too many files are open".
Setting aside the Python stack, what could cause a [Errno 11] Resource temporarily unavailable on a socket connect() command? Also, if you've run into this using requests, are there arguments that I could pass in to prevent this?
I've seen the What can cause a “Resource temporarily unavailable” on sock send() command StackOverflow post, but I'm that's on a send() command and not on the initial connect(), which is what I suspect is where the code is getting hung up.
The error message Resource temporarily unavailable corresponds to the error code EAGAIN.
The connect() manpage states, that the error `EAGAIN occurs in the following situations:
No more free local ports or insufficient entries in the routing cache. For AF_INET see the description of /proc/sys/net/ipv4/ip_local_port_range ip(7) for information on how to increase the number of local ports.
This can happen, when very many connections to the same IP/port combination are in use and no local port for automatic binding can be found. You can check with
netstat -tulpen
which connections exactly cause this.
So I need to deny a websocket connection with a specific code so the client can handle the rejection properly. Currently when you reject a connection with message.reply_channel({'accept':False}) the client simple gets a 403 error. I can close a connection with a code by using message.reply_channel({'close':3000}) but that requires for the connection to have been accepted in the first place. If that's the only way to do it then so be it but I feel like there should be a way to reject with a code that I simply can't find.
I'm using Django Channels 1.1.8 so the 2.x release changes don't benefit me unfortunately.
I want to not answer a request handled by Flask. I don't want to return any error code, data, or an answer at all.
What I am trying to accomplish by doing this is that there is an endpoint takes sensor data and do not return any information. The clients POST the data to this endpoint, but they do not wait for an answer and shutdown (I have no control over the clients.) So I'm seeing the following error: "[Errno 10053] An established connection was aborted by the software in your host machine". So I asked myself, why do I even respond to these requests.
I can think of two reasons to do something like this:
You have a "friend" that you want to prevent from accessing your site, or
You have the misguided notion that this will help prevent (D)DoS attacks.
When you say "ignore a request totally" you kind of actually can't do that, generally speaking. Unless you know the IP address that the traffic is coming from, and then you can instruct your OS, Network card, router, switch, load balancer, maybe even ISP to filter out the traffic coming from that IP.
Otherwise, you're kind of out of luck because of how the Internet works.
HTTP works over TCP*. Specifically the client process looks something like this:
Translate DNS (e.g. google.com) to IP address (e.g. 216.58.218.174)
open up a TCP connection to 216.58.218.174:80 (using google for the example)
send the HTTP header over to Google:
GET / HTTP/1.1
read the response
Once that TCP/IP connection has been created to your server, at the very least you're going to have to terminate the connection.
There's really no good way to do this from within Python itself, and certainly not within Flask.
As you've updated your answer, it turns out you really don't have to change anything, Flask is already handling the error behind the scenes. It may be routing the message to a specific logger that you might be able to handle if you really don't want to see the messages, but it's not really important.
The only thing you may want to do, if your return processing is expensive (like tying up the database with a several second long query) is look into streaming your response instead, which will fail much more cheaply.
*Mostly. Sure you can do it over UDP, but you probably aren't
I am creating a tool that will run many simultaneous calls to a RESTful API. I am using the python "Requests" module and the "threading" module. Once I stack too many simultaneous gets on the system I am getting exceptions like this:
ConnectionError: HTTPConnectionPool(host='xxx.net', port=80): Max retries exceeded with url: /thing/subthing/ (Caused by : [Errno 10055] An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full)
What can I do to either increase the buffer and queue space, or ask the Requests module to wait for an available slot?
(I know I could stuff it in a "try" loop, but that seems clumsy)
Use a session. If you use the requests.request family of methods (get, post, ...), each request will use it's own session with it's own connection pool, therfore not making any use of connection pooling.
If you need to fine-tune the number of connections used within a session, you can do this by changing it's HTTPAdapter