import requests
requests.get(path_url, timeout=100)
In the above usage of python requests library, does the connection close automatically once requests.get is done running? If not, how can I make for certain that connection is closed
Yes, there is a call to a session.close behind the get code. If using a proper IDE like PyCharm for example, you can follow the get code to see what is happening. Inside get there is a call to request:
return request('get', url, params=params, **kwargs)
Within the definition of that request method, the call to session.close is made.
By following the link here to the requests repo, there is a call being made for the session control:
# By using the 'with' statement we are sure the session is closed, thus we
# avoid leaving sockets open which can trigger a ResourceWarning in some
# cases, and look like a memory leak in others.
with sessions.Session() as session:
return session.request(method=method, url=url, **kwargs)
Related
I'm pretty new to this and it's taken me days to get this far, the script I have now that pushes a JSON feed into a Google Sheet works for my test link, but times out when used with the URL I actually need to pull from.
I can confirm the real URL works, and I have access - I'm able to print to terminal no problem.
It has sensitive info, so I'm unable to share - I've looked into proxies, and URIs, but haven't really been able how to figure any of it out with my code.
# import urllib library
import json
from urllib.request import urlopen, Request
import gspread
import requests
gc = gspread.service_account(filename='creds.json')
sh = gc.open_by_key('1-1aiGMn2yUWRlh_jnIebcMNs-6phzUNxkktAFH7uY9o')
worksheet = sh.sheet1
url = 'URL LINK GOES HERE'
# store the response of URL
response = urlopen(Request(url, headers={"User-Agent": ""}))
r = requests.get("URL LINK GOES HERE",
proxies={"http": "http://61.233.25.166:80"})
# storing the JSON response
# from url in data
data_json = json.loads(response.read())
# print the json response
# print(data_json)
result = []
for key in data_json:
result.append([key, data_json[key] if not isinstance(
data_json[key], list) else ",".join(map(str, data_json[key]))])
worksheet.update('a1', result)
# proxies///uris///url 100% works
Does anyone have advice on how I could avoid the timeout? Full error is below:
Traceback (most recent call last):
File "c:\Users\AMadle\NBA-JSON-Fetch\2PrintToSheetTimeoutTesting.py", line 17, in <module>
response = urlopen(Request(url, headers={"User-Agent": ""}))
File "C:\Python\python3.10.5\lib\urllib\request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "C:\Python\python3.10.5\lib\urllib\request.py", line 519, in open
response = self._open(req, data)
File "C:\Python\python3.10.5\lib\urllib\request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "C:\Python\python3.10.5\lib\urllib\request.py", line 496, in _call_chain
result = func(*args)
File "C:\Python\python3.10.5\lib\urllib\request.py", line 1391, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "C:\Python\python3.10.5\lib\urllib\request.py", line 1351, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
Here's a couple of things:
Is this the full code? I know you said it has sensitive information, but could you simply censor it instead of deleting entirely,
The provided logs show a URLError. How do you know it's a timeout exception. If it was a timeout, wouldn't it raise a TimeoutException?
Are you trying to read or write data? Saying pushing or pulling interchangeably makes no sense.
Without being able to test your code I could make a guess. Looks like there is no connection to server. Since you are saying that url works (how do you try it btw?) then it could be that specifically the way urlopen connects is not working. As per my understanding urllib docs says this (https://docs.python.org/2/library/urllib2.html#urllib2.ProxyHandler):
In addition, if proxy settings are detected (for example, when a
*_proxy environment variable like http_proxy is set), ProxyHandler is default installed and makes sure the requests are handled through the
proxy.
If no proxy environment variables are set, then in a Windows
environment proxy settings are obtained from the registry’s Internet
Settings section, and in a Mac OS X environment proxy information is
retrieved from the OS X System Configuration Framework.
To disable autodetected proxy pass an empty dictionary.
What is your OS btw?
You could check your proxy settings with urllib.request.getproxies()
And you could try change the proxy settings. I am unfortunately not sure of exact syntaxis with urllib library (better use requests btw) but another guess smth like this:
req = Request(url, headers={"User-Agent": ""})
req.set_proxy("", "http") # empty proxy for http
response = urlopen(req)
Since the request fails, it seems the gspread is not causing the issue. First of I’d suggest using the included ‘json()’ function from the request. Furthermore, I’d coalesce the one-item-or-a-list to always be a list and always use str join for clarity.
If you only sometimes get timeouts try figuring out which records are affected. Perhaps your requesting a really large data set via a proxy.
Or perhaps the proxy is flaky. Perhaps a retry is worth a try?
I tried mitmproxy in the last couple of days as a test tool and works excellent. However, while I'm able to write add-ons that intercept requests (even changing their URL, like my example below), I couldn't avoid that the request is actually dispatched in the network.
One way or another, always the request is performed using the network.
So, how can I modify my add-on in a way that, giving a request, it returns a fixed response, avoiding any networking request?
class Interceptor:
def request(self, flow: http.HTTPFlow):
if http.method() == "GET":
flow.request.url = "http://google.com"
def response(self, flow: http.HTTPFlow):
return http.HTTPResponse.make(status_code=200,b"Rambo 5")
The request hook will be executed when mitmproxy has received the request, the response hook will be executed once we have fetched the response from the server. Long story short, everything in the response hook is too late.
Instead, you need to assign flow.response in the request hook.
I was wondering if this code is safe:
import requests
with requests.Session() as s:
response = s.get("http://google.com", stream=True)
content = response.content
For example simple like this, this does not fail (note I don't write "it works" :p), since the pool does not close instantly the connection anyway (that's a point of a session/pool right?).
Using stream=True, the response object is supposed to have its raw attribute that contains the connection, but I'm unsure if the connection is owned by the session or not, and then if at some point if I don't read the content right now but later, it might have been closed.
My current 2 cents is that it's unsafe, but I'm not 100% sure.
Thanks!
[Edit after reading requests code]
After reading the requests code in more details, it seems that it's what requests.get is doing itself:
https://github.com/requests/requests/blob/master/requests/api.py
def request(method, url, **kwargs):
with sessions.Session() as session:
return session.request(method=method, url=url, **kwargs)
Being that kwargs might contain stream.
So I guess the answer is "it's safe".
Ok, I got my answer by digging in requests and urllib3
The Response object own the connection until the close() method is explicitly called:
https://github.com/requests/requests/blob/24092b11d74af0a766d9cc616622f38adb0044b9/requests/models.py#L937-L948
def close(self):
"""Releases the connection back to the pool. Once this method has been
called the underlying ``raw`` object must not be accessed again.
*Note: Should not normally need to be called explicitly.*
"""
if not self._content_consumed:
self.raw.close()
release_conn = getattr(self.raw, 'release_conn', None)
if release_conn is not None:
release_conn()
release_conn() is a urllib3 method, this releases the connection and put it back in the pool
def release_conn(self):
if not self._pool or not self._connection:
return
self._pool._put_conn(self._connection)
self._connection = None
If the Session was destroyed (like using the with), the pool was destroyed, the connection cannot be put back and it's just closed.
TL;DR; This is safe.
Note that this also means that in stream mode, the connection is not available for the pool until the response is closed explicitly or the content read entirely.
This question already has answers here:
Flask hangs when sending a post request to itself
(2 answers)
Closed 5 years ago.
I was trying develop web application using flask, below is my code,
from sample import APIAccessor
#API
#app.route('/test/triggerSecCall',methods=['GET'])
def triggerMain():
resp = APIAccessor().trigger2()
return Response(json.dumps(resp), mimetype='application/json')
#app.route('/test/seccall',methods=['GET'])
def triggerSub():
resp = {'data':'called second method'}
return Response(json.dumps(resp), mimetype='application/json')
And my trigger method contains the following code,
def trigger2(self):
url = 'http:/127.0.0.1:5000/test/seccall'
response = requests.get(url)
response.raise_for_status()
responseJson = response.json()
if self.track:
print 'Response:::%s' %str(responseJson)
return responseJson
When I hit http://127.0.0.1:5000/test/seccall, I get the expected output. When I hit /test/triggerSecCall, the server stop responding. The request waits forever.
At this stage, I am not able to access any apis from anyother REST clients. When I force stop the server(Ctrl+C) I am getting response in the second REST client.
Why flask is not able to serve to internal service call?
I guess you are using the single threaded development server and not a WSGI setup for production.
Since the server has only one thread is can handle one request at a time. The first request will be executed, resulting in the requests.get(...) which will open a second request that can not be handled until the first request is complete, a dead lock.
The best solution would be to just call triggerSub() to get the result instead of using an HTTP request.
I am using a urllib.request.urlopen() to GET from a web service I'm trying to test.
This returns an HTTPResponse object, which I then read() to get the response body.
But I always see a ResourceWarning about an unclosed socket from socket.py
Here's the relevant function:
from urllib.request import Request, urlopen
def get_from_webservice(url):
""" GET from the webservice """
req = Request(url, method="GET", headers=HEADERS)
with urlopen(req) as rsp:
body = rsp.read().decode('utf-8')
return json.loads(body)
Here's the warning as it appears in the program's output:
$ ./test/test_webservices.py
/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/socket.py:359: ResourceWarning: unclosed <socket.socket object, fd=5, family=30, type=1, proto=6>
self._sock = None
.s
----------------------------------------------------------------------
Ran 2 tests in 0.010s
OK (skipped=1)
If there's anything I can do to the HTTPResponse (or the Request?) to make it close its socket cleanly,
I would really like to know, because this code is for my unit tests; I don't like
ignoring warnings anywhere, but especially not there.
I don't know if this is the answer, but it is part of the way to an answer.
If I add the header "connection: close" to the response from my web services, the HTTPResponse object seems to clean itself up properly without a warning.
And in fact, the HTTP Spec (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html) says:
HTTP/1.1 applications that do not support persistent connections MUST include the "close" connection option in every message.
So the problem was on the server end (i.e. my fault!). In the event that you don't have control over the headers coming from the server, I don't know what you can do.
I had the same problem with urllib3 and I just added a context manager to close connection automatically:
import urllib3
def get(addr, headers):
""" this function will close the connection after a http request. """
with urllib3.PoolManager() as conn:
res = conn.request('GET', addr, headers=headers)
if r.status == 200:
return res.data
else:
raise ConnectionError(res.reason)
Note that urllib3 is designed to have a pool of connections and to keep connections alive for you. This can significantly speed up your application, if it needs to make a series of requests, e.g. few calls to the backend API.
Please read urllib3 documentation re connection pools here: https://urllib3.readthedocs.io/en/1.5/pools.html
P.S. you could also use requests lib, which is not a part of the Python standard lib (at 2019) but is very powerful and simple to use: http://docs.python-requests.org/en/master/