aiohttp requests timeout for no reason - python

I am trying to test connectivity to a website using http/https with aiohttp, but the script hangs and eventually times out for some domains. Here is the script, with an example hanging domain :
import asyncio, aiohttp
domain = "liffol-le-grand.fr"
async def ping(client, domain):
results = tuple()
ping = tuple()
try:
async with client.get(f"http://{domain}/") as r:
...
some code using r.url, r.history and r.status
...
except Exception as e:
pass
try:
async with client.get(f"https://{domain}/") as r:
...
some code using r.url, r.history and r.status
...
except Exception as e:
pass
return results, ping
async def main():
async with aiohttp.ClientSession() as client:
http, ping_result = await ping(client, domain)
print(http, ping_result)
asyncio.run(main())
Interestingly, running the ping function with either of the try...except blocks works like a charm. I have not found anything unusual about this domain, certificate- or otherwise. The issues page on the aiohttp github or other posts on stackoverflow haven't helped me much either.
I use aiohttp v3.8.4, and python v3.11.0.
Thanks for your help.

Related

How to use asyncio.wait_for to run_until_complete to synchronously call async method in Python

To allow timeouts receiving data via Python websocket, the FAQ: How do I set a timeout on recv()? recommends using asynchronously receive data:
await asyncio.wait_for(websocket.recv(), timeout=10)
Since the function I receive data is not async, I've adapted this answer to run the asyncio loop until data was received or the timeout occurred:
loop.run_until_complete(asyncio.wait_for(ws.recv(), timeout=10))
Unfortunately this statement seems to be not valid, since the following exception occurs:
An asyncio.Future, a coroutine or an awaitable is required
For me, it looks like asyncio.wait_for is no valid parameter for the run_until_complete although the documentation clearly shows an example which awaits for it.
What am I missing here - and what would be the correct way to use asyncio.wait_for in a synchronous method?
You need to feed loop.run_until_complete with a coroutine. To do so, you can wrap your receiving code into an async function:
async def receive_message():
return await asyncio.wait_for(ws.recv(), timeout=10)
loop.run_until_complete(receive_message())
Here's a fully working code:
import asyncio
import websockets
URI = "ws://0.0.0.0:8765"
TIMEOUT = 2
async def create_ws():
return await websockets.connect(URI)
async def receive_message():
ws = await create_ws()
print("Connected")
message = await asyncio.wait_for(ws.recv(), timeout=TIMEOUT)
print(f"Received message in less than {TIMEOUT} seconds: {message}")
if __name__ == "__main__":
asyncio.get_event_loop().run_until_complete(receive_message())
There are two common websocket modules available in Python.
While websockets is imported with import websockets there is also the websocket-client which is imported with import websocket.
Both of the modules offering nearly the same API where ws.recv() allows to receive data.
The fact the modules are so similar to each other may cause confusion which at least lead to my exception.
While websockets are capable of async operations, websocket is not. This means my statement
loop.run_until_complete(asyncio.wait_for(ws.recv(), timeout=10))
will only work for websockets, if used with websocket, the exception
An asyncio.Future, a coroutine or an awaitable is required
will occur.

How to read lines of a streaming api with aiohttp in Python?

I am trying to convert a Python HTTP client featuring requests to aiohttp. The logic is to send a GET call to a REST endpoint which streams data occasionally and print the lines it returns.
I have a code using requests with stream=True option and iter_lines, it works pretty fine:
import json
import requests
def main():
with requests.get('https://my-streaming-url.com', stream=True) as r:
if r.encoding is None:
r.encoding = 'utf-8'
for line in r.iter_lines(decode_unicode=True):
if line:
# Print each line emitted by the streaming api
print(json.loads(line))
if __name__ == '__main__':
main()
Now, I want to convert this logic to aiohttp streaming api and tried:
import asyncio
import aiohttp
import json
loop = asyncio.get_event_loop()
async def main():
r = aiohttp.request('get', 'https://my-streaming-url.com')
async for line in r.content:
print(json.loads(line))
if __name__ == '__main__':
loop.run_until_complete(connect_and_listen())
loop.close()
I get an error like:
... in connect_and_listen
async for line in r.content:
AttributeError: '_SessionRequestContextManager' object has no attribute 'content'
sys:1: RuntimeWarning: coroutine 'ClientSession._request' was never awaited
Unclosed client session
client_session: aiohttp.client.ClientSession object at 0x7fac6ec24310
I tried a few ways like removing loop.close() from main, removing async from the for loop, but none helped.
What am I missing here? How can I print a streaming api lines with aiohttp?
P.S: My Python version is 3.7.5
As throughout the documentation usage of ClientSession class is encouraged, I had this code also encapsulated a session like follows and it worked:
async def main():
async with aiohttp.ClientSession(raise_for_status=True) as session:
async with session.get(cashcog_stream_url) as r:
async for line in r.content:
Another point is loop.close() is apparently does not affect the way app works and can be removed.
Your missing the await keyword.
aiohttp.request is an async context manager. you should use it with an async with statement
async def main():
async with aiohttp.request('get', 'http://localhost:5000') as r:
async for line in r.content:
print(json.loads(line))

aiohttp Client Connection Error, however I can find the website online just fine?

I am attempting to return the HTTP request code from a list of urls asynchronously using this code I found online, however after a few printed I receive the error ClientConnectorError: Cannot connect to host website_i_removed :80 ssl:None [getaddrinfo failed]. As the website is valid I am confused on how it says I cannot connect. If I am doing this wrong at any point please point me in the right direction.
The past few hours I have been looking into the documentation and online for aiohttp, but they dont have an example on HTTP requests with a list of urls, and their getting started page in the docs is quite hard to follow since I am brand new to async programming. Below is the code I am using, assume urls is a list of strings.
import asyncio
from aiohttp import ClientSession
async def fetch(url, session):
async with session.get(url) as response:
code_status = response.history[0].status if response.history else response.status
print('%s -> Status Code: %s' % (url, code_status))
return await response.read()
async def bound_fetch(semaphore, url, session):
# Getter function with semaphore.
async with semaphore:
await fetch(url, session)
async def run(urls):
tasks = []
# create instance of Semaphore
semaphore = asyncio.Semaphore(1000)
async with ClientSession() as session:
for url in urls:
# pass Semaphore and session to every GET request
task = asyncio.ensure_future(bound_fetch(semaphore, url, session))
tasks.append(task)
responses = asyncio.gather(*tasks)
await responses
loop = asyncio.get_event_loop()
future = asyncio.ensure_future(run(urls))
loop.run_until_complete(future)
I expected each website to print its request code to determine if they are reachable, however it says I can't connect to some despite me being able to look them up in my browser.

TypeError: A Future or coroutine is required with AWS lambda

I am setting up a lambda function which makes asynchronous requests using asyncio and aiohttp. Even thought the code works fine while running locally, once I upload it to lambda it returns:
"errorMessage": "A Future or coroutine is required"
I've looked at already opened issues such as Boto3: A Future or coroutine is required and TypeError: A Future or coroutine is required but couldn't make it to work.
I am not sure why is returning the error message while I have a defined coroutine
base_url = "https://www.reed.co.uk/api/1.0/jobs/"
url_list = [
"38012438",
"38012437",
"38012436"]
def lambda_handler(event, context):
client = boto3.client('s3',
aws_access_key_id="aws_access_key_id",
aws_secret_access_key="aws_secret_access_key")
async def fetch(session, url):
auth = aiohttp.BasicAuth(login='login', password='')
async with aiohttp.ClientSession(auth=auth) as session:
async with session.get(url) as response:
return await response.json()
async def fetch_all(urls, loop):
async with aiohttp.ClientSession(loop=loop) as session:
results = await asyncio.gather(*[fetch(session, base_url + url) for url in url_list], return_exceptions=True)
return results
loop = asyncio.get_event_loop()
htmls = loop.run_until_complete(fetch_all(urls, loop))
#I believe this is where the coroutine is
with open("/tmp/JOBS.json", "w") as f:
json.dump(htmls, f)
I just want the combined content of my requests to be uploaded to a json file.
I apologise for my limited coding skills as I am new to python, lambda and etc.
Check your requirements.txt file. I got the same error when I added asyncio to the requirements.txt. It seems like AWS uses a special version of python. Asyncio is a part of python. And if you install it separately then it doesn't work in AWS Lambda.

limit number of concurrent requests aiohttp

I'm downloading images using aiohttp, and was wondering if there is a way to limit the number of open requests that haven't finished. This is the code I currently have:
async def get_images(url, session):
chunk_size = 100
# Print statement to show when a request is being made.
print(f'Making request to {url}')
async with session.get(url=url) as r:
with open('path/name.png', 'wb') as file:
while True:
chunk = await r.content.read(chunk_size)
if not chunk:
break
file.write(chunk)
# List of urls to get images from
urls = [...]
conn = aiohttp.TCPConnector(limit=3)
loop = asyncio.get_event_loop()
session = aiohttp.ClientSession(connector=conn, loop=loop)
loop.run_until_complete(asyncio.gather(*(get_images(url, session=session) for url in urls)))
The problem is, I threw a print statement in to show me when each request is being made and it is making almost 21 requests at once, instead of the 3 that I am wanting to limit it to (i.e., once an image is done downloading, it can move on to the next url in the list to get). I'm just wondering what I am doing wrong here.
Your limit setting works correctly. You made mistake while debugging.
As Mikhail Gerasimov pointed in the comment, you put your print() call in wrong place - it must be inside session.get() context.
In order to be confident that limit is respected, I tested your code against simple logging server - and test shows that the server receives exactly that number of connections that you set in TCPConnector. Here is the test:
import asyncio
import aiohttp
loop = asyncio.get_event_loop()
class SilentServer(asyncio.Protocol):
def connection_made(self, transport):
# We will know when the connection is actually made:
print('SERVER |', transport.get_extra_info('peername'))
async def get_images(url, session):
chunk_size = 100
# This log doesn't guarantee that we will connect,
# session.get() will freeze if you reach TCPConnector limit
print(f'CLIENT | Making request to {url}')
async with session.get(url=url) as r:
while True:
chunk = await r.content.read(chunk_size)
if not chunk:
break
urls = [f'http://127.0.0.1:1337/{x}' for x in range(20)]
conn = aiohttp.TCPConnector(limit=3)
session = aiohttp.ClientSession(connector=conn, loop=loop)
async def test():
await loop.create_server(SilentServer, '127.0.0.1', 1337)
await asyncio.gather(*(get_images(url, session=session) for url in urls))
loop.run_until_complete(test())
asyncio.Semaphore solves exactly this issue.
In your case it'll be something like this:
semaphore = asyncio.Semaphore(3)
async def get_images(url, session):
async with semaphore:
print(f'Making request to {url}')
# ...
You may also be interested to take a look at this ready-to-run code example that demonstrates how semaphore works.

Categories

Resources