Flask hangs when run as a subprocess alongside pytest

Flask hangs when run as a subprocess alongside pytest - python

I've spent the last hour and a half trying and failing to debug this test and I am utterly stumped. To simplify the process of testing the Flask server I am building, I have made a relatively simple script which starts the server, then runs pytest, kills the server, writes the outputs to files, and exits with Pytest's exit code. This code was working perfectly until today, and I haven't modified it since (aside from debugging this issue).
Here's the problem: when it gets to a certain point in the tests, it hangs. The weird thing is that this does not happen if I run my tests in any other way.
Debugging my server in VS Code, and running tests in the terminal: works
Running my server using the same code used in the test script and running pytest manually: works
Running pytest using the test script and running the server through the start server script (which uses the same code for running the server as the test script does) in a second terminal: works
Here's the other interesting thing: the tests always hang in the same place, part way through the setup fixture. It sends the clear command, and an echo request to the server (which prints the name of the current test). The database clears successfully, and the server echoes the correct information, but the echo route never exits - my tests never get a response. This echo route behaves perfectly for the 50 or so tests that happen before this point. If I comment out the test that is causing it to fail, it fails on the next test. If I comment out the call to the echo then it hangs on a later test on a completely different request to a different route. When it hangs, the server cannot be killed using a SIGTERM, but instead requires a SIGKILL.
Here is my echo route:
#debug.get('/echo')
def echo() -> IEcho:
"""
Echo an input. This returns the given value, but also prints it to stdout
on the server. Useful for debugging tests.
## Params:
* `value` (`str`): value to echo
"""
try:
value = request.args['value']
except KeyError:
raise http_errors.BadRequest('echo route requires a `value` argument')
to_print = f'{Fore.MAGENTA}[ECHO]\t\t{value}{Fore.RESET}'
# Print it to both stdout and stderr to ensure it is seen across all logs
# Otherwise it could be more difficult to figure out what's up with server
# output
print(to_print)
print(to_print, file=sys.stderr)
return {'value': value}
And here is my code that sends the requests:
def get(token: JWT | None, url: str, params: dict) -> dict:
"""
Returns the response to a GET web request
This also parses the response to help with error checking
### Args:
* `url` (`str`): URL to request to
* `params` (`dict`): parameters to send
### Returns:
* `dict`: response data
"""
return handle_response(requests.get(
url,
params=params,
headers=encode_headers(token),
timeout=3
))
def echo(value: str) -> IEcho:
"""
Echo an input. This returns the given value, but also prints it to stdout
on the server. Useful for debugging tests.
## Params:
* `value` (`str`): value to echo
"""
return cast(IEcho, get(None, f"{URL}/echo", {"value": value}))
#pytest.fixture(autouse=True)
def before_each(request: pytest.FixtureRequest):
"""Clear the database between tests"""
clear()
echo(f"{request.module.__name__}.{request.function.__name__}")
print("After echo") # This never prints
Here is my code for running Pytest in my test script
def pytest():
pytest = subprocess.Popen(
[sys.executable, '-u', '-m', 'pytest', '-v', '-s'],
)
# Wait for tests to finish
print("🔨 Running tests...")
try:
ret = pytest.wait()
except KeyboardInterrupt:
print("❗ Testing cancelled")
pytest.terminate()
# write_outputs(pytest, None)
# write_outputs(pytest, "pytest")
raise
# write_outputs(pytest, "pytest")
if ret == 0:
print("✅ It works!")
else:
print("❌ Tests failed")
return bool(ret)
And here is my code for running my server in my test script:
def backend(debug=False, live_output=False):
env = os.environ.copy()
if debug:
env.update({"ENSEMBLE_DEBUG": "TRUE"})
debug_flag = ["--debug"]
else:
debug_flag = []
if live_output is False:
outputs = subprocess.PIPE
else:
outputs = None
flask = subprocess.Popen(
[sys.executable, '-u', '-m', 'flask'] + debug_flag + ['run'],
env=env,
stderr=outputs,
stdout=outputs,
)
if outputs is not None and (flask.stderr is None or flask.stdout is None):
print("❗ Can't read flask output", file=sys.stderr)
flask.kill()
sys.exit(1)
# Request until we get a success, but crash if we failed to start in 10
# seconds
start_time = time.time()
started = False
while time.time() - start_time < 10:
try:
requests.get(
f'http://localhost:{os.getenv("FLASK_RUN_PORT")}/debug/echo',
params={'value': 'Test script startup...'},
)
except requests.ConnectionError:
continue
started = True
break
if not started:
print("❗ Server failed to start in time")
flask.kill()
if outputs is not None:
write_outputs(flask, None)
sys.exit(1)
else:
if flask.poll() is not None:
print("❗ Server crashed during startup")
if outputs is not None:
write_outputs(flask, None)
sys.exit(1)
print("✅ Server started")
return flask
So in summary, does anyone have any idea what on earth is happening? It freezes on such a simple route that this makes me very concerned. I think I may have found some crazy bug in Flask or in the requests library or something.
Even if you don't know what's happening with this, it'd be really helpful to have any ideas as to how I can debug this further, as I have absolutely no idea what is going on.

It turns out that my server output was filling up all the buffer space in the pipe, meaning that it would wait for the buffer to empty. The issue is that my test script was waiting for the tests to exit, and the tests could not progress unless the server was active. As such, the code reached a three-way deadlock. I fixed it by redirecting my output through a file (where limited buffer size wasn't a problem).

Related

Why isn't subprocess.Popen.kill() doing anything?

I am following the Google Cloud Functions python testing example here:
https://cloud.google.com/functions/docs/testing/test-http
import os
import subprocess
import uuid
import requests
from requests.packages.urllib3.util.retry import Retry
def test_args():
name = str(uuid.uuid4())
port = os.getenv('PORT', 8080) # Each functions framework instance needs a unique port
process = subprocess.Popen(
[
'functions-framework',
'--target', 'hello_http',
'--port', str(port)
],
cwd=os.path.dirname(__file__),
stdout=subprocess.PIPE
)
# Send HTTP request simulating Pub/Sub message
# (GCF translates Pub/Sub messages to HTTP requests internally)
BASE_URL = f'http://localhost:{port}'
retry_policy = Retry(total=6, backoff_factor=1)
retry_adapter = requests.adapters.HTTPAdapter(
max_retries=retry_policy)
session = requests.Session()
session.mount(BASE_URL, retry_adapter)
name = str(uuid.uuid4())
res = session.post(
BASE_URL,
json={'name': name}
)
assert res.text == 'Hello {}!'.format(name)
# Stop the functions framework process
process.kill()
process.wait()
After running pytest against this test file, the test passes and the code exits nearly immediately, but it doesn't appear to have killed the process:
± ps -ef | grep functions
thoraxe 11661 1985 49 20:30 pts/0 00:00:02 /home/thoraxe/.pyenv/versions/3.9.13/envs/avogadro-trainer-3-9-13/bin/python3.9 /home/thoraxe/.pyenv/versions/3.9.13/envs/avogadro-trainer-3-9-13/bin/functions-framework --target train --port 8080
thoraxe 11684 11661 0 20:30 pts/0 00:00:00 /home/thoraxe/.pyenv/versions/3.9.13/envs/avogadro-trainer-3-9-13/bin/python3.9 /home/thoraxe/.pyenv/versions/3.9.13/envs/avogadro-trainer-3-9-13/bin/functions-framework --target train --port 8080
I'm using Python 3.9.13 in a virtual environment on Fedora.
As this is sample code from Google, I'd expect it to work, but something is definitely not working here. Can someone suggest what I might be doing wrong?

When a Python assertion fails, the program exits immediately and does not continue. The kill/wait are never actually executed unless the test is successful. This is a major bummer because the function framework will continue to run in the background and new code changes aren't apparently picked up on subsequent pytest runs.
Using a different wrapper framework like https://github.com/okken/pytest-check solves the problem because all steps will be performed, even if there are failures.
However, note that legitimate Python failures/errors/explosions will still result in the functions framework not properly exiting.

how to omit "connect: network is unreachable" message

i created a script which runs on boot which checks if there's an internet connection on my raspberry pi, and at the same time updates the time (care of ntp) - via os.system().
import datetime, os, socket, subprocess
from time import sleep
dir_path = os.path.dirname(os.path.abspath(__file__))
def internet(host="8.8.8.8"):
result = subprocess.call("ping -c 1 "+host, stdout=open(os.devnull,'w'), shell=True)
if result == 0:
return True
else:
return False
timestr = time.strftime("%Y-%m-%d--%H:%M:%S")
netstatus = internet()
while netstatus == False:
sleep(30)
netstatus = internet()
if netstatus == True:
print "successfully connected! updating time . . . "
os.system("sudo bash "+dir_path+"/updatetime.sh")
print "time updated! time check %s"%datetime.datetime.now()
where updatetime.sh contains the following:
service ntp stop
ntpd -q -g
service ntp start
this script runs at reboot/boot and i'm running this in our workplace, 24/7. also, outputs from scripts like these are saved in a log file. it's working fine, but is there a way how NOT to output connect: Network is unreachable
if there's no internet connection? thanks.
edit
i run this script via a shell script i named launch.sh which runs check_net.py (this script's name), and other preliminary scripts, and i placed launch.sh in my crontab to run on boot/reboot:
#reboot sh /home/pi/launch.sh > /home/pi/logs/cronlog 2>&1
from what i've read in this thread: what does '>/dev/null/ 2>&1' mean, 2 handles stderr where as 1 handles stdout.
i am new to this. I wish to see my stdout - but not the stderrs (in this case, the connect: Network is unreachable messages (only)..
/ogs

As per #shellter 's link suggestion in the comments, i restructured my cron to:
#reboot sh /home/pi/launch.sh 2>&1 > /home/pi/logs/cronlog | grep "connect: Network is unreachable"
alternatively, i also came up of an alternative solution, which involves a different way of checking an internet connection with the use urllib2.urlopen():
def internet_init():
try:
urllib2.urlopen('https://www.google.com', timeout=1)
return True
except urllib2.URLError as err:
return False
either of the two methods above omitted any connect: Network is unreachable error output in my logs.
thanks!
/ogs

`manage.py runserver` and Ctrl+C (Django)

When I quit Django manage.py runserver with Ctrl+C, do threads running HTTP request finish properly or are they interrupted in the middle?

TL;DR running HTTP requests are stopped when Ctrl+C is hit on the django dev server
I thought your question is really interesting and investigated:
I made a view that takes 10 seconds to execute and after that sends a response.
To test for your behaviour I stopped the development-server manage.py runserver using Ctrl+C and checked for the results.
My base test:
class TestView ( generic.View ):
def get ( self, request ):
import time
time.sleep(10)
response = HttpResponse('Done.')
return response
Normal execute (10s runtime): Displays the msg Done.
interrupted execute (Ctrl+C while the request is running): Browser error, the host cannot be reached
so far everything as expected. But I played around a little bit, because Ctrl+C in python is not a full stop, but actually handled rather conveniently: As soon as Ctrl+C is hit, a KeyboardInterrupt aka an Exception is risen (equivalent to this):
raise KeyboardInterrupt()
so in your command-line based programm you can put the following:
try:
some_action_that_takes_a_while()
except KeyboardInterrupt:
print('The user stopped the programm.')
ported to django the new view looks like that:
def get ( self, request ):
import time
slept_for = 0
try:
for i in range( 100 ):
slept_for += 0.1
time.sleep( 0.1 )
except KeyboardInterrupt:
pass
response = HttpResponse( 'Slept for: ' + str( slept_for ) + 's' )
return response
Normal execute (10s runtime): Displays the msg Slept for: 10s
interrupted execute (Ctrl+C while the request is running): Browser error, the host cannot be reached
so no change in behaviour here. out of interest i changed one line, but the result didn't change; i used
slept_for = 1000*1000
instead of
time.sleep( 0.1 )
so to finally answer your question: on Ctrl+C the dev server shuts down immediately and running http-requets are not finished.

How can I start another process required by my Selenium tests

I'm using django-selenium to add Selenium testing functionality to existing unittests.
My Selenium tests are reliant on a web server running on my machine which would be triggered by running our django app like so; main.py -a
So the first thing I want to do in my Selenium test is start this server which I setup like so;
def start_server():
path = os.path.join(os.getcwd(), 'main.py -a')
server_running = is_server_running()
if server_running is False:
server = subprocess.Popen('cmd.exe', stdin= subprocess.PIPE, stdout= subprocess.PIPE)
stdout, stderr = server.communicate(input='%s\n' % path)
print 'Server error:\n{0}\n'.format(stderr)
server_running = is_server_running()
return server_running
However when I do this the webserver takes over the execution of the django test process in the command line. I assume the way I should be doing this is to launch the command prompt in a separate process and then trigger the main.py -a command in that process.
Is this the right idea and if so, how can I modify that function to spawn a new process and launch my command? I was trying to run 'cmd.exe' using Process(target=path but I couldn't get it to work. Thanks :)

The way I have gone with this is with a much simpler launch method;
startServer.py
def run():
path = os.path.join(os.getcwd(), 'main.py')
server_running = is_server_running()
if server_running is False:
subprocess.Popen(['python', path, '-a'])
if __name__ == '__main__':
run()
Which I can then start and stop in my tests' setup & teardown as so;
def setUp(self):
self.server = Process(target= startServer.run)
self.server.start()
def test(self):
# run test process
def tearDown(self):
utils.closeBrowser(self.ff)
There may well be a better way of doing things & something here may not be 'as it should be' but it works (with a socket forcibly closed error) :)
My only outstanding issue is test starting before the database tables have been created :(

Detect whether Celery is Available/Running

I'm using Celery to manage asynchronous tasks. Occasionally, however, the celery process goes down which causes none of the tasks to get executed. I would like to be able to check the status of celery and make sure everything is working fine, and if I detect any problems display an error message to the user. From the Celery Worker documentation it looks like I might be able to use ping or inspect for this, but ping feels hacky and it's not clear exactly how inspect is meant to be used (if inspect().registered() is empty?).
Any guidance on this would be appreciated. Basically what I'm looking for is a method like so:
def celery_is_alive():
from celery.task.control import inspect
return bool(inspect().registered()) # is this right??
EDIT: It doesn't even look like registered() is available on celery 2.3.3 (even though the 2.1 docs list it). Maybe ping is the right answer.
EDIT: Ping also doesn't appear to do what I thought it would do, so still not sure the answer here.

Here's the code I've been using. celery.task.control.Inspect.stats() returns a dict containing lots of details about the currently available workers, None if there are no workers running, or raises an IOError if it can't connect to the message broker. I'm using RabbitMQ - it's possible that other messaging systems might behave slightly differently. This worked in Celery 2.3.x and 2.4.x; I'm not sure how far back it goes.
def get_celery_worker_status():
ERROR_KEY = "ERROR"
try:
from celery.task.control import inspect
insp = inspect()
d = insp.stats()
if not d:
d = { ERROR_KEY: 'No running Celery workers were found.' }
except IOError as e:
from errno import errorcode
msg = "Error connecting to the backend: " + str(e)
if len(e.args) > 0 and errorcode.get(e.args[0]) == 'ECONNREFUSED':
msg += ' Check that the RabbitMQ server is running.'
d = { ERROR_KEY: msg }
except ImportError as e:
d = { ERROR_KEY: str(e)}
return d

From the documentation of celery 4.2:
from your_celery_app import app
def get_celery_worker_status():
i = app.control.inspect()
availability = i.ping()
stats = i.stats()
registered_tasks = i.registered()
active_tasks = i.active()
scheduled_tasks = i.scheduled()
result = {
'availability': availability,
'stats': stats,
'registered_tasks': registered_tasks,
'active_tasks': active_tasks,
'scheduled_tasks': scheduled_tasks
}
return result
of course you could/should improve the code with error handling...

To check the same using command line in case celery is running as daemon,
Activate virtualenv and go to the dir where the 'app' is
Now run : celery -A [app_name] status
It will show if celery is up or not plus no. of nodes online
Source:
http://michal.karzynski.pl/blog/2014/05/18/setting-up-an-asynchronous-task-queue-for-django-using-celery-redis/

The following worked for me:
import socket
from kombu import Connection
celery_broker_url = "amqp://localhost"
try:
conn = Connection(celery_broker_url)
conn.ensure_connection(max_retries=3)
except socket.error:
raise RuntimeError("Failed to connect to RabbitMQ instance at {}".format(celery_broker_url))

One method to test if any worker is responding is to send out a 'ping' broadcast and return with a successful result on the first response.
from .celery import app # the celery 'app' created in your project
def is_celery_working():
result = app.control.broadcast('ping', reply=True, limit=1)
return bool(result) # True if at least one result
This broadcasts a 'ping' and will wait up to one second for responses. As soon as the first response comes in, it will return a result. If you want a False result faster, you can add a timeout argument to reduce how long it waits before giving up.

I found an elegant solution:
from .celery import app
try:
app.broker_connection().ensure_connection(max_retries=3)
except Exception as ex:
raise RuntimeError("Failed to connect to celery broker, {}".format(str(ex)))

You can use ping method to check whether any worker (or specific worker) is alive or not https://docs.celeryproject.org/en/latest/_modules/celery/app/control.html#Control.ping
celey_app.control.ping()

You can test on your terminal by running the following command.
celery -A proj_name worker -l INFO
You can review every time your celery runs.

The below script is worked for me.
#Import the celery app from project
from application_package import app as celery_app
def get_celery_worker_status():
insp = celery_app.control.inspect()
nodes = insp.stats()
if not nodes:
raise Exception("celery is not running.")
logger.error("celery workers are: {}".format(nodes))
return nodes

Run celery status to get the status.
When celery is running,
(venv) ubuntu#server1:~/project-dir$ celery status
-> celery#server1: OK
1 node online.
When no celery worker is running, you get the below information displayed in terminal.
(venv) ubuntu#server1:~/project-dir$ celery status
Error: No nodes replied within time constraint

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Flask hangs when run as a subprocess alongside pytest - python

Related

Why isn't subprocess.Popen.kill() doing anything?

how to omit "connect: network is unreachable" message

`manage.py runserver` and Ctrl+C (Django)

How can I start another process required by my Selenium tests

Detect whether Celery is Available/Running

Categories

Resources