(Note: The background for this problem is pretty verbose, but there's an SSCCE at the bottom that can be skipped to)
Background
I'm trying to develop a Python-based CLI to interact with a web service. In my codebase I have a CommunicationService class that handles all direct communication with the web service. It exposes a received_response property that returns an Observable (from RxPY) that other objects can subscribe to in order to be notified when responses are received back from the web service.
I've based my CLI logic on the click library, where one of my subcommands is implemented as below:
async def enabled(self, request: str, response_handler: Callable[[str], Tuple[bool, str]]) -> None:
self._generate_request(request)
if response_handler is None:
return None
while True:
response = await self.on_response
success, value = response_handler(response)
print(success, value)
if success:
return value
What's happening here (in the case that response_handler is not None) is that the subcommand is behaving as a coroutine that awaits responses from the web service (self.on_response == CommunicationService.received_response) and returns some processed value from the first response it can handle.
I'm trying to test the behaviour of my CLI by creating test cases in which CommunicationService is completely mocked; a fake Subject is created (which can act as an Observable) and CommunicationService.received_response is mocked to return it. As part of the test, the subject's on_next method is invoked to pass mock web service responses back to the production code:
#when('the communications service receives a response from TestCube Web Service')
def step_impl(context):
context.mock_received_response_subject.on_next(context.text)
I use a click 'result callback' function that gets invoked at the end of the CLI invocation and blocks until the coroutine (the subcommand) is done:
#cli.resultcallback()
def _handle_command_task(task: Coroutine, **_) -> None:
if task:
loop = asyncio.get_event_loop()
result = loop.run_until_complete(task)
loop.close()
print('RESULT:', result)
Problem
At the start of the test, I run CliRunner.invoke to fire off the whole shebang. The problem is that this is a blocking call and will block the thread until the CLI has finished and returned a result, which isn't helpful if I need my test thread to carry on so it can produce mock web service responses concurrently with it.
What I guess I need to do is run CliRunner.invoke on a new thread using ThreadPoolExecutor. This allows the test logic to continue on the original thread and execute the #when step posted above. However, notifications published with mock_received_response_subject.on_next do not seem to trigger execution to continue within the subcommand.
I believe the solution would involve making use of RxPY's AsyncIOScheduler, but I'm finding the documentation on this a little sparse and unhelpful.
SSCCE
The snippet below captures what I hope is the essence of the problem. If it can be modified to work, I should be able to apply the same solution to my actual code to get it to behave as I want.
import asyncio
import logging
import sys
import time
import click
from click.testing import CliRunner
from rx.subjects import Subject
web_response_subject = Subject()
web_response_observable = web_response_subject.as_observable()
thread_loop = asyncio.new_event_loop()
#click.group()
def cli():
asyncio.set_event_loop(thread_loop)
#cli.resultcallback()
def result_handler(task, **_):
loop = asyncio.get_event_loop()
result = loop.run_until_complete(task) # Should block until subject publishes value
loop.close()
print(result)
#cli.command()
async def get_web_response():
return await web_response_observable
def test():
runner = CliRunner()
future = thread_loop.run_in_executor(None, runner.invoke, cli, ['get_web_response'])
time.sleep(1)
web_response_subject.on_next('foo') # Simulate reception of web response.
time.sleep(1)
result = future.result()
print(result.output)
logging.basicConfig(
level=logging.DEBUG,
format='%(threadName)10s %(name)18s: %(message)s',
stream=sys.stderr,
)
test()
Current Behaviour
The program hangs when run, blocking at result = loop.run_until_complete(task).
Acceptance Criteria
The program terminates and prints foo on stdout.
Update 1
Based on Vincent's help I've made some changes to my code.
Relay.enabled (the subcommand that awaits responses from the web service in order to process them) is now implemented like this:
async def enabled(self, request: str, response_handler: Callable[[str], Tuple[bool, str]]) -> None:
self._generate_request(request)
if response_handler is None:
return None
return await self.on_response \
.select(response_handler) \
.where(lambda result, i: result[0]) \
.select(lambda result, index: result[1]) \
.first()
I wasn't quite sure how await would behave with RxPY observables - would they return execution to the caller on each element generated, or only when the observable has completed (or errored?). I now know it's the latter, which honestly feels like the more natural choice and has allowed me to make the implementation of this function feel a lot more elegant and reactive.
I've also modified the test step that generates mock web service responses:
#when('the communications service receives a response from TestCube Web Service')
def step_impl(context):
loop = asyncio.get_event_loop()
loop.call_soon_threadsafe(context.mock_received_response_subject.on_next, context.text)
Unfortunately, this will not work as it stands, since the CLI is being invoked in its own thread...
#when('the CLI is run with "{arguments}"')
def step_impl(context, arguments):
loop = asyncio.get_event_loop()
if 'async.cli' in context.tags:
context.async_result = loop.run_in_executor(None, context.cli_runner.invoke, testcube.cli, arguments.split())
else:
...
And the CLI creates its own thread-private event loop when invoked...
def cli(context, hostname, port):
_initialize_logging(context.meta['click_log.core.logger']['level'])
# Create a new event loop for processing commands asynchronously on.
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
...
What I think I need is a way to allow my test steps to invoke the CLI on a new thread and then fetch the event loop it's using:
#when('the communications service receives a response from TestCube Web Service')
def step_impl(context):
loop = _get_cli_event_loop() # Needs to be implemented.
loop.call_soon_threadsafe(context.mock_received_response_subject.on_next, context.text)
Update 2
There doesn't seem to be an easy way to get the event loop that a particular thread creates and uses for itself, so instead I took Victor's advice and mocked asyncio.new_event_loop to return an event loop that my test code creates and stores:
def _apply_mock_event_loop_patch(context):
# Close any already-existing exit stacks.
if hasattr(context, 'mock_event_loop_exit_stack'):
context.mock_event_loop_exit_stack.close()
context.test_loop = asyncio.new_event_loop()
print(context.test_loop)
context.mock_event_loop_exit_stack = ExitStack()
context.mock_event_loop_exit_stack.enter_context(
patch.object(asyncio, 'new_event_loop', spec=True, return_value=context.test_loop))
I change my 'mock web response received' test step to do the following:
#when('the communications service receives a response from TestCube Web Service')
def step_impl(context):
loop = context.test_loop
loop.call_soon_threadsafe(context.mock_received_response_subject.on_next, context.text)
The great news is that I'm actually getting the Relay.enabled coroutine to trigger when this step gets executed!
The only problem now is the final test step in which I await the future I got from executing the CLI in its own thread and validate that the CLI is sending this on stdout:
#then('the CLI should print "{output}"')
def step_impl(context, output):
if 'async.cli' in context.tags:
loop = asyncio.get_event_loop() # main loop, not test loop
result = loop.run_until_complete(context.async_result)
else:
result = context.result
assert_that(result.output, equal_to(output))
I've tried playing around with this but I can't seem to get context.async_result (which stores the future from loop.run_in_executor) to transition nicely to done and return the result. With the current implementation, I get an error for the first test (1.1) and indefinite hanging for the second (1.2):
#mock.comms #async.cli #wip
Scenario Outline: Querying relay enable state -- #1.1 # testcube/tests/features/relay.feature:45
When the user queries the enable state of relay 0 # testcube/tests/features/steps/relay.py:17 0.003s
Then the CLI should query the web service about the enable state of relay 0 # testcube/tests/features/steps/relay.py:48 0.000s
When the communications service receives a response from TestCube Web Service # testcube/tests/features/steps/core.py:58 0.000s
"""
{'module':'relays','path':'relays[0].enabled','data':[True]}'
"""
Then the CLI should print "True" # testcube/tests/features/steps/core.py:94 0.003s
Traceback (most recent call last):
File "/Users/davidfallah/testcube_env/lib/python3.5/site-packages/behave/model.py", line 1456, in run
match.run(runner.context)
File "/Users/davidfallah/testcube_env/lib/python3.5/site-packages/behave/model.py", line 1903, in run
self.func(context, *args, **kwargs)
File "testcube/tests/features/steps/core.py", line 99, in step_impl
result = loop.run_until_complete(context.async_result)
File "/usr/local/Cellar/python3/3.5.2_1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/asyncio/base_events.py", line 387, in run_until_complete
return future.result()
File "/usr/local/Cellar/python3/3.5.2_1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/asyncio/futures.py", line 274, in result
raise self._exception
File "/usr/local/Cellar/python3/3.5.2_1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/concurrent/futures/thread.py", line 55, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/davidfallah/testcube_env/lib/python3.5/site-packages/click/testing.py", line 299, in invoke
output = out.getvalue()
ValueError: I/O operation on closed file.
Captured stdout:
RECEIVED WEB RESPONSE: {'module':'relays','path':'relays[0].enabled','data':[True]}'
<Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/local/Cellar/python3/3.5.2_1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/asyncio/futures.py:431]>
#mock.comms #async.cli #wip
Scenario Outline: Querying relay enable state -- #1.2 # testcube/tests/features/relay.feature:46
When the user queries the enable state of relay 1 # testcube/tests/features/steps/relay.py:17 0.005s
Then the CLI should query the web service about the enable state of relay 1 # testcube/tests/features/steps/relay.py:48 0.001s
When the communications service receives a response from TestCube Web Service # testcube/tests/features/steps/core.py:58 0.000s
"""
{'module':'relays','path':'relays[1].enabled','data':[False]}'
"""
RECEIVED WEB RESPONSE: {'module':'relays','path':'relays[1].enabled','data':[False]}'
Then the CLI should print "False" # testcube/tests/features/steps/core.py:94
Chapter 3: Finale
Screw all this asynchronous multi-threaded stuff, I'm too dumb for it.
First off, instead of describing the scenario like this...
When the user queries the enable state of relay <relay_id>
Then the CLI should query the web service about the enable state of relay <relay_id>
When the communications service receives a response from TestCube Web Service:
"""
{"module":"relays","path":"relays[<relay_id>].enabled","data":[<relay_enabled>]}
"""
Then the CLI should print "<relay_enabled>"
We describe it like this:
Given the communications service will respond to requests:
"""
{"module":"relays","path":"relays[<relay_id>].enabled","data":[<relay_enabled>]}
"""
When the user queries the enable state of relay <relay_id>
Then the CLI should query the web service about the enable state of relay <relay_id>
And the CLI should print "<relay_enabled>"
Implement the new given step:
#given('the communications service will respond to requests')
def step_impl(context):
response = context.text
def publish_mock_response(_):
loop = context.test_loop
loop.call_soon_threadsafe(context.mock_received_response_subject.on_next, response)
# Configure the mock comms service to publish a mock response when a request is made.
instance = context.mock_comms.return_value
instance.send_request.on_next.side_effect = publish_mock_response
BOOM
2 features passed, 0 failed, 0 skipped
22 scenarios passed, 0 failed, 0 skipped
58 steps passed, 0 failed, 0 skipped, 0 undefined
Took 0m0.111s
I can see two problems with your code:
asyncio is not thread-safe, unless you use call_soon_threadsafe or run_coroutine_threadsafe. RxPy doesn't use any of those in Observable.to_future, so you have to access RxPy objects in the same thread that runs the asyncio event loop.
RxPy sets the result of the future when on_completed is called, so that awaiting for an observable returns the last object emitted. This means you have to call both on_next and on_completed to get await to return.
Here is a working example:
import click
import asyncio
from rx.subjects import Subject
from click.testing import CliRunner
web_response_subject = Subject()
web_response_observable = web_response_subject.as_observable()
main_loop = asyncio.get_event_loop()
#click.group()
def cli():
pass
#cli.resultcallback()
def result_handler(task, **_):
future = asyncio.run_coroutine_threadsafe(task, main_loop)
print(future.result())
#cli.command()
async def get_web_response():
return await web_response_observable
def test():
runner = CliRunner()
future = main_loop.run_in_executor(
None, runner.invoke, cli, ['get_web_response'])
main_loop.call_later(1, web_response_subject.on_next, 'foo')
main_loop.call_later(2, web_response_subject.on_completed)
result = main_loop.run_until_complete(future)
print(result.output, end='')
if __name__ == '__main__':
test()
Related
I have a task that is IO bound running in a loop. This task does a lot of work and is often times hogging the loop (Is that the right word for it?). My plan is to run it in a separate process or thread using run_in_executor with ProcessPoolExecutor or ThreadPoolExecutor to run it separately and allow the main loop to do its work. Currently for communication between tasks I use asyncio.PriorityQueue() and asyncio.Event() for communication and would like to reuse these, or something with the same interface, if possible.
Current code:
# Getter for events and queues so communication can happen
send, receive, send_event, receive_event = await process_obj.get_queues()
# Creates task based off the process object
future = asyncio.create_task(process_obj.main())
Current process code:
async def main():
while True:
#does things that hogs loop
What I want to do:
# Getter for events and queues so communication can happen
send, receive, send_event, receive_event = await process_obj.get_queues()
# I assume I could use Thread or Process executors
pool = concurrent.futures.ThreadPoolExecutor()
result = await loop.run_in_executor(pool, process_obj.run())
New process code:
def run():
asyncio.create_task(main())
async def main():
while True:
#does things that hogs loop
How do I communicate between this new thread and the original loop like I could originally?
There is not much I could reproduce your code. So please consider this code from YouTube Downloader as example and I hope that will help you to understand how to get result from thread function:
example code:
def on_download(self, is_mp3: bool, is_mp4: bool, url: str) -> None:
if is_mp3 == False and is_mp4 == False:
self.ids.info_lbl.text = 'Please select a type of file to download.'
else:
self.ids.info_lbl.text = 'Downloading...'
self.is_mp3 = is_mp3
self.is_mp4 = is_mp4
self.url = url
Clock.schedule_once(self.schedule_download, 2)
Clock.schedule_interval(self.start_progress_bar, 0.1)
def schedule_download(self, dt: float) -> None:
'''
Callback method for the download.
'''
pool = ThreadPool(processes=1)
_downloader = Downloader(self.d_path)
self.async_result = pool.apply_async(_downloader.download,
(self.is_mp3, self.is_mp4, self.url))
Clock.schedule_interval(self.check_process, 0.1)
def check_process(self, dt: float) -> None:
'''
Check if download is complete.
'''
if self.async_result.ready():
resp = self.async_result.get()
if resp[0] == 'Error. Download failed.':
self.ids.info_lbl.text = resp[0]
# progress bar gray if error
self.stop_progress_bar(value=0)
else:
# progress bar blue if success
self.stop_progress_bar(value=100)
self.ids.file_name.text = resp[0]
self.ids.info_lbl.text = 'Finished downloading.'
self.ids.url_input.text = ''
Clock.unschedule(self.check_process)
Personally I prefer from multiprocessing.pool import ThreadPool and now it looks like your code 'hogs up' because you are awaiting for result. So obviously until there is result program will wait (and that may be long). If you look in my example code:
on_download will schedule and event schedule download and this one will schedule another event check process. I can't tell if you app is GUI app or terminal as there is pretty much no code in your question but what you have to do, in your loop you have to schedule an event of check process.
If you look on my check process: if self.async_result.ready(): that will only return when my result is ready.
Now you are waiting for the result, here everything is happening in the background and every now and then the main loop will check for the result (it won't hog up as if there is no result the main loop will carry on doing what it have to rather than wait for it).
So basically you have to schedule some events (especially the one for the result) in your loop rather than going line by line and waiting for one. Does that make sense and does my example code is helpful? Sorry I am really bad at explaining what is in my head ;)
-> mainloop
-> new Thread if there is any
-> check for result if there is any Threads
-> if there is a result
-> do something
-> mainloop keeps running
-> back to top
When you execute the while True in your main coroutine, it doesn't hog the loop but blocks the loop not accepting the rest task to do their jobs. Running a process in your event-based application is not the best solution as the processes are not much friendly in data sharing.
It is possible to do all concurrently without using parallelism. All you need is to execute a await asyncio.sleep(0) at the end of while True. It yields back to the loop and allows the rest tasks to be executed. So we do not exit from the coroutine.
In the following example, I have a listener that uses while True and handles the data added by emitter to the queue.
import asyncio
from queue import Empty
from queue import Queue
from random import choice
queue = Queue()
async def listener():
while True:
try:
# data polling from the queue
data = queue.get_nowait()
print(data) # {"type": "event", "data": {...}}
except (Empty, Exception):
pass
finally:
# the magic action
await asyncio.sleep(0)
async def emitter():
# add a data to the queue
queue.put({"type": "event", "data": {...}})
async def main():
# first create a task for listener
running_loop = asyncio.get_running_loop()
running_loop.create_task(listener())
for _ in range(5):
# create tasks for emitter with random intervals to
# demonstrate that the listener is still running in
# the loop and handling the data put into the queue
running_loop.create_task(emitter())
await asyncio.sleep(choice(range(2)))
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Im using a HTTP trigger to trigger an Orchestrator Function that runs multiple activity functions. The Http trigger is called every few minutes to retrieve the status of the Orchestration. The activity functions require an external token for certain operations. This token is send with the HTTP trigger to the Orchestrator. To avoid using an expired token when the Orchestrator runs longer, I included an external event into the Http trigger that sends a new token on each call of the trigger, since I didn't find another way to send new data to a running Orchestrator. Now if I add the wait_for_external_event function after my activity functions in the Orchestrator everything runs properly. But if I set it before the activities are called it causes the Orchestrator to stop working. The client.get_status function of the HTTP trigger returns a failed status after the first run.
I am not sure as to why this is, from my understanding it should not make a difference as to when I wait for the external event. Is there any reason why this is happening? In the monitoring the Orchestrator is still shown as "running".
This is my http trigger:
import logging
import azure.functions as func
import azure.durable_functions as df
import json
import uuid
from azure.durable_functions.models.OrchestrationRuntimeStatus import OrchestrationRuntimeStatus
async def main(req: func.HttpRequest, starter: str) -> func.HttpResponse:
try:
client = df.DurableOrchestrationClient(starter)
params = {}
for key, value in req.params.items():
params[key] = value
for key, value in _get_json(req).items():
params[key] = value
access_token = params["accessToken"]
azure_call_id = params.get("azureCallId")
if not azure_call_id:
azure_call_id = await client.start_new('TestOrchestrator',
instance_id=None,
client_input=params)
status = await client.get_status(azure_call_id)
if status.runtime_status == OrchestrationRuntimeStatus.Pending \
or status.runtime_status == OrchestrationRuntimeStatus.Running \
or status.runtime_status == OrchestrationRuntimeStatus.ContinuedAsNew:
await client.raise_event(azure_call_id, 'RefreshToken', {'accessToken': access_token})
return response
except Exception as e:
logging.error(f"Exception caught: {str(e)}")
I am trying to get an Az Function to return an HTTP response and continue a background thread after. In the code below I try to use a thread but the response is still waiting for the function to finish before returning. Is there something I am missing?
import logging
import json
import threading
import azure.functions as func
from .commands import start_process
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
command = req.params.get('command')
vm = req.params.get('vm')
if not command:
try:
req_body = req.get_json()
except ValueError:
pass
else:
command = req_body.get('command')
vm = req_body.get('vm')
if command == 'restart':
thread = threading.Thread(target=start_process(vm, command))
thread.start()
if command:
return func.HttpResponse(f"Hello {command}!")
else:
return func.HttpResponse(
"Please pass a name on the query string or in the request body",
status_code=400
)
If a program is running Threads (that are not daemon), then the program will wait for those threads to complete before it terminates. Please follow this post for more information.
So, the AZ function (which is a stateless method) that you have created, will wait for the thread to complete as you have attempted to exit from the function by using a return statement. Hence the behavior that you see.
I think you need to look at fire and forget kind of thread execution, and this SO question and answer should help in that.
In the following code, an API gives a task to a task broker, who puts it in a queue, where it is picked up by a worker. The worker will then execute the task and notify the task broker (using a redis message channel) that he is done, after which the task broker will remove it from its queue. This works.
What I'd like is that the task broker is then able to return the result of the task to the API. But I'm unsure on how to do so since it is asynchronous code and I'm having difficulty figuring it out. Can you help?
Simplified the code is roughly as follows, but incomplete.
The API code:
#router.post('', response_model=BaseDocument)
async def post_document(document: BaseDocument):
"""Create the document with a specific type and an optional name given in the payload"""
task = DocumentTask({ <SNIP>
})
task_broker.give_task(task)
result = await task_broker.get_task_result(task)
return result
The task broker code, first part is giving the task, the second part is removing the task and the final part is what I assume should be a blocking call on the status of the removed task
def give_task(self, task_obj):
self.add_task_to_queue(task_obj)
<SNIP>
self.message_channel.publish(task_obj)
# ...
def remove_task_from_queue(self, task):
id_task_to_remove = task.id
for i in range(len(task_queue)):
if task_queue[i]["id"] == id_task_to_remove:
removed_task = task_queue.pop(i)
logger.debug(
f"[TaskBroker] Task with id '{id_task_to_remove}' succesfully removed !"
)
removed_task["status"] = "DONE"
return
# ...
async def get_task_result(self, task):
return task.result
My intuition would like to implement a way in get_task_result that blocks on task.result until it is modified, where I would modify it in remove_task_from_queue when it is removed from the queue (and thus done).
Any idea in how to do this, asynchronously?
I have a Google Cloud Function triggered by a PubSub. The doc states messages are acknowledged when the function end with success.
link
But randomly, the function retries (same execution ID) exactly 10 minutes after execution. It is the PubSub ack max timeout.
I also tried to get message ID and acknowledge it programmatically in Function code but the PubSub API respond there is no message to ack with that id.
In StackDriver monitoring, I see some messages not being acknowledged.
Here is my code : main.py
import base64
import logging
import traceback
from google.api_core import exceptions
from google.cloud import bigquery, error_reporting, firestore, pubsub
from sql_runner.runner import orchestrator
logging.getLogger().setLevel(logging.INFO)
def main(event, context):
bigquery_client = bigquery.Client()
firestore_client = firestore.Client()
publisher_client = pubsub.PublisherClient()
subscriber_client = pubsub.SubscriberClient()
logging.info(
'event=%s',
event
)
logging.info(
'context=%s',
context
)
try:
query_id = base64.b64decode(event.get('data',b'')).decode('utf-8')
logging.info(
'query_id=%s',
query_id
)
# inject dependencies
orchestrator(
query_id,
bigquery_client,
firestore_client,
publisher_client
)
sub_path = (context.resource['name']
.replace('topics', 'subscriptions')
.replace('function-sql-runner', 'gcf-sql-runner-europe-west1-function-sql-runner')
)
# explicitly ack message to avoid duplicates invocations
try:
subscriber_client.acknowledge(
sub_path,
[context.event_id] # message_id to ack
)
logging.warning(
'message_id %s acknowledged (FORCED)',
context.event_id
)
except exceptions.InvalidArgument as err:
# google.api_core.exceptions.InvalidArgument: 400 You have passed an invalid ack ID to the service (ack_id=982967258971474).
logging.info(
'message_id %s already acknowledged',
context.event_id
)
logging.debug(err)
except Exception as err:
# catch all exceptions and log to prevent cold boot
# report with error_reporting
error_reporting.Client().report_exception()
logging.critical(
'Internal error : %s -> %s',
str(err),
traceback.format_exc()
)
if __name__ == '__main__': # for testing
from collections import namedtuple # use namedtuple to avoid Class creation
Context = namedtuple('Context', 'event_id resource')
context = Context('666', {'name': 'projects/my-dev/topics/function-sql-runner'})
script_to_start = b' ' # launch the 1st script
script_to_start = b'060-cartes.sql'
main(
event={"data": base64.b64encode(script_to_start)},
context=context
)
Here is my code : runner.py
import logging
import os
from retry import retry
PROJECT_ID = os.getenv('GCLOUD_PROJECT') or 'my-dev'
def orchestrator(query_id, bigquery_client, firestore_client, publisher_client):
"""
if query_id empty, start the first sql script
else, call the given query_id.
Anyway, call the next script.
If the sql script is the last, no call
retrieve SQL queries from FireStore
run queries on BigQuery
"""
docs_refs = [
doc_ref.get() for doc_ref in
firestore_client.collection(u'sql_scripts').list_documents()
]
sorted_queries = sorted(docs_refs, key=lambda x: x.id)
if not bool(query_id.strip()) : # first execution
current_index = 0
else:
# find the query to run
query_ids = [ query_doc.id for query_doc in sorted_queries]
current_index = query_ids.index(query_id)
query_doc = sorted_queries[current_index]
bigquery_client.query(
query_doc.to_dict()['request'], # sql query
).result()
logging.info(
'Query %s executed',
query_doc.id
)
# exit if the current query is the last
if len(sorted_queries) == current_index + 1:
logging.info('All scripts were executed.')
return
next_query_id = sorted_queries[current_index+1].id.encode('utf-8')
publish(publisher_client, next_query_id)
#retry(tries=5)
def publish(publisher_client, next_query_id):
"""
send a message in pubsub to call the next query
this mechanism allow to run one sql script per Function instance
so as to not exceed the 9min deadline limit
"""
logging.info('Calling next query %s', next_query_id)
future = publisher_client.publish(
topic='projects/{}/topics/function-sql-runner'.format(PROJECT_ID),
data=next_query_id
)
# ensure publish is successfull
message_id = future.result()
logging.info('Published message_id = %s', message_id)
It looks like the pubsub message is not ack on success.
I do not think I have background activity in my code.
My question : why my Function is randomly retrying even when success ?
Cloud Functions does not guarantee that your functions will run exactly once. According to the documentation, background functions, including pubsub functions, are given an at-least-once guarantee:
Background functions are invoked at least once. This is because of the
asynchronous nature of handling events, in which there is no caller
that waits for the response. The system might, in rare circumstances,
invoke a background function more than once in order to ensure
delivery of the event. If a background function invocation fails with
an error, it will not be invoked again unless retries on failure are
enabled for that function.
Your code will need to expect that it could possibly receive an event more than once. As such, your code should be idempotent:
To make sure that your function behaves correctly on retried execution
attempts, you should make it idempotent by implementing it so that an
event results in the desired results (and side effects) even if it is
delivered multiple times. In the case of HTTP functions, this also
means returning the desired value even if the caller retries calls to
the HTTP function endpoint. See Retrying Background Functions for more
information on how to make your function idempotent.