I am trying to write an API that takes advantage of the python wrapper for NI-DAQmx, and need to have a global list of tasks that can be edited across the module.
Here is what I have tried so far:
1) Created an importable dictionary of tasks which is updated whenever a call is made to ni-daqmx. The function endpoint processes data from an HTTPS request, I promise it's not just a pointless wrapper around the ni-daqmx library itself.
e.g., on startup, the following is created:
#./daq/__init.py__
import nidaqmx
# ... other stuff ...#
TASKS = {}
then, the user can create a task by calling this endpoint
#./daq/task/task.py
from daq import TASKS
# ...
def api_create_task_endpoint(task_id):
try:
task = nidaqmx.Task(new_task_name=task_id)
TASKS[task_id] = task
except Exception:
# handle it
Everything up to here works as it should. I can get the task list, and the task stays open. I also tried explicitly calling task.control(nidaqmx.constants.TaskMode.TASK_RESERVE), but the following code gives me the same issue no matter what.
When I try to add channels to the task, it closes at the end of the function call no matter how I set the state.
#./daq/task/channels.py
from daq import TASKS
def api_add_channel_task_endpoint(task_id, channel_type, function):
# channel_type corresponds to ni-daqmx channel modules (e.g. ai_channels).
# function corresponds to callable functions (e.g. add_ai_voltage_chan)
# do some preliminary checks (e.g. task exists, channel type valid)
channels = get_chans_from_json_post()
with TASKS[task_id] as task:
getattr(getattr(task, channel_type), function)(channels)
# e.g. task.ai_channels.add_ai_voltage_chan("Dev1/ai0")
This is apparently closing the task. When I call api_create_task_endpoint(task_id) again, I receive the DaqResourceWarning that the task has been closed, and no longer exists.
I similarly tried setting the TaskMode using task.control here, to no avail.
I would like to be able to make edits to the task by storing it in the module-wide TASKS dict, but cannot keep the Task open long enough to do so.
2) I also tried implementing this using the NI-MAX save feature. The issue with this is that tasks cannot be saved unless they already contain channels, which I don't necessarily want to do immediately after creating the task.
I attempted to work around this by adding to the api_create_task_endpoint() some default behavior which just adds a random channel that is removed on the first channel added by the user.
Problem is, I can't find any documentation for a way to remove channels from a task after adding them without a GUI (this is running on CENTOS, so GUI is a non-starter).
Thank you so much for any help!
I haven't use the Python bindings for NI-DAQmx, but
with TASKS[task_id] as task:
looks like it would stop and clear the task immediately after updating it because the program flow leaves the with block and Task.__exit__() executes.
Because you expect these tasks to live while the Python module is in use, my recommendation is to only use task.control() when you need to change a task's state.
Related
I'm working with a django application hosted on heroku with redistogo addon:nano pack. I'm using rq, to execute tasks in the background - the tasks are initiated by online users. I've a constraint on increasing number of connections, limited resources I'm afraid.
I'm currently having a single worker running over 'n' number of queues. Each queue uses an instance of connection from the connection pool to handle 'n' different types of task. For instance, lets say if 4 users initiate same type of task, I would like to have my main worker create child processes dynamically, to handle it. Is there a way to achieve required multiprocessing and concurrency?
I tried with multiprocessing module, initially without introducing Lock(); but that exposes and overwrites user passed data to the initiating function, with the previous request data. After applying locks, it restricts second user to initiate the requests by returning a server error - 500
github link #1: Looks like the team is working on the PR; not yet released though!
github link #2: This post helps to explain creating more workers at runtime.
This solution however also overrides the data. The new request is again processed with the previous requests data.
Let me know if you need to see some code. I'll try to post a minimal reproducible snippet.
Any thoughts/suggestions/guidelines?
Did you get a chance to try AutoWorker?
Spawn RQ Workers automatically.
from autoworker import AutoWorker
aw = AutoWorker(queue='high', max_procs=6)
aw.work()
It makes use of multiprocessing with StrictRedis from redis module and following imports from rq
from rq.contrib.legacy import cleanup_ghosts
from rq.queue import Queue
from rq.worker import Worker, WorkerStatus
After looking under the hood, I realised Worker class is already implementing multiprocessing.
The work function internally calls execute_job(job, queue) which in turn as quoted in the module
Spawns a work horse to perform the actual work and passes it a job.
The worker will wait for the work horse and make sure it executes within the given timeout bounds,
or will end the work horse with SIGALRM.
The execute_job() funtion makes a call to fork_work_horse(job, queue) implicitly which spawns a work horse to perform the actual work and passes it a job as per the following logic:
def fork_work_horse(self, job, queue):
child_pid = os.fork()
os.environ['RQ_WORKER_ID'] = self.name
os.environ['RQ_JOB_ID'] = job.id
if child_pid == 0:
self.main_work_horse(job, queue)
else:
self._horse_pid = child_pid
self.procline('Forked {0} at {1}'.format(child_pid, time.time()))
The main_work_horse makes an internal call to perform_job(job, queue) which makes a few other calls to actually perform the job.
All the steps about The Worker Lifecycle mentioned over rq's official documentation page are taken care within these calls.
It's not the multiprocessing I was expecting, but I guess they have a way of doing things. However my original post is still not answered with this, also I'm still not sure about concurrency..
The documentation there still needs to be worked upon, since it hardly covers the true essence of this library!
I don't know if it's a dumb question, but I'm really struggling with solving this problem.
I'm coding with the obd library.
Now my problem with that is the continuous actualization of my variables.
For instance, one variable outputs the actual speed of the car.
This variable has to be updated every second or 2 seconds. To do this update I have to run 2 lines of code
cmd = obd.commands.RPM
rpm = connection.query(cmd)
but I have to check the rpm variable in some while loops and if statements. (in realtime)
Is there any opportunity to get this thing done ? (another class or thread or something) It would really help me take a leap forward in my programming project.
Thanks :)
use the Async interface instead of the OBD:
Since the standard query() function is blocking, it can be a hazard for UI event loops. To deal with this, python-OBD has an Async connection object that can be used in place of the standard OBD object. Async is a subclass of OBD, and therefore inherits all of the standard methods. However, Async adds a few in order to control a threaded update loop. This loop will keep the values of your commands up to date with the vehicle. This way, when the user querys the car, the latest response is returned immediately.
The update loop is controlled by calling start() and stop(). To subscribe a command for updating, call watch() with your requested OBDCommand. Because the update loop is threaded, commands can only be watched while the loop is stoped.
import obd
connection = obd.Async() # same constructor as 'obd.OBD()'
connection.watch(obd.commands.RPM) # keep track of the RPM
connection.start() # start the async update loop
print connection.query(obd.commands.RPM) # non-blocking, returns immediately
http://python-obd.readthedocs.io/en/latest/Async%20Connections/
Is there any way in celery by which if a task execution fails I can automatically put it into another queue.
For example it the task is running in a queue x, on exception enqueue it to another queue named error_x
Edit:
Currently I am using celery==3.0.13 along with django 1.4, Rabbitmq as broker.
Some times the task fails. Is there a way in celery to add messages to an error queue and process it later.
The problem when celery task fails is that I don't have access to the message queue name. So I can't use self.retry retry to put it to a different error queue.
Well, you cannot use the retry mechanism if you want to route the task to another queue. From the docs:
retry() can be used to re-execute the task, for example in the event
of recoverable errors.
When you call retry it will send a new message, using the same
task-id, and it will take care to make sure the message is delivered
to the same queue as the originating task.
You'll have to relaunch yourself and route it manually to your wanted queue in the event of any exception raised. It seems a good job for error callbacks.
The main issue is that we need to get the task name in the error callback to be able to launch it. Also we may not want to add the callback each time we launch a task. Thus a decorator would be a good way to automatically add the right callback.
from functools import partial, wraps
import celery
#celery.shared_task
def error_callback(task_id, task_name, retry_queue, retry_routing_key):
# We must retrieve the task object itself.
# `tasks` is a dict of 'task_name': celery_task_object
task = celery.current_app.tasks[task_name]
# Re launch the task in specified queue.
task.apply_async(queue=retry_queue, routing_key=retry_routing_key)
def retrying_task(retry_queue, retry_routing_key):
"""Decorates function to automatically add error callbacks."""
def retrying_decorator(func):
#celery.shared_task
#wraps(func) # just to keep the original task name
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
# Monkey patch the apply_async method to add the callback.
wrapper.apply_async = partial(
wrapper.apply_async,
link_error=error_callback.s(wrapper.name, retry_queue, retry_routing_key)
)
return wrapper
return retrying_decorator
# Usage:
#retrying_task(retry_queue='another_queue', retry_routing_key='another_routing_key')
def failing_task():
print 'Hi, I will fail!'
raise Exception("I'm failing!")
failing_task.apply_async()
You can adjust the decorator to pass whatever parameters you need.
I had a similar problem and i solved it may be not in a most efficient way but however my solution is as follows:
I have created a django model to keep all my celery task-ids and that is capable of checking the task state.
Then i have created another celery task that is running in an infinite cycle and checks all tasks that are 'RUNNING' on their actual state and if the state is 'FAILED' it just reruns it. Im not actually changing the queue for the task which i rerun but i think you can implement some custom logic to decide where to put every task you rerun this way.
I'm sorry if this question has in fact been asked before. I've searched around quite a bit and found pieces of information here and there but nothing that completely helps me.
I am building an app on Google App engine in python, that lets a user upload a file, which is then being processed by a piece of python code, and then resulting processed file gets sent back to the user in an email.
At first I used a deferred task for this, which worked great. Over time I've come to realize that since the processing can take more than then 10 mins I have before I hit the DeadlineExceededError, I need to be more clever.
I therefore started to look into task queues, wanting to make a queue that processes the file in chunks, and then piece everything together at the end.
My present code for making the single deferred task look like this:
_=deferred.defer(transform_function,filename,from,to,email)
so that the transform_function code gets the values of filename, from, to and email and sets off to do the processing.
Could someone please enlighten me as to how I turn this into a linear chain of tasks that get acted on one after the other? I have read all documentation on Google app engine that I can think about, but they are unfortunately not written in enough detail in terms of actual pieces of code.
I see references to things like:
taskqueue.add(url='/worker', params={'key': key})
but since I don't have a url for my task, but rather a transform_function() implemented elsewhere, I don't see how this applies to me…
Many thanks!
You can just keep calling deferred to run your task when you get to the end of each phase.
Other queues just allow you to control the scheduling and rate, but work the same.
I track the elapsed time in the task, and when I get near the end of the processing window the code stops what it is doing, and calls defer for the next task in the chain or continues where it left off, depending if its a discrete set up steps or a continues chunk of work. This was all written back when tasks could only run for 60 seconds.
However the problem you will face (it doesn't matter if it's a normal task queue or deferred) is that each stage could fail for some reason, and then be re-run so each phase must be idempotent.
For long running chained tasks, I construct an entity in the datastore that holds the description of the work to be done and tracks the processing state for the job and then you can just keep rerunning the same task until completion. On completion it marks the job as complete.
To avoid the 10 minutes timeout you can direct the request to a backend or a B type module
using the "_target" param.
BTW, any reason you need to process the chunks sequentially? If all you need is some notification upon completion of all chunks (so you can "piece everything together at the end")
you can implement it in various ways (e.g. each deferred task for a chunk can decrease a shared datastore counter [read state, decrease and update all in the same transaction] that was initialized with the number of chunks. If the datastore update was successful and counter has reached zero you can proceed with combining all the pieces together.) An alternative for using deferred that would simplify the suggested workflow can be pipelines (https://code.google.com/p/appengine-pipeline/wiki/GettingStarted).
I am trying to understand a simple python proxy example using Twisted located here. The proxy instantiates a Server Class, which in turn instantiates a client class. defer.DeferredQueue() is used to pass data from client class to server class.
I am now trying to understand how defer.DeferredQueue() works in this example. For example what is the significance of this statement:
self.srv_queue.get().addCallback(self.clientDataReceived)
and it's analogous
self.cli_queue.get().addCallback(self.serverDataReceived)
statement.
What happens when self.cli_queue.put(False) or self.cli_queue = None is executed?
Just trying to get into grips with Twisted now, so things seems pretty daunting. A small explanation of how things are connected would make it far more easy to get into grips with this.
According to the documentation, DeferredQueue has a normal put method to add object to queue and a deferred get method.
The get method returns a Deferred object. You add a callback method (e.g serverDataReceived) to the object. Whenever the object available in the queue, the Deferred object will invoke the callback method. The object will be passed as argument to the method. In case the queue is empty or the serverDataReceived method hasn't finished executing, your program still continues to execute next statements. When new object available in the queue, the callback method will be called regardless of the point of execution of your program.
In other words, it is an asynchronous flow, in contrary to a synchronous flow model, in which, you might have a BlockingQueue, i.e, your program will wait until the next object available in the queue for it to continue executing.
In your example program self.cli_queue.put(False) add a False object to the queue. It is a sort of flag to tell the ProxyClient thread that there won't be anymore data added to the queue. So that it should disconnect the remote connection. You can refer to this portion of code:
def serverDataReceived(self, chunk):
if chunk is False:
self.cli_queue = None
log.msg("Client: disconnecting from peer")
self.factory.continueTrying = False
self.transport.loseConnection()
Set the cli_queue = None is just to discard the queue after the connection is closed.