Server side command line queuing

Server side command line queuing - python

Is it possible to have a server side program that queues and manages processes that are executed at the command line?
The project I am working on takes an image from the user, modifies the image then applies it as a texture to a 3D shape. This 3D scene is generated by blender/Cinema 4d at the command line which outputs it as an image. It is this process that needs to be queued or somehow managed from the server side program. The end result sent back to the user is a video containing an animated 3D shape with their image as a texture applied to it.
These renders may take a while (they may not) but how can I ensure that they are executed at the right times and done so in a queued manner?
This will preferably be done in python.

Lacking more details about how/why you're doing queuing (can only run so many at a time, things need to be done in the right order, etc?), it's hard to suggest a specific solution. However, the basic answer for any situation is that you want to use the subprocess module to fire off the processes, and then you can watch them (using the tools afforded to you by that module) to wait until they're complete and then execute the next one in the queue.

Related

Python : How can I send command lines to a (silent) running shell program

I work with python in Houdini, but bear with me as my question might not be only restricted to the Houdini environment.
Basically, Houdini comes with a "Hython" process that is supposed to be a Python shell
(see the end of this doc : https://www.sidefx.com/docs/houdini/hom/commandline.html)
At some point, Houdini can start multiple of these "hython.exe" without the shell interface, they are automatically managed in background using a server that dispatch instructions to them.
There is no explicit way to directly communicate with them using the API, but I want to be able to do exactly this.
Is there a way to send command lines to a running process (that is supposed to support them) ?
Very long shot part of my question : I'm trying to communicate with the hython processes as the server never tells them to flush their cache when a full set of computation has finished; leading to inflated memory consumption after a while and the need to stop/restart new hythons. It's already admitted by SideFx there is nothing in the API to ask for a cache flush (see here : https://www.sidefx.com/forum/topic/81712/)
I tried to embedd some naive "clear output" in the nodes processes I send to the hythons, to no avail.
"Unload" tag on nodes does nothing.
Node deletion doesn't understand context and will delete my project nodes before they are sent to a hython.

Python: Threading/multiprocessing with matplotlib and user input

I am currently working on a code that will continuous plot data retrieved via serial communication while also allowing for user input, in the form of raw_input, to control the job, such as starting/stopping/clearing the plot and setting the save file names for the data. Currently, I'm trying to do this by having an extra thread that will just read user input and relay it to the program while it continuously plots and saves the data.
Unfortunately, I have run into some errors where commands that are entered during the plotting loop freeze the program for 2 minutes or so, which I believe has to do with matplotlib not being thread safe, where a command entered while the loop is not working with the plotting libraries will lead to a response in 1-2 seconds.
I have attempted switching from threading to the multiprocessing library to try to alleviate the problem to no avail, where the program will not show a plot, leading me to believe the plotting process never starts (the plotting command is the first command in it). I can post the codes or the relevant parts of either program if necessary.
I wanted to know if there was any way around these issues, or if I should start rethinking how I want to program this. Any suggestions on different ways of incorporating user input are welcome too.
Thanks

If matplotlib is not thread-safe, the right thing to do is to serialize all the inputs into matplotlib through a single event queue. Matplotlib can retrieve items off the queue until the queue is empty, and then process all the new parameters.
Your serial communication code and your raw_input should simply put data on this queue, and not attempt direct communication with matplotlib.
Your matplotlib thread will be doing one of three things: (1) waiting for new data; (2) retrieving new data and processing it (e.g. appending it to the arrays to be plotted or changing output file names) and staying in this state as long as the queue is not empty, or moving on to state (3) if it is; or (3) invoking matplotlib to do the plotting, then looping back to state (1).
If you are implementing multiple action commands from your raw_input, you can add some auxiliary state variables. For example, if 'stop' is read from the queue, then you would set a variable that would cause state (3) to skip the plotting and go straight to state (1), and if 'start' is read from the queue, you would reset this variable, and would resume plotting when data is received.
You might think you want to do something fancy like: "if I see data, wait to make sure more is not coming before I start to plot." This is usually a mistake. You would have to tune your wait time very carefully, and then would still find times when your plotting never happened because of the timing of the input data. If you have received data (you are in state 2), and the queue is empty, just plot! In the time taken to do that, if 4 more data points come in, then you'll plot 4 more next time...

Terminate Python Program, but Recover Data

I have an inefficient simulation running (it has been running for ~24 hours).
It can be split into 3 independent parts, so I would like to cancel the simulation, and start a more efficient one, but still recover the data that has already been calculated for the first part.
When an error happens in a program, for example, you can still access the data that the script was working with, and examine it to see where things went wrong.
Is there a way to kill the process manually without losing the data?

You could start a debugger such as winpdb, or any of several IDE debuggers, in a separate session, attach to the running process, (this halts it), set a break point in a section of the code that has access to your data, resume until you reach the break point and then save your data to a file, your new process could then load that data as a starting point.

How to prebuffer an incoming network stream with gstreamer?

I'm using gstreamer to stream audio over the network. My goal is seemingly simple: Prebuffer the incoming stream up to a certain time/byte threshold and then start playing it.
I might be overlooking a really simple feature of gstreamer, but so far, I haven't been able to find a way to do that.
My (simplified) pipeline looks like this: udpsrc -> alsasink. So far all my attempts at achieving my goal have been using a queue element in between:
Add a queue element in between.
Use the min-threshold-time property. This actually works but the problem is, that it makes all the incoming data spend the specified minimum amount of time in the queue, rather than just the beginning, which is not what I want.
To work around the previous problem, I tried to have the queue notify my code when data enters the audio sink for the first time, thinking that this is the time to unset the min-time property that I set earlier, and thus, achieving the "prebuffering" behavior.
Here's is a rough equivalent of the code I tried:
def remove_thresh(pad, info, queue):
pad.remove_data_probe(probe_id)
queue.set_property("min-threshold-time", 0)
queue.set_property("min-threshold-time", delay)
queue.set_property("max-size-time", delay * 2)
probe_id = audiosink.get_pad("sink").add_data_probe(remove_thresh, queue)
This doesn't work for two reasons:
My callback gets called way earlier than the delay variable I provided.
After it gets called, all of the data that was stored in the queue is lost. the playback starts as if the queue weren't there at all.
I think I have a fundamental misunderstanding of how this thing works. Does anyone know what I'm doing wrong, or alternatively, can provide a (possibly) better way to do this?
I'm using python here, but any solution in any language is welcome.
Thanks.

Buffering has already been implemented in GStreamer. Some elements, like the queue, are capable of building this buffer and post bus messages regarding the buffer level (the state of the queue).
An application wanting to have more network resilience, then, should listen to these messages and pause playback if the buffer level is not high enough (usually, whenever it is below 100%).
So, all you have to do is set the pipeline to the PAUSED state while the queue is buffering. In your case, you only want to buffer once, so use any logic for this (maybe set flag variables to pause the pipeline only the first time).
Set the "max-size-bytes" property of the queue to the value you want.
Either listen to the "overrun" signal to notify you when the buffer becomes full or use gst_message_parse_buffering () to find the buffering level.
Once your buffer is full, set the pipeline to PLAYING state and then ignore all further buffering messages.
Finally, for a complete streaming example, you can refer to this tutorial: https://gstreamer.freedesktop.org/documentation/tutorials/basic/streaming.html
The code is in C, but the walkthroughs should help you with you want.

I was having the exact same problems as you with a different pipeline (appsrc), and after spending days trying to find an elegant solution (and ending up with code remarkably similar to what you posted)... all I did was switch the flag is-live to False and the buffering worked automagically. (no need for min-threshold-time or anything else)
Hope this helps.

Is there a good way to split a python program into independent modules?

I'm trying to do some machinery automation with python, but I've run into a problem.
I have code that does the actual control, code that logs, code the provides a GUI, and some other modules all being called from a single script.
The issue is that an error in one module halts all the others. So, for instance a bug in the GUI will kill the control systems.
I want to be able to have the modules run independently, so one can crash, be restarted, be patched, etc without halting the others.
The only way I can find to make that work is to store the variables in an SQL database, or files or something.
Is there a way for one python script to sort of ..debug another? so that one script can read or change the variables in the other? I can't find a way to do that that also allows to scripts to be started and stopped independently.
Does anyone have any ideas or advice?

A fairly effective way to do this is to use message passing. Each of your modules are independent, but they can send and receive messages to each other. A very good reference on the many ways to achieve this in Python is the Python wiki page for parallel processing.
A generic strategy
Split your program into pieces where there are servers and clients. You could then use middleware such as 0MQ, Apache ActiveMQ or RabbitMQ to send data between different parts of the system.
In this case, your GUI could send a message to the log parser server telling it to begin work. Once it's done, the log parser will send a broadcast message to anyone interested telling the world the a reference to the results. The GUI could be a subscriber to the channel that the log parser subscribes to. Once it receives the message, it will open up the results file and display whatever the user is interested in.
Serialization and deserialization speed is important also. You want to minimise the overhead for communicating. Google Protocol Buffers and Apache Thrift are effective tools here.
You will also need some form of supervision strategy to prevent a failure in one of the servers from blocking everything. supervisord will restart things for you and is quite easy to configure. Again, it is only one of many options in this space.
Overkill much?
It sounds like you have created a simple utility. The multiprocessing module is an excellent way to have different bits of the program running fairly independently. You still apply the same strategy (message passing, no shared shared state, supervision), but with different tactics.

You want multiply independent processes, and you want them to talk to each other. Hence: read what methods of inter-process communication are available on your OS. I recommend sockets (generic, will work over a n/w and with diff OSs). You can easily invent a simple (maybe http-like) protocol on top of TCP, maybe with json for messages. There is a bunch of classes coming with Python distribution to make it easy (SocketServer.ThreadingMixIn, SocketServer.TCPServer, etc.).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.