Parallel Computing with Python on a queued cluster

Parallel Computing with Python on a queued cluster - python

There are lots of different modules for threading/parallelizing python. Dispy and pp/ParallelPython seem especially popular. It looks like these are all designed for a single interface (e.g. desktop) which has many cores/processors. Is there a module which works on massively parallel architectures which are run by queue systems (specifically: SLURM)?

The most used parallel framework on large compute clusters for scientific/technical applications is MPI. The name of the Python package is MPI4py, which is part of SciPy.
MPI offers a high-level API for creating parallel software using messages for communicating over the network; remote process creation, data scatter/gather, reductions, etc. All implementations are able to take advantage of fast and low-latency networks if present. It is fully integrated with all cluster managers, including Slurm.

Via the ParallelPython main page:
"PP is a python module which provides mechanism for parallel execution of python code on SMP (systems with multiple processors or cores) and clusters (computers connected via network)."

Related

What are the different runtimes in MLRun?

I'm trying to get a feel for how MLRun executes my Python code. What different runtimes are supported and why would I use one vs the other?

MLRun has several different ways to run a piece of code. At this time, the following runtimes are supported:
Batch runtimes
local - execute a Python or shell program in your local environment (i.e. Jupyter, IDE, etc.)
job - run the code in a Kubernetes Pod
dask - run the code as a Dask Distributed job (over Kubernetes)
mpijob - run distributed jobs and Horovod over the MPI job operator, used mainly for deep learning jobs
spark - run the job as a Spark job (using Spark Kubernetes Operator)
remote-spark - run the job on a remote Spark service/cluster (e.g. Iguazio Spark service)
Real-time runtimes
nuclio - real-time serverless functions over Nuclio
serving - higher level real-time Graph (DAG) over one or more Nuclio functions
If you are interested in learning more about each runtime, see the documentation.

Developing ROS Nodes in individual Docker Containers?

I am currently planning a sizeable ROS project which will contain upwards of 15 Nodes developed either in Python2.x or C++ talking to each other. We will try to isolate different tasks as individual nodes to guarentee unit-testability and modularity and to improve reusability for future projects.
The question is if creating a docker container for each individual node is worth the effort and if there are any downsides. The big upside would be that we could have vastly different systems from developer-pcs to build machines and then the devices the nodes are deployed to. Docker keeps dependencies and environment configurations in one managable place and fixes many hurdles.
But are there any significant drawbacks to that idea? Are there significant performance hits or other hurdles docker does introduce in such a scenario?

Creation of isolated ROS nodes in docker containers is not as complicated and already done before:
How can I run two ROS2 nodes each in a separate docker container?
Running 2 nodes in 2 separate docker containers
Since the effort is not as high, I think the improvements in testing, reusability, integration, delivery, ... are absolutely arguable if these points are scoped in your project.
Since CPU performance is no drawback in Docker, you should have a closer look on communication of ROS nodes, which means passing messages between your containered ROS nodes. Here you can find two drawbacks you should know:
Socket communication:
As you can see in Networking with Docker: Don’t settle for the defaults, network access in Docker is done by additional interfaces, which will affect the TCP transmission of ROS messages. But a performance analysis in Docker network performance shows only small impact.
Nodelet usage:
Since nodelets are loaded in an existing host ROS node communication between ROS node can be improved imense. Using Docker, the usage of nodelets would not be possible.
All in all, you have to consider which and how many messages are transmitted between your ROS nodes. If many images or other large messages are transmitted, further analysis would be absolutely advisable. Except a little more complicated handling of your nodes, I can't find notable points that speaks against the concept of using ROS nodes in separated Docker containers.

How to avoid 'paramiko' which restricts me to limited python usage on remote machine?

I have distributed system to test (e.g. hadoop), so my test cluster has 10 to 20 machines. We have developed test automation suite which we trigger from outside of the test cluster. Being a system remotely located we are not able to leverage python modules except 'paramiko' for remote calls, as a result, we always issue a Linux commands a part of test execution.
What shall I do to leverage different python modules on remote machine?
Is python not meant for the distributed system?

paramiko definitely won't help to leverage the Python modules present on remote machines.
Python is definitely meant for distributed systems. For this, one solution I can think of is need to use Remote Procedure Call mechanism.
There are several modules in Python using which you can achieve RPC.
XML-RPC (In-Built)
RPyC
PyRo
Stackoverflow link with multiple Python RPC solutions.
Some links with comparison between above three
Pyro and RPyC
XML-RPC vs PyRo

running python code on distributed cluster

I need to run some numpy computation on 5000 files in parallel using python. I have the sequential single machine version implemented already. What would be the easiest way to run the code in parallel (say using an ec2 cluster)? Should I write my own task scheduler and job distribution code?

You can have a look pscheduler Python module. It will allow you to queue up your jobs and run them sequentially. The number of concurrent processes will depend upon the available CPU cores. This program can easily scale up and submit your jobs to remote machines but then would require all your remote machines to use NFS.
I'll be happy to help you further.

How to split python tasks across a network

I have a python program that performs several independent and time consuming processes. The python code is generally an automater, that calls into several batch files via popen.
The program currently takes several hours, so I'd like to split it up across multiple machines. How can I split tasks to process in parallel with python, over an intranet network?

There are many Python parallelisation frameworks out there. Just two of the options:
The parallel computing facilities of IPython
The parallelisation framework jug

For the remote execution you could use execnet. Do you have to distribute the data too?

I might suggest STAF. It's advertised as a software testing framework, yet it allows for distribution of activities across multiple PCs (and multiple platforms). You can run scripts, copy data, and easily communicate between your multiple sessions. Best of all, it's fairly easy to integrate with already existing scripts.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parallel Computing with Python on a queued cluster - python

Via the ParallelPython main page: "PP is a python module which provides mechanism for parallel execution of python code on SMP (systems with multiple processors or cores) and clusters (computers connected via network)."

Related

What are the different runtimes in MLRun?

Developing ROS Nodes in individual Docker Containers?

How to avoid 'paramiko' which restricts me to limited python usage on remote machine?

running python code on distributed cluster

How to split python tasks across a network

Categories

Resources