What are the simplest way to use all cores off a computer for a python program ? In particular, I would want to parallelize a numpy function (which already exists). Is there something like openmp under fortran in python ?
Check out the multiprocessing library. It even allows to spread work across multiple computers.
It depends on what you want to do and how numpy is compiled on your machine (in some cases, some multicore use will be automatic). See this page for details.
It may or may not fit to your specific problem you want to solve, but I personally find the ipython shell's parallel infrastructure quite attractive. It is relatively easy to set up an ipcluster on localhost (see in the manual).
You can wrap your function you wish to evaluate into a #parallel decorator for example and its evaluation will be distributed among many cores (see the Quick and easy parallelism section of the manual)
Related
I'm interested in running a Python program using a computer cluster. I have in the past been using Python MPI interfaces, but due to difficulties in compiling/installing these, I would prefer solutions which use built-in modules, such as Python's multiprocessing module.
What I would really like to do is just set up a multiprocessing.Pool instance that would span across the whole computer cluster, and run a Pool.map(...). Is this something that is possible/easy to do?
If this is impossible, I'd like to at least be able to start Process instances on any of the nodes from a central script with different parameters for each node.
If by cluster computing you mean distributed memory systems (multiple nodes rather that SMP) then Python's multiprocessing may not be a suitable choice. It can spawn multiple processes but they will still be bound within a single node.
What you will need is a framework that handles spawing of processes across multiple nodes and provides a mechanism for communication between the processors. (pretty much what MPI does).
See the page on Parallel Processing on the Python wiki for a list of frameworks which will help with cluster computing.
From the list, pp, jug, pyro and celery look like sensible options although I can't personally vouch for any since I have no experience with any of them (I use mainly MPI).
If ease of installation/use is important, I would start by exploring jug. It's easy to install, supports common batch cluster systems, and looks well documented.
In the past I've used Pyro to do this quite successfully. If you turn on mobile code it will automatically send over the wire required modules the nodes don't have already. Pretty nifty.
I have luck using SCOOP as an alternative to multiprocessing for single or multi computer use and gain the benefit of job submission for clusters as well as many other features such as nested maps and minimal code changes to get working with map().
The source is available on Github. A quick example shows just how simple implementation can be!
If you are willing to pip install an open source package, you should consider Ray, which out of the Python cluster frameworks is probably the option that comes closest to the single threaded Python experience. It allows you to parallelize both functions (as tasks) and also stateful classes (as actors) and does all of the data shipping and serialization as well as exception message propagation automatically. It also allows similar flexibility to normal Python (actors can be passed around, tasks can call other tasks, there can be arbitrary data dependencies, etc.). More about that in the documentation.
As an example, this is how you would do your multiprocessing map example in Ray:
import ray
ray.init()
#ray.remote
def mapping_function(input):
return input + 1
results = ray.get([mapping_function.remote(i) for i in range(100)])
The API is a little bit different than Python's multiprocessing API, but should be easier to use. There is a walk-through tutorial that describes how to handle data-dependencies and actors, etc.
You can install Ray with "pip install ray" and then execute the above code on a single node, or it's also easy to set up a cluster, see Cloud support and Cluster support
Disclaimer: I'm one of the Ray developers.
I'm interested in running a Python program using a computer cluster. I have in the past been using Python MPI interfaces, but due to difficulties in compiling/installing these, I would prefer solutions which use built-in modules, such as Python's multiprocessing module.
What I would really like to do is just set up a multiprocessing.Pool instance that would span across the whole computer cluster, and run a Pool.map(...). Is this something that is possible/easy to do?
If this is impossible, I'd like to at least be able to start Process instances on any of the nodes from a central script with different parameters for each node.
If by cluster computing you mean distributed memory systems (multiple nodes rather that SMP) then Python's multiprocessing may not be a suitable choice. It can spawn multiple processes but they will still be bound within a single node.
What you will need is a framework that handles spawing of processes across multiple nodes and provides a mechanism for communication between the processors. (pretty much what MPI does).
See the page on Parallel Processing on the Python wiki for a list of frameworks which will help with cluster computing.
From the list, pp, jug, pyro and celery look like sensible options although I can't personally vouch for any since I have no experience with any of them (I use mainly MPI).
If ease of installation/use is important, I would start by exploring jug. It's easy to install, supports common batch cluster systems, and looks well documented.
In the past I've used Pyro to do this quite successfully. If you turn on mobile code it will automatically send over the wire required modules the nodes don't have already. Pretty nifty.
I have luck using SCOOP as an alternative to multiprocessing for single or multi computer use and gain the benefit of job submission for clusters as well as many other features such as nested maps and minimal code changes to get working with map().
The source is available on Github. A quick example shows just how simple implementation can be!
If you are willing to pip install an open source package, you should consider Ray, which out of the Python cluster frameworks is probably the option that comes closest to the single threaded Python experience. It allows you to parallelize both functions (as tasks) and also stateful classes (as actors) and does all of the data shipping and serialization as well as exception message propagation automatically. It also allows similar flexibility to normal Python (actors can be passed around, tasks can call other tasks, there can be arbitrary data dependencies, etc.). More about that in the documentation.
As an example, this is how you would do your multiprocessing map example in Ray:
import ray
ray.init()
#ray.remote
def mapping_function(input):
return input + 1
results = ray.get([mapping_function.remote(i) for i in range(100)])
The API is a little bit different than Python's multiprocessing API, but should be easier to use. There is a walk-through tutorial that describes how to handle data-dependencies and actors, etc.
You can install Ray with "pip install ray" and then execute the above code on a single node, or it's also easy to set up a cluster, see Cloud support and Cluster support
Disclaimer: I'm one of the Ray developers.
I have some code which depends on Greenlets, and need to remove this dependency.
Can anyone explain to me exactly what I'll need to do?
They would preferably be replaced with threads or (better yet) processes from the multiprocessing module, but anything that relies solely on the Python standard library would be sufficient for my needs.
Functionality can be sacrificed, as I don't need asynchronous code, nor does the code that I am converting (for my uses, not the original implementation).
UPDATE:
Specifically, I need to know of alternatives to Greenlet.spawn()
It really depends on the structure of your code and high level architecture of your system. If you think that whatever you are using greenlets for can be done using multiprocessing module in the Python Standard library, then you can do that. I think, if you post specific instances than you can get the specific ways to those using multiprocessing. But beware these are two different ways to solve a generic problem of concurrency.
I'm looking at implementing a fuzzy logic controller based on either PyFuzzy (Python) or FFLL (C++) libraries.
I'd prefer to work with python but am unsure if the performance will be acceptable in the embedded environment it will work in (either ARM or embedded x86 proc both ~64Mbs of RAM).
The main concern is that response times are as fast as possible (an update rate of 5hz+ would be ideal >2Hz is required). The system would be reading from multiple (probably 5) sensors from an RS232 port and provide 2/3 outputs based on the results of the fuzzy evaluation.
Should I be concerned that Python will be too slow for this task?
In general, you shouldn't obsess over performance until you've actually seen it become a problem. Since we don't know the details of your app, we can't say how it'd perform if implemented in Python. And since you haven't implemented it yet, neither can you.
Implement the version you're most comfortable with, and can implement fastest, first. Then benchmark it. And if it is too slow, you have three options which should be done in order:
First, optimize your Python code
If that's not enough, write the most performance-critical functions in C/C++, and call that from your Python code
And finally, if you really need top performance, you might have to rewrite the whole thing in C++. But then at least you'll have a working prototype in Python, and you'll have a much clearer idea of how it should be implemented. You'll know what pitfalls to avoid, and you'll have an already correct implementation to test against and compare results to.
Python is very slow at handling large amounts of non-string data. For some operations, you may see that it is 1000 times slower than C/C++, so yes, you should investigate into this and do necessary benchmarks before you make time-critical algorithms in Python.
However, you can extend python with modules in C/C++ code, so that time-critical things are fast, while still being able to use python for the main code.
Make it work, then make it work fast.
If most of your runtime is spent in C libraries, the language you use to call these libraries isn't important. What language are your time-eating libraries written in ?
From your description, speed should not be much of a concern (and you can use C, cython, whatever you want to make it faster), but memory would be. For environments with 64 Mb max (where the OS and all should fit as well, right ?), I think there is a good chance that python may not be the right tool for target deployment.
If you have non trivial logic to handle, I would still prototype in python, though.
I never really measured the performance of pyfuzzy's examples, but as the new version 0.1.0 can read FCL files as FFLL does. Just describe your fuzzy system in this format, write some wrappers, and check the performance of both variants.
For reading FCL with pyfuzzy you need the antlr python runtime, but after reading you should be able to pickle the read object, so you don't need the antlr overhead on the target.
Would it be possible to make a python cluster, by writing a telnet server, then telnet-ing the commands and output back-and-forth? Has anyone got a better idea for a python compute cluster?
PS. Preferably for python 3.x, if anyone knows how.
The Python wiki hosts a very comprehensive list of Python cluster computing libraries and tools. You might be especially interested in Parallel Python.
Edit: There is a new library that is IMHO especially good at clustering: execnet. It is small and simple. And it appears to have less bugs than, say, the standard multiprocessing module.
You can see most of the third-party packages available for Python 3 listed here; relevant to cluster computation is mpi4py -- most other distributed computing tools such as pyro are still Python-2 only, but MPI is a leading standard for cluster distributed computation and well looking into (I have no direct experience using mpi4py with Python 3, yet, but by hearsay I believe it's a good implementation).
The main alternative is Python's own built-in multiprocessing, which also scales up pretty well if you have no interest in interfacing existing nodes that respect the MPI standards but may not be coded in Python.
There is no real added value in rolling your own (as Atwood says, don't reinvent the wheel, unless your purpose is just to better understand wheels!-) -- use one of the solid, tested, widespread solutions, already tested, debugged and optimized on your behalf!-)
Look into these
http://www.parallelpython.com/
http://pyro.sourceforge.net/
I have used both and both are exellent for distributed computing
for more detailed list of options see
http://wiki.python.org/moin/ParallelProcessing
and if you want to auto execute something on remote machine , better alternative to telnet is ssh as in http://pydsh.sourceforge.net/
What kind of stuff do you want to do? You might want to check out hadoop. The backend, heavy lifting is done in java, but has a python interface, so you can write python scripts create and send the input, as well as process the results.
If you need to write administrative scripts, take a look at the ClusterShell Python library too, or/and its parallel shell clush. It's useful when dealing with node sets also (man nodeset).
I think IPython.parallel is the way to go. I've been using it extensively for the last year and a half. It allows you to work interactively with as many worker nodes as you want. If you are on AWS, StarCluster is a great way to get IPython.parallel up and running quickly and easily with as many EC2 nodes as you can afford. (It can also automatically install Hadoop, and a variety of other useful tools, if needed.) There are some tricks to using it. (For example, you don't want to send large amounts of data through the IPython.parallel interface itself. Better to distribute a script that will pull down chunks of data on each engine individually.) But overall, I've found it to be a remarkably easy way to do distributed processing (WAY better than Hadoop!)
"Would it be possible to make a python cluster"
Yes.
I love yes/no questions. Anything else you want to know?
(Note that Python 3 has few third-party libraries yet, so you may wanna stay with Python 2 at the moment.)