I have a more general beginners question about multiprocessing in Python (please forgive me if I'm utterly wrong in the following). Let's assume I launch two ore more Ipython consols in parallel and run some independent functions/scripts via those consols, does that meant these tasks are performed on multiple cores (one core per task)? If yes, would it be better to collect the tasks in a "main module" and use the multiprocessing library?
There's no difference between starting processes in two terminals or using multiprocessing:
when you open two python consoles you have two processes with their pid
when you run two multiprocessing processes they are forked (on Linux) or started as separate python instance (Windows) and thus run as independent processes.
What the OS does with this processes is beyond your control. If both processes use a lot of CPU resources and there is only little other processes they will be spread across cores.
Related
I have a Python script that is currently using multiprocessing to perform tasks in parallel. My advisor recommended me to use GNU Parallel to speed it up since Python programs always execute on a single core. Should I keep the multiprocessing script as it is and use GNU Parallel on top of it? Or should I remove the multiprocessing part and then use GNU Parallel? Does it make a difference?
Does it make a difference?
There is a really simple answer: Try it and measure.
The performance of parallelization these days depends on so many factors and it really depends on your application (and sometimes on your hardware).
So I know that even multithreaded python process can not use multiple core at the same time.
But, by default, does that mean a python process is "pinned" to one CPU? By pinned, I mean, will the python process use always the same CPU, or can the same process, use overtime the different CPU of my machine?
By default, a python process is not pinned to a particular CPU core. In fact, despite the GIL, a single python process can spawn multiple threads -- each of which can be scheduled simultaneously by the OS on different CPU cores. Although the GIL makes it difficult for more than one thread to actually make progress at any given time (since they must all contend for the lock), even this can happen (native code can release the GIL unless / until it needs to access Python datastructures).
You can, of course, use your operating system utilities to pin any process (including Python) to a specific CPU core.
I heard that using OMP_NUM_THREADS=1 before calling a Python script that use multiprocessing make the script faster.
Is it true or not ? If yes, why so ?
Since you said in a comment that your Python program is calling a C module that uses OpenMP:
OpenMP does multi-threading within a process, and the default number of threads is typically the number that the CPU can actually run simultaneously. (This is generally the number of CPU cores, or a multiple of that number if the CPU has an SMT feature such as Intel's Hyper-Threading.) So if you have, for example, a quad-core non-hyperthreaded CPU, OpenMP will want to run 4 threads by default.
When you use Python's multiprocessing module, your program starts multiple Python processes which can run simultaneously. You can control the number of processes, but often you'll want it to be the number of CPU cores/threads, e.g. returned by multiprocessing.cpu_count().
So, what happens on that quad-core CPU if you run a multiprocessing program that runs 4 Python processes, and each calls an OpenMP function runs 4 threads? You end up running 16 threads on 4 cores. That'll work, but not at peak efficiency, since each core will have to spend some time switching between tasks.
Setting OMP_NUM_THREADS=1 basically turns off the OpenMP multi-threading, so each of your Python processes remains single-threaded.
Make sure you're starting enough Python processes if you do this, though! If you have 4 CPU cores and you only run 2 single-threaded Python processes, you'll have 2 cores utilized and the other 2 sitting idle. (In this case you might want to set OMP_NUM_THREADS=2.)
Resolved in comments:
OMP_NUM_THREADS is an option for OpenMP, a C/C++/Fortran API for doing multi-threading within a process.
It's unclear how that's even related to Python multiprocessing.
Is your Python program calling modules written in C that use OpenMP internally? – Wyzard
I am learning multi-threaded Python (CPython). I'm aware of the GIL and how it limits threading to a single core (in most circumstances).
I know that I/O functionality can be run multi-cored, however I have been unable to find a list of what parts of the standard library can be run across multiple cores. I believe that urllib can be run multi cored, allowing downloading on a thread on a separate core (but have been unable to find confirmation of this in the docs).
What I am trying to find out is, which parts of the standard library will run multi-core, as this doesn't seem to be specified in the documentation.
Taken from the docs:
However, some extension modules, either standard or third-party, are designed so as to release the GIL when doing computationally-intensive tasks such as compression or hashing. Also, the GIL is always released when doing I/O.
With the multiprocessing package you can write truly parallel programs where separate processes run on different cores. There is no limitation to which libraries (standard or not) each sub-process can use.
The tricky part about multi-process programming is when the processes need to exchange information (e.g., pass each other values, wait for each other to finish with a certain task). The multiprocessing package contains several tools for that.
Currently have a data-intensive process running on Ubuntu Version 11.04 that needs multiple CPU usage.
I wrote the command, given I have 4 cores
taskset -c 0,1,2,3 python sample.py
I am only achieving 100% on one CPU, and the others are idle <2%.
Any tips how to ramp all 4 CPUs up to 100% to make the task faster?
Cheers!
Application needs to be prepared to use more than one core, its tasks need to be divided into separate threads. Otherwise there is little to no usage of more than one CPU.
Standard python interpreter (CPython) has GIL that prevents running more than one thread on a CPU. Consider using multiprocessing module or use alternative implementations such as PyPy.