Why is it that you can run Jython and IronPython without the need for a GIL but Python (CPython) requires a GIL?
Parts of the Interpreter aren't threadsafe, though mostly because making them all threadsafe by massive lock usage would slow single-threaded extremely (source). This seems to be related to the CPython garbage collector using reference counting (the JVM and CLR don't, and therefore don't need to lock/release a reference count every time). But even if someone thought of an acceptable solution and implemented it, third party libraries would still have the same problems.
Note that extensions written in C can in fact get rid of the GIL: http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock
My guess, because the C libraries that CPython is built upon aren't thread-safe. Whereas Jython and IronPython are built against the Java and .Net respectively.
Related
As the GIL is a lock that surrounds the interpreter does it affect compiled Python? I'm wondering whether it is possible to get past the inherent multi-threading limitations of cpython by simply compiling my python before executing it.
Hopefully that makes sense and I'm not missing something obvious or misinterpreting how the GIL works/affects execution.
Thanks
As Daniel said in the comments, it depends on how you "compile" the code.
For example, running the code using Jython does indeed get around the limitations imposed by the GIL.
On the other hand, using something like py2exe makes no difference, since this effectively just packages CPython alongside your code.
Jython does not have a GIL.
IronPython does not have a GIL.
You can compile your python code with cython, and then whether it uses the GIL or not depends. If you convert all you python variables into cython types, you can run your code in a with nogil block and you will have no GIL because you are expressly releasing the GIL. If you are not running in a nogil block, you will be affected by cpython's GIL. More in the cython docs: http://docs.cython.org/src/userguide/external_C_code.html#acquiring-and-releasing-the-gil
For more on python and the GIL, read up here: http://www.jeffknupp.com/blog/2013/06/30/pythons-hardest-problem-revisited/
I often see people talking that the GIL is per Python Interpreter (even here on stackoverflow).
But what I see in the source code it seems to be that the GIL is a global variable and therefore there is one GIL for all Interpreters in each python process. I know they did this because there is no interpreter object passed around like lua or TCL does, it was just not designed well in the beginning. And thread local storage seems to be not portable for the python guys to use.
Is this correct? I had a short look at the 2.4 version I'm using in a project here.
Had this changed in later versions, especially in 3.0?
The GIL is indeed per-process, not per-interpreter. This is unchanged in 3.x.
Perhaps the confusion comes about because most people assume Python has one interpreter per process. I recall reading that the support for multiple interpreters via the C API was largely untested and hardly ever used. (And when I gave it a go, didn't work properly.)
I believe it is true (at least as of Python 2.6) that each process may have at most one CPython interpreter embedded (other runtimes may have different constraints). I'm not sure if this is an issue with the GIL per se, but it is likely due to global state, or to protect from conflicting global state in third-party C modules. From the CPython API Docs:
[Py___Initialize()] is a no-op when called for a second time (without calling Py_Finalize() first). There is no return value; it is a fatal error if the initialization fails.
You might be interested in the Unladen Swallow project, which aims eventually to remove the GIL entirely from CPython. Other Python runtimes don't have the GIL at all, like (I believe) Stackless Python, and certainly Jython.
Also note that the GIL is still present in CPython 3.x.
How does IronPython stack up to the default Windows implementation of Python from python.org? If I am learning Python, will I be learning a subtley different language with IronPython, and what libraries would I be doing without?
Are there, alternatively, any pros to IronPython (not including .NET IL compiled classes) that would make it more attractive an option?
There are a number of important differences:
Interoperability with other .NET languages. You can use other .NET libraries from an IronPython application, or use IronPython from a C# application, for example. This interoperability is increasing, with a movement toward greater support for dynamic types in .NET 4.0. For a lot of detail on this, see these two presentations at PDC 2008.
Better concurrency/multi-core support, due to lack of a GIL. (Note that the GIL doesn't inhibit threading on a single-core machine---it only limits performance on multi-core machines.)
Limited ability to consume Python C extensions. The Ironclad project is making significant strides toward improving this---they've nearly gotten Numpy working!
Less cross-platform support; basically, you've got the CLR and Mono. Mono is impressive, though, and runs on many platforms---and they've got an implementation of Silverlight, called Moonlight.
Reports of improved performance, although I have not looked into this carefully.
Feature lag: since CPython is the reference Python implementation, it has the "latest and greatest" Python features, whereas IronPython necessarily lags behind. Many people do not find this to be a problem.
There are some subtle differences in how you write your code, but the biggest difference is in the libraries you have available.
With IronPython, you have all the .Net libraries available, but at the expense of some of the "normal" python libraries that haven't been ported to the .Net VM I think.
Basically, you should expect the syntax and the idioms to be the same, but a script written for IronPython wont run if you try giving it to the "regular" Python interpreter. The other way around is probably more likely, but there too you will find differences I think.
Well, it's generally faster.
Can't use modules, and only has a subset of the library.
Here's a list of differences.
See the blog post IronPython is a one-way gate. It summarizes some things I've learned about IronPython from asking questions on StackOverflow.
Python is Python, the only difference is that IronPython was designed to run on the CLR (.NET Framework), and as such, can inter-operate and consume .NET assemblies written in other .NET languages. So if your platform is Windows and you also use .NET or your company does then should consider IronPython.
One of the pros of IronPython is that, unlike CPython, IronPython doesn't use the Global Interpreter Lock, thus making threading more effective.
In the standard Python implementation, threads grab the GIL on each object access. This limits parallel execution, which matters especially if you expect to fully utilize multiple CPUs.
Pro: You can run IronPython in a browser if SilverLight is installed.
It also depends on whether you want your code to work on Linux. Dunno if IronPython will work on anything beside windows platforms.
I am relatively new to Python, and I have always used the standard cpython (v2.5) implementation.
I've been wondering about the other implementations though, particularly Jython and IronPython. What makes them better? What makes them worse? What other implementations are there?
I guess what I'm looking for is a summary and list of pros and cons for each implementation.
Jython and IronPython are useful if you have an overriding need to interface with existing libraries written in a different platform, like if you have 100,000 lines of Java and you just want to write a 20-line Python script. Not particularly useful for anything else, in my opinion, because they are perpetually a few versions behind CPython due to community inertia.
Stackless is interesting because it has support for green threads, continuations, etc. Sort of an Erlang-lite.
PyPy is an experimental interpreter/compiler that may one day supplant CPython, but for now is more of a testbed for new ideas.
An additional benefit for Jython, at least for some, is it lacks the GIL (the Global Interpreter Lock) and uses Java's native threads. This means that you can run pure Python code in parallel, something not possible with the GIL.
All of the implementations are listed here:
https://wiki.python.org/moin/PythonImplementations
CPython is the "reference implementation" and developed by Guido and the core developers.
Pros: Access to the libraries available for JVM or CLR.
Cons: Both naturally lag behind CPython in terms of features.
IronPython and Jython use the runtime environment for .NET or Java and with that comes Just In Time compilation and a garbage collector different from the original CPython. They might be also faster than CPython thanks to the JIT, but I don't know that for sure.
A downside in using Jython or IronPython is that you cannot use native C modules, they can be only used in CPython.
PyPy is a Python implementation written in RPython wich is a Python subset.
RPython can be translated to run on a VM or, unlike standard Python, RPython can be statically compiled.
A reliable coder friend told me that Python's current multi-threading implementation is seriously buggy - enough to avoid using altogether. What can said about this rumor?
Python threads are good for concurrent I/O programming. Threads are swapped out of the CPU as soon as they block waiting for input from file, network, etc. This allows other Python threads to use the CPU while others wait. This would allow you to write a multi-threaded web server or web crawler, for example.
However, Python threads are serialized by the GIL when they enter interpreter core. This means that if two threads are crunching numbers, only one can run at any given moment. It also means that you can't take advantage of multi-core or multi-processor architectures.
There are solutions like running multiple Python interpreters concurrently, using a C based threading library. This is not for the faint of heart and the benefits might not be worth the trouble. Let's hope for an all Python solution in a future release.
The standard implementation of Python (generally known as CPython as it is written in C) uses OS threads, but since there is the Global Interpreter Lock, only one thread at a time is allowed to run Python code. But within those limitations, the threading libraries are robust and widely used.
If you want to be able to use multiple CPU cores, there are a few options. One is to use multiple python interpreters concurrently, as mentioned by others. Another option is to use a different implementation of Python that does not use a GIL. The two main options are Jython and IronPython.
Jython is written in Java, and is now fairly mature, though some incompatibilities remain. For example, the web framework Django does not run perfectly yet, but is getting closer all the time. Jython is great for thread safety, comes out better in benchmarks and has a cheeky message for those wanting the GIL.
IronPython uses the .NET framework and is written in C#. Compatibility is reaching the stage where Django can run on IronPython (at least as a demo) and there are guides to using threads in IronPython.
The GIL (Global Interpreter Lock) might be a problem, but the API is quite OK. Try out the excellent processing module, which implements the Threading API for separate processes. I am using that right now (albeit on OS X, have yet to do some testing on Windows) and am really impressed. The Queue class is really saving my bacon in terms of managing complexity!
EDIT: it seemes the processing module is being included in the standard library as of version 2.6 (import multiprocessing). Joy!
As far as I know there are no real bugs, but the performance when threading in cPython is really bad (compared to most other threading implementations, but usually good enough if all most of the threads do is block) due to the GIL (Global Interpreter Lock), so really it is implementation specific rather than language specific. Jython, for example, does not suffer from this due to using the Java thread model.
See this post on why it is not really feasible to remove the GIL from the cPython implementation, and this for some practical elaboration and workarounds.
Do a quick google for "Python GIL" for more information.
If you want to code in python and get great threading support, you might want to check out IronPython or Jython. Since the python code in IronPython and Jython run on the .NET CLR and Java VM respectively, they enjoy the great threading support built into those libraries. In addition to that, IronPython doesn't have the GIL, an issue that prevents CPython threads from taking full advantage of multi-core architectures.
I've used it in several applications and have never had nor heard of threading being anything other than 100% reliable, as long as you know its limits. You can't spawn 1000 threads at the same time and expect your program to run properly on Windows, however you can easily write a worker pool and just feed it 1000 operations, and keep everything nice and under control.