Effect of Python 3.11 on Cython-compiled modules

Effect of Python 3.11 on Cython-compiled modules - python

I have a pure-Python module that compiles with Cython into .so, for performance.
The project currently uses Python 3.8 and everything works fine.
I've read CPython 3.11 comes with potentially great performance benefits, so I'm wondering the cost/benefit effort of upgrading.
Do any of the CPython 3.11 optimizations translate to Cython-compiled code? Which ones? How much?
What is the expected overall effect of Cython on 3.11 relative to 3.8 code?

Basically none of it.
Start-up applies to starting the interpreter (before you load any Cython modules)
"Cheaper, lazy Python frames" - Cython doesn't create frames (apart from when an exception is raised, and even then I doubt it applies)
"Adaptive interpreter" is for interpreted code. It specializes bytecode based on past results - so if an attribute lookup bytecode instruction in a certain function turns out to be looking up in a class, it assumes the same will apply next time. Since Cython doesn't use bytecode, these optimizations just don't apply. (However, if you've applied static types correctly then Cython may be doing similar optimizations already)
As ever, if you're really interested then you should measure it. But I wouldn't expect many benefits to Cython code

Related

AoT Compiler for Python

I want to get my Python script working on a bare metal device like microcontroller WITHOUT the need for an interpreter. I know there are already JIT compilers for Python like PyPy, and interpreters like CPython.
However, existing interpreters I've seen (such as CPython) take up large memory (in MB range).
Is there an AOT compiler for Python (i.e. compiling directly to native hardware through intermediary like LLVM)?
I assume such a compiler would enable Python to run much faster compared to existing implementations AND with lower memory footprint. If there is, I wonder why that solution hasn't been popularized.

As you already mentioned Cython is an option (However, it is true that the result is big due since the C runtime need to implement the Python functionality together with your program).
With regards to LLVM there was a project by Google named unladen swallow. However, that project is mostly abandoned. You can find some information about it here
Basically it was an attempt to bring LLVM optimizations into the runtime of Cython. E.g JITTING Python code.
Another old alternative was shed skin which compiles Python to C++. Some information about it can be found here.
Yet another option similar to shed skin is to restrict yourself to a subset of the Python language and use micropython.

An alternative would be to use GraalVM with Truffle AOT with Python.
It's basically Python running on an optimized AOT for jvm.
The project looks promising. You can chek this link here:
https://www.graalvm.org/22.2/graalvm-as-a-platform/language-implementation-framework/AOT/

Recently came across codon,
Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead. Typical speedups over Python are on the order of 10-100x or more, on a single thread. Codon's performance is typically on par with (and sometimes better than) that of C/C++.

Is Python byte-code, interpreter independent?

This is an obvious question, that I haven't been able to find a concrete answer to.
Is the Python Byte-Code and Python Code itself interpreter independent,
Meaning by this, that If I take a CPython, PyPy, Jython, IronPython, Skulpt, etc, Interpreter, and I attempt to run, the same piece of code in python or bytecode, will it run correctly? (provided that they implement the same language version, and use modules strictly written in Python or standard modules)
If so, is there is a benchmark, or place where I can compare performance comparison from many interpreters?
I've been playing for a while with CPython, and now I want to explore new interpreters.
And also a side question, What are the uses for the others implementations of python?
Skulpt I get it, browsers, but the rest? Is there a specific industry or application that requires a different interpreter (which)?

From https://docs.python.org/3/library/dis.html#module-dis
Bytecode is an implementation detail of the CPython interpreter. No
guarantees are made that bytecode will not be added, removed, or
changed between versions of Python. Use of this module should not be
considered to work across Python VMs or Python releases.
On the other hand, Jython "consists of a compiler to compile Python source code down to Java bytecodes which can run directly on a JVM" and IronPython compiles to CIL to run on the .NET VM.
The purpose is to better integrate into your programming environment. CPython allows you to write C extensions, but this is not necessarily true of other implementations. Jython allows you to interact with Java code. I'm sure similar is true of IronPython.

If so, is there is a benchmark, or place where I can compare
performance comparison from many interpreters?
speed.pypy.org compares pypy to cpython

Calling Python from Ruby - PyPy compatibility

I'm looking to call Python code from Ruby. There are a few existing tools to do this and a few questions on this site recommending http://rubypython.rubyforge.org/, which works by embedding the Python interpreter in Ruby. I'm working on an app that uses libraries unique to Python (namely graph-tool, which I have reasons for using over, say RGL), but the final project is in Rails so having Ruby code do the controlling work would be ideal. I want it to be speedy so I'm using PyPy. Is there a way to get the PyPy interpreter embedded in Ruby code, or to make the Python interpreter in rubypython run PyPy?

No. Well, not without a lot of work.
First, RubyPython doesn't really include an embedded Python interpreter; it just wraps the interpreter at runtime. As shown in the docs, you can run it with any Python you want, e.g.:
>> RubyPython.start(:python_exe => "python2.6")
So, what happens when you try?
>> RubyPython.start(:python_exe => "/usr/local/bin/pypy")
RubyPython::InvalidInterpreter: An invalid interpreter was specified.
from /Library/Ruby/Gems/1.8/gems/rubypython-0.6.3/lib/rubypython.rb:67:in `start'
from /Library/Ruby/Gems/1.8/gems/rubypython-0.6.3/lib/rubypython/python.rb:10:in `synchronize'
from /Library/Ruby/Gems/1.8/gems/rubypython-0.6.3/lib/rubypython/python.rb:10:in `synchronize'
from /Library/Ruby/Gems/1.8/gems/rubypython-0.6.3/lib/rubypython.rb:54:in `start'
from (irb):4
Unfortunately, it requires CPython 2.4-2.7. It doesn't work with CPython 3.x, PyPy, Jython, etc. Again, from the docs:
RubyPython has been tested with the C-based Python interpreter (cpython), versions 2.4 through 2.7. Work is planned to enable Python 3 support, but has not yet been started. If you’re interested in helping us enable Python 3 support, please let us know.
Without looking at the code, I'm guessing rubypython is using rubyffi to either:
* Wrap the CPython embedding APIs, or
* Directly call CPython VM internals via its dll/so/dylib exports.
If it's the former, the project might be doable, but still a lot of work. PyPy doesn't support CPython's embedding APIs. If it had its own embedded APIs, you could potentially rewrite rubypython's lower level to wrap those instead, and leave the higher-level code alone. But embedding PyPy at all is still a work in progress, (See http://mail.python.org/pipermail/pypy-dev/2012-March/009661.html for the state of affairs 6 months ago.) So, you'd need to first help get PyPy embedding ready for prime time and stable, and then port the lower level of rubypython to use the different APIs.
If it's the latter, you're pretty much SOL. PyPy will never support the CPython internals, and much of what's internal for CPython is actually written in RPython or Python and then compiled for PyPy, so it's not even possible in principle. You'd have to drastically rewrite all of rubypython to find some way to make it work, instead of just porting the lower level.
One alternative is to port Ruby to RPython and use PyPy to build a Ruby interpreter and a Python interpreter that can talk to each other at a higher level; then, writing something like rubypython for PyRuby and PyPy would be trivial. But that first step is a doozy.

Speeding up Parts of Existing Python App with PyPy or Shedskin

I am looking to bring speed improvements to an existing application and I'm looking for advice on my possible options. The application is written in Python, uses wxPython, and is packaged with py2exe (I only target windows platforms). Parts of the application are computationally intensive and run too slowly in interpreted Python. I am not familiar with C, so porting parts of the code over is not really an option for me.
So my question is basically do I have a clear picture of my options as I outline below, or am I approaching this from the wrong direction?
Running with pypy: Today I started experimenting with Pypy - the results are exciting, in that I can run large parts of the code from the pypy interpreter and I'm seeing 5x+ speed improvements with no code changes. However, if I understand correctly, (a) Pypy with wxpython support is still a work in progress, and (b) I cannot compile it down to an exe for distribution anyway. So unless I'm mistaken, this seems like a no-go for me? There's no way to package things up so parts of it are executed with pypy?
Converting code to RPython, translating with pypy So the next option seems to be actually rewriting parts of the code to the pypy restricted language, which seems like a pretty large job. But if I do that, parts of the code can then be compiled to an executable (?) and then I can access the code through ctypes (?).
Other restricted options Shedskin seems to be a popular alternative here, does this fit my requirements better? Other options seem to be Cpython, Psyco, and Unladen, but they are all superseded or no longer maintained.

Using PyPy indeed rules out py2exe and similar tools, at least until one is ported (AFAIK there is no active work on that). Still, as PyPy binaries do not need to be installed, you might get away with a more complicated distribution that includes both your Python source code and a PyPy binary+stdlib and uses a small wrapper (batch file, executable) to ease launching. I can't comment on whether WxPython on PyPy is mature enough to be used, but perhaps someone on pypy-dev, wxpython-dev or either one's IRC channel can give a recommendation if you describe your situation.
Translating your code into RPython does not seem viable to me. The translation toolchain is not really a tool for general purpose development, and producing a C dll for embedding/ctypes seems nontrivial. Also, RPython code really is low-level, making your Python code restricted enough may amount to rewriting half of it.
As for other restricted options: You seem to mix up CPython (the original Python interpreter written in C) with Cython (a compiler for a Python-like language that emits C code suitable for CPython extension modules). Both projects are active. I'm not very familiar with Shedskin, but it seems to be a tool for developing whole programs, with little or no interaction with non-restricted Python code. Cython seems a much better fit: Although it requires manual type annotations and lower-level code to achieve really good performance, it's trivial to use from Python: The very purpose of the project is producing extension modules.

I would definitely look into Cython, I've been playing with it some and have seen speedups of ~100x over pure python. Use the profile module to find the bottlenecks first. Usually the loops are the biggest chances to increase speed when going to Cython. You should also look into seeing if you can use array/vector operations in Numpy instead of loops, if so that can also give extreme performance boosts. For instance:
a = range(1000000)
for i in range(len(a)):
a[i] += 5
is slow, real slow. On the other hand:
a = numpy.arange(10000000)
a = a +5
is fast, real fast.

Correction: shedskin can be used to generare extention modules, as well as whole programs.

Is the Python GIL really per interpreter?

I often see people talking that the GIL is per Python Interpreter (even here on stackoverflow).
But what I see in the source code it seems to be that the GIL is a global variable and therefore there is one GIL for all Interpreters in each python process. I know they did this because there is no interpreter object passed around like lua or TCL does, it was just not designed well in the beginning. And thread local storage seems to be not portable for the python guys to use.
Is this correct? I had a short look at the 2.4 version I'm using in a project here.
Had this changed in later versions, especially in 3.0?

The GIL is indeed per-process, not per-interpreter. This is unchanged in 3.x.

Perhaps the confusion comes about because most people assume Python has one interpreter per process. I recall reading that the support for multiple interpreters via the C API was largely untested and hardly ever used. (And when I gave it a go, didn't work properly.)

I believe it is true (at least as of Python 2.6) that each process may have at most one CPython interpreter embedded (other runtimes may have different constraints). I'm not sure if this is an issue with the GIL per se, but it is likely due to global state, or to protect from conflicting global state in third-party C modules. From the CPython API Docs:
[Py___Initialize()] is a no-op when called for a second time (without calling Py_Finalize() first). There is no return value; it is a fatal error if the initialization fails.
You might be interested in the Unladen Swallow project, which aims eventually to remove the GIL entirely from CPython. Other Python runtimes don't have the GIL at all, like (I believe) Stackless Python, and certainly Jython.
Also note that the GIL is still present in CPython 3.x.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.