Is Python byte-code, interpreter independent? - python

This is an obvious question, that I haven't been able to find a concrete answer to.
Is the Python Byte-Code and Python Code itself interpreter independent,
Meaning by this, that If I take a CPython, PyPy, Jython, IronPython, Skulpt, etc, Interpreter, and I attempt to run, the same piece of code in python or bytecode, will it run correctly? (provided that they implement the same language version, and use modules strictly written in Python or standard modules)
If so, is there is a benchmark, or place where I can compare performance comparison from many interpreters?
I've been playing for a while with CPython, and now I want to explore new interpreters.
And also a side question, What are the uses for the others implementations of python?
Skulpt I get it, browsers, but the rest? Is there a specific industry or application that requires a different interpreter (which)?

From https://docs.python.org/3/library/dis.html#module-dis
Bytecode is an implementation detail of the CPython interpreter. No
guarantees are made that bytecode will not be added, removed, or
changed between versions of Python. Use of this module should not be
considered to work across Python VMs or Python releases.
On the other hand, Jython "consists of a compiler to compile Python source code down to Java bytecodes which can run directly on a JVM" and IronPython compiles to CIL to run on the .NET VM.
The purpose is to better integrate into your programming environment. CPython allows you to write C extensions, but this is not necessarily true of other implementations. Jython allows you to interact with Java code. I'm sure similar is true of IronPython.

If so, is there is a benchmark, or place where I can compare
performance comparison from many interpreters?
speed.pypy.org compares pypy to cpython

Related

Does WebAssembly run faster if written in C as opposed to Python?

There's a long list of languages that can be compiled into Wasm. Is there any performance gain from writing in something like C or Rust over Python? Or is it all the same since it is being compiled to Wasm?
Short answer: Yes, because Python, the language itself, is not compiled to Wasm, but its interpreter.
Saying Python supports Wasm does not always means the same. Firstly, Python is NOT a compiled language, it's a script language. Don't expect a script language will be compiled to a native (or Wasm) language because it is not meant to work that way.
Then how Python supports Wasm? Python interpreters/runtimes like cpython, which is written in C, are compiled to Wasm. There are two popular Python runtimes that supports Python: pyodide and Wasm port for micropython (there are a lot of efforts to run Python in a browser besides the two). Both of them are interpreters that translate Python to their own bytecode and then execute bytecode in Wasm. Of course there will be huge performance penalties just like cpython in the native environment.
Compiling to WebAssembly is basically just simulating a special form of assembly targeting virtual hardware. When you read "can compile language X" into Wasm, it doesn't always mean the language literally compiles directly to Wasm. In the case of Python, to my knowledge, it means "they compiled Python interpreters to Wasm" (e.g. CPython, PyPy), so the whole Python interpreter is Wasm, but it still interprets Python source code files normally, it doesn't convert them to special Wasm modules or anything. Which means all the overhead of the Python interpreter is there, on top of the overhead of the Wasm engine, etc.
So yes, C and Rust (which can target Wasm directly by swapping out the compiler backend) will still run faster than Python code targeting CPython compiled to Wasm, for the same reasons. Tools that speed up Python when run natively (e.g. Cython, raw CPython C extensions, etc.) may also work in Wasm to get the same speed ups, but it's not a free "Compile slow interpreted language to Wasm and become fast compiled language"; computers aren't that smart yet.

When using the Python Interpreter, is the compiler used at all?

In Google's Python Class it reads
Python is a dynamic, interpreted (bytecode-compiled) language
I know what an interpreter is and know what bytecode is but the two together seem not to fit. After doing some reading it became a bit clearer that basically Python source code is automatically compiled before it is interpreted; but some new questions emerged.
When using the Python Interpreter does no compilation happen? If it does, when? For example if you're just typing code at the command line and it gets run each time you hit enter, when does a compiler have the opportunity to do its work?
Also in the linked to question above, #delnan gives a pretty broad definition of a compiler
A compiler is, more generally, a program that converts a program in
one programming language into a program in another programming
language...JIT compilers compile to native machine code at runtime
I guess my question is: what's the difference between an interpreter and automatic compiler? To refine the question a bit, if Python is compiled, why not compile all the way to machine code (or assembly, since I know writing compilers that can produce pure machine code is difficult)?
Perhaps it is best to forget semantics and just try to learn what Cpython is actually doing. When you invoke the Cpython binary, it does a number of things. Generally speaking, you can expect it to translate the code you've written into a sequence of bytecode instructions. This is the "compiling" stage that people will sometimes reference for python code. These are a more compact and efficient way to tell the interpreter what to do than your hand-written code. Frequently, python will cache these files for reuse in .pyc files (only re-generating if the associated .py file is newer). You can think of python bytecode as the set of instructions that the python virtual machine can run -- In a lot of ways, it's not really all that different than what you get for Java. When people speak of compiled languages (e.g. C), the compiler's job is to translate your code into a set of instructions that will run directly on your computer's hardware. Languages like Cpython and Java have an extra level of indirection (e.g. the Virtual Machine). The Virtual Machine runs directly on the computer's hardware and is responsible for interpreting the domain specific language.
Compared to standard "compiled" languages (e.g. C, Fortran), this stage is really light-weight -- and python doesn't do a lot of the checking that "traditional" compilers will do (e.g. typechecking). It pretty much only checks the syntax and does a few very simple optimizations using the peephole optimizer.

Python vs Cpython

What's all this fuss about Python and CPython (Jython,IronPython), I don't get it:
python.org mentions that CPython is:
The "traditional" implementation of Python (nicknamed CPython)
yet another Stack Overflow question mentions that:
CPython is the default byte-code interpreter of Python, which is written in C.
Honestly I don't get what both of those explanations practically mean but what I thought was that, if I use CPython does that mean when I run a sample python code, it compiles it to C language and then executes it as if it were C code
So what exactly is CPython and how does it differ when compared with python and should I probably use CPython over Python and if so what are its advantages?
So what is CPython?
CPython is the original Python implementation. It is the implementation you download from Python.org. People call it CPython to distinguish it from other, later, Python implementations, and to distinguish the implementation of the language engine from the Python programming language itself.
The latter part is where your confusion comes from; you need to keep Python-the-language separate from whatever runs the Python code.
CPython happens to be implemented in C. That is just an implementation detail, really. CPython compiles your Python code into bytecode (transparently) and interprets that bytecode in a evaluation loop.
CPython is also the first to implement new features; Python-the-language development uses CPython as the base; other implementations follow.
What about Jython, etc.?
Jython, IronPython and PyPy are the current "other" implementations of the Python programming language; these are implemented in Java, C# and RPython (a subset of Python), respectively. Jython compiles your Python code to Java bytecode, so your Python code can run on the JVM. IronPython lets you run Python on the Microsoft CLR. And PyPy, being implemented in (a subset of) Python, lets you run Python code faster than CPython, which rightly should blow your mind. :-)
Actually compiling to C
So CPython does not translate your Python code to C by itself. Instead, it runs an interpreter loop. There is a project that does translate Python-ish code to C, and that is called Cython. Cython adds a few extensions to the Python language, and lets you compile your code to C extensions, code that plugs into the CPython interpreter.
You need to distinguish between a language and an implementation. Python is a language,
According to Wikipedia, "A programming language is a notation for writing programs, which are specifications of a computation or algorithm". This means that it's simply the rules and syntax for writing code. Separately we have a programming language implementation which in most cases, is the actual interpreter or compiler.
Python is a language.
CPython is the implementation of Python in C. Jython is the implementation in Java, and so on.
To sum up: You are already using CPython (if you downloaded from here).
Even I had the same problem understanding how are CPython, JPython, IronPython, PyPy are different from each other.
So, I am willing to clear three things before I begin to explain:
Python: It is a language, it only states/describes how to convey/express yourself to the interpreter (the program which accepts your python code).
Implementation: It is all about how the interpreter was written, specifically, in what language and what it ends up doing.
Bytecode: It is the code that is processed by a program, usually referred to as a virtual machine, rather than by the "real" computer machine, the hardware processor.
CPython is the implementation, which was
written in C language. It ends up producing bytecode (stack-machine
based instruction set) which is Python specific and then executes it.
The reason to convert Python code to a bytecode is because it's easier to
implement an interpreter if it looks like machine instructions. But,
it isn't necessary to produce some bytecode prior to execution of the
Python code (but CPython does produce).
If you want to look at CPython's bytecode then you can. Here's how you can:
>>> def f(x, y): # line 1
... print("Hello") # line 2
... if x: # line 3
... y += x # line 4
... print(x, y) # line 5
... return x+y # line 6
... # line 7
>>> import dis # line 8
>>> dis.dis(f) # line 9
2 0 LOAD_GLOBAL 0 (print)
2 LOAD_CONST 1 ('Hello')
4 CALL_FUNCTION 1
6 POP_TOP
3 8 LOAD_FAST 0 (x)
10 POP_JUMP_IF_FALSE 20
4 12 LOAD_FAST 1 (y)
14 LOAD_FAST 0 (x)
16 INPLACE_ADD
18 STORE_FAST 1 (y)
5 >> 20 LOAD_GLOBAL 0 (print)
22 LOAD_FAST 0 (x)
24 LOAD_FAST 1 (y)
26 CALL_FUNCTION 2
28 POP_TOP
6 30 LOAD_FAST 0 (x)
32 LOAD_FAST 1 (y)
34 BINARY_ADD
36 RETURN_VALUE
Now, let's have a look at the above code. Lines 1 to 6 are a function definition. In line 8, we import the 'dis' module which can be used to view the intermediate Python bytecode (or you can say, disassembler for Python bytecode) that is generated by CPython (interpreter).
NOTE: I got the link to this code from #python IRC channel: https://gist.github.com/nedbat/e89fa710db0edfb9057dc8d18d979f9c
And then, there is Jython, which is written in Java and ends up producing Java byte code. The Java byte code runs on Java Runtime Environment, which is an implementation of Java Virtual Machine (JVM). If this is confusing then I suspect that you have no clue how Java works. In layman terms, Java (the language, not the compiler) code is taken by the Java compiler and outputs a file (which is Java byte code) that can be run only using a JRE. This is done so that, once the Java code is compiled then it can be ported to other machines in Java byte code format, which can be only run by JRE. If this is still confusing then you may want to have a look at this web page.
Here, you may ask if the CPython's bytecode is portable like Jython, I suspect not. The bytecode produced in CPython implementation was specific to that interpreter for making it easy for further execution of code (I also suspect that, such intermediate bytecode production, just for the ease the of processing is done in many other interpreters).
So, in Jython, when you compile your Python code, you end up with Java byte code, which can be run on a JVM.
Similarly, IronPython (written in C# language) compiles down your Python code to Common Language Runtime (CLR), which is a similar technology as compared to JVM, developed by Microsoft.
This article thoroughly explains the difference between different implementations of Python. Like the article puts it:
The first thing to realize is that ‘Python’ is an interface. There’s a
specification of what Python should do and how it should behave (as
with any interface). And there are multiple implementations (as with
any interface).
The second thing to realize is that ‘interpreted’ and ‘compiled’ are
properties of an implementation, not an interface.
Python is a language: a set of rules that can be used to write programs. There are several implementaions of this language.
No matter what implementation you take, they do pretty much the same thing: take the text of your program and interpret it, executing its instructions. None of them compile your code into C or any other language.
CPython is the original implementation, written in C. (The "C" part in "CPython" refers to the language that was used to write Python interpreter itself.)
Jython is the same language (Python), but implemented using Java.
IronPython interpreter was written in C#.
There's also PyPy - a Python interpreter written in Python. Make your pick :)
As python is open source that's why we can customize python as per our requirements. After customization, we can name that version as we want. That's why multiple flavours of python are available. Each flavour is a customized version of python to fulfill a special requirement. It is similar to the fact that there are multiple flavours of UNIX like, Ubuntu, Linux, RedHat Linux etc. Below are some of the flavours of python :
CPython
Default implementation of python programming language which we download from python.org, provided by python software foundation. It is written in C and python. It does not allow us to write any C code, only python code are allowed. CPython can be called as both an interpreter and a compiler as here our python code first gets compiled to python bytecode then the bytecode gets interpreted into platform-specific operations by PVM or Python Virtual Machine. Keep in mind interpreters have language syntaxes predefined that's why it does not need to translate into low level machine code. Here Interpreter just executes bytecode on the fly during runtime and results in platform-specific operations.
Old versions of JavaScript, Ruby, Php were fully interpreted languages as their interpreters would directly translate each line source code into platform-specific operations, no bytecode was involved. Bytecode is there in Java, Python, C++ (.net), C# to decouple the language from execution environment, i.e. for portability, write once, run anywhere. Since 2008, Google Chrome's V8 JavaScript engine has come up with Just-In-Time Compiler for JavaScript. It executes JavaScript code line-by-line like an interpreter to reduce start up time, but if encounters with a hot section with repeatedly executed line of code then optimizes that code using baseline or optimizing compiler.
Cython
Cython is a programming language which is a superset of python and C. It is written in C and python. It is designed to give C-like performance with python syntax and optional C-syntax. Cython is a compiled language as it generates C code and gets compiled by C compiler. We can write similar code in Cython as in default python or CPython, the differences are :
Cython allows us to write optional additional C code and,
In Cython, our python code gets translated into C code internally so that it can get compiled by C compiler. Although Cython results in considerably faster execution, but falls short of original C language execution. This is because Cython has to make calls to the CPython interpreter and CPython standard libraries to understand the written CPython code
JPython / Jython
Java implementation of python programming language. It is written in Java and python. Here our python code first gets compiled to Java bytecode and then that bytecode gets interpreted to platform-specific operations by JVM or Java Virtual Machine. This is similar to how Java code gets executed : Java code first gets compiled to intermediate bytecode then that bytecode gets interpreted to platform-specific operations by JVM
PyPy
RPython implementation of python programming language. It is written in a restricted subset of python called Restricted Python (RPython). PyPy runs faster than CPython because to interpret bytecode, PyPy has a Just-in-time Compiler (Interpreter + Compiler) while CPython has an Interpreter. So JIT Compiler in PyPy can execute Python bytecode line-by-line like an Interpreter to reduce start up time, but if encounters with a hot section with repeatedly executed line of code then optimizes that code using Baseline or Optimizing Compiler.
JIT Compiler in a Nutshell: Compiler in Python translates our high level source code into bytecode and to execute bytecode, some implementations have normal Interpreter, some have Just-in-time Compiler. To execute a loop which runs say, million times i.e. a very hot code, initially Interpreter will run it for some time and Monitor of JIT Compiler will watch that code. Then when it gets repeated some times i.e. the code becomes warm* then JIT compiler will send that code to Baseline Compiler which will make some assumptions on variable types etc. based on data gathered by Monitor while watching the code. From next iterations if assumptions turn out to be valid, then no need to retranslate bytecode into machine code, i.e. steps can be skipped for faster execution. If the code repeats a lot of times i.e. code becomes very hot then JIT compiler will send that code to Optimizing Compiler which will make more assumptions and will skip more steps for very fast execution.
JIT Compiler Drawbacks: initial slower execution when the code gets analysed, and if assumptions turn out to be false then optimized compiled code gets thrown out i.e. Deoptimization or Bailing out happens which can make code execution slower, although JIT compilers has limit for optimization/deoptimization cycle. After certain number of deoptimization happens, JIT Compiler just does not try to optimize anymore. Whereas normal Interpreter, in each iteration, repeatedly translates bytecode into machine code thereby taking more time to complete a loop which runs say, million times
IronPython
C# implementation of python, targeting the .NET framework
Ruby Python
works with Ruby platform
Anaconda Python
Distribution of python and R programming languages for scientific computing like, data science, machine learning, artificial intelligence, deep learning, handling large volume of data etc. Numerous number of libraries like, scikit-learn, tensorflow, pytorch, numba, pandas, jupyter, numpy, matplotlib etc. are available with this package
Stackless
Python for Concurrency
To test speed of each implementation, we write a program to call integrate_f 500 times using an N value of 50,000, and record the execution time over several runs. Below table shows the benchmark results :
Implementation
Execution Time (seconds)
Speed Up
CPython
9.25
CPython + Cython
0.21
44x
PyPy
0.57
16x
implementation means what language was used to implement Python and not how python Code would be implemented. The advantage of using CPython is the availability of C Run-time as well as easy integration with C/C++.
So CPython was originally implemented using C. There were other forks to the original implementation which enabled Python to lever-edge Java (JYthon) or .NET Runtime (IronPython).
Based on which Implementation you use, library availability might vary, for example Ctypes is not available in Jython, so any library which uses ctypes would not work in Jython. Similarly, if you want to use a Java Class, you cannot directly do so from CPython. You either need a glue (JEPP) or need to use Jython (The Java Implementation of Python)
You should know that CPython doesn't really support multithreading (it does, but not optimal) because of the Global Interpreter Lock. It also has no Optimisation mechanisms for recursion, and has many other limitations that other implementations and libraries try to fill.
You should take a look at this page on the python wiki.
Look at the code snippets on this page, it'll give you a good idea of what an interpreter is.
The original, and standard, implementation of Python is usually called CPython when
you want to contrast it with the other options (and just plain “Python” otherwise). This
name comes from the fact that it is coded in portable ANSI C language code. This is
the Python that you fetch from http://www.python.org, get with the ActivePython and
Enthought distributions, and have automatically on most Linux and Mac OS X machines.
If you’ve found a preinstalled version of Python on your machine, it’s probably
CPython, unless your company or organization is using Python in more specialized
ways.
Unless you want to script Java or .NET applications with Python or find the benefits
of Stackless or PyPy compelling, you probably want to use the standard CPython system.
Because it is the reference implementation of the language, it tends to run the
fastest, be the most complete, and be more up-to-date and robust than the alternative
systems.
A programming language implementation is a system for executing computer programs.
There are two general approaches to programming language implementation:
Interpretation: An interpreter takes as input a program in some language, and performs the actions written in that language on some machine.
Compilation: A compiler takes as input a program in some language, and translates that program into some other language, which may serve as input to another interpreter or another compiler.
Python is an interpreted high-level programming language created by Guido van Rossum in 1991.
CPython is reference version of the Python computing language, which is written in C created by Guido van Rossum too.
Other list of Python Implementations
Source
Cpython is the default implementation of Python and the one which we get onto our system when we download Python from its official website.
Cpython compiles the python source code file with .py extension into an intermediate bytecode which is usually given the .pyc extension, and gets executed by the Cpython Virtual Machine. This implementation of Python provides maximum compatibility with the Python packages and C extension modules.
There are many other Python implementations such as IronPython, Jython, PyPy, CPython, Stackless Python and many more.

Calling Python from Ruby - PyPy compatibility

I'm looking to call Python code from Ruby. There are a few existing tools to do this and a few questions on this site recommending http://rubypython.rubyforge.org/, which works by embedding the Python interpreter in Ruby. I'm working on an app that uses libraries unique to Python (namely graph-tool, which I have reasons for using over, say RGL), but the final project is in Rails so having Ruby code do the controlling work would be ideal. I want it to be speedy so I'm using PyPy. Is there a way to get the PyPy interpreter embedded in Ruby code, or to make the Python interpreter in rubypython run PyPy?
No. Well, not without a lot of work.
First, RubyPython doesn't really include an embedded Python interpreter; it just wraps the interpreter at runtime. As shown in the docs, you can run it with any Python you want, e.g.:
>> RubyPython.start(:python_exe => "python2.6")
So, what happens when you try?
>> RubyPython.start(:python_exe => "/usr/local/bin/pypy")
RubyPython::InvalidInterpreter: An invalid interpreter was specified.
from /Library/Ruby/Gems/1.8/gems/rubypython-0.6.3/lib/rubypython.rb:67:in `start'
from /Library/Ruby/Gems/1.8/gems/rubypython-0.6.3/lib/rubypython/python.rb:10:in `synchronize'
from /Library/Ruby/Gems/1.8/gems/rubypython-0.6.3/lib/rubypython/python.rb:10:in `synchronize'
from /Library/Ruby/Gems/1.8/gems/rubypython-0.6.3/lib/rubypython.rb:54:in `start'
from (irb):4
Unfortunately, it requires CPython 2.4-2.7. It doesn't work with CPython 3.x, PyPy, Jython, etc. Again, from the docs:
RubyPython has been tested with the C-based Python interpreter (cpython), versions 2.4 through 2.7. Work is planned to enable Python 3 support, but has not yet been started. If you’re interested in helping us enable Python 3 support, please let us know.
Without looking at the code, I'm guessing rubypython is using rubyffi to either:
* Wrap the CPython embedding APIs, or
* Directly call CPython VM internals via its dll/so/dylib exports.
If it's the former, the project might be doable, but still a lot of work. PyPy doesn't support CPython's embedding APIs. If it had its own embedded APIs, you could potentially rewrite rubypython's lower level to wrap those instead, and leave the higher-level code alone. But embedding PyPy at all is still a work in progress, (See http://mail.python.org/pipermail/pypy-dev/2012-March/009661.html for the state of affairs 6 months ago.) So, you'd need to first help get PyPy embedding ready for prime time and stable, and then port the lower level of rubypython to use the different APIs.
If it's the latter, you're pretty much SOL. PyPy will never support the CPython internals, and much of what's internal for CPython is actually written in RPython or Python and then compiled for PyPy, so it's not even possible in principle. You'd have to drastically rewrite all of rubypython to find some way to make it work, instead of just porting the lower level.
One alternative is to port Ruby to RPython and use PyPy to build a Ruby interpreter and a Python interpreter that can talk to each other at a higher level; then, writing something like rubypython for PyRuby and PyPy would be trivial. But that first step is a doozy.

What are the pros and cons of the various Python implementations?

I am relatively new to Python, and I have always used the standard cpython (v2.5) implementation.
I've been wondering about the other implementations though, particularly Jython and IronPython. What makes them better? What makes them worse? What other implementations are there?
I guess what I'm looking for is a summary and list of pros and cons for each implementation.
Jython and IronPython are useful if you have an overriding need to interface with existing libraries written in a different platform, like if you have 100,000 lines of Java and you just want to write a 20-line Python script. Not particularly useful for anything else, in my opinion, because they are perpetually a few versions behind CPython due to community inertia.
Stackless is interesting because it has support for green threads, continuations, etc. Sort of an Erlang-lite.
PyPy is an experimental interpreter/compiler that may one day supplant CPython, but for now is more of a testbed for new ideas.
An additional benefit for Jython, at least for some, is it lacks the GIL (the Global Interpreter Lock) and uses Java's native threads. This means that you can run pure Python code in parallel, something not possible with the GIL.
All of the implementations are listed here:
https://wiki.python.org/moin/PythonImplementations
CPython is the "reference implementation" and developed by Guido and the core developers.
Pros: Access to the libraries available for JVM or CLR.
Cons: Both naturally lag behind CPython in terms of features.
IronPython and Jython use the runtime environment for .NET or Java and with that comes Just In Time compilation and a garbage collector different from the original CPython. They might be also faster than CPython thanks to the JIT, but I don't know that for sure.
A downside in using Jython or IronPython is that you cannot use native C modules, they can be only used in CPython.
PyPy is a Python implementation written in RPython wich is a Python subset.
RPython can be translated to run on a VM or, unlike standard Python, RPython can be statically compiled.

Categories

Resources