LLVM, Parrot, JVM, PyPy + python

LLVM, Parrot, JVM, PyPy + python - python

What is the problem in developing some languages, for example python for some optimized techniques with some of LLVM / Parrot.
PyPy, LLVM, Parrot are the main technologies for common platform development.
I see this like:
PyPy - framework to build VM with build in optimized VM for python
So it quite general solution. The process goes as listed down:
dynamic_language_code ->
PyPy frontend ->
PyPy internal code - bytecode ->
PyPy optimization ->
leaving PyPy code and:
a. PyPy backend for some VM (like jvm)
b. som Kit to make own VM
c. processing/running PyPy internal code
Am I right About this process? For python there is optimized VM? Particularly by default there is build in VM for optimized PyPy code (step 5.c) - which is for python and every language processing can stop there and be running by it?
Parrot - much like PyPy, but without 5.a and 5.b ? Some internal improvements for dynamic processing (Parrot Magic Cookies).
Both Parrot and PyPy are designed to create a platform which create a common dynamic languages runtime, but PyPy wants more - also to create more VM.
Where is the sens of PyPy? For what we need to create more VM? Shouldn't be better to focus on one VM (like in parrot) - because there is common one code level - either PyPy internal bytecode or Parrot ones.
I think we can't gain nothing better to translate to PyPy bytecode to newly created with PyPy VMs.
LLVM - i see this very similar to PyPy but without VM generator.
It is mature, well designed environment with similar targets as PyPy (but without VM generator) but working on low level structure and great optimization/JIT techniques implemeted
Is see this as: LLVM is general use, but Parrot and **PyPy* designed for dynamic languages. In PyPy / Parrot is more easy to introduce some complicated techniques - because it is more high level then LLVM - like sophisticate compiler which can better understand high level code and produce better assembler code (which humans can't write in reasonable time), then the LLVM one?
Questions:
Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?
I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy
Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high development process and has big companies investing in.
I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?
What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?
Can you correct me if i'm wrong somewhere
Some addings:
How does Parrot compare to other virtual machines
What's the benefit of Parrot VM for end-users
What are the differences between LLVM and java/jvm
=============
CLARIFICATION
I want to figure how all this software consist - and what is the problem to porting one to other.

That not stuff anybody can possible answer in a stackoverflow questions but i give it a minmal shot.
First what problems do the 3 projects solve?
pypy allows you to implement an interpreter in a high level language and you get a generated jit for free. The good thing about this is that you don't have a dependence mismatch between the langauge and the platform. Thats the reason why pypy-clr is faster then IronPython.
More info here: http://codespeak.net/pypy/dist/pypy/doc/extradoc.html --> High performance implementation of Python for CLI/.NET with JIT compiler generation for dynamic)
llvm is a low level infrastructure for compilers. The general idea is to have one "high level assembly". All the optomizations work on that language. Then there is tons of infrastructure around to help you build compilers (JIT or AOT). Implementing a dynamic language on llvm is possible but needs more work then implementing it on pypy or parrot. You, for example, can't get a GC for free (there are GC you can use together with LLVM see http://llvm.org/devmtg/2009-10/ --> the vmkit video ) There are attempts to build a platform better for dynamic langauges based on llvm: http://www.ffconsultancy.com/ocaml/hlvm/
I don't know that much about parrot but as I understand they want to build one standard VM specialized for dynamic langauges (perl, php, python ....). The problem here is the same as with compiling to JVM/CLR there is a dependency missmatch, just a much smaller one. The VM still does not know the semantics of your langauge. As I unterstand parrot is still pretty slow for user code. (http://confreaks.net/videos/118-elcamp2010-parrot)
The answer to your question:
Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?
Thats a question of effort. Building everthing your self and specialized for you will eventually be faster but it's a LOT more effort.
I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy.
Targeting parrot would (at this point) not likely have a benefit over pypy. Why nobody else does it I don't know.
Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high
development process and has big companies investing in.
Ok there is a lot of stuff in that question.
Like I said LLVM is hard to move to and parrot is not that fast (correct me if im wrong).
Ruby has Rubinius witch tries to do a lot in ruby and jits to llvm (http://llvm.org/devmtg/2009-10/ --> Accelerating Ruby with LLVM).
There is a implementation of CLR/JVM on LLVM but they both already have very mature implemantations that have big investments.
LLVM is not high level.
I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?
I have no idea what the question is.
What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?
Watch the video of VMKit I linked above that show how far they got and what the problem is (and how they solved it).
Can you correct me if i'm wrong somewhere
Lots of stuff you wrote is wrong or I just don't anderstand what you mean, but the stuff I linked should make a lot of stuff clearer.
Some examples:
Clojure
The creater didn't want all the work of implementing his own vm and all the libraries. So where to go to? Since Clojure is a new langauge you can build it in a way that works well on a platform like the JVM by restricting a lot of dynamic stuff a language like python or ruby would have.
Python
The language can't (practically) be changed to work well on JVM/CLR. So implementing python on those wont bring massive speedups. Static compiler won't work very well either because there are not many static guarantees. Writing a JIT in C will be fast but very hard to change (see the psyco project). Using the llvm jit could work and is explored by the Unladen Swallow project (again http://llvm.org/devmtg/2009-10/ --> Unladen Swallow: Python on LLVM). Some people wanted to have python in python so they started pypy and their idea seams to work really well (see above). Parrot could work as well but I have not seen anybody have try (feel free).
On everything:
I think you're confused and I can understand that. Take your time and read, listen, watch everything you can get. Don't stress yourself. There are a lot of parts to this and eventually you see how what fits together and what makes sense and even when you know a lot there is still a lot of discussing one may do. The question is where to implement a new language or how to speed up a old language have many answers and if you ask 3 people you're likely to get three different answers.

What are you trying to implement? Your question is very confusingly worded (I realize English is likely not your first language).
LLVM and PyPy are both mature, useful projects, but really don't overlap much at this point. (At one point, PyPy could generate LLVM bytecode—which was statically compiled to an interpreter—as opposed to C code, but it didn't provide much of a performance benefit and is no longer supported.)
PyPy lets you write an interpreter in RPython and use that as a description to generate a native code interpreter or JIT; LLVM is a C++ framework for building a compiler toolchain which can also be used to implement a JIT. LLVM's optimizers, code generation and platform support are significantly more advanced than those of PyPy, but it isn't as well suited to building a dynamic language runtime (see the Unladen Swallow retrospective for some examples of why). In particular, it is not as effective at collecting/using runtime feedback (which is absolutely essential for making dynamic languages perform well) as PyPy's trace-based JIT. Also, LLVM's garbage collection support is still somewhat primitive, and it lacks PyPy's unique ability to automatically generate a JIT.
Incidentally two Java implementations are built on LLVM—J3/VMKit and Shark.
You might consider watching the PyPy talk from Stanford last week; it provides a pretty decent overview of how PyPy works. Carl Friedrich Bolz's presentation also provides a good overview of the state of VM implementation.

The main reason? Because VM design is not a settled technology, and having a variety of VMs with different goals and objectives allows a variety of mechnisms to be tried in parallel rather than all having to be tried in series.
The JVM, CLR, PyPy, Parrot, LLVM and the rest all target different kinds of problems in different ways. It's similar to the reasons why Chrome, Firefox, Safari and IE all use their own Javascript engines.
Unladen Swallow attempted to apply LLVM to CPython, and spent more of their time fixing issues in LLVM than they did in doing anything Python specific.
Python-on-Parrot suffered from semantic differences between Perl 6 and Python causing problems with the front-end compilation process, so future efforts in this area are likely to use the PyPy front-end to target the Parrot VM.
Different VM developers certainly keep an eye on what the others are doing, but even when they lift good ideas they will put their own spin on them before incorporating them.

Related

AoT Compiler for Python

I want to get my Python script working on a bare metal device like microcontroller WITHOUT the need for an interpreter. I know there are already JIT compilers for Python like PyPy, and interpreters like CPython.
However, existing interpreters I've seen (such as CPython) take up large memory (in MB range).
Is there an AOT compiler for Python (i.e. compiling directly to native hardware through intermediary like LLVM)?
I assume such a compiler would enable Python to run much faster compared to existing implementations AND with lower memory footprint. If there is, I wonder why that solution hasn't been popularized.

As you already mentioned Cython is an option (However, it is true that the result is big due since the C runtime need to implement the Python functionality together with your program).
With regards to LLVM there was a project by Google named unladen swallow. However, that project is mostly abandoned. You can find some information about it here
Basically it was an attempt to bring LLVM optimizations into the runtime of Cython. E.g JITTING Python code.
Another old alternative was shed skin which compiles Python to C++. Some information about it can be found here.
Yet another option similar to shed skin is to restrict yourself to a subset of the Python language and use micropython.

An alternative would be to use GraalVM with Truffle AOT with Python.
It's basically Python running on an optimized AOT for jvm.
The project looks promising. You can chek this link here:
https://www.graalvm.org/22.2/graalvm-as-a-platform/language-implementation-framework/AOT/

Recently came across codon,
Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead. Typical speedups over Python are on the order of 10-100x or more, on a single thread. Codon's performance is typically on par with (and sometimes better than) that of C/C++.

Speeding up Parts of Existing Python App with PyPy or Shedskin

I am looking to bring speed improvements to an existing application and I'm looking for advice on my possible options. The application is written in Python, uses wxPython, and is packaged with py2exe (I only target windows platforms). Parts of the application are computationally intensive and run too slowly in interpreted Python. I am not familiar with C, so porting parts of the code over is not really an option for me.
So my question is basically do I have a clear picture of my options as I outline below, or am I approaching this from the wrong direction?
Running with pypy: Today I started experimenting with Pypy - the results are exciting, in that I can run large parts of the code from the pypy interpreter and I'm seeing 5x+ speed improvements with no code changes. However, if I understand correctly, (a) Pypy with wxpython support is still a work in progress, and (b) I cannot compile it down to an exe for distribution anyway. So unless I'm mistaken, this seems like a no-go for me? There's no way to package things up so parts of it are executed with pypy?
Converting code to RPython, translating with pypy So the next option seems to be actually rewriting parts of the code to the pypy restricted language, which seems like a pretty large job. But if I do that, parts of the code can then be compiled to an executable (?) and then I can access the code through ctypes (?).
Other restricted options Shedskin seems to be a popular alternative here, does this fit my requirements better? Other options seem to be Cpython, Psyco, and Unladen, but they are all superseded or no longer maintained.

Using PyPy indeed rules out py2exe and similar tools, at least until one is ported (AFAIK there is no active work on that). Still, as PyPy binaries do not need to be installed, you might get away with a more complicated distribution that includes both your Python source code and a PyPy binary+stdlib and uses a small wrapper (batch file, executable) to ease launching. I can't comment on whether WxPython on PyPy is mature enough to be used, but perhaps someone on pypy-dev, wxpython-dev or either one's IRC channel can give a recommendation if you describe your situation.
Translating your code into RPython does not seem viable to me. The translation toolchain is not really a tool for general purpose development, and producing a C dll for embedding/ctypes seems nontrivial. Also, RPython code really is low-level, making your Python code restricted enough may amount to rewriting half of it.
As for other restricted options: You seem to mix up CPython (the original Python interpreter written in C) with Cython (a compiler for a Python-like language that emits C code suitable for CPython extension modules). Both projects are active. I'm not very familiar with Shedskin, but it seems to be a tool for developing whole programs, with little or no interaction with non-restricted Python code. Cython seems a much better fit: Although it requires manual type annotations and lower-level code to achieve really good performance, it's trivial to use from Python: The very purpose of the project is producing extension modules.

I would definitely look into Cython, I've been playing with it some and have seen speedups of ~100x over pure python. Use the profile module to find the bottlenecks first. Usually the loops are the biggest chances to increase speed when going to Cython. You should also look into seeing if you can use array/vector operations in Numpy instead of loops, if so that can also give extreme performance boosts. For instance:
a = range(1000000)
for i in range(len(a)):
a[i] += 5
is slow, real slow. On the other hand:
a = numpy.arange(10000000)
a = a +5
is fast, real fast.

Correction: shedskin can be used to generare extention modules, as well as whole programs.

Python - IronPython dilemma

I'm starting to study Python, and for now I like it very much. But, if you could just answer a few questions for me, which have been troubling me, and I can't find any definite answers to them:
What is the relationship between Python's C implementation (main version from python.org) and IronPython, in terms of language compatibility ? Is it the same language, and do I by learning one, will be able to smoothly cross to another, or is it Java to JavaScript ?
What is the current status to IronPython's libraries ? How much does it lags behind CPython libraries ? I'm mostly interested in numpy/scipy and f2py. Are they available to IronPython ?
What would be the best way to access VB from Python and the other way back (connecting some python libraries to Excel's VBA, to be exact) ?

1) IronPython and CPython share nearly identical language syntax. There is very little difference between them. Transitioning should be trivial.
2) The libraries in IronPython are very different than CPython. The Python libraries are a fair bit behind - quite a few of the CPython-accessible libraries will not work (currently) under IronPython. However, IronPython has clean, direct access to the entire .NET Framework, which means that it has one of the most extensive libraries natively accessible to it, so in many ways, it's far ahead of CPython. Some of the numpy/scipy libraries do not work in IronPython, but due to the .NET implementation, some of the functionality is not necessary, since the perf. characteristics are different.
3) Accessing Excel VBA is going to be easier using IronPython, if you're doing it from VBA. If you're trying to automate excel, IronPython is still easier, since you have access to the Execl Primary Interop Assemblies, and can directly automate it using the same libraries as C# and VB.NET.

What is the relationship between
Python's C implementation (main
version from python.org) and
IronPython, in terms of language
compatibility ? Is it the same
language, and do I by learning one,
will be able to smoothly cross to
another, or is it Java to JavaScript ?
Same language (at 2.5 level for now -- IronPython's not 2.6 yet AFAIK).
What is the current status to
IronPython's libraries ? How much does
it lags behind CPython libraries ? I'm
mostly interested in numpy/scipy and
f2py. Are they available to IronPython
?
Standard libraries are in a great state in today's IronPython, huge third-party extensions like the ones you mention far from it. numpy's starting to get feasible thanks to ironclad, but not production-level as numpy is from IronPython (as witnessed by the 0.5 version number for ironclad, &c). scipy is huge and sprawling and chock full of C-coded and Fortran-coded extensions: I have no first-hand experience but I suspect less than half will even run, much less run flawlessly, under any implementation except CPython.
What would be the best way to access
VB from Python and the other way back
(connecting some python libraries to
Excel's VBA, to be exact) ?
IronPython should make it easier via .NET approaches, but CPython is not that far via COM implementation in win32all.
Last but not least, by all means check out the book IronPython in Action -- as I say every time I recommend it, I'm biased (by having been a tech reviewer for it AND by friendship with one author) but I think it's objectively the best intro to Python for .NET developers AND at the same time the best intro to .NET for Pythonistas.
If you need all of scipy (WOW, but that's some "Renaissance Man" computational scientist!-), CPython is really the only real option today. I'm sure other large extensions (PyQt, say, or Mayavi) are in a similar state. For deep integration to today's Windows, however, I think IronPython may have an edge. For general-purpose uses, CPython may be better (esp. thanks to the many new features in 2.6), unless you're really keen to use many cores to the hilt within a single process, in which case IronPython (which is GIL-less) may prove advantageous again.
One way or another (or even on the JVM via Jython, or in peculiar environments via PyPy) Python is surely an awesome language, whatever implementation(s) you pick for a given application!-) Note that you don't need to stick with ONE implementation (though you should probably pick one VERSION -- 2.5 for maximal compatibility with IronPython, Jython, Google App Engine, etc; 2.6 if you don't care about any deployment options except "CPython on a machine under my own sole or virtual control";-).

IronPython version 2.0.2, the current release, supports Python 2.5 syntax. With the next release, 2.6, which is expected sometime over the next month or so (though I'm not sure the team have set a hard release date; here's the beta), they will support Python 2.6 syntax. So, less Java to JavaScript and more Java to J# :-)
All of the libraries that are themselves written in Python work pretty much perfectly. The ones that are written in C are more problematic; there is an open source project called Ironclad (full disclosure: developed and supported by my company), currently at version 0.8.5, which supports numpy pretty well but doesn't cover all of scipy. I don't think we've tested it with f2py. (The version of Ironclad mentioned below by Alex Martelli is over a year old, please avoid that one!)
Calling regular VB.NET from IronPython is pretty easy -- you can instantiate .NET classes and call methods (static or instance) really easily. Calling existing VBA code might be trickier; there are open source projects called Pyinex and XLW that you might want to take a look at. Alternatively, if you want a spreadsheet you can code in Python, then there's always Resolver One (another one from my company... :-)

1) The language implemented by CPython and IronPython are the same, or at most a version or two apart. This is nothing like the situation with Java and Javascript, which are two completely different languages given similar names by some bone-headed marketing decision.
2) 3rd-party libraries implemented in C (such as numpy) will have to be evaluated carefully. IronPython has a facility to execute C extensions (I forget the name), but there are many pitfalls, so you need to check with each library's maintainer
3) I have no idea.

CPython is implemented by C for corresponding platform, such as Windows, Linux or Unix; IronPython is implemented by C# and Windows .Net Framework, so it can only run on Windows Platform with .Net Framework.
For gramma, both are same. But we cannot say they are one same language. IronPython can use .Net Framework essily when you develop on windows platform.
By far, July 21, 2009 - IronPython 2.0.2, our latest stable release of IronPython, was released. you can refer to http://www.codeplex.com/Wiki/View.aspx?ProjectName=IronPython.
You can access VB with .Net Framework function by IronPython. So, if you want to master IronPython, you'd better learn more .Net Framework.

Using Jython with Django?

I am planning to use Jython with Django. I want to know how stable the Jython project is, how easy to use it is, and how large its developer community is.

Django is proven to work with Jython:
Special focus in Jython 2.5 was to make it compatible with modern web frameworks like Django
There is also a special project, django-jython, that focuses on making database backends and extensions available for Jython development.
There is explicit documentation on how to run Django on Jython
In theory, Jython is 100% compatible with CPython. In practice, some extensions or libraries may have badly written code that make them dependent on a specific Python implementation such as CPython. The django-jython project explicitly provides a tested solution to overcome this problem. Of course you can still run across some libraries that explicitly require CPython (hence mostly safe).

I have not used Django with Jython, so I can't speak to that specific issue, but I've used Jython for other things and I've found it quite stable of late, and just as easy as plain Python. I believe the "core committers" in Jython are substantially fewer than in C-Python (maybe 1/3 the number or less), if that's what you mean by "developer community", but I'm not quite sure what's the point in asking about this -- are you considering joining either developer community (Jython or Core Python) and wondering where you could have the best impact?
If that's the case, I think the key issue isn't really how many others are already helping out, but, "what do you bring to the party" -- if you're a JVM wizard, or an expert at any important Java framework, you could be a real boon to the Jython community while that same skill would help much less in the C-Python community; vice versa, if you're a wizard, say, with autoconfigure and C-coded system calls, that would be precious for the C-Python community, but not as useful for the Jython community.

I use Jython in testing and rapid-development.
From my point of view it is stable.

Pros and cons of IronPython and IronPython Studio

We are ready in our company to move everything to Python instead of C#, we are a consulting company and we usually write small projects in C# we don't do huge projects and our work is more based on complex mathematical models not complex software structures. So we believe IronPython is a good platform for us because it provides standard GUI functionality on windows and access to all of .Net libraries.
I know Ironpython studio is not complete, and in fact I had a hard time adding my references but I was wondering if someone could list some of the pros and cons of this migration for us, considering Python code is easier to read by our clients and we usually deliver a proof-of-concept prototype instead of a full-functional code, our clients usually go ahead and implement the application themselves

My company, Resolver Systems, develops what is probably the biggest application written in IronPython yet. (It's called Resolver One, and it's a Pythonic spreadsheet). We are also hosting the Ironclad project (to run CPython extensions under IronPython) and that is going well (we plan to release a beta of Resolver One & numpy soon).
The reason we chose IronPython was the .NET integration - our clients want 100% integration on Windows and the easiest way to do that right now is .NET.
We design our GUI (without behaviour) in Visual Studio, compile it into a DLL and subclass it from IronPython to add behaviour.
We have found that IronPython is faster at some cases and slower at some others. However, the IronPython team is very responsive, whenever we report a regression they fix it and usually backport it to the bugfix release. If you worry about performance, you can always implement a critical part in C# (we haven't had to do that yet).
If you have experience with C#, then IronPython will be natural for you, and easier than C#, especially for prototypes.
Regarding IronPython studio, we don't use it. Each of us has his editor of choice (TextPad, Emacs, Vim & Wing), and everything works fine.

There are a lot of reasons why you want to switch from C# to python, i did this myself recently. After a lot of investigating, here are the reasons why i stick to CPython:
Performance: There are some articles out there stating that there are always cases where ironpython is slower, so if performance is an issue
Take the original: many people argue that new features etc. are always integrated in CPython first and you have to wait until they are implemented in ironpython.
Licensing: Some people argue this is a timebomb: nobody knows how the licensing of ironpython/mono might change in near future
Extensions: one of the strengths of python are the thousands of extensions which are all usable by CPython, as you mentioned mathematical problems: numpy might be a suitable fast package for you which might not run as expected under IronPython (although Ironclad)
Especially under Windows you have a native GUI-toolkit with wxPython which also looks great under several other platforms and there are pyQT and a lot of other toolkits. They have nice designer like wxGlade, but here VisualStudio C# Designer is easier to use.
Platform independence (if this is an issue): CPython is ported to really a lot of platforms, whereas ironpython can only be used on the major platforms (recently read a developer was sad that he couldn't get mono to run under his AIX)
Ironpython is a great work, and if i had a special .NET library i would have to use, IronPython might be the choice, but for general purpose problems, people seem to suggest using the original CPython, unless Guido changes his mind.

The way you describe things, it sounds like you're company is switching to Python simple for the sake of Python. Is there some specific reason you want to use Python? Is a more dynamic language necessary? Is the functional programming going to help you at all? If you've got a perfectly good working set of tools in C#, why bother switching?
If you're set on switching, you may want to consider starting with standard Python unless you're specifically tied to the .NET libraries. You can write cross platform GUIs using a number of different frameworks like wxPython, pyQt, etc. That said, Visual Studio has a far superior GUI designer to just about any of the tools out there for creating Python windowed layouts.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.