Where can I find documentation on assembler?

Where can I find documentation on assembler? - python

I wrote a very short program that parses a "program" using python and converts it to assembler, allowing me to compile my little proramming language to an executable.
You can read my blog for more information here http://spiceycurry.blogspot.com/2010/05/simple-compilable-programming-language.html
my question is... Where can I find more kernel commands so that I can further expand on my script in the above blog?

I'd rather recomment using LLVM:
It allows you not to bother with low-level details like register allocations (you provide only SSA form)
It does optimizations for you. It can be faster then the hand-written and well-optimized compiler as the LLVM pipeline in GHC is showing (at the beginning - before much optimalization it had equal or better performance than mature native code generator).
It is cross-platform - you don't tight yourself to for example x86
I'n not quite sure what you mean by 'kernel' commands. If you mean opcodes:
There are Intel manuals
There is Wikipedia page containing all of the documented memnonics (however not the opcodes and not always description)
There is NASM manual
However the ARM or PowerPC have totally different opcodes.
If you mean the operating systen syscalls (system calls) then:
You can just use C library. It is in every operating system and is cross platform.
You can use directly syscalls. However they are harder to use, may be slower (libc may use additional buffering) and are not cross platform (Linux syscalls on x86 - may be not up-to-date).

Related

AoT Compiler for Python

I want to get my Python script working on a bare metal device like microcontroller WITHOUT the need for an interpreter. I know there are already JIT compilers for Python like PyPy, and interpreters like CPython.
However, existing interpreters I've seen (such as CPython) take up large memory (in MB range).
Is there an AOT compiler for Python (i.e. compiling directly to native hardware through intermediary like LLVM)?
I assume such a compiler would enable Python to run much faster compared to existing implementations AND with lower memory footprint. If there is, I wonder why that solution hasn't been popularized.

As you already mentioned Cython is an option (However, it is true that the result is big due since the C runtime need to implement the Python functionality together with your program).
With regards to LLVM there was a project by Google named unladen swallow. However, that project is mostly abandoned. You can find some information about it here
Basically it was an attempt to bring LLVM optimizations into the runtime of Cython. E.g JITTING Python code.
Another old alternative was shed skin which compiles Python to C++. Some information about it can be found here.
Yet another option similar to shed skin is to restrict yourself to a subset of the Python language and use micropython.

An alternative would be to use GraalVM with Truffle AOT with Python.
It's basically Python running on an optimized AOT for jvm.
The project looks promising. You can chek this link here:
https://www.graalvm.org/22.2/graalvm-as-a-platform/language-implementation-framework/AOT/

Recently came across codon,
Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead. Typical speedups over Python are on the order of 10-100x or more, on a single thread. Codon's performance is typically on par with (and sometimes better than) that of C/C++.

Performance differences between python from package and python compiled from source

I would like to know if there are any documented performance differences between a Python interpreter that I can install from an rpm (or using yum) and a Python interpreter compiled from sources (with a priori well set flags for compilations).
I am using a Redhat 6.3 machine as Django/Apache/Mod_WSGI production server. I have already properly compiled everything in different setups and in different orders. However, I usually keep the build-dev dependencies on such machine. For some various ego-related (and more or less practical) reasons, I would like to use Python-2.7.3. By default, Redhat comes with Python-2.6.6. I think I could go with it but it would hurt me somehow (I would have to drop and find a replacement for a few libraries and my ego).
However, besides my ego and dependencies, I would like to know what would be the impact in terms of performance for a Django server.

If you compile with the exact same flags that were used to compile the RPM version, you will get a binary that's exactly as fast. And you can get those flags by looking at the RPM's spec file.
However, you can sometimes do better than the pre-built version. For example, you can let the compiler optimize for your specific CPU, instead of for "general 386 compatible" (or whatever the RPM was optimized for). Of course if you don't know what you're doing (or are doing it on purpose), it's always possible to build something slower than the pre-built version, too.
Meanwhile, 2.7.3 is faster in a few areas than 2.6.6. Most of them usually won't affect you, but if they do, they'll probably be a big win.
Finally, for the vast majority of Python code, the speed of the Python interpreter itself isn't relevant to your overall performance or scalability. (And when it is, you probably want to try PyPy, Jython, or IronPython to replace CPython.) This is especially true for a WSGI service. If you're not doing anything slow, Apache will probably be the bottleneck. If you are doing anything slow, it's probably something I/O bound and well outside of Python's control (like reading files).
Ultimately, the only way you can know how much gain you get is by trying it both ways and performance testing. But if you just want a rule of thumb, I'd say expect a 0% gain, and be pleasantly surprised if you get lucky.

Use Cython as Python to C Converter

I have huge Python modules(+8000 lines) .They basically have tons of functions for interacting with a hardware platform via serial port by reading and writing to hardware registers.
They are not numerical algorithms. So application is just reading/writing to hardware registers/memory. I use these libraries to write custom scripts. Eventually, I need to move all these stuff to be run in an embedded processor on my hardware to have finer control, then I just kick off the event from PC and the rest is in hardware.
So I need to convert them to C.If I can have my scripts be converted to C by an automatic tool, that would save me a huge time. This is why I got attracted to Cython. Efficiency is not important my codes are not number crunchers. But generated code should be relatively small to fit in my limited memory (few hundreds of Kilobytes).
Can I use Cython as converter for my custom Python scripts to C? My guess is yes, in which case can I use these .c files to be run in my hardware? My guess is not since I have to have Cython in my hardware as well for them to run. But if just creates some .c files, I can go through and make it standalone since code is not using much of features of Python it just use it as a fast to implement script.

Yes, at its core this is what Cython does. But ...
You don't need Cython, however, you do need libpython. You may feel like it doesn't use that many Python features, but I think if you try this you'll find it's not true -- you won't be able to separate your program from its dependence on libpython while still using the Python language.
Another option is PyPy, specifically it's translation toolchain, NOT the PyPy Python interpreter. It lets you translate RPython, a subset of the Python language, into C. If you really aren't using many Python language features or libraries, this may work.
PyPy is mostly known as an alternative Python implementation, but it is also a set of tools for compiling dynamic languages into various forms. This is what allows the PyPy implementation of Python, written in (R)Python, to be compiled to machine code.
If C++ is available, Nuitka is a Python to C++ compiler that works for regular Python, not just RPython (which is what shedskin and PyPy use).

If C++ is available for that embedded platform, there is shed skin, it converts python into c++.

LLVM, Parrot, JVM, PyPy + python

What is the problem in developing some languages, for example python for some optimized techniques with some of LLVM / Parrot.
PyPy, LLVM, Parrot are the main technologies for common platform development.
I see this like:
PyPy - framework to build VM with build in optimized VM for python
So it quite general solution. The process goes as listed down:
dynamic_language_code ->
PyPy frontend ->
PyPy internal code - bytecode ->
PyPy optimization ->
leaving PyPy code and:
a. PyPy backend for some VM (like jvm)
b. som Kit to make own VM
c. processing/running PyPy internal code
Am I right About this process? For python there is optimized VM? Particularly by default there is build in VM for optimized PyPy code (step 5.c) - which is for python and every language processing can stop there and be running by it?
Parrot - much like PyPy, but without 5.a and 5.b ? Some internal improvements for dynamic processing (Parrot Magic Cookies).
Both Parrot and PyPy are designed to create a platform which create a common dynamic languages runtime, but PyPy wants more - also to create more VM.
Where is the sens of PyPy? For what we need to create more VM? Shouldn't be better to focus on one VM (like in parrot) - because there is common one code level - either PyPy internal bytecode or Parrot ones.
I think we can't gain nothing better to translate to PyPy bytecode to newly created with PyPy VMs.
LLVM - i see this very similar to PyPy but without VM generator.
It is mature, well designed environment with similar targets as PyPy (but without VM generator) but working on low level structure and great optimization/JIT techniques implemeted
Is see this as: LLVM is general use, but Parrot and **PyPy* designed for dynamic languages. In PyPy / Parrot is more easy to introduce some complicated techniques - because it is more high level then LLVM - like sophisticate compiler which can better understand high level code and produce better assembler code (which humans can't write in reasonable time), then the LLVM one?
Questions:
Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?
I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy
Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high development process and has big companies investing in.
I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?
What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?
Can you correct me if i'm wrong somewhere
Some addings:
How does Parrot compare to other virtual machines
What's the benefit of Parrot VM for end-users
What are the differences between LLVM and java/jvm
=============
CLARIFICATION
I want to figure how all this software consist - and what is the problem to porting one to other.

That not stuff anybody can possible answer in a stackoverflow questions but i give it a minmal shot.
First what problems do the 3 projects solve?
pypy allows you to implement an interpreter in a high level language and you get a generated jit for free. The good thing about this is that you don't have a dependence mismatch between the langauge and the platform. Thats the reason why pypy-clr is faster then IronPython.
More info here: http://codespeak.net/pypy/dist/pypy/doc/extradoc.html --> High performance implementation of Python for CLI/.NET with JIT compiler generation for dynamic)
llvm is a low level infrastructure for compilers. The general idea is to have one "high level assembly". All the optomizations work on that language. Then there is tons of infrastructure around to help you build compilers (JIT or AOT). Implementing a dynamic language on llvm is possible but needs more work then implementing it on pypy or parrot. You, for example, can't get a GC for free (there are GC you can use together with LLVM see http://llvm.org/devmtg/2009-10/ --> the vmkit video ) There are attempts to build a platform better for dynamic langauges based on llvm: http://www.ffconsultancy.com/ocaml/hlvm/
I don't know that much about parrot but as I understand they want to build one standard VM specialized for dynamic langauges (perl, php, python ....). The problem here is the same as with compiling to JVM/CLR there is a dependency missmatch, just a much smaller one. The VM still does not know the semantics of your langauge. As I unterstand parrot is still pretty slow for user code. (http://confreaks.net/videos/118-elcamp2010-parrot)
The answer to your question:
Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?
Thats a question of effort. Building everthing your self and specialized for you will eventually be faster but it's a LOT more effort.
I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy.
Targeting parrot would (at this point) not likely have a benefit over pypy. Why nobody else does it I don't know.
Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high
development process and has big companies investing in.
Ok there is a lot of stuff in that question.
Like I said LLVM is hard to move to and parrot is not that fast (correct me if im wrong).
Ruby has Rubinius witch tries to do a lot in ruby and jits to llvm (http://llvm.org/devmtg/2009-10/ --> Accelerating Ruby with LLVM).
There is a implementation of CLR/JVM on LLVM but they both already have very mature implemantations that have big investments.
LLVM is not high level.
I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?
I have no idea what the question is.
What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?
Watch the video of VMKit I linked above that show how far they got and what the problem is (and how they solved it).
Can you correct me if i'm wrong somewhere
Lots of stuff you wrote is wrong or I just don't anderstand what you mean, but the stuff I linked should make a lot of stuff clearer.
Some examples:
Clojure
The creater didn't want all the work of implementing his own vm and all the libraries. So where to go to? Since Clojure is a new langauge you can build it in a way that works well on a platform like the JVM by restricting a lot of dynamic stuff a language like python or ruby would have.
Python
The language can't (practically) be changed to work well on JVM/CLR. So implementing python on those wont bring massive speedups. Static compiler won't work very well either because there are not many static guarantees. Writing a JIT in C will be fast but very hard to change (see the psyco project). Using the llvm jit could work and is explored by the Unladen Swallow project (again http://llvm.org/devmtg/2009-10/ --> Unladen Swallow: Python on LLVM). Some people wanted to have python in python so they started pypy and their idea seams to work really well (see above). Parrot could work as well but I have not seen anybody have try (feel free).
On everything:
I think you're confused and I can understand that. Take your time and read, listen, watch everything you can get. Don't stress yourself. There are a lot of parts to this and eventually you see how what fits together and what makes sense and even when you know a lot there is still a lot of discussing one may do. The question is where to implement a new language or how to speed up a old language have many answers and if you ask 3 people you're likely to get three different answers.

What are you trying to implement? Your question is very confusingly worded (I realize English is likely not your first language).
LLVM and PyPy are both mature, useful projects, but really don't overlap much at this point. (At one point, PyPy could generate LLVM bytecode—which was statically compiled to an interpreter—as opposed to C code, but it didn't provide much of a performance benefit and is no longer supported.)
PyPy lets you write an interpreter in RPython and use that as a description to generate a native code interpreter or JIT; LLVM is a C++ framework for building a compiler toolchain which can also be used to implement a JIT. LLVM's optimizers, code generation and platform support are significantly more advanced than those of PyPy, but it isn't as well suited to building a dynamic language runtime (see the Unladen Swallow retrospective for some examples of why). In particular, it is not as effective at collecting/using runtime feedback (which is absolutely essential for making dynamic languages perform well) as PyPy's trace-based JIT. Also, LLVM's garbage collection support is still somewhat primitive, and it lacks PyPy's unique ability to automatically generate a JIT.
Incidentally two Java implementations are built on LLVM—J3/VMKit and Shark.
You might consider watching the PyPy talk from Stanford last week; it provides a pretty decent overview of how PyPy works. Carl Friedrich Bolz's presentation also provides a good overview of the state of VM implementation.

The main reason? Because VM design is not a settled technology, and having a variety of VMs with different goals and objectives allows a variety of mechnisms to be tried in parallel rather than all having to be tried in series.
The JVM, CLR, PyPy, Parrot, LLVM and the rest all target different kinds of problems in different ways. It's similar to the reasons why Chrome, Firefox, Safari and IE all use their own Javascript engines.
Unladen Swallow attempted to apply LLVM to CPython, and spent more of their time fixing issues in LLVM than they did in doing anything Python specific.
Python-on-Parrot suffered from semantic differences between Perl 6 and Python causing problems with the front-end compilation process, so future efforts in this area are likely to use the PyPy front-end to target the Parrot VM.
Different VM developers certainly keep an eye on what the others are doing, but even when they lift good ideas they will put their own spin on them before incorporating them.

Python - IronPython dilemma

I'm starting to study Python, and for now I like it very much. But, if you could just answer a few questions for me, which have been troubling me, and I can't find any definite answers to them:
What is the relationship between Python's C implementation (main version from python.org) and IronPython, in terms of language compatibility ? Is it the same language, and do I by learning one, will be able to smoothly cross to another, or is it Java to JavaScript ?
What is the current status to IronPython's libraries ? How much does it lags behind CPython libraries ? I'm mostly interested in numpy/scipy and f2py. Are they available to IronPython ?
What would be the best way to access VB from Python and the other way back (connecting some python libraries to Excel's VBA, to be exact) ?

1) IronPython and CPython share nearly identical language syntax. There is very little difference between them. Transitioning should be trivial.
2) The libraries in IronPython are very different than CPython. The Python libraries are a fair bit behind - quite a few of the CPython-accessible libraries will not work (currently) under IronPython. However, IronPython has clean, direct access to the entire .NET Framework, which means that it has one of the most extensive libraries natively accessible to it, so in many ways, it's far ahead of CPython. Some of the numpy/scipy libraries do not work in IronPython, but due to the .NET implementation, some of the functionality is not necessary, since the perf. characteristics are different.
3) Accessing Excel VBA is going to be easier using IronPython, if you're doing it from VBA. If you're trying to automate excel, IronPython is still easier, since you have access to the Execl Primary Interop Assemblies, and can directly automate it using the same libraries as C# and VB.NET.

What is the relationship between
Python's C implementation (main
version from python.org) and
IronPython, in terms of language
compatibility ? Is it the same
language, and do I by learning one,
will be able to smoothly cross to
another, or is it Java to JavaScript ?
Same language (at 2.5 level for now -- IronPython's not 2.6 yet AFAIK).
What is the current status to
IronPython's libraries ? How much does
it lags behind CPython libraries ? I'm
mostly interested in numpy/scipy and
f2py. Are they available to IronPython
?
Standard libraries are in a great state in today's IronPython, huge third-party extensions like the ones you mention far from it. numpy's starting to get feasible thanks to ironclad, but not production-level as numpy is from IronPython (as witnessed by the 0.5 version number for ironclad, &c). scipy is huge and sprawling and chock full of C-coded and Fortran-coded extensions: I have no first-hand experience but I suspect less than half will even run, much less run flawlessly, under any implementation except CPython.
What would be the best way to access
VB from Python and the other way back
(connecting some python libraries to
Excel's VBA, to be exact) ?
IronPython should make it easier via .NET approaches, but CPython is not that far via COM implementation in win32all.
Last but not least, by all means check out the book IronPython in Action -- as I say every time I recommend it, I'm biased (by having been a tech reviewer for it AND by friendship with one author) but I think it's objectively the best intro to Python for .NET developers AND at the same time the best intro to .NET for Pythonistas.
If you need all of scipy (WOW, but that's some "Renaissance Man" computational scientist!-), CPython is really the only real option today. I'm sure other large extensions (PyQt, say, or Mayavi) are in a similar state. For deep integration to today's Windows, however, I think IronPython may have an edge. For general-purpose uses, CPython may be better (esp. thanks to the many new features in 2.6), unless you're really keen to use many cores to the hilt within a single process, in which case IronPython (which is GIL-less) may prove advantageous again.
One way or another (or even on the JVM via Jython, or in peculiar environments via PyPy) Python is surely an awesome language, whatever implementation(s) you pick for a given application!-) Note that you don't need to stick with ONE implementation (though you should probably pick one VERSION -- 2.5 for maximal compatibility with IronPython, Jython, Google App Engine, etc; 2.6 if you don't care about any deployment options except "CPython on a machine under my own sole or virtual control";-).

IronPython version 2.0.2, the current release, supports Python 2.5 syntax. With the next release, 2.6, which is expected sometime over the next month or so (though I'm not sure the team have set a hard release date; here's the beta), they will support Python 2.6 syntax. So, less Java to JavaScript and more Java to J# :-)
All of the libraries that are themselves written in Python work pretty much perfectly. The ones that are written in C are more problematic; there is an open source project called Ironclad (full disclosure: developed and supported by my company), currently at version 0.8.5, which supports numpy pretty well but doesn't cover all of scipy. I don't think we've tested it with f2py. (The version of Ironclad mentioned below by Alex Martelli is over a year old, please avoid that one!)
Calling regular VB.NET from IronPython is pretty easy -- you can instantiate .NET classes and call methods (static or instance) really easily. Calling existing VBA code might be trickier; there are open source projects called Pyinex and XLW that you might want to take a look at. Alternatively, if you want a spreadsheet you can code in Python, then there's always Resolver One (another one from my company... :-)

1) The language implemented by CPython and IronPython are the same, or at most a version or two apart. This is nothing like the situation with Java and Javascript, which are two completely different languages given similar names by some bone-headed marketing decision.
2) 3rd-party libraries implemented in C (such as numpy) will have to be evaluated carefully. IronPython has a facility to execute C extensions (I forget the name), but there are many pitfalls, so you need to check with each library's maintainer
3) I have no idea.

CPython is implemented by C for corresponding platform, such as Windows, Linux or Unix; IronPython is implemented by C# and Windows .Net Framework, so it can only run on Windows Platform with .Net Framework.
For gramma, both are same. But we cannot say they are one same language. IronPython can use .Net Framework essily when you develop on windows platform.
By far, July 21, 2009 - IronPython 2.0.2, our latest stable release of IronPython, was released. you can refer to http://www.codeplex.com/Wiki/View.aspx?ProjectName=IronPython.
You can access VB with .Net Framework function by IronPython. So, if you want to master IronPython, you'd better learn more .Net Framework.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.