Standalone Python interpreter

Standalone Python interpreter - python

I want to run a python program without any underlying OS.
I have read articles on running python on small microcontrollers, but i want it on a bigger processor (Intel, ARM).
My criteria is:
It could be directly run as binary.
The Python interpreter could be loaded, onto which I can run my program.
At worst, tell me an extremely small, basic OS i can run it on.
Note: I want to use my program like a minimalistic operating system. I should be able to load it like any other OS, and it should be able to access memory and have basic I/O.
Note 2: Will there be limitations in terms of python's functions?

Note: this post describes x86 exclusively, as, next to ARM, requested by the OP.
It could be directly run as binary.
Binary? Python is not compiled, so no binary is produced. I think you mean just "run a Python program directly" here.
You could implement an additional compilation step, so that Python source files are compiled to bytecode prior to being executed, of course.
The Python interpreter could be loaded, onto which I can run my program.
"loaded" is a problem here. You need software to load the interpreter, displaying a chicken-egg problem. Intel x86 solves the problem by using a so-called BIOS (Basic I/O System), which starts further, user-defined programs. This "user-defined" program would be your Python interpreter then.
On more modern machines, UEFI is used instead of the legacy BIOS.
I want to use my program like a minimalistic operating system. I
should be able to load it like any other OS, and it should be able to
access memory and have basic I/O.
The aforementioned BIOS provides, as the acronym says, basic I/O functionality like reading/writing from/to disks, reading/writing from/to the screen, etc. Either use these basic routines and abstract from these or circumvent them and rewrite them all from scratch. That includes graphics drivers (a basic VGA driver will suffice), disk drivers (for loading Python files from disk), and filesystem (a simple FAT-16 is sufficient).
After all, you not only need to write a Python interpreter but a whole development environment from scratch.
Will there be limitations in terms of python's functions?
It depends on what you implement. For networking you need the appropriate drivers, for file stuff a filesystem + secondary storage driver. You are the ultimate master of your system you create, so it is up to you how un/limited your Python environment will be.

Related

Instrument memory access of python scripts

My research requires processing memory traces of applications. For C/C++ programs, this is easy using Intel's PIN library. However, as suggested here Use Intel Pin to instrument Python scripts, I may need to instrument the Python runtime itself, which I'm not sure will represent the true memory behavior of a given python script due to some overheads(If this is not the case, please comment). Some of the existing python memory profilers only talk about the runtime memory "usage" in terms of the heap space usage, etc.
I ended up making an executable from my python script using PyInstaller and running my PINTool over it. However, I'm not sure if this is the right approach.
Is there any way(any library or any hack into the python runtime) that may help in getting the memory traces accessed by the python scripts?

How can I wrap Python processes in an OSGi deployment

I need to integrate a body of Python code into an existing OSGi (Apache Felix) deployment.
I assume, or at least hope, that packages exist to help with this effort.
If it helps, the Python code is still relatively new and small, so can probably re re-architected to meet whatever constraints are needed. However, it must remain in Python, because of dependencies on third-party libraries.
What are suggested best practices?

The trick is to make this an extender, see 1 and 2. You want your Python code to be separate from the code that handles the interaction with the interpreter. So what you do is wrap the Python code and any native libraries in a bundle. This is trivial since it is just a zip file.
You then develop a bundle that listens to starting bundle (see the BundleTracker) that have python code. A manifest is often used but you can also look in a directory in the JAR. If you detect this code, you extract any native libraries and run the code in the interpeter of your choice.
If can use JYthon then that would be highly recommended. You can then carry the interpreter as an OSGi bundle that runs on the VM. If you need to use a native compiler your life is less rosy. You can rely on the environment to provide you with an interpreter but then why use OSGi in the first place. You basically lose the write once run anywhere advantage. You could go the full monty by creating bundles that contain Python installers for all platforms you support. Can be done, not even that hard, but a maintenance nightmare. Believe me, native code suck, it only does it a bit faster than Java.

Does compiling and linking Python in a C++ program mean that target users won't need python installed?

I have a C++ application that uses machine learning from Python and my current approach is making a single file executable with pyInstaller and then just running it from C++. This have obvious drawbacks, notably interapplication communication. At the moment I'm using an intermediate JSON file to talk to each other but this is massively suboptimal for my future requirements. What's beautiful about this, is that is working on all major platforms without too much hassle.
Section 1.6. from Python's manual reads "Compiling and Linking under Unix-like systems"
Does this mean that Python interpreter will be inside my application binary and target system doesn't need to have Python installed as the program will always use embedded interpreter ? If so, whats with python libraries ? Can I embed a whole conda enviroment ?
Also, whats with:
"(...) under Unix-like systems"
Does this means this approach is not multiplatform ?
Thanks in advance.

Embedding the Python interpreter is possible on all platforms. However it will only be the interpreter. Embedding any libraries will be a lot harder or even impossible.
But since you seem to deploy the Python libs already, you can use them just fine from the embedded interpreter. And then you could bridge C++ and Python without IPC, since they are both running in the same process.
pybind11 is very nice for embedding and generating C++ <-> Python interfaces.
A possible alternative, depending on the libraries in use, may be to export the model and use a C++ library to load and use it (for instance Tensorflow -> ONNX -> ONNX runtime).

It means that cpython (Python interpreter) will be inside your application. You will be able to run Python code and observe and manipulate virtual machine state directly from C++ code (good entry point C API reference is here). Your application might have some additional dynamic library dependencies (which ones depends on compilation options of embedded Python). Also interpreter isn't completely self contained and depends on some external .py modules normally shipped with Python distribution (as standard library). If you plan to import external modules that expect standard library, you will have to ship it with your application. There are ways how to build modules into binary too (freeze) but you might run into issues specially with modules that rely on filesystem.
As far as I tried, this procedure works on UNIX like systems and Windows (where easiest way is to link against DLL which you then ship with your application). On Windows you also need to make sure that you compile with same compiler as was used to compile DLL (or you compile Python DLL from source). Here is additional information about embedding on Windows: https://docs.python.org/3/faq/windows.html#how-can-i-embed-python-into-a-windows-application
Just note that embedding Python and shipping 3rd party modules with your application might have some licensing consequences.

Should I embed or extend python to create high quality, high speed GUI programs?

I'm trying to find a way to rapidly develop (or rather eventually reach a point where I can rapidly develop) very nice looking cross platform GUI desktop apps that have a very small footprint on disk and in memory, launch very fast (much faster, for example, than even a bare bones wxPython window) (for a good example, look at how fast TextEdit launches under OSX. That's the kind of launch speed I want for my GUI apps), deploy easily, and interact very well with the operating system (Gimp and Gedit and various other open source, cross platform apps exhibit various behaviors that I really hate, depending on the platform, but especially on OSX) without spending any money. (Hey, stop laughing! =P)
I'm dissatisfied with wxWidgets, Qt, SDL, and everything else I've tried so far, so I'm down to writing native GUI code (especially the part that interacts with the OS's windowing system) on each platform, using native tools (XCode/ObjC/Cocoa/OpenGL, MSVC/Win32/DirectX, gcc/GTK/OpenGL), and then trying to come up with some way of writing as much as possible of the rest of the program in Python.
I've thought about maybe writing a set of shared libraries / dll's to deal with matters GUI, and then wrapping them with a set of Python C extensions, but there are some technical challenges involved in doing that when it comes to packaging (menus, the app icon, certain OS-specific application manifests, etc), and I'm not sure that launch speed and performance in general will be acceptable, depending on the particular program I'm writing.
So I've thought about maybe creating a sort of a "shell" program on each platform, and embedding python, kind of in a similar way to the way Sublime Text 2 does.
I don't like the startup slowness that occurs when launching any python program for the first time. I was hoping this was a result of compiling to byte code, and that I could just include precompiled versions of python modules with my apps, but from experimenting, it seems this is not the case.. it seems that the first time anything python runs (since the last system reboot), a shared library / dll is loaded or something. So that's one reason I think of maybe embedding Python - I wonder if there are some options available when imbedding/calling python that could help reduce that launch delay. Or if worst comes to worst, in the imbedded case I can launch without Python, then launch Python if/when I need to, asynchronously (not in the main thread), after the app has already launched.
Is there a way to reduce the first-time launch delay for deployed python programs (ie., programs who's packages include a version of the intepreter.. maybe the interpreter can be compiled with switches I haven't tried)?
Is there any way to reduce the interpreter load/initialize delay when embedding python?
Is it completely unrealistic to expect any python gui program to launch as fast or have as small a footprint as TextEdit?

Pyglet
It doesn't get any better, you'll get full support for all Python releases,
you're in charge of your GUI code and the speed of the thing is phenomenal.
You can render chunks of data without noticeable lag!
Running on a 333Mhz CPU with less then 128MB and some random PCI graphics card i've managed to pull this out of a hat:
http://www.youtube.com/watch?v=D7zFLQZxzcY
(Roughly a few hunder thousand stars, scale:able, a few hundred planets also in scale witha a ship that can be navigated in space.. this was a early development video for something i didn't have time to finish)
AFter the first run (or if you compile your .py into .pyc) you'll get a great speed out of your GUI using pyglet, but you'll have to create all your input and buttons your self since you're writing graphical data and not interface code.

You can take the openoffice way on windows platform and write a launcher that simply tries to access the file needed by your software, so that they will end up in the memory cache, speeding up the startup time (but creating useless things in the cache, in case the user doesn't want to open your program).

Use Cython as Python to C Converter

I have huge Python modules(+8000 lines) .They basically have tons of functions for interacting with a hardware platform via serial port by reading and writing to hardware registers.
They are not numerical algorithms. So application is just reading/writing to hardware registers/memory. I use these libraries to write custom scripts. Eventually, I need to move all these stuff to be run in an embedded processor on my hardware to have finer control, then I just kick off the event from PC and the rest is in hardware.
So I need to convert them to C.If I can have my scripts be converted to C by an automatic tool, that would save me a huge time. This is why I got attracted to Cython. Efficiency is not important my codes are not number crunchers. But generated code should be relatively small to fit in my limited memory (few hundreds of Kilobytes).
Can I use Cython as converter for my custom Python scripts to C? My guess is yes, in which case can I use these .c files to be run in my hardware? My guess is not since I have to have Cython in my hardware as well for them to run. But if just creates some .c files, I can go through and make it standalone since code is not using much of features of Python it just use it as a fast to implement script.

Yes, at its core this is what Cython does. But ...
You don't need Cython, however, you do need libpython. You may feel like it doesn't use that many Python features, but I think if you try this you'll find it's not true -- you won't be able to separate your program from its dependence on libpython while still using the Python language.
Another option is PyPy, specifically it's translation toolchain, NOT the PyPy Python interpreter. It lets you translate RPython, a subset of the Python language, into C. If you really aren't using many Python language features or libraries, this may work.
PyPy is mostly known as an alternative Python implementation, but it is also a set of tools for compiling dynamic languages into various forms. This is what allows the PyPy implementation of Python, written in (R)Python, to be compiled to machine code.
If C++ is available, Nuitka is a Python to C++ compiler that works for regular Python, not just RPython (which is what shedskin and PyPy use).

If C++ is available for that embedded platform, there is shed skin, it converts python into c++.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.