compiling python for embedded linux_rt - python

I am targetting an embedded platform with linux_rt, and would like to compile cpython. I am not asking whether python is appropriate for realtime, or its latency. I AM asking about compiling under platform constraints.
I would like an interpretter embedded in a C shared library, but will also accept an exectuable binary if needs be.
Any C compiling ive done is for mainstream OS deployment, and i usually just hit make install. Im not afraid to get a little dirty, but am afraid of longterm maintenance and repeatability.
To avoid as much memory overhead as possible, are there any compiler configurations that can be changed from defaults? Can I easily strip sections of the standard library I know will not be needed?
Target platform has a 600 MHz Celeron, and 256mb RAM. The required firmware is built for a v2.6 kernel (might be 2.4). The default OS image uses Busybox, and most standard system libraries are minimally available. The root filesystem is around 100mB (flash), although I will have an external memory card mounted and can extended root onto there.
Python should have 70% Cpu and 128mB ram at most times, although I could imagine sloppy execution of the interpretter at times, and on RT linux, that could start to add up. Just trying to take precautions before I dive in.
Looking for simple Do's or Don'ts. Reference to similar projects would be great, but I really want to stick with CPython where possible.
I do not have the target platform in the shop yet, so I cannot post any tests. Will have the unit in 2 weeks and will update this post, at that time, if needed.

make a VM with the target configuration to help you get started. VirtualBox or QEmu. If you don't have a root FS one place to start is TinyCore, which is very small, configurable, but also can run on your laptop -- http://www.linuxjournal.com/article/11023

Related

Instrument memory access of python scripts

My research requires processing memory traces of applications. For C/C++ programs, this is easy using Intel's PIN library. However, as suggested here Use Intel Pin to instrument Python scripts, I may need to instrument the Python runtime itself, which I'm not sure will represent the true memory behavior of a given python script due to some overheads(If this is not the case, please comment). Some of the existing python memory profilers only talk about the runtime memory "usage" in terms of the heap space usage, etc.
I ended up making an executable from my python script using PyInstaller and running my PINTool over it. However, I'm not sure if this is the right approach.
Is there any way(any library or any hack into the python runtime) that may help in getting the memory traces accessed by the python scripts?

Python VS code taking too much memory and taking too long to auto complete

I am a beginner learning to program python using VS code so my knowledge about both the VS code and the python extension is limited. I am facing two very annoying problems.
Firstly, when the python extension starts the memory usage of vs code jumps from ~300 mb to 1-1.5 Gbs. If I have any thing else open then everything gets extremely sluggish. This seems to me a bit abnormal. I have tried disabling all other extensions but the memory consumption remains the same. Is there a way (or some settings that I can change to reduce the memory consumption?
Secondly, the intellisense autocomplete takes quite a bit of time (sometimes 5-10 mins) before it starts to kick in. Also it stops working sometimes completely. Any pointers what could be causing that?
PS: I am using VS code version 1.50 (September update) and python anaconda 4.8.3.
VSCode as a code editor, in addition to the memory space occupied by VSCode itself, it needs to download the corresponding language services and language extensions to support, so it occupies some memory space.
For memory, it is recommended that you uninstall unnecessary third-party extensions and duplicate language services. In addition, this is a good habit if we use virtual environments in VSCode. The folder of the virtual environment exists in the project, and the installation package is stored in the project without occupying system resources.
For automatic completion, this function is provided by the corresponding language service and extension. please try to reload VSCode and wait for the language service to load before editing the code.
Therefore, you can try to use the extension "Pylance", which not only provides outstanding language service functions but also provides automatic completion.
At least for the intellisense, you could try changing
"python.jediEnabled": false
in your settings.json file. This will allow you to use a newer version of the intellisense, but it might need to download first.
But beyond that, I’d suggest using Pycharm instead. It’s quite snappy, and it has a free version.

Performance differences between python from package and python compiled from source

I would like to know if there are any documented performance differences between a Python interpreter that I can install from an rpm (or using yum) and a Python interpreter compiled from sources (with a priori well set flags for compilations).
I am using a Redhat 6.3 machine as Django/Apache/Mod_WSGI production server. I have already properly compiled everything in different setups and in different orders. However, I usually keep the build-dev dependencies on such machine. For some various ego-related (and more or less practical) reasons, I would like to use Python-2.7.3. By default, Redhat comes with Python-2.6.6. I think I could go with it but it would hurt me somehow (I would have to drop and find a replacement for a few libraries and my ego).
However, besides my ego and dependencies, I would like to know what would be the impact in terms of performance for a Django server.
If you compile with the exact same flags that were used to compile the RPM version, you will get a binary that's exactly as fast. And you can get those flags by looking at the RPM's spec file.
However, you can sometimes do better than the pre-built version. For example, you can let the compiler optimize for your specific CPU, instead of for "general 386 compatible" (or whatever the RPM was optimized for). Of course if you don't know what you're doing (or are doing it on purpose), it's always possible to build something slower than the pre-built version, too.
Meanwhile, 2.7.3 is faster in a few areas than 2.6.6. Most of them usually won't affect you, but if they do, they'll probably be a big win.
Finally, for the vast majority of Python code, the speed of the Python interpreter itself isn't relevant to your overall performance or scalability. (And when it is, you probably want to try PyPy, Jython, or IronPython to replace CPython.) This is especially true for a WSGI service. If you're not doing anything slow, Apache will probably be the bottleneck. If you are doing anything slow, it's probably something I/O bound and well outside of Python's control (like reading files).
Ultimately, the only way you can know how much gain you get is by trying it both ways and performance testing. But if you just want a rule of thumb, I'd say expect a 0% gain, and be pleasantly surprised if you get lucky.

Should I embed or extend python to create high quality, high speed GUI programs?

I'm trying to find a way to rapidly develop (or rather eventually reach a point where I can rapidly develop) very nice looking cross platform GUI desktop apps that have a very small footprint on disk and in memory, launch very fast (much faster, for example, than even a bare bones wxPython window) (for a good example, look at how fast TextEdit launches under OSX. That's the kind of launch speed I want for my GUI apps), deploy easily, and interact very well with the operating system (Gimp and Gedit and various other open source, cross platform apps exhibit various behaviors that I really hate, depending on the platform, but especially on OSX) without spending any money. (Hey, stop laughing! =P)
I'm dissatisfied with wxWidgets, Qt, SDL, and everything else I've tried so far, so I'm down to writing native GUI code (especially the part that interacts with the OS's windowing system) on each platform, using native tools (XCode/ObjC/Cocoa/OpenGL, MSVC/Win32/DirectX, gcc/GTK/OpenGL), and then trying to come up with some way of writing as much as possible of the rest of the program in Python.
I've thought about maybe writing a set of shared libraries / dll's to deal with matters GUI, and then wrapping them with a set of Python C extensions, but there are some technical challenges involved in doing that when it comes to packaging (menus, the app icon, certain OS-specific application manifests, etc), and I'm not sure that launch speed and performance in general will be acceptable, depending on the particular program I'm writing.
So I've thought about maybe creating a sort of a "shell" program on each platform, and embedding python, kind of in a similar way to the way Sublime Text 2 does.
I don't like the startup slowness that occurs when launching any python program for the first time. I was hoping this was a result of compiling to byte code, and that I could just include precompiled versions of python modules with my apps, but from experimenting, it seems this is not the case.. it seems that the first time anything python runs (since the last system reboot), a shared library / dll is loaded or something. So that's one reason I think of maybe embedding Python - I wonder if there are some options available when imbedding/calling python that could help reduce that launch delay. Or if worst comes to worst, in the imbedded case I can launch without Python, then launch Python if/when I need to, asynchronously (not in the main thread), after the app has already launched.
Is there a way to reduce the first-time launch delay for deployed python programs (ie., programs who's packages include a version of the intepreter.. maybe the interpreter can be compiled with switches I haven't tried)?
Is there any way to reduce the interpreter load/initialize delay when embedding python?
Is it completely unrealistic to expect any python gui program to launch as fast or have as small a footprint as TextEdit?
Pyglet
It doesn't get any better, you'll get full support for all Python releases,
you're in charge of your GUI code and the speed of the thing is phenomenal.
You can render chunks of data without noticeable lag!
Running on a 333Mhz CPU with less then 128MB and some random PCI graphics card i've managed to pull this out of a hat:
http://www.youtube.com/watch?v=D7zFLQZxzcY
(Roughly a few hunder thousand stars, scale:able, a few hundred planets also in scale witha a ship that can be navigated in space.. this was a early development video for something i didn't have time to finish)
AFter the first run (or if you compile your .py into .pyc) you'll get a great speed out of your GUI using pyglet, but you'll have to create all your input and buttons your self since you're writing graphical data and not interface code.
You can take the openoffice way on windows platform and write a launcher that simply tries to access the file needed by your software, so that they will end up in the memory cache, speeding up the startup time (but creating useless things in the cache, in case the user doesn't want to open your program).

Python Performance on Windows

Is Python generally slower on Windows vs. a *nix machine? Python seems to blaze on my Mac OS X machine whereas it seems to run slower on my Window's Vista machine. The machines are similar in processing power and the vista machine has 1GBs more memory.
I particularly notice this in Mercurial but I figure this may simply be how Mercurial is packaged on windows.
I wanted to follow up on this and I found something that I believe is 'my answer'. It appears that Windows (vista, which is what I notice this on) is not as fast in handling files. This was mentioned by tony-p-lee.
I found this comparisons of Ubuntu vs Vista vs Win7. Their results are interesting and like they say, you need to take the results with a grain of salt. But I think the results lead me to the cause. Python, which I feel was indirectly tested, is about equivalent if not a tad-bit faster on Windows.. See the section "Richards benchmark".
Here is their graph for file transfers:
(source: tuxradar.com)
I think this specifically help address the question because Hg is really just a series of file reads, copies and overall handling. Its likely this is causing the delay.
http://www.tuxradar.com/content/benchmarked-ubuntu-vs-vista-vs-windows-7
No real numbers here but it certainly feels like the start up time is slower on Windows platforms. I regularly switch between Ubuntu at home and Windows 7 at work and it's an order of magnitude faster starting up on Ubuntu, despite my work machine being at least 4x the speed.
As for runtime performance, it feels about the same for "quiet" applications. If there are any GUI operations using Tk on Windows, they are definitely slower. Any console applications on windows are slower, but this is most likely due to the Windows cmd rendering being slow more than python running slowly.
Maybe the python has more depend on a lot of files open (import different modules).
Windows doesn't handle file open as efficiently as Linux.
Or maybe Linux probably have more utilities depend on python and python scripts/modules are more likely to be buffered in the system cache.
I run Python locally on Windows XP and 7 as well as OSX on my Macbook. I've seen no noticable performance differences in the command line interpreter, wx widget apps run the same, and Django apps also perform virtually identically.
One thing I noticed at work was that the Kaspersky virus scanner tended to slow the python interpreter WAY down. It would take 3-5 seconds for the python prompt to properly appear and 7-10 seconds for Django's test server to fully load. Properly disabling its active scanning brought the start up times back to 0 seconds.
With the OS and network libraries, I can confirm slower performance on Windows, at least for versions =< 2.6.
I wrote a CLI podcast-fetcher script which ran great on Ubuntu, but then wouldn't download anything faster than about 80 kB/s (where ~1.6 MB/s is my usual max) on either XP or 7.
I could partially correct this by tweaking the buffer size for download streams, but there was definitely a major bottleneck on Windows, either over the network or IO, that simply wasn't a problem on Linux.
Based on this, it seems that system and OS-interfacing tasks are better optimized for *nixes than they are for Windows.
Interestingly I ran a direct comparison of a popular Python app on a Windows 10 x64 Machine (low powered admittedly) and a Ubuntu 14.04 VM running on the same machine.
I have not tested load speeds etc, but am just looking at processor usage between the two. To make the test fair, both were fresh installs and I duplicated a part of my media library and applied the same config in both scenarios. Each test was run independently.
On Windows Python was using 20% of my processor power and it triggered System Compressed Memory to run up at 40% (this is an old machine with 6GB or RAM).
With the VM on Ubuntu (linked to my windows file system) the processor usage is about 5% with compressed memory down to about 20%.
This is a huge difference. My trigger for running this test was that the app using python was running my CPU up to 100% and failing to operate. I have now been running it in the VM for 2 weeks and my processor usage is down to 65-70% on average. So both on a long and short term test, and taking into account the overhead of running a VM and second operating system, this Python app is significantly faster on Linux. I can also confirm that the Python app responds better, as does everything else on my machine.
Now this could be very application specific, but it is at minimum interesting.
The PC is an old AMD II X2 X265 Processor, 6GB of RAM, SSD HD (which Python ran from but the VM used a regular 5200rpm HD which gets used for a ton of other stuff including recording of 2 CCTV cameras).

Categories

Resources