I have a python application running in an embedded Linux system. I have realized that the python interpreter is not saving the compiled .pyc files in the filesystem for the imported modules by default.
How can I enable the interpreter to save it ? File system permission are right.
There are a number of places where this enabled-by-default behavior could be turned off.
PYTHONDONTWRITEBYTECODE could be set in the environment
sys.dont_write_bytecode could be set through an out-of-band mechanism (ie. site-local initialization files, or a patched interpreter build).
File permissions could fail to permit it. This need not be obvious! Anything from filesystem mount flags to SELinux tags could have this result. I'd suggest using strace or a similar tool (as available for your platform) to determine whether any attempts to create these files exist.
On an embedded system, it makes much more sense to make this an explicit step rather than runtime behavior: This ensures that performance is consistent (rather than having some runs take longer than others to execute). Use py_compile or compileall to explicitly run ahead-of-time.
Related
I've built a project with Python in which a module of functions can be changed by the user. More specifically, functions can be added or deleted inside this module by other processes in the application. Now I have just converted the whole project into an executable file using auto-py-to-exe to run it through a console window, instead of running it through VS Code for instance. I can't change the module if it was not added as an additional file in auto-py-to-exe rather can the application use this module if I do add it as an additional file.
My question is: how can I turn this project into an executable with the possibility of changing this module by the program itself?
This may be tricky, but it is possible with tools like pyinstaller when you bundle your app as a directory, rather than a single file executable. The source files (albeit compiled) will be present in the directory.
In principle, you could edit files in a single-file executable, but it's probably more trouble than it's worth and may be blocked by permissions and other issues.
You could also design your software to read a file from a predefined location (or the bundle directory -- anywhere accessible by the user) and simply exec the string of the code to 'load' it. It could look more or less like an extension/scripting system for your program. One example comes to mind: the iterm2 software Python API.
It should go without saying that you must trust your users/inputs if you give them the ability to arbitrarily change the software code.
i am looking into compiling quite a big set of python modules and packages to pyo. I know this is possible by either setting the PYTHONOPTIMIZE environment variable or by specifying -O on launch. I'd like to enforce pyo instead of pyc to yield the smallest footprint possible. In order to do that in my deploy module i have to create a wrapper script that launches the actual script with the -O option, because the environment variable needs to be specified prior to starting the interpreter.
Is there any way around this and enforce pyo creation programmatically?
Kind regards,
Thorsten
To compile all modules beforehand, run the following command:
python -O -m compileall /path/to/your/files
The python compileall module takes care of the compilation, the -O switch makes it output .pyo files.
However, you cannot force Python to use these unless the -O switch is given for every run or the PYTHONOPTIMIZE environment var is set.
Note that all the -O flag does is disable the assert statement and set the __debug__ flag to False (and Python will optimise out the tests). Specify -OO and docstrings are dropped. These do not make for much speed difference or space savings, unless you used excessive docstring sizes or very slow debug code.
See: What does Python optimization (-O or PYTHONOPTIMIZE) do?
Sadly from version 3.5 the compile optimization only makes .pyc files, not .pyo
Changed in version 3.5: The legacy parameter only writes out .pyc
files, not .pyo files no matter what the value of optimize is.
From https://docs.python.org/3.6/library/compileall.html
Are there any benefits, performance or otherwise for avoiding .pyc files, except for the convenience of not having a bunch of these files in the source folder?
I don't think there really is. .pyc files are cached bytecode files, and you save startup time as Python does not have to recompile your python files every time you start the interpreter.
At most, switching off bytecode compilation lets you measure how much time the interpreter spends on this step. If you want to compare how much time is saved, remove all the .pyc files in your project, and time Python by using the -B switch to the interpreter:
$ time python -B yourproject
then run again without the -B switch:
$ time python yourproject
It could be that the user under which you want to run your program does not have write access to the source code directories; for example a web server where you do not want remote users to have any chance of altering your source code through a vulnerability. In such cases I'd use the included compileall module to bytecompile everything using a priviledged user rather than forgo writing .pyc files.
One reason I can think of: during development, if you remove or rename a .py file, the .pyc file will stay around in your local copy with the old name and the old bytecode. Since you don't normally commit .pyc files to version control, this can lead to stray ImportErrors that don't happen on your machine, but do on others.
It is possible that .pyc files could encourage people to think they need not maintain/ship the original source. pyc files might not be portable between operating systems and versions of Python. When moving Python modules it is safer to leave the pyc files behind and just copy or ship the source and let the host python generate new pyc files.
By the way, from Python 3.2 the .pyc files no longer go into the source folder, but in __pycache__ (in the source folder).
I am writing a python script for automating the backing up of the PostgreSQL data directory to enable "Continuous Archiving and Point-In-Time Recovery".
I would like to use Pythons tarfile library to create an archive.
The complication is that the data files can change during the backup. To quote the PostgreSQL manual:
Some file system backup tools emit
warnings or errors if the files they
are trying to copy change while the
copy proceeds. When taking a base
backup of an active database, this
situation is normal and not an error.
However, you need to ensure that you
can distinguish complaints of this
sort from real errors.
... some
versions of GNU tar return an error
code indistinguishable from a fatal
error if a file was truncated while
tar was copying it. Fortunately, GNU
tar versions 1.16 and later exit with
1 if a file was changed during the
backup, and 2 for other errors.
When copying the files from the data directory, what exceptions should I be expecting in Python? And how can I determine whether an error is critical?
If you are using the tarfile module, the implementation of this is essentially up to you. GNU tar checks the file size and ctime before and after it processes a file and complains if they differ. It's not an actual error, it's just something that tar feels like it should mention. But in your use case, you don't care about that, so you can just forget about it. The tarfile module certainly won't complain about this by itself.
As far as other exceptions, the possibilities are endless. Pretty much anything that Python throws is probably fatal in this case.
Continuous Archiving is explained in PostgreSQL documentation. It is implemented and there is no need to create additional script - but you will have many files.
24.3.2. Making a Base Backup - follow instructions and you can create tar, data files will not change.
24.3.1. Setting up WAL archiving - here you have to archive new WAL files and they will not change too but they are separate files.
I think it is not safe to add WAL files to base backup. If server crashes while you are adding file to base backup it may become corrupt.
I have written a small DB access module that is extensively reused in many programs.
My code is stored in a single directory tree /projects for backup and versioning reasons, and so the module should be placed within this directory tree, say at /projects/my_py_lib/dbconn.py.
I want to easily configure Python to automatically search for modules at the /projects/my_py_lib directory structure (of course, __init__.py should be placed within any subdirectory).
What's the best way to do this under Ubuntu?
Thanks,
Adam
You can add a PYTHONPATH environment variable to your .bashrc file. eg.
export PYTHONPATH=/projects/my_py_lib
on linux, this directory will be added to your sys.path automatically for pythonN.M
~/.local/lib/pythonN.M/site-packages/
So you can put your packages in there for each version of python you are using.
You need a copy for each version of python, otherwise the .pyc file will be recompiled every time you import the module with a different python version
This also allows fine grained control if the module only works for some of the versions of python you have installed
If you create this file
~/.local/lib/pythonN.M/site-packages/usercustomize.py
it will be imported each time you start the python interpreter
Another option is to create a soft link in /usr/lib*/python*/site-packages/:
ln -s /projects/my_py_lib /usr/lib*/python*/site-packages/
That will make the project visible to all Python programs plus any changes will be visible immediately, too.
The main drawback is that you will eventually have *.pyc files owned by root or another user unless you make sure you compile the files yourself before you start python as another user.