Saving Python package setup configuration for later use - python

I have a fairly involved setup.py cython compilation process where I consider multiple things such as openMP support and the presence or absence of C headers. Specifically, FFTW is a library that computes the FFT, and is faster than numpy's FFT, so if fftw3.h is available, I compile my module against that, otherwise I fallback onto numpy.
I would like to be able to remember how the package was compiled i.e. did the compiler support openMP and which FFT library was used. All this information is available when running setup.py but not later on and can be useful e.g. if the user would like to run a function using multiple cores, but openMP was not used during compilation, everything will run on one core. Remembering this information would allow me to show a nice error.
I am unsure what the best way to do this would be. There are plenty of options such as writing a file with the data and then reading it when necessary, but is there any standard way to do this? Basically, I'm trying to emulate numpy's show_config, but am unsure what the best way to do this would be.

I have not attempted this, but my suggestion would to mimic the config.h-behavior one sees with autotools-based building: your setup.py generates a set of definitions that you either invoke via commandline or use via a generated header file - and then you can use this to feed e.g. a compiled extension function to return an approriate data-structure. But whatever you do: I have not come across a standardized way for this.

Related

Is it possible to compile NumPy into a single library?

I have a use case where I want to deploy an application on end-user machines that don't necessarily have Python installed. Most of the application is written in Python, but the entry point is a legacy C++ application. This is why using tools like PyInstaller are not an option (although I'm certainly not an expert on this, so I could be wrong).
I am using Cython to embed the Python interpreter into the application. Everything is going more or less fine, but my Python code uses NumPy, which means there is still an external dependency. Ideally, I'd like to have a single binary that includes NumPy, as well as the Python interpreter, and my application code. Is it possible, using Cython or some other tool, to compile NumPy (or some portions of it that I am using) to a library that I can then link to my final executable binary? Or is it possible to somehow link to a wheel (or egg) containing the NumPy binary???

Optimizing heavy computations in Python code for package to be distributed (with Numba or Cython)

I have a Python package that I'm distributing and I need to include in it a function that does some heavy computation that I can't find programmed in Numpy as Scipy (namely, I need to include a function to compute a variogram with two variables, also called a cross-variogram).
Since these have to be calculated for arrays of over 20000 elements, I need to optimize the code. I have successfully optimized the code (very easily) using Numba and I'm also trying to optimize it using Cython. From what I've read, there is little difference on the final run-time with both, just the steps change.
The problem is: optimizing this code on my computer is relatively easy, but I don't know how to include the code and its optimized (compiled) version in my github package for other users.
I'm thinking I'm going to have to put only the python/cython source code and tweak the setup.py around so it re-compiles in for every user that install the package. If that is the case, I'm not sure if I should use Numba or Cython since Numba seems so much easier to use (at least from by experience) but is such a hassle to install (I don't want to force my users to install anaconda!).
To sum up, two questions:
1 Should this particular piece of code indeed be re-compiled in every user's computer?
2 If so, it is more portable to use Numba or Cython? If not, should I just provide the .so I compiled in my computer?

Running python script without installed libraries

I have working Python script using scipy and numpy functions and I need to run it on the computer with installed Python but without modules scipy and numpy. How should I do that? Is .pyc the answer or should I do something more complex?
Notes:
I don't want to use py2exe. I am aware of it but it doesn't fit to the problem.
I have read, these questions (What is the difference between .py and .pyc files?, Python pyc files (main file not compiled?)) with obvious connection to this problem but since I am a physicist, not a programmer, I got totally lost.
It is not possible.
A pyc-file is nothing more than a python file compiled into byte-code. It does not contain any modules that this file imports!
Additionally, the numpy module is an extension written in C (and some Python). A substantial piece of it are shared libraries that are loaded into Python at runtime. You need those for numpy to work!
Python first "compiles" a program into bytecode, and then throws this bytecode through an interpreter.
So if your code is all Python code, you would be able to one-time generate the bytecode and then have the Python runtime use this. In fact I've seen projects such as this, where the developer has just looked through the bytecode spec, and implemented a bytecode parsing engine. It's very lightweight, so it's useful for e.g. "Python on a chip" etc.
Problem comes with external libraries not entirely written in Python, (e.g. numpy, scipy).
Python provides a C-API, allowing you to create (using C/C++ code) objects that appear to it as Python objects. This is useful for speeding things up, interacting with hardware, making use of C/C++ libs.
Take a look at Nuitka. If you'll be able to compile your code (not necessarily a possible or easy task), you'll get what you want.

C helper program for Python package

I am writing a Python package that, in normal operation, needs to run a helper program that's written in C. The helper program ships as part of the package, and it doesn't make sense to try to run it independently.
How do I persuade Distutils to compile, and install to an appropriate location, an independent C program rather than a C extension module?
How should the Python part of the code locate and start the helper program?
N.B. Porting the actual code (especially the C helper) to Windows would necessitate a >90% rewrite, so I only care about making installation work on Unix.
This is pretty interesting. I've never done this, but I think you can use the distutils compiler directly.
I checked out github for some possible examples that might give you inspiration. Check out this one
The filter I used was distutils ccompiler language:python filename:setup.py in case you want to extend it/trim it down

Can an arbitrary Python program have its dependencies inlined?

In the JavaScript ecosystem, "compilers" exist which will take a program with a significant dependency chain (of other JavaScript libraries) and emit a standalone JavaScript program (often, with optimizations applied).
Does any equivalent tool exist for Python, able to generate a script with all non-standard-library dependencies inlined? Are there other tools/practices available for bundling in dependencies in Python?
Since your goal is to be cross-architecture, any Python program which relies on native C modules will not be possible with this approach.
In general, using virtualenv to create a target environment will mean that even users who don't have permission to install new system-level software can install dependencies under their own home directory; thus, what you ask about is not often needed in practice.
However, if you wanted to do things that are consider evil / bad practices, pure-Python modules can in fact be bundled into a script; thus, a tool of this sort would be possible for modules with only native-Python dependencies!
If I were writing such a tool, I might start the following way:
Use pickle to serialize content of modules on the "sending" side
In the loader code, use imp.create_module() to create new module objects, and assign unpickled objects to them.

Categories

Resources