How to edit and use a python package containing an .so file

How to edit and use a python package containing an .so file - python

I have installed a Python package as usual using pip install package_name. It contains the main/most relevant file in the form of .so extension. I want to MODIFY it and use it for my work. Is it even possible to do it. Is there a background/underlying code for the .so file in python/.. that comes along with the package or is it a standalone program?

Go to the site whence pip fetches things, find your package, and follow the link to its source distribution. Building that yourself often requires more tools and expertise than usingpip, which is the cost of customization. (The GPL, despite being more restrictive (in this peculiar sense) than most Free licenses, certainly allows merely providing Internet access to the sources for binaries so distributed.)

Related

How to include files downloadable from a server into a pip installation?

I want to create a pip-installable (this is important, we already have a mostly-working easy-install version, but we want to switch to PIP) python package, which is essentially a wrapper for some C functions. As I understand it, I cannot count on users having compilers installed (e.g. Windows), and so preferably I would precompile these files and upload them onto a server. What I would like, is PIP to download a suitable file (I would prefer if it wasn't necessary for all these files to come shipped with the package) during the installation. I've tried reading the docs, but failed to find any solutions for my problem there. Is PIP able to download a compiled C file from a server during the installation? If so, what is the course of action? Should I perhaps try to include a python script, to be run at installation, which would determine the OS and the architecture, and then access a specific link?

You are correct in most of your assumptions. You can offer a source distribution, or sdist, which requires build tools on the target machine in order to be installed. It is often uploaded as a fallback when the platform wheel that you need doesn't exist, or if you want your users to be able to build it themselves.
Speaking of wheels, that is the name of the current standard for binary python distributions, or bdists. If your package contains code that needs to be compiled, wheels will end up being platform specific - depending on the used build system that can be Linux, macOS, or Windows. See for example the sklearn entry on pyPI, which features one wheel per os for all supported python versions (plus 32/64 bit support, but that's another story).
If you specify an index (or just a directory that has the wheels in it) where pip should install from, it will automatically make sure that it downloads/installs the correct wheel, which avoids writing platform-specific code into your source code. The hard part is building the wheels.
Related question:
Pip install and platform specific wheels
How to avoid building C library with my python package? (ends up building platform specific wheels)

What is Building and Installing?

This is probably a question that has a very easy and straightforward answer, however, despite having a few years programming experience, for some reason I still don't quite get the exact concepts of what it means to "build" and then to "install". I know how to use them and have used them a lot, but have no idea about the exact processes which happen in the background...
I have looked across the web, wikipedia, etc... but there is no one simple answer to it, neither can I find one here.
A good example, which I tried to understand, is adding new modules to python:
http://docs.python.org/2/install/index.html#how-installation-works
It says that "the build command is responsible for putting the files to install into a build directory"
And then for the install command: "After the build command runs (whether you run it explicitly, or the install command does it for you), the work of the install command is relatively simple: all it has to do is copy everything under build/lib (or build/lib.plat) to your chosen installation directory."
So essentially what this is saying is:
1. Copy everything to the build directory and then...
2. Copy everything to the installation directory
There must be a process missing somewhere in the explanation...complilation?
Would appreciate some straightforward not too techy answer but in as much detail as possible :)
Hopefully I am not the only one who doesn't know the detailed answer to this...
Thanks!
Aivoric

Building means compiling the source code to binary in a sandbox location where it won't affect your system if something goes wrong, like a build subdirectory inside the source code directory.
Install means copying the built binaries from the build subdirectory to a place in your system path, where they become easily accessible. This is rarely done by a straight copy command, and it's often done by some package manager that can track the files created and easily uninstall them later.
Usually, a build command does all the compiling and linking needed, but Python is an interpreted language, so if there are only pure Python files in the library, there's no compiling step in the build. Indeed, everything is copied to a build directory, and then copied again to a final location. Only if the library depends on code written in other languages that needs to be compiled you'll have a compiling step.

You want a new chair for your living-room and you want to make it yourself. You browse through a catalog and order a pile of parts. When they arrives at your door, you can't immediately use them. You have to build the chair at your workshop. After a bit of elbow-grease, you can sit down in it. Afterwards, you install the chair in your living-room, in a convenient place to sit down.
The chair is a program you want to use. It arrives at your house as source code. You build it by compiling it into a runnable program. You install it by making it easier to use.

The build and install commands you are refering to come from setup.py file right?
Setup.py (http://docs.python.org/2/distutils/setupscript.html)
This file is created by 3rd party applications / extensions of Python. They are not part of:
Python source code (bunch of c files, etc)
Python libraries that come bundled with Python
When a developer makes a library for python that he wants to share to the world he creates a setup.py file so the library can be installed on any computer that has python. Maybe this is the MISSING STEP
Setup.py sdist
This creates a python module (the tar.gz files). What this does is copy all the files used by the python library into a folder. Creates a setup.py file for the module and archives everything so the library can be built somewhere else.
Setup.py build
This builds the python module back into a library (SPECIFICALLY FOR THIS OS).
As you may know, the computer that the python library originally came from will be different from the library that you are installing on.
It might have a different version of python
It might have a different operating system
It might have a different processor / motherboard / etc
For all the reasons listed above the code will not work on another computer. So setup.py sdist creates a module with only the source files needed to rebuild the library on another computer.
What setup.py does exactly is similar to what a makefile would do. It compiles sources / creates libraries all that stuff.
Now we have a copy of all the files we need in the library and they will work on our computer / operating system.
Setup.py install
Great we have all the files needed. But they won't work. Why? Well they have to be added to Python that's why. This is where install comes in. Now that we have a local copy of the library we need to install it into python so you can use it like so:
import mycustomlibrary
In order to do this we need to do several things including:
Copy files to their library folders in our version of python.
Make sure library can be imported using import command
Run any special install instructions for this library. (seting up paths, etc)
This is the most complicated part of the task. What if our library uses BeautifulSoup? This is not a part of Python Library. We'd have to install it in a way such that our library and any others can use BeautifulSoup without interfering with each other.
Also what if python was installed someplace else? What if it was installed on a server with many users?
Install handles all these problems transparently. What is does is make the library that we just built able to run. All you have to do is use the import command, install handles the rest.

Python compile all modules into a single python file

I am writing a program in python to be sent to other people, who are running the same python version, however these some 3rd party modules that need to be installed to use it.
Is there a way to compile into a .pyc (I only say pyc because its a python compiled file) that has the all the dependant modules inside it as well?
So they can run the programme without needing to install the modules separately?
Edit:
Sorry if it wasnt clear, but I am aware of things such as cx_freeze etc but what im trying to is just a single python file.
So they can just type "python myapp.py" and then it will run. No installation of anything. As if all the module codes are in my .py file.

If you are on python 2.3 or later and your dependencies are pure python:
If you don't want to go the setuptools or distutiles routes, you can provide a zip file with the pycs for your code and all of its dependencies. You will have to do a little work to make any complex pathing inside the zip file available (if the dependencies are just lying around at the root of the zip this is not necessary. Then just add the zip location to your path and it should work just as if the dependencies files has been installed.
If your dependencies include .pyds or other binary dependencies you'll probably have to fall back on distutils.

You can simply include .pyc files for the libraries required, but no - .pyc cannot work as a container for multiple files (unless you will collect all the source into one .py file and then compile it).

It sounds like what you're after is the ability for your end users to run one command, e.g. install my_custom_package_and_all_required_dependencies, and have it assemble everything it needs.
This is a perfect use case for distutils, with which you can make manifests for your own code that link out to external dependencies. If your 3rd party modules are available publicly in a standard format (they should be, and if they're not, it's pretty easy to package them yourself), then this approach has the benefit of allowing you to very easily change what versions of 3rd party libraries your code runs against (see this section of the above linked doc). If you're dead set on packaging others' code with your own, you can always include the required files in the .egg you create with distutils.

Two options:
build a package that will install the dependencies for them (I don't recommend this if the only dependencies are python packages that are installed with pip)
Use virtual environments. You use an existing python on their system but python modules are installed into the virtualenv.
or I suppose you could just punt, and create a shell script that installs them, and tell them to run it once before they run your stuff.

Python Libraries

I'm in desperate need of a cross platform framework as I have vast numbers of .NET products that I'm trying to port to Linux. I have started to work with Python/pyQt and the standard library and all was going well until I try to import non-standard libraries. I'm hearing about pip and easy_install and I'm completely confused about this.
My products need to ship with everything required to execute them, so in the .NET world I simply package my DLLs (or licensed DLLs) with my product.
As a test bed I'm trying to import this library called requests: https://github.com/kennethreitz/requests
I've got an __init__.py file and the library source in my program directory but it isn't working. Please tell me that there is a simple way to include libraries without needing any kind of extra package installer.

I would suggest you start by familiarizing yourself with python packages (see the distutils docs. Pip is simply a manager that install packages directly from the internet repository, so that you don't need to manually go and download them. So for, example, as stated under "Installing" on the requests homepage, you simply run pip install requests in a terminal, without manually downloading anything.
Packaging your product is a different story, and the way you do it depends on the target system. On windows, the easiest might be to create an installer using NSIS which will install all dependencies. You might also want to use cx-freeze to pull all the dependencies (including the python interpreter) into a single package.
On linux, many of the dependencies will already be including in most distributions. so you should just list them as requirements when creating your package (e.g. deb for ubuntu). Other dependencies might not be included in the distro's repo, but you can still list them as requirements in setup.py.
I can't really comment on Mac, since I've never used python on one, but I think that it would be similar to the linux approach.

Deploy Python programs on Windows and fetch big library dependencies

I have some small Python programs which depend on several big libraries, such as:
NumPy & SciPy
matplotlib
PyQt
OpenCV
PIL
I'd like to make it easier to install these programs for Windows users. Currently I have two options:
either create huge executable bundles with PyInstaller, py2exe or similar tool,
or write step-by-step manual installation instructions.
Executable bundles are way too big. I always feel like there is some magic happening, which may or may not work the next time I use a different library or a new library version. I dislike wasted space too. Manual installation is too easy to do wrong, there are too many steps: download this particular interpreter version, download numpy, scipy, pyqt, pil binaries, make sure they all are built for the same python version and the same platform, install one after another, download and unpack OpenCV, copy its .pyd file deep inside Python installation, setup environment variables and file asssociations... You see, few users will have the patience and self-confidence to do all this.
What I'd like to do: distribute only a small Python source and, probably, an installation script, which fetches and installs all the missing dependencies (correct versions, correct platform, installs them in the right order). That's a trivial task with any Linux package manager, but I just don't know which tools can accomplish it on Windows.
Are there simple tools which can generate Windows installers from a list of URLs of dependencies1?
1 As you may have noticed, most of the libraries I listed are not installable with pip/easy_install, but require to run their own installers and modify some files and environment variables.

npackd exists http://code.google.com/p/windows-package-manager/ It could be done through here or use distribute (python 3.x) or setuptools (python 2.x) with easy_install, possibly pip (don't know it's windows compatibility). But I would choose npackd because PyQt and it's unusual setup for pip/easy_install (doesn't play with them nicely, using a configure.py instead of setup.py). Though you would have to create your own repo for npackd to use for some of them. I forget what is contributed in total for python libs with it.

AFAIK there is no tool (and I'd assume you googled), so you must make one yourself.
Fetching the proper library versions seems simple enough -- using python's ftplib you can fetch the proper installers for every library. How would you know which version is compatible with the user's python? You can store different lists of download URLs, each for a different python version (this method came off the top of my head and there is probably a better way; not that it matters much if it's simple and it works).
After you figure out how to make each installer run, you can py2exe your installer script, and even use it to fetch the program itself.
EDIT
Some Considerations
There are a couple of things that popped into my mind just as I posted:
First, some pseudocode (how I would approach it, anyway)
#first, we check modules
try:
import numpy
except ImportError:
#flag numpy for installation
#lather, rinse repeat for all dependencies
#next we check version compatibility -- note that if a library version you need
#is not backwards-compatible, you're in DLL hell, and there is little we can do.
<insert version-checking code here>
#once you have your unavailable dependencies, you install them
import ftplib
<all your file-downloading here>
#now you install. sorry I can't help you here.
There are a few things you can do to make your utility reusable --
put all URL lists, minimum version numbers, required library names etc in config files
Write a script which helps you set up an installer
Py2exe the installer-maker-script
Sell it
Even better, release it under GPL so we can all feast upon fruits of your labours.

I have a similar need as you, but in addition I need the packaged application to work on several platforms. I'm currently exploring the currently available solutions, here are a few interesting ones:
Use SnakeBasket, which wraps around Pip and add a recursive dependency resolution plus a heuristic to choose the right version when there are conflicts.
Package all dependencies as an egg, but not your sourcecode which will still be editable: https://stackoverflow.com/a/528064/1121352
Package all dependencies in a zip file and directly import the modules on the fly: Cross-platform alternative to py2exe or http://davidf.sjsoft.com/mirrors/mcmillan-inc/install1.html
Using buildout: http://www.buildout.org/en/latest/install.html
Using virtualenv with virtualenv-tools (instead of "relocate")
If your main problem when freezing your code using PyInstaller or similar is that you end up with a big single file, you can customize the process so that you get several files, one for each dependency, instead of one big executable.
I will update here if I find something that fills my bill.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.