Is there an ldd like command for python

Is there an ldd like command for python - python

This is the problem: You try to run a python script that you didn't write yourself, and it is missing a module. Then you solve that problem and try again - now another module is missing. And so on.
Is there anything, a command or something, that can go through the python sources and check that all the necessary modules are available - perhaps even going as far as looking up the dependencies of missing modules online (although that may be rather ambitious)? I think of it as something like 'ldd', but of course this is much more like yum or apt-get in its scope.
Please note, BTW, I'm not talking about the package dependencies like in pip (I think it is called, never used it), but about the logical dependencies in the source code.

There are several packages that analyze code dependencies:
https://docs.python.org/2/library/modulefinder.html
Modulefinder seems like what you want, and reports what modules can't be loaded. It looks like it works transitively from the example, but I am not sure.
https://pypi.org/project/findimports/
This also analyzes transitive imports, I am not sure however what the output is if a module is missing.
... And some more you can find with your favorite search engine

To answer the original question more directly, I think...
lddcollect is available via pip and looks very good.
Emphases mine:
Typical use case: you have a locally compiled application or library with large number of dependencies, and you want to share this binary. This tool will list all shared libraries needed to run it. You can then create a minimal rootfs with just the needed libraries. Alternatively you might want to know what packages need to be installed to run this application (Debian based systems only for now).
There are two modes of operation.
List all shared library files needed to execute supplied inputs
List all packages you need to apt-get install to execute supplied inputs as well as any shared libraries that are needed but are not under package management.
In the first mode it is similar to ldd, except referenced symbolic links to libraries are also listed. In the second mode shared library dependencies that are under package management are not listed, instead the name of the package providing the dependency is listed.
lddcollect --help
Usage: lddcollect [OPTIONS] [LIBS_OR_DIR]...
Find all other libraries and optionally Debian dependencies listed
applications/libraries require to run.
Two ways to run:
1. Supply single directory on input
- Will locate all dynamic libs under that path
- Will print external libs only (will not print any input libs that were found)
2. Supply paths to individual ELF files on a command line
- Will print input libs and any external libs referenced
Prints libraries (including symlinks) that are referenced by input files,
one file per line.
When --dpkg option is supplied, print:
1. Non-dpkg managed files, one per line
2. Separator line: ...
3. Package names, one per line
Options:
--dpkg / --no-dpkg Lookup dpkg libs or not, default: no
--json Output in json format
--verbose Print some info to stderr
--ignore-pkg TEXT Packages to ignore (list package files instead)
--help Show this message and exit.
I can't test it against my current use case for ldd right now, but I quickly ran it against a binary I've built, and it seems to report the same kind of info, in fact almost twice as many lines!

Related

How to properly use 3rd party dependencies with Sublime Text plugins?

I'm trying to write a plugin for Sublime Text 3.
I have to use several third party packages in my code. I have managed to get the code working by manually copying the packages into /home/user/.config/sublime-text-3/Packages/User/, then I used relative imports to get to the needed code. How would I distribute the plugin to the end users? Telling them to copy the needed dependencies to the appropriate location is certainly not the way to go. How are 3rd party modules supposed to be used properly with Sublime Text plugins? I can't find any documentation online; all I see is the recommendation to put the modules in the folder.

Sublime uses it's own embedded Python interpreter (currently Python 3.3.6 although the next version will also support Python 3.8 as well) and as such it will completely ignore any version of Python that you may or may not have installed on your system, as well as any libraries that are installed for that version.
For that reason, if you want to use external modules (hereafter dependencies) you need to do extra work. There are a variety of ways to accomplish this, each with their own set of pros and cons.
The following lists the various ways that you can achieve this; all of them require a bit of an understanding about how modules work in Python in order to understand what's going on. By and large except for the paths involved there's nothing too "Sublime Text" about the mechanisms at play.
NOTE: The below is accurate as of the time of this answer. However there are plans for Package Control to change how it works with dependencies that are forthcoming that may change some aspect of this.
This is related to the upcoming version of Sublime supporting multiple versions of Python (and the manner in which it supports them) which the current Package Control mechanism does not support.
It's unclear at the moment if the change will bring a new way to specify dependencies or if only the inner workings of how the dependencies are installed will change. The existing mechanism may remain in place regardless just for backwards compatibility, however.
All roads to accessing a Python dependency from a Sublime plugin involve putting the code for it in a place where the Python interpreter is going to look for it. This is similar to how standard Python would do things, except that locations that are checked are contained within the area that Sublime uses to store your configuration (referred to as the Data directory) and instead of a standalone Python interpreter, Python is running in the plugin host.
Populate the library into the Lib folder
Since version 3.0 (build 3143), Sublime will create a folder named Lib in the data directory and inside of it a directory based on the name of the Python version. If you use Preferences > Browse Packages and go up one folder level, you'll see Lib, and inside of it a folder named e.g. python3.3 (or if you're using a newer build, python33 and python38).
Those directories are directly on the Python sys.path by default, so anything placed inside of them will be immediately available to any plugin just as a normal Python library (or any of those built in) would be. You could consider these folders to be something akin to the site-packages folder in standard Python.
So, any method by which you could install a standard Python library can be used so long as the result is files ending up in this folder. You could for example install a library via pip and then manually copy the files to that location from site-packages, manually install from sources, etc.
Lib/python3.3/
|-- librarya
| `-- file1.py
|-- libraryb
| `-- file2.py
`-- singlefile.py
Version restrictions apply here; the dependency that you want to use must support the version of Python that Sublime is using, or it won't work. This is particularly important for Python libraries with a native component (e.g. a .dll, .so or .dylib), which may require hand compiling the code.
This method is not automatic; you would need to do it to use your package locally, and anyone that wants to use your package would also need to do it as well. Since Sublime is currently using an older version of Python, it can be problematic to obtain a correct version of libraries as well.
In the future, Package Control will install dependencies in this location (Will added the folder specifically for this purpose during the run up to version 3.0), but as of the time I'm writing this answer that is not currently the case.
Vendor your dependencies directly inside of your own package
The Packages folder is on the sys.path by default as well; this is how Sublime finds and loads packages. This is true of both the physical Packages folder, as well as the "virtual" packages folder that contains the contents of sublime-package files.
For example, one can access the class that provides the exec command via:
from Default.exec import ExecCommand
This will work even though the exec.py file is actually stored in Default.sublime-package in the Sublime text install folder and not physically present in the Packages folder.
As a result of this, you can vendor any dependencies that you require directly inside of your own package. Here this could be the User package or any other package that you're creating.
It's important to note that Sublime will treat any Python file in the top level of a package as a plugin and try to load it as one. Hence it's important that if you go this route you create a sub-folder in your package and put the library in there.
MyPackage/
|-- alibrary
| `-- code.py
`-- my_plugin.py
With this structure, you can access the module directly:
import MyPackage.alibrary
from MyPackage.alibrary import someSymbol
Not all Python modules lend themselves to this method directly without modification; some code changes in the dependency may be required in order to allow different parts of the library to see other parts of itself, for example if it doesn't use relative import to get at sibling files. License restrictions may also get in the way of this as well, depending on the library that you're using.
On the other hand, this directly locks the version of the library that you're using to exactly the version that you tested with, which ensures that you won't be in for any undue surprises further on down the line.
Using this method, anything you do to distribute your package will automatically also distribute the vendored library that's contained inside. So if you're distributing by Package Control, you don't need to do anything special and it will Just Work™.
Modify the sys.path to point to a custom location
The Python that's embedded into Sublime is still standard Python, so if desired you can manually manipulate the sys.path that describes what folders to look for packages in so that it will look in a place of your choosing in addition to the standard locations that Sublime sets up automatically.
This is generally not a good idea since if done incorrectly things can go pear shaped quickly. It also still requires you to manually install libraries somewhere yourself first, and in that case you're better off using the Lib folder as outlined above, which is already on the sys.path.
I would consider this method an advanced solution and one you might use for testing purposes during development but otherwise not something that would be user facing. If you plan to distribute your package via Package Control, the review of your package would likely kick back a manipulation of the sys.path with a request to use another method.
Use Package Control's Dependency system (and the dependency exists)
Package control contains a dependency mechanism that uses a combination of the two prior methods to provide a way to install a dependency automatically. There is a list of available dependencies as well, though the list may not be complete.
If the dependency that you're interested in using is already available, you're good to go. There are two different ways to go about declaring that you need one or more dependencies on your package.
NOTE: Package Control doesn't currently support dependencies of dependencies; if a dependency requires that another library also be installed, you need to explicitly mention them both yourself.
The first involves adding a dependencies key to the entry for your package in the package control channel file. This is a step that you'd take at the point where you're adding your package to Package Control, which is something that's outside the scope of this answer.
While you're developing your package (or if you decide that you don't want to distribute your package via Package Control when you're done), then you can instead add a dependencies.json file into the root of your package (an example dependencies.json file is available to illustrate this).
Once you do that, you can choose Package Control: Satisfy Dependencies from the command Palette to have Package Control download and install the dependency for you (if needed).
This step is automatic if your package is being distributed and installed by Package Control; otherwise you need to tell your users to take this step once they install the package.
Use Package Control's Dependency system (but the dependency does not exist)
The method that Package Control uses to install dependencies is, as outlined at the top of the question subject to change at some point in the (possibly near) future. This may affect the instructions here. The overall mechanism may remain the same as far as setup is concerned, with only the locations of the installation changing, but that remains to be seen currently.
Package Control installs dependencies via a special combination of vendoring and also manipulation of the sys.path to allow things to be found. In order to do so, it requires that you lay out your dependency in a particular way and provide some extra metadata as well.
The layout for the package that contains the dependency when you're building it would have a structure similar to the following:
Packages/my_dependency/
├── .sublime-dependency
└── prefix
└── my_dependency
└── file.py
Package Control installs a dependency as a Package, and since Sublime treats every Python file in the root of a package as a plugin, the code for the dependency is not kept in the top level of the package. As seen above, the actual content of the dependency is stored inside of the folder labeled as prefix above (more on that in a second).
When the dependency is installed, Package Control adds an entry to it's special 0_package_control_loader package that causes the prefix folder to be added to the sys.path, which makes everything inside of it available to import statements as normal. This is why there's an inherent duplication of the name of the library (my_dependency in this example).
Regarding the prefix folder, this is not actually named that and instead has a special name that determines what combination of Sublime Text version, platform and architecture the dependency is available on (important for libraries that contain binaries, for example).
The name of the prefix folder actually follows the form {st_version}_{os}_{arch}, {st_version}_{os}, {st_version} or all. {st_version} can be st2 or st3, {os} can be windows, linux or osx and {arch} can be x32 or x64.
Thus you could say that your dependency supports only st3, st3_linux, st3_windows_x64 or any combination thereof. For something with native code you may specify several different versions by having multiple folders, though commonly all is used when the dependency contains pure Python code that will work regardless of the Sublime version, OS or architecture.
In this example, if we assume that the prefix folder is named all because my_dependency is pure Python, then the result of installing this dependency would be that Packages/my_dependency/all would be added to the sys.path, meaning that if you import my_dependency you're getting the code from inside of that folder.
During development (or if you don't want to distribute your dependency via Package Control), you create a .sublime-dependency file in the root of the package as shown above. This should be a text file with a single line that contains a 2 digit number (e.g. 01 or 50). This controls in what order each installed dependency will get added to the sys.path. You'd typically pick a lower number if your dependency has no other dependencies and a higher value if it does (so that it gets injected after those).
Once you have the initial dependency laid out in the correct format in the Packages folder, you would use the command Package Control: Install Local Dependency from the Command Palette, and then select the name of your dependency.
This causes Package Control to "install" the dependency (i.e. update the 0_package_control_loader package) to make the dependency active. This step would normally be taken by Package Control automatically when it installs a dependency for the first time, so if you are also manually distributing your dependency you need to provide instructions to take this step.

setuptools "eager_resources" to executable directory

I maintain a Python utility that allows bpy to be installable as a Python module. Due to the hugeness of the spurce code, and the length of time it takes to download the libraries, I have chosen to provide this module as a wheel.
Unfortunately, platform differences and Blender runtime expectations makes support for this tricky at times.
Currently, one of my big goals is to get the Blender addon scripts directory to install into the correct location. The directory (simply named after the version of Blender API) has to exist in the same directory as the Python executable.
Unfortunately the way that setuptools works (or at least the way that I have it configured) the 2.79 directory is not always placed as a sibling to the Python executable. It fails on Windows platforms outside of virtual environments.
However, I noticed in setuptools documentation that you can specify eager_resources that supposedly guarantees the location of extracted files.
https://setuptools.readthedocs.io/en/latest/setuptools.html#automatic-resource-extraction
https://setuptools.readthedocs.io/en/latest/pkg_resources.html#resource-extraction
There was a lot of hand waving and jargon in the documentation, and 0 examples. I'm really confused as to how to structure my setup.py file in order to guarantee the resource extraction. Currently, I just label the whole 2.79 directory as "scripts" in my setuptools Extension and ship it.
Is there a way to write my setup.py and package my module so as to guarantee the 2.79 directory's location is the same as the currently running python executable when someone runs
py -3.6.8-32 -m pip install bpy
Besides simply "hacking it in"? I was considering writing a install_requires module that would simply move it if possible but that is mangling with the user's file system and kind of hacky. However it's the route I am going to go if this proves impossible.
Here is the original issue for anyone interested.
https://github.com/TylerGubala/blenderpy/issues/13
My build process is identical to the process descsribed in my answer here
https://stackoverflow.com/a/51575996/6767685

Maybe try the data_files option of distutils/setuptools.
You could start by adding data_files=[('mydata', ['setup.py'],)], to your setuptools.setup function call. Build a wheel, then install it and see if you can find mydata/setup.py somewhere in your sys.prefix.
In your case the difficult part will be to compute the actual target directory (mydata in this example). It will depend on the platform (Linux, Windows, etc.), if it's in a virtual environment or not, if it's a global or local install (not actually feasible with wheels currently, see update below) and so on.
Finally of course, check that everything gets removed cleanly on uninstall. It's a bit unnecessary when working with virtual environments, but very important in case of a global installation.
Update
Looks like your use case requires a custom step at install time of your package (since the location of the binary for the Python interpreter relative to sys.prefix can not be known in advance). This can not be done currently with wheels. You have seen it yourself in this discussion.
Knowing this, my recommendation would be to follow the advice from Jan Vlcinsky in his comment for his answer to this question:
Post install script after installing a wheel.
Add an extra setuptools console entry point to your package (let's call it bpyconfigure).
Instruct the users of your package to run it immediately after installing your package (pip install bpy && bpyconfigure).
The purpose of bpyconfigure should be clearly stated (in the documentation and maybe also as a notice shown in the console right after starting bpyconfigure) since it would write into locations of the file system where pip install does not usually write.
bpyconfigure should figure out where is the Python interpreter, and where to write the extra data.
The extra data to write should be packaged as package_data, so that it can be found with pkg_resources.
Of course bpyconfigure --uninstall should be available as well!

What is Building and Installing?

This is probably a question that has a very easy and straightforward answer, however, despite having a few years programming experience, for some reason I still don't quite get the exact concepts of what it means to "build" and then to "install". I know how to use them and have used them a lot, but have no idea about the exact processes which happen in the background...
I have looked across the web, wikipedia, etc... but there is no one simple answer to it, neither can I find one here.
A good example, which I tried to understand, is adding new modules to python:
http://docs.python.org/2/install/index.html#how-installation-works
It says that "the build command is responsible for putting the files to install into a build directory"
And then for the install command: "After the build command runs (whether you run it explicitly, or the install command does it for you), the work of the install command is relatively simple: all it has to do is copy everything under build/lib (or build/lib.plat) to your chosen installation directory."
So essentially what this is saying is:
1. Copy everything to the build directory and then...
2. Copy everything to the installation directory
There must be a process missing somewhere in the explanation...complilation?
Would appreciate some straightforward not too techy answer but in as much detail as possible :)
Hopefully I am not the only one who doesn't know the detailed answer to this...
Thanks!
Aivoric

Building means compiling the source code to binary in a sandbox location where it won't affect your system if something goes wrong, like a build subdirectory inside the source code directory.
Install means copying the built binaries from the build subdirectory to a place in your system path, where they become easily accessible. This is rarely done by a straight copy command, and it's often done by some package manager that can track the files created and easily uninstall them later.
Usually, a build command does all the compiling and linking needed, but Python is an interpreted language, so if there are only pure Python files in the library, there's no compiling step in the build. Indeed, everything is copied to a build directory, and then copied again to a final location. Only if the library depends on code written in other languages that needs to be compiled you'll have a compiling step.

You want a new chair for your living-room and you want to make it yourself. You browse through a catalog and order a pile of parts. When they arrives at your door, you can't immediately use them. You have to build the chair at your workshop. After a bit of elbow-grease, you can sit down in it. Afterwards, you install the chair in your living-room, in a convenient place to sit down.
The chair is a program you want to use. It arrives at your house as source code. You build it by compiling it into a runnable program. You install it by making it easier to use.

The build and install commands you are refering to come from setup.py file right?
Setup.py (http://docs.python.org/2/distutils/setupscript.html)
This file is created by 3rd party applications / extensions of Python. They are not part of:
Python source code (bunch of c files, etc)
Python libraries that come bundled with Python
When a developer makes a library for python that he wants to share to the world he creates a setup.py file so the library can be installed on any computer that has python. Maybe this is the MISSING STEP
Setup.py sdist
This creates a python module (the tar.gz files). What this does is copy all the files used by the python library into a folder. Creates a setup.py file for the module and archives everything so the library can be built somewhere else.
Setup.py build
This builds the python module back into a library (SPECIFICALLY FOR THIS OS).
As you may know, the computer that the python library originally came from will be different from the library that you are installing on.
It might have a different version of python
It might have a different operating system
It might have a different processor / motherboard / etc
For all the reasons listed above the code will not work on another computer. So setup.py sdist creates a module with only the source files needed to rebuild the library on another computer.
What setup.py does exactly is similar to what a makefile would do. It compiles sources / creates libraries all that stuff.
Now we have a copy of all the files we need in the library and they will work on our computer / operating system.
Setup.py install
Great we have all the files needed. But they won't work. Why? Well they have to be added to Python that's why. This is where install comes in. Now that we have a local copy of the library we need to install it into python so you can use it like so:
import mycustomlibrary
In order to do this we need to do several things including:
Copy files to their library folders in our version of python.
Make sure library can be imported using import command
Run any special install instructions for this library. (seting up paths, etc)
This is the most complicated part of the task. What if our library uses BeautifulSoup? This is not a part of Python Library. We'd have to install it in a way such that our library and any others can use BeautifulSoup without interfering with each other.
Also what if python was installed someplace else? What if it was installed on a server with many users?
Install handles all these problems transparently. What is does is make the library that we just built able to run. All you have to do is use the import command, install handles the rest.

Python compile all modules into a single python file

I am writing a program in python to be sent to other people, who are running the same python version, however these some 3rd party modules that need to be installed to use it.
Is there a way to compile into a .pyc (I only say pyc because its a python compiled file) that has the all the dependant modules inside it as well?
So they can run the programme without needing to install the modules separately?
Edit:
Sorry if it wasnt clear, but I am aware of things such as cx_freeze etc but what im trying to is just a single python file.
So they can just type "python myapp.py" and then it will run. No installation of anything. As if all the module codes are in my .py file.

If you are on python 2.3 or later and your dependencies are pure python:
If you don't want to go the setuptools or distutiles routes, you can provide a zip file with the pycs for your code and all of its dependencies. You will have to do a little work to make any complex pathing inside the zip file available (if the dependencies are just lying around at the root of the zip this is not necessary. Then just add the zip location to your path and it should work just as if the dependencies files has been installed.
If your dependencies include .pyds or other binary dependencies you'll probably have to fall back on distutils.

You can simply include .pyc files for the libraries required, but no - .pyc cannot work as a container for multiple files (unless you will collect all the source into one .py file and then compile it).

It sounds like what you're after is the ability for your end users to run one command, e.g. install my_custom_package_and_all_required_dependencies, and have it assemble everything it needs.
This is a perfect use case for distutils, with which you can make manifests for your own code that link out to external dependencies. If your 3rd party modules are available publicly in a standard format (they should be, and if they're not, it's pretty easy to package them yourself), then this approach has the benefit of allowing you to very easily change what versions of 3rd party libraries your code runs against (see this section of the above linked doc). If you're dead set on packaging others' code with your own, you can always include the required files in the .egg you create with distutils.

Two options:
build a package that will install the dependencies for them (I don't recommend this if the only dependencies are python packages that are installed with pip)
Use virtual environments. You use an existing python on their system but python modules are installed into the virtualenv.
or I suppose you could just punt, and create a shell script that installs them, and tell them to run it once before they run your stuff.

alternatives to DYLD_LIBRARY_PATH/LD_LIBRARY_PATH

I'm developing python C++ extensions for use in both OSX and linux. Currently, I can run my code with a wrapper script wrapper.sh:
#!/bin/bash
trunk=`dirname $0`
trunk=`cd $trunk; pwd`
export DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:$trunk/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$trunk/lib/:$trunk/src/hdf5/lib/:$trunk/src/python/lib
$trunk/src/python/bin/python "$#"
which is able to set up my run like this: wrapper.sh app.py
What I would like to do is to eliminate the need for wrapper.sh, so I need alternatives for DYLD_LIBRARY_PATH and LD_LIBRARY_PATH. I can not put my libraries in some standard location like /usr/local/lib because on my machine, I maintain several independent instances of my libraries. That is, my libraries need to be kept somewhere relative to my installation path. I can't put these environment variables in my login script for the same reason. Currently, I need to call one of my wrapper.sh scripts to use the associated libraries. My goal is to be able to run merely app.py, which if it lives in my installation path, should be able to find its associated python and libraries. The purpose is to simplify execution for users, and to simplify usage of external tools like nosetests.
One alternative seems to be using rpath when I build my version of python:
./configure --enable-shared --prefix=$(CURDIR)/$(PYTHON_DIR) LDFLAGS="-Wl,-rpath,$(CURDIR)/lib/ -Wl,-rpath,$(CURDIR)/src/hdf5/lib -Wl,-rpath,$(CURDIR)/src/python/lib"
This trick seems to work fine on linux, even though one of my libraries ended up needing to be copied directly into trunk/src/python/lib/python2.6/lib-dynload for some reason unclear to me. However, this trick is not working on OSX; it looks like I need to run install_name_tool on all my dylibs libraries.
The other alternative I came up with was to do something like this:
ln -s wrapper.sh python
so that my scripts could all use #! ../python, but I'm getting Unmatched ". errors. Same thing if I use #! ../wrapper.sh. I'm not really an expert in bash...
However, these all seem so unnecessarily complicated, and surely this is something that other people have solved?? Thanks for any advice!

For python extensions, consider using PYTHONPATH: the Python interpreter will search the PYTHONPATH for .py/.pyc/.pyo/.so modules, as well as packages. See docs for Python 2.x as well as docs for Python 3.x; specifically the section named "The Module Search Path" on both pages. This also references information that seems to indicate that it is possible to update the module search path at runtime, which, if true, means that you could add all that logic to your program and it can hunt for its libraries on its own (say if it installs a copy in /usr/libexec/pkgname/... somewhere or something).
For all but the most complex of cases, though, setting PYTHONPATH and using a shell-script or native-compiled binary wrapper to start the core program is an okay approach, and one that is also used in other language environments including Mono and Java.

Not sure if this would be an acceptable (partial) solution in your circumstances, but another way to get libraries noticed by ld on linux is to add the path to the libraries to /etc/ld.so.conf and then runldconfig
For the Mac I don't remember the details, but I think Apple provide some resources for distributing apps packaged as a .app which includes some default locations (relative to the root of the .app) for libraries, or "frameworks" as they call them. Would require some googling from there - sorry can't help further on that but hope you get some progress :-)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.