Custom archive format in Python pip - python

I want to download the source code for Python packages using something like
pip download --no-binary=:all: $package==$version
Almost always this result in a tar.gz file (at least on Linux), which is what I want. For NumPy version 1.13.1 however, I retrieve a zip file instead. This unreliability makes it somewhat harder to write automatic install scripts, and so I would like to ask if there is any way in which I can choose the format of the retrieved archive?

pip downloads what it finds at PyPI. For numpy it finds .zip and not .tar.*. So you have to know in advance what formats package publishers provide.
You can ask (or better yet provide a patch for) NumPy team to publish .tar.* in addition to .zip.

Related

How to edit and use a python package containing an .so file

I have installed a Python package as usual using pip install package_name. It contains the main/most relevant file in the form of .so extension. I want to MODIFY it and use it for my work. Is it even possible to do it. Is there a background/underlying code for the .so file in python/.. that comes along with the package or is it a standalone program?
Go to the site whence pip fetches things, find your package, and follow the link to its source distribution. Building that yourself often requires more tools and expertise than usingpip, which is the cost of customization. (The GPL, despite being more restrictive (in this peculiar sense) than most Free licenses, certainly allows merely providing Internet access to the sources for binaries so distributed.)

How to install a Python package with documentation?

I'm tryin' to find a way to install a python package with its docs.
I have to use this on machines that have no connection to the internet and so online help is not a solution to me. Similar questions already posted here are telling that this is not possible. Do you see any way to make this easier as I'm currently doing this:
downloading the source archive
extracting the docs folder
running sphinx
launching the index file from a browser (firefox et al.)
Any ideas?
P.S. I'm very new to Python, so may be I'm missing something... And I'm using Windows (virtual) machines...
Edit:
I'm talking about two possible ways to install a package:
installing the package via easy_install (or any other to me unknown way) on a machine while I'm online, and then copying the changes to my installation to the target machine
downloading the source package (containing sphinx compatible docs) and installing the package on the target machine off-line
But in any case I do not know a way to install the package in a way that the supplied documentations are installed alltogether with module!
You might know that there exists a folder for the docs: <python-folder>/Doc which will contain only python278.chm after installation of Python 2.78 on Windows. So, I expect that this folder will also contain the docs for a newly installed package. This will avoid looking at docs for a different package version on the internet as well as my specific machine setup problems.
Most packages I'm currently using are supplied with documentation generated with sphinx, and their source package contains all the files necessary to generate the docs offline.
So what I'm looking for is some cli argument for a package installer like it's common for unix/linux based package managers. I did expect something like:
easy_install a_package --with-html-docs.
Here are some scenarios:
packages have documentation included within the zip/tar
packages have a -docs to download/install seperately
packages that have buildable documentation
packages that only have online documentation
packages with no documentation other than internal.
packages with no documentation anywhere.
The sneaky trick that you can use for options 1 & 3 is to download the package as a tar or zip and then use easy-install archive_name on the target machine this will install the package from the zip or tar file including (I believe) any documentation. You will find that there are dependencies that are unmet in some packages - those should give an error on the easy install mentioning what is missing - you will need to get those and use the same trick.
A couple of things that are very handy - virtual-env will let you have a library free version of python running so you can get the requirements and pip -d <dir> which will download without installing storing your packages in dir.
You should be able to use the same trick for option 2.
With packages that only have on-line documentation you could look to see if there is a downloadable version or could scrape the web pages and use a tool like pandoc to convert to something useful.
In the 5 scenario I would suggest raising a ticket on the package stating that lack of accessible documentation makes it virtually unusable and running sphinx on it.
In scenario 6 I suggest raising the ticket but missing out virtually and avoiding the use of that package on the basis that if it has no documentation it probably has a lot of other problems as well - if you are a package author feeling slandered reading this then you should be feeling ashamed instead.
Mirror/Cache PyPi
Another possibly is to have a linux box, or VM, initially outside of your firewall, running a cached or mirroring service e.g. pipyserver, install the required packages through it to populate the cache and then move it, (or its cache to another pip server), inside the firewall and you can then use pip with the documented settings to do all your installs inside the firewall. See also the answer here.

How to import libraries from source code in python?

I am trying to write a python script that I could easily export to friends without dependency problems, but am not sure how to do so. Specifically, my script relies on code from BeautifulSoup, but rather than forcing friends to have to install BeautifulSoup, I would rather just package the src for BeautifulSoup into a Libraries/ folder in my project files and call to functions from there. However, I can no longer simply "import bs4." What is the correct way of going about this?
Thanks!
A common approach is ship a requirements file with a project, specifying which version of which library is required. This file is (by convention) often named requirements.txt and looks something like this:
MyApp
BeautifulSoup==3.2.1
SomeOtherLib==0.9.4
YetAnother>=0.2
(The fictional file above says: I need exactly BeautifulSoup 3.2.1, SomeOtherLib 0.9.4 and any version of YetAnother greater or equal to 0.2).
Then the user of this project can simply take you library, (create a virtualenv) and run
$ pip install -r requirements.txt
which then will fetch all libraries and makes them available either system-wide of project-wide (if virtualenv is used). Here's a random python project off github, having a requirements file:
https://github.com/laoqiu/pypress
https://github.com/laoqiu/pypress/blob/master/requirements.txt
The nice thing about this approach is that you'll get your transitive dependencies resolved automatically. Also, if you use virtualenv, you'll get a clean separation of your projects and avoid library version collisions.
You must add Libraries/ (converted to an absolute path first) to sys.path before attempting to import anything under it.

Deploy Python programs on Windows and fetch big library dependencies

I have some small Python programs which depend on several big libraries, such as:
NumPy & SciPy
matplotlib
PyQt
OpenCV
PIL
I'd like to make it easier to install these programs for Windows users. Currently I have two options:
either create huge executable bundles with PyInstaller, py2exe or similar tool,
or write step-by-step manual installation instructions.
Executable bundles are way too big. I always feel like there is some magic happening, which may or may not work the next time I use a different library or a new library version. I dislike wasted space too. Manual installation is too easy to do wrong, there are too many steps: download this particular interpreter version, download numpy, scipy, pyqt, pil binaries, make sure they all are built for the same python version and the same platform, install one after another, download and unpack OpenCV, copy its .pyd file deep inside Python installation, setup environment variables and file asssociations... You see, few users will have the patience and self-confidence to do all this.
What I'd like to do: distribute only a small Python source and, probably, an installation script, which fetches and installs all the missing dependencies (correct versions, correct platform, installs them in the right order). That's a trivial task with any Linux package manager, but I just don't know which tools can accomplish it on Windows.
Are there simple tools which can generate Windows installers from a list of URLs of dependencies1?
1 As you may have noticed, most of the libraries I listed are not installable with pip/easy_install, but require to run their own installers and modify some files and environment variables.
npackd exists http://code.google.com/p/windows-package-manager/ It could be done through here or use distribute (python 3.x) or setuptools (python 2.x) with easy_install, possibly pip (don't know it's windows compatibility). But I would choose npackd because PyQt and it's unusual setup for pip/easy_install (doesn't play with them nicely, using a configure.py instead of setup.py). Though you would have to create your own repo for npackd to use for some of them. I forget what is contributed in total for python libs with it.
AFAIK there is no tool (and I'd assume you googled), so you must make one yourself.
Fetching the proper library versions seems simple enough -- using python's ftplib you can fetch the proper installers for every library. How would you know which version is compatible with the user's python? You can store different lists of download URLs, each for a different python version (this method came off the top of my head and there is probably a better way; not that it matters much if it's simple and it works).
After you figure out how to make each installer run, you can py2exe your installer script, and even use it to fetch the program itself.
EDIT
Some Considerations
There are a couple of things that popped into my mind just as I posted:
First, some pseudocode (how I would approach it, anyway)
#first, we check modules
try:
import numpy
except ImportError:
#flag numpy for installation
#lather, rinse repeat for all dependencies
#next we check version compatibility -- note that if a library version you need
#is not backwards-compatible, you're in DLL hell, and there is little we can do.
<insert version-checking code here>
#once you have your unavailable dependencies, you install them
import ftplib
<all your file-downloading here>
#now you install. sorry I can't help you here.
There are a few things you can do to make your utility reusable --
put all URL lists, minimum version numbers, required library names etc in config files
Write a script which helps you set up an installer
Py2exe the installer-maker-script
Sell it
Even better, release it under GPL so we can all feast upon fruits of your labours.
I have a similar need as you, but in addition I need the packaged application to work on several platforms. I'm currently exploring the currently available solutions, here are a few interesting ones:
Use SnakeBasket, which wraps around Pip and add a recursive dependency resolution plus a heuristic to choose the right version when there are conflicts.
Package all dependencies as an egg, but not your sourcecode which will still be editable: https://stackoverflow.com/a/528064/1121352
Package all dependencies in a zip file and directly import the modules on the fly: Cross-platform alternative to py2exe or http://davidf.sjsoft.com/mirrors/mcmillan-inc/install1.html
Using buildout: http://www.buildout.org/en/latest/install.html
Using virtualenv with virtualenv-tools (instead of "relocate")
If your main problem when freezing your code using PyInstaller or similar is that you end up with a big single file, you can customize the process so that you get several files, one for each dependency, instead of one big executable.
I will update here if I find something that fills my bill.

How can I move file into Recycle Bin / trash on different platforms using PyQt4?

I would like to add the next feature to my cross-platform PyQt4 application: when user selects some file and select "remove" action on it that file will be moved to Recycle Bin folder instead of being permantly removed. I think I can find Windows-specific solution using Win32 API or something similar, but I'd like to know does similar operation could be executed on Ubuntu/Linux and MaxOSX as well via PyQt4 methods.
It's a good thing you're using Python, I created a library to do just that a while ago:
http://www.hardcoded.net/articles/send-files-to-trash-on-all-platforms.htm
On PyPI: Send2Trash
Installation
Using conda:
conda install Send2Trash
Using pip:
pip install Send2Trash
Usage
Delete file or folders
from send2trash import send2trash
send2trash("directory")
I guess there really is no cross-platform solution provided by Qt and it's not a totally trivial task to implement the trash concept in Linux since it's slightly different based on which file manager is in use.
Here's a site discussing the trash concept in Nautilus and another one for KDE.
Under Windows you can use the Win32 API like you said. Python solution available here.
Mac OS X puts the trashed files in ~/.Trash similar to other *NIX OSes, but I couldn't quickly Google any documentation for it. It seems that the OS X trash info file is some kind of binary format and not plain text like in Linux.
Symbian doesn't have a desktop concept and thus no trashcan concept either. It might be similar for other mobile platforms.
EDIT: Super User has some discussion revealing that .DS_Store does indeed store information about trashed files, but no specifics about the format.
The best OSX solution I know uses Applescript. I did not, however, invent it, so I shall simply link to it here.
It would be nice to have a module that packaged up the Win32/KDE/OSX solutions into one, i feel, and imported the correct one on demand. Is that how you solved your problem in the end?

Categories

Resources