I have created a python library using sklearn and some other dependencies. I want other developers to be able to use it in their programs, in a non-public environment(e.g.within a organization) They will use this library for to write their own applications.
Some questions that I have are -
What is a best way to make it available to other developers?
Let's say , the developers have their own python installation, and they use a version 1.x of a package(e.g. sklearn etc) but my
package uses 2.x, will there be a problem? If yes, how can i ensure they
can use my library.
I want to make my library available for both Python 2.7 and 3.x users. Do I need two different deployments? Currently my library
works(no version specific calls for 2.7/3.x) in both 2.7 and 3.x, if
the correct dependencies are pre-installed by the user.
The best way is to publish at PyPI. That way your user have ti just run pip install $LIB and got it with all dependencies (if you properly configured dependencies). See Python Packaging User Guide.
Just recommend your users to use virtualenv. Virtual environments are the way to separate and install different versions of Python libraries and programs to coexist at one system.
Very much depends on the nature of your library. There are libraries that can be installed to both Python 2 and 3 from one source and there are libraries that require different package for every python version.
Related
I recently install Openrefine and it's great, especially enjoying the python execution option.
Within the python execution, one can import additional packages, this can be seen in this example where the random package is imported.
Example of Openrefine python execution which returns random word out of the first 50 words
Now, I want to use a special package within the Openrefine tool, which is installed on one of my Conda environments. can I activate a particular Conda env that will be executed in Openrefine tool?
TL;DR: just wrap your Python package with FastAPI and communicate via HTTP requests.
OpenRefine and Jython
OpenRefine is using Jython, a Java implementation of Python.
Therefore, you can not "just" activate a conda environment, but you have to provide a Jython compatible package.
There is a tutorial in the OpenRefine wiki describing how to extend Jython with PyPi modules.
Please note that currently 2.7 is the newest Jython implementation. Jython 3 is still in it's planing and development phase. See Jython 3 Roadmap for details. This makes it difficult to use external libraries, as Python 2 had its end of life on 01.01.2020 and accordingly (most) libraries stopped supporting Python 2.
Also, some Python packages rely on C libraries that are not compatible with Jython. Check the Appendix A of the Definitive Guide to Jython for more details on using external tools and libraries.
Alternative solution using FastAPI
Personally I find it easier to just wrap the Python packages I want to use with FastAPI and communicate from OpenRefine via HTTP requests. Depending on your data you can then add new columns by fetching URLs or use GET/POST requests in Jython.
I recently created a GitHub Gist showing how to wrap the NER component of a spaCy model with FastAPI to be then used via OpenRefine.
I have a project written in python2. Suppose if I want to deploy my project after the end of life of python2 support, am I able to pull specific versions of specific packages?
For example, I am using boto3==1.4.7 version. So, am I able to pull the same version of the package after python2 support ends?
That should be possible, but it is better to port your Project to Python 3 and use the newest versions of the packages.
You may also consider storing these packages on some local server in case the support goes down. For example, boto3==1.4.7 is available as a tar.gz.
I have reinstalled my operating system (moved from windows XP to Windows 7).
I have reinstalled Python 2.7.
But i had a lot of packages installed in my old environment.
(Django, sciPy, jinja2, matplotlib, numpy, networkx, to name just a view)
I still have my old Python installation lying around on a data partition, so i wondered if i can just copy-paste the old Python library folders onto the new installation?
Or do i need to reinstall every package?
Do the packages keep any information in registry, system variables or similar?
Does it depend on the package?
That's the point where you must be able to layout your project, thus having special tools for that.
Normally, Python packages do not do such wierd things as dealing with registry (unless they are packaged via MSI installer). The problems may start with packages that contain C extensions, so moving to another version of OS or from 32 to 64-bit architecture will require recompiling/rebuilding of those. So, it would be much better to reinstall all packages to new system as written below.
Your demands may vary, but you definitely must choose the way of building your environment. If you don't have and plan to have a large variety of projects you may consider the first approach as follows below, the second approach is more likely for setting up development environment for different projects or for different versions of the same project.
Global environment (your Python installation in your system along with installed packages).
Here you can consider using pip. In this case your project can have requirements file containing all needed packages for your project. Basically, requirements file is a text file containing package names (on PyPI and their versions).
Isolated environment. It can be achieved using special tools or specially organized path.
Here where pip can be gracefully combined with virtualenv. This way is highly recommended by a lot of developers (I must remind that Python 3.3 that will soon be released contains virtualenv as a part of standard library). This approach assumes creating virtual shell with its own instance of Python interpreter and installed packages.
Another popular tool for achieving isolated environment is called buildout. It lays out your project source and dependencies in one path so you achieve the same effect as virtualenv creates. The great advantage of buildout that it's built upon an idea of pluggable recipes (pieces of code implementing different common project deployment tasks) and there are hundreds of stable and reliable recipes over Internet.
Both virtualenv and buildout help you to remove head-ache when installing dependencies and solve the problem of different versions of the same package kept on a single machine.
Choose your destiny...
The short answer to this question is "no", since packages can execute arbitrary code on installation and do whatever the heck they want wherever they want on your system.
Just reinstall all of them.
I am not a regular Linux user so this might be completely trivial question. I am running 6.2 PUIAS version i386_64 on one of my GPU based "super" computers due to the unavailability of NVidia drivers for NetBSD. The installed version of Python is 2.6.6. I need 2.7.2 Python and newer version of scipy, numpy, matlibplot and friends. I have PUIAS and EPEL repositories enabled. However they do not have newer versions of Python. What is the "recommended" way to install newer version of Python without braking the system which depends on it. I am not interested in Python 3.2 due to the lack of libraries for scientific computing.
When the install-Python-from-source routine tells you to use make install, type make altinstall instead. This will leave the normal python executable untouched and instead create python2.7 for you to use. Install the other packages from source using this new executable. Don't forget to change the shebang line in your scripts accordingly.
I am going to answer my own question. For people who are using Python for scientific computing on RedHat clones (PUIAS for example) the easiest way to get all they need is to use rpm package manager and Enthought Python Distribution (EPD for short). EPD installs everything in a sandbox so system tools which are based on an obsolete version of Python are not massed up. However, paths have to be adjusted for system or even easier on the user base so that the using shell invokes non-system tools. One should never compile Python from source unless you are interesting in Python itself or in porting it to your favorite operating system rather than in your own research!
I have a linux VPS that uses an older version of python (2.4.3). This version doesn't include the UUID module, but I need it for a project. My options are to upgrade to python2.6 or find a way to make uuid work with the older version. I am a complete linux newbie. I don't know how to upgrade python safely or how I could get the UUID modules working with the already installed version. What is a better option and how would I go about doing it?
The safest way to upgrading Python is to install it to a different location (away from the default system path).
To do this, download the source of python and do a
./configure --prefix=/opt
(Assuming you want to install it to /opt which is where most install non system dependant stuff to)
The reason why I say this is because some other system libraries may depend on the current version of python.
Another reason is that as you are doing your own custom development, it is much better to have control over what version of the libraries (or interpreters) you are using rather than have a operating system patch break something that was working before. A controlled upgrade is better than having the application break on you all of a sudden.
The UUID module exists as a separate package for Python 2.3 and up:
http://pypi.python.org/pypi/uuid/1.30
So you can either install that in your Python2.4, or install Python2.6. If your distro doesn't have it, then Python is quite simple to compile from source. Look through the requirements to make sure all the libraries you need/want are installed before compiling Python. That's it.
The best solution will be installing python2.6 in the choosen directory - It will you give you access to many great features and better memory handling (infamous python=2.4 memory leak problem).
I have got several pythons installed onto my two computers, I found that the best solution for are two directories:
$HOME/usr-32
$HOME/usr-64
respectively to using operating system (I share $HOME between 32 and 64 bit versions of Linux).
In each I have one directory for every application/program, for example:
ls ~/usr-64/python-2.6.2/
bin include lib share
It leads completetely to avoiding conflicts between version and gives great portability (you can use usb pendrives etc).
Python 2.6.2 in previously example has been installed with option:
./configure --prefix=$HOME/usr-64/python-2.6.2