Patching Linux systems with python modules installed via pip

Patching Linux systems with python modules installed via pip - python

There probably isn't one "right answer" to this question. I'm interested in thoughts and opinions.
We have a couple hundred RHEL7/Centos7/Rocky8 nodes. Many of them have python modules installed via pip/pip3.
I've been searching for a best practices on routine/monthly patching these modules...so I far haven't found any. Obviously things installed with rpm/yum/dnf are pretty easy to deal with.
From the pip man page:
pip install --upgrade SomePackage
Great!
But how do you update all of them?
Sure. It is possible to do a "pip list/freeze" pipe that to awk...etc..
Surely, there's a better way. Ideally, one that captures things like "boto3 V1.2 replaced with boto3 V1.3"
Right now it feels like I'm the only one thinking about this. Maybe I am and it is stupid. I'm ok with that response as well (but please tell me why).

A common solution is to deploy the application code inside a Docker container - the container image contains its own version of Python and all the dependency modules, so you don't have to update each module on all the host machines individually. It also means that the combination of OS, Python and modules that you deploy can be tested and then "frozen" into an immutable image which is then deployed the same everywhere.
Right now it feels like I'm the only one thinking about this.
I realise the above answer is probably not helpful in your situation as you already have a fairly large system deployed... but it might help to explain why not many people are developing solutions to your problem!

Related

How to package/ distribute python applications

I've spent countless hours trying to understand this and unfortunately I haven't gotten to an answer yet. Or at least I don't think I have.
First up I should say that I am a Java Developer. I've only recently started working with Python and the build-process is a bit...odd for me.
In my mind I write an application, I compile it to run and I package it into a .jar for other people to use. Either as a library or for end-users to execute and have fun with it. (ignoring stuff like maven or gradle...)
I wrote a little CLT in python that consists of ~6 files and I wanted to distribute it. From what I could find I was supposed to write a setup.py and I found some guides on how to do that but ... to be honest I'm not even sure what that did. I could get my source code bundled into a tar.gz with some other meta data or it would create some weird files that I don't know what to do with.
Then I found PyInstaller and it worked great to package everything into a binary. However I've run into some problems trying to create a Debian package and it has made me re-assess and question the fact that there doesn't seem to be something in Python (without having to use an external tool) that lets me package/ bundle/ whatever my application into a single file to be run.
And that's something I can't get my head around. I mean...before there were tools like PyInstaller and P2Exe and what not, how did people distribute their applications? Am I expected to write a C application, somehow include the python code in there and compile that? Sorry if this seems like a stupid question but I'm really asking. I've googled around so much and spent so much time on it and haven't found a satisfactory answer so I hope someone here can help me with this! Thanks.

If you package your Python code for pip, you can include some executable scripts that start your program. I don't know how the situation was 5 years ago when this question got asked, but nowadays pip is pretty much integrated with Python, to the point that there's a standard library module to bootstrap pip in case it's missing:
https://docs.python.org/3/library/ensurepip.html
The situation is different if you want to package an application for some other package manager, like Anaconda or the package managers of various Linux distributions, or as a Windows installer. Obviously, you'll have to create a separate package for each package manager or installation technique you want to support.

Mac OS X + Python + Django + MySQL

I have worked many hours over several days trying to get MySQL working with Mac OS X, Python (I've tried both 2.7 and 3.3), and Django 1.6.
This topic is addressed on many webpages, both in SO and elsewhere, and over a period of many years (one solution specifically uses MySQLdb 1.2.2, which was last modified in March 2007). Some of the posts seem to say they have it working, but when I try their solution, it doesn't work for me. On the other hand, one post from a few months ago flatly says it can't be done.
The heart of the problem seems to be installing a driver (whether MySQLdb or mysql-connector), and symptoms vary depending on which instructions you follow. Typical show-stoppers from the various attempted solutions have been "No module named 'MySQLdb'" and "Symbol not found: _mysql_affected_rows" when you finally try "python manage.py syncdb".
One wonders whether the very act of trying so many solutions has itself messed up my dev environment so that what would have worked with a clean slate won't work now. Yes, I've tried this both with and without virtualenv. I don't know whether virtualenv has gotten me closer or not, because I don't know how to recognize getting closer.
I happen to have OS X 10.7.5 (Lion) and MySQL 5.0 on my machine. Those are not the latest versions of either, but I don't know whether that matters and I'm reluctant to keep changing things. They work fine for other MySQL applications on my machine. I'll gladly upgrade either or both if a solution is available for later versions.
Does anyone actually have the configuration listed as the title of this post working, with either Python 2.7 or 3.3? If so, I'd be most grateful if you'd direct me to the solution.
UPDATE
I just wanted to let readers know that I eventually did get my app running with Python 3.3, Django 1.6, and sort-of MySQL. My app has been running smoothly for months.
I'm sorry, I don't have the time to recreate the many hours of steps and mis-steps I followed to get this working. I'll just outline the key points:
I started using Macs more than a decade ago, starting with PowerBooks, so my Mac has a lot of old stuff on it. The first thing I finally decided I had to do was to get rid of every copy of Python and Django; installers such as MacPorts and Fink; and any of the directories they like to put their installations into. Google was of course invaluable to me in learning how to do this, and all the other steps mentioned below.
I then started fresh (as much as I could give my Mac a fresh start) using Homebrew as my only command-line installer.
I also used virtualenv. I don't actually understand virtualenv very well, and again don't have the time to research it, but I've got it working and it does seem to be a good idea.
Well, within virtualenv, I also used pip, which I guess is also a command-line installer, but it seems to be part of the Homebrew/virtualenv methodology. Sorry I can't provide any expertise on this.
As I mentioned, the app sort-of worked with MySQL, but when I used Homebrew to uninstall MySQL, and install MariaDB instead, it started to work really well. As far as I can tell, Django, Sequel Pro, PyCharm's DB features, and other programs that think they're talking to MySQL can't tell the difference between MySQL and MariaDB. I also really like the MariaDB online documentation. Admittedly, "MariaDB" isn't a great name, but neither is "MySQL".
Bottom line: If someone tells you it's impossible to get Python3.3 and Django1.6 running with MySQL (or at least MariaDB) on a Mac, don't believe them. It can be done, it's just hard to do if your system has a lot of legacy files and apps that can get into conflict with what you're trying to do.
One more thing: When I started work on this project, I suspended my work on a GAE app I'd been making great progress on for over a year. Since I'll be going back to that project soon, I wanted to keep my GAE install up-to-date on my machine, but sadly, I can no longer run the GAE installer for updates. I get some error about not being able to find python2.5. Sigh. That's what I'll have to look forward to solving when I get back to working on that project.

You could try using the pure-python pymysql:
sudo easy_install pymysql
(Use pip if you have it installed.) Then, add this before execute_from_command_line in manage.py:
try:
import pymysql
pymysql.install_as_MySQLdb()
except ImportError:
pass

I feel your struggle. I went through the same thing and found the setup process very frustrating. I don't really know which instructions you follow on which website that is throwing exceptions. But I find that all these instructions are missing either one or two small prerequisites. For example, xcode and command-line tools needed to be installed before doing any pip install. For the connectors, if you are using mysql-python, you probably need to install python-devel. I used this instruction. You are probably right that with so many installation attempts your system is probably corrupt and you might need to re-install your osx and start clean, again. It's painful, but that's what I had to do to make it work. I hope your next attempt works.

Difference between installing and importing modules

New to Python, so excuse my lack of specific technical jargon. Pretty simple question really, but I can't seem to grasp or understand the concept.
It seems that a lot of modules require using pip or easy_install and running setup.py to "install" into your python installation or your virtualenv. What is the difference between installing a module and simply taking it and importing the into another script? It seems that you access the modules the same way.
Thanks!

It's like the difference between:
Uploading a photo to the internet
Linking the photo URL inside an HTML page
Installing puts the code somewhere python expects those kinds of things to be, and the import statement says "go look there for something named X now, and make the data available to me for use".

For a single module, it usually doesn't make any difference. For complicated webs of modules, though, an installation program may do many things that wouldn't be immediately obvious. For example, it may also copy data files into locations the new modules can find them, put executables (binary libraries, or DLLs on Windws, for example) where the new modules can find them, do different things depending on which version of Python you have, and so on.
If deploying a web of modules were always easy, nobody would have written setup programs to begin with ;-)

Best Practices in Handling Modules Prerequisites

Recently I started working on a personal project in my notebook that, all going OK, it will be placed in a server elsewhere. The problem is that I make use of modules. Some were installed from apt-get, others from easy_install and one or two of those were placed directly under a subdirectory since I changed them a bit. My question is: is there a way to move all those things together? Moreover, I don't want any of those modules being updated since it may break something. How to handle that?
Finally, I'm pretty sure that I've done things the wrong way since the beginning. How do you guys work to avoid those problems?

Have a look at virtualenv. Virtualenv is a tool to create isolated Python environments.

Script to install and compile Python, Django, Virtualenv, Mercurial, Git, LessCSS, etc... on Dreamhost

The Story
After cleaning up my Dreamhost shared server's home folder from all the cruft accumulated over time, I decided to start afresh and compile/reinstall Python.
All tutorials and snippets I found seemed overly simplistic, assuming (or ignoring) a bunch of dependencies needed by Python to compile all modules correctly. So, starting from http://andrew.io/weblog/2010/02/installing-python-2-6-virtualenv-and-virtualenvwrapper-on-dreamhost/ (so far the best guide I found), I decided to write a set-and-forget Bash script to automate this painful process, including along the way a bunch of other things I am planning to use.
The Script
I am hosting the script on http://bitbucket.org/tmslnz/python-dreamhost-batch/src/
The TODOs
So far it runs fine, and does all it needs to do in about 900 seconds, giving me at the end of the process a fully functional Python / Mercurial / etc... setup without even needing to log out and back in.
I though this might be of use for others too, but there are a few things that I think it's missing and I am not quite sure how to go for it, what's the best way to do it, or if this just doesn't make any sense at all.
Check for errors and break
Check for minor version bumps of the packages and give warnings
Check for known dependencies
Use arguments to install only some of the packages instead of commenting out lines
Organise the code in a manner that's easy to update
Optionally make the installers and compiling silent, with error logging to file
failproof .bashrc modification to prevent breaking ssh logins and having to log back via FTP to fix it
EDIT: The implied question is: can anyone, more bashful than me, offer general advice on the worthiness of the above points or highlight any problems they see with this approach? (see my answer to Ry4an's comment below)
The Gist
I am no UNIX or Bash or compiler expert, and this has been built iteratively, by trial and error. It is somehow going towards apt-get (well, 1% of it...), but since Dreamhost and others obviously cannot give root access on shared servers, this looks to me like a potentially very useful workaround; particularly so with some community work involved.

One way to streamline this would be to make it work with one of: capistrano/fabric, puppet/chef, jhbuild, or buildout+minitage (and a lot of cmmi tasks). There are some opportunities for factoring in common code, especially with something more high-level than bash. You will run into bootstrapping issues, however, so maybe leave good enough alone.
If you want to look into userland package managers, there is autopackage (bootstraps well), nix (quickstart), and stow (simple but helps with isolation).

Honestly, I would just build packages with a name prefix for all of the pieces and have them install under /opt so that they're out of the way. That way it only takes the download time and a bit of install time to do.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.