How to manage Conda, Python, and dependencies within GitHub? - python

I am seeking some assistance in software project management. I want to take my project and store it remotely in GitHub, so I can clone the environment and work on it across multiple offices, each with a desktop PC, without moving a laptop back and forth. My knowledge and experience in software project management are a bit old, so as much detail as possible is always welcomed.
I have python code (several classes in corresponding class files), with the main python file executed from a Linux terminal, and the code calls a number of external programs:
Gromacs
Modeller
Several Python packages (new or with specific versions)
I've installed all the external programs/dependencies within a Conda environment.
I am the only person working on the project, but as mentioned above, I will need access to the working and development environment across several machines (not all at once). Is there any way I can "wrap" everything up, conda environment, code, etc., add it to a GitHub repository and clone as and when to each machine?
I would appreciate people's thoughts on how to manage this software project.

Related

PyCharm: Cannot keep interpreters the same on two different computers when accessing same project

I regularly switch between computers to work on the same project. It has become a headache to have separate interpreters updated with modules installed on my two machines even though both computers can access the net drive where python.exe is placed.
When I try to set the interpreter from my second computer to the same used by my primary computer, I keep getting an Invalid Python SDK error:
Currently the system python version is 3.9.1. Maybe I should uninstall this and install python3.6?
I am using Win10 64bit on both computer. My main computer has Intel i5, and my secondary computer (problem) has two Intel Xeon. I have seen posts that suggested edit the PYTHONPATH but do not know where to access it in Win10 so please be specific with the OS.
Do you really need to share an interpreter between the two machines? This seems like a very unusual idea.
"I regularly switch between computers to work on the same project." - this is very common that multiple people work over the same project and instead sharing interpreter, they use specific tools to manage modules and dependencies. I think that you can do the same.
The best way is to use something more sophisticated than the plain requirements.txt - you can check out pipenv which is supported by PyCharm: https://realpython.com/pipenv-guide/
Then, you can store pipenv artifacts on your gir repo (you have it, right?) and update packages with a few commands which is described in the above link.

advice needed: jupyter dashboard, backwards compatibility and deployment

I (a scientist rather than software developer) have developed an application using jupyter dashboards and would like to make sure, colleagues (no programming skills) can use it in the future. However, jupyter dashboards are incompatible with the newest jupyter versions. We run windows on all of our desktop computers and cannot install software at will but have to use portable apps like anaconda python. For example, the anaconda navigator for example cannot modify the start entry after the installation because it requires admin rights. Furthermore, the firewall blocks conda update
I thought of two solutions:
1) the least complicated (for me)
Provide a .yaml file for the anaconda environment and a tutorial how to
install anaconda and activate the required environment. Problem: the firewall of the company does not allow anaconda to install packages. I can install it, log into my private wlan and cirumvent that but that is not an option for everyone. I would have to deploy somehow the specific anaconda environment offline. I do prefer this solution because it seems to be simpler and least error prone.
2) using docker
There are docker images available. We do have a local PC on which I could install docker and set up everything. Problem. If a new PC is installed, someone else would have to do all that, and honestly I doubt anyone would do that. We have an IT department but that is way out of the box and would require special attention and human ressource as well as a lot of mails and calls to the IT-service line
I would appreciate any advice or ideas how to make sure in the simplest way possible that my work can be used by other scientists with minimal effort.
Out of what you mentioned, I prefer docker approach. It allows you to define a well-controlled environment with a relatively easy setup for new users. Note that Docker has some quirks when running on windows and can sometimes cause weird issues (containers running out of space out of the blue, pathing issues [if running on docker toolbox]) and such.
It is slightly more complicated to setup (than yaml) but as a tradeoff you are much less dependant on every single machine's/network's specifications.
If your workplace has it department and if your team is supposed to share the work, i'd suggest to request them to create a cloud (intranet) jupyter server so your team could have centralized access to the jupyter infrastructure.
In my company we have an even more complex approach, an intranet copy of google colab. That would be the best approach if you can push your it dept this much.
Good luck!!

How to distribute python software to users of low technical ability?

I have a python application (3.5) that I’m trying to distribute. It:
Uses no GUI libraries (it runs in the browser)
Uses several external packages (Flask, SocketIO, httplib2)
maintains saved config and data files inside the main source directory
The target users:
Use Mac or Windows
Do not understand the concept of the terminal/command line (testing has shown that it can take hours to teach users how to cd into the source directory to run a .py file).
Generally have little difficulty installing the python interpreter from python.org (but have great trouble starting and exiting the python console).
Are generally of very low technical ability.
Preferably, the app should:
be “click and play”, as I have found that typically the cd navigation is the biggest hurdle preventing users from running my application.
not require manually modifying any system settings
I am developing from Ubuntu Linux. I have access to a Windows VM, but not a Mac computer. How do I distribute my application?
There are a couple of applications that can help you to distribute a Python Application, for this case you want to take a look on Python freezing tools like py2exe (windows only) or py2app (MacOs).
This two will help you distribute your code without all the hassle of making the user to install the dependencies and run anything from the command line.
However if your application runs on the browser, you probably want to just put that into a server (take a look of openshift, it's free) it will make your life a lot easier.

Cloning development environment across machines

So I'm doing a bit of Python development work right now, and I was wondering if it was possible to "clone" my entire development environment, specifically the Python interpreter and all the libraries I have installed, to my laptop. I currently use GitHub to store and sync my files across machines, and I use Sublime Text as my main code editor so I can just install it on both machines by hand, but I don't want to have to hunt down and re-install every library and their dependencies on the new machine because I don't remember everything I might've installed and doing it by hand might not get me everything I need.
My first guess would be to just copy/paste the Python folder from my main PC to my laptop, but I have no idea how to synchronize it so that updates and changes made to one side can be brought over to the other without hassle.
How do more experienced programmers/developers handle working on large projects across multiple machines?
What I'd do is keep a virtualenv for each project on each machine and check a requirements.txt file into your Git repository, and do
source /path/to/virtualenv/bin/activate
pip install -r /path/to/project/requirements.txt
each time you add or change a library.

Are there any other good alternatives to zc.buildout and/or virtualenv for installing non-python dependencies?

I am a member of a team that is about to launch a beta of a python (Django specifically) based web site and accompanying suite of backend tools. The team itself has doubled in size from 2 to 4 over the past few weeks and we expect continued growth for the next couple of months at least. One issue that has started to plague us is getting everyone up to speed in terms of getting their development environment configured and having all the right eggs installed, etc.
I'm looking for ways to simplify this process and make it less error prone. Both zc.buildout and virtualenv look like they would be good tools for addressing this problem but both seem to concentrate primarily on the python-specific issues. We have a couple of small subprojects in other languages (Java and Ruby specifically) as well as numerous python extensions that have to be compiled natively (lxml, MySQL drivers, etc). In fact, one of the biggest thorns in our side has been getting some of these extensions compiled against appropriate versions of the shared libraries so as to avoid segfaults, malloc errors and all sorts of similar issues. It doesn't help that out of 4 people we have 4 different development environments -- 1 leopard on ppc, 1 leopard on intel, 1 ubuntu and 1 windows.
Ultimately what would be ideal would be something that works roughly like this, from the dos/unix prompt:
$ git clone [repository url]
...
$ python setup-env.py
...
that then does what zc.buildout/virtualenv does (copy/symlink the python interpreter, provide a clean space to install eggs) then installs all required eggs, including installing any native shared library dependencies, installs the ruby project, the java project, etc.
Obviously this would be useful for both getting development environments up as well as deploying on staging/production servers.
Ideally I would like for the tool that accomplishes this to be written in/extensible via python, since that is (and always will be) the lingua franca of our team, but I am open to solutions in other languages.
So, my question then is: does anyone have any suggestions for better alternatives or any experiences they can share using one of these solutions to handle larger/broader install bases?
Setuptools may be capable of more of what you're looking for than you realize -- if you need a custom version of lxml to work correctly on MacOS X, for instance, you can put a URL to an appropriate egg inside your setup.py and have setuptools download and install that inside your developers' environments as necessary; it also can be told to download and install a specific version of a dependency from revision control.
That said, I'd lean towards using a scriptably generated virtual environment. It's pretty straightforward to build a kickstart file which installs whichever packages you depend on and then boot virtual machines (or production hardware!) against it, with puppet or similar software doing other administration (adding users, setting up services [where's your database come from?], etc). This comes in particularly handy when your production environment includes multiple machines -- just script the generation of multiple VMs within their handy little sandboxed subnet (I use libvirt+kvm for this; while kvm isn't available on all the platforms you have developers working on, qemu certainly is, or you can do as I do and have a small number of beefy VM hosts shared by multiple developers).
This gets you out of the headaches of supporting N platforms -- you only have a single virtual platform to support -- and means that your deployment process, as defined by the kickstart file and puppet code used for setup, is source-controlled and run through your QA and review processes just like everything else.
I always create a develop.py file at the top level of the project, and have also a packages directory with all of the .tar.gz files from PyPI that I want to install, and also included an unpacked copy of virtualenv that is ready to run right from that file. All of this goes into version control. Every developer can simply check out the trunk, run develop.py, and a few moments later will have a virtual environment ready to use that includes all of our dependencies at exactly the versions the other developers are using. And it works even if PyPI is down, which is very helpful at this point in that service's history.
Basically, you're looking for a cross-platform software/package installer (on the lines of apt-get/yum/etc.) I'm not sure something like that exists?
An alternative might be specifying the list of packages that need to be installed via the OS-specific package management system such as Fink or DarwinPorts for Mac OS X and having a script that sets up the build environment for the in-house code?
I have continued to research this issue since I posted the question. It looks like there are some attempts to address some of the needs I outlined, e.g. Minitage and Puppet which take different approaches but both may accomplish what I want -- although Minitage does not explicitly state that it supports Windows. Lacking any better options I will try to make either one of these or just extensive customized use of zc.buildout work for our needs, but I still feel like there must be better options out there.
You might consider creating virtual machine appliances with whatever production OS you are running, and all of the software dependencies pre-built. Code can be edited either remotely, or with a shared folder. It worked pretty well for me in a past life that had a fairly complicated development environment.
Puppet doesn't (easily) support the Win32 world either. If you're looking for a deployment mechanism and not just a "dev setup" tool, you might consider looking into ControlTier (http://open.controltier.com/) which has a open-source cross-platform solution.
Beyond that you're looking at "enterprise" software such as BladeLogic or OpsWare and typically an outrageous pricetag for the functionality offered (my opinion, obviously).
A lot of folks have been aggressively using a combination of Puppet and Capistrano (even non-rails developers) for deployment automation tools to pretty good effect. Downside, again, is that it's expecting a somewhat homogeneous environment.

Categories

Resources