Is there a Rake equivalent in Python?

Is there a Rake equivalent in Python? - python

Rake is a software build tool written in Ruby (like Ant or Make), and so all its files are written in this language. Does something like this exist in Python?

Invoke — Fabric without the SSH dependencies.
The Fabric roadmap discusses that Fabric 1.x will be split into three portions:
Invoke — The non-SSH task execution.
Fabric 2.x — The remote execution and deployment library that utilizes Invoke.
Patchwork — The "common deployment/sysadmin operations, built on Fabric."
Invoke is a Python (2.6+ and 3.3+) task execution tool & library, drawing inspiration from various sources to arrive at a powerful & clean feature set.
Below are a few descriptive statements from Invoke's website:
Invoke is a Python (2.6+ and 3.3+) task execution tool & library, drawing inspiration from various sources to arrive at a powerful & clean feature set.
Like Ruby’s Rake tool and Invoke’s own predecessor Fabric 1.x, it provides a clean, high level API for running shell commands and defining/organizing task functions from a tasks.py file.

Paver has a similar set of goals, though I don't really know how it compares.

Shovel seems promising:
Shovel — Rake for Python
https://github.com/seomoz/shovel

Waf is a Python-based framework for configuring, compiling and installing applications. It derives from the concepts of other build tools such as Scons, Autotools, CMake or Ant.

There is also doit - I came across it while looking for these things a while ago, though I didn't get very far with evaluating it.

Although it is more commonly used for deployment, Fabric might be interesting for this use case.

Also check out buildout, which isn't so much a make system for software, as a make system for a deployment.
http://pypi.python.org/pypi/pysqlite/2.5.5
So it's not a direct rake equivalent, but may be a better match for what you want to do, or a really lousy one.

There is Phantom in Boo (which isn't Python, but nearly).

I would check out distutils:
The distutils package provides support
for building and installing additional
modules into a Python installation.
The new modules may be either
100%-pure Python, or may be extension
modules written in C, or may be
collections of Python packages which
include modules coded in both Python
and C.

Related

How can I wrap Python processes in an OSGi deployment

I need to integrate a body of Python code into an existing OSGi (Apache Felix) deployment.
I assume, or at least hope, that packages exist to help with this effort.
If it helps, the Python code is still relatively new and small, so can probably re re-architected to meet whatever constraints are needed. However, it must remain in Python, because of dependencies on third-party libraries.
What are suggested best practices?

The trick is to make this an extender, see 1 and 2. You want your Python code to be separate from the code that handles the interaction with the interpreter. So what you do is wrap the Python code and any native libraries in a bundle. This is trivial since it is just a zip file.
You then develop a bundle that listens to starting bundle (see the BundleTracker) that have python code. A manifest is often used but you can also look in a directory in the JAR. If you detect this code, you extract any native libraries and run the code in the interpeter of your choice.
If can use JYthon then that would be highly recommended. You can then carry the interpreter as an OSGi bundle that runs on the VM. If you need to use a native compiler your life is less rosy. You can rely on the environment to provide you with an interpreter but then why use OSGi in the first place. You basically lose the write once run anywhere advantage. You could go the full monty by creating bundles that contain Python installers for all platforms you support. Can be done, not even that hard, but a maintenance nightmare. Believe me, native code suck, it only does it a bit faster than Java.

Differences between subprocess module, envoy, sarge and pexpect?

I am thinking about making a program that will need to send input and take output from the various aircrack-ng suite tools. I know of a couple of python modules like subprocess, envoy, sarge and pexpect that would provide the necessary functionality. Can anyone advise on what I should be using or not using, especially as I'm new to python.
Thanks

As the maintainer of sarge, I can tell you that its goals are broadly similar to envoy (in terms of ease of use over subprocess) and there is (IMO) more functionality in sarge with respect to:
Cross-platform support for bash-like syntax (e.g.use of &&, ||, & in command lines)
Better support for capturing subprocess output streams and working with them asynchronously
More documentation, especially about the internals and peripheral issues like threading+forking in the context of using subprocess
Support for prevention of shell injection attacks
Of course YMMV, but you can check out the docs, they're reasonably comprehensive.

pexpect
In 2015, pexpect does not work on windows. Rumored to add "experimental" support in the next version, but this has been a rumor for a long time (I'm not holding my breath).
Having written many applications using pexpect (and loving it), I am now sorry because one of the things I love about python (that it is cross platform) is not true for my applications.
If you plan to ever add windows support, for the moment, avoid pexpect.
envoy
Not much activity in the last year. And few commits (12 total) since 2012. Not very promising for its future.
Internally it uses shlex in a way that is not compatible with windows paths (the commands must use '/' not '\' for directory separators). A workaround (when using pathlib) is to call as_posix() on path objects before passing them as commands. See this answer.
Getting access to the internal streams (i.e. I want to parse the output to have some updating scrollbars), seems possible but is not documented.
sarge
Works on windows out-of-the-box and has an expect() method that should provide functionality similar to pexpect (allowing me to update a scrollbar). Recent activity, but it is hosted on gitlab and bitbucket (very confusing).
Personal Conclusion
I'm moving from pexpect to sarge for future development. Seems to provide similar feature set to pexpect and supports windows.

subprocess - is a standard library module, so it'll be available with python installation. But it has a reputation of hard to use since it's api is non-intuitive.
envoy - is a third party module that wraps around subprocess. It was written to be an easy to use alternative to subprocess. The author of envoy Kenneth Reitz is famous for his Python for Humans philosophy.
I'm not familiar with the other two.

Why are there no Makefiles for automation in Python projects?

As a long time Python programmer, I wonder, if a central aspect of Python culture eluded me a long time: What do we do instead of Makefiles?
Most ruby-projects I've seen (not just rails) use Rake, shortly after node.js became popular, there was cake. In many other (compiled and non-compiled) languages there are classic Make files.
But in Python, no one seems to need such infrastructure. I randomly picked Python projects on GitHub, and they had no automation, besides the installation, provided by setup.py.
What's the reason behind this?
Is there nothing to automate? Do most programmers prefer to run style checks, tests, etc. manually?
Some examples:
dependencies sets up a virtualenv and installs the dependencies
check calls the pep8 and pylint commandlinetools.
the test task depends on dependencies enables the virtualenv, starts selenium-server for the integration tests, and calls nosetest
the coffeescript task compiles all coffeescripts to minified javascript
the runserver task depends on dependencies and coffeescript
the deploy task depends on check and test and deploys the project.
the docs task calls sphinx with the appropiate arguments
Some of them are just one or two-liners, but IMHO, they add up. Due to the Makefile, I don't have to remember them.
To clarify: I'm not looking for a Python equivalent for Rake. I'm glad with paver. I'm looking for the reasons.

Actually, automation is useful to Python developers too!
Invoke is probably the closest tool to what you have in mind, for automation of common repetitive Python tasks: https://github.com/pyinvoke/invoke
With invoke, you can create a tasks.py like this one (borrowed from the invoke docs)
from invoke import run, task
#task
def clean(docs=False, bytecode=False, extra=''):
patterns = ['build']
if docs:
patterns.append('docs/_build')
if bytecode:
patterns.append('**/*.pyc')
if extra:
patterns.append(extra)
for pattern in patterns:
run("rm -rf %s" % pattern)
#task
def build(docs=False):
run("python setup.py build")
if docs:
run("sphinx-build docs docs/_build")
You can then run the tasks at the command line, for example:
$ invoke clean
$ invoke build --docs
Another option is to simply use a Makefile. For example, a Python project's Makefile could look like this:
docs:
$(MAKE) -C docs clean
$(MAKE) -C docs html
open docs/_build/html/index.html
release: clean
python setup.py sdist upload
sdist: clean
python setup.py sdist
ls -l dist

Setuptools can automate a lot of things, and for things that aren't built-in, it's easily extensible.
To run unittests, you can use the setup.py test command after having added a test_suite argument to the setup() call. (documentation)
Dependencies (even if not available on PyPI) can be handled by adding a install_requires/extras_require/dependency_links argument to the setup() call. (documentation)
To create a .deb package, you can use the stdeb module.
For everything else, you can add custom setup.py commands.
But I agree with S.Lott, most of the tasks you'd wish to automate (except dependencies handling maybe, it's the only one I find really useful) are tasks you don't run everyday, so there wouldn't be any real productivity improvement by automating them.

There is a number of options for automation in Python. I don't think there is a culture against automation, there is just not one dominant way of doing it. The common denominator is distutils.
The one which is closed to your description is buildout. This is mostly used in the Zope/Plone world.
I myself use a combination of the following: Distribute, pip and Fabric. I am mostly developing using Django that has manage.py for automation commands.
It is also being actively worked on in Python 3.3

Any decent test tool has a way of running the entire suite in a single command, and nothing is stopping you from using rake, make, or anything else, really.
There is little reason to invent a new way of doing things when existing methods work perfectly well - why re-invent something just because YOU didn't invent it? (NIH).

The make utility is an optimization tool which reduces the time spent building a software image. The reduction in time is obtained when all of the intermediate materials from a previous build are still available, and only a small change has been made to the inputs (such as source code). In this situation, make is able to perform an "incremental build": rebuild only a subset of the intermediate pieces that are impacted by the change to the inputs.
When a complete build takes place, all that make effectively does is to execute a set of scripting steps. These same steps could just be deposited into a flat script. The -n option of make will in fact print these steps, which makes this possible.
A Makefile isn't "automation"; it's "automation with a view toward optimized incremental rebuilds." Anything scripted with any scripting tool is automation.
So, why would Python project eschew tools like make? Probably because Python projects don't struggle with long build times that they are eager to optimize. And, also, the compilation of a .py to a .pyc file does not have the same web of dependencies like a .c to a .o.
A C source file can #include hundreds of dependent files; a one-character change in any one of these files can mean that the source file must be recompiled. A properly written Makefile will detect when that is or is not the case.
A big C or C++ project without an incremental build system would mean that a developer has to wait hours for an executable image to pop out for testing. Fast, incremental builds are essential.
In the case of Python, probably all you have to worry about is when a .py file is newer than its corresponding .pyc, which can be handled by simple scripting: loop over all the files, and recompile anything newer than its byte code. Moreover, compilation is optional in the first place!
So the reason Python projects tend not to use make is that their need to perform incremental rebuild optimization is low, and they use other tools for automation; tools that are more familiar to Python programmers, like Python itself.

The original PEP where this was raised can be found here. Distutils has become the standard method for distributing and installing Python modules.
Why? It just happens that python is a wonderful language to perform the installation of Python modules with.

Here are few examples of makefile usage with python:
https://blog.horejsek.com/makefile-with-python/
https://krzysztofzuraw.com/blog/2016/makefiles-in-python-projects.html
I think that a most of people is not aware "makefile for python" case. It could be useful, but "sexiness ratio" is too small to propagate rapidly (just my PPOV).

Is there nothing to automate?
Not really. All but two of the examples are one-line commands.
tl;dr Very little of this is really interesting or complex. Very little of this seems to benefit from "automation".
Due to documentation, I don't have to remember the commands to do this.
Do most programmers prefer to run stylechecks, tests, etc. manually?
Yes.
generation documentation,
the docs task calls sphinx with the appropiate arguments
It's one line of code. Automation doesn't help much.
sphinx-build -b html source build/html. That's a script. Written in Python.
We do this rarely. A few times a week. After "significant" changes.
running stylechecks (Pylint, Pyflakes and the pep8-cmdtool).
check calls the pep8 and pylint commandlinetools
We don't do this. We use unit testing instead of pylint.
You could automate that three-step process.
But I can see how SCons or make might help someone here.
tests
There might be space for "automation" here. It's two lines: the non-Django unit tests (python test/main.py) and the Django tests. (manage.py test). Automation could be applied to run both lines.
We do this dozens of times each day. We never knew we needed "automation".
dependecies sets up a virtualenv and installs the dependencies
Done so rarely that a simple list of steps is all that we've ever needed. We track our dependencies very, very carefully, so there are never any surprises.
We don't do this.
the test task depends on dependencies enables the virtualenv, starts selenium-server for the integration tests, and calls nosetest
The start server & run nosetest as a two-step "automation" makes some sense. It saves you from entering the two shell commands to run both steps.
the coffeescript task compiles all coffeescripts to minified javascript
This is something that's very rare for us. I suppose it's a good example of something to be automated. Automating the one-line script could be helpful.
I can see how SCons or make might help someone here.
the runserver task depends on dependencies and coffeescript
Except. The dependencies change so rarely, that this seems like overkill. I supposed it can be a good idea of you're not tracking dependencies well in the first place.
the deploy task depends on check and test and deploys the project.
It's an svn co and python setup.py install on the server, followed by a bunch of customer-specific copies from the subversion area to the customer /www area. That's a script. Written in Python.
It's not a general make or SCons kind of thing. It has only one actor (a sysadmin) and one use case. We wouldn't ever mingle deployment with other development, QA or test tasks.

Parallel Tasking Concurrency with Dependencies on Python like GNU Make

I'm looking for a method or possibly a philosophical approach for how to do something like GNU Make within python. Currently, we utilize makefiles to execute processing because the makefiles are extremely good at parallel runs with changing single option: -j x. In addition, gnu make already has the dependency stacks built into it, so adding a secondary processor or the ability to process more threads just means updating that single option. I want that same power and flexibility in python, but I don't see it.
As an example:
all: dependency_a dependency_b dependency_c
dependency_a: dependency_d
stuff
dependency_b: dependency_d
stuff
dependency_c: dependency_e
stuff
dependency_d: dependency_f
stuff
dependency_e:
stuff
dependency_f:
stuff
If we do a standard single thread operation (-j 1), the order of operation might be:
dependency_f -> dependency_d -> dependency_a -> dependency_b -> dependency_e \
-> dependency_c
For two threads (-j 2), we might see:
1: dependency_f -> dependency_d -> dependency_a -> dependency_b
2: dependency_e -> dependency_c
Does anyone have any suggestions on either a package already built or an approach? I'm totally open, provided it's a pythonic solution/approach.
Please and Thanks in advance!

You might want to have a look a jug. It's a task-based parallelisation framework that includes dependency tracking.

Have also a look at Waf, it's less complicated than Scons.
Waf is a Python-based framework for
configuring, compiling and installing
applications. Here are perhaps the
most important features of Waf:
Automatic build order: the build order
is computed from input and output
files, among others Automatic
dependencies: tasks to execute are
detected by hashing files and commands
Performance: tasks are executed in
parallel automatically Flexibility:
new commands can be added very easily
through subclassing Features: support
for lots of programming languages and
compilers is included by default
Documentation: the application is
based on a robust model documented in
The Waf book and in the API docs
Python support: from Python 2.4 to 3.2
(Jython 2.5 and PyPy are also
supported)
(from the website)

You should use Scons, as it already does the computations you want, and you can subvert it to do pretty much anything (like Make).

Have a look at Scons. It is a replacement for GNU Make written in Python.
SCons is a software construction
tool—that is, a superior alternative
to the classic "Make" build tool that
we all know and love.
SCons is implemented as a Python
script and set of modules, and SCons
"configuration files" are actually
executed as Python scripts. This gives
SCons many powerful capabilities not
found in other software build tools.

redo -j
"Smaller, easier, more powerful, and more reliable than make. An implementation of djb's redo" in Python.

Are there any other good alternatives to zc.buildout and/or virtualenv for installing non-python dependencies?

I am a member of a team that is about to launch a beta of a python (Django specifically) based web site and accompanying suite of backend tools. The team itself has doubled in size from 2 to 4 over the past few weeks and we expect continued growth for the next couple of months at least. One issue that has started to plague us is getting everyone up to speed in terms of getting their development environment configured and having all the right eggs installed, etc.
I'm looking for ways to simplify this process and make it less error prone. Both zc.buildout and virtualenv look like they would be good tools for addressing this problem but both seem to concentrate primarily on the python-specific issues. We have a couple of small subprojects in other languages (Java and Ruby specifically) as well as numerous python extensions that have to be compiled natively (lxml, MySQL drivers, etc). In fact, one of the biggest thorns in our side has been getting some of these extensions compiled against appropriate versions of the shared libraries so as to avoid segfaults, malloc errors and all sorts of similar issues. It doesn't help that out of 4 people we have 4 different development environments -- 1 leopard on ppc, 1 leopard on intel, 1 ubuntu and 1 windows.
Ultimately what would be ideal would be something that works roughly like this, from the dos/unix prompt:
$ git clone [repository url]
...
$ python setup-env.py
...
that then does what zc.buildout/virtualenv does (copy/symlink the python interpreter, provide a clean space to install eggs) then installs all required eggs, including installing any native shared library dependencies, installs the ruby project, the java project, etc.
Obviously this would be useful for both getting development environments up as well as deploying on staging/production servers.
Ideally I would like for the tool that accomplishes this to be written in/extensible via python, since that is (and always will be) the lingua franca of our team, but I am open to solutions in other languages.
So, my question then is: does anyone have any suggestions for better alternatives or any experiences they can share using one of these solutions to handle larger/broader install bases?

Setuptools may be capable of more of what you're looking for than you realize -- if you need a custom version of lxml to work correctly on MacOS X, for instance, you can put a URL to an appropriate egg inside your setup.py and have setuptools download and install that inside your developers' environments as necessary; it also can be told to download and install a specific version of a dependency from revision control.
That said, I'd lean towards using a scriptably generated virtual environment. It's pretty straightforward to build a kickstart file which installs whichever packages you depend on and then boot virtual machines (or production hardware!) against it, with puppet or similar software doing other administration (adding users, setting up services [where's your database come from?], etc). This comes in particularly handy when your production environment includes multiple machines -- just script the generation of multiple VMs within their handy little sandboxed subnet (I use libvirt+kvm for this; while kvm isn't available on all the platforms you have developers working on, qemu certainly is, or you can do as I do and have a small number of beefy VM hosts shared by multiple developers).
This gets you out of the headaches of supporting N platforms -- you only have a single virtual platform to support -- and means that your deployment process, as defined by the kickstart file and puppet code used for setup, is source-controlled and run through your QA and review processes just like everything else.

I always create a develop.py file at the top level of the project, and have also a packages directory with all of the .tar.gz files from PyPI that I want to install, and also included an unpacked copy of virtualenv that is ready to run right from that file. All of this goes into version control. Every developer can simply check out the trunk, run develop.py, and a few moments later will have a virtual environment ready to use that includes all of our dependencies at exactly the versions the other developers are using. And it works even if PyPI is down, which is very helpful at this point in that service's history.

Basically, you're looking for a cross-platform software/package installer (on the lines of apt-get/yum/etc.) I'm not sure something like that exists?
An alternative might be specifying the list of packages that need to be installed via the OS-specific package management system such as Fink or DarwinPorts for Mac OS X and having a script that sets up the build environment for the in-house code?

I have continued to research this issue since I posted the question. It looks like there are some attempts to address some of the needs I outlined, e.g. Minitage and Puppet which take different approaches but both may accomplish what I want -- although Minitage does not explicitly state that it supports Windows. Lacking any better options I will try to make either one of these or just extensive customized use of zc.buildout work for our needs, but I still feel like there must be better options out there.

You might consider creating virtual machine appliances with whatever production OS you are running, and all of the software dependencies pre-built. Code can be edited either remotely, or with a shared folder. It worked pretty well for me in a past life that had a fairly complicated development environment.

Puppet doesn't (easily) support the Win32 world either. If you're looking for a deployment mechanism and not just a "dev setup" tool, you might consider looking into ControlTier (http://open.controltier.com/) which has a open-source cross-platform solution.
Beyond that you're looking at "enterprise" software such as BladeLogic or OpsWare and typically an outrageous pricetag for the functionality offered (my opinion, obviously).
A lot of folks have been aggressively using a combination of Puppet and Capistrano (even non-rails developers) for deployment automation tools to pretty good effect. Downside, again, is that it's expecting a somewhat homogeneous environment.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.