Mercurial scripting with python

Mercurial scripting with python - python

I am trying to get the mercurial revision number/id (it's a hash not a number) programmatically in python.
The reason is that I want to add it to the css/js files on our website like so:
<link rel="stylesheet" href="example.css?{% mercurial_revision "example.css" %}" />
So that whenever a change is made to the stylesheet, it will get a new url and no longer use the old cached version.
OR if you know where to find good documentation for the mercurial python module, that would also be helpful. I can't seem to find it anywhere.
My Solution
I ended up using subprocess to just run a command that gets the hg node. I chose this solution because the api is not guaranteed to stay the same, but the bash interface probably will:
import subprocess
def get_hg_rev(file_path):
pipe = subprocess.Popen(
["hg", "log", "-l", "1", "--template", "{node}", file_path],
stdout=subprocess.PIPE
)
return pipe.stdout.read()
example use:
> path_to_file = "/home/jim/workspace/lgr/pinax/projects/lgr/site_media/base.css"
> get_hg_rev(path_to_file)
'0ed525cf38a7b7f4f1321763d964a39327db97c4'

It's true there's no official API, but you can get an idea about best practices by reading other extensions, particularly those bundled with hg. For this particular problem, I would do something like this:
from mercurial import ui, hg
from mercurial.node import hex
repo = hg.repository('/path/to/repo/root', ui.ui())
fctx = repo.filectx('/path/to/file', 'tip')
hexnode = hex(fctx.node())
Update At some point the parameter order changed, now it's like this:
repo = hg.repository(ui.ui(), '/path/to/repo/root' )

Do you mean this documentation?
Note that, as stated in that page, there is no official API, because they still reserve the right to change it at any time. But you can see the list of changes in the last few versions, it is not very extensive.

An updated, cleaner subprocess version (uses .check_output(), added in Python 2.7/3.1) that I use in my Django settings file for a crude end-to-end deployment check (I dump it into an HTML comment):
import subprocess
HG_REV = subprocess.check_output(['hg', 'id', '--id']).strip()
You could wrap it in a try if you don't want some strange hiccup to prevent startup:
try:
HG_REV = subprocess.check_output(['hg', 'id', '--id']).strip()
except OSError:
HG_REV = "? (Couldn't find hg)"
except subprocess.CalledProcessError as e:
HG_REV = "? (Error {})".format(e.returncode)
except Exception: # don't use bare 'except', mis-catches Ctrl-C and more
# should never have to deal with a hangup
HG_REV = "???"

If you are using Python 2, you want to use hglib.
I don't know what to use if you're using Python 3, sorry. Probably hgapi.
Contents of this answer
Mercurial's APIs
How to use hglib
Why hglib is the best choice for Python 2 users
If you're writing a hook, that discouraged internal interface is awfully convenient
Mercurial's APIs
Mercurial has two official APIs.
The Mercurial command server. You can talk to it from Python 2 using the hglib (wiki, PyPI) package, which is maintained by the Mercurial team.
Mercurial's command-line interface. You can talk to it via subprocess, or hgapi, or somesuch.
How to use hglib
Installation:
pip install python-hglib
Usage:
import hglib
client = hglib.open("/path/to/repo")
commit = client.log("tip")
print commit.author
More usage information on the hglib wiki page.
Why hglib is the best choice for Python 2 users
Because it is maintained by the Mercurial team, and it is what the Mercurial team recommend for interfacing with Mercurial.
From Mercurial's wiki, the following statement on interfacing with Mercurial:
For the vast majority of third party code, the best approach is to use Mercurial's published, documented, and stable API: the command line interface. Alternately, use the CommandServer or the libraries which are based on it to get a fast, stable, language-neutral interface.
From the command server page:
[The command server allows] third-party applications and libraries to communicate with Mercurial over a pipe that eliminates the per-command start-up overhead. Libraries can then encapsulate the command generation and parsing to present a language-appropriate API to these commands.
The Python interface to the Mercurial command-server, as said, is hglib.
The per-command overhead of the command line interface is no joke, by the way. I once built a very small test suite (only about 5 tests) that used hg via subprocess to create, commit by commit, a handful of repos with e.g. merge situations. Throughout the project, the runtime of suite stayed between 5 to 30 seconds, with nearly all time spent in the hg calls.
If you're writing a hook, that discouraged internal interface is awfully convenient
The signature of a Python hook function is like so:
# In the hgrc:
# [hooks]
# preupdate.my_hook = python:/path/to/file.py:my_hook
def my_hook(
ui, repo, hooktype,
... hook-specific args, find them in `hg help config` ...,
**kwargs)
ui and repo are part of the aforementioned discouraged unofficial internal API. The fact that they are right there in your function args makes them terribly convenient to use, such as in this example of a preupdate hook that disallows merges between certain branches.
def check_if_merge_is_allowed(ui, repo, hooktype, parent1, parent2, **kwargs):
from_ = repo[parent2].branch()
to_ = repo[parent1].branch()
...
# return True if the hook fails and the merge should not proceed.
If your hook code is not so important, and you're not publishing it, you might choose to use the discouraged unofficial internal API. If your hook is part of an extension that you're publishing, better use hglib.

give a try to the keyword extension

FWIW to avoid fetching that value on every page/view render, I just have my deploy put it into the settings.py file. Then I can reference settings.REVISION without all the overhead of accessing mercurial and/or another process. Do you ever have this value change w/o reloading your server?

I wanted to do the same thing the OP wanted to do, get hg id -i from a script (get tip revision of the whole REPOSITORY, not of a single FILE in that repo) but I did not want to use popen, and the code from brendan got me started, but wasn't what I wanted.
So I wrote this... Comments/criticism welcome. This gets the tip rev in hex as a string.
from mercurial import ui, hg, revlog
# from mercurial.node import hex # should I have used this?
def getrepohex(reporoot):
repo = hg.repository(ui.ui(), reporoot)
revs = repo.revs('tip')
if len(revs)==1:
return str(repo.changectx(revs[0]))
else:
raise Exception("Internal failure in getrepohex")

Related

Perforce API: Get Latest Revision of a Subdirectory

I have downloaded and installed the Perforce API for Python.
I'm able to run the examples on this page:
http://www.perforce.com/perforce/doc.current/manuals/p4script/03_python.html#1127434
But unfortunately the documentation seems incomplete. For example, the P4 class has a method called run_sync, but it's not documented anywhere (in fact, it doesn't even show up if you run dir(p4) in the Python interactive interpreter, despite the fact that you can use the method just fine in the interactive interpreter.)
So I'm struggling with figuring out how to use the API for anything beyond the trivial examples on the page I linked to above.
I would like to write a script which simply downloads the latest revision of a subdirectory to the filesystem of the computer running it and does nothing else. I don't want the server to change in any way. I don't want there to be any indication that the files came from Perforce (as opposed to if you get the files via the Perforce application, it'll mark the files in your file system as read only until you check them out or whatever. That's silly - I just need to pull down a snapshot of what the subdirectory looked like at the moment the script was run.)

The Python API follows the same basic structure as the command line client (both are very thin wrappers over the same underlying API), so you'll want to look at the command line client documentation; for example, look at "p4 sync" to understand how "run_sync" in P4Python works:
http://www.perforce.com/perforce/r14.2/manuals/cmdref/p4_sync.html
For the task you're describing I would do the following (I'll describe it in terms of Perforce commands since my Python is a little rusty; once you know what commands you're running it should be pretty simple to translate into Python, since the P4Python doc has examples of things like creating and modifying a client spec, which is the hardest part):
1) Create a client that maps the desired depot directory to the desired local filesystem location, e.g. if you want the directory "//depot/foo/..." downloaded to "/usr/team/foo" you'd make a client that looks like:
Client: mytempclient123847
Root: /usr/team/foo
View:
//depot/foo/... //mytempclient123847/...
You should set the "allwrite" option on the client since you said don't want the synced files to be read-only:
Options: allwrite noclobber nocompress unlocked nomodtime rmdir
2) Sync, using the "-p" option to minimize server impact (the server will not record that you "have" the files).
3) Delete the client.
(I'm omitting some details like making sure that you're authenticated correctly -- that's a whole other potential challenge depending on your server's security and whether it's using external authentication, but it sounds like that's not the part you're having trouble with.)

Using GitPython, how do I do git submodule update --init

My code so far is working doing the following. I'd like to get rid of the subprocess.call() stuff
import git
from subprocess import call
repo = git.Repo(repo_path)
repo.remotes.origin.fetch(prune=True)
repo.head.reset(commit='origin/master', index=True, working_tree=True)
# I don't know how to do this using GitPython yet.
os.chdir(repo_path)
call(['git', 'submodule', 'update', '--init'])

My short answer: it's convenient and simple.
Full answer follows. Suppose you have your repo variable:
repo = git.Repo(repo_path)
Then, simply do:
for submodule in repo.submodules:
submodule.update(init=True)
And you can do all the things with your submodule that you do with your ordinary repo via submodule.module() (which is of type git.Repo) like this:
sub_repo = submodule.module()
sub_repo.git.checkout('devel')
sub_repo.git.remote('maybeorigin').fetch()
I use such things in my own porcelain over git porcelain that I use to manage some projects.
Also, to do it more directly, you can, instead of using call() or subprocess, just do this:
repo = git.Repo(repo_path)
output = repo.git.submodule('update', '--init')
print(output)
You can print it because the method returns output that you usually get by runnning git submodule update --init (obviously the print() part depends on Python version).

Short answer: You can’t.
Full answer: You can’t, and there is also no point. GitPython is not a complete implementation of the whole Git. It just provides a high-level interface to some common things. While a few operations are implemented directly in Python, a lot calls actually use the Git command line interface to process stuff.
Your fetch line for example does this. Under the hood, there is some trick used to make some calls look like Python although they call the Git executable to process the result—using subprocess as well.
So you could try to figure out how to use the git cmd interface GitPython offers works to support those calls (you can access the instance of that cmd handler using repo.git), or you just continue using the “boring” subprocess calls directly.

Can I use msilib or other Python libraries to extract one file from an .msi file?

What I really want to do is determine whether a particular file in the MSI exists and contains a particular string.
My current idea is to run:
db = msilib.OpenDatabase('c:\Temp\myfile.msi',1)
query = "select * from File"
view = db.OpenView(query)
view.Execute(None)
cur_record = view.Fetch() # do this until I get the record I want
print cur_record.GetString(3) # do stuff with this value
And then if it's there, extract all the files using
msiexec /a c:\Temp\myfile.msi /qn TARGETDIR=c:\foo
and use whatever parser to see whether my string is there. But I'm hoping a less clunky way exists.

Note that, as the docs for msilib say, "Support for reading .cab files is currently not implemented". And. more generally, the library is designed for building .msi files, not reading them. And there is nothing else in the stdlib that will do what you want.
So, there are a few possibilities:
Find and install another library, like pycabinet. I know nothing about this particular library; it's just the first search hit I got; you probably want to search on your own. But it claims to provide a zipfile-like API for CAB files, which sounds like exactly the part you're missing.
Use win32com (if you've got pywin32) or ctypes (if you're a masochist) to talk to the underlying COM interfaces and/or the classic Cabinet API (which I think is now deprecated, but still works).
Use IronPython instead of CPython, so you can use the simpler .NET interfaces.
Since I don't have a Windows box here, I can't test this, but here's a sketch of Christopher Painter's .NET solution written in IronPython instead of C#:
import clr
clr.AddReference('Microsoft.Deployment.WindowsInstaller')
clr.AddReference('Microsoft.Deployment.WindowsInstaller.Package')
from Microsoft.Deployment.WindowsInstaller import *
from Microsoft.Deployment.WindowsInstaller.Package import *
def FindAndExtractFiles(packagePath, longFileName):
with InstallPackage(packagePath, DatabaseOpenMode.ReadOnly) as installPackage:
if installPackage.FindFiles(longFileName).Count() > 0:
installPackage.ExtractFiles()

Realize that in using Python you have to deal with the Windows Installer (COM) Automation interface. This means you have to do all the database connections, querying and processing yourself.
If you could move to C# ( or say PowerShell ) you could leverage some higher level classes that exist in Windows Installer XML (WiX) Deployment Tools Foundation (DTF).
using Microsoft.Deployment.WindowsInstaller;
using Microsoft.Deployment.WindowsInstaller.Package;
static void FindAndExtractFiles(string packagePath, string longFileName)
{
using (var installPackage = new InstallPackage(packagePath, DatabaseOpenMode.ReadOnly))
{
if(installPackage.FindFiles(longFileName).Count() > 0 )
installPackage.ExtractFiles();
}
}
You could also write this as ComVisible(True) and call it from Python.

The MSI APIs are inherently clunky, so it's only a matter of where the abstraction lies. Bear in mind that if you just need this a couple times, it may be easier to browse the cab file(s) manually in Explorer. (Files are stored by file key instead of file name).

Python. Redirect stderr to log file

I have Django web-site working on tornado and nginx.
I took this tornado launcher script (tornading.py)
Then I'm using python openid that outputs some information to sys.stderr.
As a result I get IOError.
How can I redirect it using logging package?
I thought about
f = open("myfile.log", "w")
sys.stderr = f
or
python tornado.py > /dev/null 2>&1
But what is the best way to solve it?

The best way would be if the openid library didn't print to stderr, but used some kind of logging API instead (e.g. the logging module). I agree with thkala that modifying third-party code is not good in the long term, so you should fix it, and then provide the fix to the openid authors.
For the objective of advancing the open source community, that's the best way to solve it.

Using shell redirections is more of a work-around than a solution and it may not be always possible, depending on how the script is launched.
It has the distinct advantage, however, of you not having to modify third-party code. Local modifications - even minor ones - can become a major issue when you decide to e.g. update said code to its latest version from upstream.

Python client library for WebDAV

I'd like to implement a piece of functionality in my application that uploads and manipulates files on a WebDAV server. I'm looking for a mature Python library that would give an interface similar to the os.* modules for working with the remote files. Googling has turned up a smattering of options for WebDAV in Python, but I'd like to know which are in wider use these days.

It's sad that for this question ("What Python webdav library to use?"), which for sure interests more than one person, unrelated answer was accepted ("don't use Python webdav library"). Well, common problem on Stackexchange.
For people who will be looking for real answers, and given the requirements in the original question (simple API similar to "os" module), I may suggest easywebdav, which has very easy API and even nice and simple implementation, offering upload/download and few file/dir management methods. Due to simple implementation, it so far doesn't support directory listing, but bug for that was filed, and the author intends to add it.

I just had a similar need and ended up testing a few Python WebDAV clients for my needs (uploading and downloading files from a WebDAV server). Here's a summary of my experience:
1) The one that worked for me is python-webdav-lib.
Not much documentation, but a quick look at the code (in particular the example) was enough to figure out how to make it work for me.
2) PyDAV 0.21 (the latest release I found) doesn't work with Python 2.6 because it uses strings as exceptions. I didn't try to fix this, expecting further incompatibilities later on.
3) davclient 0.2.0. I looked at it but didn's explore any further because the documentation didn't mention the level of API I was looking for (file upload and download).
4) Python_WebDAV_Library-0.3.0. Doesn't seem to have any upload functionality.

Apparently you're looking for a WebDAV client library.
Not sure how the gazillion hits came up, it seems the following 2 looks relevant:
PyDAV:
http://users.sfo.com/~jdavis/Software/PyDAV/readme.html#client
Zope - and look for client.py

import easywebdav
webdav = easywebdav.connect(
host='dav.dumptruck.goldenfrog.com',
username='_snip_',
port=443,
protocol="https",
password='_snip_')
_file = "test.py"
print webdav.cd("/dav/")
# print webdav._get_url("")
# print webdav.ls()
# print webdav.exists("/dav/test.py")
# print webdav.exists("ECS.zip")
# print webdav.download(_file, "./"+_file)
print webdav.upload("./test.py", "test.py")

I have no experience with any of these libraries, but the Python Package Index ("PyPi") lists quite a few webdav modules.

Install:
$ sudo apt-get install libxml2-dev libxslt-dev python-dev
$ sudo apt-get install libcurl4-openssl-dev python-pycurl
$ sudo easy_install webdavclient
Examples:
import webdav.client as wc
options = {
'webdav_hostname': "https://webdav.server.ru",
'webdav_login': "login",
'webdav_password': "password"
}
client = wc.Client(options)
client.check("dir1/file1")
client.info("dir1/file1")
files = client.list()
free_size = client.free()
client.mkdir("dir1/dir2")
client.clean("dir1/dir2")
client.copy(remote_path_from="dir1/file1", remote_path_to="dir2/file1")
client.move(remote_path_from="dir1/file1", remote_path_to="dir2/file1")
client.download_sync(remote_path="dir1/file1", local_path="~/Downloads/file1")
client.upload_sync(remote_path="dir1/file1", local_path="~/Documents/file1")
client.download_async(remote_path="dir1/file1", local_path="~/Downloads/file1", callback=callback)
client.upload_async(remote_path="dir1/file1", local_path="~/Documents/file1", callback=callback)
link = client.publish("dir1/file1")
client.unpublish("dir1/file1")
Links:
Source code here
Packet here

I don't know of any specifically but, depending on your platform, it may be simpler to mount and access the WebDAV-served files through the file system. There's davfs2 out there and some OS's, like Mac OS X, have WebDAV file system support built in.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Mercurial scripting with python - python

Do you mean this documentation? Note that, as stated in that page, there is no official API, because they still reserve the right to change it at any time. But you can see the list of changes in the last few versions, it is not very extensive.

give a try to the keyword extension

FWIW to avoid fetching that value on every page/view render, I just have my deploy put it into the settings.py file. Then I can reference settings.REVISION without all the overhead of accessing mercurial and/or another process. Do you ever have this value change w/o reloading your server?

Related

Perforce API: Get Latest Revision of a Subdirectory

Using GitPython, how do I do git submodule update --init

Can I use msilib or other Python libraries to extract one file from an .msi file?

Python. Redirect stderr to log file

Python client library for WebDAV

Categories

Resources