I'm searching for a library that can extract (at least) the following information from a SVN repository (not a working copy!):
Revision numbers and their author & commit message
Changes in each revision (added, deleted, modified files)
Is there a Python library that can do this?
For the authors and commit messages, I could parse "db/revprops/0/..." (simple format), but looking for changed files does not seem so easy, so I'd rather stick with a library that supports SVN repos.
There are Python bindings to libsvn: http://pysvn.tigris.org/docs/pysvn.html. They facilitate doing pretty much everything the svn command line client can do.
In particular, the Client.log() method does what you are looking for.
I think you want something like py-svn.
Related
for this Git issue I saw that the the gitrepo updated a file for TensorFlow. Now I want to check if the changes can be found in my installation.
I am using conda and installed the specific TensorFlow version in an environment. The file should be here: tensorflow/lite/interpreter.h
However, going down the side package route ~/anaconda3/envs/AI2.6/lib/python3.6/site-packages/tensorflow/lite/, I cannot find the file.
find | grep interpreter in this folder tree gives me
./python/interpreter.py
./python/interpreter_wrapper
./python/interpreter_wrapper/init.py
./python/interpreter_wrapper/pycache
./python/interpreter_wrapper/pycache/init.cpython-36.pyc
./python/interpreter_wrapper/_pywrap_tensorflow_interpreter_wrapper.so
./python/pycache/interpreter.cpython-36.pyc
Could you give me a hint where to find the file, or how to check if a specific commit made it into the stable version of TensorFlow?
Thanks
edit: While typing, I got the answer that the version is in the nightly version, however, it would still be interesting to learn how to find out if a commit made it into a stable release. And why I cannot find the file which should be there.
From the git side, the answer to the question is easy, provided:
that you know the commit's hash ID; and
that the answer you want is is this specific commit in a repository?
The reason for this is that Git commit hash IDs are universally unique. If some repository has some commit, it has that hash ID, in that repository and in every other repository. So you just inspect the repository to see if it has that commit, with that hash ID, and you're done.
In practice—since you've scattered this across a wide rang of tags (I plucked off the linux one since we're not talking about Linux programming APIs here)—this answer isn't useful, not even in the git arena, because commits get copied and modified, and the new-and-improved—or older and worsened, or whatever—version of some commit will have a different hash ID. You often care whether you have some version of some commit, rather than some specific commit.
For this other purpose ("do I have some version of this commit?"), you can sometimes use what Git calls a patch-ID. To find the patch ID of some commit, run the commit through the git patch-ID program (read the linked documentation for details). Then, run potentially matching commits through git patch-ID as well. If they produce the same patch ID, they are equivalent commits, even if they are technically different and therefore have different hash IDs.
A more general, more useful, and more portable way to find out if you have some particular feature requires effort on the part of the maintainers: changelogs, feature tests, and documentation. If something brings new behavior, or new files, or whatever, it should be documented, and in some cases you might want to have, in your programming language, a way to test for the existence of this feature. In python in particular, the core documentation has, for instance, things like this:
subprocess.run(args, *, stdin=None, ...
...
New in version 3.5.
Changed in version 3.6: Added encoding and errors parameters
...
You can also use Python constructs like:
try:
import what.ever
except ImportError:
... do whatever you need here ...
and similar tricks, and import sys and inspect sys.version and so on.
The file should be here: tensorflow/lite/interpreter.h
The OS-specific methods for testing the existence of a file in a path depend on the OS, but when using github, you can construct the URL from the file's name knowing the systematic scheme that the GitHub folks use. For instance, https://github.com/git/git/blob/seen/Makefile is the URL to view the version of Makefile at the tip commit of branch seen in the Git repository mirror for Git itself on GitHub.
I work on a Python web application with 20+ dependency packages. I'd love to somehow get feeds of updates for all of these packages, so that I could look at their changelogs and update quickly if the package fixes important bugs or potential security vulnerabilities. Is there a way for me to do this without hunting down the project homepage RSS feed (if one exists) for all 20 packages individually?
Ideally I'd like to be able to slurp our requirements.txt file and construct a feed automatically, but I'd settle for just manually curating a list of RSS feeds or manually subscribing some email address to a bunch of email lists.
Maybe https://requires.io/ is interesting to you?
It shows the status of all your dependencies and alerts you for security/outdated ones as well.
See here for example https://requires.io/github/edx/edx-platform/requirements/?branch=master
Github seems to have launched this feature (for other languages too!) since I posted this. See here.
If the Python packages is on PyPI, you can use libraries.io to select packages you want release emails for. This answer to a related question has additional details.
I want to write a module in python (This is the learning project) to enhance my git experience. Is there a python module for various git commands? At least the basic ones (commit/diff/log/add)?
I saw GitPython but I couldn't find the support for (new) commits; its more of a repo browsing framework than a complete GIT interface. (Or did I miss something?)
Also, if there IS a python module for all this, will that be preferable, or executing the shell commands from python code?
In GitPython you create a commit from an index object.
In libgit2 you create a commit from a repository object.
You might also want to look at this question:
Python Git Module experiences?
I think some python source could help beginners like me not to waste precious time on digging docs.
All commits will go to freshly created origin master
Here it is:
from git import Repo
import os
path = '/your/path/here'
if not os.path.exists(path):
os.makedirs(path)
os.chdir(path)
repo = Repo.init(path).git
index = Repo.init(path).index
for x in xrange (1,10):
fname = 'filename' + str(x)
f.open(fname, 'wb+')
f.write()
f.close()
repo.add(fname)
index.commit("initial commit")
Git is designed to consist of "plumbing" and "porcelain". Plumbing components form the foundation, low-level system: Managing objects, repositories, remotes, and so on. Porcelain, on the other hand, means more user-friendly high-level tools that use the plumbing.
Historically, only the most basic/performance-critical parts (mostly plumbing) were implemented in C, the rest used shell/perl scripts. To be more portable, more and more code was rewritten in C.
With this background, I would recommend to just use system calls to the git executable for your python wrapping. Consider your code as part of Git's porcelain. Compared to using a specialized library:
PRO
No need to learn an API -- use the git commands you are familiar with
Complete set of tools -- you can use porcelain and are not restricted to low-level functionality
CONTRA
Need to parse command line output from git calls.
Might be slower
This can be done with GitPython
Install it with:
pip install GitPython
And use it like this:
from git.repo import Repo
repo = Repo('/path/to/repository')
repo.index.add(['some_file'])
repo.index.commit('commit from python')
origin = repo.remotes[0]
origin.push()
Learn more in the documentation.
I'm looking for a Python library that can do basic manipulation of repositories, but is independent of the backend version control system.
By basic manipulation, I'm referring to: initialize a repo, add files, commit, pull, push, get current revision number.
Users of the library could do something this:
import dvcs_wrapper as dvcs
dvcs.set_backend('hg') # could choose 'git', 'bzr'
repo = dvcs.init('/home/me/my_repo')
repo.add('/home/me/my_repo/*.py')
repo.commit('Initial commit')
repo.push('http://bitbucket.org/....')
print('At revision %d' % repo.revision_num)
Any pointers to something like the above? My Google searches turn up nothing...
Update: for what it's worth, I've started working on something like this: code is here with unit tests
for Hg repositories. I might get around to Git and Bazaar; contributions welcome.
There's also the VCS module, which advertises:
vcs is abstraction layer over various version control systems. It is
designed as feature-rich Python library with clean API.
I think you are out of luck.
There are Python wrappers for git but according to this the quality is still less than optimal. Hg and bzr are Python projects but their infrastructure is quite different, so API level integration is not easy. Also different SCMs have different design philosophies, which makes a unified wrapper less plausible.
That being said, if you do need a simple wrapper, you can use the subprocess module and wrap the command lines to get the result you want.
I need to manipulate a subversion client from python. I need to:
check the most recent revision to change something under a given path.
update a client to a given (head or non head) revision
get logs for a given path (revisions that changed it and when).
A quick search didn't turn up what I'm looking for and I'd rather not have to write my own wrapper around the svn command line tool. (BTW: running under Linux and python 2.6)
Check out the pysvn library. Or skim the pysvn Programmer's Guide to see if it meets most of your use cases.