How to clone from specific branch from Git using Gitpython - python

I tried to clone a repository from git using GitPython in python function.
I used GitPython library for cloning from git in my python function and my code snippet as follows:
from git import Repo
Repo.clone_from('http://user:password#github.com/user/project.git',
/home/antro/Project/')
It clones from master branch. How do I clone from other branch using GitPython or any other library is available to clone from individual branches? Please let me know.
I am aware of clone by mentioning branch in commandline using
git clone -b branch http://github.com/user/project.git

just pass the branch name parameter, e.g. :-
repo = Repo.clone_from(
'http://user:password#github.com/user/project.git',
'/home/antro/Project/',
branch='master'
)
see here for more info

From toanant's answer.
This works for me with the --single-branch option
repo = Repo.clone_from(
'http://user:password#github.com/user/project.git --single-branch',
'/home/antro/Project/',
branch='master'
)

GitPython uses a keyword args transformation under the hood:
# cmd.py
def transform_kwarg(self, name: str, value: Any, split_single_char_options: bool) -> List[str]:
if len(name) == 1:
if value is True:
return ["-%s" % name]
elif value not in (False, None):
if split_single_char_options:
return ["-%s" % name, "%s" % value]
else:
return ["-%s%s" % (name, value)]
else:
if value is True:
return ["--%s" % dashify(name)]
elif value is not False and value is not None:
return ["--%s=%s" % (dashify(name), value)]
return []
A resulting list of command parts is fed into subprocess.Popen, so you do not want to add --single-branch to the repo URL. If you do, a strange list will be passed to Popen. For example:
['-v', '--branch=my-branch', 'https://github.com/me/my-project.git --single-branch', '/tmp/clone/here']
However, armed with this new information, you can pass any git CLI flags you like just by using the kwargs. You may then ask yourself, "How do I pass in a dash to a keyword like single-branch?" That's a no-go in Python. You will see a dashify function in the above code which transforms any flag from, say, single_branch=True to single-branch, and then to --single-branch.
Full Example:
Here is a useful example for cloning a single, shallow branch using a personal access token from GitHub:
repo_url = "https://github.com/me/private-project.git"
branch = "wip-branch"
# Notice the trailing : below
credentials = base64.b64encode(f"{GHE_TOKEN}:".encode("latin-1")).decode("latin-1")
Repo.clone_from(
url=repo_url,
c=f"http.{repo_url}/.extraheader=AUTHORIZATION: basic {credentials}",
single_branch=True,
depth=1,
to_path=f"/clone/to/here",
branch=branch,
)
The command list sent to Popen then looks like this:
['git', 'clone', '-v', '-c', 'http.https://github.com/me/private-project.git/.extraheader=AUTHORIZATION: basic XTE...UwNTo=', '--single-branch', '--depth=1', '--bare', '--branch=wip-branch', 'https://github.com/me/private-project.git', '/clone/to/here']
(PSA: Please do not actually send your personal tokens as part of the URL before the #.)

For --single-branch option, you can just pass a single_branch argument to the Repo.clone_from() method:
Repo.clone_from(repo, path, single_branch=True, b='branch')

Related

GitLab runner, error on push whereas same thing works on git bash or python from a console

Context
We're trying to do a GitLab runner job that, on a certain tag, modifies a version header file and add a release branch/tag to this changeset.
The GitLab runner server is on my machine, launched as a service by my user (that is properly registered to our GitLab server).
The GitLab runner job basically launches a python script that uses gitpython to du the job, there are just a few changes in runner yml file (added before_script part to be able to have upload permission, got it from there: https://stackoverflow.com/a/55344804/11159476), here is full .gitlab-ci.yml file:
variables:
GIT_SUBMODULE_STRATEGY: recursive
stages: [ build, publish, release ]
release_tag:
stage: build
before_script:
- git config --global user.name ${GITLAB_USER_NAME}
- git config --global user.email ${GITLAB_USER_EMAIL}
script:
- python .\scripts\release_gitlab_runner.py
only:
# Trigger on specific regex...
- /^Src_V[0-9]+\.[0-9]+\.[0-9]+$/
except:
# .. only for tags then except branches, see doc (https://docs.gitlab.com/ee/ci/yaml/#regular-expressions): "Only the tag or branch name can be matched by a regular expression."
- branches
Also added trick in the python URL when pushing (push with user:personal_access_token#repo_URL instead of default runner URL, got it from same answer as above, and token has been generated from company gitlab => user "Settings" => "Access Tokens" => "Add a personal access token" with all rights and never expiring), and here is, not the actual scripts\release_gitlab_runner.py python script but one simplified to have a git flow as much standard as possible for what we want (fetch all, create local branch with random name so that it does not exist, modify a file, stage, commit and finally push):
# -*-coding:utf-8 -*
import uuid
import git
import sys
import os
# Since we are in <git root path>/scripts folder, git root path is this file's path parent path
GIT_ROOT_PATH = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
try:
# Get user login and URL from GITLAB_USER_LOGIN and CI_REPOSITORY_URL gitlab environment variables
gitlabUserLogin = os.environ["GITLAB_USER_LOGIN"]
gitlabFullURL = os.environ["CI_REPOSITORY_URL"]
# Push at "https://${GITLAB_USER_NAME}:${PERSONAL_ACCESS_TOKEN}#gitlab.companyname.net/demo/demo.git")
# generatedPersonalAccessToken has been generated with full rights from https://gitlab.companyname.net/profile/personal_access_tokens and set in a variable not seen here
gitlabPushURL = "https://{}:{}#{}".format(gitlabUserLogin, generatedPersonalAccessToken, gitlabFullURL.split("#")[-1])
print("gitlabFullURL is [{}]".format(gitlabFullURL))
print("gitlabPushURL is [{}]".format(gitlabPushURL))
branchName = str(uuid.uuid1())
print("Build git.Repo object with [{}] root path".format(GIT_ROOT_PATH))
repo = git.Repo(GIT_ROOT_PATH)
print("Fetch all")
repo.git.fetch("-a")
print("Create new local branch [{}]".format(branchName))
repo.git.checkout("-b", branchName)
print("Modify file")
versionFile = os.path.join(GIT_ROOT_PATH, "public", "include" , "Version.h")
patchedVersionFileContent = ""
with open(versionFile, 'r') as versionFileContent:
patchedVersionFileContent = versionFileContent.read()
patchedVersionFileContent = re.sub("#define VERSION_MAJOR 0", "#define VERSION_MAJOR {}".format(75145), patchedVersionFileContent)
with open(versionFile, 'w') as versionFileContent:
versionFileContent.write(patchedVersionFileContent)
print("Stage file")
repo.git.add("-u")
print("Commit file")
repo.git.commit("-m", "New version file in new branch {}".format(branchName))
print("Push new branch [{}] remotely".format(branchName))
# The error is at below line:
repo.git.push(gitlabPushURL, "origin", branchName)
sys.exit(0)
except Exception as e:
print("Exception: {}".format(e))
sys.exit(-1)
Problem
Even with the trick to have rights, when we try to push from GitLab runner following error is raised:
Cmd('git') failed due to: exit code(1)
cmdline: git push https://user:token#gitlab.companyname.net/demo/repo.git origin 85a3fa6e-690a-11ea-a07d-e454e8696d31
stderr: 'error: src refspec origin does not match any
error: failed to push some refs to 'https://user:token#gitlab.companyname.net/demo/repo.git''
What works
If I open a Git Bash, I successfully run manual commands:
git fetch -a
git checkout -b newBranch
vim public/include/Version.h
=> At this point file has been modified
git add -u
git commit -m "New version file in new branch"
git push origin newBranch
Here if we fetch all from elsewhere we can see newBranch with version file modifications
And same if we run script content (without URL modification) from a python command line (assuming all imports as in script have been performed):
GIT_ROOT_PATH = "E:\\path\\to\\workspace\\repo"
branchName = str(uuid.uuid1())
repo = git.Repo(GIT_ROOT_PATH)
repo.git.fetch("-a")
repo.git.checkout("-b", branchName)
versionFile = os.path.join(GIT_ROOT_PATH, "public", "include" , "Version.h")
patchedVersionFileContent = ""
with open(versionFile, 'r') as versionFileContent:
patchedVersionFileContent = versionFileContent.read()
patchedVersionFileContent = re.sub("#define VERSION_MAJOR 0", "#define VERSION_MAJOR {}".format(75145), patchedVersionFileContent)
with open(versionFile, 'w') as versionFileContent:
versionFileContent.write(patchedVersionFileContent)
repo.git.add("-u")
repo.git.commit("-m", "New version file in new branch {}".format(branchName))
repo.git.push("origin", branchName)
Conclusion
I can't find what I do wrong when running from GitLab runner, is there something I'm missing ?
The only thing that I can see different when running from GitLab runner is that after fetch I can see I'm on a detached head (listing repo.git.branch('-a').split('\n') gives for example ['* (HEAD detached at 560976b)', 'branchName', 'remotes/origin/otherExistingBranch', ...]), but this should not be a problem since I create a new branch where to push, right ?
Git said that you used the wrong refspec. When you need to push in other remote you have to make it first gitlab = repo.create_remote("gitlab", gitlabPushURL) and push to it like repo.push("gitlab", branchName).
Edit from #gluttony to not break on next git run with "remote already exists":
remote_name = "gitlab"
if remote_name not in repo.remotes:
repo.create_remote(remote_name, gitlabPushURL)

Pushing local branch to remote branch

I created new repository in my Github repository.
Using the gitpython library I'm able to get this repository. Then I create new branch, add new file, commit and try to push to the new branch.
Please check be code below:
import git
import random
import os
repo_name = 'test'
branch_name = 'feature4'
remote_repo_addr_git = 'git#repo:DevOps/z_sandbox1.git'
no = random.randint(0,1000)
repo = git.Repo.clone_from(remote_repo_addr_git, repo_name)
new_branch = repo.create_head(branch_name)
repo.head.set_reference(new_branch)
os.chdir(repo_name)
open("parasol" + str(no), "w+").write(str(no)) # this is added
print repo.active_branch
repo.git.add(A=True)
repo.git.commit(m='okej')
repo.git.push(u='origin feature4')
Everything working fine until last push method. I got this error:
stderr: 'fatal: 'origin feature4' does not appear to be a git repository
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.'
I'm able to run this method from command line and it's working fine:
git puth -u origin feature4
But it doesn't work in Python.
This worked for me:
repo.git.push("origin", "feature4")
Useful documentation for fetch/pull/push operations with gitpython:
https://gitpython.readthedocs.io/en/stable/reference.html?highlight=index.fetch#git.remote.Remote.fetch
from git import GitCommandError, Repo
repo_name = 'test'
branch_name = 'feature4'
remote_repo_addr_git = 'git#repo:DevOps/z_sandbox1.git'
# clone repo
repo = git.Repo.clone_from(remote_repo_addr_git, repo_name)
# refspec is a sort of mapping between remote:local references
refspec = f'refs/heads/{branch_name}:refs/heads/{branch_name}'
# get branch
try:
# if exists pull the branch
# the refspec here means: grab the {branch_name} branch head
# from the remote repo and store it as my {branch_name} branch head
repo.remotes.origin.pull(refspec)
except GitCommandError:
# if not exists create it
repo.create_head(branch_name)
# checkout branch
branch = repo.heads[branch_name]
branch.checkout()
# modify files
with open(f'{repo_name}/hello.txt', 'w') as file:
file.write('hello')
# stage & commit & push
repo.index.add('**')
repo.index.commit('added good manners')
# refspec here means: publish my {branch_name} branch head
# as {branch_name} remote branch
repo.remotes.origin.push(refspec)

Use bare repo with git-python

When I'm trying to add files to bare repo:
import git
r = git.Repo("./bare-repo")
r.working_dir("/tmp/f")
print(r.bare) # True
r.index.add(["/tmp/f/foo"]) # Exception, can't use bare repo <...>
I only understood that I can add files only by Repo.index.add.
Is using bare repo with git-python module even possible? Or I need to use subprocess.call with git --work-tree=... --git-dir=... add ?
You can not add files into bare repositories. They are for sharing, not for working. You should clone bare repository to work with it. There is a nice post about it: www.saintsjd.com/2011/01/what-is-a-bare-git-repository/
UPDATE (16.06.2016)
Code sample as requested:
import git
import os, shutil
test_folder = "temp_folder"
# This is your bare repository
bare_repo_folder = os.path.join(test_folder, "bare-repo")
repo = git.Repo.init(bare_repo_folder, bare=True)
assert repo.bare
del repo
# This is non-bare repository where you can make your commits
non_bare_repo_folder = os.path.join(test_folder, "non-bare-repo")
# Clone bare repo into non-bare
cloned_repo = git.Repo.clone_from(bare_repo_folder, non_bare_repo_folder)
assert not cloned_repo.bare
# Make changes (e.g. create .gitignore file)
tmp_file = os.path.join(non_bare_repo_folder, ".gitignore")
with open(tmp_file, 'w') as f:
f.write("*.pyc")
# Run git regular operations (I use cmd commands, but you could use wrappers from git module)
cmd = cloned_repo.git
cmd.add(all=True)
cmd.commit(m=".gitignore was added")
# Push changes to bare repo
cmd.push("origin", "master", u=True)
del cloned_repo # Close Repo object and cmd associated with it
# Remove non-bare cloned repo
shutil.rmtree(non_bare_repo_folder)

How to get the current checked out Git branch name through pygit2?

This question should be related to:
How to get the current branch name in Git?
Get git current branch/tag name
How to get the name of the current git branch into a variable in a shell script?
How to programmatically determine the current checked out Git branch
But I am wondering how to do that through pygit2?
To get the conventional "shorthand" name:
from pygit2 import Repository
Repository('.').head.shorthand # 'master'
In case you don't want to or can't use pygit2
May need to alter path - this assumes you are in the parent directory of .git
from pathlib import Path
def get_active_branch_name():
head_dir = Path(".") / ".git" / "HEAD"
with head_dir.open("r") as f: content = f.read().splitlines()
for line in content:
if line[0:4] == "ref:":
return line.partition("refs/heads/")[2]
From
PyGit Documentation
Either of these should work
#!/usr/bin/python
from pygit2 import Repository
repo = Repository('/path/to/your/git/repo')
# option 1
head = repo.head
print("Head is " + head.name)
# option 2
head = repo.lookup_reference('HEAD').resolve()
print("Head is " + head.name)
You'll get the full name including /refs/heads/. If you don't want that strip it out or use shorthand instead of name.
./pygit_test.py
Head is refs/heads/master
Head is refs/heads/master
You can use GitPython:
from git import Repo
local_repo = Repo(path=settings.BASE_DIR)
local_branch = local_repo.active_branch.name

How to checkout a tag with GitPython

In a python script, I try to checkout a tag after cloning a git repository.
I use GitPython 0.3.2.
#!/usr/bin/env python
import git
g = git.Git()
g.clone("user#host:repos")
g = git.Git(repos)
g.execute(["git", "checkout", "tag_name"])
With this code I have an error:
g.execute(["git", "checkout", "tag_name"])
File "/usr/lib/python2.6/site-packages/git/cmd.py", line 377, in execute
raise GitCommandError(command, status, stderr_value)
GitCommandError: 'git checkout tag_name' returned exit status 1: error: pathspec 'tag_name' did not match any file(s) known to git.
If I replace the tag name with a branch name, I have no problem.
I didn't find informations in GitPython documentation.
And if I try to checkout the same tag in a shell, I have non problem.
Do you know how can I checkout a git tag in python ?
Assuming you cloned the repository in 'path/to/repo', just try this:
from git import Git
g = Git('path/to/repo')
g.checkout('tag_name')
from git import Git
g = Git(repo_path)
g.init()
g.checkout(version_tag)
Like cmd.py Class Git comments say
"""
The Git class manages communication with the Git binary.
It provides a convenient interface to calling the Git binary, such as in::
g = Git( git_dir )
g.init() # calls 'git init' program
rval = g.ls_files() # calls 'git ls-files' program
``Debugging``
Set the GIT_PYTHON_TRACE environment variable print each invocation
of the command to stdout.
Set its value to 'full' to see details about the returned values.
"""
git.Repo().git.checkout('tag')
This worked for me, and I think it's closer to the intended API usage:
from git import Repo
repo = Repo.clone_from("https://url_here", "local_path")
repo.heads['tag-name'].checkout()

Categories

Resources