How to get commit author name and email with gitpython? - python

When I run git log, I get the this line for each commit: "Author: name < email >". How do I get the exact same format for a commit in Python for a local repo? When I run the code below, I get just the author name.
from git import Repo
repo_path = 'mockito'
repo = Repo(repo_path)
commits_list = list(repo.iter_commits())
for i in range(5):
commit = commits_list[i]
print(commit.hexsha)
print(commit.author)
print(commit.committer)

It seems that gitpython's Commit objects do not have an attribute for the author email.
You can also use gitpython to call git commands directly. You can use the git show command, passing in the commit HASH (from commit.hexsha) and then a --format option that gives you just the author's name and email (you can of course pass other format options you need).
Using plain git:
$ git show -s --format='%an <%ae>' 4e13ccfbde2872c23aec4f105f334c3ae0cb4bf8
me <me#somewhere.com>
Using gitpython to use git directly:
from git import Repo
repo_path = 'myrepo'
repo = Repo(repo_path)
commits_list = list(repo.iter_commits())
for i in range(5):
commit = commits_list[i]
author = repo.git.show("-s", "--format=Author: %an <%ae>", commit.hexsha)
print(author)

According to the gitpython API documentation, a commit object—an instance of class git.objects.commit.Commit—has author and committer attributes that are instances of class git.util.Actor, which in turn has fields conf_email, conf_name, email, and name.
Hence (untested):
print(commit.author.name, commit.author.email)
will likely get you the two fields you want, though you may wish to format them in some way.
Edit: I'll defer to Gino Mempin's answer since I don't have gitpython installed to test this.

Related

Creating a repository and commiting a file with PyGithub

I've seen the topic of commiting using PyGithub in many other questions here, but none of them helped me, I didn't understood the solutions, I guess I'm too newbie.
I simply want to commit a file from my computer to a test github repository that I created. So far I'm testing with a Google Collab notebook.
This is my code, questions and problems are in the comments:
from github import Github
user = '***'
password = '***'
g = Github(user, password)
user = g.get_user()
# created a test repository
repo = user.create_repo('test')
# problem here, ask for an argument 'sha', what is this?
tree = repo.get_git_tree(???)
file = 'content/echo.py'
# since I didn't got the tree, this also goes wrong
repo.create_git_commit('test', tree, file)
The sha is a 40-character checksum hash that functions as a unique identifier to the commit ID that you want to fetch (sha is used to identify each other Git Objects as well).
From the docs:
Each object is uniquely identified by a binary SHA1 hash, being 20 bytes in size, or 40 bytes in hexadecimal notation.
Git only knows 4 distinct object types being Blobs, Trees, Commits and Tags.
The head commit sha is accessible via:
headcommit = repo.head.commit
headcommit_sha = headcommit.hexsha
Or master branch commit is accessible via:
branch = repo.get_branch("master")
master_commit = branch.commit
You can see all your existing branches via:
for branch in user.repo.get_branches():
print(f'{branch.name}')
You can also view the sha of the branch you'd like in the repository you want to fetch.
The get_git_tree takes the given sha identifier and returns a github.GitTree.GitTree, from the docs:
Git tree object creates the hierarchy between files in a Git repository
You'll find a lot of more interesting information in the docs tutorial.
Code for repository creation and to commit a new file in it on Google CoLab:
!pip install pygithub
from github import Github
user = '****'
password = '****'
g = Github(user, password)
user = g.get_user()
repo_name = 'test'
# Check if repo non existant
if repo_name not in [r.name for r in user.get_repos()]:
# Create repository
user.create_repo(repo_name)
# Get repository
repo = user.get_repo(repo_name)
# File details
file_name = 'echo.py'
file_content = 'print("echo")'
# Create file
repo.create_file(file_name, 'commit', file_content)

Use Github API to get the last commit of all repository's

I want to use the Github API for Python to be able to get every repository and check the last change to the repository.
import git
from git import Repo
from github import Github
repos = []
g = Github('Dextron12', 'password')
for repo in g.get_user().get_repos():
repos.append(str(repo))
#check for last commit to repository HERE
This gets all repository's on my account but I want to be able to also get the last change to each one of them and I want a result like this:
13:46:45
I don't mind if it is 12 hour time either.
According to the documentation, the max info you can get is the SHA of the commit and the commit date:
https://pygithub.readthedocs.io/en/latest/examples/Commit.html#
with your example:
g = Github("usar", "pass")
for repo in g.get_user().get_repos():
master = repo.get_branch("master")
sha_com = master.commit
commit = repo.get_commit(sha=sha_com)
print(commit.commit.author.date)
from github import Github
from datetime import datetime
repos = {}
g = Github('username', 'password')
for repo in g.get_user().get_repos():
master = repo.get_branch('master')
sha_com = master.commit
sha_com = str(sha_com).split('Commit(sha="')
sha_com = sha_com[1].split('")')
sha_com = sha_com[0]
commit = repo.get_commit(sha_com)
#get repository name
repo = str(repo).split('Repository(full_name="Dextron12/')
repo = repo[1].split('")')
#CONVERT DATETIME OBJECT TO STRING
timeObj = commit.commit.author.date
timeStamp = timeObj.strftime("%d-%b-%Y (%H:%M:%S)")
#ADD REPOSITORY NAME AND TIMESTAMP TO repos DICTIONARY
repos[repo[0]] = timeStamp
print(repos)
I got the timeStamp by using the method Damian Lattenero suggested. upon testing his code I got a AssertationError this was because sha_commit was returning Commit=("sha") and not "sha". So I removed the brackets and Commit from the sha_com to be left with the sha all by itslef then I didn't receive that error and it worked. I then use datetime to convert the timestamp to a string and save it to a dictionary
#Dextron Just add .sha because its a property, No need to do split and form dictionary
g = Github("user", "pass")
for repo in g.get_user().get_repos():
master = repo.get_branch("master")
sha_com = master.commit
commit = repo.get_commit(sha=sha_com.sha)
print(commit.commit.author.date)

Use Python along with Github API to collect data from a repo

I'm researching on a project, which involves python to use GitHub API to collect the no of stars, contributors, PR's and issues from a repo (https://github.com/ and store it in a CSV file.
I'm trying to use BeautifulSoup4, but API method is a more stable way to go.
Below is my small snippet. Im not sure how to get the info of no of issues raised by certain contributors of a company v/s non-company(to check the external contributors) using github API(pygithub).
from github import Github
# using username and password
# or using an access token
g = Github("***************************")
for repo in g.get_user().get_repos():
print(repo.name)
print("**********Get Current Repos**********")
user = g.get_user()
user.login
print(user.login)
repo = g.get_repo("<any-repo>/<any-repo>")
repo.name
print(repo.name)
print("********Get the Repo Topics**************")
repo = g.get_repo("<any-repo>/<any-repo>")
repo.get_topics()
print(repo.get_topics())
print("*****Get the Star Count*************")
repo = g.get_repo("<any-repo>/<any-repo>")
repo.stargazers_count
print(repo.stargazers_count)
print("********Get the Open Issues*********")
repo = g.get_repo("<any-repo>/<any-repo>")
open_issues = repo.get_issues(state='open')
for issue in open_issues:
print(issue)
print("******Get the Branch Count*******")
repo = g.get_repo("<any-repo>/<any-repo>")
print(list(repo.get_branches()))
PS: Im still a python noobie.

How do i push to remote with pygit2?

i want to clone a repository, change a file and push these changed file back to the origin branch.
I can clone the repo with
repo = pygit2.clone_repository(repo_url, local_dir, checkout_branch="test_it")
but what do i need to do now to push the changes to the remote? I want only commit the changes for one specific file, even if more files are changed.
Hope someone can help me. TIA
First stage only file_path:
# stage 'file_path'
index = repository.index
index.add(file_path)
index.write()
Then do a commit:
# commit data
reference='refs/HEAD'
message = '...some commit message...'
tree = index.write_tree()
author = pygit2.Signature(user_name, user_mail)
commiter = pygit2.Signature(user_name, user_mail)
oid = repository.create_commit(reference, author, commiter, message, tree, [repository.head.get_object().hex])
and last push the repo as described in Unable to ssh push in pygit2

How do you checkout a branch with pygit2?

I want to use pygit2 to checkout a branch-name.
For example, if I have two branches: master and new and HEAD is at master, I would expect to be able to do:
import pygit2
repository = pygit2.Repository('.git')
repository.checkout('new')
or even
import pygit2
repository = pygit2.Repository('.git')
repository.lookup_branch('new').checkout()
but neither works and the pygit2 docs don't mention how to checkout a branch.
It seems you can do:
import pygit2
repo = pygit2.Repository('.git')
branch = repo.lookup_branch('new')
ref = repo.lookup_reference(branch.name)
repo.checkout(ref)
I had a lot of trouble with this and this is one of the only relevant StackOverflow posts regarding this, so I thought I'd leave a full working example of how to clone a repo from Github and checkout the specified branch.
def clone_repo(clone_url, clone_path, branch, auth_token):
# Use pygit2 to clone the repo to disk
# if using github app pem key token, use x-access-token like below
# if you were using a personal access token, use auth_method = 'x-oauth-basic' AND reverse the auth_method and token parameters
auth_method = 'x-access-token'
callbacks = pygit2.RemoteCallbacks(pygit2.UserPass(auth_method, auth_token))
pygit2_repo = pygit2.clone_repository(clone_url, clone_path, callbacks=callbacks)
pygit2_branch = pygit2_repo.branches['origin/' + branch]
pygit2_ref = pygit2_repo.lookup_reference(pygit2_branch.name)
pygit2_repo.checkout(pygit2_ref)

Categories

Resources