How to check if a Git Repo has uncommitted changes using Python - python

I'm pretty new to git, but I'm trying to use python to check if a git repository has any uncommitted changes. I seem to get the same error no matter what command I try to run using python. Here's my code:
from git import *
repo = Repo("path\to\my\repo")
lastCommit = repo.head.commit.committed_date
uncommitted = repo.is_dirty()
Everything works as expected until I run the last line which is when I get the error:
Traceback (most recent call last):
.
.
.
raise GitCommandNotFound: [Error 2] The system cannot find the file specified
I've tried this with other commands too and I get the same error. For example, repo.index.diff(repo.head.commit). I've also tried running repo.index.diff(None) and repo.index.diff('HEAD') which give the same error. What I really want is to essentially run $ git status for the repository I've named repo. I'm using Python 2.7.9 and gitpython 1.0.1 on Windows 7. Any help would be appreciated!

In your particular example (which is only for illustration purposes), your "path\to\my\repo" would be understood as 'path\to\\my\repo'. Use a double backslash between the components of your path ("path\\to\\my\\repo"). \t is understood as a tab, and \r is understood as a carriage return. Alternatively, you can put a r in front of your path like so: r"path\to\my\repo"

Look like GitPython can't find git.exe.
Try setting the environment variable GIT_PYTHON_GIT_EXECUTABLE.
It's should most likely be "C:\Program Files (x86)\Git\bin\git.exe" if using Git for Windows with defaults
at command line (cmd.exe)
set GIT_PYTHON_GIT_EXECUTABLE="C:\Program Files (x86)\Git\bin\git.exe"

Thanks for your suggestions, but implementing them did not actually solve my problem. I did develop a work-around with the following code:
def statusChecker(repo, lastCommit):
uncommittedFiles = []
files = os.listdir(repo)
for file in files:
if os.path.getmtime(repo + "\\\\" + file) > lastCommit:
uncommittedFiles.append(file)
uncommittedFiles = uncommittedFiles.remove(".git")
return uncommittedFiles
As long as you use something like lastCommit = repo.head.commit.committed_date for the lastCommit argument this should work well.

Related

Pytessaract image_to_pdf_or_hocr function not working in AWS lambda

I am using this repository to deploy tesseract as a lambda layer: https://github.com/bweigel/aws-lambda-tesseract-layer
The deployment works well and other functions that pytesseract has like: image_to_string, image_to_data also works well without any hiccups.
But, when I try to use image_to_pdf_or_hocr like this:
pdf = pytesseract.image_to_pdf_or_hocr(f'/tmp/{file_name}/{page.number}.png', extension='pdf')
it does not work and throws error like:
Traceback (most recent call last):
File "/var/task/helpers/ocr_helper.py", line 36, in save_searchable_pdf
f'/tmp/{file_name}/{page.number}.png', extension='pdf')
File "/var/task/pytesseract/pytesseract.py", line 432, in image_to_pdf_or_hocr
return run_and_get_output(*args)
File "/var/task/pytesseract/pytesseract.py", line 289, in run_and_get_output
with open(filename, 'rb') as output_file:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tess_6_hu78b0.pdf'
It says that the file tess_6_hu78b0.pdf does not exist. What does this mean? I have no file with tess_6_hu78b0 name to begin with.
The path that I am passing to image_to_pdf_or_hocr function is 100% correct and an image is present there. I have confirmed and the same thing works on my local.
I have tried:
I found somewhere that I needed to install libtesseract-dev too. Hence, I modified my dockerfile as:
FROM lambci/lambda:build-python3.6
RUN sudo apt install tesseract-ocr
RUN sudo apt install libtesseract-dev
but unfortunately this too did not work.
After 18 hours of hard work, I was finally able to figure it out.
It turns out that https://github.com/bweigel/aws-lambda-tesseract-layer is not bundled with all the necessary files for pytesseract.image_to_pdf_or_hocr() to run.
So what I did was, I build leptonica and tesseract from source and generated
configs folder
tessconfigs folder and
pdf.tiff file
These required files are available here:
https://github.com/prameshbajra/tessdata
Inside https://github.com/bweigel/aws-lambda-tesseract-layer, under ready-to-use folder there is a directory named amazonlinux-1, and inside it, there is a folder named tesseract/share/tessdata. All you need to do is paste in the above listed files under this directory.
Just download this repo and replace the tessdata folder.
Note: This tessdata is build with tesseract 4.1.1
I hope this helps future readers.
Happy coding.
Thank Benjamin Genz (#bweigel) for publishing this repo. You made our lives easier.
Adding this config argument fixed it for me, inspired by this solution :)
pytesseract.image_to_pdf_or_hocr("Image.png", extension="pdf", config = " -c tessedit_create_pdf=1")

how to fix read-only file system error python

I am trying to make a python script that runs the command line for turning a file into a .zip using python3 on my Mac.
However, whenever I run: os.system('zip -er file.zip /Users/mymac/Desktop/file.py') in python3, I get the error:
zip I/O error: Read-only file system
zip error: Could not create output file (file.zip)
I have tried disabling SIP on my Mac, as well as trying to use subprocess but I get the same message every time. I am really unsure why this happens... Is anyone able to help out?
i will suggest 3 steps !
first run :
fsck -n -f
then reboot !
make sure to run the python file as root
import os
try:
os.system('zip mag.zip mag.ppk')
print ('success')
except:
print ('problem')
screnshoot for my test

Windows 10: PIL.Image.open(abc).load(xyz) triggers OSError: Unable to locate Ghostscript on paths

I'm using Python on Windows 10 with PyCharm. My script contains this line:
img = PIL.Image.open(io.BytesIO(ps.encode('utf-8')))
It triggers this error:
Traceback (most recent call last):
File "C:/Users/x/Desktop/ytg2/main.py", line 504, in <module>
generate_terrain(driver)
File "C:/Users/x/Desktop/ytg2/main.py", line 129, in generate_terrain
img = open_eps(ps, dpi=95.5)
File "C:/Users/x/Desktop/ytg2/main.py", line 32, in open_eps
img.load(scale=math.ceil(scale))
File "C:\Users\x\AppData\Local\Programs\Python\Python37\lib\site-packages\PIL\EpsImagePlugin.py", line 332, in load
self.im = Ghostscript(self.tile, self.size, self.fp, scale)
File "C:\Users\x\AppData\Local\Programs\Python\Python37\lib\site-packages\PIL\EpsImagePlugin.py", line 134, in Ghostscript
raise OSError("Unable to locate Ghostscript on paths")
OSError: Unable to locate Ghostscript on paths
Process finished with exit code 1
So what I understand is that the function load of the object returned by PIL.Image.open uses the package Ghostscript that can't be found with the interpreter.
So here is, in the order, what I've tried to do:
In PyCharm's packages manager, I've installed the following packages: python3-ghostscript and ghostscript.
In Windows 10 Environments Variables, I have added this variable: (name="Ghostscript" ; value="C:\Program Files\gs\gs9.52\bin\gswin64.exe"). Previously, I've of course manually installed Ghostscript (https://www.ghostscript.com/download/gsdnld.html). I 've tried this value too: %ProgramFiles%\gs%\gs9.52%\bin%\gswin64.exe.
However the problem is still here. What could I do?
The PIL.Image.open(io.BytesIO(ps.encode('utf-8'))) uses shutils.which('gswin64c') to find gswin64c (I knew that by clicking on a file link that the Python Interpreter shown in the PyCharm's console, in the error logs - this link is: one of the two last lines beginning with the word File in the error logs I've shown in the OP, if I remember well).
shutils.which('gswin64c') was returning None (indeed, I made myself a print of it) ; so I prompted os.environ["PATH"] and indeed, it was not contained in the printed output. Then to be sure, I typed echo %path% in the Windows 10 CLI, and I made the same constatation.
My conclusion was: I thought I was correctly adding the path of gswin64c in the way I mentionned in the OP of this SOflw Question (via the admin panel) but in fact, I was wrong.
(Maybe this step is optional.) So: first, since it doesn't work, I have deleted the path of gswin64c that I have added via the admin panel (cf.: the OP). This deletion was done via the admin panel too.
Then, to correctly add the path of gswin64c, I've typed, in the Windows CLI: setx path "%path%;c:\Program Files\..........\" (this path must contain gswin64c). Then I've restarted Windows 10 (if I remember well, it was required).
Then I re-printed the result of shutils.which('gswin64c') and gswin64c is found now. Also os.environ["PATH"] and echo %path% correctly output the path of gswin64c.
I hope this answer could help someone. In fact it was not very difficult: one just has to know how to correctly add a path on Windows 10.... Lol.
I was "inspired" by: https://www.windows-commandline.com/set-path-command-line/ ;-) .

pylint with jenkins - complince that can't find xml file

I am trying to run pylint with jenkins with following command:
pylint -f parseable -d I0011,R0801 "mypath\highLevel" | tee.exe pylint.out
The process looks run fine, pylint.out created with a lot of information inside but during pylint report creation I get following error:
13:38:27 ERROR: Publisher hudson.plugins.violations.ViolationsPublisher aborted due to exception
13:38:27 java.io.FileNotFoundException: C:\Users\DMD\.jenkins\jobs\Diamond - Run Coverage\builds\2015-07-26_13-34-30\violations\file\A:\highLevel\Monitor\InitialBootAdapter.py.xml (The filename, directory name, or volume label syntax is incorrect)
It's creates very strange path:
C:\Users\DMD\.jenkins\jobs\Diamond - Run Coverage\builds\2015-07-26_13-34-30\violations\file\A:\highLevel\Monitor\InitialBootAdapter.py.xml
I don't really understand what happens.
Why pylint is interested in file InitialBootAdapter.py? Why it's looks for file InitialBootAdapter.py.xml? Who should create it and why? I searched for this file over all the environment and didn't find. But I did'nt find any xml for my other py files?
Maybe you have experience with pylint and can help?
Thank you.
I have experience with pylint in jenkins. And here is how I use it, hope it will help someone.
Step 1
Add a "Execute Shell" step and execute the pylint command to generate the pylint.out. Please note
/usr/local/bin/pylint -f parseable -d I0011,R0801 my-python-project-folder | tee pylint.out
Step 2
Make sure you have the Violation Report Plugin, after that , click Add post-build action-->Report Violation, put the pylint.out in the corresponding field.
And after the successful run, the pylint report looks like this:
I fixed the problem, it took time and DevOps help but it worked and is described in my own blog (it's more my online notebook than blog) in very small details.
The most important point in this post is small utility
import fileinput, sys
if __name__ == "__main__":
for line in fileinput.FileInput(sys.argv[1], inplace=True):
if ".cs" in line:
line = line.replace("\\", "/")
print line,
Here sys.argv[1] should be path to your violations.xml file.
You have to move the path as a command line argument to the utility as path to your violations.xml file is dynamic and depends on build id.

Updating PYTHONPATH variable pointing to a dropbox directory containing a space

I'm trying to import a module for python that I have written that is contained in a Dropbox folder whose path contains a space. Following the comments here, I don't want to do a sys.path.append(path_to_repository) every time I use python, I'd rather just update my bash profile to point to the correct Dropbox folder once.
I've tried adapting the code from the previous page by appending the following lines to my ~/.bash_profile:
PYTHONPATH ="/Users/myusername/Dropbox (projectname)/REPOSITORY_NAME"
export ${PYTHONPATH}
When I close the terminal window and reopen, I get the following error message:
-bash: PYTHONPATH: command not found
-bash: export: `/Users/myusername/Dropbox': not a valid identifier
-bash: export: `(projectname)/REPOSITORY_NAME': not a valid identifier
and (not suprisingly) when I then try to import from the repository in python, I get a module not found:
>>> from REPOSITORY_NAME import myfile
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named REPOSITORY_NAME
Does anyone have any solutions?
Some questions/possibilities I'm thinking of -
1) Does it have anything to do with my Anaconda configuration? (Anaconda runs in a virtual env)
2) Does it has anything to do with the folder with python code being installed in Dropbox?
3) Could it be that spaces in pythonpath are not interpreted correctly?
4) Is there any problem with this being the same directory that syncs to github and bitbucket?
Thanks in advance for your help.
*Edit:
The solution seems to be to 1) erase the extra space in the first line and 2) repalce the ${PYTHONPATH} with PYTHONPATH in the second line, i.e. to adjust ~/.bash_profile to have the following line:
PYTHONPATH="/Users/myusername/Dropbox (projectname)/REPOSITORY_NAME"
export PYTHONPATH
I had a similar problem a while ago. The error message is indicating that the problem lies in the space in the directory path - the bash_profile is being truncated by the space and it is splitting the path into 2. It may be a problem with the way bash handles spaces, but I am not 100% sure.
Here is one solution that worked for me:
export PYTHONPATH="/Users/myusername/Dropbox (projectname)/REPOSITORY_NAME"
It is similar to what you have, but export and PYTHONPATH are in the same line. I don't think this would interfere with Dropbox, Github, Bitbucket, Anaconda (or any other virtualenv like Enthought) etc., as long as you as have a _init__.py' file in each directory where you have your .py files.
Hope this helps
Yeah, you put a space before the = sign when exporting PYTHONPATH:
PYTHONPATH ="/Users/myusername/Dropbox (projectname)/REPOSITORY_NAME"
Should be:
PYTHONPATH="/Users/myusername/Dropbox (projectname)/REPOSITORY_NAME"

Categories

Resources