Fails to use `git check-ignore` from `subprocess.run`

Fails to use `git check-ignore` from `subprocess.run` - python

I don't achieve to use git check-ignore from subprocess.run. What I have missed in the following not working example?
tools-for-dev > python
Python 3.8.11 (default, Aug 6 2021, 08:56:27)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from subprocess import run
>>> run(['git', 'a'])
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: src2prod/README.md
modified: src2prod/src/project.py
modified: src2prod/tools/debug/lof.py
CompletedProcess(args=['git', 'a'], returncode=0)
>>> run(['git', 'check-ignore', '**/*'])
CompletedProcess(args=['git', 'check-ignore', '**/*'], returncode=1)
>>> quit()
tools-for-dev > git check-ignore **/*
multimd/changes/x-todo-x.txt
spkpb/changes/x-todo-x.txt
spkpb/dist
spkpb/dist/spkpb-0.0.10b0-py3-none-any.whl
spkpb/dist/spkpb-0.0.10b0.tar.gz
...

I am not sure but I think that the run command takes the first parameter given and encloses all the others with single quotes. As #torek pointed out in the comments, run command does not enclose arguments with quotes. Instead, subprocess.run is called with shell=False if not specified otherwise. If the shell is not called in the first place, shell expansion cannot happen.
Again, as #torek observed, the best solution is probably using glob.glob, with the option recursive=True in your case, since you have the ** pattern.
check-ignore does not work with glob patterns, but with plain paths. If in the repository you were able to create a file called **/*, then it would probably find it (without shell expansion), but since you cannot do it, the only possible result is
Exit Code 1: None of the provided paths are ignored.

Here is a very ugly patch... See the 1st comment below for a better approach.
from os import popen
depth = 10 # <-- Just for the example but in real use
# this value must be found automatically.
for i in range(-1, depth + 1):
if i == - 1:
pathsearch = '**'
else:
pathsearch = '**' + '/**'*i + '/*'
filesfound = popen(f'git check-ignore {pathsearch}').read()
if filesfound:
print(filesfound)

Related

What's the Python equivalent of Julia's `#edit` macro?

In Julia, calling a function with the #edit macro from the REPL will open the editor and put the cursor at the line where the method is defined. So, doing this:
julia> #edit 1 + 1
jumps to julia/base/int.jl and puts the cursor on the line:
(+)(x::T, y::T) where {T<:BitInteger} = add_int(x, y)
As does the function form: edit(+, (Int, Int))
Is there an equivalent decorator/function in Python that does the same from the Python REPL?

Disclaimer: In the Python ecosystem, this is not the job of the core language/runtime but rather tools such as IDEs. For example, the ipython shell has the ?? special syntax to get improved help including source code.
Python 3.8.5 (default, Jul 21 2020, 10:42:08)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.18.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import random
In [2]: random.uniform??
Signature: random.uniform(a, b)
Source:
def uniform(self, a, b):
"Get a random number in the range [a, b) or [a, b] depending on rounding."
return a + (b-a) * self.random()
File: /usr/local/Cellar/python#3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/random.py
Type: method
The Python runtime itself allows viewing source code of objects via inspect.getsource. This uses a heuristic to search the source code as available; the objects themselves do not carry their source code.
Python 3.8.5 (default, Jul 21 2020, 10:42:08)
[Clang 11.0.0 (clang-1100.0.33.17)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import inspect
>>> print(inspect.getsource(inspect.getsource))
def getsource(object):
"""Return the text of the source code for an object.
The argument may be a module, class, method, function, traceback, frame,
or code object. The source code is returned as a single string. An
OSError is raised if the source code cannot be retrieved."""
lines, lnum = getsourcelines(object)
return ''.join(lines)
It is not possible to resolve arbitrary expressions or statements to their source; since all names in Python are resolved dynamically, the vast majority of expressions does not have a well-defined implementation unless executed. A debugger, e.g. as provided by pdb.set_trace(), allows inspecting the expression as it is executed.

In most IDEs like PyCharm or VSCode you can Ctrl+ click on a function / class to get its definition, even if it is in the core language or a 3rd party library (in VSCode, this also works in Julia btw.).
A limitation is that this only works for "pure Python" code, C library code, etc. is not shown.

Why we must use a list in subprocess.Popen?

My question is more theoretical than practical, I've found more answers that explains how but not why should we use a list in a subprocess.Popen call.
For example as is known:
Python 2.7.10 (default, Oct 14 2015, 16:09:02)
[GCC 5.2.1 20151010] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> cmd = subprocess.Popen(["python", "-V"], stdout=subprocess.PIPE)
Python 2.7.10
Then I was messing around in UNIX and found something interesting:
mvarge#ubuntu:~$ strace -f python -V 2>&1
execve("/usr/bin/python", ["python", "-V"], [/* 29 vars */]) = 0
Probably both execve and the list model that subprocess use are someway related, but can anyone give a good explanation for this?
Thanks in advance.

The underlying C-level representation is a *char [] array. Representing this as a list in Python is just a very natural and transparent mapping.
You can use a string instead of a list with shell=True; the shell is then responsible for parsing the command line into a * char [] array. However, the shell adds a number of pesky complexities; see the many questions for why you want to avoid shell=True for a detailed explanation.
The command line arguments argv and the environment envp are just two of many OS-level structures which are essentially a null-terminated arrays of strings.

A process is an OS level abstraction — to create a process, you have to use OS API that dictates what you should use. It is not necessary to use a list e.g., a string (lpCommandLine) is the native interface on Windows (CreateProcess()). POSIX uses execv() and therefore the native interface is a sequence of arguments (argv). Naturally, subprocess Python module uses these interfaces to run external commands (create new processes).
The technical (uninsteresting) answer is that in "why we must", the "must" part is not correct as Windows demonstrates.
To understand "why it is", you could ask the creators of CreateProcess(), execv() functions.
To understand "why we should" use a list, look at the table of contents for Unix (list) and Windows (string): How Command Line Parameters Are Parsed — the task that should be simple is complicated on Windows.
The main difference is that on POSIX the caller is responsible for splitting a command line into separate parameters. While on Windows the command itself parses its parameters. Different programs may and do use different algorithms to parse the parameters. subprocess module uses MS C runtime rules (subprocess.list2cmdline()), to combine args list into the command line. It is much harder for a programmer to understand how the parameters might be parsed on Windows.

Call shell script from python code without any return value (0) or new lines

Let's say my shell script returns a value of '19' when it is run. I'd like to store that value (without any return value of 0 or empty lines) into a variable in my python code for use later.
There are many questions here with similar reference to mine but I have yet to find a solution where the shell script is returning me '19' without a extra return value of 0 or new lines.
Using subprocess.call('bash TestingCode', shell=True) in the python code returns me exactly what I want however when I store this command in a variable and then print the variable, it prints with an extra 0.
answer = subprocess.call('bash TestingCode', shell=True)
print answer
>>19
>>0
I then tried an example from this question: How to return a value from a shell script in a python script
However it returns me an extra empty line instead.
answer = subprocess.check_output('bash TestingCode', shell=True)
print answer
>>19
>>
I really appreciate the help!
UPDATE: TestingCode script
#!/bin/bash
num=19
echo $num

Just call it like this:
import subprocess
answer = subprocess.check_output('bash TestingCode', shell=True)
answer = answer.rstrip()
The reason is that your shell script is printing 19 followed by a new line. The return value from subprocess.check_output() will therefore include the new line produced by the shell script. Calling str.rstrip() will remove any trailing whitespace, leaving just '19' in this case.

Try calling a subprocess.Popen, it returns without the 0.

This works for me. I suspect you have a problem in your shell script that is causing the output.
$ cat test.sh
#!/bin/sh
exit 19
(0) austin#Austins-Mac-8:~
$ python2.7
Python 2.7.10 (default, Aug 22 2015, 20:33:39)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> subprocess.call('bash test.sh', shell=True)
19
>>>

Retrieve the command line arguments of the Python interpreter

Inspired by another question here, I would like to retrieve the Python interpreter's full command line in a portable way. That is, I want to get the original argv of the interpreter, not the sys.argv which excludes options to the interpreter itself (like -m, -O, etc.).
sys.flags tells us which boolean options were set, but it doesn't tell us about -m arguments, and the set of flags is bound to change over time, creating a maintenance burden.
On Linux you can use procfs to retrieve the original command line, but this is not portable (and it's sort of gross):
open('/proc/{}/cmdline'.format(os.getpid())).read().split('\0')

You can use ctypes
~$ python2 -B -R -u
Python 2.7.9 (default, Dec 11 2014, 04:42:00)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Persistent session history and tab completion are enabled.
>>> import ctypes
>>> argv = ctypes.POINTER(ctypes.c_char_p)()
>>> argc = ctypes.c_int()
>>> ctypes.pythonapi.Py_GetArgcArgv(ctypes.byref(argc), ctypes.byref(argv))
1227013240
>>> argc.value
4
>>> argv[0]
'python2'
>>> argv[1]
'-B'
>>> argv[2]
'-R'
>>> argv[3]
'-u'

I'm going to add another answer to this. #bav had the right answer for Python 2.7, but it breaks in Python 3 as #szmoore points out (not just 3.7). The code below, however, will work in both Python 2 and Python 3 (the key to that is c_wchar_p in Python 3 instead of c_char_p in Python 2) and will properly convert the argv into a Python list so that it's safe to use in other Python code without segfaulting:
def get_python_interpreter_arguments():
argc = ctypes.c_int()
argv = ctypes.POINTER(ctypes.c_wchar_p if sys.version_info >= (3, ) else ctypes.c_char_p)()
ctypes.pythonapi.Py_GetArgcArgv(ctypes.byref(argc), ctypes.byref(argv))
# Ctypes are weird. They can't be used in list comprehensions, you can't use `in` with them, and you can't
# use a for-each loop on them. We have to do an old-school for-i loop.
arguments = list()
for i in range(argc.value - len(sys.argv) + 1):
arguments.append(argv[i])
return arguments
You'll notice that it also returns only the interpreter arguments and excludes the augments found in sys.argv. You can eliminate this behavior by removing - len(sys.argv) + 1.

Check absolute paths in Python

How can I check whether two file paths point to the same file in Python?

$ touch foo
$ ln -s foo bar
$ python
Python 2.5.1 (r251:54863, Feb 6 2009, 19:02:12)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> help(os.path.samefile)
Help on function samefile in module posixpath:
samefile(f1, f2)
Test whether two pathnames reference the same actual file
>>> os.path.samefile("foo", "bar")
True

You want to use os.path.abspath(path) to normalize each path for comparison.
os.path.abspath(foo) == os.path.abspath(bar)

A simple string compare should work:
import os
print os.path.abspath(first) == os.path.abspath(second)
Credit to Andrew, he corrected my initial post which included a call to os.path.normpath: this is unneeded because the implementation of os.path.abspath does it for you.

On Windows systems, there is no samefile function and you also have to worry about case. The normcase function from os.path can be combined with abspath to handle this case.
from os.path import abspath, normcase
def are_paths_equivalent(path1, path2):
return normcase(abspath(path1)) == normcase(abspath(path2))
This will consider "C:\SPAM\Eggs.txt" to be equivalent to "c:\spam\eggs.txt" on Windows.
Note that unlike samefile, all methods based on normalizing and comparing paths will not be aware of cases where completely different paths refer to the same file. On Windows, this means that if you use SUBST, MKLINK or mounted network shares to create multiple distinct paths to the same file, none of these solutions will be able to say "that's the same file". Hopefully that's not too much of a problem most of the time.

May be one can use os.path.relpath(path1, path2) as workaround for os.path.samefile(path1, path2) on Windows?
If os.path.relpath(path1, path2) returns '.' than path1 and path2 point to the same place

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Fails to use `git check-ignore` from `subprocess.run` - python

Related

What's the Python equivalent of Julia's `#edit` macro?

Why we must use a list in subprocess.Popen?

Call shell script from python code without any return value (0) or new lines

Retrieve the command line arguments of the Python interpreter

Check absolute paths in Python

Categories

Resources