Inspired by another question here, I would like to retrieve the Python interpreter's full command line in a portable way. That is, I want to get the original argv of the interpreter, not the sys.argv which excludes options to the interpreter itself (like -m, -O, etc.).
sys.flags tells us which boolean options were set, but it doesn't tell us about -m arguments, and the set of flags is bound to change over time, creating a maintenance burden.
On Linux you can use procfs to retrieve the original command line, but this is not portable (and it's sort of gross):
open('/proc/{}/cmdline'.format(os.getpid())).read().split('\0')
You can use ctypes
~$ python2 -B -R -u
Python 2.7.9 (default, Dec 11 2014, 04:42:00)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Persistent session history and tab completion are enabled.
>>> import ctypes
>>> argv = ctypes.POINTER(ctypes.c_char_p)()
>>> argc = ctypes.c_int()
>>> ctypes.pythonapi.Py_GetArgcArgv(ctypes.byref(argc), ctypes.byref(argv))
1227013240
>>> argc.value
4
>>> argv[0]
'python2'
>>> argv[1]
'-B'
>>> argv[2]
'-R'
>>> argv[3]
'-u'
I'm going to add another answer to this. #bav had the right answer for Python 2.7, but it breaks in Python 3 as #szmoore points out (not just 3.7). The code below, however, will work in both Python 2 and Python 3 (the key to that is c_wchar_p in Python 3 instead of c_char_p in Python 2) and will properly convert the argv into a Python list so that it's safe to use in other Python code without segfaulting:
def get_python_interpreter_arguments():
argc = ctypes.c_int()
argv = ctypes.POINTER(ctypes.c_wchar_p if sys.version_info >= (3, ) else ctypes.c_char_p)()
ctypes.pythonapi.Py_GetArgcArgv(ctypes.byref(argc), ctypes.byref(argv))
# Ctypes are weird. They can't be used in list comprehensions, you can't use `in` with them, and you can't
# use a for-each loop on them. We have to do an old-school for-i loop.
arguments = list()
for i in range(argc.value - len(sys.argv) + 1):
arguments.append(argv[i])
return arguments
You'll notice that it also returns only the interpreter arguments and excludes the augments found in sys.argv. You can eliminate this behavior by removing - len(sys.argv) + 1.
Related
I don't achieve to use git check-ignore from subprocess.run. What I have missed in the following not working example?
tools-for-dev > python
Python 3.8.11 (default, Aug 6 2021, 08:56:27)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from subprocess import run
>>> run(['git', 'a'])
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: src2prod/README.md
modified: src2prod/src/project.py
modified: src2prod/tools/debug/lof.py
CompletedProcess(args=['git', 'a'], returncode=0)
>>> run(['git', 'check-ignore', '**/*'])
CompletedProcess(args=['git', 'check-ignore', '**/*'], returncode=1)
>>> quit()
tools-for-dev > git check-ignore **/*
multimd/changes/x-todo-x.txt
spkpb/changes/x-todo-x.txt
spkpb/dist
spkpb/dist/spkpb-0.0.10b0-py3-none-any.whl
spkpb/dist/spkpb-0.0.10b0.tar.gz
...
I am not sure but I think that the run command takes the first parameter given and encloses all the others with single quotes. As #torek pointed out in the comments, run command does not enclose arguments with quotes. Instead, subprocess.run is called with shell=False if not specified otherwise. If the shell is not called in the first place, shell expansion cannot happen.
Again, as #torek observed, the best solution is probably using glob.glob, with the option recursive=True in your case, since you have the ** pattern.
check-ignore does not work with glob patterns, but with plain paths. If in the repository you were able to create a file called **/*, then it would probably find it (without shell expansion), but since you cannot do it, the only possible result is
Exit Code 1: None of the provided paths are ignored.
Here is a very ugly patch... See the 1st comment below for a better approach.
from os import popen
depth = 10 # <-- Just for the example but in real use
# this value must be found automatically.
for i in range(-1, depth + 1):
if i == - 1:
pathsearch = '**'
else:
pathsearch = '**' + '/**'*i + '/*'
filesfound = popen(f'git check-ignore {pathsearch}').read()
if filesfound:
print(filesfound)
Say you are writing a script you want to be able to either directly execute from the command-line or import the functions elsewhere. As a command-line executable, you may want to pass flags as options. If you are importing the script later, it may become tedious to make each option a parameter in every function. Below I have a script that I hope illustrates my point using the verbosity option.
#!/usr/bin/python
def getArgs():
parser = argparse.ArgumentParser()
parser.add_argument('input',type=int)
parser.add_argument('-v','--verbose',action='store_true')
return parser.parse_args()
def main(input,verbose):
result = calculation(input,verbose)
if verbose:
print(str(input) + " squared is " + str(result))
else:
print(result)
def calculation(input,verbose):
if verbose:
print("Doing Calculation")
result = input * input
return result
if __name__ == '__main__': #checks to see if this script is being executed directly, will not run if imported into another script
import argparse
args=getArgs()
if args.verbose:
print("You have enabled verbosity")
main(args.input,args.verbose)
Here's some illustrative execution
user#machine ~ $ ./whatever.py 7
49
user#machine ~ $ ./whatever.py -v 7
You have enabled verbosity
Doing Calculation
7 squared is 49
user#machine ~ $ python
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import whatever
>>> whatever.main(7,False)
49
>>> whatever.main(7,True)
Doing Calculation
7 squared is 49
This script works, but I believe there is a cleaner way to handle the command-line options in the case you import the script later, such as forcing a default option. I suppose one option would be to treat the option as a global variable, but I still suspect there is a less verbose (pun intended) way to include these options in later functions.
When you have a number of functions that all share common parameters, put the parameters in an object and consider making the functions methods of its type:
class Square:
def __init__(self,v=False): self.verb=v
def calculate(self,x):
if self.verb: print(…)
return x*x
def main(self,x):
if self.verb: print(…)
y=self.calculate(x)
print("%s squared is %s"%(x,y) if self.verb else y)
if __name__=="__main__":
args=getArgs()
Square(args.verbose).main(args.input)
(The default of False is generally what an API client wants.)
My question is more theoretical than practical, I've found more answers that explains how but not why should we use a list in a subprocess.Popen call.
For example as is known:
Python 2.7.10 (default, Oct 14 2015, 16:09:02)
[GCC 5.2.1 20151010] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> cmd = subprocess.Popen(["python", "-V"], stdout=subprocess.PIPE)
Python 2.7.10
Then I was messing around in UNIX and found something interesting:
mvarge#ubuntu:~$ strace -f python -V 2>&1
execve("/usr/bin/python", ["python", "-V"], [/* 29 vars */]) = 0
Probably both execve and the list model that subprocess use are someway related, but can anyone give a good explanation for this?
Thanks in advance.
The underlying C-level representation is a *char [] array. Representing this as a list in Python is just a very natural and transparent mapping.
You can use a string instead of a list with shell=True; the shell is then responsible for parsing the command line into a * char [] array. However, the shell adds a number of pesky complexities; see the many questions for why you want to avoid shell=True for a detailed explanation.
The command line arguments argv and the environment envp are just two of many OS-level structures which are essentially a null-terminated arrays of strings.
A process is an OS level abstraction — to create a process, you have to use OS API that dictates what you should use. It is not necessary to use a list e.g., a string (lpCommandLine) is the native interface on Windows (CreateProcess()). POSIX uses execv() and therefore the native interface is a sequence of arguments (argv). Naturally, subprocess Python module uses these interfaces to run external commands (create new processes).
The technical (uninsteresting) answer is that in "why we must", the "must" part is not correct as Windows demonstrates.
To understand "why it is", you could ask the creators of CreateProcess(), execv() functions.
To understand "why we should" use a list, look at the table of contents for Unix (list) and Windows (string): How Command Line Parameters Are Parsed — the task that should be simple is complicated on Windows.
The main difference is that on POSIX the caller is responsible for splitting a command line into separate parameters. While on Windows the command itself parses its parameters. Different programs may and do use different algorithms to parse the parameters. subprocess module uses MS C runtime rules (subprocess.list2cmdline()), to combine args list into the command line. It is much harder for a programmer to understand how the parameters might be parsed on Windows.
Let's say my shell script returns a value of '19' when it is run. I'd like to store that value (without any return value of 0 or empty lines) into a variable in my python code for use later.
There are many questions here with similar reference to mine but I have yet to find a solution where the shell script is returning me '19' without a extra return value of 0 or new lines.
Using subprocess.call('bash TestingCode', shell=True) in the python code returns me exactly what I want however when I store this command in a variable and then print the variable, it prints with an extra 0.
answer = subprocess.call('bash TestingCode', shell=True)
print answer
>>19
>>0
I then tried an example from this question: How to return a value from a shell script in a python script
However it returns me an extra empty line instead.
answer = subprocess.check_output('bash TestingCode', shell=True)
print answer
>>19
>>
I really appreciate the help!
UPDATE: TestingCode script
#!/bin/bash
num=19
echo $num
Just call it like this:
import subprocess
answer = subprocess.check_output('bash TestingCode', shell=True)
answer = answer.rstrip()
The reason is that your shell script is printing 19 followed by a new line. The return value from subprocess.check_output() will therefore include the new line produced by the shell script. Calling str.rstrip() will remove any trailing whitespace, leaving just '19' in this case.
Try calling a subprocess.Popen, it returns without the 0.
This works for me. I suspect you have a problem in your shell script that is causing the output.
$ cat test.sh
#!/bin/sh
exit 19
(0) austin#Austins-Mac-8:~
$ python2.7
Python 2.7.10 (default, Aug 22 2015, 20:33:39)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> subprocess.call('bash test.sh', shell=True)
19
>>>
I have a shell script that has a command to run a python script. I want 4 variables (for eg: var1,var2,var3,var4) from the shell script to be used in the python script. Any suggestions how to do this?
For eg: I want to replace "lastTest44", firstTest44 and A-S00000582 with variables from the shell script.
driver.find_element_by_id("findKey_input").clear()
driver.find_element_by_id("findKey_input").send_keys("lastTest44")
driver.find_element_by_id("ST_View_lastTest44, firstTest44").click()
driver.find_element_by_link_text("A-S00000582").click()
Just use command line arguments:
Shell Script
a=1
b=2
python test1.py "$a" "$b"
Python Script
import sys
var1 = sys.argv[1]
var2 = sys.argv[2]
print var1, var2
What you're looking to use are called command line arguments. These are parameters that are specified at the time of calling the particular piece of code you're looking to run.
In Python, these are accessible through the sys module under a variable called argv. This is an array of all the arguments passed in from the caller, where each value within the array is a string.
For example, say the code I'm writing takes in parameters to draw a square. This could require 4 parameters - An x coordinate, y coordinate, a width, and a height. The Python code for this might look like this:
import sys
x = sys.argv[1]
y = sys.argv[2]
width = sys.argv[3]
height = sys.argv[4]
# Some more code follows.
A few things to note:
Each argument is of type string. This means that in this case, I could not perform any sort of arithmetic until converting them into the correct types that I want.
The first argument in sys.argv is the name of the script being run. You'll want to make sure that you start reading from the second position in the array sys.argv[1] instead of the typical zero-th index like you normally would.
There is some more detailed information here, which could lead you to better ways of handling command line arguments. To get started though, this would work well enough.
I think this will do what you want:
2014-06-05 09:37:57 [tmp]$ export VAR1="a"
2014-06-05 09:38:01 [tmp]$ export VAR2="b"
2014-06-05 09:38:05 [tmp]$ export VAR3="c"
2014-06-05 09:38:08 [tmp]$ export VAR4="d"
2014-06-05 09:38:12 [tmp]$ python
Python 2.7.3 (default, Feb 27 2014, 19:58:35)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from os import environ
>>> environ['VAR1']
'a'
>>> environ['VAR2']
'b'
>>> environ['VAR3']
'c'
>>> environ['VAR4']
'd'
>>> environ['VAR5']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/UserDict.py", line 23, in __getitem__
raise KeyError(key)
KeyError: 'VAR5'
Remember to catch KeyError and respond accordingly or use the get method (from the dict class) and specify a default to be used when the key is not present:
>>> environ.get('VAR5', 'not present')
'not present'
more:
https://docs.python.org/2/library/os.html
I want to add for what worked for my case:
I have my variables in file which was being sourced in shell script.
Need to pass the variables to python from same file.
I have pandas and spark as well in my case
My expected result is to concatenate the path to pass in "tocsv" which is achieved
**Shell**:
. path/to/variable/source file # sourcing the variables
python path/to/file/extract.py "$OUTBOUND_PATH"
**Python**:
import sys
import pandas as pd
#If spark session is involved import the sparksession as well
outbound_path=sys.argv[1] #argument is collected from the file passed through shell
file_name="/report.csv"
geo.topandas().tocsv(outbound_path+file_name,mode="w+")