Bash: Tokenize string using shell rules without eval'ing it?

Bash: Tokenize string using shell rules without eval'ing it? - python

I'm writing a wrapper script. The original program's arguments are in a separate file, args. The script needs to split contents of args using shell parameter rules and then run the program. A partial solution (set + eval) was offered in Splitting a string to tokens according to shell parameter rules without eval:
#!/usr/bin/env bash
STDOUT="$1"
STDERR="$2"
( set -f ; eval "set -- $(cat args)"; exec run_in_container "$#" >"$STDOUT" 2>"$STDERR" )
but in my case args is user-generated. One can easily imagine
args: echo "Hello, 'world'! $(rm -rf /)" (not cool, but harmless: commands are run in a e.g. docker container)
args: bash -c "$JAVA_HOME/<...> > <...> && <...>" (harmful: $JAVA_HOME was intended to be container's value of environment variable JAVA_HOME, but actually will be substituted earlier, when eval'ing the command in the wrapper script's subshell.)
I tried Python, and this works:
#!/usr/bin/env python
import shlex, subprocess, sys
with open('args', 'r') as argsfile:
args = argsfile.read()
with open(sys.argv[1], 'w') as outfile, open(sys.argv[2], 'w') as errfile:
exit(subprocess.call(["run_in_container"] + shlex.split(args), stdout=outfile, stderr=errfile))
Is there a way to do shlex in bash: tokenize the string using shell parameter rules, but don't substitute any variables' values, don't execute $(...) etc.?

Related

Run a bash command with variables as a python subprocess

I have this shell command:
$ docker run -it --env-file=.env -e "CONFIG=$(cat /path/to/your/config.json | jq -r tostring)" algolia/docsearch-scraper
And I want to run it as a python subprocess.
I thought I'll only need an equivalent of the jq -r tostring, but if I use the config.json as a normal string the " don't get escaped. I also escaped them by using json.load(config.json).
With the original jq command the " don't get escaped either and it's just returning the json string.
When I use the json returned as a string in python subprocess i get always a FileNotFoundError on the subprocess line.
#main.command()
def algolia_scrape():
with open(f"{WORKING_DIR}/conf_dev.json") as conf:
CONFIG = json.load(conf)
subprocess.Popen(f'/usr/local/bin/docker -it --env-file={WORKING_DIR}/algolia.env -e "CONFIG={json.dumps(CONFIG)}" algolia/docsearch-scraper')

You get "file not found" because (without shell=True) you are trying to run a command whose name is /usr/local/bin/docker -it ... when you want to run /usr/local/bin/docker with some arguments. And of course it would be pretty much a nightmare to try to pass the JSON through the shell because you need to escape any shell metacharacters from the string; but just break up the command string into a list of strings, like the shell would.
def algolia_scrape():
with open(f"{WORKING_DIR}/conf_dev.json") as conf:
CONFIG = json.load(conf)
p = subprocess.Popen(['/usr/local/bin/docker', '-it',
f'--env-file={WORKING_DIR}/algolia.env',
'-e', f'CONFIG={json.dumps(CONFIG)}',
'algolia/docsearch-scraper'])
You generally want to save the result of subprocess.Popen() because you will need to wait for the process when it terminates.

Using ssh and sed within a python script with os.system properly

I am trying to run an ssh command within a python script using os.system to add a 0 at the end of a fully matched string in a remote server using ssh and sed.
I have a file called nodelist in a remote server that's a list that looks like this.
test-node-1
test-node-2
...
test-node-11
test-node-12
test-node-13
...
test-node-21
I want to use sed to make the following modification, I want to search test-node-1, and when a full match is found I want to add a 0 at the end, the file must end up looking like this.
test-node-1 0
test-node-2
...
test-node-11
test-node-12
test-node-13
...
test-node-21
However, when I run the first command,
hostname = 'test-node-1'
function = 'nodelist'
os.system(f"ssh -i ~/.ssh/my-ssh-key username#serverlocation \"sed -i '/{hostname}/s/$/ 0/' ~/{function}.txt\"")
The result becomes like this,
test-node-1 0
test-node-2
...
test-node-11 0
test-node-12 0
test-node-13 0
...
test-node-21
I tried adding a \b to the command like this,
os.system(f"ssh -i ~/.ssh/my-ssh-key username#serverlocation \"sed -i '/\b{hostname}\b/s/$/ 0/' ~/{function}.txt\"")
The command doesn't work at all.
I have to manually type in the node name instead of using a variable like so,
os.system(f"ssh -i ~/.ssh/my-ssh-key username#serverlocation \"sed -i '/\btest-node-1\b/s/$/ 0/' ~/{function}.txt\"")
to make my command work.
What's wrong with my command, why can't I do what I want it to do?

This code has serious security problems; fixing them requires reengineering it from scratch. Let's do that here:
#!/usr/bin/env python3
import os.path
import shlex # note, quote is only here in Python 3.x; in 2.x it was in the pipes module
import subprocess
import sys
# can set these from a loop if you choose, of course
username = "whoever"
serverlocation = "whereever"
hostname = 'test-node-1'
function = 'somename'
desired_cmd = ['sed', '-i',
f'/\\b{hostname}\\b/s/$/ 0/',
f'{function}.txt']
desired_cmd_str = ' '.join(shlex.quote(word) for word in desired_cmd)
print(f"Remote command: {desired_cmd_str}", file=sys.stderr)
# could just pass the below direct to subprocess.run, but let's log what we're doing:
ssh_cmd = ['ssh', '-i', os.path.expanduser('~/.ssh/my-ssh-key'),
f"{username}#{serverlocation}", desired_cmd_str]
ssh_cmd_str = ' '.join(shlex.quote(word) for word in ssh_cmd)
print(f"Local command: {ssh_cmd_str}", file=sys.stderr) # log equivalent shell command
subprocess.run(ssh_cmd) # but locally, run without a shell
If you run this (except for the subprocess.run at the end, which would require a real SSH key, hostname, etc), output looks like:
Remote command: sed -i '/\btest-node-1\b/s/$/ 0/' somename.txt
Local command: ssh -i /home/yourname/.ssh/my-ssh-key whoever#whereever 'sed -i '"'"'/\btest-node-1\b/s/$/ 0/'"'"' somename.txt'
That's correct/desired output; the funny '"'"' idiom is how one safely injects a literal single quote inside a single-quoted string in a POSIX-compliant shell.
What's different? Lots:
We're generating the commands we want to run as arrays, and letting Python do the work of converting those arrays to strings where necessary. This avoids shell injection attacks, a very common class of security vulnerability.
Because we're generating lists ourselves, we can change how we quote each one: We can use f-strings when it's appropriate to do so, raw strings when it's appropriate, etc.
We aren't passing ~ to the remote server: It's redundant and unnecessary because ~ is the default place for a SSH session to start; and the security precautions we're using (to prevent values from being parsed as code by a shell) prevent it from having any effect (as the replacement of ~ with the active value of HOME is not done by sed itself, but by the shell that invokes it; because we aren't invoking any local shell at all, we also needed to use os.path.expanduser to cause the ~ in ~/.ssh/my-ssh-key to be honored).
Because we aren't using a raw string, we need to double the backslashes in \b to ensure that they're treated as literal rather than syntactic by Python.
Critically, we're never passing data in a context where it could be parsed as code by any shell, either local or remote.

python3.7 optionparser input option with asterisk(*) becomes a file in the folder

Env:
python3.7
OptionParser with a option [ add_option('-t', '--target', action='append', dest='targets') ]
OS: CentOS7.6
Problem:
So I am using this option to input a list of targets, and with this command line:
parser -t logs* -t test
there's a file "logs.tar.gz" in where I execute this command line,
when i print the value of targets, this is what i get:
['logs.tar.gz', 'test']
So I believe this is a 'problem' of system level, and what I want to know is:
is there any way to make logs* be logs* without input logs\* in python?

The shell is who is expanding the *. There is nothing that python can do here since it never gets to know about the log*.
You can force your shell to interpret the * as a literal value with some quoting:
parser -t "logs*" -t test
This works in zsh, it might be different for your shell.

Running subprocesses command with two string inputs

I'm trying to validate a certificate with a CA bundle file. The original Bash command takes two file arguments like this;
openssl verify -CAfile ca-ssl.ca cert-ssl.crt
I'm trying to figure out how to run the above command in python subprocess whilst having ca-ssl.ca and cert-ssl.crt as variable strings (as opposed to files).
If I ran the command with variables (instead of files) in bash then this would work;
ca_value=$(<ca-ssl.ca)
cert_value=$(<cert-ssl.crt)
openssl verify -CAfile <(echo "$ca_value") <(echo "$cert_value")
However, I'm struggling to figure out how to do the above with Python, preferably without needing to use shell=True. I have tried the following but doesn't work and instead prints 'help' commands for openssl;
certificate = ''' cert string '''
ca_bundle = ''' ca bundle string '''
def ca_valid(cert, ca):
ca_validation = subprocess.Popen(['openssl', 'verify', '-CAfile', ca, cert], stdin=subprocess.PIPE, stdout=subprocess.PIPE, bufsize=1)
ca_validation_output = ca_validation.communicate()[0].strip()
ca_validation.wait()
ca_valid(certificate, ca_bundle)
Any guidance/clues on what I need to look further into would be appreciated.

Bash process substitution <(...) in the end is supplying a file path as an argument to openssl.
You will need to make a helper function to create this functionality since Python doesn't have any operators that allow you to inline pipe data into a file and present its path:
import subprocess
def validate_ca(cert, ca):
with filearg(ca) as ca_path, filearg(cert) as cert_path:
ca_validation = subprocess.Popen(
['openssl', 'verify', '-CAfile', ca_path, cert_path],
stdout=subprocess.PIPE,
)
return ca_validation.communicate()[0].strip()
Where filearg is a context manager which creates a named temporary file with your desired text, closes it, hands the path to you, and then removes it after the with scope ends.
import os
import tempfile
from contextlib import contextmanager
#contextmanger
def filearg(txt):
with tempfile.NamedTemporaryFile('w', delete=False) as fh:
fh.write(txt)
try:
yield fh.name
finally:
os.remove(fh.name)
Anything accessing this temporary file(like the subprocess) needs to work inside the context manager.
By the way, the Popen.wait(self) is redundant since Popen.communicate(self) waits for termination.

If you want to use process substitution, you will have to use shell=True. This is unavoidable. The <(...) process substitution syntax is bash syntax; you simply must call bash into service to parse and execute such code.
Additionally, you have to ensure that bash is invoked, as opposed to sh. On some systems sh may refer to an old Bourne shell (as opposed to the Bourne-again shell bash) in which case process substitution will definitely not work. On some systems sh will invoke bash, but process substitution will still not work, because when invoked under the name sh the bash shell enters something called POSIX mode. Here are some excerpts from the bash man page:
...
INVOCATION
... When invoked as sh, bash enters posix mode after the startup files are read. ....
...
SEE ALSO
...
http://tiswww.case.edu/~chet/bash/POSIX -- a description of posix mode
...
From the above web link:
Process substitution is not available.
/bin/sh seems to be the default shell in python, whether you're using os.system() or subprocess.Popen(). So you'll have to specify the argument executable='bash', or executable='/bin/bash' if you want to specify the full path.
This is working for me:
subprocess.Popen('printf \'argument: "%s"\\n\' verify -CAfile <(echo ca_value) <(echo cert_value);',executable='bash',shell=True).wait();
## argument: "verify"
## argument: "-CAfile"
## argument: "/dev/fd/63"
## argument: "/dev/fd/62"
## 0
Here's how you can actually embed the string values from variables:
bashEsc = lambda s: "'"+s.replace("'","'\\''")+"'";
ca_value = 'x';
cert_value = 'y';
cmd = 'printf \'argument: "%%s"\\n\' verify -CAfile <(echo %s) <(echo %s);'%(bashEsc(ca_value),bashEsc(cert_value));
subprocess.Popen(cmd,executable='bash',shell=True).wait();
## argument: "verify"
## argument: "-CAfile"
## argument: "/dev/fd/63"
## argument: "/dev/fd/62"
## 0

Getting console output of a Perl script through Python

There are a variety of posts and resources explaining how to use Python to get output of an outside call. I am familiar with using these--I've used Python to get output of jars and exec several times, when it was not realistic or economical to re-implement the functionality of that jar/exec inside Python itself.
I am trying to call a Perl script via Python's subprocess module, but I have had no success with this particular Perl script. I carefully followed the answers here, Call Perl script from Python, but had no results.
I was able to get the output of this test Perl script from this question/answer: How to call a Perl script from Python, piping input to it?
#!/usr/bin/perl
use strict;
use warnings;
my $name = shift;
print "Hello $name!\n";
Using this block of Python code:
import subprocess
var = "world"
args_test = ['perl', 'perl/test.prl', var]
pipe = subprocess.Popen(args_test, stdout=subprocess.PIPE)
out, err = pipe.communicate()
print out, err
However, if I swap out the arguments and the Perl script with the one I need output from, I get no output at all.
args = ['perl', 'perl/my-script.prl', '-a', 'perl/file-a.txt',
'-t', 'perl/file-t.txt', 'input.txt']
which runs correctly when entered on the command line, e.g.
>perl perl/my-script.prl -a perl/file-a.txt -t perl/file-t.txt input.txt
but this produces no output when called via subprocess:
pipe = subprocess.Popen(args, stdout=subprocess.PIPE)
out, err = pipe.communicate()
print out, err
I've done another sanity check as well. This correctly outputs the help message of Perl as a string:
import subprocess
pipe = subprocess.Popen(['perl', '-h'], stdout=subprocess.PIPE)
out, err = pipe.communicate()
print out, err
As shown here:
>>> ================================ RESTART ================================
>>>
Usage: perl [switches] [--] [programfile] [arguments]
-0[octal] specify record separator (\0, if no argument)
-a autosplit mode with -n or -p (splits $_ into #F)
-C[number/list] enables the listed Unicode features
-c check syntax only (runs BEGIN and CHECK blocks)
-d[:debugger] run program under debugger
-D[number/list] set debugging flags (argument is a bit mask or alphabets)
-e program one line of program (several -e's allowed, omit programfile)
-f don't do $sitelib/sitecustomize.pl at startup
-F/pattern/ split() pattern for -a switch (//'s are optional)
-i[extension] edit <> files in place (makes backup if extension supplied)
-Idirectory specify #INC/#include directory (several -I's allowed)
-l[octal] enable line ending processing, specifies line terminator
-[mM][-]module execute "use/no module..." before executing program
-n assume "while (<>) { ... }" loop around program
-p assume loop like -n but print line also, like sed
-P run program through C preprocessor before compilation
-s enable rudimentary parsing for switches after programfile
-S look for programfile using PATH environment variable
-t enable tainting warnings
-T enable tainting checks
-u dump core after parsing program
-U allow unsafe operations
-v print version, subversion (includes VERY IMPORTANT perl info)
-V[:variable] print configuration summary (or a single Config.pm variable)
-w enable many useful warnings (RECOMMENDED)
-W enable all warnings
-x[directory] strip off text before #!perl line and perhaps cd to directory
-X disable all warnings
None

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Bash: Tokenize string using shell rules without eval'ing it? - python

Related

Run a bash command with variables as a python subprocess

Using ssh and sed within a python script with os.system properly

python3.7 optionparser input option with asterisk(*) becomes a file in the folder

Running subprocesses command with two string inputs

Getting console output of a Perl script through Python

Categories

Resources