CONTEXT
I am working on a simulation cluster.
In order to make as flexible as possible (working with different simulation soft) , we created a python file that parse a config file defining environment variables, and command line to start the simulation. This command is launched through SLURM sbatch command (shell $COMMAND)
ISSUE
From python, all Environment variables are enrolled reading the config file
I have issue with variable COMMAND that is using other environment variables (displayed as shell variable)
For example
COMMAND = "fluent -3ddp -n$NUMPROCS -hosts=./hosts -file $JOBFILE"
os.environ['COMMAND']=COMMAND
NUMPROCS = "32"
os.environ['NUMPROCS']=NUMPROCS
[...]
exe = Popen(['sbatch','template_document.sbatch'], stdout=PIPE, stderr=PIPE)
sbatch distribute COMMAND to all simulation nodes as COMMAND being a command line
COMMAND recalls other saved env. variables. Shell interprets it strictly as text... Which makes the command line fails. it is strictly as a string using $ not variable for example :
'fluent -3ddp -n$NUMPROCS -hosts=./hosts -file $JOBFILE'
SOLUTION I AM LOOKING FOR
I am looking for a simple solution
Solution 1: A 1 to 3 python command lines to evaluate the COMMAND as shell command to echo
Solution 2: A Shell command to evaluate the variables within the "string" $COMMAND as a variable
At the end the command launched from within sbatch should be
fluent -3ddp -n32 -hosts=./hosts -file /path/to/JOBFILE
You have a few options:
Partial or no support for bash's variable substitution, e.g. implement some python functionality to reproduces bash's $VARIABLE syntax.
Reproduce all of bash's variable substitution facilities which are supported in the config file ($VARIABLE, ${VARIABLE}, ${VARIABLE/x/y}, $(cmd) - whatever.
Let bash do the heavy lifting, for the cost of performance and possibly security, depending on your trust of the content of the config files.
I'll show the third one here, since it's the most resilient (again, security issues notwithstanding). Let's say you have this config file, config.py:
REGULAR = "some-text"
EQUALS = "hello = goodbye" # trap #1: search of '='
SUBST = "decorated $REGULAR"
FANCY = "xoxo${REGULAR}xoxo"
CMDOUT = "$(date)"
BASH_A = "trap" # trap #2: avoid matching variables like BASH_ARGV
QUOTES = "'\"" # trap #3: quoting
Then your python program can run the following incantation:
bash -c 'source <(sed "s/^/export /" config.py | sed "s/[[:space:]]*=[[:space:]]*/=/") && env | grep -f <(cut -d= -f1 config.py | grep -E -o "\w+" | sed "s/.*/^&=/")'
which will produce the following output:
SUBST=decorated some-text
CMDOUT=Thu Nov 28 12:18:50 PST 2019
REGULAR=some-text
QUOTES='"
FANCY=xoxosome-textxoxo
EQUALS=hello = goodbye
BASH_A=trap
Which you can then read with python, but note that the quotes are now gone, so you'll have to account for that.
Explanation of the incantation:
bash -c 'source ...instructions... && env | grep ...expressions...' tells bash to read & interpret the instructions, then grep the environment for the expressions. We're going to turn the config file into instructions which modify bash's environment.
If you try using set instead of env, the output will be inconsistent with respect to quoting. Using env avoids trap #3.
Instructions: We're going to create instructions for the form:
export FANCY="xoxo${REGULAR}xoxo"
so that bash can interpret them and env can read them.
sed "s/^/export /" config.py prefixes the variables with export.
sed "s/[[:space:]]*=[[:space:]]*/=/" converts the assignment format to syntax that bash can read with source. Using s/x/y/ instead of s/x/y/g avoids trap #1.
source <(...command...) causes bash to treat the output of the command as a file and run its lines, one by one.
Of course, one way to avoid this complexity is to have the file use bash syntax to begin with. If that were the case, we would use source config.sh instead of source <(...command...).
Expressions: We want to grep the output of env for patterns like ^FANCY=.
cut -d= -f1 config.py | grep -E -o "\w+" finds the variable names in config.py.
sed "s/.*/^&=/" turns variable names like FANCY to grep search expressions such as ^FANCY=. This is to avoid trap #2.
grep -f <(...command...) gets grep to treat the output of the command as a file containing one search expression in each line, which in this case would be ^FANCY=, ^CMDOUT= etc.
EDIT
Since you actually want to just pass this environment to another bash command rather than use it in python, you can actually just have python run this:
bash -c 'source <(sed "s/^/export /" config.py | sed "s/[[:space:]]*=[[:space:]]*/=/") && $COMMAND'
(assuming that COMMAND is specified in the config file).
It seems I have not explained well enough the issue, but your 3rd solution seems replying to my expectations... though so far I did not manage to adapt it
Based on your 3rd solution BASH, I will make it more straight :
Let's say I have got following after running python, and this that cannot be modified
EXPORT COMMAND='fluent -3ddp -n$NUMPROCS -hosts=./hosts -file $JOBFILE'
EXPORT JOBFILE='/path/to/jobfile'
EXPORT NUMPROCS='32'
EXPORT WHATSOEVER='SPECIFIC VARIABLE TO SIMULATION SOFTWARE'
I wish to execute the following from the slurm batch file (bash), using $COMMAND / $JOBFILE /$NUMPROCS
fluent -3ddp -n32-hosts=./hosts -file /path/to/jobfile
Please note : I have backup solution in python - I managed to substitute $VARIABLE by its value - based on the assumption $VARIABLE is not composed by another $variable... using regex substitution... just it looks so many lines to have what seemed to me simple request
I am writing a bash script (e.g. program.sh) where I am calling a python code in which a list of files are read from a directory.
the python script (read_files.py) is as following:
import os
def files(path):
for filename in os.listdir('/home/testfiles'):
if os.path.isfile(os.path.join('/home/testfiles', filename)):
yield filename
for filename in files("."):
print (filename)
Now I want to keep the string filename and use it in the bash script.
e.g.
program.sh:
#!/bin/bash
python read_files.py
$Database_maindir/filename
.
.
.
How could I keep the string filename (the names of files in the directory) and write a loop in order to execute commands in bash script for each filename?
The Python script in the question doesn't do anything that Bash cannot already do all by itself, and simpler and easier. Use simple native Bash instead:
shopt -s nullglob
for path in /home/testfiles/*; do
if [[ -f "$path" ]]; then
filename=$(basename "$path")
echo "do something with $filename"
fi
done
If the Python script does something more than what you wrote in the question,
for example it does some complex computation and spits out filenames,
which would be complicated to do in Bash,
then you do have a legitimate use case to keep it.
In that case, you can iterate over the lines in the output like this:
python read_files.py | while read -r filename; do
echo "do something with $filename"
done
Are you looking for something like this? =
for filename in $(python read_files.py); do
someCommand $filename
done
I am trying to run a bash while loop inside a Python3.6 script. What I have tried so far is:
subprocess.run(args=['while [ <condition> ]; do <command> done;'])
I get the following error:
FileNotFoundError: [Errno 2] No such file or directory
Is there a way to run such a while loop inside Python?
The part that's tripping you up is providing the args as a list. From the documentation:
If the cmd argument to popen2 functions is a string, the command is executed through /bin/sh. If it is a list, the command is directly executed.
This seems to do what you want:
subprocess.run('i=0; while [ $i -lt 3 ]; do i=`expr $i + 1`; echo $i; done', shell=True)
Notice it's specified as a string instead of a list.
Running bash for loop in Python 3.x is very much like running while loop.
#! /bin/bash
for i in */;
do
zip -r "${i%/}.zip" "$i";
done
This will iterate over a path and zip all directories. To run above bash script in Python:
import subprocess
subprocess.run('for i in */; do zip -r "${i%/}.zip" "$i"; done', shell=True)
This question already has answers here:
How do I get the directory where a Bash script is located from within the script itself?
(74 answers)
Closed 6 years ago.
I have a Bash script that needs to know its full path. I'm trying to find a broadly-compatible way of doing that without ending up with relative or funky-looking paths. I only need to support Bash, not sh, csh, etc.
What I've found so far:
The accepted answer to Getting the source directory of a Bash script from within addresses getting the path of the script via dirname $0, which is fine, but that may return a relative path (like .), which is a problem if you want to change directories in the script and have the path still point to the script's directory. Still, dirname will be part of the puzzle.
The accepted answer to Bash script absolute path with OS X (OS X specific, but the answer works regardless) gives a function that will test to see if $0 looks relative and if so will pre-pend $PWD to it. But the result can still have relative bits in it (although overall it's absolute) — for instance, if the script is t in the directory /usr/bin and you're in /usr and you type bin/../bin/t to run it (yes, that's convoluted), you end up with /usr/bin/../bin as the script's directory path. Which works, but...
The readlink solution on this page, which looks like this:
# Absolute path to this script. /home/user/bin/foo.sh
SCRIPT=$(readlink -f $0)
# Absolute path this script is in. /home/user/bin
SCRIPTPATH=`dirname $SCRIPT`
But readlink isn't POSIX and apparently the solution relies on GNU's readlink where BSD's won't work for some reason (I don't have access to a BSD-like system to check).
So, various ways of doing it, but they all have their caveats.
What would be a better way? Where "better" means:
Gives me the absolute path.
Takes out funky bits even when invoked in a convoluted way (see comment on #2 above). (E.g., at least moderately canonicalizes the path.)
Relies only on Bash-isms or things that are almost certain to be on most popular flavors of *nix systems (GNU/Linux, BSD and BSD-like systems like OS X, etc.).
Avoids calling external programs if possible (e.g., prefers Bash built-ins).
(Updated, thanks for the heads up, wich) It doesn't have to resolve symlinks (in fact, I'd kind of prefer it left them alone, but that's not a requirement).
Here's what I've come up with (edit: plus some tweaks provided by sfstewman, levigroker, Kyle Strand, and Rob Kennedy), that seems to mostly fit my "better" criteria:
SCRIPTPATH="$( cd -- "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )"
That SCRIPTPATH line seems particularly roundabout, but we need it rather than SCRIPTPATH=`pwd` in order to properly handle spaces and symlinks.
The inclusion of output redirection (>/dev/null 2>&1) handles the rare(?) case where cd might produce output that would interfere with the surrounding $( ... ) capture. (Such as cd being overridden to also ls a directory after switching to it.)
Note also that esoteric situations, such as executing a script that isn't coming from a file in an accessible file system at all (which is perfectly possible), is not catered to there (or in any of the other answers I've seen).
The -- after cd and before "$0" are in case the directory starts with a -.
I'm surprised that the realpath command hasn't been mentioned here. My understanding is that it is widely portable / ported.
Your initial solution becomes:
SCRIPT=$(realpath "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
And to leave symbolic links unresolved per your preference:
SCRIPT=$(realpath -s "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
The simplest way that I have found to get a full canonical path in Bash is to use cd and pwd:
ABSOLUTE_PATH="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/$(basename "${BASH_SOURCE[0]}")"
Using ${BASH_SOURCE[0]} instead of $0 produces the same behavior regardless of whether the script is invoked as <name> or source <name>.
I just had to revisit this issue today and found Get the source directory of a Bash script from within the script itself:
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
There's more variants at the linked answer, e.g. for the case where the script itself is a symlink.
Get the absolute path of a shell script
It does not use the -f option in readlink, and it should therefore work on BSD/Mac OS X.
Supports
source ./script (When called by the . dot operator)
Absolute path /path/to/script
Relative path like ./script
/path/dir1/../dir2/dir3/../script
When called from symlink
When symlink is nested eg) foo->dir1/dir2/bar bar->./../doe doe->script
When caller changes the scripts name
I am looking for corner cases where this code does not work. Please let me know.
Code
pushd . > /dev/null
SCRIPT_PATH="${BASH_SOURCE[0]}";
while([ -h "${SCRIPT_PATH}" ]); do
cd "`dirname "${SCRIPT_PATH}"`"
SCRIPT_PATH="$(readlink "`basename "${SCRIPT_PATH}"`")";
done
cd "`dirname "${SCRIPT_PATH}"`" > /dev/null
SCRIPT_PATH="`pwd`";
popd > /dev/null
echo "srcipt=[${SCRIPT_PATH}]"
echo "pwd =[`pwd`]"
Known issus
The script must be on disk somewhere. Let it be over a network. If you try to run this script from a PIPE it will not work
wget -o /dev/null -O - http://host.domain/dir/script.sh |bash
Technically speaking, it is undefined. Practically speaking, there is no sane way to detect this. (A co-process can not access the environment of the parent.)
Use:
SCRIPT_PATH=$(dirname `which $0`)
which prints to standard output the full path of the executable that would have been executed when the passed argument had been entered at the shell prompt (which is what $0 contains)
dirname strips the non-directory suffix from a file name.
Hence you end up with the full path of the script, no matter if the path was specified or not.
As realpath is not installed per default on my Linux system, the following works for me:
SCRIPT="$(readlink --canonicalize-existing "$0")"
SCRIPTPATH="$(dirname "$SCRIPT")"
$SCRIPT will contain the real file path to the script and $SCRIPTPATH the real path of the directory containing the script.
Before using this read the comments of this answer.
Easy to read? Below is an alternative. It ignores symlinks
#!/bin/bash
currentDir=$(
cd $(dirname "$0")
pwd
)
echo -n "current "
pwd
echo script $currentDir
Since I posted the above answer a couple years ago, I've evolved my practice to using this linux specific paradigm, which properly handles symlinks:
ORIGIN=$(dirname $(readlink -f $0))
Simply:
BASEDIR=$(readlink -f $0 | xargs dirname)
Fancy operators are not needed.
You may try to define the following variable:
CWD="$(cd -P -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd -P)"
Or you can try the following function in Bash:
realpath () {
[[ $1 = /* ]] && echo "$1" || echo "$PWD/${1#./}"
}
This function takes one argument. If the argument already has an absolute path, print it as it is, otherwise print $PWD variable + filename argument (without ./ prefix).
Related:
Bash script absolute path with OS X
Get the source directory of a Bash script from within the script itself
Answering this question very late, but I use:
SCRIPT=$( readlink -m $( type -p ${0} )) # Full path to script handling Symlinks
BASE_DIR=`dirname "${SCRIPT}"` # Directory script is run in
NAME=`basename "${SCRIPT}"` # Actual name of script even if linked
We have placed our own product realpath-lib on GitHub for free and unencumbered community use.
Shameless plug but with this Bash library you can:
get_realpath <absolute|relative|symlink|local file>
This function is the core of the library:
function get_realpath() {
if [[ -f "$1" ]]
then
# file *must* exist
if cd "$(echo "${1%/*}")" &>/dev/null
then
# file *may* not be local
# exception is ./file.ext
# try 'cd .; cd -;' *works!*
local tmppwd="$PWD"
cd - &>/dev/null
else
# file *must* be local
local tmppwd="$PWD"
fi
else
# file *cannot* exist
return 1 # failure
fi
# reassemble realpath
echo "$tmppwd"/"${1##*/}"
return 0 # success
}
It doesn't require any external dependencies, just Bash 4+. Also contains functions to get_dirname, get_filename, get_stemname and validate_path validate_realpath. It's free, clean, simple and well documented, so it can be used for learning purposes too, and no doubt can be improved. Try it across platforms.
Update: After some review and testing we have replaced the above function with something that achieves the same result (without using dirname, only pure Bash) but with better efficiency:
function get_realpath() {
[[ ! -f "$1" ]] && return 1 # failure : file does not exist.
[[ -n "$no_symlinks" ]] && local pwdp='pwd -P' || local pwdp='pwd' # do symlinks.
echo "$( cd "$( echo "${1%/*}" )" 2>/dev/null; $pwdp )"/"${1##*/}" # echo result.
return 0 # success
}
This also includes an environment setting no_symlinks that provides the ability to resolve symlinks to the physical system. By default it keeps symlinks intact.
Considering this issue again: there is a very popular solution that is referenced within this thread that has its origin here:
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
I have stayed away from this solution because of the use of dirname - it can present cross-platform difficulties, particularly if a script needs to be locked down for security reasons. But as a pure Bash alternative, how about using:
DIR="$( cd "$( echo "${BASH_SOURCE[0]%/*}" )" && pwd )"
Would this be an option?
If we use Bash I believe this is the most convenient way as it doesn't require calls to any external commands:
THIS_PATH="${BASH_SOURCE[0]}";
THIS_DIR=$(dirname $THIS_PATH)
The accepted solution has the inconvenient (for me) to not be "source-able":
if you call it from a "source ../../yourScript", $0 would be "bash"!
The following function (for bash >= 3.0) gives me the right path, however the script might be called (directly or through source, with an absolute or a relative path):
(by "right path", I mean the full absolute path of the script being called, even when called from another path, directly or with "source")
#!/bin/bash
echo $0 executed
function bashscriptpath() {
local _sp=$1
local ascript="$0"
local asp="$(dirname $0)"
#echo "b1 asp '$asp', b1 ascript '$ascript'"
if [[ "$asp" == "." && "$ascript" != "bash" && "$ascript" != "./.bashrc" ]] ; then asp="${BASH_SOURCE[0]%/*}"
elif [[ "$asp" == "." && "$ascript" == "./.bashrc" ]] ; then asp=$(pwd)
else
if [[ "$ascript" == "bash" ]] ; then
ascript=${BASH_SOURCE[0]}
asp="$(dirname $ascript)"
fi
#echo "b2 asp '$asp', b2 ascript '$ascript'"
if [[ "${ascript#/}" != "$ascript" ]]; then asp=$asp ;
elif [[ "${ascript#../}" != "$ascript" ]]; then
asp=$(pwd)
while [[ "${ascript#../}" != "$ascript" ]]; do
asp=${asp%/*}
ascript=${ascript#../}
done
elif [[ "${ascript#*/}" != "$ascript" ]]; then
if [[ "$asp" == "." ]] ; then asp=$(pwd) ; else asp="$(pwd)/${asp}"; fi
fi
fi
eval $_sp="'$asp'"
}
bashscriptpath H
export H=${H}
The key is to detect the "source" case and to use ${BASH_SOURCE[0]} to get back the actual script.
One liner
`dirname $(realpath $0)`
Bourne shell (sh) compliant way:
SCRIPT_HOME=`dirname $0 | while read a; do cd $a && pwd && break; done`
Perhaps the accepted answer to the following question may be of help.
How can I get the behavior of GNU's readlink -f on a Mac?
Given that you just want to canonicalize the name you get from concatenating $PWD and $0 (assuming that $0 is not absolute to begin with), just use a series of regex replacements along the line of abs_dir=${abs_dir//\/.\//\/} and such.
Yes, I know it looks horrible, but it'll work and is pure Bash.
Try this:
cd $(dirname $([ -L $0 ] && readlink -f $0 || echo $0))
I have used the following approach successfully for a while (not on OS X though), and it only uses a shell built-in and handles the 'source foobar.sh' case as far as I have seen.
One issue with the (hastily put together) example code below is that the function uses $PWD which may or may not be correct at the time of the function call. So that needs to be handled.
#!/bin/bash
function canonical_path() {
# Handle relative vs absolute path
[ ${1:0:1} == '/' ] && x=$1 || x=$PWD/$1
# Change to dirname of x
cd ${x%/*}
# Combine new pwd with basename of x
echo $(pwd -P)/${x##*/}
cd $OLDPWD
}
echo $(canonical_path "${BASH_SOURCE[0]}")
type [
type cd
type echo
type pwd
Just for the hell of it I've done a bit of hacking on a script that does things purely textually, purely in Bash. I hope I caught all the edge cases.
Note that the ${var//pat/repl} that I mentioned in the other answer doesn't work since you can't make it replace only the shortest possible match, which is a problem for replacing /foo/../ as e.g. /*/../ will take everything before it, not just a single entry. And since these patterns aren't really regexes I don't see how that can be made to work. So here's the nicely convoluted solution I came up with, enjoy. ;)
By the way, let me know if you find any unhandled edge cases.
#!/bin/bash
canonicalize_path() {
local path="$1"
OIFS="$IFS"
IFS=$'/'
read -a parts < <(echo "$path")
IFS="$OIFS"
local i=${#parts[#]}
local j=0
local back=0
local -a rev_canon
while (($i > 0)); do
((i--))
case "${parts[$i]}" in
""|.) ;;
..) ((back++));;
*) if (($back > 0)); then
((back--))
else
rev_canon[j]="${parts[$i]}"
((j++))
fi;;
esac
done
while (($j > 0)); do
((j--))
echo -n "/${rev_canon[$j]}"
done
echo
}
canonicalize_path "/.././..////../foo/./bar//foo/bar/.././bar/../foo/bar/./../..//../foo///bar/"
Yet another way to do this:
shopt -s extglob
selfpath=$0
selfdir=${selfpath%%+([!/])}
while [[ -L "$selfpath" ]];do
selfpath=$(readlink "$selfpath")
if [[ ! "$selfpath" =~ ^/ ]];then
selfpath=${selfdir}${selfpath}
fi
selfdir=${selfpath%%+([!/])}
done
echo $selfpath $selfdir
More simply, this is what works for me:
MY_DIR=`dirname $0`
source $MY_DIR/_inc_db.sh
In a directory with 30 CSV files, running:
find . -name "*.csv" | (xargs python ~/script.py)
How can I have python properly run on each file passed by xargs? I do print sys.stdin and it's just one file. I try for file in stdin loop, but there's nothing there. What am I missing?
In fact xargs does not pass to stdin. It passes all its read from stdin as arguments to the command you give it in parameter.
You can debug your command invokation with an echo:
find . -name "*.csv" | (xargs echo python ./script.py)
You will see all your files outputed on one line.
So in fact to access your files from arguments list in python use this in your script:
import sys
for argument in sys.argv[1:]:
print argument
script.py is being run exactly once for each csv file
python ~/script.py file1.csv
python ~/script.py file2.csv
python ~/script.py file3.csv
python ~/script.py file4.csv
etc
If you want to run it like
python ~/script.py file1.csv file2.csv file3.csv
then do
python ~/script.py `find . -name "*.csv"`
or
python ~/script.py `ls *.csv`
(the " may have to be escaped, not sure)
EDIT: note the difference between ` and '