subprocess cp leaves some files empty

subprocess cp leaves some files empty - python

I'm trying to copy some files from one directory to another. I want all files in one directory to end up in the root of another directory.
This command does exactly what I want when I run it in the terminal:
cp -rv ./src/CopyPasteIntoBuildDir/* ./build-root/src/
This line of python, however, copies most of the files just like the above command, but it leaves some of the new files empty. Specifically, files in subdirectories are left empty.
subprocess.check_call("cp -rv ./src/CopyPasteIntoBuildDir/* ./build-root/src/", shell=True)
It creates the files if they're not there, and it truncates them if they are.
What is going on?

Assuming that you're decided to use cp rather than native Python operations --
This code will be much more reliable if you write it to not invoke any shell whatsoever. To avoid the need for /* on the source (and the side effects of this -- ie. refusal to copy directories whose names exceed the ARG_MAX combined environment and command-line size storage limit), use . as the last element of the name of the directory whose contents are to be copied, instead of passing a wildcard that needs to be expanded by a shell.
subprocess.check_call(["cp", "-R", "--", '%s/.' % src, dest])
The use of cp -R rather than cp -rv is on account of -R, but not -r, being POSIX-standardized (and thus portable across all compliant UNIXlike platforms).
Demonstrating In Action (copy/pasteable code)
tempdir=$(mktemp -d -t testdir.XXXXXX)
trap 'rm -rf "$tempdir"' EXIT
cd "$tempdir"
mkdir -p ./src/CopyPasteIntoBuildDir/subdir-1 ./build-root/src/
touch ./src/CopyPasteIntoBuildDir/file-1
touch ./src/CopyPasteIntoBuildDir/subdir-1/file-2
script='
import sys, shutil, subprocess
src = sys.argv[1]
dest = sys.argv[2]
subprocess.check_call(["cp", "-R", "--", "%s/." % src, dest])
'
python -c "$script" ./src/CopyPasteIntoBuildDir ./build-root/src/
find ./build-root -type f -print
rm -rf "$tempdir"
...emits output akin to:
./build-root/src/file-1
./build-root/src/subdir-1/file-2
...showing that content was correctly recursively copied with no prefix.

So apparently this is a problem with sh. Using bash instead worked.
subprocess.check_call("cp -rv ./src/CopyPasteIntoBuildDir/* ./build-root/src/", shell=True, executable="/bin/bash")
EDIT: See accepted answer!

Related

env command not working with find command

Im trying to write a script:
env PYTHONPATH=$PYTHONPATH: $Dir/scripts find * -name ‘*.py' -exec pylint (} \\; | grep . && exit 1
The code is finding all scripts in the root directory instead of using the environmental variables I set. Any help on writing this code to only look for files in the directory I set as a value in PYTHONPATH.

env PYTHONPATH=$PYTHONPATH: $Dir/scripts isn't doing what you think it's doing. Including $PYTHONPATH includes the former value of PYTHONPATH, meaning whatever you have it already set to or a blank default. The space in your variable also makes it invalid, and instead interprets the $Dir/scripts as a new command. It looks like what you want would be env PYTHONPATH=$Dir/scripts — but there's actually an easier way.
If you have __init__.py files in your directory, you can just do pylint ./some-directory. If you don't, you can use xargs: find . -type f -name "*.py" | xargs pylint. If you wanted to pass the directory instead of have it coded to . (your current calling directory) you could do that too:
# set directory to first argument
dir="$1"
# check if "dir" was actually provided, if not, set to `.`
if [ -z "$dir" ]; then dir=.; fi
find "$dir" -type f -name "*.py" | xargs pylint
You could save that in a script or function and then call it either with a directory (like run-pylint-on-everything.sh ~/foo/bar, or not, in which case it would run starting from your current shell location.

There’s no space between the PYTHONPATH value, it was a typo mistake, I want to run the command on a CLI instead of a script.

Bash script which sets pythonpath

The mssql-cli uses the following bash script to execute the actual python script. As I understand the code, the while loop determines the current directory of the script executed, this path gets then added to PYTHONPATH.
There are no .py files in the current directory so why is the path added to PYTHONPATH? Could someone please explain to me what the first part of the script is doing. Thank you for helping me out here.
#!/bin/bash
SOURCE="${BASH_SOURCE[0]}"
while [ -h "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
SOURCE="$(readlink "$SOURCE")"
[[ $SOURCE != \/* ]] && SOURCE="$DIR/$SOURCE" # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done
DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
# Set the python io encoding to UTF-8 by default if not set.
if [ -z ${PYTHONIOENCODING+x} ]; then export PYTHONIOENCODING=utf8; fi
export PYTHONPATH="${DIR}:${PYTHONPATH}"
python -m mssqlcli.main "$#"

Wonder if this is still relevant for you, but I've marked it as fun puzzle to revisit later... Long story short: It adds directory of location where this script file is with all symbolic links resolved (neither the acquired filename nor a directory leading up to is is a symbolic link) to PYTHONPATH.
It's basically the same thing as doing so using readlink (or realpath):
export PYTHONPATH="$(dirname $(readlink -f ${BASH_SOURCE})):${PYTHONOATH}"
Line by line dissection:
SOURCE="${BASH_SOURCE[0]}"
This sets SOURCE to be path with which this script was called or sourced.
while [ -h "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
We enter the wile loop if SOURCE path refers to a symbolic link. I.e. in the first iteration of this file was a symbolic link. Subsequently if this was a link pointing to another link.
DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
This (a bit simplified explanation of -P) changes into directory where SOURCE is resolving symbolic links along the way (i.e. lands in the directory the link(s) was/were pointing to) and prints working directory after that change (absolute path). All that happens in a subshell and result is assign to variable DIR.
SOURCE="$(readlink "$SOURCE")"
SOURCE is assigned a new value of path resulting from symlink resolution. Literally a target the link points to (as seen by for instance ls -l) relative or absolute.
[[ $SOURCE != \/* ]] && SOURCE="$DIR/$SOURCE" # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
If the SOURCE value we have obtained by symbolic link resolution does not begin with / (i.e. is an absolute path), DIR (directory where the SOURCE with which we have entered the loop resides) and resolved symbolic link SOURCE are concatenated over / to form a new SOURCE (we make it into an absolute path) and we go back to the top of this loop. NOTE: escaping of / by \ seems in this case unnecessary and arbitrary.
done
When done. SOURCE points to a file that is not a symblic link. It's path may still contain symbolic links at this point which is taken care of in the next step.
DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
Once more, like in the loop. DIR should now be pointing to a directory where resolved (not a symlink) SOURCE file (target of what was originally called/sourced) resides.
# Set the python io encoding to UTF-8 by default if not set.
if [ -z ${PYTHONIOENCODING+x} ]; then export PYTHONIOENCODING=utf8; fi
Exports an environmental variable if a shell variable was not set or equals to an empty string. NOTE: ${PYTHONIOENCODING+x} seems to be an alternative form of ${PYTHONIOENCODING:+x} and its use seems absolutely arbitrary. There is also a test to check if variable was set (regardless of its value).
export PYTHONPATH="${DIR}:${PYTHONPATH}"
PYTHONPATH is now set to start with an absolute resolved path (no symbolic links should be anywhere along the path) of where does this very script (or file this link points to) reside.
python -m mssqlcli.main "$#"
Calls python...

Fastest way to merge a directory tree

I have multiple directories of the form
foo/bar/baz/alpha_1/beta/gamma/files/uniqueFile1
foo/bar/baz/alpha_2/beta/gamma/files/uniqueFile2
foo/bar/baz/alpha_3/beta/gamma/files/uniqueFile3
What is the fastest way to merge these directories to a single directory structure like
foo/bar/baz/alpha/beta/gamma/files/uniqueFile1...uniqueFile3
I could write a python script to do that but is there a faster way to do that on a debian machine ? Can rsync help in this case ?
EDIT:
Apologies for not making it clear earlier, the depth in the examples is ~10-12 and I do not know the some directory names such as alpha*, these are randomly generated while throwing out logs. I was using find with wildcards to list these files earlier but now another level has been added in the path, that caused my find queries to take over a minute from 0.004s. So I am looking for a faster solution.
/known_fixed_path_5_levels/*/known_name*/*/fixed_path_2_levels/n_unique_files
has become
/known_fixed_path_5_levels/*/known_name*/*/xx*/fixed_path_2_levels/unique_file_1
/known_fixed_path_5_levels/*/known_name*/*/xx*/fixed_path_2_levels/unique_file_2
.
.
/known_fixed_path_5_levels/*/known_name*/*/xx*/fixed_path_2_levels/unique_file_n
I basically want to collect all those unique files into one place like how it was before.

With find:
mkdir --parents foo/bar/baz/alpha/beta/gamma/files; #create target directory if nessessary
find foo/bar/baz/alpha_[1-3]/beta/gamma/files -type f -exec cp {} foo/bar/baz/alpha/beta/gamma/files \;

As question is not clear about copying or moving, there is two ways, without copy! Even second part don't effectively copy your data!
Simple bash command
Simply:
cd foo/bar/baz
mv -it alpha/beta/gamma/files alpha_*/beta/gamma/files/uniqueFile*
with -i switch to prevent overwritting.
This will work perfectly for small bunch of files.
More robust and adaptive find syntax
Or by using find:
cd foo/bar/baz
find alpha_* -type f -mindepth 3 -exec mv -it alpha/beta/gamma/files {} +
Advantage of using find are
you could add a lot of flags like -name, -mtime and so on
find will never try to pass more files to command (mv) that command line could hold.
cp -al specific UN*X concept
Under Un*x, you could create hard-link wich is not symbolic links, but a secondary entry in folder tree, for the same inode.
Nota: As only one inode has to be referenced, this could work only on same filesystem.
By using
cp -ialt alpha/beta/gamma/files alpha_*/beta/gamma/files/uniqueFile*
You will copy in one directory all inodes references, but keeping only one file for each.
Using bash's globstar feature:
cd foo/bar/baz
shopt -s globstar
cp -alit alpha/beta/gamma/files alpha_*/**/uniqueFile*

Relative Path issue in shell script [duplicate]

This question already has answers here:
How do I get the directory where a Bash script is located from within the script itself?
(74 answers)
Closed 6 years ago.
I have a Bash script that needs to know its full path. I'm trying to find a broadly-compatible way of doing that without ending up with relative or funky-looking paths. I only need to support Bash, not sh, csh, etc.
What I've found so far:
The accepted answer to Getting the source directory of a Bash script from within addresses getting the path of the script via dirname $0, which is fine, but that may return a relative path (like .), which is a problem if you want to change directories in the script and have the path still point to the script's directory. Still, dirname will be part of the puzzle.
The accepted answer to Bash script absolute path with OS X (OS X specific, but the answer works regardless) gives a function that will test to see if $0 looks relative and if so will pre-pend $PWD to it. But the result can still have relative bits in it (although overall it's absolute) — for instance, if the script is t in the directory /usr/bin and you're in /usr and you type bin/../bin/t to run it (yes, that's convoluted), you end up with /usr/bin/../bin as the script's directory path. Which works, but...
The readlink solution on this page, which looks like this:
# Absolute path to this script. /home/user/bin/foo.sh
SCRIPT=$(readlink -f $0)
# Absolute path this script is in. /home/user/bin
SCRIPTPATH=`dirname $SCRIPT`
But readlink isn't POSIX and apparently the solution relies on GNU's readlink where BSD's won't work for some reason (I don't have access to a BSD-like system to check).
So, various ways of doing it, but they all have their caveats.
What would be a better way? Where "better" means:
Gives me the absolute path.
Takes out funky bits even when invoked in a convoluted way (see comment on #2 above). (E.g., at least moderately canonicalizes the path.)
Relies only on Bash-isms or things that are almost certain to be on most popular flavors of *nix systems (GNU/Linux, BSD and BSD-like systems like OS X, etc.).
Avoids calling external programs if possible (e.g., prefers Bash built-ins).
(Updated, thanks for the heads up, wich) It doesn't have to resolve symlinks (in fact, I'd kind of prefer it left them alone, but that's not a requirement).

Here's what I've come up with (edit: plus some tweaks provided by sfstewman, levigroker, Kyle Strand, and Rob Kennedy), that seems to mostly fit my "better" criteria:
SCRIPTPATH="$( cd -- "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )"
That SCRIPTPATH line seems particularly roundabout, but we need it rather than SCRIPTPATH=`pwd` in order to properly handle spaces and symlinks.
The inclusion of output redirection (>/dev/null 2>&1) handles the rare(?) case where cd might produce output that would interfere with the surrounding $( ... ) capture. (Such as cd being overridden to also ls a directory after switching to it.)
Note also that esoteric situations, such as executing a script that isn't coming from a file in an accessible file system at all (which is perfectly possible), is not catered to there (or in any of the other answers I've seen).
The -- after cd and before "$0" are in case the directory starts with a -.

I'm surprised that the realpath command hasn't been mentioned here. My understanding is that it is widely portable / ported.
Your initial solution becomes:
SCRIPT=$(realpath "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
And to leave symbolic links unresolved per your preference:
SCRIPT=$(realpath -s "$0")
SCRIPTPATH=$(dirname "$SCRIPT")

The simplest way that I have found to get a full canonical path in Bash is to use cd and pwd:
ABSOLUTE_PATH="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/$(basename "${BASH_SOURCE[0]}")"
Using ${BASH_SOURCE[0]} instead of $0 produces the same behavior regardless of whether the script is invoked as <name> or source <name>.

I just had to revisit this issue today and found Get the source directory of a Bash script from within the script itself:
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
There's more variants at the linked answer, e.g. for the case where the script itself is a symlink.

Get the absolute path of a shell script
It does not use the -f option in readlink, and it should therefore work on BSD/Mac OS X.
Supports
source ./script (When called by the . dot operator)
Absolute path /path/to/script
Relative path like ./script
/path/dir1/../dir2/dir3/../script
When called from symlink
When symlink is nested eg) foo->dir1/dir2/bar bar->./../doe doe->script
When caller changes the scripts name
I am looking for corner cases where this code does not work. Please let me know.
Code
pushd . > /dev/null
SCRIPT_PATH="${BASH_SOURCE[0]}";
while([ -h "${SCRIPT_PATH}" ]); do
cd "`dirname "${SCRIPT_PATH}"`"
SCRIPT_PATH="$(readlink "`basename "${SCRIPT_PATH}"`")";
done
cd "`dirname "${SCRIPT_PATH}"`" > /dev/null
SCRIPT_PATH="`pwd`";
popd > /dev/null
echo "srcipt=[${SCRIPT_PATH}]"
echo "pwd =[`pwd`]"
Known issus
The script must be on disk somewhere. Let it be over a network. If you try to run this script from a PIPE it will not work
wget -o /dev/null -O - http://host.domain/dir/script.sh |bash
Technically speaking, it is undefined. Practically speaking, there is no sane way to detect this. (A co-process can not access the environment of the parent.)

Use:
SCRIPT_PATH=$(dirname `which $0`)
which prints to standard output the full path of the executable that would have been executed when the passed argument had been entered at the shell prompt (which is what $0 contains)
dirname strips the non-directory suffix from a file name.
Hence you end up with the full path of the script, no matter if the path was specified or not.

As realpath is not installed per default on my Linux system, the following works for me:
SCRIPT="$(readlink --canonicalize-existing "$0")"
SCRIPTPATH="$(dirname "$SCRIPT")"
$SCRIPT will contain the real file path to the script and $SCRIPTPATH the real path of the directory containing the script.
Before using this read the comments of this answer.

Easy to read? Below is an alternative. It ignores symlinks
#!/bin/bash
currentDir=$(
cd $(dirname "$0")
pwd
)
echo -n "current "
pwd
echo script $currentDir
Since I posted the above answer a couple years ago, I've evolved my practice to using this linux specific paradigm, which properly handles symlinks:
ORIGIN=$(dirname $(readlink -f $0))

Simply:
BASEDIR=$(readlink -f $0 | xargs dirname)
Fancy operators are not needed.

You may try to define the following variable:
CWD="$(cd -P -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd -P)"
Or you can try the following function in Bash:
realpath () {
[[ $1 = /* ]] && echo "$1" || echo "$PWD/${1#./}"
}
This function takes one argument. If the argument already has an absolute path, print it as it is, otherwise print $PWD variable + filename argument (without ./ prefix).
Related:
Bash script absolute path with OS X
Get the source directory of a Bash script from within the script itself

Answering this question very late, but I use:
SCRIPT=$( readlink -m $( type -p ${0} )) # Full path to script handling Symlinks
BASE_DIR=`dirname "${SCRIPT}"` # Directory script is run in
NAME=`basename "${SCRIPT}"` # Actual name of script even if linked

We have placed our own product realpath-lib on GitHub for free and unencumbered community use.
Shameless plug but with this Bash library you can:
get_realpath <absolute|relative|symlink|local file>
This function is the core of the library:
function get_realpath() {
if [[ -f "$1" ]]
then
# file *must* exist
if cd "$(echo "${1%/*}")" &>/dev/null
then
# file *may* not be local
# exception is ./file.ext
# try 'cd .; cd -;' *works!*
local tmppwd="$PWD"
cd - &>/dev/null
else
# file *must* be local
local tmppwd="$PWD"
fi
else
# file *cannot* exist
return 1 # failure
fi
# reassemble realpath
echo "$tmppwd"/"${1##*/}"
return 0 # success
}
It doesn't require any external dependencies, just Bash 4+. Also contains functions to get_dirname, get_filename, get_stemname and validate_path validate_realpath. It's free, clean, simple and well documented, so it can be used for learning purposes too, and no doubt can be improved. Try it across platforms.
Update: After some review and testing we have replaced the above function with something that achieves the same result (without using dirname, only pure Bash) but with better efficiency:
function get_realpath() {
[[ ! -f "$1" ]] && return 1 # failure : file does not exist.
[[ -n "$no_symlinks" ]] && local pwdp='pwd -P' || local pwdp='pwd' # do symlinks.
echo "$( cd "$( echo "${1%/*}" )" 2>/dev/null; $pwdp )"/"${1##*/}" # echo result.
return 0 # success
}
This also includes an environment setting no_symlinks that provides the ability to resolve symlinks to the physical system. By default it keeps symlinks intact.

Considering this issue again: there is a very popular solution that is referenced within this thread that has its origin here:
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
I have stayed away from this solution because of the use of dirname - it can present cross-platform difficulties, particularly if a script needs to be locked down for security reasons. But as a pure Bash alternative, how about using:
DIR="$( cd "$( echo "${BASH_SOURCE[0]%/*}" )" && pwd )"
Would this be an option?

If we use Bash I believe this is the most convenient way as it doesn't require calls to any external commands:
THIS_PATH="${BASH_SOURCE[0]}";
THIS_DIR=$(dirname $THIS_PATH)

The accepted solution has the inconvenient (for me) to not be "source-able":
if you call it from a "source ../../yourScript", $0 would be "bash"!
The following function (for bash >= 3.0) gives me the right path, however the script might be called (directly or through source, with an absolute or a relative path):
(by "right path", I mean the full absolute path of the script being called, even when called from another path, directly or with "source")
#!/bin/bash
echo $0 executed
function bashscriptpath() {
local _sp=$1
local ascript="$0"
local asp="$(dirname $0)"
#echo "b1 asp '$asp', b1 ascript '$ascript'"
if [[ "$asp" == "." && "$ascript" != "bash" && "$ascript" != "./.bashrc" ]] ; then asp="${BASH_SOURCE[0]%/*}"
elif [[ "$asp" == "." && "$ascript" == "./.bashrc" ]] ; then asp=$(pwd)
else
if [[ "$ascript" == "bash" ]] ; then
ascript=${BASH_SOURCE[0]}
asp="$(dirname $ascript)"
fi
#echo "b2 asp '$asp', b2 ascript '$ascript'"
if [[ "${ascript#/}" != "$ascript" ]]; then asp=$asp ;
elif [[ "${ascript#../}" != "$ascript" ]]; then
asp=$(pwd)
while [[ "${ascript#../}" != "$ascript" ]]; do
asp=${asp%/*}
ascript=${ascript#../}
done
elif [[ "${ascript#*/}" != "$ascript" ]]; then
if [[ "$asp" == "." ]] ; then asp=$(pwd) ; else asp="$(pwd)/${asp}"; fi
fi
fi
eval $_sp="'$asp'"
}
bashscriptpath H
export H=${H}
The key is to detect the "source" case and to use ${BASH_SOURCE[0]} to get back the actual script.

One liner
`dirname $(realpath $0)`

Bourne shell (sh) compliant way:
SCRIPT_HOME=`dirname $0 | while read a; do cd $a && pwd && break; done`

Perhaps the accepted answer to the following question may be of help.
How can I get the behavior of GNU's readlink -f on a Mac?
Given that you just want to canonicalize the name you get from concatenating $PWD and $0 (assuming that $0 is not absolute to begin with), just use a series of regex replacements along the line of abs_dir=${abs_dir//\/.\//\/} and such.
Yes, I know it looks horrible, but it'll work and is pure Bash.

Try this:
cd $(dirname $([ -L $0 ] && readlink -f $0 || echo $0))

I have used the following approach successfully for a while (not on OS X though), and it only uses a shell built-in and handles the 'source foobar.sh' case as far as I have seen.
One issue with the (hastily put together) example code below is that the function uses $PWD which may or may not be correct at the time of the function call. So that needs to be handled.
#!/bin/bash
function canonical_path() {
# Handle relative vs absolute path
[ ${1:0:1} == '/' ] && x=$1 || x=$PWD/$1
# Change to dirname of x
cd ${x%/*}
# Combine new pwd with basename of x
echo $(pwd -P)/${x##*/}
cd $OLDPWD
}
echo $(canonical_path "${BASH_SOURCE[0]}")
type [
type cd
type echo
type pwd

Just for the hell of it I've done a bit of hacking on a script that does things purely textually, purely in Bash. I hope I caught all the edge cases.
Note that the ${var//pat/repl} that I mentioned in the other answer doesn't work since you can't make it replace only the shortest possible match, which is a problem for replacing /foo/../ as e.g. /*/../ will take everything before it, not just a single entry. And since these patterns aren't really regexes I don't see how that can be made to work. So here's the nicely convoluted solution I came up with, enjoy. ;)
By the way, let me know if you find any unhandled edge cases.
#!/bin/bash
canonicalize_path() {
local path="$1"
OIFS="$IFS"
IFS=$'/'
read -a parts < <(echo "$path")
IFS="$OIFS"
local i=${#parts[#]}
local j=0
local back=0
local -a rev_canon
while (($i > 0)); do
((i--))
case "${parts[$i]}" in
""|.) ;;
..) ((back++));;
*) if (($back > 0)); then
((back--))
else
rev_canon[j]="${parts[$i]}"
((j++))
fi;;
esac
done
while (($j > 0)); do
((j--))
echo -n "/${rev_canon[$j]}"
done
echo
}
canonicalize_path "/.././..////../foo/./bar//foo/bar/.././bar/../foo/bar/./../..//../foo///bar/"

Yet another way to do this:
shopt -s extglob
selfpath=$0
selfdir=${selfpath%%+([!/])}
while [[ -L "$selfpath" ]];do
selfpath=$(readlink "$selfpath")
if [[ ! "$selfpath" =~ ^/ ]];then
selfpath=${selfdir}${selfpath}
fi
selfdir=${selfpath%%+([!/])}
done
echo $selfpath $selfdir

More simply, this is what works for me:
MY_DIR=`dirname $0`
source $MY_DIR/_inc_db.sh

bash globbing in subprocess

If I have a directory in which I want to delete all but one file, I might do this in bash:
cd /tmp/a
rm -rf !(specialfile)
cd -
Translating this to the most obvious Python code fails for me:
>>> subprocess.Popen( 'cd /tmp/a; rm -rf !(specialfile); cd -', stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True).communicate()
with this message:
('', "/bin/sh: -c: line 0: syntax error near unexpected token `('\n/bin/sh: -c: line 0: `cd /tmp/a; rm -rf !(specialfile); cd -'\n")
The next best thing in Python seems to be:
p = '/tmp/a'
for i in os.listdir( p ):
if i != 'specialfile':
os.remove( os.path.join( p, i ) )
but of course this doesn't handle files and subdirectories equally well. Is there a better way?

Update: As #isedev and OP #JohnSchmitt point out in comments, subprocess.Popen invokes sh, not bash (and sh may or may not be bash, depending on the platform), but use of the extended pattern matching operator !(...) requires (a) bash with (b) the extglob option turned on (see below for background).
Thus, the answer is to:
invoke bash explicitly with a command string passed via the -c command-line option.
turn on the extglob shell option, via the -O command-line option (without it, the glob !(specialfile) triggers the syntax error the OP encountered).
Borrowing from #JohnSchmitt's own comment, we get:
subprocess.Popen("bash -O extglob -c 'cd /tmp/a; rm -rf !(file2); cd -'",
stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True).communicate()
(The less elegant alternative is to add shopt -s extglob; to the bash command string, before the rm command.)
Background:
!(specialfile) is an instance of an extended pattern matching operator (see man bash, section Pattern Matching); these extended operators are by default NOT enabled; shopt -s extglob enables them (shopt -u extglob disables them).

You can use os.walk as #Bakuriu mentioned. Very important is to traverse the directory tree from bottom to top in order to have always empty directories, with the exception of the one containing the 'specialfile'. That's why you would need the try clause in the os.rmdir command.
import os
for root, dirs, files in os.walk(top, topdown=False):
for name in files:
if name != 'specialfile':
os.remove(os.path.join(root, name))
for name in dirs:
try:
os.rmdir(os.path.join(root, name))
except:
pass

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

subprocess cp leaves some files empty - python

So apparently this is a problem with sh. Using bash instead worked. subprocess.check_call("cp -rv ./src/CopyPasteIntoBuildDir/* ./build-root/src/", shell=True, executable="/bin/bash") EDIT: See accepted answer!

Related

env command not working with find command

Bash script which sets pythonpath

Fastest way to merge a directory tree

Relative Path issue in shell script [duplicate]

bash globbing in subprocess

Categories

Resources