Moving all contents of a directory to another in Python - python

I've been trying to figure this out for hours with no luck. I have a list of directories that have subdirectories and other files of their own. I'm trying to traverse through all of them and move all of their content to a specific location. I tried shutil and glob but I couldn't get it to work. I even tried to run shell commands using subprocess.call and that also did not work either. I understand that it didn't work because I couldn't apply it properly but I couldn't find any solution that moves all contents of a directory to another.
files = glob.glob('Food101-AB/*/')
dest = 'Food-101/'
if not os.path.exists(dest):
os.makedirs(dest)
subprocess.call("mv Food101-AB/* Food-101/", shell=True)
# for child in files:
# shutil.move(child, dest)
I'm trying to move everything in Food101-AB to Food-101

shutil module of the standart library is the way to go:
>>> import shutil
>>> shutil.move("Food101-AB", "Food-101")
If you don't want to move Food101-AB folder itself, try using this:
import shutil
import os
for i in os.listdir("Food101-AB"):
shutil.move(os.path.join("Food101-AB", i), "Food-101")
For more information about move function:
https://docs.python.org/3/library/shutil.html#shutil.move

Try to change call function to run in order to retrieve the stdout, stderr and return code for your shell command:
from subprocess import run, CalledProcessError
source_dir = "full/path/to/src/folder"
dest_dir = "full/path/to/dest/folder"
try:
res = run(["mv", source_dir, dest_dir], check=True, capture_output=True)
except CalledProcessError as ex:
print(ex.stdout, ex.stderr, ex.returncode)

Related

How to suppress a folder or a file with os module [duplicate]

How can I delete a file or folder?
os.remove() removes a file.
os.rmdir() removes an empty directory.
shutil.rmtree() deletes a directory and all its contents.
Path objects from the Python 3.4+ pathlib module also expose these instance methods:
pathlib.Path.unlink() removes a file or symbolic link.
pathlib.Path.rmdir() removes an empty directory.
Python syntax to delete a file
import os
os.remove("/tmp/<file_name>.txt")
or
import os
os.unlink("/tmp/<file_name>.txt")
or
pathlib Library for Python version >= 3.4
file_to_rem = pathlib.Path("/tmp/<file_name>.txt")
file_to_rem.unlink()
Path.unlink(missing_ok=False)
Unlink method used to remove the file or the symbolik link.
If missing_ok is false (the default), FileNotFoundError is raised if the path does not exist.
If missing_ok is true, FileNotFoundError exceptions will be ignored (same behavior as the POSIX rm -f command).
Changed in version 3.8: The missing_ok parameter was added.
Best practice
First, check if the file or folder exists and then delete it. You can achieve this in two ways:
os.path.isfile("/path/to/file")
Use exception handling.
EXAMPLE for os.path.isfile
#!/usr/bin/python
import os
myfile = "/tmp/foo.txt"
# If file exists, delete it.
if os.path.isfile(myfile):
os.remove(myfile)
else:
# If it fails, inform the user.
print("Error: %s file not found" % myfile)
Exception Handling
#!/usr/bin/python
import os
# Get input.
myfile = raw_input("Enter file name to delete: ")
# Try to delete the file.
try:
os.remove(myfile)
except OSError as e:
# If it fails, inform the user.
print("Error: %s - %s." % (e.filename, e.strerror))
Respective output
Enter file name to delete : demo.txt
Error: demo.txt - No such file or directory.
Enter file name to delete : rrr.txt
Error: rrr.txt - Operation not permitted.
Enter file name to delete : foo.txt
Python syntax to delete a folder
shutil.rmtree()
Example for shutil.rmtree()
#!/usr/bin/python
import os
import sys
import shutil
# Get directory name
mydir = raw_input("Enter directory name: ")
# Try to remove the tree; if it fails, throw an error using try...except.
try:
shutil.rmtree(mydir)
except OSError as e:
print("Error: %s - %s." % (e.filename, e.strerror))
Use
shutil.rmtree(path[, ignore_errors[, onerror]])
(See complete documentation on shutil) and/or
os.remove
and
os.rmdir
(Complete documentation on os.)
Here is a robust function that uses both os.remove and shutil.rmtree:
def remove(path):
""" param <path> could either be relative or absolute. """
if os.path.isfile(path) or os.path.islink(path):
os.remove(path) # remove the file
elif os.path.isdir(path):
shutil.rmtree(path) # remove dir and all contains
else:
raise ValueError("file {} is not a file or dir.".format(path))
You can use the built-in pathlib module (requires Python 3.4+, but there are backports for older versions on PyPI: pathlib, pathlib2).
To remove a file there is the unlink method:
import pathlib
path = pathlib.Path(name_of_file)
path.unlink()
Or the rmdir method to remove an empty folder:
import pathlib
path = pathlib.Path(name_of_folder)
path.rmdir()
Deleting a file or folder in Python
There are multiple ways to Delete a File in Python but the best ways are the following:
os.remove() removes a file.
os.unlink() removes a file. it is a Unix name of remove() method.
shutil.rmtree() deletes a directory and all its contents.
pathlib.Path.unlink() deletes a single file The pathlib module is available in Python 3.4 and above.
os.remove()
Example 1: Basic Example to Remove a File Using os.remove() Method.
import os
os.remove("test_file.txt")
print("File removed successfully")
Example 2: Checking if File Exists using os.path.isfile and Deleting it With os.remove
import os
#checking if file exist or not
if(os.path.isfile("test.txt")):
#os.remove() function to remove the file
os.remove("test.txt")
#Printing the confirmation message of deletion
print("File Deleted successfully")
else:
print("File does not exist")
#Showing the message instead of throwig an error
Example 3: Python Program to Delete all files with a specific extension
import os
from os import listdir
my_path = 'C:\Python Pool\Test\'
for file_name in listdir(my_path):
if file_name.endswith('.txt'):
os.remove(my_path + file_name)
Example 4: Python Program to Delete All Files Inside a Folder
To delete all files inside a particular directory, you simply have to use the * symbol as the pattern string.
#Importing os and glob modules
import os, glob
#Loop Through the folder projects all files and deleting them one by one
for file in glob.glob("pythonpool/*"):
os.remove(file)
print("Deleted " + str(file))
os.unlink()
os.unlink() is an alias or another name of os.remove() . As in the Unix OS remove is also known as unlink.
Note: All the functionalities and syntax is the same of os.unlink() and os.remove(). Both of them are used to delete the Python file path.
Both are methods in the os module in Python’s standard libraries which performs the deletion function.
shutil.rmtree()
Example 1: Python Program to Delete a File Using shutil.rmtree()
import shutil
import os
# location
location = "E:/Projects/PythonPool/"
# directory
dir = "Test"
# path
path = os.path.join(location, dir)
# removing directory
shutil.rmtree(path)
Example 2: Python Program to Delete a File Using shutil.rmtree()
import shutil
import os
location = "E:/Projects/PythonPool/"
dir = "Test"
path = os.path.join(location, dir)
shutil.rmtree(path)
pathlib.Path.rmdir() to remove Empty Directory
Pathlib module provides different ways to interact with your files. Rmdir is one of the path functions which allows you to delete an empty folder. Firstly, you need to select the Path() for the directory, and then calling rmdir() method will check the folder size. If it’s empty, it’ll delete it.
This is a good way to deleting empty folders without any fear of losing actual data.
from pathlib import Path
q = Path('foldername')
q.rmdir()
How do I delete a file or folder in Python?
For Python 3, to remove the file and directory individually, use the unlink and rmdir Path object methods respectively:
from pathlib import Path
dir_path = Path.home() / 'directory'
file_path = dir_path / 'file'
file_path.unlink() # remove file
dir_path.rmdir() # remove directory
Note that you can also use relative paths with Path objects, and you can check your current working directory with Path.cwd.
For removing individual files and directories in Python 2, see the section so labeled below.
To remove a directory with contents, use shutil.rmtree, and note that this is available in Python 2 and 3:
from shutil import rmtree
rmtree(dir_path)
Demonstration
New in Python 3.4 is the Path object.
Let's use one to create a directory and file to demonstrate usage. Note that we use the / to join the parts of the path, this works around issues between operating systems and issues from using backslashes on Windows (where you'd need to either double up your backslashes like \\ or use raw strings, like r"foo\bar"):
from pathlib import Path
# .home() is new in 3.5, otherwise use os.path.expanduser('~')
directory_path = Path.home() / 'directory'
directory_path.mkdir()
file_path = directory_path / 'file'
file_path.touch()
and now:
>>> file_path.is_file()
True
Now let's delete them. First the file:
>>> file_path.unlink() # remove file
>>> file_path.is_file()
False
>>> file_path.exists()
False
We can use globbing to remove multiple files - first let's create a few files for this:
>>> (directory_path / 'foo.my').touch()
>>> (directory_path / 'bar.my').touch()
Then just iterate over the glob pattern:
>>> for each_file_path in directory_path.glob('*.my'):
... print(f'removing {each_file_path}')
... each_file_path.unlink()
...
removing ~/directory/foo.my
removing ~/directory/bar.my
Now, demonstrating removing the directory:
>>> directory_path.rmdir() # remove directory
>>> directory_path.is_dir()
False
>>> directory_path.exists()
False
What if we want to remove a directory and everything in it?
For this use-case, use shutil.rmtree
Let's recreate our directory and file:
file_path.parent.mkdir()
file_path.touch()
and note that rmdir fails unless it's empty, which is why rmtree is so convenient:
>>> directory_path.rmdir()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "~/anaconda3/lib/python3.6/pathlib.py", line 1270, in rmdir
self._accessor.rmdir(self)
File "~/anaconda3/lib/python3.6/pathlib.py", line 387, in wrapped
return strfunc(str(pathobj), *args)
OSError: [Errno 39] Directory not empty: '/home/username/directory'
Now, import rmtree and pass the directory to the funtion:
from shutil import rmtree
rmtree(directory_path) # remove everything
and we can see the whole thing has been removed:
>>> directory_path.exists()
False
Python 2
If you're on Python 2, there's a backport of the pathlib module called pathlib2, which can be installed with pip:
$ pip install pathlib2
And then you can alias the library to pathlib
import pathlib2 as pathlib
Or just directly import the Path object (as demonstrated here):
from pathlib2 import Path
If that's too much, you can remove files with os.remove or os.unlink
from os import unlink, remove
from os.path import join, expanduser
remove(join(expanduser('~'), 'directory/file'))
or
unlink(join(expanduser('~'), 'directory/file'))
and you can remove directories with os.rmdir:
from os import rmdir
rmdir(join(expanduser('~'), 'directory'))
Note that there is also a os.removedirs - it only removes empty directories recursively, but it may suit your use-case.
This is my function for deleting dirs. The "path" requires the full pathname.
import os
def rm_dir(path):
cwd = os.getcwd()
if not os.path.exists(os.path.join(cwd, path)):
return False
os.chdir(os.path.join(cwd, path))
for file in os.listdir():
print("file = " + file)
os.remove(file)
print(cwd)
os.chdir(cwd)
os.rmdir(os.path.join(cwd, path))
shutil.rmtree is the asynchronous function,
so if you want to check when it complete, you can use while...loop
import os
import shutil
shutil.rmtree(path)
while os.path.exists(path):
pass
print('done')
import os
folder = '/Path/to/yourDir/'
fileList = os.listdir(folder)
for f in fileList:
filePath = folder + '/'+f
if os.path.isfile(filePath):
os.remove(filePath)
elif os.path.isdir(filePath):
newFileList = os.listdir(filePath)
for f1 in newFileList:
insideFilePath = filePath + '/' + f1
if os.path.isfile(insideFilePath):
os.remove(insideFilePath)
For deleting files:
os.unlink(path, *, dir_fd=None)
or
os.remove(path, *, dir_fd=None)
Both functions are semantically same. This functions removes (deletes) the file path. If path is not a file and it is directory, then exception is raised.
For deleting folders:
shutil.rmtree(path, ignore_errors=False, onerror=None)
or
os.rmdir(path, *, dir_fd=None)
In order to remove whole directory trees, shutil.rmtree() can be used. os.rmdir only works when the directory is empty and exists.
For deleting folders recursively towards parent:
os.removedirs(name)
It remove every empty parent directory with self until parent which has some content
ex. os.removedirs('abc/xyz/pqr') will remove the directories by order 'abc/xyz/pqr', 'abc/xyz' and 'abc' if they are empty.
For more info check official doc: os.unlink , os.remove, os.rmdir , shutil.rmtree, os.removedirs
To remove all files in folder
import os
import glob
files = glob.glob(os.path.join('path/to/folder/*'))
files = glob.glob(os.path.join('path/to/folder/*.csv')) // It will give all csv files in folder
for file in files:
os.remove(file)
To remove all folders in a directory
from shutil import rmtree
import os
// os.path.join() # current working directory.
for dirct in os.listdir(os.path.join('path/to/folder')):
rmtree(os.path.join('path/to/folder',dirct))
To avoid the TOCTOU issue highlighted by Éric Araujo's comment, you can catch an exception to call the correct method:
def remove_file_or_dir(path: str) -> None:
""" Remove a file or directory """
try:
shutil.rmtree(path)
except NotADirectoryError:
os.remove(path)
Since shutil.rmtree() will only remove directories and os.remove() or os.unlink() will only remove files.
My personal preference is to work with pathlib objects - it offers a more pythonic and less error-prone way to interact with the filesystem, especially if You develop cross-platform code.
In that case, You might use pathlib3x - it offers a backport of the latest (at the date of writing this answer Python 3.10.a0) Python pathlib for Python 3.6 or newer, and a few additional functions like "copy", "copy2", "copytree", "rmtree" etc ...
It also wraps shutil.rmtree:
$> python -m pip install pathlib3x
$> python
>>> import pathlib3x as pathlib
# delete a directory tree
>>> my_dir_to_delete=pathlib.Path('c:/temp/some_dir')
>>> my_dir_to_delete.rmtree(ignore_errors=True)
# delete a file
>>> my_file_to_delete=pathlib.Path('c:/temp/some_file.txt')
>>> my_file_to_delete.unlink(missing_ok=True)
you can find it on github or PyPi
Disclaimer: I'm the author of the pathlib3x library.
I recommend using subprocess if writing a beautiful and readable code is your cup of tea:
import subprocess
subprocess.Popen("rm -r my_dir", shell=True)
And if you are not a software engineer, then maybe consider using Jupyter; you can simply type bash commands:
!rm -r my_dir
Traditionally, you use shutil:
import shutil
shutil.rmtree(my_dir)

Passing a variable in shell/python

I'm using a python script where I'm using a shell command to copy from local to hdfs.
import os
import logging
import subprocess
filePath = "/tmp"
keyword = "BC10^Dummy-Segment"
for root, dirs, files in os.walk(filePath):
for file in files:
if keyword in file:
subprocess.call(["hadoop fs -copyFromLocal /tmp/BC10%5EDummy-Segment* /user/app"], shell=True)
subprocess.call(["hadoop fs -rm /tmp/BC10%5EDummy-Segment*"], shell=True)
I'm seeing this error:
copyFromLocal: `/tmp/BC10^Dummy-Segment*': No such file or directory
rm: `/tmp/BC10^Dummy-Segment_2019': No such file or directory
Updated code:
import glob
import subprocess
import os
from urllib import urlencode, quote_plus
filePath = "/tmp"
keyword = "BC10^Dummy-Segment"
wildcard = os.path.join(filePath, '{0}*'.format(keyword))
print(wildcard)
files = [urlencode(x, quote_via=quote_plus) for x in glob.glob(wildcard)]
subprocess.check_call(["hadoop", "fs", "-copyFromLocal"] + files + ["/user/app"])
#subprocess.check_call(["hadoop", "fs", "-rm"] + files)
Seeing error when I run:
Traceback (most recent call last):
File "ming.py", line 11, in <module>
files = [urlencode(x, quote_via=quote_plus) for x in glob.glob(wildcard)]
TypeError: urlencode() got an unexpected keyword argument 'quote_via'
I'm guessing you are URL-encoding the path to pass it properly to Hadoop, but in doing so you basically hide it from the shell. There really are no files matching the wildcard /tmp/BC10%5EDummy-Segment* where % etc are literal characters.
Try handling the glob from Python instead. With that, you can also get rid of that pesky shell=True; and with that change, it is finally actually correct and useful to pass the commands as a list of strings (never a list of a singe space-separated string, and with shell=True, don't pass a list at all). Notice also the switch to check_call so we trap errors and don't delete the source files if copying them failed. (See also https://stackoverflow.com/a/51950538/874188 for additional rationale.)
import glob
import subprocess
import os
from urllib import quote_plus
filePath = "/tmp"
keyword = "BC10^Dummy-Segment"
wildcard = os.path.join(filePath, '{0}*'.format(keyword))
files = [quote_plus(x) for x in glob.glob(wildcard)]
subprocess.check_call(["hadoop", "fs", "-copyFromLocal"] + files + ["/user/app"])
subprocess.check_call(["hadoop", "fs", "-rm"] + files)
This will not traverse subdirectories; but neither would your attempt with os.walk() do anything actually useful if it found files in subdirectories. If you actually want that to happen, please explain in more detail what the script should do.

Subprocess.run() inside loop

I would like to loop over files using subprocess.run(), something like:
import os
import subprocess
path = os.chdir("/test")
files = []
for file in os.listdir(path):
if file.endswith(".bam"):
files.append(file)
for file in files:
process = subprocess.run("java -jar picard.jar CollectHsMetrics I=file", shell=True)
How do I correctly call the files?
shell=True is insecure if you are including user input in it. #eatmeimadanish's answer allows anybody who can write a file in /test to execute arbitrary code on your machine. This is a huge security vulnerability!
Instead, supply a list of command-line arguments to the subprocess.run call. You likely also want to pass in check=True – otherwise, your program would finish without an exception if the java commands fails!
import os
import subprocess
os.chdir("/test")
for file in os.listdir("."):
if file.endswith(".bam"):
subprocess.run(
["java", "-jar", "picard.jar", "CollectHsMetrics", "I=" + file], check=True)
Seems like you might be over complicating it.
import os
import subprocess
path = os.chdir("/test")
for file in os.listdir(path):
if file.endswith(".bam"):
subprocess.run("java -jar picard.jar CollectHsMetrics I={}".format(file), shell=True)

Python file copying deletes original file

I've got the program below which runs via cron and backs up asterisk call recordings.
It works fine for the most part, however if a call is in progress at the time then the act of trying to copy it seems to kill it, i.e. it disappears from both the source and destination.
Is there any way to prevent this, i.e. could I test if a file is in use somehow before trying to copy it?
Thanks
from datetime import datetime
from glob import iglob
from os.path import basename, dirname, isdir
from os import makedirs
from sys import argv
from shutil import copyfile
def copy_asterisk_files_tree(src, fullpath=None):
DEST = datetime.now().strftime('/mnt/shardik/asteriskcalls/' + src)
if fullpath is None:
fullpath = src
if not isdir(DEST):
makedirs(DEST)
for path in iglob(src + '/*'):
if isdir(path):
copy_asterisk_files_tree(path, fullpath)
else:
subdir = '%s/%s' % (
DEST, dirname(path)[len(fullpath) + 1:]
)
if not isdir(subdir):
makedirs(subdir)
copyfile(path, '%s/%s' % (
subdir, basename(path).replace(':', '-')
))
if __name__ == '__main__':
if len(argv) != 2:
print 'You must specify the source path as the first argument!'
exit(1)
copy_asterisk_files_tree(argv[1])
What you need to do is use a lock. Take a look at the docs ...
https://docs.python.org/2/library/fcntl.html#fcntl.flock
fcntl.flock(fd, op)
Perform the lock operation op on file descriptor fd (file objects
providing a fileno() method are accepted as well). See the Unix manual
flock(2) for details. (On some systems, this function is emulated
using fcntl().)
This has also been answered on SO in previous questions, such as this one: Locking a file in Python, which uses filelock (https://pypi.python.org/pypi/filelock/). Filelock is platform independant.
You could also write to a temporary file/s and merge them, but I'd much prefer

How do I iteratively copy logs from the local drive to a network share?

I'm new to Python. I'm running version 3.3. I'd like to iteratively copy all wildcard named folders and files from the C drive to a network share. Wildcard named folders are called "Test_1", "Test_2", etc. with folders containing the same named folder, "Pass". The files in "Pass" end with .log. I do NOT want to copy the .log files in the Fail folder. So, I have this:
C:\Test_1\Pass\a.log
C:\Test_1\Fail\a.log
C:\Test_1\Pass\b.log
C:\Test_1\Fail\b.log
C:\Test_2\Pass\y.log
C:\Test_2\Fail\y.log
C:\Test_2\Pass\z.log
C:\Test_2\Fail\z.log
but only want to copy
C:\Test_1\Pass\a.log
C:\Test_1\Pass\b.log
C:\Test_2\Pass\y.log
C:\Test_2\Pass\z.log
to:
\\share\Test_1\Pass\a.log
\\share\Test_1\Pass\b.log
\\share\Test_2\Pass\y.log
\\share\Test_2\Pass\z.log'
The following code works but I don't want to copy tons of procedural code. I'd like to make it object oriented.
import shutil, os
from shutil import copytree
def main():
source = ("C:\\Test_1\\Pass\\")
destination = ("\\\\share\\Test_1\\Pass\\")
if os.path.exists ("C:\\Test_1\\Pass\\"):
shutil.copytree (source, destination)
print ('Congratulations! Copy was successfully completed!')
else:
print ('There is no Actual folder in %source.')
main()
Also, I noticed it is not printing the "else" print statement when the os path does not exist. How do I accomplish this? Thanks in advance!
This is not a perfect example but you could do this:
import glob, os, shutil
#root directory
start_dir = 'C:\\'
def copy_to_remote(local_folders, remote_path):
if os.path.exists(remote_path):
for source in local_folders:
# source currently has start_dir at start. Strip it and add remote path
dest = os.path.join(remote_path, source.lstrip(start_dir))
try:
shutil.copytree(source, dest)
print ('Congratulations! Copy was successfully completed!')
except FileExistsError as fe_err:
print(fe_err)
except PermissionError as pe_err:
print(pe_err)
else:
print('{} - does not exist'.format(remote_path))
# Find all directories that start start_dir\Test_ and have subdirectory Pass
dir_list = glob.glob(os.path.join(start_dir, 'Test_*\\Pass'))
if dir_list:
copy_to_remote(dir_list, '\\\\Share\\' )
Documentation for glob can be found here.
def remotecopy(local, remote)
if os.path.exists(local):
shutil.copytree (local, remote)
print ('Congratulations! Copy was successfully completed!')
else:
print ('There is no Actual folder in %local.')
Then just remotecopy("C:\Local\Whatever", "C:\Remote\Whatever")

Categories

Resources