Python shutil - copy file + file tree for specific file only - python

I am trying to copy files from list with shutil and place them in same folder structure in another directory:
import os
import shutil
src = "/sourcedoc/1/"
dest = "/destdoc/"
files_to_find = []
with open('filelist.txt') as fh:
for row in fh:
files_to_find.append(row.strip())
for root, dirs, files in os.walk(src):
for _file in files:
if _file in files_to_find:
print('Found file in: ' + str(root))
os.makedirs(os.path.dirname(dest), exist_ok = True)
shutil.copy(os.path.abspath(root + '/' + _file), dest + _file)
I want to create the folders with the same names as the ones containing the files that are to be copied in the new destination and then copy the files in them.
The final goal is to have structure like "/destdoc/1/" and having copied the files from the list. However, what I get is all the files in the destination directory without the folder structure. It seems that using shutil.copytree copies all of the files in the folders.

Related

How to copy a file two folders back by replacing file name with names of folders using python

I have several folders such as a1-b1, a1-b2, a1-b3. a2-b2 and so on. Within each folder there are subfolders e.g. c_1, c_2, c_3 and so on. Within each subfolder I have data files with same name e.g. abc.dat. I desire to copy abc.dat two folders back by replacing its name with subfolders e.g. a1-b1-c_1.dat, a1-b1-c_2.dat, a1-b3_c1.dat etc..
My present approach can only copy one folder back, but also changes the name of abc.dat files in their existing directories, which I would now like to avoid and prefer that these files are copied with their desired changed names in two folders back, but exist as abc.dat in their current directories. Thanks in advance for your support!
input_dir = "/user/my_data"
# Walk through all files in the directory that contains the files to copy
for root, dirs, files in os.walk(input_dir):
for filename in files:
if filename == 'abc.dat':
base = os.path.join(os.path.abspath(root))
#Get current name
old_name = os.path.join(base, filename)
#Get parent folder
parent_folder = os.path.basename(base)
#New name based on parent folder
new_file_name = parent_folder + ".dat" #assuming same extension
new_abs_name = os.path.join(base, new_file_name)
#Rename to new name
os.rename(old_name,new_abs_name)
#Copy to one level up
one_level_up = os.path.normpath(os.path.join(base, os.pardir))
one_level_up_name = os.path.join(one_level_up, new_file_name)
shutil.copy(new_abs_name,one_level_up_name)
So, I figured out its solution. Here it is!
import os
import shutil
current_dir = os.getcwd()
for dirpath, dirs, files in os.walk(current_dir):
for f in files:
if f.endswith('.dat'):
folder_1 = os.path.split(os.path.split(dirpath)[0])[1]
folder_2 = os.path.split(os.path.split(dirpath)[1])[1]
os.rename(os.path.join(dirpath, f),
os.path.join(dirpath, folder_1 + '-' + folder_2 + '.dat'))
totalCopyPath = os.path.join(dirpath, folder_1 + '-' + folder_2 + '.dat')
shutil.copy(totalCopyPath,current_dir)

Iterate over files over several directories to extract data

I have a series of files that are nested as shown in the attached image. For each "inner" folder (e.g. like the 001717528 one), I want to extract a row of data from each the FITS files and create a CSV file that contains all the rows, and name that CSV file after the name of the "inner" folder (e.g. 001717528.csv that has data from the 18 fits files). The data-extracting part is easy but I have trouble coding the iteration.
I don't really know how to iterate over both the outer folders such as the 0017 and inner folders, and name the csv files as I want.
My code is looking like this:
for subdir, dirs, files in os.walk('../kepler'):
for file in files:
filepath = subdir + os.sep + file
if filepath.endswith(".fits"):
extract data
write to csv file
Apparently this will iterate over all files in the kepler folder so it doesn't work.
If you need to keep track of how far you've walked into the directory structure, you can count the file path delimiter (os.sep). In your case it's / because you're on a Mac.
for path, dirs, _ in os.walk("../kepler"):
if path.count(os.sep) == 2:
# path should be ../kepler/0017
for dir in dirs:
filename = dir + ".csv"
data_files = os.listdir(path + os.sep + dir)
for file in data_files:
if file.endswith(".fits"):
# Extract data
# Write to CSV file
As far as I can tell this meets your requirements, but let me know if I've missed something.
Try this code it should print the file path of all your ".fits" files:
# !/usr/bin/python
import os
base_dir = './test'
for root, dirs, files in os.walk(base_dir, topdown=False):
for name in files:
if name.endswith(".fits"):
file_path = os.path.join(root, name) #path of files
print(file_path)
# do your treatment on file_path
All you have to do is add your specific treatment.

Python recursively traverse through all subdirs and write the filenames to output file

I want to recursively traverse through all subdirs in a root folder and write all filenames to an output file. Then in each subdir, create a new output file inside the subdir and recursively traverse through its subdirs and append the filename to the new output file.
So in the example below, under the Music folder it should create a Music.m3u8 file and recursively traverse all subdirs and add all filenames in each subdir to the Music.m3u8 file. Then in Rock folder, create a Rock.m3u8 file and recursively traverse all subdirs within the Rock folder and add all filenames in each subdirs in Rock.m3u8. Finally in each Album folder, create Album1.m3u8, Album2.m3u8, etc with the filenames in its folder. How can I do this in python3.6?
Music
....Rock
........Album1
........Album2
....Hip-Hop
........Album3
........Album4
This is what I have but only adds the filenames of each folder to an output file but doesn't recursively adds to the root output file.
import os
rootdir = '/Users/bayman/Music'
ext = [".mp3", ".flac"]
for root, dirs, files in os.walk(rootdir):
path = root.split(os.sep)
if any(file.endswith(tuple(ext)) for file in files):
m3ufile = str(os.path.basename(root))+'.m3u8'
list_file_path = os.path.join(root, m3ufile)
with open(list_file_path, 'w') as list_file:
list_file.write("#EXTM3U\n")
for file in sorted(files):
if file.endswith(tuple(ext)):
list_file.write(file + '\n')
You're doing with open(list_file_path, 'w') as list_file: each time through the outer loop. But you're not creating, or writing to, any top-level file, so of course you don't get one. If you want one, you have to explicitly create it. For example:
rootdir = '/Users/bayman/Music'
ext = [".mp3", ".flac"]
with open('root.m3u', 'w') as root_file:
root_file.write("#EXTM3U\n")
for root, dirs, files in os.walk(rootdir):
path = root.split(os.sep)
if any(file.endswith(tuple(ext)) for file in files):
m3ufile = str(os.path.basename(root))+'.m3u8'
list_file_path = os.path.join(root, m3ufile)
with open(list_file_path, 'w') as list_file:
list_file.write("#EXTM3U\n")
for file in sorted(files):
if file.endswith(tuple(ext)):
root_file.write(os.path.join(root, file) + '\n')
list_file.write(file + '\n')
(I'm just guessing at what you actually want in that root file here; you presumably know the answer to that and don't have to guess…)

moving files from an unknown folder to other

I am extracting .tar.gz files which inside there are folders (with files with many extensions). I want to move all the .txt files of the folders to another, but I don't know the folders' name.
.txt files location ---> my_path/extracted/?unknown_name_folder?/file.txt
I want to do ---> my_path/extracted/file.txt
My code:
os.mkdir('extracted')
t = tarfile.open('xxx.tar.gz', 'r')
for member in t.getmembers():
if ".txt" in member.name:
t.extract(member, 'extracted')
###
I would try extracting the tar file first (See here)
import tarfile
tar = tarfile.open("xxx.tar.gz")
tar.extractall()
tar.close()
and then use the os.walk() method (See here)
import os
for root, dirs, files in os.walk('.\\xxx\\'):
txt_files = [path for path in files if path[:-4] == '.txt']
OR use the glob package to gather the txt files as suggested by #alper in the comments below:
txt_files = glob.glob('./**/*.txt', recursive=True)
This is untested, but should get you pretty close
And obviously move them once you get the list of text files
new_path = ".\\extracted\\"
for path in txt_files:
name = path[path.rfind('\\'):]
os.rename(path, new_path + name)

Move files from multiple directories to single directory

I am trying to use the os.walk() module to go through a number of directories and move the contents of each directory into a single "folder" (dir).
In this particular example I have hundreds of .txt files that need to be moved. I tried using shutil.move() and os.rename(), but it did not work.
import os
import shutil
current_wkd = os.getcwd()
print(current_wkd)
# make sure that these directories exist
dir_src = current_wkd
dir_dst = '.../Merged/out'
for root, dir, files in os.walk(top=current_wkd):
for file in files:
if file.endswith(".txt"): #match files that match this extension
print(file)
#need to move files (1.txt, 2.txt, etc) to 'dir_dst'
#tried: shutil.move(file, dir_dst) = error
If there is a way to move all the contents of the directories, I would be interested in how to do that as well.
Your help is much appreciated! Thanks.
Here is the file directory and contents
current_wk == ".../Merged
In current_wkthere is:
Dir1
Dir2
Dir3..
combine.py # python script file to be executed
In each directory there are hundreds of .txtfiles.
Simple path math is required to find source files and destination files precisely.
import os
import shutil
src_dir = os.getcwd()
dst_dir = src_dir + " COMBINED"
for root, _, files in os.walk(current_cwd):
for f in files:
if f.endswith(".txt"):
full_src_path = os.path.join(src_dir, root, f)
full_dst_path = os.path.join(dst_dir, f)
os.rename(full_src_path, full_dst_path)
You have to prepare the complete path of source file, and make sure dir_dst exists.
for root, dir, files in os.walk(top=current_wkd):
for file in files:
if file.endswith(".txt"): #match files that match this extension
shutil.move(os.path.join(root, file), dir_dst)

Categories

Resources