I am extracting .tar.gz files which inside there are folders (with files with many extensions). I want to move all the .txt files of the folders to another, but I don't know the folders' name.
.txt files location ---> my_path/extracted/?unknown_name_folder?/file.txt
I want to do ---> my_path/extracted/file.txt
My code:
os.mkdir('extracted')
t = tarfile.open('xxx.tar.gz', 'r')
for member in t.getmembers():
if ".txt" in member.name:
t.extract(member, 'extracted')
###
I would try extracting the tar file first (See here)
import tarfile
tar = tarfile.open("xxx.tar.gz")
tar.extractall()
tar.close()
and then use the os.walk() method (See here)
import os
for root, dirs, files in os.walk('.\\xxx\\'):
txt_files = [path for path in files if path[:-4] == '.txt']
OR use the glob package to gather the txt files as suggested by #alper in the comments below:
txt_files = glob.glob('./**/*.txt', recursive=True)
This is untested, but should get you pretty close
And obviously move them once you get the list of text files
new_path = ".\\extracted\\"
for path in txt_files:
name = path[path.rfind('\\'):]
os.rename(path, new_path + name)
Related
I have a path mydir where i have file1,file2, .. file100 each folder have .crc,.bak etc, files i want to remove all files and keep only .parquet files and name the .parquet files with folder name
for eg., file1 folder have .crc,.bak files after removing we end up with .parquet i need to name this as file1.parquet.
I tried to remove one folder but couldnot do it for all folders uisng python
can someone help me how to solve this
mydir='c/users/name/files'
for f in os.listdir(mydir):
if f.endswith(".parquet"):
continue
os.remove(os.path.join(mydir, f))
Following Miguel's comment, you can use glob like this:
import glob
import os
mydir = "./"
dir_to_remove = []
for src_file in glob.glob(mydir + '**/.parquet', recursive=True):
dir_, file_ = src_file.rsplit('/', 1)
dir_to_remove.append(dir_)
dst_file = dir_ + file_
os.rename(src_file, dst_file)
for dir_ in dir_to_remove:
os.rmdir(dir_)
I have a series of files that are nested as shown in the attached image. For each "inner" folder (e.g. like the 001717528 one), I want to extract a row of data from each the FITS files and create a CSV file that contains all the rows, and name that CSV file after the name of the "inner" folder (e.g. 001717528.csv that has data from the 18 fits files). The data-extracting part is easy but I have trouble coding the iteration.
I don't really know how to iterate over both the outer folders such as the 0017 and inner folders, and name the csv files as I want.
My code is looking like this:
for subdir, dirs, files in os.walk('../kepler'):
for file in files:
filepath = subdir + os.sep + file
if filepath.endswith(".fits"):
extract data
write to csv file
Apparently this will iterate over all files in the kepler folder so it doesn't work.
If you need to keep track of how far you've walked into the directory structure, you can count the file path delimiter (os.sep). In your case it's / because you're on a Mac.
for path, dirs, _ in os.walk("../kepler"):
if path.count(os.sep) == 2:
# path should be ../kepler/0017
for dir in dirs:
filename = dir + ".csv"
data_files = os.listdir(path + os.sep + dir)
for file in data_files:
if file.endswith(".fits"):
# Extract data
# Write to CSV file
As far as I can tell this meets your requirements, but let me know if I've missed something.
Try this code it should print the file path of all your ".fits" files:
# !/usr/bin/python
import os
base_dir = './test'
for root, dirs, files in os.walk(base_dir, topdown=False):
for name in files:
if name.endswith(".fits"):
file_path = os.path.join(root, name) #path of files
print(file_path)
# do your treatment on file_path
All you have to do is add your specific treatment.
I work in audio and I need a number of files transcribed by a third party. To do so I have to swap out an entire directory of .wav files with .mp3s I have compressed while still maintaining the file directory. It's about 20,000 files.
e.g.
wav:
Folder1
Folder 1a
sound1.wav
sound2.wav
Folder 1b
sound3.wav
sound4.wav
Folder2
Folder 2a
Folder 2aa
sound5.wav
sound6.wav
Folder 2ab
sound7.wav
Folder2b
sound8.wav
etc.
mp3:
Folder1
sound1.mp3
sound2.mp3
sound3.mp3
sound4.mp3
sound5.mp3
sound6.mp3
sound7.mp3
sound8.mp3
etc.
I had to group them together to do the batch compression in Adobe Audition, but now I would like to be able to switch them out with the wav files that are perfectly identical save for file extension as doing this manually is not a reasonable option.
Any help would be greatly appreciated. I have a little experience with python so that language is preferable, but I'm open to any solutions.
You can use a combination of glob and shutil to do this. Try running this script from inside Folder1.
from glob import glob
from shutil import move
import os
wav_files = glob('**/*.wav', recursive=True)
for wf in wav_files:
file_path = os.path.splitext(wf)[0]
file_head = os.path.split(file_path)[-1]
try:
move('./{}.mp3'.format(file_head),
'{}.mp3'.format(file_path))
except:
print('Could not find or move file {}.mp3, it may not exist.'.format(file_head))
What I understand is that you want the same directory structure for mp3 as for vaw.
You can:
browse the directory structure of vaw file and construct a mapping between base names (file names without extension) and relative path.
browse the directory structure, searching the mp3 files and find each relative path in the mapping, creating the target directory structure if missing and move the file in.
For instance:
import os
vaw_dir = 'path/to/MyVaw' # parent of Folder1...
musics = {}
for root, dirnames, filenames in os.walk(vaw_dir):
for filename in filenames:
basename, ext = os.path.splitext(filename)
if ext.lower() == '.wav':
relpath = os.path.relpath(root, vaw_dir)
print('indexing "{0}" to "{1}"...'.format(filename, relpath))
musics[basename] = relpath
else:
print('skiping "{0}"...'.format(filename))
mp3_dir = 'path/to/MyMp3'
out_dir = vaw_dir # or somewhere else
for root, dirnames, filenames in os.walk(vaw_dir):
for filename in filenames:
basename, ext = os.path.splitext(filename)
if ext.lower() == '.mp3' and basename in musics:
relpath = musics[basename]
path = os.path.join(out_dir, relpath)
if not os.path.exists(path):
print('creating directory "{0}"...'.format(path))
os.makedirs(path)
src_path = os.path.join(root, filename)
dst_path = os.path.join(path, filename)
if src_path != dst_path:
print('moving "{0}" to "{1}"...'.format(filename, relpath))
os.rename(src_path, dst_path)
else:
print('skiping "{0}"...'.format(filename))
print("Done.")
My directory contains several folders, each with several subdirectories of their own. I need to move all of the files that contain 'Volume.csv' into a directory called Volume.
Folder1
|---1Area.csv
|---1Circumf.csv
|---1Volume.csv
Folder2
|---2Area.csv
|---2Circumf.csv
|---2Volume.csv
Volume
I'm trying combinations of os.walk and regex to retrieve the files by filename but not having much luck.
Any ideas?
Thank you!
Sunworshipper, thank you for the answer!
I ran the following code and it moved the entire directory rather than just file name containing 'Volume'. Is it clear why that happened?
import os
import shutil
source_dir = "~/Stats/"
dest_dir = "~/Stats/Volume/"
file_paths = set()
for dir_, _, files in os.walk(source_dir):
for fileName in files:
if "Volume" in fileName:
relDir = os.path.relpath(dir_, source_dir)
file_paths.add(relDir)
for matched in file_paths:
shutil.move(matched, dest_dir)
You can use glob for this. It returns a list of path names matching the expression you give it.
import glob
import shutil
dest = 'testfiles/'
files = glob.glob('*/*test.csv')
for file in files:
shutil.move(file, dest)
I used relative paths but you can also use absolute paths.
shutil moves the documents to the new location. See the glob.glob documentation for more info.
import os
import shutil
Setup your source and destination directories
source_dir = "/Users/nenad/Documents/Python Files/Random Tests"
dest_dir = "/Users/nenad/Documents/Python Files/Random Tests/volume"
This set will now hold paths of all files matching your substring.
file_paths = set()
Now I only consider the directories that contain a file which has a substring "hello" in the filename.
for dir_, _, files in os.walk(source_dir):
for fileName in files:
if "hello" in fileName:
relDir = os.path.relpath(dir_, source_dir)
relFile = os.path.join(relDir, fileName)
file_paths.add(relFile)
And now you just move them to your destination with shutil.
for matched in file_paths:
shutil.move(matched, dest_dir)
Sorry for the misread :)
Best regards
I am trying to use the os.walk() module to go through a number of directories and move the contents of each directory into a single "folder" (dir).
In this particular example I have hundreds of .txt files that need to be moved. I tried using shutil.move() and os.rename(), but it did not work.
import os
import shutil
current_wkd = os.getcwd()
print(current_wkd)
# make sure that these directories exist
dir_src = current_wkd
dir_dst = '.../Merged/out'
for root, dir, files in os.walk(top=current_wkd):
for file in files:
if file.endswith(".txt"): #match files that match this extension
print(file)
#need to move files (1.txt, 2.txt, etc) to 'dir_dst'
#tried: shutil.move(file, dir_dst) = error
If there is a way to move all the contents of the directories, I would be interested in how to do that as well.
Your help is much appreciated! Thanks.
Here is the file directory and contents
current_wk == ".../Merged
In current_wkthere is:
Dir1
Dir2
Dir3..
combine.py # python script file to be executed
In each directory there are hundreds of .txtfiles.
Simple path math is required to find source files and destination files precisely.
import os
import shutil
src_dir = os.getcwd()
dst_dir = src_dir + " COMBINED"
for root, _, files in os.walk(current_cwd):
for f in files:
if f.endswith(".txt"):
full_src_path = os.path.join(src_dir, root, f)
full_dst_path = os.path.join(dst_dir, f)
os.rename(full_src_path, full_dst_path)
You have to prepare the complete path of source file, and make sure dir_dst exists.
for root, dir, files in os.walk(top=current_wkd):
for file in files:
if file.endswith(".txt"): #match files that match this extension
shutil.move(os.path.join(root, file), dir_dst)