read all the files in a directory in python

read all the files in a directory in python - python

I am pretty new to python.I have a directory and inside which I have 2 sub directory.Each sub directory contain 100 text file.I want to read the content of each file of the both the sub directory and put them in a single text file in such a way that each file content is in a single line in the new file.how can I achieve this in pyhon.thank you

Since you don't have anything.. you could try starting from here. You can use the glob module instead to load files from a single level of subdirectories, or use os.walk() to walk an arbitrary directory structure depending on your requirement,
To open say all text files in an arbitrary nesting of directories:
import os
import fnmatch
for dirpath, dirs, files in os.walk('Test'):
for filename in fnmatch.filter(files, '*.txt'):
with open(os.path.join(dirpath, filename)):
# deal with this file as next loop will present you with a new file.
#use the filename object to do whatever you want with that file
Since your new like you said. Watch out for the indentations. #Goodluck

Related

Python: Finding files in directory but ignoring folders and their contents

So my program search_file.py is trying to look for .log files in the directory it is currently placed in. I used the following code to do so:
import os
# This is to get the directory that the program is currently running in
dir_path = os.path.dirname(os.path.realpath(__file__))
# for loop is meant to scan through the current directory the program is in
for root, dirs, files in os.walk(dir_path):
for file in files:
# Check if file ends with .log, if so print file name
if file.endswith('.log')
print(file)
My current directory is as follows:
search_file.py
sample_1.log
sample_2.log
extra_file (this is a folder)
And within the extra_file folder we have:
extra_sample_1.log
extra_sample_2.log
Now, when the program runs and prints the files out it also takes into account the .log files in the extra_file folder. But I do not want this. I only want it to print out sample_1.log and sample_2.log. How would I approach this?

Try this:
import os
files = os.listdir()
for file in files:
if file.endswith('.log'):
print(file)
The problem in your code is os.walk traverses the whole directory tree and not just your current directory. os.listdir returns a list of all filenames in a directory with the default being your current directory which is what you are looking for.
os.walk documentation
os.listdir documentation

By default, os.walk does a root-first traversal of the tree, so you know the first emitted data is the good stuff. So, just ask for the first one. And since you don't really care about root or dirs, use _ as the "don't care" variable name
# get root files list.
_, _, files = next(os.walk(dir_path))
for file in files:
# Check if file ends with .log, if so print file name
if file.endswith('.log')
print(file)
Its also common to use glob:
from glob import glob
dir_path = os.path.dirname(os.path.realpath(__file__))
for file in glob(os.path.join(dir_path, "*.log")):
print(file)
This runs the risk that there is a directory that ends in ".log", so you could also add a testing using os.path.isfile(file).

Python loop through directories

I am trying to use python library os to loop through all my subdirectories in the root directory, and target specific file name and rename them.
Just to make it clear this is my tree structure
My python file is located at the root level.
What I am trying to do, is to target the directory 942ba loop through all the sub directories and locate the file 000000 and rename it to 000000.csv
the current code I have is as follow:
import os
root = '<path-to-dir>/942ba956-8967-4bec-9540-fbd97441d17f/'
for dirs, subdirs, files in os.walk(root):
for f in files:
print(dirs)
if f == '000000':
dirs = dirs.strip(root)
f_new = f + '.csv'
os.rename(os.path.join(r'{}'.format(dirs), f), os.path.join(r'{}'.format(dirs), f_new))
But this is not working, because when I run my code, for some reasons the code strips the date from the subduers
can anyone help me to understand how to solve this issue?

A more efficient way to iterate through the folders and only select the files you are looking for is below:
source_folder = '<path-to-dir>/942ba956-8967-4bec-9540-fbd97441d17f/'
files = [os.path.normpath(os.path.join(root,f)) for root,dirs,files in os.walk(source_folder) for f in files if '000000' in f and not f.endswith('.gz')]
for file in files:
os.rename(f, f"{f}.csv")
The list comprehension stores the full path to the files you are looking for. You can change the condition inside the comprehension to anything you need. I use this code snippet a lot to find just images of certain type, or remove unwanted files from the selected files.
In the for loop, files are renamed adding the .csv extension.

I would use glob to find the files.
import os, glob
zdir = '942ba956-8967-4bec-9540-fbd97441d17f'
files = glob.glob('*{}/000000'.format(zdir))
for fly in files:
os.rename(fly, '{}.csv'.format(fly))

How to transfer multiple files from sub directories to a single path folder using python?

I have a list of names(all unique) of Wav files-
2003211085_2003211078_ffc0d543799a2984c60c581d.wav
2003214817_2003214800_92720fb19bf9216c2f160733.wav
2003233142_2003233136_8c42d206701830dff6032d41.wav
2003256235_2003256218_4e71bf77b0ffb907990d2e30.wav
2003276239_2003276196_dad6aff70f37817fcd75ffb8.wav
2003352182_2003352170_b1f2990d5f867408cc39c445.wav
There is a directory called \019\Recordings where all of these files are located under various subfolders.
I want to write a python app that pulls these wav files based on their unique name from all these subfolders and places them into a single target folder.
Im new to python and tried using -
import glob, os
import shutil
target_list_of_wav_names = ["2003211085_2003211078_ffc0d543799a2984c60c581d.wav",
"2003214817_2003214800_92720fb19bf9216c2f160733.wav",
"2003233142_2003233136_8c42d206701830dff6032d41.wav"
"2003352182_2003352170_b1f2990d5f867408cc39c445.wav"]
for file in glob.glob('//19/Recordings*.wav', recursive=True):
print(file)
if file in target_list_of_wav_names:
shutil.move(file, "C:/Users/ivd/Desktop/autotranscribe"+file)
But the files do not reflect in the target folder
How can i fix this?

glob is just a utility to find files based on a wildcard. It returns the string of the files that match your query.
So you'll still need to actually move the file with another function.
you could use os.rename or shutil.move to move it
for file in glob.glob("*.wav"):
os.rename(file, f'destinationfolder/{file}')

import glob, os
import shutil
target_list_of_wav_names = ['example_wav1.wav','example_wav2.wav',...... etc]
for file in glob.glob('/019/Recordings/*.wav', recursive=True):
print(file)
if file in target_list_of_wav_names:
shutil.move(file, "/mydir/"+file)

Unzipping a file with subfolders into the same directory without creating an extra folder

I hope I don't duplicate here, but I didn't find a solution until now since the answers don't include subfolders. I have a zipfile that contains a folder which contains files and subfolders.
I want to extract the files within the folder (my_folder) and the subfolder to a specific path: Users/myuser/Desktop/another . I want only files and subfolders in the another dir. With my current code what happens it that a directory my_folder is created in which my files and subfolders are placed. But I don't want that directory created. This is what I am doing:
with zipfile.ZipFile("Users/myuser/Desktop/another/my_file.zip", "r") as zip_ref:
zip_ref.extractall(Users/myuser/Desktop/another)
I tried listing all the zipfiles within the folder and extracting them manually:
with ZipFile('Users/myuser/Desktop/another/myfile.zip', 'r') as zipObj:
# Get a list of all archived file names from the zip
listOfFileNames = zipObj.namelist()
for fileName in new_list_of_fn:
print(fileName)
zipObj.extract(fileName, 'Users/myuser/Desktop/another/')
This yields the same result. I the tried create a new list, stripping the names so that they don't include the name of the folder anymore but then it tells me that there is no item named xyz in the archive.
Finally I leveraged those two questions/code (extract zip file without folder python and Extract files from zip without keeping the structure using python ZipFile?) and this works, but only if there are no subfolders involved. If there are subfolders it throws me the error FileNotFoundError: [Errno 2] No such file or directory: ''. What I want though is that the files in the subdirectory get extracted to the subdirectory.
I can only use this code if I skip all directories:
my_zip = Users/myuser/Desktop/another/myfile.zip
my_dir = Users/myuser/Desktop/another/
with zipfile.ZipFile(my_zip, 'r') as zip_file:
for member in zip_file.namelist():
filename = os.path.basename(member)
print(filename)
# skip directories
if not filename:
continue
# copy file (taken from zipfile's extract)
source = zip_file.open(member)
target = open(os.path.join(my_dir, filename), "wb")
with source, target:
shutil.copyfileobj(source, target)
So I am looking for a way to do this which would also extract subdirs to their respective dir. That means I want a structure within /Users/myuser/Desktop/another:
-file1
-file2
-file3
...
- subfolder
-file1
-file2
-file3
...
I have the feeling this must be doable with shututil but don't really know how....
Is there a way I can do this? Thanks so much for any help. Very much appreciated.

I want to use python to walk through directories to get to text files and processed them

I have many folders in a directory:
/home/me/Documents/coverage
/coverage contains 50 folders all beginning with H:
/home/me/Documents/coverage/H1 (etc)
In each H*** folder there is a text file which I need to extract data from.
I have been trying to use glob and os.walk to use a script that is saved in /coverage to walk into each of these H folders, open the .txt file and process it, but I have had no luck at all.
Would this be a good starting point? (where path = /coverage)
for filename in glob.glob(os.path.join(path, "H*")):
folder = open(glob.glob(H*))
And then try and open the .txt file?

Just gather all the txt files in one shot using glob wildcards.
You can do it like that.
import glob
path = "/home/me/Documents/coverage/H*/*.txt"
for filename in glob.glob(path):
fileStream = open(filename )
cheers

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

read all the files in a directory in python - python

Related

Python: Finding files in directory but ignoring folders and their contents

Python loop through directories

How to transfer multiple files from sub directories to a single path folder using python?

Unzipping a file with subfolders into the same directory without creating an extra folder

I want to use python to walk through directories to get to text files and processed them

Categories

Resources