How to open all files in a folder in Python? [duplicate] - python

This question already has answers here:
How do I list all files of a directory?
(21 answers)
Closed 1 year ago.
How do I open all files in a folder in python? I need to open all files in a folder, so I can index the files for language processing.

Here you have an example. here is what it does:
os.listdir('yourBasebasePath') returns a list of files in your directory
with open(os.path.join(os.getcwd(), filename), 'r') is opening the current file as readonly (you will not be able to write inside)
import os
for filename in os.listdir('yourBasebasePath'):
with open(os.path.join(os.getcwd(), filename), 'r') as f:
# do your stuff
How to open every file in a folder

I would recommend looking at the pathlib library https://docs.python.org/3/library/pathlib.html
you could do something like:
from pathlib import Path
folder = Path('<folder to index>')
# get all the files in the folder
files = folder.glob('**/*.csv') # assuming the files are csv
for file in files:
with open(file, 'r') as f:
print(f.readlines())

you can use os.walk for listing all the files having in your folder.
you can refer os.walk documentation
import os
folderpath = r'folderpath'
for root, dirs, files in os.walk(folderpath, topdown=False):
for name in files:
print(os.path.join(root, name))
for name in dirs:
print(os.path.join(root, name))

You can use
import os
os.walk()

Related

how to cretae zip file using zipfile in python?

import os
from zipfile import ZipFile
from os.path import basename
src = "C:\git\mytest"
full_path=[]
for root, dirs, files in os.walk(src, topdown=False):
for name in files:
full_path.append(os.path.join(root, name))
with ZipFile('output.zip', 'w') as zipObj:
for item in full_path:
zipObj.write(item, basename(item))
Trying to create a zip file with containing some file of a specific folder.
In specific folder has some files. then it will add to zip file
In the mentioned code, one zipfile is created but there is no file. I am not getting the exact reason

Save outputs to new directory in Python

I am trying to identify all .kml in a specific directory and save them into a new directory. Is this possible? I'm able to print the file path but I would like to use Python to copy those files to a new directory.
Here is my code so far:
import os
# traverse whole directory
for root, dirs, files in os.walk(r'C:\Users\file_path_here'):
# select file name
for file in files:
# check the extension of files
if file.endswith('.kml'):
# print whole path of files
print(os.path.join(root, file))
Try this:
import os
# traverse whole directory
for root, dirs, files in os.walk(r'C:\Users\file_path_here'):
# select file name
for each_file in files:
# check the extension of files
if each_file.endswith('.kml'):
# print whole path of files
print(os.path.join(root, file))
kml_file = open(each_file, "r")
content = kml_file.read()
file.close()
with open('newfile.kml', 'w') as f:
f.write(content)

How can I iterate over all files in all folders of one master folder? [duplicate]

This question already has answers here:
How to use glob() to find files recursively?
(28 answers)
Closed 3 years ago.
So I wrote a Python scrip that would do a certain thing to a certain .txt file:
with open("1.txt") as f:
for line in f:
#DoStuff
Now this works for 1 .txt file.
I have One master folder, in the master folder I have different other folders, and in each folder I also have several .txt files.
How can I iterate over all this to apply my script to every .txt file in every folder in the master file?.
You can use os.walk()
import os
path = 'c:\\projects\\hc2\\'
files = []
# r=root, d=directories, f = files
for r, d, f in os.walk(path):
for file in f:
if '.txt' in file:
files.append(os.path.join(r, file))
for f in files:
print(f)
Output:
c:\projects\hc2\app\readme.txt
c:\projects\hc2\app\release.txt
c:\projects\hc2\web\readme.txt
c:\projects\hc2\whois\download\afrinic.txt
c:\projects\hc2\whois\download\apnic.txt
c:\projects\hc2\whois\download\arin.txt
c:\projects\hc2\whois\download\lacnic.txt
c:\projects\hc2\whois\download\ripe.txt
c:\projects\hc2\whois\out\test\resources\asn\afrinic\3068.txt
c:\projects\hc2\whois\out\test\resources\asn\afrinic\37018.txt
You could use glob.iglob() and os.walk() for that.
Here a little function for you.
def list_of_files(path, extension, recursive=False):
'''
Return a list of filepaths for each file into path with the target extension.
If recursive, it will loop over subfolders as well.
'''
if not recursive:
for file_path in glob.iglob(path + '/*.' + extension):
yield file_path
else:
for root, dirs, files in os.walk(path):
for file_path in glob.iglob(root + '/*.' + extension):
yield file_path
You need to import glob, os to use it.
In your case:
for file in list_of_files(path='master_folder_path_here', extension='txt'):
...
You can use glob module from python
from glob import glob
file_list = glob("(folder path)/*/*")
this will give you list of all file paths in sub-sub folder.
and then you can iterate and do your operations.

Open a file without specifying the subdirectory python

Lets say my python script is in a folder "/main". I have a bunch of text files inside subfolders in main. I want to be able to open a file just by specifying its name, not the subdirectory its in.
So open_file('test1.csv') should open test1.csv even if its full path is /main/test/test1.csv.
I don't have duplicated file names so it should no be a problem.
I using windows.
you could use os.walk to find your filename in a subfolder structure
import os
def find_and_open(filename):
for root_f, folders, files in os.walk('.'):
if filename in files:
# here you can either open the file
# or just return the full path and process file
# somewhere else
with open(root_f + '/' + filename) as f:
f.read()
# do something
if you have a very deep folder structure you might want to limit the depth of the search
import os
def get_file_path(file):
for (root, dirs, files) in os.walk('.'):
if file in files:
return os.path.join(root, file)
This should work. It'll return the path, so you should handle opening the file, in your code.
import os
def open_file(filename):
f = open(os.path.join('/path/to/main/', filename))
return f

Python - Need to loop through directories looking for TXT files

I am a total Python Newb
I need to loop through a directory looking for .txt files, and then read and process them individually. I would like to set this up so that whatever directory the script is in is treated as the root of this action. For example if the script is in /bsepath/workDir, then it would loop over all of the files in workDir and its children.
What I have so far is:
#!/usr/bin/env python
import os
scrptPth = os.path.realpath(__file__)
for file in os.listdir(scrptPth)
with open(file) as f:
head,sub,auth = [f.readline().strip() for i in range(3)]
data=f.read()
#data.encode('utf-8')
pth = os.getcwd()
print head,sub,auth,data,pth
This code is giving me an invalid syntax error and I suspect that is because os.listdir does not like file paths in standard string format. Also I dont think that I am doing the looped action right. How do I reference a specific file in the looped action? Is it packaged as a variable?
Any help is appriciated
import os, fnmatch
def findFiles (path, filter):
for root, dirs, files in os.walk(path):
for file in fnmatch.filter(files, filter):
yield os.path.join(root, file)
Use it like this, and it will find all text files somewhere within the given path (recursively):
for textFile in findFiles(r'C:\Users\poke\Documents', '*.txt'):
print(textFile)
os.listdir expects a directory as input. So, to get the directory in which the script resides use:
scrptPth = os.path.dirname(os.path.realpath(__file__))
Also, os.listdir returns just the filenames, not the full path.
So open(file) will not work unless the current working directory happens to be the directory where the script resides. To fix this, use os.path.join:
import os
scrptPth = os.path.dirname(os.path.realpath(__file__))
for file in os.listdir(scrptPth):
with open(os.path.join(scrptPth, file)) as f:
Finally, if you want to recurse through subdirectories, use os.walk:
import os
scrptPth = os.path.dirname(os.path.realpath(__file__))
for root, dirs, files in os.walk(scrptPth):
for filename in files:
filename = os.path.join(root, filename)
with open(filename, 'r') as f:
head,sub,auth = [f.readline().strip() for i in range(3)]
data=f.read()
#data.encode('utf-8')

Categories

Resources