How to open a file only using its extension? - python

I have a Python script which opens a specific text file located in a specific directory (working directory) and perform some actions.
(Assume that if there is a text file in the directory then it will always be no more than one such .txt file)
with open('TextFileName.txt', 'r') as f:
for line in f:
# perform some string manipulation and calculations
# write some results to a different text file
with open('results.txt', 'a') as r:
r.write(someResults)
My question is how I can have the script locate the text (.txt) file in the directory and open it without explicitly providing its name (i.e. without giving the 'TextFileName.txt'). So, no arguments for which text file to open would be required for this script to run.
Is there a way to achieve this in Python?

You could use os.listdir to get the files in the current directory, and filter them by their extension:
import os
txt_files = [f for f in os.listdir('.') if f.endswith('.txt')]
if len(txt_files) != 1:
raise ValueError('should be only one txt file in the current directory')
filename = txt_files[0]

You Can Also Use glob Which is easier than os
import glob
text_file = glob.glob('*.txt')
# wild card to catch all the files ending with txt and return as list of files
if len(text_file) != 1:
raise ValueError('should be only one txt file in the current directory')
filename = text_file[0]
glob searches the current directory set by os.curdir
You can change to the working directory by setting
os.chdir(r'cur_working_directory')

Since Python version 3.4, it is possible to use the great pathlib library. It offers a glob method which makes it easy to filter according to extensions:
from pathlib import Path
path = Path(".") # current directory
extension = ".txt"
file_with_extension = next(path.glob(f"*{extension}")) # returns the file with extension or None
if file_with_extension:
with open(file_with_extension):
...

Related

How to modify this script so that all of my files are not deleted when trying to delete files that do not have XML files with them?

I am trying to delete all .JPG files that do not have .xml files with the same name attached to them. However, when I run this script, all of my files are deleted in my directory and not just the desired images. How can I change this script so that I can just delete the images without corresponding .xml files?
Note: The only files I have in the directory are .JPG and .XML
import os
from tqdm import tqdm
path = 'C:\\users\\my_username\\path_to_directory_with_xml_and_jpg_images'
files = os.listdir(path)
for file in tqdm(files):
filename, filetype = file.split('.')
if filetype == 'xml':
continue
imgfile = os.path.join(path, file)
xmlfile = os.path.join(path, filename + '.xml')
if not os.path.exists(xmlfile):
print('{} deleted.'.format(imgfile))
os.remove(imgfile)
It's hard to tell why your code doesn't work as we don't know the exact contents of the directory. But a simpler way to do what you want could be to use the amazing pathlib library (Python >= 3.4). The method Path.with_suffix() will make the task quite easy, together with Path.glob():
from pathlib import Path
path = Path('C:\\users\\my_username\\path_to_directory_with_xml_and_jpg_images')
for imgfile in path.glob("*.jpg"):
xmlfile = imgfile.with_suffix(".xml")
if not xmlfile.exists():
imgfile.unlink()
print(imgfile, 'deleted.')

reading files from a folder using os module

for a pattern recognition application, I want to read and operate on jpeg files from another folder using the os module.
I tried to use str(file) and file.encode('latin-1') but they both give me errors
I tried :
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ str(file.encode('latin-1'))), 'r')
allLines.append(file.read())
print(allLines)
but I get an error saying:
No such file or directory "results/b'thefilename"
when I expect a list with the desired file names that are accessible
If you can use Python 3.4 or newer, you can use the pathlib module to handle the paths.
from pathlib import Path
all_lines = []
path = Path('results/')
for file in path.iterdir():
with file.open() as f:
all_lines.append(f.read())
print(all_lines)
By using the with statement, you don't have to close the file descriptor by hand (what is currently missing), even if an exception is raised at some point.

File scanner in python 3

I am learning python atm and in order to do something useful whilst learning, I have created a small plan:
Read specific disc drive partition. Outcome: List of directories
Iterate each file within directory and subdirectories. Outcome: List of files within directories
Read file information: extension Outcome: File extension
Read file information: size Outcome: Size
Read file information: date created Outcome: Date
Read file information: date modified Date
Read file information: owner Outcome:Ownership
At step 1 I have tried several approaches, scandir:
import os as os
x = [f.name for f in os.scandir('my_path') if f.is_file()]
with open('write_to_file_path', 'w') as f:
for row in x:
print(row)
f.write("%s\n" % str(row))
f.close()
and this:
import os as os
rootDir = ('/Users/Ivan/Desktop/revit dynamo/')
for dirName, subdirList, fileList in os.walk(rootDir):
print('Found directory: %s' % dirName)
for fname in fileList:
print('\t%s' % fname)
Though I have hard time writing a result into txt file.
May I ask what would be an ideal approach to make an audit of the specific directories with all relevant information extracted and stored as a table in txt file for now?
P.S.: my first question here, so please do not judge to strictly :)
Since you are learning Python3, I would suggest as an alternative to the low-level path manipulation using os.path, you could try pathlib(part of standard library as of Python 3.4):
from pathlib import Path
p = Path(mydir)
#list mydir content
for child in p.iterdir():
print(child)
#Recursive iteration
for child in p.glob("**/*"):
if child.is_dir():
#do dir stuff
else:
print(child.suffix) # extension
print(child.owner()) # file owner
child_info = child.stat()
#file size, mod time
print(child_info.size,child_info.st_mtime)
File creation time is platform-dependent, but this post presents some solutions.
The string of a Path can be accessed as str(p).
To write to a file using pathlib:
textfile = Path(myfilepath)
# create file if it doesn't exist
textfile.touch()
# open file, write string, then close file
textfile.write_text(mystringtext)
# open file with context manager
with textfile.open('r') as f:
f.read()

Reading/Writing to and from a file using the path

How do I go about reading from or writing to a file that isn't in the same folder as the file itself.
I'm trying to write a function that allows user input to find the file by using the path itself. I have to set it up where the user inputs the file name, then the path
I have to then concatenate them together to form the finished path to locate the file.
Use os.path.join() to construct file paths from distinct parts:
import os.path
dirname = '/foo/bar/baz'
filename = 'ham_n_spam'
path = os.path.join(dirname, filename)

Python: How to read all files in a directory?

I found this piece of code that reads all the lines of a specific file.
How can I edit it to make it read all the files (html, text, php .etc) in the directory "folder" one by one without me having to specify the path to each file? I want to search each file in the directory for a keyword.
path = '/Users/folder/index.html'
files = glob.glob(path)
for name in files:
try:
with open(name) as f:
sys.stdout.write(f.read())
except IOError as exc:
if exc.errno != errno.EISDIR:
raise
Update Python 3.4+
Read all files
from pathlib import Path
for child in Path('.').iterdir():
if child.is_file():
print(f"{child.name}:\n{child.read_text()}\n")
Read all files filtered by extension
from pathlib import Path
for p in Path('.').glob('*.txt'):
print(f"{p.name}:\n{p.read_text()}\n")
Read all files in directory tree filtered by extension
from pathlib import Path
for p in Path('.').glob('**/*.txt'):
print(f"{p.name}:\n{p.read_text()}\n")
Or equivalently, use Path.rglob(pattern):
from pathlib import Path
for p in Path('.').rglob('*.txt'):
print(f"{p.name}:\n{p.read_text()}\n")
Path.open()
As an alternative to Path.read_text() [or Path.read_bytes() for binary files], there is also Path.open(mode='r', buffering=-1, encoding=None, errors=None, newline=None), which is like the built-in Python function open().
from pathlib import Path
for p in Path('.').glob('*.txt'):
with p.open() as f:
print(f"{p.name}:\n{f.read()}\n")
import os
your_path = 'some_path'
files = os.listdir(your_path)
keyword = 'your_keyword'
for file in files:
if os.path.isfile(os.path.join(your_path, file)):
f = open(os.path.join(your_path, file),'r')
for x in f:
if keyword in x:
#do what you want
f.close()
os.listdir('your_path') will list all content of a directory
os.path.isfile will check its file or not

Categories

Resources