reading files from a folder using os module

reading files from a folder using os module - python

for a pattern recognition application, I want to read and operate on jpeg files from another folder using the os module.
I tried to use str(file) and file.encode('latin-1') but they both give me errors
I tried :
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ str(file.encode('latin-1'))), 'r')
allLines.append(file.read())
print(allLines)
but I get an error saying:
No such file or directory "results/b'thefilename"
when I expect a list with the desired file names that are accessible

If you can use Python 3.4 or newer, you can use the pathlib module to handle the paths.
from pathlib import Path
all_lines = []
path = Path('results/')
for file in path.iterdir():
with file.open() as f:
all_lines.append(f.read())
print(all_lines)
By using the with statement, you don't have to close the file descriptor by hand (what is currently missing), even if an exception is raised at some point.

Related

Getting FileNotFoundError when trying to open a file for reading in Python 3

I am using the OS module to open a file for reading, but I'm getting a FileNotFoundError.
I am trying to
find all the files in a given sub-directory that contain the word "mda"
for each of those files, grab the string in the filename just after two "_"s (indicates a specific code called an SIC)
open that file for reading
will write to a master file for some Mapreduce processing later
When I try to do the opening, I get the following error:
File "parse_mda_SIC.py", line 16, in <module>
f = open(file, 'r')
FileNotFoundError: [Errno 2] No such file or directory:
'mda_3357_2017-03-08_1000230_000143774917004005__3357.txt'
I am suspicious the issue is either with the "file" variable or the fact that it is one directory down, but confused why this would occur when I am using OS to address that lower directory.
I have the following code :
working_dir = "data/"
for file in os.listdir(working_dir):
if (file.find("mda") != -1):
SIC = re.findall("__(\d+)", file)
f = open(file, 'r')
I would expect to be able to open the file without issue and then create my list from the data. Thanks for your help.

This should work for you. You need to append the directory because it sees it as just the file name at the top of your code and will look only in the directory where your code is located for that file name.
for file in os.listdir(working_dir):
if (file.find("mda") != -1):
SIC = re.findall("__(\d+)", file)
f = open(os.path.join(working_dir, file), 'r')
Also it's a good practice to open files using a context manager of with as it will handle closing your file when it is no longer needed:
for file in os.listdir(working_dir):
if (file.find("mda") != -1):
SIC = re.findall("__(\d+)", file)
with open(os.path.join(working_dir, file), 'r') as f:
# do stuff with f here

You need to append the directory, like this:
f = open(os.path.join(working_dir, file, 'r'))

Django can't find a file stored in my application folder

I have a folder that stores a json file in my django application folder, ie, test_data/data.json.
In my tests.py, I am trying to read this file using the following code:
with open('/test_data/data.json', 'r') as f:
self.response_data = json.load(f)
However, I keep on getting the following error:
FileNotFoundError: [Errno 2] No such file or directory: '/test_data/data.json'
What am I doing wrong? Thanks.
Edit: I tried removing the leading slash, yet I still get the same error.

import os
try this
with open(os.getcwd() + '/test_data/data.json', 'r') as f:
self.response_data = json.load(f)

If you're opening files in directories close to where your code is, it is common to place
import os
DIRNAME = os.path.dirname(__file__) # the directory of this file
at the top of the file.
Then you can open files in a test_data subdirectory with
with open(os.path.join(DIRNAME, 'test_data', 'data.json'), 'rb') as fp:
self.response_data = json.load(fp)
you probably want to open json files, which should be utf-8 encoded, in 'rb' (read-binary) mode.

File scanner in python 3

I am learning python atm and in order to do something useful whilst learning, I have created a small plan:
Read specific disc drive partition. Outcome: List of directories
Iterate each file within directory and subdirectories. Outcome: List of files within directories
Read file information: extension Outcome: File extension
Read file information: size Outcome: Size
Read file information: date created Outcome: Date
Read file information: date modified Date
Read file information: owner Outcome:Ownership
At step 1 I have tried several approaches, scandir:
import os as os
x = [f.name for f in os.scandir('my_path') if f.is_file()]
with open('write_to_file_path', 'w') as f:
for row in x:
print(row)
f.write("%s\n" % str(row))
f.close()
and this:
import os as os
rootDir = ('/Users/Ivan/Desktop/revit dynamo/')
for dirName, subdirList, fileList in os.walk(rootDir):
print('Found directory: %s' % dirName)
for fname in fileList:
print('\t%s' % fname)
Though I have hard time writing a result into txt file.
May I ask what would be an ideal approach to make an audit of the specific directories with all relevant information extracted and stored as a table in txt file for now?
P.S.: my first question here, so please do not judge to strictly :)

Since you are learning Python3, I would suggest as an alternative to the low-level path manipulation using os.path, you could try pathlib(part of standard library as of Python 3.4):
from pathlib import Path
p = Path(mydir)
#list mydir content
for child in p.iterdir():
print(child)
#Recursive iteration
for child in p.glob("**/*"):
if child.is_dir():
#do dir stuff
else:
print(child.suffix) # extension
print(child.owner()) # file owner
child_info = child.stat()
#file size, mod time
print(child_info.size,child_info.st_mtime)
File creation time is platform-dependent, but this post presents some solutions.
The string of a Path can be accessed as str(p).
To write to a file using pathlib:
textfile = Path(myfilepath)
# create file if it doesn't exist
textfile.touch()
# open file, write string, then close file
textfile.write_text(mystringtext)
# open file with context manager
with textfile.open('r') as f:
f.read()

How to open a file only using its extension?

I have a Python script which opens a specific text file located in a specific directory (working directory) and perform some actions.
(Assume that if there is a text file in the directory then it will always be no more than one such .txt file)
with open('TextFileName.txt', 'r') as f:
for line in f:
# perform some string manipulation and calculations
# write some results to a different text file
with open('results.txt', 'a') as r:
r.write(someResults)
My question is how I can have the script locate the text (.txt) file in the directory and open it without explicitly providing its name (i.e. without giving the 'TextFileName.txt'). So, no arguments for which text file to open would be required for this script to run.
Is there a way to achieve this in Python?

You could use os.listdir to get the files in the current directory, and filter them by their extension:
import os
txt_files = [f for f in os.listdir('.') if f.endswith('.txt')]
if len(txt_files) != 1:
raise ValueError('should be only one txt file in the current directory')
filename = txt_files[0]

You Can Also Use glob Which is easier than os
import glob
text_file = glob.glob('*.txt')
# wild card to catch all the files ending with txt and return as list of files
if len(text_file) != 1:
raise ValueError('should be only one txt file in the current directory')
filename = text_file[0]
glob searches the current directory set by os.curdir
You can change to the working directory by setting
os.chdir(r'cur_working_directory')

Since Python version 3.4, it is possible to use the great pathlib library. It offers a glob method which makes it easy to filter according to extensions:
from pathlib import Path
path = Path(".") # current directory
extension = ".txt"
file_with_extension = next(path.glob(f"*{extension}")) # returns the file with extension or None
if file_with_extension:
with open(file_with_extension):
...

I want to process every file inside a folder line by line and get a particular matching string

I am trying to process every files inside a folder line by line. I need to check for a particular string and write into an excel sheet. Using my code, if i explicitly give the file name, the code will work. If I try to get all the files, then it throws an IOError. The code which I wrote is as below.
import os
def test_extract_programid():
folder = 'C://Work//Scripts//CMDC_Analysis//logs'
for filename in os.listdir(folder):
print filename
with open(filename, 'r') as fo:
strings = ("/uri")
<conditions>
for line in fo:
if strings in line:
<conditions>
I think the error is that the file is already opened when the for loop started but i am not sure. printing the file name prints the file name correctly.
The error shown is IOError: [Errno 2] No such file or directory:

if your working directory is not the same as folder, then you need to give open the path the the file as well:
with open(folder+'/'+filename, 'r') as fo
Alternatively, you can use glob
import glob
for filename in glob.glob(folder+'/*'):
print filename

It can't open the path. You should do
for filename in os.listdir(folder):
print folder+os.sep()+filename

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

reading files from a folder using os module - python

Related

Getting FileNotFoundError when trying to open a file for reading in Python 3

Django can't find a file stored in my application folder

File scanner in python 3

How to open a file only using its extension?

I want to process every file inside a folder line by line and get a particular matching string

Categories

Resources