How to avoid a FileNotFoundError with os.listdir [duplicate] - python

This question already has answers here:
Python raising FileNotFoundError for file name returned by os.listdir
(3 answers)
Closed 2 months ago.
I'm developing a loop where each csv in a specified directory is re-sampled and then exported into a new file. I'm getting a FileNotFoundError despite trying with various folders and using exact folder paths.
# Specify folder name
serial = '015'
# Specify directory (note - '...' substitute for the full path used back to the drive letter)
root_dir = '...\\CleanTemps\\{}\\'.format(str(serial))
#loop
for filename in os.listdir(root_dir):
if filename.endswith('.csv'):
print(filename)
# Pull in the file
df = pd.read_csv(filename)
This prints a list of the eight files .csv files in that folder. However, when using the following code to pull in the file (one-by-one as a df to modify, I receive the FileNotFoundError:
#loop
for filename in os.listdir(root_dir):
if filename.endswith('.csv'):
# Pull in the file
df = pd.read_csv(filename)

The path to your file is compose from root_path + your file name, you can use :
from pathlib import Path
root_path = Path(root_dir)
for filename in os.listdir(root_path):
if filename.endswith('.csv'):
# Pull in the file
df = pd.read_csv(root_path/filename)
or you can use:
for filepath in root_path.glob("*.csv"):
df = pd.read_csv(filepath)

You must provide the full (or relative) path to the file, not just its name, this path is based on the root of your files, and you can use os.path.join to build it:
df = pd.read_csv(os.path.join(root_dir, file_name))

Related

FileNotFoundError: [Errno 2] No such file or directory: a.csv [duplicate]

This question already has answers here:
Python raising FileNotFoundError for file name returned by os.listdir
(3 answers)
Closed 2 months ago.
I'm developing a loop where each csv in a specified directory is re-sampled and then exported into a new file. I'm getting a FileNotFoundError despite trying with various folders and using exact folder paths.
# Specify folder name
serial = '015'
# Specify directory (note - '...' substitute for the full path used back to the drive letter)
root_dir = '...\\CleanTemps\\{}\\'.format(str(serial))
#loop
for filename in os.listdir(root_dir):
if filename.endswith('.csv'):
print(filename)
# Pull in the file
df = pd.read_csv(filename)
This prints a list of the eight files .csv files in that folder. However, when using the following code to pull in the file (one-by-one as a df to modify, I receive the FileNotFoundError:
#loop
for filename in os.listdir(root_dir):
if filename.endswith('.csv'):
# Pull in the file
df = pd.read_csv(filename)
The path to your file is compose from root_path + your file name, you can use :
from pathlib import Path
root_path = Path(root_dir)
for filename in os.listdir(root_path):
if filename.endswith('.csv'):
# Pull in the file
df = pd.read_csv(root_path/filename)
or you can use:
for filepath in root_path.glob("*.csv"):
df = pd.read_csv(filepath)
You must provide the full (or relative) path to the file, not just its name, this path is based on the root of your files, and you can use os.path.join to build it:
df = pd.read_csv(os.path.join(root_dir, file_name))

Searching for an excel file in two Directories and creating a path

I've recently posted a similar question a week about searching through sub directories to find a specific excel file. However this time, I need to find a specific file in either one of the two directories and give a path based on whether the file is located in one folder or the other.
Here is the code I have so far. the work computer i have is running on Python 2.7.18 - there are no errors however when i print out the df as an excel file nothing is shown in the output path
ExcelFilePath = sys.argv[1]
OutputFilePath = sys.argv[2]
# path of excel directory and use glob glob to get all the DigSym files
for (root, subdirs, files) in os.walk(ExcelFilePath):
for f in files:
if '/**/Score_Green_*' in f and '.xlsx' in f:
ScoreGreen_Files = os.path.join(root, f)
for f in ScoreGreen_Files:
df1 = pd.read_excel(f)
df1.to_excel(OutputFilePath)
OutputFilePath is an argument you're passing in. It isn't going to have a value unless you pass one in as a command line argument.
If you want to return the path, the variable you need to return is ScoreGreen_Files. You also don't need to iterate through
ScoreGreen_Files as it should just be the file you're looking for.
ExcelFilePath = sys.argv[1]
OutputFilePath = sys.argv[2]
# path of excel directory and use glob glob to get all the DigSym files
for (root, subdirs, files) in os.walk(ExcelFilePath):
for f in files:
if '/**/Score_Green_*' in f and '.xlsx' in f: # f is a single file
ScoreGreen_File = os.path.join(root, f)
df1 = pd.read_excel(ScoreGreen_File)
df1.to_excel(OutputFilePath)

iterate over files in directory and use file names as variables, and assign the file path to the variable

I was trying to iterate through a folder and get the names of files and paths of these files in DataBricks using Pyspark.
And suddenly a thought came like if we could make the names of files as variable and assign the path to that respective file named variable.
We could use dbutils to create widgets and assign the file name as parameter, to make things easier.
So working on this process I came till obtaining the paths of files and filenames.
But I couldn't figure out the variable creation and assigning the paths of the respective files in the respective file name variables
Here's the code :
import pandas as pd
import os
list1 =[]
list2 =[]
directory='/dbfs/FileStore/tables'
dir='/FileStore/tables'
for filename in os.listdir(directory):
if filename.endswith(".csv") or filename.endswith(".txt"):
file_path=os.path.join(dir, filename)
print(file_path)
print(filename)
list1.append(file_path)
list2.append(filename)
Thanks in advance
If you're set on assigning paths to variables with the file name, then you can try:
...
for filename in os.listdir(directory):
if filename.endswith(".csv") or filename.endswith(".txt"):
file_path=os.path.join(dir, filename)
print(file_path)
print(filename)
exec("%s = '%s'" % (filename, file_path))
Notice the additional set of quotes avoid syntax and name errors. However, this solution is still fraught with problems. For example, it looks like the call to exec takes the backslashes in a file path as unicode:
filename = 'file1'
filepath = '\maindir\foo'
exec("%s = '%s'" % (filename, filepath))
file1
'\\maindir\x0coo'
But a dictionary seems much better suited to his situation:
...
filenames_and_paths = {}
for filename in os.listdir(directory):
if filename.endswith(".csv") or filename.endswith(".txt"):
file_path=os.path.join(dir, filename)
print(file_path)
print(filename)
filenames_and_paths[filename] = file_path
Not sure why you've created the two lists for the names and paths, but if they are needed you can also use a dictionary comprehension:
filenames_and_paths = {name:path for name,path in zip(list1, list2)}
With Pyspark I'd rather suggest using the Hadoop FS API to list files as os.listdir won't work with external buckets/storage.
Here is an example that you can adapt:
# access hadoop fs via the JVM
Path = sc._gateway.jvm.org.apache.hadoop.fs.Path
conf = sc._jsc.hadoopConfiguration()
# list directory
directory = Path("/dbfs/FileStore/tables/*.csv")
gs = directory.getFileSystem(conf).globStatus(directory)
# create tuples (filename, filepath), you can also filter specific files here...
paths = []
if gs:
paths = [(f.getPath().getName(), f.getPath().toString()) for f in gs]
for filename, file_path in paths:
# your process

Why I am not getting into the path file " the system cannot find the path specified " [duplicate]

This question already has answers here:
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
So I am using a function in python which is being called in Robotframework to copy a file from a source to destination
I have used os.path.join() and os.listdir() and os.path.normpath() to get access to the folder and copy using shutil
But everytime I get this error
WindowsError: [Error 3] The system cannot find the path specified: '\\10.28.108.***\\folder\\folder2\\out/*.*'
My code
from pathlib import Path
import shutil
import os
#filename = Path ("\\10.28.108.***\folder\folder2\out\001890320181228184056-HT.xml")
source = os.listdir("\\10.28.108.***\folder\folder2\out")
destination = "\\10.28.108.***\folder\folder2\"
for files in source :
if files.endswith(".xml"):
shutil.copy(files, destination)
By this you can read your file.
filename = secure_filename(file_name.filename)
file_split = os.path.splitext(filename)
filename = file_split[0] + '__' + str(uuid.uuid4()) + file_split[1]
filepath = os.path.join(dest_dir, filename)
syspath = os.path.join(upload_dir, filepath)
file_name.save(syspath)
first thing here that check if you can access this folder(\10.28.108.\folder\folder2\out) from your file explorer
The other thing is you have to specify two slash if you are accessing remote folder below is example:
source = os.listdir(r"\\10.28.108.xxx\folder\folder2\out")

Opening multiple CSV files

I am trying to open multiple excel files. My program throws error message "FileNotFoundError". The file is present in the directory.
Here is the code:
import os
import pandas as pd
path = "C:\\GPA Calculations for CSM\\twentyfourteen"
files = os.listdir(path)
print (files)
df = pd.DataFrame()
for f in files:
df = pd.read_excel(f,'Internal', skiprows = 7)
print ("file name is " + f)
print (df.loc[0][1])
print (df.loc[1][1])
print (df.loc[2][1])
Program gives error on df = pd.read_excel(f,'Internal', skiprows = 7).
I opened the same file on another program (which opens single file) and that worked fine. Any suggestions or advice would be highly appreciated.
os.listdir lists the filenames relative to the directory (path) you're giving as argument. Thus, you need to join the path and filename together to get the absolute path for each file. Thus, in your loop:
for filename in files:
abspath = os.path.join(path, filename)
<etc, replace f by abspath>

Categories

Resources