Opening multiple CSV files

Opening multiple CSV files - python

I am trying to open multiple excel files. My program throws error message "FileNotFoundError". The file is present in the directory.
Here is the code:
import os
import pandas as pd
path = "C:\\GPA Calculations for CSM\\twentyfourteen"
files = os.listdir(path)
print (files)
df = pd.DataFrame()
for f in files:
df = pd.read_excel(f,'Internal', skiprows = 7)
print ("file name is " + f)
print (df.loc[0][1])
print (df.loc[1][1])
print (df.loc[2][1])
Program gives error on df = pd.read_excel(f,'Internal', skiprows = 7).
I opened the same file on another program (which opens single file) and that worked fine. Any suggestions or advice would be highly appreciated.

os.listdir lists the filenames relative to the directory (path) you're giving as argument. Thus, you need to join the path and filename together to get the absolute path for each file. Thus, in your loop:
for filename in files:
abspath = os.path.join(path, filename)
<etc, replace f by abspath>

Related

saving csv files to new directory

I am trying to use this code to write my edited csv files to a new directory. Does anyone know how I specify the directory?
I have tried this but it doesn't seem to be working.
dir = r'C:/Users/PycharmProjects/pythonProject1' # raw string for windows.
csv_files = [f for f in Path(dir).glob('*.csv')] # finds all csvs in your folder.
cols = ['Temperature']
for csv in csv_files: #iterate list
df = pd.read_csv(csv) #read csv
df[cols].to_csv('C:/Users/Desktop', csv.name, index=False)
print(f'{csv.name} saved.')

I think your only problem is the way you're calling to_csv(), passing a directory and a filename. I tried that and got this error:
IsADirectoryError: [Errno 21] Is a directory: '/Users/zyoung/Desktop/processed'
because to_csv() is expecting a path to a file, not a directory path and a file name.
You need to join the output directory and CSV's file name, and pass that, like:
out_dir = PurePath(base_dir, r"processed")
# ...
# ...
csv_out = PurePath(out_dir, csv_in)
df[cols].to_csv(csv_out, index=False)
I'm writing to the subdirectory processed, in my current dir ("."), and using the PurePath() function to do smart joins of the path components.
Here's the complete program I wrote for myself to test this:
import os
from pathlib import Path, PurePath
import pandas as pd
base_dir = r"."
out_dir = PurePath(base_dir, r"processed")
csv_files = [x for x in Path(base_dir).glob("*.csv")]
if not os.path.exists(out_dir):
os.mkdir(out_dir)
cols = ["Temperature"]
for csv_in in csv_files:
df = pd.read_csv(csv_in)
csv_out = PurePath(out_dir, csv_in)
df[cols].to_csv(csv_out, index=False)
print(f"Saved {csv_out.name}")

Searching for an excel file in two Directories and creating a path

I've recently posted a similar question a week about searching through sub directories to find a specific excel file. However this time, I need to find a specific file in either one of the two directories and give a path based on whether the file is located in one folder or the other.
Here is the code I have so far. the work computer i have is running on Python 2.7.18 - there are no errors however when i print out the df as an excel file nothing is shown in the output path
ExcelFilePath = sys.argv[1]
OutputFilePath = sys.argv[2]
# path of excel directory and use glob glob to get all the DigSym files
for (root, subdirs, files) in os.walk(ExcelFilePath):
for f in files:
if '/**/Score_Green_*' in f and '.xlsx' in f:
ScoreGreen_Files = os.path.join(root, f)
for f in ScoreGreen_Files:
df1 = pd.read_excel(f)
df1.to_excel(OutputFilePath)

OutputFilePath is an argument you're passing in. It isn't going to have a value unless you pass one in as a command line argument.
If you want to return the path, the variable you need to return is ScoreGreen_Files. You also don't need to iterate through
ScoreGreen_Files as it should just be the file you're looking for.
ExcelFilePath = sys.argv[1]
OutputFilePath = sys.argv[2]
# path of excel directory and use glob glob to get all the DigSym files
for (root, subdirs, files) in os.walk(ExcelFilePath):
for f in files:
if '/**/Score_Green_*' in f and '.xlsx' in f: # f is a single file
ScoreGreen_File = os.path.join(root, f)
df1 = pd.read_excel(ScoreGreen_File)
df1.to_excel(OutputFilePath)

Finding a name of a file in a list and renaming that file with the index in the list

Newbie in python here, I'm trying to get a list of file names (.wav) from an excel file, find the files with those names under some directory, and rename those wav files by the index in the list.
Here is the simple version my code:
import glob
import pandas as pd
import os
path_before = ""
path_after = ""
path_excel = ""
# the excel file with the names ("Title") of the wav files, I want to save the file name + extension
data = pd.read_excel(path_excel+"data.xlsx")
data['Title_temp'] = data['Title']+'.wav'
filelist = data['Title_temp']
# finding the wav files to be renamed
file_names_all = glob.glob(path_before+'\*')
# and get rid of the directory to just keep the names
get_name = []
for file in file_names_all:
temp_name = file.split('\\')
get_name.append(temp_name[-1])
# for the wav files to be renamed, go through the filelist from the excel file
# and if the name is on filelist, rename it to the index of filelist,
# so the new name should be a number.wav
# Also, all the file names are in the list and they all should be renamed,
# but I couldn't find a better way to do this, so the code below
for filename in get_name:
if filename in filelist:
try:
os.rename(filename, filelist.index(filename))
except:
print ('File ' + filename + ' could not be renamed!')
else: print ('File ' + filename + ' could not be found!')
I printed the file names out for both the directory and the excel list, they match (with the .wav extensions and everything), but when I run the code, I always get an error that the filename could not be found. Could somebody tell me what's wrong? (The codes are written in windows jupyter notebook)

Assuming your code is working and you need to reduce its size and complexity, here is another version.
import pandas as pd
import os
path_before = "path to the files to be renamed"
path_excel = "path to the excel file"
data = pd.read_excel('{}/data.xlsx'.format(path_excel))
files_to_rename = os.listdir(path_before)
for index, file_title in enumerate(data['Title']):
try:
# as per your code, the excel file has file names without extension, hence adding .wav extension in formatted string'
if '{}.wav'.format(file_title) in files_to_rename:
os.rename(os.path.join(path_before, '{}.wav'.format(file_title)), os.path.join(path_before, '{}.wav'.format(index)))
else:
print("File {}.wav not found in {}".format(file_title, path_before))
except Exception as ex:
print("Cannot rename file: {}".format(file_title))

Open csv using Python using relative filepath

os.chdir(r"C:\Downloads")
I'm getting stuck with reading in files in Python.
Why does specifying the relative file path not work when reading the file?
files = os.listdir(r"csvfilestoimport")
files
['file1.csv', 'file2.csv']
df1 = pd.concat([pd.read_csv(f) for f in files])
FileNotFoundError: [Errno 2] File file.csv does not exist:'file1.csv'

os was my choice before I learned about pathlib.
from pathlib import Path
path = Path("C:\Downloads")
df = pd.concat([pd.read_csv(f) for f in path.rglob("*.csv")])
With pathlib you don't have to join directory and file manually.

Try creating a new file with a name you are sure that doesn't exist before (in your entire computer), and check that it is created in the folder you think. Then try to read it.
Ok, now with your example. Please, note that
files = os.listdir(r"csvfilestoimport")
['file1.csv', 'file2.csv']
really means
['csvfilestoimport\file1.csv', 'csvfilestoimport\file2.csv']
So, you need to add this folder (r"csvfilestoimport"+f)
df1 = pd.concat([pd.read_csv(r"csvfilestoimport\"+f) for f in files])

See this eg.
root_path = r"C:\Downloads"
filelist = glob.glob(f"{root_path}//*.csv")
df1 = pd.concat([pd.read_csv(f) for f in filelist])

In OS.chdir() try giving the full path of the downloads "C:\Users\xxxx\Downloads" and try again
os.chdir(r'C:\Users\xxxxx\Downloads')

Renaming multiple files in a directory using Python

I'm trying to rename multiple files in a directory using this Python script:
import os
path = '/Users/myName/Desktop/directory'
files = os.listdir(path)
i = 1
for file in files:
os.rename(file, str(i)+'.jpg')
i = i+1
When I run this script, I get the following error:
Traceback (most recent call last):
File "rename.py", line 7, in <module>
os.rename(file, str(i)+'.jpg')
OSError: [Errno 2] No such file or directory
Why is that? How can I solve this issue?
Thanks.

You are not giving the whole path while renaming, do it like this:
import os
path = '/Users/myName/Desktop/directory'
files = os.listdir(path)
for index, file in enumerate(files):
os.rename(os.path.join(path, file), os.path.join(path, ''.join([str(index), '.jpg'])))
Edit: Thanks to tavo, The first solution would move the file to the current directory, fixed that.

You have to make this path as a current working directory first.
simple enough.
rest of the code has no errors.
to make it current working directory:
os.chdir(path)

import os
from os import path
import shutil
Source_Path = 'E:\Binayak\deep_learning\Datasets\Class_2'
Destination = 'E:\Binayak\deep_learning\Datasets\Class_2_Dest'
#dst_folder = os.mkdir(Destination)
def main():
for count, filename in enumerate(os.listdir(Source_Path)):
dst = "Class_2_" + str(count) + ".jpg"
# rename all the files
os.rename(os.path.join(Source_Path, filename), os.path.join(Destination, dst))
# Driver Code
if __name__ == '__main__':
main()

As per #daniel's comment, os.listdir() returns just the filenames and not the full path of the file. Use os.path.join(path, file) to get the full path and rename that.
import os
path = 'C:\\Users\\Admin\\Desktop\\Jayesh'
files = os.listdir(path)
for file in files:
os.rename(os.path.join(path, file), os.path.join(path, 'xyz_' + file + '.csv'))

Just playing with the accepted answer define the path variable and list:
path = "/Your/path/to/folder/"
files = os.listdir(path)
and then loop over that list:
for index, file in enumerate(files):
#print (file)
os.rename(path+file, path +'file_' + str(index)+ '.jpg')
or loop over same way with one line as python list comprehension :
[os.rename(path+file, path +'jog_' + str(index)+ '.jpg') for index, file in enumerate(files)]
I think the first is more readable, in the second the first part of the loop is just the second part of the list comprehension

If your files are renaming in random manner then you have to sort the files in the directory first. The given code first sort then rename the files.
import os
import re
path = 'target_folder_directory'
files = os.listdir(path)
files.sort(key=lambda var:[int(x) if x.isdigit() else x for x in re.findall(r'[^0-9]|[0-9]+', var)])
for i, file in enumerate(files):
os.rename(path + file, path + "{}".format(i)+".jpg")

I wrote a quick and flexible script for renaming files, if you want a working solution without reinventing the wheel.
It renames files in the current directory by passing replacement functions.
Each function specifies a change you want done to all the matching file names. The code will determine the changes that will be done, and displays the differences it would generate using colors, and asks for confirmation to perform the changes.
You can find the source code here, and place it in the folder of which you want to rename files https://gist.github.com/aljgom/81e8e4ca9584b481523271b8725448b8
It works in pycharm, I haven't tested it in other consoles
The interaction will look something like this, after defining a few replacement functions
when it's running the first one, it would show all the differences from the files matching in the directory, and you can confirm to make the replacements or no, like this

This works for me and by increasing the index by 1 we can number the dataset.
import os
path = '/Users/myName/Desktop/directory'
files = os.listdir(path)
index=1
for index, file in enumerate(files):
os.rename(os.path.join(path, file),os.path.join(path,''.join([str(index),'.jpg'])))
index = index+1
But if your current image name start with a number this will not work.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Opening multiple CSV files - python

os.listdir lists the filenames relative to the directory (path) you're giving as argument. Thus, you need to join the path and filename together to get the absolute path for each file. Thus, in your loop: for filename in files: abspath = os.path.join(path, filename) <etc, replace f by abspath>

Related

saving csv files to new directory

Searching for an excel file in two Directories and creating a path

Finding a name of a file in a list and renaming that file with the index in the list

Open csv using Python using relative filepath

Renaming multiple files in a directory using Python

Categories

Resources