How to write script that edits every excel file inside a folder - python

I want to make a script that writes a certain text to the A1 cell of every excel file inside a folder. I'm not sure how to get Python to open every file one by one, make changes to A1, and then overwrite save the original file.
import os
import openpyxl
os.chdir('C:/Users/jdal/Downloads/excelWorkdir')
folderList = os.listdir()
for file in in os.walk():
for name in file:
if name.endswith(".xlsx" and ".xls")
wb = openpyxl.load_workbook()
sheet = wb.get_sheet_by_name('Sheet1')
sheet['A1'] = 'input whatever here!'
sheet['A1'].value
wb.save()

I see following errors in your code:
You have an error in .endswith, it should be
name.endswith((".xlsx",".xls"))
i.e. it needs to be feed with tuple of allowed endings.
Your if lacks : at end and your indentation seems to be broken.
You should deliver one argument to .load_workbook and one argument to .save, i.e. name of file to read/to write.

I would iterate through the folder and use pandas to read the files as temporary data frames. From there, they are easily manipulable.
Assuming you are in the relevant directory:
import pandas as pd
import os
files = os.listdir()
for i in range(len(files)):
if files[i].endswith('.csv'):
# Store file name for future replacement
name = str(files[i])
# Save file as dataframe and edit cell
df = pd.read_csv(files[i])
df.iloc[0,0] = 'changed value'
# Replace file with df
df.to_csv(name, index=False)

Related

xlwings open workbook using context manager and overwrite it after making changes

What I want to do is to open csv file, make changes inside and save it. My code looks like this:
savePath = r'D:\file.csv'
with xw.App(visible=True) as app:
wb = app.books.open(savePath)
sheet1 = wb.sheets[0]
saveDF = sheet1.range(wb.sheets[0].used_range.address).options(pd.DataFrame, chunksize=30_000, index=False).value
wb.sheets[0].clear()
wb.sheets[0].range('A1').options(index=False, header=True).value = saveDF
wb.api.SaveAs(savePath, FileFormat=FileFormat.xlCSV)
This code kinda works but when it saves the file it asks me if i want to overwrite the file since it already exists.
So what I did was to save it as "csv.txt" file, then remove the ".csv" file and rename ".csv.txt" file back to ".csv" file using code below:
savePath = r'D:\file.csv'
with xw.App(visible=True) as app:
wb = app.books.open(savePath)
sheet1 = wb.sheets[0]
saveDF = sheet1.range(wb.sheets[0].used_range.address).options(pd.DataFrame, chunksize=30_000, index=False).value
wb.sheets[0].clear()
wb.sheets[0].range('A1').options(index=False, header=True).value = saveDF
wb.api.SaveAs(savePath + '.txt', FileFormat=FileFormat.xlCSV)
os.remove(savePath)
os.rename(savePath + '.txt', savePath)
Issue I have here is that I get error:
"PermissionError: [WinError 32] The process cannot access the file because it is being used by another process:"
Which means that Python tries to rename files while its being saved.
So my questions are:
Is there a way to overwrite csv file without needing to manually click "file already exists - do You want ot ovewrite it" prompt ?
Is there anything I can change in my code to not get [WinError 32] shown above ?
How can I change my code to not open two instances of excel (now it opens blank one - probably when I use "with" statement and second one with my file when I use app.books.open) ?
Thank You in advance for any kind of help.
I had the same problem caused by corrputed excel file saved by Excel Writer.
If you are writing an excel file in previous lines with pd.ExcelWriter and later you use the same file for xlwings, use context manager for Writer.
You don't have to overwrite the file by xlwings, you can simple do: wb.save(path=None). It will save the same file under the same name.
Just be attentive on the file you re using if it is a healty excel file - I mean not corrupted.

python: error reading files into data frame

I'm trying to import multiple csv files in one folder into one data frame. This is my code. It can iterate through the files and print them successfully and it can read one file into a data frame but combining them is printing an error. I saw many questions similar but the responses are complex, I thought the 'pythonic' way is to be simple because I am new to this. Thanks in advance for any help. The error message is always: No such file or directory: 'some file name' which makes no sense because it successfully printed the file name in the print step.
import pandas as pd
# this works
df = pd.read_csv("headlines/2017-1.csv")
print(df)
path = 'C:/.../... /.../headlines/' <--- full path I shortened it here
files = os.listdir(path)
print(files) <-- prints all file names successfully
for filename in files:
print(filename) # <-- successfully prints all file names
df = pd.read_csv(filename) # < -- error here
df2.append(df) # append to data frame
It seems like your current working directory is different from your path. Please use
os.chdir(path) before attempting to read your csv.

Saving multiple Excel Files to a Specific Path with Unique Filenames

In a loop I adjust the CSV structure of each file.
Now I want them to save in to the assigned folder with unique file names.
I can save to a CSV file, but than CSV file gets overwritten resulting in only the final modified result of the test5 file. I want save the CSV under their own filename+string _modified format.
I have 5 csv files:
Test1.csv
test2.csv
test3.csv
test4.csv
test5.csv
I import them:
for x in allFiles:
print(x)
stop=1
with open(x, 'r') as thecsv:
base=os.path.basename(ROT)
filename=os.path.splitext(base)[0]
print(name)
Now I loop through the files manipulate them and save it as DataFrame.
This is working fine.
Now I want to save each file separately in the output folder with a unique name (filename + _modified)
Output='J:\Temp\Output'
This is what I tried:
df2.to_csv(output+filename+'//_modified.csv'),sep=';',header=False,index=False)
also tried:
df2.to_csv(output(os.path.join(name+'//_modified.csv'),sep=';',header=False,index=False)
Hoping for the output folder looks like this:
test1_modified.csv
test2_modified.csv
test3_modified.csv
test4_modified.csv
test5_modified.csv
I would do something like this, making a new name before the call to write it out:
testFiles = ["test1.csv", "test2.csv", "test3.csv",
"test4.csv", "test5.csv"]
# iterate over each one
for f in testFiles:
# strip old extensions, replace with nothing
f = f.replace(".csv", "")
# I'd use join but you can you +
newName = "_".join([f, "_modified.csv"])
print(newName)
# make your call to write it out
I would also check the pandas docs for writing out, it's simpler than what you're trying:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
import pandas as pd
# read data
iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
# write data to local
iris.to_csv("iris.csv")
I found the solution to my problem
df.to_csv(output+'\'+filename+'.csv',sep=';',header=False,index=False)

how to move multiple images from a folder to another folder in python?

I am trying to move multiple images from one folder to another, using shutil.move() , I have saved image names in a CSV file.
ex: [img1 , img25, img55....]
I Have tried the below code
import pandas as pd
import shutil
cop_folder = path to folder containing images
destination_folder = path wher i want to move the images
df = pd.read_csv('', header = None)
for i in df:
if i in cop_folder:
shutil.move( i, dest_folder)
else:
print('fail')
TypeError: 'in ' requires string as left operand, not int
Try this approach:
import pandas as pd
import os
def move_file(old_file_path, new_directory):
if not os.path.isdir(new_directory):
os.mkdir(new_directory)
base_name = os.path.basename(old_file_path)
new_file_path = os.path.join(new_directory, base_name)
# Deletes a file if that file already exists there, you can change this behavior
if os.path.exists(new_file_path):
os.remove(new_file_path)
os.rename(old_file_path, new_file_path)
cop_folder = 'origin-folder\\'
destination_folder = 'dest_folder\\'
df = pd.read_csv('files.csv', header=None)
for i in df[0]:
filename = os.path.join(cop_folder, i)
move_file(filename, destination_folder)
The file names inside the csv must have an extension. If they don't, then you should use filename = os.path.join(cop_folder, i + '.jpg')
There are a few issues here, firstly you are iterating over a dataframe which will return the column labels not the values - that's what's causing the error the you posted. If you really want to use pandas just to import a CSV then you could change it to for i in df.iterrows() but even then it won't simply return the file name, it will return a series object. You'd probably be better off using the standard CSV module to read the CSV. That way your filenames will be read in as a list and will behave as you intended.
Secondly unless there is something else going on in your code you can't look for files in a folder using the 'in' keyword, you'll need to construct a full filepath by concatenating the filename and the folder path.

read csv in for loops and move to the next one python

Sorry for being not so clear. I want to read csv files in a for loop. each file is afterwards processed with some calculations. Afterwards I want to read the next file and do the same. Instead of manually changing the file names how can I do this with a loop ?
My code below is not working, putting the filenames for the pd_read_csv is wrong. But how to solve this?
filenumber=0
for files in range(4):
filenames=["file1","file2",
"file3","file4"]
os.chdir(r"/folder")
results=pd.read_csv('files[filenumber].csv',sep=',',header=0, index_col=None)
#dosomething with the file and move than to the next file
filenumber=+1
I guess you are looking for this:
filenames=["file1","file2","file3","file4"]
for i in range(len(filenames)):
filename = filenames[i]+'.csv'
results=pd.read_csv(filename,sep=',',header=0, index_col=None)
# Now do whatever operations you want
Since you have a pattern in your file names, another way would be:
for i in range(4):
filename = 'file'+str(i+1)+'.csv'
results=pd.read_csv(filename,sep=',',header=0, index_col=None)
# Now do whatever operations you want
You can iterate over your entire computer automatically:
import csv
import os
for root, dirs, files in os.walk(".\\your_directory_to_start\\"):
# for each file and directory...
for file in files:
# for each file
if file.endsswith(".csv"):
# if file is csv
print(os.path.join(root, file))
# show file name with location
ruta_completa = os.path.join(root, file)
# store in a variable the full path to file
mi_archivo = open(ruta_completa)
#open the file
mi_csv = csv.reader(mi_archivo)
# extract data from file
mis_datos = list(mi_csv)
# convert data from file into list
mis_datos
# show in screen all the data
mis_datos[0]
#extract the first row value
mis_datos[0][0]
#extract the first cell value in the first row
# do whatever you want... even create a new xlsx file or csv file

Categories

Resources