PermissionError: [Errno 13] Permission denied: output.csv [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
After the user provides the source directory, the following script reads in a list of csvs. It then takes one csv and copies its contents row by row to a new csv until it reaches 100,000 rows at which point a new csv is created to continue the process until the original csv has been copied completely. The process is then repeated for the next csv file in the directory.
I will sometimes encounter the above PermissionError and am not sure how to go about fixing it, but sometimes I will run the script and I encounter no issues. I've verified that both the input and output files are NOT open on my machine. I've also tried to change the properties of my directory folder to not be read-only, though this always reverts back. When the error does occur, it is always within a few seconds of first starting to process a csv. Once you are about 5 seconds in, it won't give the error for that csv. But it could later once it gets to a new input csv.
"""
Script processes all csv's in a provided directory and returns
csv's with a maximum of 100,000 rows
"""
import csv
import pathlib
import argparse
import os
import glob
def _get_csv_list(
*, description: str = "Process csv file directory.",
):
"""
Uses argument parser to set up working directory, then
extracts list of csv file names from directory
Args: Directory string
Returns list of csv file name strings
"""
parser = argparse.ArgumentParser(description=description)
parser.add_argument(
"SRC", type=pathlib.Path, help="source (input) directory"
)
parsed_arg = parser.parse_args()
os.chdir(parsed_arg.SRC)
return glob.glob("*.{}".format("csv"))
def _process_csv(file_name):
"""
Iterates through csv file and copies each row to output
file. Once 100,000 rows is reached, a new file is started
Args: file name string
"""
file_index = 0
max_records_per_file = 100_000
with open(file_name) as _file:
reader = csv.reader(_file)
first_line = _file.readline()
first_line_list = first_line.split(",")
for index, row in enumerate(reader):
if index % max_records_per_file == 0:
file_index += 1
with open(
f"output_{file_name.strip('.csv')}_{file_index}.csv",
mode="xt",
encoding="utf-8",
newline="\n",
) as buffer:
writer = csv.writer(buffer)
writer.writerow(first_line_list)
else:
try:
with open(
f"output_{file_name.strip('.csv')}_{file_index}.csv",
mode="at",
encoding="utf-8",
newline="\n",
) as buffer:
writer = csv.writer(buffer)
writer.writerow(row)
except FileNotFoundError as error:
print(error)
with open(
f"output_{file_name.strip('.csv')}_{file_index}.csv",
mode="xt",
encoding="utf-8",
newline="\n",
) as buffer:
writer = csv.writer(buffer)
writer.writerow(first_line_list)
writer.writerow(row)
def main():
"""
Primary function for limiting csv file size
Cmd Line: python csv_row_limiter.py . (Replace '.' with other path
if csv_row_limiter.py directory and csv directory are different)
"""
csv_list = _get_csv_list()
for file_name in csv_list:
_process_csv(file_name)
if __name__ == "__main__":
main()
Also, please note that the only requirement for the contents of the input csv's is that they have a large number of rows (100,000+) with some amount of data.
Any ideas of how I might resolve this issue?

try opening it as root i.e try running this python script through root or su privileges. What i mean is login as root and then run this python script . Hope this helps.

Related

How to run a python script from a text file context menu and compare it to other text files?

I wrote a python script that takes two files as input and then saves the difference between them as output in another file.
I bound it to a batch file .cmd (see below) and added the batch file to context menu of text files, so when I right-click on a text file and select it, a cmd window pops up and I type the address of the file to compare.
Batch file content:
#echo off
cls
python "C:\Users\User\Desktop\Difference of Two Files.py" %1
Python Code:
import sys
import os
f1 = open(sys.argv[1], 'r')
f1_name = str(os.path.basename(f1.name)).rsplit('.')[0]
f2_path = input('Enter the path of file to compare: ')
f2 = open(f2_path, 'r')
f2_name = str(os.path.basename(f2.name)).rsplit('.')[0]
f3 = open(f'{f1_name} - {f2_name} diff.txt', 'w')
file1 = set(f1.read().splitlines())
file2 = set(f2.read().splitlines())
difference = file1.difference(file2)
for i in difference:
f3.write(i + '\n')
f1.close()
f2.close()
f3.close()
Now, my question is how can I replace the typing of 2nd file path with a drag and drop solution that accepts more than one file.
I don't have any problem with python code and can extend it myself to include more files. I just don't know how to edit the batch file so instead of taking just one file by typing the path, it takes several files by drag and drop.
I would appreciate your help.
Finally, I've figured it out myself!
I Post the final code, maybe it helps somebody.
# This script prints those lines in the 1st file that are not in the other added files
# and saves the results into a 3rd file on Desktop.
import sys
import os
f1 = open(sys.argv[1], 'r')
f1_name = str(os.path.basename(f1.name)).rsplit('.')[0]
reference_set = set(f1.read().splitlines())
compare_files = input('Drag and drop files into this window to compare: ')
compare_files = compare_files.strip('"').rstrip('"')
compare_files_list = compare_files.split('\"\"')
compare_set = set()
for file in compare_files_list:
with open(os.path.abspath(file), 'r') as f2:
file_content = set(f2.read().splitlines())
compare_set.update(file_content)
f3 = open(f'C:\\Users\\User\\Desktop\\{f1_name} diff.txt', 'w')
difference = reference_set.difference(compare_set)
for i in difference:
f3.write(i + '\n')
f1.close()
f3.close()
The idea came from this fact that drag and drop to cmd, copies the file path surrounded with double-quotes into it. I used the repeated double-quotes between paths to create a list, and you can see the rest in the code.
However, there's a downside and it's that you can't drag multiple files together and you should do that one by one, but it's better than nothing. ;)

Why is my created file blank?

I'm having trouble storing data minus the header into a new file. I don't understand Python enough to debug.
Ultimately, I'd like to extract data from each file and store into one main csv file rather than opening each file individually, while copying and pasting everything into the main csv file I would like.
My code is as follows:
import csv, os
# os.makedirs() command will create a folder titled in green or in apostrophies
os.makedirs('HeaderRemoved', exist_ok=True)
# Loop through every file in the current working directory.
for csvFilename in os.listdir('directory'):
if not csvFilename.endswith('.csv'):
continue #skips non-csv files
print('Removing header from ' + csvFilename + '...')
### Read the CSV file in (skipping first Row)###
csvRows = []
csvFileObj = open(csvFilename)
readerObj = csv.reader(csvFileObj)
for row in readerObj:
if readerObj.line_num == 1:
continue # skips first row
csvRows.append(row)
print (csvRows) #----------->Check to see if it has anything stored in array
csvFileObj.close()
#Todo: Write out the CSV file
csvFileObj = open(os.path.join('HeaderRemoved', 'directory/mainfile.csv'), 'w',
newline='')
csvWriter = csv.writer(csvFileObj)
for row in csvRows:
csvWriter.writerow(row)
csvFileObj.close()
The csv files that are being "scanned" or "read" have text and numbers. I do not know if this might be preventing the script from properly "reading" and storing the data into the csvRow array.
The problem comes from you reusing the same variable when you loop over your file names. See the documentation for listdir, it returns a list of filenames. Then your newfile isn't really pointing to the file anymore, but
to a string filename from the directory.
https://docs.python.org/3/library/os.html#os.listdir
with open(scancsvFile, 'w') as newfile:
array = []
#for row in scancsvFile
for newfile in os.listdir('directory'): # <---- you're reassigning the variable newfile here
if newfile.line_num == 1:
continue
array.append(lines)
newfile.close()

Error when trying to read and write multiple files

I modified the code based on the comments from experts in this thread. Now the script reads and writes all the individual files. The script reiterates, highlight and write the output. The current issue is, after highlighting the last instance of the search item, the script removes all the remaining contents after the last search instance in the output of each file.
Here is the modified code:
import os
import sys
import re
source = raw_input("Enter the source files path:")
listfiles = os.listdir(source)
for f in listfiles:
filepath = source+'\\'+f
infile = open(filepath, 'r+')
source_content = infile.read()
color = ('red')
regex = re.compile(r"(\b be \b)|(\b by \b)|(\b user \b)|(\bmay\b)|(\bmight\b)|(\bwill\b)|(\b's\b)|(\bdon't\b)|(\bdoesn't\b)|(\bwon't\b)|(\bsupport\b)|(\bcan't\b)|(\bkill\b)|(\betc\b)|(\b NA \b)|(\bfollow\b)|(\bhang\b)|(\bbelow\b)", re.I)
i = 0; output = ""
for m in regex.finditer(source_content):
output += "".join([source_content[i:m.start()],
"<strong><span style='color:%s'>" % color[0:],
source_content[m.start():m.end()],
"</span></strong>"])
i = m.end()
outfile = open(filepath, 'w+')
outfile.seek(0)
outfile.write(output)
print "\nProcess Completed!\n"
infile.close()
outfile.close()
raw_input()
The error message tells you what the error is:
No such file or directory: 'sample1.html'
Make sure the file exists. Or do a try statement to give it a default behavior.
The reason why you get that error is because the python script doesn't have any knowledge about where the files are located that you want to open.
You have to provide the file path to open it as I have done below. I have simply concatenated the source file path+'\\'+filename and saved the result in a variable named as filepath. Now simply use this variable to open a file in open().
import os
import sys
source = raw_input("Enter the source files path:")
listfiles = os.listdir(source)
for f in listfiles:
filepath = source+'\\'+f # This is the file path
infile = open(filepath, 'r')
Also there are couple of other problems with your code, if you want to open the file for both reading and writing then you have to use r+ mode. More over in case of Windows if you open a file using r+ mode then you may have to use file.seek() before file.write() to avoid an other issue. You can read the reason for using the file.seek() here.

add file name without file path to csv in python

I am using Blair's Python script which modifies a CSV file to add the filename as the last column (script appended below). However, instead of adding the file name alone, I also get the Path and File name in the last column.
I run the below script in windows 7 cmd with the following command:
python C:\data\set1\subseta\add_filename.py C:\data\set1\subseta\20100815.csv
The resulting ID field is populated by the following C:\data\set1\subseta\20100815.csv, although, all I need is 20100815.csv.
I'm new to python so any suggestion is appreciated!
import csv
import sys
def process_file(filename):
# Read the contents of the file into a list of lines.
f = open(filename, 'r')
contents = f.readlines()
f.close()
# Use a CSV reader to parse the contents.
reader = csv.reader(contents)
# Open the output and create a CSV writer for it.
f = open(filename, 'wb')
writer = csv.writer(f)
# Process the header.
header = reader.next()
header.append('ID')
writer.writerow(header)
# Process each row of the body.
for row in reader:
row.append(filename)
writer.writerow(row)
# Close the file and we're done.
f.close()
# Run the function on all command-line arguments. Note that this does no
# checking for things such as file existence or permissions.
map(process_file, sys.argv[1:])
Use os.path.basename(filename). See http://docs.python.org/library/os.path.html for more details.

No-such-file-or-directory error in Python

I have written this code
import os
import csv
import time
class upload_CSV:
def __init__(self):
coup = []
self.coup = coup
self.csv_name = 'Coup.csv'
def loader(self,rw):
with open(self.csv_name,'rb') as csv_file:
reader = csv.reader(csv_file,delimiter=',')
for row in reader:
self.coup.append(row[rw])
self.coup = self.coup[1:]
csv_file.flush()
csv_file.close()
return self.coup
def update(self,rw,message):
#try:
with open(self.csv_name,'rb') as csv_file1:
reader = csv.reader(csv_file1,delimiter=',')
csv_file1.flush()#To clean the register for reuse
csv_file1.close()
#except Exception as ex:
#error = 'An error occured loading data in the reader'
# #raise
# return ex
os.remove(self.csv_name)
writer = csv.writer(open(self.csv_name,'wb'))
for row in reader:
if row[rw]==message:
print str(row),' has been removed'
else:
writer.writerow(row)
return message
I am trying to read the content of a csv to a list first. Once i get the relevant data, i need to go back to my csv and create a new entry without that record. I keep getting the single error
Line 27 in update
with open(csv_name,'rb')as csvfile1:
Python: IOError: [Errno 2] No such file or directory 'Coup.csv'
when i call the Update function
I have looked at this question Python: IOError: [Errno 2] No such file or directory but nothing seems to work. Its as if the first function has a lock on the file. Any help would be appreciated
It would be enormously helpful if we saw the traceback to know exactly what line is producing the error...but here is a start...
First, you have two spots in your code where you are working with a filename that expects to only be available in the current directory. That is one possible point of failure in your code if you run it outside the directory containing the file:
self.csv_name = 'Coup.csv'
...
with open(self.csv_name,'rb') as csv_file:
...
with open('Coup.csv~','rb') as csv_file1:
...
And then, you are also referring to a variable that won't exist:
def update(self,rw,message):
...
# self.csv_name? or csv_file1?
os.remove(csv_name)
writer = csv.writer(open(csv_name,'wb'))
Also, how can you be sure this temp file will exist? Is it guaranteed? I normally wouldn't recommend relying on a system-temporary file.
with open('Coup.csv~','rb') as csv_file1:

Categories

Resources