I have to run my python script on windows too, and then it began the problems.
Here I'm scraping html locally saved files, and then saving their .csv versions with the data I want. I ran it on my ubuntu and goes for +100k files with no problems. But when I go on windows, it says:
IOError: [Errno 13] Permission denied
It is not a permissions problems, I've rechecked it, and run it under 'Administration' powers, and it makes no difference.
It breaks exactly on the line where I open the file:
with open(of, 'w') as output:
...
I've tried to create same first file of the 100k from the python console and from a new blank stupid script from same directory as my code, and it works...
So, it seems is doable.
Then I've tried with output = open(of, 'w') instead of above code but nothing.
The weird thing is that it creates a directory with same name as the file, and then breaks with the IOError.
I've started thinking that it could be a csv thing..., naaaeehh, apart from other tries that didn't helped me, the most interesting stuff is that with the following code:
with open(of+.txt, 'w') as output:
...
it happens the astonishing thing that it creates a directory ending on .csv AND a file ending in .csv.txt with the right data!
Aargh!
Changing the open mode file to 'w+', 'wb', it didn't make a difference either.
Any ideas?
You can get permission denied if the file is opened up in another application.
Follow this link to see if any other process is using it: http://www.techsupportalert.com/content/how-find-out-which-windows-process-using-file.htm
Otherwise, I would say to try to open the file for read instead of write to see if it allows you to access it at all.
-Brian
Damn it, it's already working!, it has been like saying i cannot find my glasses and to have them on.
THanks Brian, it wasn't that the error. The problem was that in my code i was dealing with ubuntu separator besides the full path to the csv output file was completely correct. But I replaced it with os.sep , and started working like a charm :)
Thanks again!
Related
Yeah I know this or similar questions have been posted to this forum but so far none of the answers is sufficient to solve my problem:
Here I have the following code:
with open(filename,'r',buffering=2000000) as f:
f.readline() # takes header away
for i, l in enumerate(f): # count the number of lines
print('Counting {}'.format(i),end='\r')
pass
What happens is the file is a 23Gbytes csv file. I get the following error:
File "programs\readbigfile.py", line 33, in <module>
for i, l in enumerate(f): # count the number of lines
PermissionError: [Errno 13] Permission denied
The error always happens at the line number 1374200. I checked the file with a text editor and there is nothing unusual at that line. This happened to me with the same file but a smaller version (a few less Gigabytes). Then suddenly it worked.
The file is not being used by any other process at all.
Any ideas of why this error occurs in the middle of the file?
PD. I am running this program on a computer with an Intel i5-6500 CPU/16Gb memory and a NVIDIA GeForce GTX 750 Ti card.
System is Windows 10. Python 3.7.6 x64/Anaconda
The file is on a local disk, no networking involved.
Whatever it is, I think your code is ok.
My ideas:
do you need this buffering? Did you try to remove it completely?
I see you're running on Windows. I don't know if that's important, but many weird issues happen on Windows
if you're trying to access it on a disk in the network (samba etc), maybe it's not fully synced?
are you sure nothing else tries to access this file in the meantime? excel?
did you try reading this file with csv.reader? I don't think it'd help though, just wondering
you can try/except and when the error is raised check os.stat or os.access if you have permissions
maybe printing is at fault, it sounds like a huge file. Did you try without printing? You might want to add if i % 1000 == 0: print(...)
I found out the error is due to a file writing error, either because a bad disk block or a disk system failure. The point is, the file had a CRC error somewhere in the middle of it, which I corrected just by creating the file again. It is a random error so if you find yourself in the same situation, one of the checks should be the soundness of the file itself.
I'm writing a web scraper that has to download 8000 files in total. In my script i download the files consecutively and delete the previous one after the relevant information has been extracted. To delete the file, i use "os.remove(downloaded_file)". So far, in 500+ downloads, 3 times it didnt remove the file, but just removed the contents of the file, so an exception happened when the script tried to copy things from an empty file. Has anyone experienced this or can explain what is happening?
Working on windows 10
I couldn't fine any relevant information on this error so far.
def copy_to_master_and_delete_df(downloaded_file,master_file):
'''open a downloaded csv file, copy the data (line 10), append to master file and delete the downloaded file'''
while not os.path.exists(downloaded_file):
time.sleep(0.5)
log(f'waiting for {bank} {quarter} to download')
with open(downloaded_file, encoding='utf-8') as df:
data = list(df.readlines())[-1]
os.remove(downloaded_file)
while os.path.exists(downloaded_file):
time.sleep(0.1)
log(f'waiting for {bank} {quarter} to be deleted')
with open(master_file, 'a', encoding='utf-8') as mf:
mf.write(data)
On data = list(df.readlines())[-1] it gives an exception:
Exception has occurred: IndexError
list index out of range
It happens because of what is described earlier, the contents are removed, but not the file itself.
To work around this problem a little, i have put an infinite
while os.path.exists(downloaded_file):
time.sleep(0.1)
log(f'waiting for {bank} {quarter} to be deleted')
that allows me to manually delete the file and have the script not break down.
I'm asking for help because it went to a next level. The script somehow jumped over the line where i check if the file is deleted (again, the content is deleted but the file is not) and downloaded the next one, so the script crashed when it looked into an empty file.
Any ideas to why this is happening or how to handle this?
I suspect that's a buffer flush problem. Try following the remove with a call to os.sync(), os.fsync() on Windows, or see here for the buffering option to disable buffering when opening a file.
I'm using Python 2.7 on Windows XP.
My script relies on tempfile.mkstemp and tempfile.mkdtemp to create a lot of files and directories with the following pattern:
_,_tmp = mkstemp(prefix=section,dir=indir,text=True)
<do something with file>
os.close(_)
Running the script always incurs the following error (although the exact line number changes, etc.). The actual file that the script is attempting to open varies.
OSError: [Errno 24] Too many open files: 'path\\to\\most\\recent\\attempt\\to\\open\\file'
Any thoughts on how I might debug this? Also, let me know if you would like additional information. Thanks!
EDIT:
Here's an example of use:
out = os.fdopen(_,'w')
out.write("Something")
out.close()
with open(_) as p:
p.read()
You probably don't have the same value stored in _ at the time you call os.close(_) as at the time you created the temp file. Try assigning to a named variable instead of _.
If would help you and us if you could provide a very small code snippet that demonstrates the error.
why not use tempfile.NamedTemporaryFile with delete=False? This allows you to work with python file objects which is one bonus. Also, it can be used as a context manager (which should take care of all the details making sure the file is properly closed):
with tempfile.NamedTemporaryFile('w',prefix=section,dir=indir,delete=False) as f:
pass #Do something with the file here.
try:
directoryListing = os.listdir(inputDirectory)
#other code goes here, it iterates through the list of files in the directory
except WindowsError as winErr:
print("Directory error: " + str((winErr)))
This works fine, and I have tested that it doesnt choke and die when the directory doesn't exist, but I was reading in a Python book that I should be using "with" when opening files. Is there a preferred way to do what I am doing?
You are perfectly fine. The os.listdir function does not open files, so ultimately you are alright. You would use the with statement when reading a text file or similar.
an example of a with statement:
with open('yourtextfile.txt') as file: #this is like file=open('yourtextfile.txt')
lines=file.readlines() #read all the lines in the file
#when the code executed in the with statement is done, the file is automatically closed, which is why most people use this (no need for .close()).
What you are doing is fine. With is indeed the preferred way for opening files, but listdir is perfectly acceptable for just reading the directory.
I have written a few lines of code in Python to see if I can make it read a text file, make a list out of it where the lines are lists themselves, and then turn everything back into a string and write it as output on a different file. This may sound silly, but the idea is to shuffle the items once they are listed, and I need to make sure I can do the reading and writing correctly first. This is the code:
import csv,StringIO
datalist = open('tmp/lista.txt', 'r')
leyendo = datalist.read()
separando = csv.reader(StringIO.StringIO(leyendo), delimiter = '\t')
macrolist = list(separando)
almosthere = ('\t'.join(i) for i in macrolist)
justonemore = list(almosthere)
arewedoneyet = '\n'.join(justonemore)
with open('tmp/randolista.txt', 'w') as newdoc:
newdoc.write(arewedoneyet)
newdoc.close()
datalist.close()
This seems to work just fine when I run it line by line on the interpreter, but when I save it as a separate Python script and run it (myscript.py) nothing happens. The output file is not even created. After having a look at similar issues raised here, I have introduced the 'with' parameter (before I opened the output file through output = open()), I have tried flushing as well as closing the file... Nothing seems to work. The standalone script does not seem to do much, but the code can't be too wrong if it works on the interpreter, right?
Thanks in advance!
P.S.: I'm new to Python and fairly new to programming, so I apologise if this is due to a shallow understanding of a basic issue.
Where are the input file and where do you want to save the output file. For this kind of scripts i think that it's better use absolute paths
Use:
open('/tmp/lista.txt', 'r')
instead of:
open('tmp/lista.txt', 'r')
I think that the error can be related to this
It may have something to do with where you start your interpreter.
Try use a absolute path /tmp/randolista.txt instead of relative path tmp/randolista.txt to isolate the problem.