I'm trying to create and write a file if it does not exist yet, so that it is co-operatively safe from race conditions, and I'm having (probably stupid) problem. First, here's code:
import os
def safewrite(text, filename):
print "Going to open", filename
fd = os.open(filename, os.O_CREAT | os.O_EXCL, 0666) ##### problem line?
print "Going to write after opening fd", fd
os.write(fd, text)
print "Going to close after writing", text
os.close(fd)
print "Going to return after closing"
#test code to verify file writing works otherwise
f = open("foo2.txt", "w")
f.write("foo\n");
f.close()
f = open("foo2.txt", "r")
print "First write contents:", f.read()
f.close()
os.remove("foo2.txt")
#call the problem method
safewrite ("test\n", "foo2.txt")
Then the problem, I get exception:
First write contents: foo
Going to open foo2.txt
Going to write after opening fd 5
Traceback (most recent call last):
File "/home/user/test.py", line 21, in <module>
safewrite ("test\n", "foo2.txt")
File "/home/user/test.py", line 7, in safewrite
os.write(fd, text)
OSError: [Errno 9] Bad file descriptor
Probable problem line is marked in the code above (I mean, what else could it be?), but I can't figure out how to fix it. What is the problem?
Note: above was tested in a Linux VM, with Python 2.7.3. If you try the code and it works for you, please write a comment with your environment.
Alternative code to do the same thing at least as safely is also very welcome.
Change the line:
fd = os.open(filename, os.O_CREAT | os.O_EXCL, 0666)
to be instead:
fd=os.open(filename, os.O_CREAT | os.O_EXCL | os.O_WRONLY, 0666)
You must open the file with a flag such that you can write to it (os.O_WRONLY).
From open(2):
DESCRIPTION
The argument flags must include one of the following access modes: O_RDONLY, O_WRONLY, or O_RDWR. These request opening the file read-only,
write-only, or read/write, respectively.
From write(2):
NAME
write - write to a file descriptor
...
ERRORS
EAGAIN The file descriptor fd has been marked non-blocking (O_NONBLOCK) and the write would block.
EBADF fd is not a valid file descriptor or is not open for writing.
Related
I have a question quite similar to this question, where I need the follow conditions to be upheld:
If a file is opened for reading, that file may only be opened for reading by any other process/program
If a file is opened for writing, that file may only be opened for reading by any other process/program
The solution posted in the linked question uses a third party library which adds an arbitrary .LOCK file in the same directory as the file in question. It is a solution that only works wrt to the program in which that library is being used and doesn't prevent any other process/program from using the file as they may not be implemented to check for a .LOCK association.
In essence, I wish to replicate this result using only Python's standard library.
BLUF: Need a standard library implementation specific to Windows for exclusive file locking
To give an example of the problem set, assume there is:
1 file on a shared network/drive
2 users on separate processes/programs
Suppose that User 1 is running Program A on the file and at some point the following is executed:
with open(fp, 'rb') as f:
while True:
chunk = f.read(10)
if chunk:
# do something with chunk
else:
break
Thus they are iterating through the file 10 bytes at a time.
Now User 2 runs Program B on the same file a moment later:
with open(fp, 'wb') as f:
for b in data: # some byte array
f.write(b)
On Windows, the file in question is immediately truncated and Program A stops iterating (even if it wasn't done) and Program B begins to write to the file. Therefore I need a way to ensure that the file may not be opened in a different mode that would alter its content if previously opened.
I was looking at the msvcrt library, namely the msvcrt.locking() interface. What I have been successful at doing is ensuring that a file opened for reading can be locked for reading, but nobody else can read the file (as I lock the entire file):
>>> f1 = open(fp, 'rb')
>>> f2 = open(fp, 'rb')
>>> msvcrt.locking(f1.fileno(), msvcrt.LK_LOCK, os.stat(fp).st_size)
>>> next(f1)
b"\x00\x05'\n"
>>> next(f2)
PermissionError: [Errno 13] Permission denied
This is an acceptible result, just not the most desired.
In the same scenario, User 1 runs Program A which includes:
with open(fp, 'rb') as f
msvcrt.locking(f.fileno(), msvcrt.LK_LOCK, os.stat(fp).st_size)
# repeat while block
msvcrt.locking(f.fileno(), msvcrt.LK_UNLCK, os.stat(fp).st_size)
Then User 2 runs Program B a moment later, the same result occurs and the file is truncated.
At this point, I would've liked a way to throw an error to User 2 stating the file is opened for reading somewhere else and cannot be written at this time. But if User 3 came along and opened the file for reading, then there would be no problem.
Update:
A potential solution is to change the permissions of a file (with exception catching if the file is already in use):
>>> os.chmod(fp, stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH)
>>> with open(fp, 'wb') as f:
# do something
PermissionError: [Errno 13] Permission denied <fp>
This doesn't feel like the best solution (particularly if the users didn't have the permission to even change permissions). Still looking for a proper locking solution but msvcrt doesn't prevent truncating and writing if the file is locked for reading. There still doesn't appear to be a way to generate an exclusive lock with Python's standard library.
For those who are interested in a Windows specific solution:
import os
import ctypes
import msvcrt
import pathlib
# Windows constants for file operations
NULL = 0x00000000
CREATE_ALWAYS = 0x00000002
OPEN_EXISTING = 0x00000003
FILE_SHARE_READ = 0x00000001
FILE_ATTRIBUTE_READONLY = 0x00000001 # strictly for file reading
FILE_ATTRIBUTE_NORMAL = 0x00000080 # strictly for file writing
FILE_FLAG_SEQUENTIAL_SCAN = 0x08000000
GENERIC_READ = 0x80000000
GENERIC_WRITE = 0x40000000
_ACCESS_MASK = os.O_RDONLY | os.O_WRONLY
_ACCESS_MAP = {os.O_RDONLY: GENERIC_READ,
os.O_WRONLY: GENERIC_WRITE
}
_CREATE_MASK = os.O_CREAT | os.O_TRUNC
_CREATE_MAP = {NULL: OPEN_EXISTING,
os.O_CREAT | os.O_TRUNC: CREATE_ALWAYS
}
win32 = ctypes.WinDLL('kernel32.dll', use_last_error=True)
win32.CreateFileW.restype = ctypes.c_void_p
INVALID_FILE_HANDLE = ctypes.c_void_p(-1).value
def _opener(path: pathlib.Path, flags: int) -> int:
access_flags = _ACCESS_MAP[flags & _ACCESS_MASK]
create_flags = _CREATE_MAP[flags & _CREATE_MASK]
if flags & os.O_WRONLY:
share_flags = NULL
attr_flags = FILE_ATTRIBUTE_NORMAL
else:
share_flags = FILE_SHARE_READ
attr_flags = FILE_ATTRIBUTE_READONLY
attr_flags |= FILE_FLAG_SEQUENTIAL_SCAN
h = win32.CreateFileW(path, access_flags, share_flags, NULL, create_flags, attr_flags, NULL)
if h == INVALID_FILE_HANDLE:
raise ctypes.WinError(ctypes.get_last_error())
return msvcrt.open_osfhandle(h, flags)
class _FileControlAccessor(pathlib._NormalAccessor):
open = staticmethod(_opener)
_control_accessor = _FileControlAccessor()
class Path(pathlib.WindowsPath):
def _init(self) -> None:
self._closed = False
self._accessor = _control_accessor
def _opener(self, name, flags) -> int:
return self._accessor.open(name, flags)
I'm trying to learn to manipulate files on python, but I can't get the open function to work. I have made a .txt file called foo that holds the content "hello world!" in my user directory (/home/yonatan) and typed this line into the shell:
open('/home/yonatan/foo.txt')
What i get in return is:
<_io.TextIOWrapper name='/home/yonatan/foo.txt' mode='r' encoding='UTF-8'>
I get what that means, but why don't I get the content?
open() returns a file object.
You then need to use read() to read the whole file
f = open('/home/yonatan/foo.txt', 'r')
contents = f.read()
Or you can use readline() to read just one line
line = f.readline()
and don't forget to close the file at the end
f.close()
An example iterating through the lines of the file (using with which ensures file.close() gets called on the end of it's lexical scope):
file_path = '/home/yonatan/foo.txt'
with open(file_path) as file:
for line in file:
print line
A great resource on I/O and file handling operations.
You haven't specified the mode you want to open it in.
Try:
f = open("home/yonatan/foo.txt", "r")
print(f.read())
In my code, user uploads file which is saved on server and read using the server path. I'm trying to delete the file from that path after I'm done reading it. But it gives me following error instead:
An error occurred while reading file. [WinError 32] The process cannot access the file because it is being used by another process
I'm reading file using with, and I've tried f.close() and also f.closed but its the same error every time.
This is my code:
f = open(filePath)
with f:
line = f.readline().strip()
tempLst = line.split(fileSeparator)
if(len(lstHeader) != len(tempLst)):
headerErrorMsg = "invalid headers"
hjsonObj["Line No."] = 1
hjsonObj["Error Detail"] = headerErrorMsg
data['lstErrorData'].append(hjsonObj)
data["status"] = True
f.closed
return data
f.closed
after this code I call the remove function:
os.remove(filePath)
Edit: using with open(filePath) as f: and then trying to remove the file gives the same error.
Instead of:
f.closed
You need to say:
f.close()
closed is just a boolean property on the file object to indicate if the file is actually closed.
close() is method on the file object that actually closes the file.
Side note: attempting a file delete after closing a file handle is not 100% reliable. The file might still be getting scanned by the virus scanner or indexer. Or some other system hook is holding on to the file reference, etc... If the delete fails, wait a second and try again.
Use below code:
import os
os.startfile('your_file.py')
To delete after completion:
os.remove('your_file.py')
This
import os
path = 'path/to/file'
with open(path) as f:
for l in f:
print l,
os.remove(path)
should work, with statement will automatically close the file after the nested block of code
if it fails, File could be in use by some external factor. you can use Redo pattern.
while True:
try:
os.remove(path)
break
except:
time.sleep(1)
There is probably an application that is opening the file; check and close the application before executing your code:
os.remove(file_path)
Delete files that are not used by another application.
Goal = Open file, encrypt file, write encrypted file.
Trying to use the PyPDF2 module to accomplish this. I have verified theat "input" is a file type object. I have researched this error and it translates to "file not found". I believe that it is linked somehow to the file/file path but am unsure how to debug or troubleshoot. and getting the following error:
Traceback (most recent call last):
File "CommissionSecurity.py", line 52, in <module>
inputStream = PyPDF2.PdfFileReader(input)
File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1065, in __init__
File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1660, in read
IOError: [Errno 22] Invalid argument
Below is the relevant code. I'm not sure how to correct this issue because I'm not really sure what the issue is. Any guidance is appreciated.
for ID in FileDict:
if ID in EmailDict :
path = "C:\\Apps\\CorVu\\DATA\\Reports\\AlliD\\Monthly Commission Reports\\Output\\pdcom1\\"
#print os.listdir(path)
file = os.path.join(path + FileDict[ID])
with open(file, 'rb') as input:
print type(input)
inputStream = PyPDF2.PdfFileReader(input)
output = PyPDF2.PdfFileWriter()
output = inputStream.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
output.write(outputStream)
else : continue
I think your problem might be caused by the fact that you use the same filename to both open and write to the file, opening it twice:
with open(file, 'rb') as input :
with open(file, 'wb') as outputStream :
The w mode will truncate the file, thus the second line truncates the input.
I'm not sure what you're intention is, because you can't really try to read from the (beginning) of the file, and at the same time overwrite it. Even if you try to write to the end of the file, you'll have to position the file pointer somewhere.
So create an extra output file that has a different name; you can always rename that output file to your input file after both files are closed, thus overwriting your input file.
Or you could first read the complete file into memory, then write to it:
with open(file, 'rb') as input:
inputStream = PyPDF2.PdfFileReader(input)
output = PyPDF2.PdfFileWriter()
output = input.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
output.write(outputStream)
Notes:
you assign inputStream, but never use it
you assign PdfFileWriter() to output, and then assign something else to output in the next line. Hence, you never used the result from the first output = line.
Please check carefully what you're doing, because it feels there are numerous other problems with your code.
Alternatively, here are some other tips that may help:
The documentation suggests that you can also use the filename as first argument to PdfFileReader:
stream – A File object or an object that supports the standard read
and seek methods similar to a File object. Could also be a string
representing a path to a PDF file.
So try:
inputStream = PyPDF2.PdfFileReader(file)
You can also try to set the strict argument to False:
strict (bool) – Determines whether user should be warned of all
problems and also causes some correctable problems to be fatal.
Defaults to True.
For example:
inputStream = PyPDF2.PdfFileReader(file, strict=False)
Using open(file, 'rb') was causing the issue becuase PdfFileReader() does that automagically. I just removed the with statement and that corrected the problem.
with open(file, 'rb') as input:
inputStream = PyPDF2.PdfFileReader(input)
This error raised up because of PDF file is empty.
My PDF file was empty that's why my error was raised up. So First of all i fill my PDF file with some data and Then start reeading it using PyPDF2.PdfFileReader,
And it solved my Problem!!!
Late but, you may be opening an invalid PDF file or an empty file that's named x.pdf and you think it's a PDF file
I am getting an interesting error while trying to use Unpickler.load(), here is the source code:
open(target, 'a').close()
scores = {};
with open(target, "rb") as file:
unpickler = pickle.Unpickler(file);
scores = unpickler.load();
if not isinstance(scores, dict):
scores = {};
Here is the traceback:
Traceback (most recent call last):
File "G:\python\pendu\user_test.py", line 3, in <module>:
save_user_points("Magix", 30);
File "G:\python\pendu\user.py", line 22, in save_user_points:
scores = unpickler.load();
EOFError: Ran out of input
The file I am trying to read is empty.
How can I avoid getting this error, and get an empty variable instead?
Most of the answers here have dealt with how to mange EOFError exceptions, which is really handy if you're unsure about whether the pickled object is empty or not.
However, if you're surprised that the pickle file is empty, it could be because you opened the filename through 'wb' or some other mode that could have over-written the file.
for example:
filename = 'cd.pkl'
with open(filename, 'wb') as f:
classification_dict = pickle.load(f)
This will over-write the pickled file. You might have done this by mistake before using:
...
open(filename, 'rb') as f:
And then got the EOFError because the previous block of code over-wrote the cd.pkl file.
When working in Jupyter, or in the console (Spyder) I usually write a wrapper over the reading/writing code, and call the wrapper subsequently. This avoids common read-write mistakes, and saves a bit of time if you're going to be reading the same file multiple times through your travails
I would check that the file is not empty first:
import os
scores = {} # scores is an empty dict already
if os.path.getsize(target) > 0:
with open(target, "rb") as f:
unpickler = pickle.Unpickler(f)
# if file is not empty scores will be equal
# to the value unpickled
scores = unpickler.load()
Also open(target, 'a').close() is doing nothing in your code and you don't need to use ;.
It is very likely that the pickled file is empty.
It is surprisingly easy to overwrite a pickle file if you're copying and pasting code.
For example the following writes a pickle file:
pickle.dump(df,open('df.p','wb'))
And if you copied this code to reopen it, but forgot to change 'wb' to 'rb' then you would overwrite the file:
df=pickle.load(open('df.p','wb'))
The correct syntax is
df=pickle.load(open('df.p','rb'))
As you see, that's actually a natural error ..
A typical construct for reading from an Unpickler object would be like this ..
try:
data = unpickler.load()
except EOFError:
data = list() # or whatever you want
EOFError is simply raised, because it was reading an empty file, it just meant End of File ..
You can catch that exception and return whatever you want from there.
open(target, 'a').close()
scores = {};
try:
with open(target, "rb") as file:
unpickler = pickle.Unpickler(file);
scores = unpickler.load();
if not isinstance(scores, dict):
scores = {};
except EOFError:
return {}
if path.exists(Score_file):
try :
with open(Score_file , "rb") as prev_Scr:
return Unpickler(prev_Scr).load()
except EOFError :
return dict()
Had the same issue. It turns out when I was writing to my pickle file I had not used the file.close(). Inserted that line in and the error was no more.
I have encountered this error many times and it always occurs because after writing into the file, I didn't close it. If we don't close the file the content stays in the buffer and the file stays empty.
To save the content into the file, either file should be closed or file_object should go out of scope.
That's why at the time of loading it's giving the ran out of input error because the file is empty. So you have two options :
file_object.close()
file_object.flush(): if you don't wanna close your file in between the program, you can use the flush() function as it will forcefully move the content from the buffer to the file.
This error comes when your pickle file is empty (0 Bytes). You need to check the size of your pickle file first. This was the scenario in my case. Hope this helps!
Note that the mode of opening files is 'a' or some other have alphabet 'a' will also make error because of the overwritting.
pointer = open('makeaafile.txt', 'ab+')
tes = pickle.load(pointer, encoding='utf-8')
temp_model = os.path.join(models_dir, train_type + '_' + part + '_' + str(pc))
# print(type(temp_model)) # <class 'str'>
filehandler = open(temp_model, "rb")
# print(type(filehandler)) # <class '_io.BufferedReader'>
try:
pdm_temp = pickle.load(filehandler)
except UnicodeDecodeError:
pdm_temp = pickle.load(filehandler, fix_imports=True, encoding="latin1")
from os.path import getsize as size
from pickle import *
if size(target)>0:
with open(target,'rb') as f:
scores={i:j for i,j in enumerate(load(f))}
else: scores={}
#line 1.
we importing Function 'getsize' from Library 'OS' sublibrary 'path' and we rename it with command 'as' for shorter style of writing. Important is hier that we loading only one single Func that we need and not whole Library!
line 2.
Same Idea, but when we dont know wich modul we will use in code at the begining, we can import all library using a command '*'.
line 3.
Conditional Statement... if size of your file >0 ( means obj is not an empty). 'target' is variable that schould be a bit earlier predefined.
just an Example : target=(r'd:\dir1\dir.2..\YourDataFile.bin')
Line 4.
'With open(target) as file:' an open construction for any file, u dont need then to use file.close(). it helps to avoid some typical Errors such as "Run out of input" or Permissions rights.
'rb' mod means 'rea binary' that u can only read(load) the data from your binary file but u cant modify/rewrite it.
Line5.
List comprehension method in applying to a Dictionary..
line 6. Case your datafile is empty, it will not raise an any Error msg, but return just an empty dictionary.