I'm trying to create a file that is only user-readable and -writable (0600).
Is the only way to do so by using os.open() as follows?
import os
fd = os.open('/path/to/file', os.O_WRONLY, 0o600)
myFileObject = os.fdopen(fd)
myFileObject.write(...)
myFileObject.close()
Ideally, I'd like to be able to use the with keyword so I can close the object automatically. Is there a better way to do what I'm doing above?
What's the problem? file.close() will close the file even though it was open with os.open().
with os.fdopen(os.open('/path/to/file', os.O_WRONLY | os.O_CREAT, 0o600), 'w') as handle:
handle.write(...)
This answer addresses multiple concerns with the answer by vartec, especially the umask concern.
import os
import stat
# Define file params
fname = '/tmp/myfile'
flags = os.O_WRONLY | os.O_CREAT | os.O_EXCL # Refer to "man 2 open".
mode = stat.S_IRUSR | stat.S_IWUSR # This is 0o600.
umask = 0o777 ^ mode # Prevents always downgrading umask to 0.
# For security, remove file with potentially elevated mode
try:
os.remove(fname)
except OSError:
pass
# Open file descriptor
umask_original = os.umask(umask)
try:
fdesc = os.open(fname, flags, mode)
finally:
os.umask(umask_original)
# Open file handle and write to file
with os.fdopen(fdesc, 'w') as fout:
fout.write('something\n')
If the desired mode is 0600, it can more clearly be specified as the octal number 0o600. Even better, just use the stat module.
Even though the old file is first deleted, a race condition is still possible. Including os.O_EXCL with os.O_CREAT in the flags will prevent the file from being created if it exists due to a race condition. This is a necessary secondary security measure to prevent opening a file that may already exist with a potentially elevated mode. In Python 3, FileExistsError with [Errno 17] is raised if the file exists.
Failing to first set the umask to 0 or to 0o777 ^ mode can lead to an incorrect mode (permission) being set by os.open. This is because the default umask is usually not 0, and it will be applied to the specified mode. For example, if my original umask is 2 i.e. 0o002, and my specified mode is 0o222, if I fail to first set the umask, the resulting file can instead have a mode of 0o220, which is not what I wanted. Per man 2 open, the mode of the created file is mode & ~umask.
The umask is restored to its original value as soon as possible. This getting and setting is not thread safe, and a threading.Lock must be used in a multithreaded application.
For more info about umask, refer to this thread.
update
Folks, while I thank you for the upvotes here, I myself have to argue against my originally proposed solution below. The reason is doing things this way, there will be an amount of time, however small, where the file does exist, and does not have the proper permissions in place - this leave open wide ways of attack, and even buggy behavior.
Of course creating the file with the correct permissions in the first place is the way to go - against the correctness of that, using Python's with is just some candy.
So please, take this answer as an example of "what not to do";
original post
You can use os.chmod instead:
>>> import os
>>> name = "eek.txt"
>>> with open(name, "wt") as myfile:
... os.chmod(name, 0o600)
... myfile.write("eeek")
...
>>> os.system("ls -lh " + name)
-rw------- 1 gwidion gwidion 4 2011-04-11 13:47 eek.txt
0
>>>
(Note that the way to use octals in Python is by being explicit - by prefixing it with "0o" like in "0o600". In Python 2.x it would work writing just 0600 - but that is both misleading and deprecated.)
However, if your security is critical, you probably should resort to creating it with os.open, as you do and use os.fdopen to retrieve a Python File object from the file descriptor returned by os.open.
The question is about setting the permissions to be sure the file will not be world-readable (only read/write for the current user).
Unfortunately, on its own, the code:
fd = os.open('/path/to/file', os.O_WRONLY, 0o600)
does not guarantee that permissions will be denied to the world. It does try to set r/w for the current user (provided that umask allows it), that's it!
On two very different test systems, this code creates a file with -rw-r--r-- with my default umask, and -rw-rw-rw- with umask(0) which is definitely not what is desired (and poses a serious security risk).
If you want to make sure that the file has no bits set for group and world, you have to umask these bits first (remember - umask is denial of permissions):
os.umask(0o177)
Besides, to be 100% sure that the file doesn't already exist with different permissions, you have to chmod/delete it first (delete is safer, since you may not have write permissions in the target directory - and if you have security concerns, you don't want to write some file where you're not allowed to!), otherwise you may have a security issue if a hacker created the file before you with world-wide r/w permissions in anticipation of your move. In that case, os.open will open the file without setting its permissions at all and you're left with a world r/w secret file...
So you need:
import os
if os.path.isfile(file):
os.remove(file)
original_umask = os.umask(0o177) # 0o777 ^ 0o600
try:
handle = os.fdopen(os.open(file, os.O_WRONLY | os.O_CREAT, 0o600), 'w')
finally:
os.umask(original_umask)
This is the safe way to ensure the creation of a -rw------- file regardless of your environment and configuration. And of course you can catch and deal with the IOErrors as needed. If you don't have write permissions in the target directory, you shouldn't be able to create the file, and if it already existed the delete will fail.
I would like to suggest a modification of A-B-B's excellent answer that separates the concerns a bit more clearly. The main advantage would be that you can handle exceptions that occur during opening the file descriptor separately from other problems during actual writing to the file.
The outer try ... finally block takes care of handling the permission and umask issues while opening the file descriptor. The inner with block deals with possible exceptions while working with the Python file object (as this was the OP's wish):
try:
oldumask = os.umask(0)
fdesc = os.open(outfname, os.O_WRONLY | os.O_CREAT, 0o600)
with os.fdopen(fdesc, "w") as outf:
# ...write to outf, closes on success or on exceptions automatically...
except IOError, ... :
# ...handle possible os.open() errors here...
finally:
os.umask(oldumask)
If you want to append to the file instead of writing, then the file descriptor should be opened like this:
fdesc = os.open(outfname, os.O_WRONLY | os.O_CREAT | os.O_APPEND, 0o600)
and the file object like this:
with os.fdopen(fdesc, "a") as outf:
Of course all other usual combinations are possible.
I'd do differently.
from contextlib import contextmanager
#contextmanager
def umask_helper(desired_umask):
""" A little helper to safely set and restore umask(2). """
try:
prev_umask = os.umask(desired_umask)
yield
finally:
os.umask(prev_umask)
# ---------------------------------- […] ---------------------------------- #
[…]
with umask_helper(0o077):
os.mkdir(os.path.dirname(MY_FILE))
with open(MY_FILE, 'wt') as f:
[…]
File-manipulating code tends to be already try-except-heavy; making it even worse with os.umask's finally isn't going to bring your eyes any more joy. Meanwhile, rolling your own context manager is that easy, and results in somewhat neater indentation nesting.
Related
What is the most elegant way to solve this:
open a file for reading, but only if it is not already opened for writing
open a file for writing, but only if it is not already opened for reading or writing
The built-in functions work like this
>>> path = r"c:\scr.txt"
>>> file1 = open(path, "w")
>>> print file1
<open file 'c:\scr.txt', mode 'w' at 0x019F88D8>
>>> file2 = open(path, "w")
>>> print file2
<open file 'c:\scr.txt', mode 'w' at 0x02332188>
>>> file1.write("111")
>>> file2.write("222")
>>> file1.close()
scr.txt now contains '111'.
>>> file2.close()
scr.txt was overwritten and now contains '222' (on Windows, Python 2.4).
The solution should work inside the same process (like in the example above) as well as when another process has opened the file.
It is preferred, if a crashing program will not keep the lock open.
I don't think there is a fully crossplatform way. On unix, the fcntl module will do this for you. However on windows (which I assume you are by the paths), you'll need to use the win32file module.
Fortunately, there is a portable implementation (portalocker) using the platform appropriate method at the python cookbook.
To use it, open the file, and then call:
portalocker.lock(file, flags)
where flags are portalocker.LOCK_EX for exclusive write access, or LOCK_SH for shared, read access.
The solution should work inside the same process (like in the example above) as well as when another process has opened the file.
If by 'another process' you mean 'whatever process' (i.e. not your program), in Linux there's no way to accomplish this relying only on system calls (fcntl & friends). What you want is mandatory locking, and the Linux way to obtain it is a bit more involved:
Remount the partition that contains your file with the mand option:
# mount -o remount,mand /dev/hdXY
Set the sgid flag for your file:
# chmod g-x,g+s yourfile
In your Python code, obtain an exclusive lock on that file:
fcntl.flock(fd, fcntl.LOCK_EX)
Now even cat will not be able to read the file until you release the lock.
EDIT: I solved it myself! By using directory existence & age as a locking mechanism! Locking by file is safe only on Windows (because Linux silently overwrites), but locking by directory works perfectly both on Linux and Windows. See my GIT where I created an easy to use class 'lockbydir.DLock' for that:
https://github.com/drandreaskrueger/lockbydir
At the bottom of the readme, you find 3 GITplayers where you can see the code examples execute live in your browser! Quite cool, isn't it? :-)
Thanks for your attention
This was my original question:
I would like to answer to parity3 (https://meta.stackoverflow.com/users/1454536/parity3) but I can neither comment directly ('You must have 50 reputation to comment'), nor do I see any way to contact him/her directly. What do you suggest to me, to get through to him?
My question:
I have implemented something similiar to what parity3 suggested here as an answer: https://stackoverflow.com/a/21444311/3693375 ("Assuming your Python interpreter, and the ...")
And it works brilliantly - on Windows. (I am using it to implement a locking mechanism that works across independently started processes. https://github.com/drandreaskrueger/lockbyfile )
But other than parity3 says, it does NOT work the same on Linux:
os.rename(src, dst)
Rename the file or directory src to dst. ... On Unix, if dst exists
and is a file,
it will be replaced silently if the user has permission.
The operation may fail on some Unix flavors if src and dst
are on different filesystems. If successful, the renaming will
be an atomic operation (this is a POSIX requirement).
On Windows, if dst already exists, OSError will be raised
(https://docs.python.org/2/library/os.html#os.rename)
The silent replacing is the problem. On Linux.
The "if dst already exists, OSError will be raised" is great for my purposes. But only on Windows, sadly.
I guess parity3's example still works most of the time, because of his if condition
if not os.path.exists(lock_filename):
try:
os.rename(tmp_filename,lock_filename)
But then the whole thing is not atomic anymore.
Because the if condition might be true in two parallel processes, and then both will rename, but only one will win the renaming race. And no exception raised (in Linux).
Any suggestions? Thanks!
P.S.: I know this is not the proper way, but I am lacking an alternative. PLEASE don't punish me with lowering my reputation. I looked around a lot, to solve this myself. How to PM users in here? And meh why can't I?
Here's a start on the win32 half of a portable implementation, that does not need a seperate locking mechanism.
Requires the Python for Windows Extensions to get down to the win32 api, but that's pretty much mandatory for python on windows already, and can alternatively be done with ctypes. The code could be adapted to expose more functionality if it's needed (such as allowing FILE_SHARE_READ rather than no sharing at all). See also the MSDN documentation for the CreateFile and WriteFile system calls, and the article on Creating and Opening Files.
As has been mentioned, you can use the standard fcntl module to implement the unix half of this, if required.
import winerror, pywintypes, win32file
class LockError(StandardError):
pass
class WriteLockedFile(object):
"""
Using win32 api to achieve something similar to file(path, 'wb')
Could be adapted to handle other modes as well.
"""
def __init__(self, path):
try:
self._handle = win32file.CreateFile(
path,
win32file.GENERIC_WRITE,
0,
None,
win32file.OPEN_ALWAYS,
win32file.FILE_ATTRIBUTE_NORMAL,
None)
except pywintypes.error, e:
if e[0] == winerror.ERROR_SHARING_VIOLATION:
raise LockError(e[2])
raise
def close(self):
self._handle.close()
def write(self, str):
win32file.WriteFile(self._handle, str)
Here's how your example from above behaves:
>>> path = "C:\\scr.txt"
>>> file1 = WriteLockedFile(path)
>>> file2 = WriteLockedFile(path) #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
LockError: ...
>>> file1.write("111")
>>> file1.close()
>>> print file(path).read()
111
I prefer to use filelock, a cross-platform Python library which barely requires any additional code. Here's an example of how to use it:
from filelock import FileLock
lockfile = r"c:\scr.txt"
lock = FileLock(lockfile + ".lock")
with lock:
file = open(path, "w")
file.write("111")
file.close()
Any code within the with lock: block is thread-safe, meaning that it will be finished before another process has access to the file.
Assuming your Python interpreter, and the underlying os and filesystem treat os.rename as an atomic operation and it will error when the destination exists, the following method is free of race conditions. I'm using this in production on a linux machine. Requires no third party libs and is not os dependent, and aside from an extra file create, the performance hit is acceptable for many use cases. You can easily apply python's function decorator pattern or a 'with_statement' contextmanager here to abstract out the mess.
You'll need to make sure that lock_filename does not exist before a new process/task begins.
import os,time
def get_tmp_file():
filename='tmp_%s_%s'%(os.getpid(),time.time())
open(filename).close()
return filename
def do_exclusive_work():
print 'exclusive work being done...'
num_tries=10
wait_time=10
lock_filename='filename.lock'
acquired=False
for try_num in xrange(num_tries):
tmp_filename=get_tmp_file()
if not os.path.exists(lock_filename):
try:
os.rename(tmp_filename,lock_filename)
acquired=True
except (OSError,ValueError,IOError), e:
pass
if acquired:
try:
do_exclusive_work()
finally:
os.remove(lock_filename)
break
os.remove(tmp_filename)
time.sleep(wait_time)
assert acquired, 'maximum tries reached, failed to acquire lock file'
EDIT
It has come to light that os.rename silently overwrites the destination on a non-windows OS. Thanks for pointing this out # akrueger!
Here is a workaround, gathered from here:
Instead of using os.rename you can use:
try:
if os.name != 'nt': # non-windows needs a create-exclusive operation
fd = os.open(lock_filename, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
os.close(fd)
# non-windows os.rename will overwrite lock_filename silently.
# We leave this call in here just so the tmp file is deleted but it could be refactored so the tmp file is never even generated for a non-windows OS
os.rename(tmp_filename,lock_filename)
acquired=True
except (OSError,ValueError,IOError), e:
if os.name != 'nt' and not 'File exists' in str(e): raise
# akrueger You're probably just fine with your directory based solution, just giving you an alternate method.
To make you safe when opening files within one application, you could try something like this:
import time
class ExclusiveFile(file):
openFiles = {}
fileLocks = []
class FileNotExclusiveException(Exception):
pass
def __init__(self, *args):
sMode = 'r'
sFileName = args[0]
try:
sMode = args[1]
except:
pass
while sFileName in ExclusiveFile.fileLocks:
time.sleep(1)
ExclusiveFile.fileLocks.append(sFileName)
if not sFileName in ExclusiveFile.openFiles.keys() or (ExclusiveFile.openFiles[sFileName] == 'r' and sMode == 'r'):
ExclusiveFile.openFiles[sFileName] = sMode
try:
file.__init__(self, sFileName, sMode)
finally:
ExclusiveFile.fileLocks.remove(sFileName)
else:
ExclusiveFile.fileLocks.remove(sFileName)
raise self.FileNotExclusiveException(sFileName)
def close(self):
del ExclusiveFile.openFiles[self.name]
file.close(self)
That way you subclass the file class. Now just do:
>>> f = ExclusiveFile('/tmp/a.txt', 'r')
>>> f
<open file '/tmp/a.txt', mode 'r' at 0xb7d7cc8c>
>>> f1 = ExclusiveFile('/tmp/a.txt', 'r')
>>> f1
<open file '/tmp/a.txt', mode 'r' at 0xb7d7c814>
>>> f2 = ExclusiveFile('/tmp/a.txt', 'w') # can't open it for writing now
exclfile.FileNotExclusiveException: /tmp/a.txt
If you open it first with 'w' mode, it won't allow anymore opens, even in read mode, just as you wanted...
What is the most elegant way to solve this:
open a file for reading, but only if it is not already opened for writing
open a file for writing, but only if it is not already opened for reading or writing
The built-in functions work like this
>>> path = r"c:\scr.txt"
>>> file1 = open(path, "w")
>>> print file1
<open file 'c:\scr.txt', mode 'w' at 0x019F88D8>
>>> file2 = open(path, "w")
>>> print file2
<open file 'c:\scr.txt', mode 'w' at 0x02332188>
>>> file1.write("111")
>>> file2.write("222")
>>> file1.close()
scr.txt now contains '111'.
>>> file2.close()
scr.txt was overwritten and now contains '222' (on Windows, Python 2.4).
The solution should work inside the same process (like in the example above) as well as when another process has opened the file.
It is preferred, if a crashing program will not keep the lock open.
I don't think there is a fully crossplatform way. On unix, the fcntl module will do this for you. However on windows (which I assume you are by the paths), you'll need to use the win32file module.
Fortunately, there is a portable implementation (portalocker) using the platform appropriate method at the python cookbook.
To use it, open the file, and then call:
portalocker.lock(file, flags)
where flags are portalocker.LOCK_EX for exclusive write access, or LOCK_SH for shared, read access.
The solution should work inside the same process (like in the example above) as well as when another process has opened the file.
If by 'another process' you mean 'whatever process' (i.e. not your program), in Linux there's no way to accomplish this relying only on system calls (fcntl & friends). What you want is mandatory locking, and the Linux way to obtain it is a bit more involved:
Remount the partition that contains your file with the mand option:
# mount -o remount,mand /dev/hdXY
Set the sgid flag for your file:
# chmod g-x,g+s yourfile
In your Python code, obtain an exclusive lock on that file:
fcntl.flock(fd, fcntl.LOCK_EX)
Now even cat will not be able to read the file until you release the lock.
EDIT: I solved it myself! By using directory existence & age as a locking mechanism! Locking by file is safe only on Windows (because Linux silently overwrites), but locking by directory works perfectly both on Linux and Windows. See my GIT where I created an easy to use class 'lockbydir.DLock' for that:
https://github.com/drandreaskrueger/lockbydir
At the bottom of the readme, you find 3 GITplayers where you can see the code examples execute live in your browser! Quite cool, isn't it? :-)
Thanks for your attention
This was my original question:
I would like to answer to parity3 (https://meta.stackoverflow.com/users/1454536/parity3) but I can neither comment directly ('You must have 50 reputation to comment'), nor do I see any way to contact him/her directly. What do you suggest to me, to get through to him?
My question:
I have implemented something similiar to what parity3 suggested here as an answer: https://stackoverflow.com/a/21444311/3693375 ("Assuming your Python interpreter, and the ...")
And it works brilliantly - on Windows. (I am using it to implement a locking mechanism that works across independently started processes. https://github.com/drandreaskrueger/lockbyfile )
But other than parity3 says, it does NOT work the same on Linux:
os.rename(src, dst)
Rename the file or directory src to dst. ... On Unix, if dst exists
and is a file,
it will be replaced silently if the user has permission.
The operation may fail on some Unix flavors if src and dst
are on different filesystems. If successful, the renaming will
be an atomic operation (this is a POSIX requirement).
On Windows, if dst already exists, OSError will be raised
(https://docs.python.org/2/library/os.html#os.rename)
The silent replacing is the problem. On Linux.
The "if dst already exists, OSError will be raised" is great for my purposes. But only on Windows, sadly.
I guess parity3's example still works most of the time, because of his if condition
if not os.path.exists(lock_filename):
try:
os.rename(tmp_filename,lock_filename)
But then the whole thing is not atomic anymore.
Because the if condition might be true in two parallel processes, and then both will rename, but only one will win the renaming race. And no exception raised (in Linux).
Any suggestions? Thanks!
P.S.: I know this is not the proper way, but I am lacking an alternative. PLEASE don't punish me with lowering my reputation. I looked around a lot, to solve this myself. How to PM users in here? And meh why can't I?
Here's a start on the win32 half of a portable implementation, that does not need a seperate locking mechanism.
Requires the Python for Windows Extensions to get down to the win32 api, but that's pretty much mandatory for python on windows already, and can alternatively be done with ctypes. The code could be adapted to expose more functionality if it's needed (such as allowing FILE_SHARE_READ rather than no sharing at all). See also the MSDN documentation for the CreateFile and WriteFile system calls, and the article on Creating and Opening Files.
As has been mentioned, you can use the standard fcntl module to implement the unix half of this, if required.
import winerror, pywintypes, win32file
class LockError(StandardError):
pass
class WriteLockedFile(object):
"""
Using win32 api to achieve something similar to file(path, 'wb')
Could be adapted to handle other modes as well.
"""
def __init__(self, path):
try:
self._handle = win32file.CreateFile(
path,
win32file.GENERIC_WRITE,
0,
None,
win32file.OPEN_ALWAYS,
win32file.FILE_ATTRIBUTE_NORMAL,
None)
except pywintypes.error, e:
if e[0] == winerror.ERROR_SHARING_VIOLATION:
raise LockError(e[2])
raise
def close(self):
self._handle.close()
def write(self, str):
win32file.WriteFile(self._handle, str)
Here's how your example from above behaves:
>>> path = "C:\\scr.txt"
>>> file1 = WriteLockedFile(path)
>>> file2 = WriteLockedFile(path) #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
LockError: ...
>>> file1.write("111")
>>> file1.close()
>>> print file(path).read()
111
I prefer to use filelock, a cross-platform Python library which barely requires any additional code. Here's an example of how to use it:
from filelock import FileLock
lockfile = r"c:\scr.txt"
lock = FileLock(lockfile + ".lock")
with lock:
file = open(path, "w")
file.write("111")
file.close()
Any code within the with lock: block is thread-safe, meaning that it will be finished before another process has access to the file.
Assuming your Python interpreter, and the underlying os and filesystem treat os.rename as an atomic operation and it will error when the destination exists, the following method is free of race conditions. I'm using this in production on a linux machine. Requires no third party libs and is not os dependent, and aside from an extra file create, the performance hit is acceptable for many use cases. You can easily apply python's function decorator pattern or a 'with_statement' contextmanager here to abstract out the mess.
You'll need to make sure that lock_filename does not exist before a new process/task begins.
import os,time
def get_tmp_file():
filename='tmp_%s_%s'%(os.getpid(),time.time())
open(filename).close()
return filename
def do_exclusive_work():
print 'exclusive work being done...'
num_tries=10
wait_time=10
lock_filename='filename.lock'
acquired=False
for try_num in xrange(num_tries):
tmp_filename=get_tmp_file()
if not os.path.exists(lock_filename):
try:
os.rename(tmp_filename,lock_filename)
acquired=True
except (OSError,ValueError,IOError), e:
pass
if acquired:
try:
do_exclusive_work()
finally:
os.remove(lock_filename)
break
os.remove(tmp_filename)
time.sleep(wait_time)
assert acquired, 'maximum tries reached, failed to acquire lock file'
EDIT
It has come to light that os.rename silently overwrites the destination on a non-windows OS. Thanks for pointing this out # akrueger!
Here is a workaround, gathered from here:
Instead of using os.rename you can use:
try:
if os.name != 'nt': # non-windows needs a create-exclusive operation
fd = os.open(lock_filename, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
os.close(fd)
# non-windows os.rename will overwrite lock_filename silently.
# We leave this call in here just so the tmp file is deleted but it could be refactored so the tmp file is never even generated for a non-windows OS
os.rename(tmp_filename,lock_filename)
acquired=True
except (OSError,ValueError,IOError), e:
if os.name != 'nt' and not 'File exists' in str(e): raise
# akrueger You're probably just fine with your directory based solution, just giving you an alternate method.
To make you safe when opening files within one application, you could try something like this:
import time
class ExclusiveFile(file):
openFiles = {}
fileLocks = []
class FileNotExclusiveException(Exception):
pass
def __init__(self, *args):
sMode = 'r'
sFileName = args[0]
try:
sMode = args[1]
except:
pass
while sFileName in ExclusiveFile.fileLocks:
time.sleep(1)
ExclusiveFile.fileLocks.append(sFileName)
if not sFileName in ExclusiveFile.openFiles.keys() or (ExclusiveFile.openFiles[sFileName] == 'r' and sMode == 'r'):
ExclusiveFile.openFiles[sFileName] = sMode
try:
file.__init__(self, sFileName, sMode)
finally:
ExclusiveFile.fileLocks.remove(sFileName)
else:
ExclusiveFile.fileLocks.remove(sFileName)
raise self.FileNotExclusiveException(sFileName)
def close(self):
del ExclusiveFile.openFiles[self.name]
file.close(self)
That way you subclass the file class. Now just do:
>>> f = ExclusiveFile('/tmp/a.txt', 'r')
>>> f
<open file '/tmp/a.txt', mode 'r' at 0xb7d7cc8c>
>>> f1 = ExclusiveFile('/tmp/a.txt', 'r')
>>> f1
<open file '/tmp/a.txt', mode 'r' at 0xb7d7c814>
>>> f2 = ExclusiveFile('/tmp/a.txt', 'w') # can't open it for writing now
exclfile.FileNotExclusiveException: /tmp/a.txt
If you open it first with 'w' mode, it won't allow anymore opens, even in read mode, just as you wanted...
I wish to write to a file based on whether that file already exists or not, only writing if it doesn't already exist (in practice, I wish to keep trying files until I find one that doesn't exist).
The following code shows a way in which a potentially attacker could insert a symlink, as suggested in this post in between a test for the file and the file being written. If the code is run with high enough permissions, this could overwrite an arbitrary file.
Is there a way to solve this problem?
import os
import errno
file_to_be_attacked = 'important_file'
with open(file_to_be_attacked, 'w') as f:
f.write('Some important content!\n')
test_file = 'testfile'
try:
with open(test_file) as f: pass
except IOError, e:
# Symlink created here
os.symlink(file_to_be_attacked, test_file)
if e.errno != errno.ENOENT:
raise
else:
with open(test_file, 'w') as f:
f.write('Hello, kthxbye!\n')
Edit: See also Dave Jones' answer: from Python 3.3, you can use the x flag to open() to provide this function.
Original answer below
Yes, but not using Python's standard open() call. You'll need to use os.open() instead, which allows you to specify flags to the underlying C code.
In particular, you want to use O_CREAT | O_EXCL. From the man page for open(2) under O_EXCL on my Unix system:
Ensure that this call creates the file: if this flag is specified in conjunction with O_CREAT, and pathname already exists, then open() will fail. The behavior of O_EXCL is undefined if O_CREAT is not specified.
When these two flags are specified, symbolic links are not followed: if pathname is a symbolic link, then open() fails regardless of where the symbolic link points to.
O_EXCL is only supported on NFS when using NFSv3 or later on kernel 2.6 or later. In environments where NFS O_EXCL support is not provided, programs that rely on it for performing locking tasks will contain a race condition.
So it's not perfect, but AFAIK it's the closest you can get to avoiding this race condition.
Edit: the other rules of using os.open() instead of open() still apply. In particular, if you want use the returned file descriptor for reading or writing, you'll need one of the O_RDONLY, O_WRONLY or O_RDWR flags as well.
All the O_* flags are in Python's os module, so you'll need to import os and use os.O_CREAT etc.
Example:
import os
import errno
flags = os.O_CREAT | os.O_EXCL | os.O_WRONLY
try:
file_handle = os.open('filename', flags)
except OSError as e:
if e.errno == errno.EEXIST: # Failed as the file already exists.
pass
else: # Something unexpected went wrong so reraise the exception.
raise
else: # No exception, so the file must have been created successfully.
with os.fdopen(file_handle, 'w') as file_obj:
# Using `os.fdopen` converts the handle to an object that acts like a
# regular Python file object, and the `with` context manager means the
# file will be automatically closed when we're done with it.
file_obj.write("Look, ma, I'm writing to a new file!")
For reference, Python 3.3 implements a new 'x' mode in the open() function to cover this use-case (create only, fail if file exists). Note that the 'x' mode is specified on its own. Using 'wx' results in a ValueError as the 'w' is redundant (the only thing you can do if the call succeeds is write to the file anyway; it can't have existed if the call succeeds):
>>> f1 = open('new_binary_file', 'xb')
>>> f2 = open('new_text_file', 'x')
For Python 3.2 and below (including Python 2.x) please refer to the accepted answer.
This code will easily create a file if one does not exists.
import os
if not os.path.exists('file'):
open('file', 'w').close()
I am writing a Python script in which I write output to a temporary file and then move that file to its final destination once it is finished and closed. When the script finishes, I want the output file to have the same permissions as if it had been created normally through open(filename,"w"). As it is, the file will have the restrictive set of permissions used by the tempfile module for temp files.
Is there a way for me to figure out what the "default" file permissions for the output file would be if I created it in place, so that I can apply them to the temp file before moving it?
For the record, I had a similar issue, here is the code I have used:
import os
from tempfile import NamedTemporaryFile
def UmaskNamedTemporaryFile(*args, **kargs):
fdesc = NamedTemporaryFile(*args, **kargs)
# we need to set umask to get its current value. As noted
# by Florian Brucker (comment), this is a potential security
# issue, as it affects all the threads. Considering that it is
# less a problem to create a file with permissions 000 than 666,
# we use 666 as the umask temporary value.
umask = os.umask(0o666)
os.umask(umask)
os.chmod(fdesc.name, 0o666 & ~umask)
return fdesc
There is a function umask in the os module. You cannot get the current umask per se, you have to set it and the function returns the previous setting.
The umask is inherited from the parent process. It describes, which bits are not to be set when creating a file or directory.
This way is slow but safe and will work on any system that has a 'umask' shell command:
def current_umask() -> int:
"""Makes a best attempt to determine the current umask value of the calling process in a safe way.
Unfortunately, os.umask() is not threadsafe and poses a security risk, since there is no way to read
the current umask without temporarily setting it to a new value, then restoring it, which will affect
permissions on files created by other threads in this process during the time the umask is changed.
On recent linux kernels (>= 4.1), the current umask can be read from /proc/self/status.
On older systems, the safest way is to spawn a shell and execute the 'umask' command. The shell will
inherit the current process's umask, and will use the unsafe call, but it does so in a separate,
single-threaded process, which makes it safe.
Returns:
int: The current process's umask value
"""
mask: Optional[int] = None
try:
with open('/proc/self/status') as fd:
for line in fd:
if line.startswith('Umask:'):
mask = int(line[6:].strip(), 8)
break
except FileNotFoundError:
pass
except ValueError:
pass
if mask is None:
import subprocess
mask = int(subprocess.check_output('umask', shell=True).decode('utf-8').strip(), 8)
return mask
import os
def current_umask() -> int:
tmp = os.umask(0o022)
os.umask(tmp)
return tmp
This function is implemented in some python packages, e.g. pip and setuptools.
What is the most elegant way to solve this:
open a file for reading, but only if it is not already opened for writing
open a file for writing, but only if it is not already opened for reading or writing
The built-in functions work like this
>>> path = r"c:\scr.txt"
>>> file1 = open(path, "w")
>>> print file1
<open file 'c:\scr.txt', mode 'w' at 0x019F88D8>
>>> file2 = open(path, "w")
>>> print file2
<open file 'c:\scr.txt', mode 'w' at 0x02332188>
>>> file1.write("111")
>>> file2.write("222")
>>> file1.close()
scr.txt now contains '111'.
>>> file2.close()
scr.txt was overwritten and now contains '222' (on Windows, Python 2.4).
The solution should work inside the same process (like in the example above) as well as when another process has opened the file.
It is preferred, if a crashing program will not keep the lock open.
I don't think there is a fully crossplatform way. On unix, the fcntl module will do this for you. However on windows (which I assume you are by the paths), you'll need to use the win32file module.
Fortunately, there is a portable implementation (portalocker) using the platform appropriate method at the python cookbook.
To use it, open the file, and then call:
portalocker.lock(file, flags)
where flags are portalocker.LOCK_EX for exclusive write access, or LOCK_SH for shared, read access.
The solution should work inside the same process (like in the example above) as well as when another process has opened the file.
If by 'another process' you mean 'whatever process' (i.e. not your program), in Linux there's no way to accomplish this relying only on system calls (fcntl & friends). What you want is mandatory locking, and the Linux way to obtain it is a bit more involved:
Remount the partition that contains your file with the mand option:
# mount -o remount,mand /dev/hdXY
Set the sgid flag for your file:
# chmod g-x,g+s yourfile
In your Python code, obtain an exclusive lock on that file:
fcntl.flock(fd, fcntl.LOCK_EX)
Now even cat will not be able to read the file until you release the lock.
EDIT: I solved it myself! By using directory existence & age as a locking mechanism! Locking by file is safe only on Windows (because Linux silently overwrites), but locking by directory works perfectly both on Linux and Windows. See my GIT where I created an easy to use class 'lockbydir.DLock' for that:
https://github.com/drandreaskrueger/lockbydir
At the bottom of the readme, you find 3 GITplayers where you can see the code examples execute live in your browser! Quite cool, isn't it? :-)
Thanks for your attention
This was my original question:
I would like to answer to parity3 (https://meta.stackoverflow.com/users/1454536/parity3) but I can neither comment directly ('You must have 50 reputation to comment'), nor do I see any way to contact him/her directly. What do you suggest to me, to get through to him?
My question:
I have implemented something similiar to what parity3 suggested here as an answer: https://stackoverflow.com/a/21444311/3693375 ("Assuming your Python interpreter, and the ...")
And it works brilliantly - on Windows. (I am using it to implement a locking mechanism that works across independently started processes. https://github.com/drandreaskrueger/lockbyfile )
But other than parity3 says, it does NOT work the same on Linux:
os.rename(src, dst)
Rename the file or directory src to dst. ... On Unix, if dst exists
and is a file,
it will be replaced silently if the user has permission.
The operation may fail on some Unix flavors if src and dst
are on different filesystems. If successful, the renaming will
be an atomic operation (this is a POSIX requirement).
On Windows, if dst already exists, OSError will be raised
(https://docs.python.org/2/library/os.html#os.rename)
The silent replacing is the problem. On Linux.
The "if dst already exists, OSError will be raised" is great for my purposes. But only on Windows, sadly.
I guess parity3's example still works most of the time, because of his if condition
if not os.path.exists(lock_filename):
try:
os.rename(tmp_filename,lock_filename)
But then the whole thing is not atomic anymore.
Because the if condition might be true in two parallel processes, and then both will rename, but only one will win the renaming race. And no exception raised (in Linux).
Any suggestions? Thanks!
P.S.: I know this is not the proper way, but I am lacking an alternative. PLEASE don't punish me with lowering my reputation. I looked around a lot, to solve this myself. How to PM users in here? And meh why can't I?
Here's a start on the win32 half of a portable implementation, that does not need a seperate locking mechanism.
Requires the Python for Windows Extensions to get down to the win32 api, but that's pretty much mandatory for python on windows already, and can alternatively be done with ctypes. The code could be adapted to expose more functionality if it's needed (such as allowing FILE_SHARE_READ rather than no sharing at all). See also the MSDN documentation for the CreateFile and WriteFile system calls, and the article on Creating and Opening Files.
As has been mentioned, you can use the standard fcntl module to implement the unix half of this, if required.
import winerror, pywintypes, win32file
class LockError(StandardError):
pass
class WriteLockedFile(object):
"""
Using win32 api to achieve something similar to file(path, 'wb')
Could be adapted to handle other modes as well.
"""
def __init__(self, path):
try:
self._handle = win32file.CreateFile(
path,
win32file.GENERIC_WRITE,
0,
None,
win32file.OPEN_ALWAYS,
win32file.FILE_ATTRIBUTE_NORMAL,
None)
except pywintypes.error, e:
if e[0] == winerror.ERROR_SHARING_VIOLATION:
raise LockError(e[2])
raise
def close(self):
self._handle.close()
def write(self, str):
win32file.WriteFile(self._handle, str)
Here's how your example from above behaves:
>>> path = "C:\\scr.txt"
>>> file1 = WriteLockedFile(path)
>>> file2 = WriteLockedFile(path) #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
LockError: ...
>>> file1.write("111")
>>> file1.close()
>>> print file(path).read()
111
I prefer to use filelock, a cross-platform Python library which barely requires any additional code. Here's an example of how to use it:
from filelock import FileLock
lockfile = r"c:\scr.txt"
lock = FileLock(lockfile + ".lock")
with lock:
file = open(path, "w")
file.write("111")
file.close()
Any code within the with lock: block is thread-safe, meaning that it will be finished before another process has access to the file.
Assuming your Python interpreter, and the underlying os and filesystem treat os.rename as an atomic operation and it will error when the destination exists, the following method is free of race conditions. I'm using this in production on a linux machine. Requires no third party libs and is not os dependent, and aside from an extra file create, the performance hit is acceptable for many use cases. You can easily apply python's function decorator pattern or a 'with_statement' contextmanager here to abstract out the mess.
You'll need to make sure that lock_filename does not exist before a new process/task begins.
import os,time
def get_tmp_file():
filename='tmp_%s_%s'%(os.getpid(),time.time())
open(filename).close()
return filename
def do_exclusive_work():
print 'exclusive work being done...'
num_tries=10
wait_time=10
lock_filename='filename.lock'
acquired=False
for try_num in xrange(num_tries):
tmp_filename=get_tmp_file()
if not os.path.exists(lock_filename):
try:
os.rename(tmp_filename,lock_filename)
acquired=True
except (OSError,ValueError,IOError), e:
pass
if acquired:
try:
do_exclusive_work()
finally:
os.remove(lock_filename)
break
os.remove(tmp_filename)
time.sleep(wait_time)
assert acquired, 'maximum tries reached, failed to acquire lock file'
EDIT
It has come to light that os.rename silently overwrites the destination on a non-windows OS. Thanks for pointing this out # akrueger!
Here is a workaround, gathered from here:
Instead of using os.rename you can use:
try:
if os.name != 'nt': # non-windows needs a create-exclusive operation
fd = os.open(lock_filename, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
os.close(fd)
# non-windows os.rename will overwrite lock_filename silently.
# We leave this call in here just so the tmp file is deleted but it could be refactored so the tmp file is never even generated for a non-windows OS
os.rename(tmp_filename,lock_filename)
acquired=True
except (OSError,ValueError,IOError), e:
if os.name != 'nt' and not 'File exists' in str(e): raise
# akrueger You're probably just fine with your directory based solution, just giving you an alternate method.
To make you safe when opening files within one application, you could try something like this:
import time
class ExclusiveFile(file):
openFiles = {}
fileLocks = []
class FileNotExclusiveException(Exception):
pass
def __init__(self, *args):
sMode = 'r'
sFileName = args[0]
try:
sMode = args[1]
except:
pass
while sFileName in ExclusiveFile.fileLocks:
time.sleep(1)
ExclusiveFile.fileLocks.append(sFileName)
if not sFileName in ExclusiveFile.openFiles.keys() or (ExclusiveFile.openFiles[sFileName] == 'r' and sMode == 'r'):
ExclusiveFile.openFiles[sFileName] = sMode
try:
file.__init__(self, sFileName, sMode)
finally:
ExclusiveFile.fileLocks.remove(sFileName)
else:
ExclusiveFile.fileLocks.remove(sFileName)
raise self.FileNotExclusiveException(sFileName)
def close(self):
del ExclusiveFile.openFiles[self.name]
file.close(self)
That way you subclass the file class. Now just do:
>>> f = ExclusiveFile('/tmp/a.txt', 'r')
>>> f
<open file '/tmp/a.txt', mode 'r' at 0xb7d7cc8c>
>>> f1 = ExclusiveFile('/tmp/a.txt', 'r')
>>> f1
<open file '/tmp/a.txt', mode 'r' at 0xb7d7c814>
>>> f2 = ExclusiveFile('/tmp/a.txt', 'w') # can't open it for writing now
exclfile.FileNotExclusiveException: /tmp/a.txt
If you open it first with 'w' mode, it won't allow anymore opens, even in read mode, just as you wanted...