Opening a file on Windows with exclusive locking in Python - python

I have a question quite similar to this question, where I need the follow conditions to be upheld:
If a file is opened for reading, that file may only be opened for reading by any other process/program
If a file is opened for writing, that file may only be opened for reading by any other process/program
The solution posted in the linked question uses a third party library which adds an arbitrary .LOCK file in the same directory as the file in question. It is a solution that only works wrt to the program in which that library is being used and doesn't prevent any other process/program from using the file as they may not be implemented to check for a .LOCK association.
In essence, I wish to replicate this result using only Python's standard library.
BLUF: Need a standard library implementation specific to Windows for exclusive file locking
To give an example of the problem set, assume there is:
1 file on a shared network/drive
2 users on separate processes/programs
Suppose that User 1 is running Program A on the file and at some point the following is executed:
with open(fp, 'rb') as f:
while True:
chunk = f.read(10)
if chunk:
# do something with chunk
else:
break
Thus they are iterating through the file 10 bytes at a time.
Now User 2 runs Program B on the same file a moment later:
with open(fp, 'wb') as f:
for b in data: # some byte array
f.write(b)
On Windows, the file in question is immediately truncated and Program A stops iterating (even if it wasn't done) and Program B begins to write to the file. Therefore I need a way to ensure that the file may not be opened in a different mode that would alter its content if previously opened.
I was looking at the msvcrt library, namely the msvcrt.locking() interface. What I have been successful at doing is ensuring that a file opened for reading can be locked for reading, but nobody else can read the file (as I lock the entire file):
>>> f1 = open(fp, 'rb')
>>> f2 = open(fp, 'rb')
>>> msvcrt.locking(f1.fileno(), msvcrt.LK_LOCK, os.stat(fp).st_size)
>>> next(f1)
b"\x00\x05'\n"
>>> next(f2)
PermissionError: [Errno 13] Permission denied
This is an acceptible result, just not the most desired.
In the same scenario, User 1 runs Program A which includes:
with open(fp, 'rb') as f
msvcrt.locking(f.fileno(), msvcrt.LK_LOCK, os.stat(fp).st_size)
# repeat while block
msvcrt.locking(f.fileno(), msvcrt.LK_UNLCK, os.stat(fp).st_size)
Then User 2 runs Program B a moment later, the same result occurs and the file is truncated.
At this point, I would've liked a way to throw an error to User 2 stating the file is opened for reading somewhere else and cannot be written at this time. But if User 3 came along and opened the file for reading, then there would be no problem.
Update:
A potential solution is to change the permissions of a file (with exception catching if the file is already in use):
>>> os.chmod(fp, stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH)
>>> with open(fp, 'wb') as f:
# do something
PermissionError: [Errno 13] Permission denied <fp>
This doesn't feel like the best solution (particularly if the users didn't have the permission to even change permissions). Still looking for a proper locking solution but msvcrt doesn't prevent truncating and writing if the file is locked for reading. There still doesn't appear to be a way to generate an exclusive lock with Python's standard library.

For those who are interested in a Windows specific solution:
import os
import ctypes
import msvcrt
import pathlib
# Windows constants for file operations
NULL = 0x00000000
CREATE_ALWAYS = 0x00000002
OPEN_EXISTING = 0x00000003
FILE_SHARE_READ = 0x00000001
FILE_ATTRIBUTE_READONLY = 0x00000001 # strictly for file reading
FILE_ATTRIBUTE_NORMAL = 0x00000080 # strictly for file writing
FILE_FLAG_SEQUENTIAL_SCAN = 0x08000000
GENERIC_READ = 0x80000000
GENERIC_WRITE = 0x40000000
_ACCESS_MASK = os.O_RDONLY | os.O_WRONLY
_ACCESS_MAP = {os.O_RDONLY: GENERIC_READ,
os.O_WRONLY: GENERIC_WRITE
}
_CREATE_MASK = os.O_CREAT | os.O_TRUNC
_CREATE_MAP = {NULL: OPEN_EXISTING,
os.O_CREAT | os.O_TRUNC: CREATE_ALWAYS
}
win32 = ctypes.WinDLL('kernel32.dll', use_last_error=True)
win32.CreateFileW.restype = ctypes.c_void_p
INVALID_FILE_HANDLE = ctypes.c_void_p(-1).value
def _opener(path: pathlib.Path, flags: int) -> int:
access_flags = _ACCESS_MAP[flags & _ACCESS_MASK]
create_flags = _CREATE_MAP[flags & _CREATE_MASK]
if flags & os.O_WRONLY:
share_flags = NULL
attr_flags = FILE_ATTRIBUTE_NORMAL
else:
share_flags = FILE_SHARE_READ
attr_flags = FILE_ATTRIBUTE_READONLY
attr_flags |= FILE_FLAG_SEQUENTIAL_SCAN
h = win32.CreateFileW(path, access_flags, share_flags, NULL, create_flags, attr_flags, NULL)
if h == INVALID_FILE_HANDLE:
raise ctypes.WinError(ctypes.get_last_error())
return msvcrt.open_osfhandle(h, flags)
class _FileControlAccessor(pathlib._NormalAccessor):
open = staticmethod(_opener)
_control_accessor = _FileControlAccessor()
class Path(pathlib.WindowsPath):
def _init(self) -> None:
self._closed = False
self._accessor = _control_accessor
def _opener(self, name, flags) -> int:
return self._accessor.open(name, flags)

Related

File write of unpickled byte array has no permissions set by default

I have deserialized file data in this dictionary arranged as follows -
[filename1 : bytearray with file contents]
[filename2 : bytearray with file contents]
[filename3 : bytearray with file contents]
...
Now, when I write the data to disk at my destination folder using
for f,bArr in depickled_.items():
with open(os.path.join(r"S:\test", f), "wb") as fWr:
fWr.write(bytearray(bArr))
fWr.close() # <- probably redundant
The files are getting written as expected, but they have no permissions applied to them by default which I find odd. Therefore, I cannot open any of the written files as is, but when I fiddle with the security settings to allow myself read access then they open as expected.
Any idea what's going wrong and how I can fix it? I am the sole administrator (and user) of this computer.
More info:
Python version 3.7
Windows 10 Home
Added some permissions using pywin32 module.
import win32security
import ntsecuritycon as con
from win32 import win32api
for f,bArr in depickle_.items():
full_path = os.path.join(r"S:\test", f)
# with open(full_path, "wb+") as fWr:
fd = os.open(full_path, os.O_CREAT | os.O_WRONLY, 0o777) # will probably work for linux
with open(fd, 'wb') as fWr:
fWr.write(bytearray(bArr))
fWr.close()
# user, domain, type = win32security.LookupAccountName ("", win32api.GetUserName())
user, domain, type = win32security.LookupAccountName ("", "Everyone")
sd = win32security.GetFileSecurity(full_path, win32security.DACL_SECURITY_INFORMATION)
dacl = sd.GetSecurityDescriptorDacl()
# Delete all existing permissions
for index in range(0, dacl.GetAceCount()):
dacl.DeleteAce(0)
dacl.AddAccessAllowedAce(win32security.ACL_REVISION, con.FILE_ALL_ACCESS, user)
sd.SetSecurityDescriptorDacl(1, dacl, 0)
win32security.SetFileSecurity(full_path, win32security.DACL_SECURITY_INFORMATION, sd)

How to not write to file while reading and vise-versa

I have a python program (say reader.py) which uses file setting.py to read from:
while( True ):
...
execfile( settings.py )
...
But there is other python program (say writer.py) that uses this file to write to:
...
try:
settings = open('settings.py', 'w')
settings.truncate()
settings.write( 'some text')
except IOError:
print('Cannot write to file')
finally:
settings.close()
...
Note1: reader.py and writer.py do not ''know'' about each other.
Note2: reader.py reads settings.py cyclically, though writer.py writes to file when user wants to (not necessarily right after he/she clicked ''write'', it just means that there is no any rule when to write).
Question: What is the best way to cooperate two programs in order to avoid any contradiction? I know this might depend on platform. I am using Linux. Distributions are: Ubuntu, Scientific Linux.
EDIT1: If I choose to use FiFo I encounter the following problem: Once writer has write to settings file it will probably never write again but reader should have access to settings anyway in this case. In other words, reader should have an ability to read from file and not to wait for writer in this case. Otherwise reader has to wait for writer.
Ordinary using of FiFo does not allow reader to read from file if writer does not write (until it has written). How to deal with this problem?
You may be interested in using a named pipe for your interprocess communications. Available in Linux, it is a special type of file designed for client (writer.py), server (reader.py), tasks. After writing to the pipe, the client will wait until the server has received the data. This allows you to sync the two processes somewhat.
Linux Manual for FiFo
Python doc: os.mkfifo(path[, mode])
I found the following solution which seems to be working. I use flock to create locks.
Reader:
import errno
import fcntl
from time import *
path = "testLock.py"
f = open(path, "r")
while True:
try:
fcntl.flock(f, fcntl.LOCK_EX | fcntl.LOCK_NB)
break
except IOError as e:
if e.errno != errno.EAGAIN:
raise
else:
sleep(1)
print 'Waiting...'
#reader's action
execfile(path)
#drop lock
fcntl.flock(f, fcntl.LOCK_UN)
Writer:
import errno
import fcntl
from time import *
path = "testLock.py"
f = open(path, "w")
while True:
try:
fcntl.flock(f, fcntl.LOCK_SH | fcntl.LOCK_NB)
break
except IOError as e:
if e.errno != errno.EAGAIN:
raise
else:
sleep(1)
print 'Waiting...'
#writer's action
for i in (1,10,2):
f.write('print "%d" % i')
sleep(1)
#drop lock
fcntl.flock(f, fcntl.LOCK_UN)
I have some question here:
Qusetion 1: Is it correct usage of LOCK_EX and LOCK_SH I mean are they in the right place?
Question 2: Is the reader's action i.e execfile correct here? If the file is already opened is execfile try to open it anyway?

Shared file access between Python and Matlab

I have a Matlab application that writes in to a .csv file and a Python script that reads from it. These operations happen concurrently and at their own respective periods (not necessarily the same). All of this runs on Windows 7.
I wish to know :
Would the OS inherently provide some sort of locking mechanism so that only one of the two applications - Matlab or Python - have access to the shared file?
In the Python application, how do I check if the file is already "open"ed by Matlab application? What's the loop structure for this so that the Python application is blocked until it gets access to read the file?
I am not sure about window's API for locking files
Heres a possible solution:
While matlab has the file open, you create an empty file called "data.lock" or something to that effect.
When python tries to read the file, it will check for the lock file, and if it is there, then it will sleep for a given interval.
When matlab is done with the file, it can delete the "data.lock" file.
Its a programmatic solution, but it is simpler than digging through the windows api and finding the right calls in matlab and python.
If Python is only reading the file, I believe you have to lock it in MATLAB because a read-only open call from Python may not fail. I am not sure how to accomplish that, you may want to read this question atomically creating a file lock in MATLAB (file mutex)
However, if you are simply consuming the data with python, did you consider using a socket instead of a file?
In Windows on the Python side, CreateFile can be called (directly or indirectly via the CRT) with a specific sharing mode. For example, if the desired sharing mode is FILE_SHARE_READ, then the open will fail if the file is already open for writing. If the latter call instead succeeds, then a future attempt to open the file for writing will fail (e.g. in Matlab).
The Windows CRT function _wsopen_s allows setting the sharing mode. You can call it with ctypes in a Python 3 opener:
import sys
import os
import ctypes as ctypes
import ctypes.util
__all__ = ['shdeny', 'shdeny_write', 'shdeny_read']
_SH_DENYRW = 0x10 # deny read/write mode
_SH_DENYWR = 0x20 # deny write mode
_SH_DENYRD = 0x30 # deny read
_S_IWRITE = 0x0080 # for O_CREAT, a new file is not readonly
if sys.version_info[:2] < (3,5):
_wsopen_s = ctypes.CDLL(ctypes.util.find_library('c'))._wsopen_s
else:
# find_library('c') may be deprecated on Windows in 3.5, if the
# universal CRT removes named exports. The following probably
# isn't future proof; I don't know how the '-l1-1-0' suffix
# should be handled.
_wsopen_s = ctypes.CDLL('api-ms-win-crt-stdio-l1-1-0')._wsopen_s
_wsopen_s.argtypes = (ctypes.POINTER(ctypes.c_int), # pfh
ctypes.c_wchar_p, # filename
ctypes.c_int, # oflag
ctypes.c_int, # shflag
ctypes.c_int) # pmode
def shdeny(file, flags):
fh = ctypes.c_int()
err = _wsopen_s(ctypes.byref(fh),
file, flags, _SH_DENYRW, _S_IWRITE)
if err:
raise IOError(err, os.strerror(err), file)
return fh.value
def shdeny_write(file, flags):
fh = ctypes.c_int()
err = _wsopen_s(ctypes.byref(fh),
file, flags, _SH_DENYWR, _S_IWRITE)
if err:
raise IOError(err, os.strerror(err), file)
return fh.value
def shdeny_read(file, flags):
fh = ctypes.c_int()
err = _wsopen_s(ctypes.byref(fh),
file, flags, _SH_DENYRD, _S_IWRITE)
if err:
raise IOError(err, os.strerror(err), file)
return fh.value
For example:
if __name__ == '__main__':
import tempfile
filename = tempfile.mktemp()
fw = open(filename, 'w')
fw.write('spam')
fw.flush()
fr = open(filename)
assert fr.read() == 'spam'
try:
f = open(filename, opener=shdeny_write)
except PermissionError:
fw.close()
with open(filename, opener=shdeny_write) as f:
assert f.read() == 'spam'
try:
f = open(filename, opener=shdeny_read)
except PermissionError:
fr.close()
with open(filename, opener=shdeny_read) as f:
assert f.read() == 'spam'
with open(filename, opener=shdeny) as f:
assert f.read() == 'spam'
os.remove(filename)
In Python 2 you'll have to combine the above openers with os.fdopen, e.g.:
f = os.fdopen(shdeny_write(filename, os.O_RDONLY|os.O_TEXT), 'r')
Or define an sopen wrapper that lets you pass the share mode explicitly and calls os.fdopen to return a Python 2 file. This will require a bit more work to get the file mode from the passed in flags, or vice versa.

Error with os.open in Python

I'm trying to create and write a file if it does not exist yet, so that it is co-operatively safe from race conditions, and I'm having (probably stupid) problem. First, here's code:
import os
def safewrite(text, filename):
print "Going to open", filename
fd = os.open(filename, os.O_CREAT | os.O_EXCL, 0666) ##### problem line?
print "Going to write after opening fd", fd
os.write(fd, text)
print "Going to close after writing", text
os.close(fd)
print "Going to return after closing"
#test code to verify file writing works otherwise
f = open("foo2.txt", "w")
f.write("foo\n");
f.close()
f = open("foo2.txt", "r")
print "First write contents:", f.read()
f.close()
os.remove("foo2.txt")
#call the problem method
safewrite ("test\n", "foo2.txt")
Then the problem, I get exception:
First write contents: foo
Going to open foo2.txt
Going to write after opening fd 5
Traceback (most recent call last):
File "/home/user/test.py", line 21, in <module>
safewrite ("test\n", "foo2.txt")
File "/home/user/test.py", line 7, in safewrite
os.write(fd, text)
OSError: [Errno 9] Bad file descriptor
Probable problem line is marked in the code above (I mean, what else could it be?), but I can't figure out how to fix it. What is the problem?
Note: above was tested in a Linux VM, with Python 2.7.3. If you try the code and it works for you, please write a comment with your environment.
Alternative code to do the same thing at least as safely is also very welcome.
Change the line:
fd = os.open(filename, os.O_CREAT | os.O_EXCL, 0666)
to be instead:
fd=os.open(filename, os.O_CREAT | os.O_EXCL | os.O_WRONLY, 0666)
You must open the file with a flag such that you can write to it (os.O_WRONLY).
From open(2):
DESCRIPTION
The argument flags must include one of the following access modes: O_RDONLY, O_WRONLY, or O_RDWR. These request opening the file read-only,
write-only, or read/write, respectively.
From write(2):
NAME
write - write to a file descriptor
...
ERRORS
EAGAIN The file descriptor fd has been marked non-blocking (O_NONBLOCK) and the write would block.
EBADF fd is not a valid file descriptor or is not open for writing.

How do I set permissions (attributes) on a file in a ZIP file using Python's zipfile module?

When I extract files from a ZIP file created with the Python zipfile module, all the files are not writable, read only etc.
The file is being created and extracted under Linux and Python 2.5.2.
As best I can tell, I need to set the ZipInfo.external_attr property for each file, but this doesn't seem to be documented anywhere I could find, can anyone enlighten me?
This seems to work (thanks Evan, putting it here so the line is in context):
buffer = "path/filename.zip" # zip filename to write (or file-like object)
name = "folder/data.txt" # name of file inside zip
bytes = "blah blah blah" # contents of file inside zip
zip = zipfile.ZipFile(buffer, "w", zipfile.ZIP_DEFLATED)
info = zipfile.ZipInfo(name)
info.external_attr = 0777 << 16L # give full access to included file
zip.writestr(info, bytes)
zip.close()
I'd still like to see something that documents this... An additional resource I found was a note on the Zip file format: http://www.pkware.com/documents/casestudies/APPNOTE.TXT
This link has more information than anything else I've been able to find on the net. Even the zip source doesn't have anything. Copying the relevant section for posterity. This patch isn't really about documenting this format, which just goes to show how pathetic (read non-existent) the current documentation is.
# external_attr is 4 bytes in size. The high order two
# bytes represent UNIX permission and file type bits,
# while the low order two contain MS-DOS FAT file
# attributes, most notably bit 4 marking directories.
if node.isfile:
zipinfo.compress_type = ZIP_DEFLATED
zipinfo.external_attr = 0644 << 16L # permissions -r-wr--r--
data = node.get_content().read()
properties = node.get_properties()
if 'svn:special' in properties and \
data.startswith('link '):
data = data[5:]
zipinfo.external_attr |= 0120000 << 16L # symlink file type
zipinfo.compress_type = ZIP_STORED
if 'svn:executable' in properties:
zipinfo.external_attr |= 0755 << 16L # -rwxr-xr-x
zipfile.writestr(zipinfo, data)
elif node.isdir and path:
if not zipinfo.filename.endswith('/'):
zipinfo.filename += '/'
zipinfo.compress_type = ZIP_STORED
zipinfo.external_attr = 040755 << 16L # permissions drwxr-xr-x
zipinfo.external_attr |= 0x10 # MS-DOS directory flag
zipfile.writestr(zipinfo, '')
Also, this link has the following.
Here the low order byte presumably means the rightmost (lowest) byte of the four bytes. So this one is
for MS-DOS and can presumably be left as zero otherwise.
external file attributes: (4 bytes)
The mapping of the external attributes is
host-system dependent (see 'version made by'). For
MS-DOS, the low order byte is the MS-DOS directory
attribute byte. If input came from standard input, this
field is set to zero.
Also, the source file unix/unix.c in the sources for InfoZIP's zip program, downloaded from Debian's archives has the following in comments.
/* lower-middle external-attribute byte (unused until now):
* high bit => (have GMT mod/acc times) >>> NO LONGER USED! <<<
* second-high bit => have Unix UID/GID info
* NOTE: The high bit was NEVER used in any official Info-ZIP release,
* but its future use should be avoided (if possible), since it
* was used as "GMT mod/acc times local extra field" flags in Zip beta
* versions 2.0j up to 2.0v, for about 1.5 years.
*/
So taking all this together, it looks like only the second highest byte is actually used, at least for Unix.
EDIT: I asked about the Unix aspect of this on Unix.SX, in the question "The zip format's external file attribute". Looks like I got a couple of things wrong. Specifically both of the top two bytes are used for Unix.
Look at this: Set permissions on a compressed file in python
I'm not entirely sure if that's what you want, but it seems to be.
The key line appears to be:
zi.external_attr = 0777 << 16L
It looks like it sets the permissions to 0777 there.
The earlier answers did not work for me (on OS X 10.12). I found that as well as the executable flags (octal 755), I also need to set the "regular file" flag (octal 100000). I found this mentioned here: https://unix.stackexchange.com/questions/14705/the-zip-formats-external-file-attribute
A complete example:
zipname = "test.zip"
filename = "test-executable"
zip = zipfile.ZipFile(zipname, 'w', zipfile.ZIP_DEFLATED)
f = open(filename, 'r')
bytes = f.read()
f.close()
info = zipfile.ZipInfo(filename)
info.date_time = time.localtime()
info.external_attr = 0100755 << 16L
zip.writestr(info, bytes, zipfile.ZIP_DEFLATED)
zip.close()
A complete example of my specific usecase, creating a zip of a .app so that everything in the folder Contents/MacOS/ is executable: https://gist.github.com/Draknek/3ce889860cea4f59838386a79cc11a85
You can extend the ZipFile class to change the default file permission:
from zipfile import ZipFile, ZipInfo
import time
class PermissiveZipFile(ZipFile):
def writestr(self, zinfo_or_arcname, data, compress_type=None):
if not isinstance(zinfo_or_arcname, ZipInfo):
zinfo = ZipInfo(filename=zinfo_or_arcname,
date_time=time.localtime(time.time())[:6])
zinfo.compress_type = self.compression
if zinfo.filename[-1] == '/':
zinfo.external_attr = 0o40775 << 16 # drwxrwxr-x
zinfo.external_attr |= 0x10 # MS-DOS directory flag
else:
zinfo.external_attr = 0o664 << 16 # ?rw-rw-r--
else:
zinfo = zinfo_or_arcname
super(PermissiveZipFile, self).writestr(zinfo, data, compress_type)
This example changes the default file permission to 664 and keeps 775 for directories.
Related code:
ZipFile.writestr, Python 2.7
ZipFile.writestr, Python 3.6
Also look at what Python's zipfile module does:
def write(self, filename, arcname=None, compress_type=None):
...
st = os.stat(filename)
...
zinfo = ZipInfo(arcname, date_time)
zinfo.external_attr = (st[0] & 0xFFFF) << 16L # Unix attributes
...
```
To set permissions (Unix attributes) on a file in a ZIP file using Python's zipfile module, pass the attributes as bits 16-31 of the external_attr of ZipInfo.
The Python zipfile module accepts the 16-bit "Mode" field (that stores st_mode field from struct stat, containing user/group/other permissions, setuid/setgid and symlink info, etc) of the ASi extra block for Unix in the external_attr bits above mentioned.
You may also import the Python's "stat" module to get the mode constant definitions.
You may also set 3 in create_system to specify the operating system which created the ZIP archive: 3 = Unix; 0 = Windows.
Here is an example:
#!/usr/bin/python
import stat
import zipfile
def create_zip_with_symlink(output_zip_filename, link_source, link_target):
zipInfo = zipfile.ZipInfo(link_source)
zipInfo.create_system = 3
unix_st_mode = stat.S_IFLNK | stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR | stat.S_IRGRP | stat.S_IWGRP | stat.S_IXGRP | stat.S_IROTH | stat.S_IWOTH | stat.S_IXOTH
zipInfo.external_attr = unix_st_mode << 16
zipOut = zipfile.ZipFile(output_zip_filename, 'w', compression=zipfile.ZIP_DEFLATED)
zipOut.writestr(zipInfo, link_target)
zipOut.close()
create_zip_with_symlink('cpuinfo.zip', 'cpuinfo.txt', '/proc/cpuinfo')
When you do it like this, does it work alright?
zf = zipfile.ZipFile("something.zip")
for name in zf.namelist():
f = open(name, 'wb')
f.write(self.read(name))
f.close()
If not, I'd suggest throwing in an os.chmod in the for loop with 0777 permissions like this:
zf = zipfile.ZipFile("something.zip")
for name in zf.namelist():
f = open(name, 'wb')
f.write(self.read(name))
f.close()
os.chmod(name, 0777)

Categories

Resources