Python Multiple users append to the same file at the same time

Python Multiple users append to the same file at the same time - python

I'm working on a python script that will be accessed via the web, so there will be multiple users trying to append to the same file at the same time. My worry is that this might cause a race condition where if multiple users wrote to the same file at the same time and it just might corrupt the file.
For example:
#!/usr/bin/env python
g = open("/somepath/somefile.txt", "a")
new_entry = "foobar"
g.write(new_entry)
g.close
Will I have to use a lockfile for this as this operation looks risky.

You can use file locking:
import fcntl
new_entry = "foobar"
with open("/somepath/somefile.txt", "a") as g:
fcntl.flock(g, fcntl.LOCK_EX)
g.write(new_entry)
fcntl.flock(g, fcntl.LOCK_UN)
Note that on some systems, locking is not needed if you're only writing small buffers, because appends on these systems are atomic.

If you are doing this operation on Linux, and the cache size is smaller than 4KB, the write operation is atomic and you should be good.
More to read here: Is file append atomic in UNIX?

You didn't state what platform you use, but here is an module you can use that is cross platform:
File locking in Python

Depending on your platform/filesystem location this may not be doable in a safe manner (e.g. NFS). Perhaps you can write to different files and merge the results afterwards?

Related

Execute bytes string in Python from an .exe file [duplicate]

How can I load an exe file—stored as a base64 encoded string—into memory and execute it without writing it to disk?
The point being, to put some kind of control/password/serial system in place and compile it with py2exe. Then I could execute that embedded file when ever I want in my code.

All of the mechanisms Python has for executing a child process require a filename.
And so does the underlying CreateProcess function in the Win32 API, so there's not even an easy way around it by dropping down to that level.
There is a way to do this by dropping down to ZwCreateProcess/NtCreateProcess. If you know how to use the low-level NT API, this post should be all you need to understand it. If you don't… it's way too much to explain in an SO answer.
Alternatively, of course, you can create or use a RAM drive, or even simulate a virtual filesystem, but that's getting a little silly as an attempt to avoid creating a file.
So, the right answer is to write the exe to a file, then execute it. For example, something like this:
fd, path = tempfile.mkstemp(suffix='.exe')
code = base64.b64decode(encoded_code)
os.write(fd, code)
os.fchmod(fd, 0o711)
os.close(fd)
try:
result = subprocess.call(path)
finally:
os.remove(path)
This should work on both Windows and *nix, but it's completely untested, and will probably have bugs on at least one platform.
Obviously, if you want to execute it multiple times, don't remove it until you're done with it. Or just use some appropriate persistent directory, and write it only if it's missing or out of date.

encode exe :
import base64
#encode exe file in base64 data
with open("Sample.exe", 'rb') as f:
read_exe_to_basae64 = base64.b64encode(f.read())
#encoded data will be like (really big text, don't worry) for e.g.:
b'TVqQAAMAAAAEAAAA//8AALgAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAyAAAAA4fug4AtAnNIbgBTM0hVGhpcyBwcm9ncmFtIGNhbm5vdCBiZSBydW4gaW4gRE9TIG1vZGUuDQ0KJAAAAAAAAAA9AHveeWEVjXlhFY15YRWN+n0bjXhhFY0QfhyNfmEVjZB+GI14YRWNUmljaHlhFY0AAAAAAAAAAAAAAA'
#decode exe file:
with open("Sample2.exe", 'wb') as f:
f.write(base64.b64decode(read_exe_to_basae64))
exe file will be created in folder. If you don't want users to see it, just decode it in any random folder and delete it after use.

Is file-input thread-safe in Python?

In Python2, is it safe to have multiple threads read from a single unchanging disk file using code such as:
with open( pathname, 'rb' ) as f:
f.seek( file_position )
data = f.read( number_of_bytes )
No process has, or will have, write-permission for the file.
Obviously, reading files in this way is not atomic. The Python2 documents say nothing (I could find) about file objects and threads. Here is the documentation for the seek method:
https://docs.python.org/2/library/stdtypes.html?highlight=seek#file-objects
This is a critical issue for my system, so if pointers into the documentation could be provided, that would be reassuring.
Thank you.

If each thread executes the code you've given, they open the file separately, and this is safe. I'm not sure to what documentation to refer you; this is just a result of a process being allowed to have the same file open more than once. You may not be on a POSIX system, but for reference it describes an open file description as the thing created by open() (in C, but wrapped by Python) that holds the file offset and other information relevant to accessing the file.

Checking if file in use (by other process) by trying to rename it to same name (in Python on Windows)

I want to detect if a file is being written to by another process before I start to read the contents of that file.
This is on Windows and I am using Python (2.7.x).
(By the way, the Python script is providing a service where it acts on files that are placed in a specified folder. It acts on the files as soon as they are detected and it deletes the files after having acted on them. So I don't want to start acting on a file that is only partially written.)
I have found empirically that trying to rename the file to the same name will fail if the file is being written to (by another process) and will succeed (as a null-op) if the file is not in use by another process.
Something like this:
def isFileInUse(filePath):
try:
os.rename(filePath, filePath)
return False
except Exception:
return True
I haven't seen anything documented about the behaviour of os.rename when source and destination are the same.
Does anyone know of something that might go wrong with what I am doing above?
I emphasize that I am looking for a solution that works in Windows,
and I note that os.access doesn't seem to work - even with os.W_OK it returns True even if the file is being written by another process.
One thing that is nice about the above solution (renaming to the same name) is that it is atomic - which is not true if I try to rename to a temp name, then rename back to the original name.

Since you only want to read the file - why not just try to do it? Since this is the operation you are trying to do:
try:
with open("file.txt", "r") as handle:
content = handle.read()
except IOError as msg:
pass # error handling
This will try to read the content, and fail if the file is locked, or unreadable.
I see no reason to check if the file is locked if you just want to read from it - just try reading and see if that throws an exception.

Avoiding partially written files in Python

What is the "safest" way to write files in Python? I've heard about atomic file writing but I am not sure of how to do it and how to handle it.

What you want is an atomic file replacement, so that there is never unfinished final file on the disk. There only exist complete new version or complete old version on the target location.
The method for Python is described here:
Atomic file replacement in Python

You can write to a temp file and rename it, but there's a lot of gotchas doing it correctly. I prefer to use this nice library Atomic Writes.
Install it:
pip install atomicwrites==1.4.0 #check if there is a newer version
And then just use it as a context:
from atomicwrites import atomic_write
with atomic_write('foo.txt', overwrite=True) as f:
f.write('Hello world.')
# "foo.txt" doesn't exist yet will be created when closing context

with open("path", "w") as f:
f.write("Hello, World")
The use of the with-Statement guarantees that the file is closed, no matter what happens (well it equals to a try .. finally).

Python OSError: Too many open files

I'm using Python 2.7 on Windows XP.
My script relies on tempfile.mkstemp and tempfile.mkdtemp to create a lot of files and directories with the following pattern:
_,_tmp = mkstemp(prefix=section,dir=indir,text=True)
<do something with file>
os.close(_)
Running the script always incurs the following error (although the exact line number changes, etc.). The actual file that the script is attempting to open varies.
OSError: [Errno 24] Too many open files: 'path\\to\\most\\recent\\attempt\\to\\open\\file'
Any thoughts on how I might debug this? Also, let me know if you would like additional information. Thanks!
EDIT:
Here's an example of use:
out = os.fdopen(_,'w')
out.write("Something")
out.close()
with open(_) as p:
p.read()

You probably don't have the same value stored in _ at the time you call os.close(_) as at the time you created the temp file. Try assigning to a named variable instead of _.
If would help you and us if you could provide a very small code snippet that demonstrates the error.

why not use tempfile.NamedTemporaryFile with delete=False? This allows you to work with python file objects which is one bonus. Also, it can be used as a context manager (which should take care of all the details making sure the file is properly closed):
with tempfile.NamedTemporaryFile('w',prefix=section,dir=indir,delete=False) as f:
pass #Do something with the file here.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Multiple users append to the same file at the same time - python

If you are doing this operation on Linux, and the cache size is smaller than 4KB, the write operation is atomic and you should be good. More to read here: Is file append atomic in UNIX?

You didn't state what platform you use, but here is an module you can use that is cross platform: File locking in Python

Depending on your platform/filesystem location this may not be doable in a safe manner (e.g. NFS). Perhaps you can write to different files and merge the results afterwards?

Related

Execute bytes string in Python from an .exe file [duplicate]

Is file-input thread-safe in Python?

Checking if file in use (by other process) by trying to rename it to same name (in Python on Windows)

Avoiding partially written files in Python

Python OSError: Too many open files

Categories

Resources