Cross-platform cooperative file locking through link/folder creation? - python

I would like to implement a cooperative file locking mechanism in Python, that would also work on remote partitions (e.g. NFS), with a simple code (I want to avoid using a third-party module, because I want some specific open-source code to not have dependencies).
There are solutions out there that look relatively cross-platform, but they are more complicated than I would like: ideally the exact same lines of code would run on all platforms.
A solution is to use some atomic operation that tries to create a lock and fails if it cannot (e.g. a lock in the form of a directory). Creating a directory is atomic on Unix, so that's a good first step. Now, what would be an equivalent solution for Windows? I have read somewhere that maybe creating a link (how?) would be atomic; if creating a directory is atomic on Windows, that would be even better, as the same code could be used for both Windows and Unix, but I can't find out whether this is the case.
To summarize: what would be a simple, cross-platform Python code (no library) for creating (and releasing) a cooperative file lock, that also works on remote partitions? The directory creation route looks promising, but does it work on Windows?

Related

What is the recommended practice in django to execute external scripts?

I'm planning to build a WebApp that will need to execute scripts based on the argument that an user will provide in a text-field or in the Url.
possible solutions that I have found:
create a lib directory in the root directory of the project, and put the scripts there, and import it from views.
using subprocess module to directly run the scripts in the following way:
subprocess.call(['python', 'somescript.py', argument_1,...])
argument_1: should be what an end user provides.
I'm planning to build a WebApp that will need to execute scripts
Why should it "execute scripts" ? Turn your "scripts" into proper modules, import the relevant functions and call them. The fact that Python can be used as a "scripting language" doesn't mean it's not a proper programming language.
Approach (1) should be the default approach. Never subprocess unless you absolutely have to.
Disadvantages of subprocessing:
Depends on the underlying OS and in your case Python (i.e. is python command the same as the Python that runs the original script?).
Potentially harder to make safe.
Harder to pass values, return results and report errors.
Eats more memory and cpu (a side effect is that you can utilize all cpu cores but since you are writing a web app it is likely you do that anyway).
Generally harder to code and maintain.
Advantages of subprocessing:
Isolates the runtime. This is useful if for example scripts are uploaded by users. You don't want them to mess with your application.
Related to 1: potentially easier to dynamically add scripts. Not that you should do that anyway. Also becomes harder when you have more then 1 server and you need to synchronize them.
Well, you can run non-python code that way. But it doesn't apply to your case.

Standalone Python interpreter

I want to run a python program without any underlying OS.
I have read articles on running python on small microcontrollers, but i want it on a bigger processor (Intel, ARM).
My criteria is:
It could be directly run as binary.
The Python interpreter could be loaded, onto which I can run my program.
At worst, tell me an extremely small, basic OS i can run it on.
Note: I want to use my program like a minimalistic operating system. I should be able to load it like any other OS, and it should be able to access memory and have basic I/O.
Note 2: Will there be limitations in terms of python's functions?
Note: this post describes x86 exclusively, as, next to ARM, requested by the OP.
It could be directly run as binary.
Binary? Python is not compiled, so no binary is produced. I think you mean just "run a Python program directly" here.
You could implement an additional compilation step, so that Python source files are compiled to bytecode prior to being executed, of course.
The Python interpreter could be loaded, onto which I can run my program.
"loaded" is a problem here. You need software to load the interpreter, displaying a chicken-egg problem. Intel x86 solves the problem by using a so-called BIOS (Basic I/O System), which starts further, user-defined programs. This "user-defined" program would be your Python interpreter then.
On more modern machines, UEFI is used instead of the legacy BIOS.
I want to use my program like a minimalistic operating system. I
should be able to load it like any other OS, and it should be able to
access memory and have basic I/O.
The aforementioned BIOS provides, as the acronym says, basic I/O functionality like reading/writing from/to disks, reading/writing from/to the screen, etc. Either use these basic routines and abstract from these or circumvent them and rewrite them all from scratch. That includes graphics drivers (a basic VGA driver will suffice), disk drivers (for loading Python files from disk), and filesystem (a simple FAT-16 is sufficient).
After all, you not only need to write a Python interpreter but a whole development environment from scratch.
Will there be limitations in terms of python's functions?
It depends on what you implement. For networking you need the appropriate drivers, for file stuff a filesystem + secondary storage driver. You are the ultimate master of your system you create, so it is up to you how un/limited your Python environment will be.

How to track changes on files like dropbox does?

Anybody know how to execute methods (python) when files are modified like Dropbox and his Continuous Data Protection mechanism that can track exactly when a file is modified and sync it.
Of course it would not be of the entire hard-disk, but a track on a specified directory.
OBS: For Windows and Linux OS. Mac is a plus ;)
On Linux, pyinotify will probably do what you want. But note the caveats mentioned in the inotify(7) manpage, in particular:
Note that the event queue can overflow. In this case, events are lost. Robust applications should handle the possibility of lost events gracefully.
If monitoring an entire directory subtree, and a new subdirectory is created in that tree, be aware that by the time you create a watch for the new subdirectory, new files may already have been created in the subdirectory. Therefore, you may want to scan the contents of the subdirectory immediately after adding the watch.
I'm not sure if Python has any cross-platform solution for this, but if you are only interested in Windows-based solutions, you should look into directory change notifications. To call the Win32 API functions, you can look into pywin32.
On Linux, there seems to be a bunch of solutions, including fschange, dnotify and inotify. I'm not sure which one is the recommended solution, but inotify seems to be the most complete solution.
Not all platforms have such a feature. If it's not available for a given platform, you'll have to emulate such notifications by checking directory contents periodically.
What you need is rsync. There are several implementation of rsync in python. Check these out -
http://pypi.python.org/pypi/rsync.py/2.0
http://code.activestate.com/recipes/577518-rsync-algorithm/
Looking for cross-platform rsync-like functionality in python, such as rsync.py
Controlling rsync with Python?

Would it be a good idea to make python store compile code in file stream instead of pyc files?

I'm wondering if it wouldn't be a better if Python would store the compiled code in a file stream of the original source file. This would work on file systems supporting forks/data-streams, and fall-back if this is not possible.
On Windows using ADS (Alternative Data Streams)
On OS X using resource forks
On Linux using extended file attributes if compiled file is under 32k
Doing this will solve the problem of polluting the source tree or having problems like after the removal of a .py the .pyc remained and was loaded and used.
What do you think about this, sounds like a good idea or not? What issues to do see.
You sure do sacrifice an awful lot of portability this way -- right now .pyc files are uncommonly portable (often used by heterogeneous systems on a LAN through some kind of network file system arrangement, for example, though I've never been a fan of the performance characteristics of that approach), while your approach would only work on very specific filesystems and (I suspect) never across a network mount on heterogenous machines.
So, it would be a dire mistake to make the behavior you want the default one -- but it would surely be neat to have it as an option available for specific request if your deployment environment doesn't care about all of the above issues and does care about some of those you mention. Another "cool option to have", that I would actually use about 100 times more often, is to put the .pyc "files" in a database instead of having them in filesystems.
The cool thing is that this is (relatively) easily accomplished as an add-on "import hack" one way or another (depending on Python versons) -- most easily in recent-enough versions with importlib, Brett Cannon's masterpiece (but that might make backporting to older Python versions harder than other ways... too much depends on exactly what versions you need to support, a detail which I don't see in your Q, so I won't go into the implementation details, but the general idea doesn't change much across implementations).
One problem I forsee is that it then means that each platform has different behaviour.
The next is that not every filesystem OS X supports also supports resource forks (and the way it stores them in non-hfs filesystems is universally hated by everyone else: ._ )
Having said that, I have often been bitten by a .pyc file being used by apache because the apache process can't read the .py file I have replaced. But I think that this is not the solution: a better deployment process is ;)

Limiting the features of an embedded python instance

Is there a way to limit the abilities of python scripts running under an embedded interpretor? Specifically I wish to prevent the scripts from doing things like the following:
Importing python extension modules (ie .pyd modules), except those specifically allowed by the application.
Manipulating processes in any way (ie starting new processes, or terminating the application).
Any kind of networking.
Manipulating the file system (eg creating, modifying and deleting files).
No. There's no easy way to prevent those things on CPython. Your options are:
Edit CPython source code and remove things you don't want - provide mocking methods for all those things. Very error-prone and hard to do. This is the approach of Google's App Engine.
Use Restricted Python. However, with it you can't prevent your user from exhausting the memory available or running infinite eat-all-cpu loops.
Use another python implementation. PyPy has a sandbox mode you can use. Jython runs under java and I guess java can be sandboxed.
Maybe this can be helpful. You have an example provided on how to work with the ast.
What you want it Google's Unladen Swallow project that Python version of App Engine runs on.
Modules are severely restricted, ctypes are not allowed, sockets are matched against some policy or other, in other words you get a sandboxed version of Python, in line with their Java offering.
I'd like to point out that this makes the system almost useless. Well useless for anything cooler than yet another [App Engine] App. Forget monkey-patching system modules, and even access to own stack is restricted. Totally un-dynamic-like.
OT: games typically embed LUA for scripting, perhaps you should check it out.

Categories

Resources