How to track changes on files like dropbox does?

How to track changes on files like dropbox does? - python

Anybody know how to execute methods (python) when files are modified like Dropbox and his Continuous Data Protection mechanism that can track exactly when a file is modified and sync it.
Of course it would not be of the entire hard-disk, but a track on a specified directory.
OBS: For Windows and Linux OS. Mac is a plus ;)

On Linux, pyinotify will probably do what you want. But note the caveats mentioned in the inotify(7) manpage, in particular:
Note that the event queue can overflow. In this case, events are lost. Robust applications should handle the possibility of lost events gracefully.
If monitoring an entire directory subtree, and a new subdirectory is created in that tree, be aware that by the time you create a watch for the new subdirectory, new files may already have been created in the subdirectory. Therefore, you may want to scan the contents of the subdirectory immediately after adding the watch.

I'm not sure if Python has any cross-platform solution for this, but if you are only interested in Windows-based solutions, you should look into directory change notifications. To call the Win32 API functions, you can look into pywin32.
On Linux, there seems to be a bunch of solutions, including fschange, dnotify and inotify. I'm not sure which one is the recommended solution, but inotify seems to be the most complete solution.
Not all platforms have such a feature. If it's not available for a given platform, you'll have to emulate such notifications by checking directory contents periodically.

What you need is rsync. There are several implementation of rsync in python. Check these out -
http://pypi.python.org/pypi/rsync.py/2.0
http://code.activestate.com/recipes/577518-rsync-algorithm/
Looking for cross-platform rsync-like functionality in python, such as rsync.py
Controlling rsync with Python?

Related

Cross-platform cooperative file locking through link/folder creation?

I would like to implement a cooperative file locking mechanism in Python, that would also work on remote partitions (e.g. NFS), with a simple code (I want to avoid using a third-party module, because I want some specific open-source code to not have dependencies).
There are solutions out there that look relatively cross-platform, but they are more complicated than I would like: ideally the exact same lines of code would run on all platforms.
A solution is to use some atomic operation that tries to create a lock and fails if it cannot (e.g. a lock in the form of a directory). Creating a directory is atomic on Unix, so that's a good first step. Now, what would be an equivalent solution for Windows? I have read somewhere that maybe creating a link (how?) would be atomic; if creating a directory is atomic on Windows, that would be even better, as the same code could be used for both Windows and Unix, but I can't find out whether this is the case.
To summarize: what would be a simple, cross-platform Python code (no library) for creating (and releasing) a cooperative file lock, that also works on remote partitions? The directory creation route looks promising, but does it work on Windows?

As a beginner in Python, how should I work with installation directories?

I'm a self-taught, amateur, purely recreational programmer. I don't understand all the fancy programming lingo, and I certainly don't have any good resources, apart from this website, where I can go for help. (i.e., Please dumb it down for me!) I would imagine my question here is somewhat common, but I honestly couldn't find any answers on Google or this website, probably because I don't know the proper terminology to search for.
~~~
Having said that, I feel I have a pretty solid grasp on the basics of Python. And now, I've created an application that I'd like to share with a friend. My application accesses JPEG image files on my computer using a directory path that I've written into the code itself. However, I'd like my friend to be able to store these image files anywhere on their computer, not necessarily in the file folder that I've been using.
I assume the best way to accomplish this is to allow my friend to choose the directory path for themselves and then to write their chosen directory path to a file at a predetermined location on their computer. My application would then have that file's location prewritten into its code. This way, it would be trivially easy to open the file at the predetermined location, and then that file would point my application to my friend's chosen directory path.
1.) Are any of my intuitions here misguided? Are there better ways of doing this?
2.) If you think my general approach is a reasonable one, then is there a good/common place on the computer where applications typically store their directory paths upon installation?
Any advice - or any recommended resources - would be very much appreciated! Thanks!

Well, the standard way to do this is a lot more complicated and platform-specific:
On traditional Unix, this is pretty simple; you create a text file in some simpler format (e.g., that used by ConfigParser, named, say, ~/.myprogram.cfg, and you write a line to it that looks like image_path=/path/to/images.
On most modern Linux systems, or any other FreeDesktop/XDG-based system, you should (at least for GUI apps) instead use a special directory looked up in the environment as XDG_CONFIG_HOME, falling back to ~/.config, instead of using ~.
On Windows, the standard place to store stuff like this is the Windows Registry (e.g., by using winreg), by creating a key for your program and storing a value with name image_path and value /path/to/images there.
On Mac, the standard place to store stuff like this is in the NSUserDefaults database (e.g., by using PyObjC, which isn't part of the stdlib but does come built-in with Apple's pre-installed Python) by opening the default domain for your program and adding a value with key image_path and value… well, you probably want a Cocoa bookmark (maybe even a security-scoped one), not a path.
That probably all sounds way, way too complicated.
One option is to use a library that wraps this all up for you. If you're already using a heavy-duty framework like, say, Qt, it probably has functionality built-in to do that. Otherwise, it may take a lot of searching to find something.
A simpler alternative is to just pretend everything is like traditional Unix. That will work on Windows and Mac. It will be slightly annoying on some Windows versions that your config file will be visible in their home directory, but not a huge deal. It means you won't get some of the bonus features that Mac provides, like being able to magically follow the directory if the user moves it somewhere else on his hard drive, or remembering the settings if he reinstalls OS X and migrates his old settings, but again, usually that's fine.
In between the extremes, you can pretend everything is like Linux, using a special, and unobtrusive, location for the files on Windows and Mac just as you do there. Both platforms have APIs to look up special directories, called "application data" on Windows and "application support" on Mac. Using PyWin32 or PyObjC, respectively, these are pretty easy to look up. (For example, see this answer.) Then you just create a subdirectory there named My App on Windows, or com.mydomain.myapp on Mac, and store the file there.

What are common strategies for updating python programs?

I have a Windows program that I made with python and py2exe. I'd like to create an updating feature so that the software can be readily updated.
What are common ways of going about this?

If you think your code might benefit others, you could put it up on PyPI. Then having different versions is just updating your package, or telling your clients to use easy_install to get the latest version. This doesn't push updates, though.
You can try Esky, which is an auto-update framework for managing different versions, including fetching new versions and rolling back partial updates. It can be found on PyPI.
That said, I haven't used Esky. If you wish to roll your own auto-update feature, you might want to look at Boxed Dice to see how they got around to it.

When you package an app with py2exe, the result is usually a single executable (perhaps with some data files). This is simplest to update by just proposing the user to download and install a new version every once in a while (how you check with a server that such new version exists is a different question).
If you want to reduce the download size the user has to do, application commonly resort to breaking themselves up into multiple DLLs and updating only the relevant DLLs. When you have a Python application you don't have DLLs but you have an even easier option - you can just keep most of your app's logic outside the exe in .pyc files, and update just some of these .pyc files.
Now, mind you, .pyc files are easily "decompilable" into Python (a somewhat obfuscated version of your original code), but having an exe made with py2exe isn't much safer, because py2exe is open-source software and packs all the same files inside the exe anyway.
To conclude, my suggestion is don't bother. How large can your application be? With today's fast connections, it's easier to just make the user download a whole new version than to invest a lot of time into building partial-update functionality into your program.

Would it be a good idea to make python store compile code in file stream instead of pyc files?

I'm wondering if it wouldn't be a better if Python would store the compiled code in a file stream of the original source file. This would work on file systems supporting forks/data-streams, and fall-back if this is not possible.
On Windows using ADS (Alternative Data Streams)
On OS X using resource forks
On Linux using extended file attributes if compiled file is under 32k
Doing this will solve the problem of polluting the source tree or having problems like after the removal of a .py the .pyc remained and was loaded and used.
What do you think about this, sounds like a good idea or not? What issues to do see.

You sure do sacrifice an awful lot of portability this way -- right now .pyc files are uncommonly portable (often used by heterogeneous systems on a LAN through some kind of network file system arrangement, for example, though I've never been a fan of the performance characteristics of that approach), while your approach would only work on very specific filesystems and (I suspect) never across a network mount on heterogenous machines.
So, it would be a dire mistake to make the behavior you want the default one -- but it would surely be neat to have it as an option available for specific request if your deployment environment doesn't care about all of the above issues and does care about some of those you mention. Another "cool option to have", that I would actually use about 100 times more often, is to put the .pyc "files" in a database instead of having them in filesystems.
The cool thing is that this is (relatively) easily accomplished as an add-on "import hack" one way or another (depending on Python versons) -- most easily in recent-enough versions with importlib, Brett Cannon's masterpiece (but that might make backporting to older Python versions harder than other ways... too much depends on exactly what versions you need to support, a detail which I don't see in your Q, so I won't go into the implementation details, but the general idea doesn't change much across implementations).

One problem I forsee is that it then means that each platform has different behaviour.
The next is that not every filesystem OS X supports also supports resource forks (and the way it stores them in non-hfs filesystems is universally hated by everyone else: ._ )
Having said that, I have often been bitten by a .pyc file being used by apache because the apache process can't read the .py file I have replaced. But I think that this is not the solution: a better deployment process is ;)

Self updating py2exe/py2app application

I maintain a cross platform application, based on PyQt that runs on linux mac and windows.
The windows and mac versions are distributed using py2exe and py2app, which produces quite large bundles (~40 MB).
I would like to add an "auto update" functionality, based on patches to limit downloads size:
check for new versions on an http server
download the patches needed to update to the last version
apply the patches list and restart the application
I have some questions:
what is the preferred way to update a windows application since open files are locked and can't be overwritten ?
how do I prepare and apply the patches ? perhaps using bsdiff/pspatch ?
[update]
I made a simple class to make patches with bsdiff, which is very efficient as advertised on their site : a diff on two py2exe versions of my app (~75 MB uncompressed) produces a 44 kB patch ! Small enough for me, I will stick to this format.
The code is available in the 'update' package of pyflu, a small library of Python code.

I don't believe py2exe supports patched updates. However, if you do not bundle the entire package into a single EXE (py2exe website example - bottom of page), you can get away with smaller updates by just replacing certain files, like the EXE file, for example. This can reduce the size of your updates significantly.
You can write a separate updater app, which can be downloaded/ran from inside your application. This app may be different for every update, as the files that need to be updated may change.
Once the application launches the updater, it will need to close itself so the files can be overwritten. Once the updater is complete, you can have it reopen the application before closing itself.

I don't know about patches, but on OS X the "standard" for this with cocoa apps is Sparkle. Basically it does "appcasting". It downloads the full app each time. It might be worth looking at it for inspiration.
I imagine on OS X you can probably just download the actual part of your app bundle that contains your specific code (not the libs etc that get packaged in) and that'd be fairly small and easy to replace in the bundle.
On Windows, you'd probably be able to do a similar trick by not bundling your app into one exe - thus letting you change the one file that has actually changed.
I'd imagine your actual Python code would be much less than 40Mb, so that's probably the way to go.
As for replacing the running app, well first you need to find it's location, so you could use sys.executable to get you a starting point, then you could probably fork a child process to, kill the parent process and have the child doing the actual replacement?
I'm currently playing around with a small wxPython app and wondering about exactly this problem. I'd love to hear about what you come up with.
Also how big is you app when compressed? If it compresses well then maybe you can still afford to send the whole thing.

This is 4 years old now, but what about Esky?

Since py2exe puts all of the compiled modules of your app into a ZIP file, you could try to update this file by creating a script that updates it from a given set of files.
Then replace the remaining files that have changed (which should be few, if any).

An old post, but I thought I'd mention pyupdater with pyinstaller.
It supports Amazon and scp.
In the future, according to recent github posts, they plan to support free alternatives.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.