Python daemon to watch a folder and update a database

Python daemon to watch a folder and update a database - python

This is specifically geared towards managing MP3 files, but it should easily work for any directory structure with a lot of files.
I want to find or write a daemon (preferably in Python) that will watch a folder with many subfolders that should all contain X number of MP3 files. Any time a file is added, updated or deleted, it should reflect that in a database (preferably PostgreSQL). I am willing to accept if a file is simply moved that the respective rows are deleted and recreated anew but updating existing rows would make me the happiest.
The Stack Overflow question Managing a large collection of music has a little of what I want.
I basically just want a database that I can then do whatever I want to with. My most up-to-date database as of now is my iTunes.xml file, but I don't want to rely on that too much as I don't always want to rely on iTunes for my music management. I see plenty of projects out there that do a little of what I want but in a format that either I can't access or is just more complex than I want. If there is some media player out there that can watch a folder and update a database that is easily accessible then I am all for it.
The reason I'm leaning towards writing my own is because it would be nice to choose my database and schema myself.

Another answer already suggested pyinotify for Linux, let me add watch_directory for Windows (a good discussion of the possibilities in Windows is here, the module's an example) and fsevents on the Mac (unfortunately I don't think there's a single cross-platform module offering a uniform interface to these various system-specific ways to get directory-change notification events).
Once you manage to get such events, updating an appropriate SQL database is simple!-)

If you use Linux, you can use PyInotify.
inotify can notify you about filesystem events when your program is running.

IMO, the best media player that has these features is Winamp. It rescans the music folders every X minutes, which is enough for music (but of course a little less efficient than letting the operating system watch for changes).
But as you were asking for suggestions on writing your own, you could make use of pyinotify (Linux only). If you're running Windows, you can use the ReadDirectoryChangesW API call

Related

How to verify if two systems are in sync

I have a requirement to test two applications (via automation using Python).
The requirement is for example we have a system called “www.abc.com”
where we develop and merge code in every 2 weeks and then we create a another system called “www.xyz.com” ( basically it is backup to the first system ), everytime we do a release and add/edit in the main system, we update in our back up system.
Now the question is i need to tests both the system, after every release (every 2 weeks) to see if they both are in sync (identical).
how do i fire a python automation test script (multiple tests) to check if for example databases, servers, UI, front end, check if code base are same in both systems? can i do that if yes any help and advice , please suggest so that i can implement possible solutions .

There are several ways you could approach this:
Assuming you are using some sort of source control you could write a script to make sure that the repo is up to date and then report back the results. See here and here. This probably won't cover the data in your databases, but there are numerous ways to back database backups and it will depend what programs you are using.
Another or additional way you might check is to write a script to gather a list of hashes or checksums of all the files you care about in both systems and then compare the list for differences.

Utility to manage multiple python scripts

I saw this post on Medium, and wondered how one might go about managing multiple python scripts.
How I Hacked Amazon's Wifi Button
This describes a system where you need to run one or more scripts continuously to catch and react to events in your network.
My question: Let's say I had multiple python scripts that I wanted to do run while I work on other things. What approaches are available to manage these scripts? I have to imagine there is a better way than having a large number of terminal windows running each script individually.
I am coming back to python, and have no formal training in computer programming, so any guidance you can provide will be greatly appreciated.

Let's say I had multiple python scripts that I wanted to do run. What
approaches are available to manage these scripts? I have to imagine
there is a better way than having a large number of terminal windows
running each script individually.
If you have several .py files in a directory that you want to run, without having a specific order, you can do:
import glob
pyFiles = glob.glob('path/*.py')
for pyFile in pyFiles:
execfile(pyFile)

Your system already runs a large number of background processes, with output to the system log or occasionally to a service-specific log file.
A common arrangement for quick and dirty deployments -- where you don't necessarily want to invest in making the scripts robust and well-behaved enough to run as proper services -- is to start the script inside screen or tmux. You can detach when you don't need to be looking at it, and can reattach at any time -- even from a remote login -- to view the output, or to troubleshoot.

Take a look at luigi (I've not used it).
https://github.com/spotify/luigi
These days (five years after the question was asked) a lot of people use docker compose. But that's a little heavy weight depending on what you want to do.

I just saw today the script server of bugy. Maybe it might be a solution for you or somebody else.
(I am just trying to find a tampermonkey script structure for python..)

Python program on server - control via browser

I have to setup a program which reads in some parameters from a widget/gui, calculates some stuff based on database values and the input, and finally sends some ascii files via ftp to remote servers.
In general, I would suggest a python program to do the tasks. Write a Qt widget as a gui (interactively changing views, putting numbers into tables, setting up check boxes, switching between various layers - never done something as complex in python, but some experience in IDL with event handling etc), set up data classes that have unctions, both to create the ascii files with the given convention, and to send the files via ftp to some remote server.
However, since my company is a bunch of Windows users, each sitting at their personal desktop, installing python and all necessary libraries on each individual machine would be a pain in the ass.
In addition, in a future version the program is supposed to become smart and do some optimization 24/7. Therefore, it makes sense to put it to a server. As I personally rather use Linux, the server is already set up using Ubuntu server.
The idea is now to run my application on the server. But how can the users access and control the program?
The easiest way for everybody to access something like a common control panel would be a browser I guess. I have to make sure only one person at a time is sending signals to the same units at a time, but that should be doable via flags in the database.
After some google-ing, next to QtWebKit, django seems to the first choice for such a task. But...
Can I run a full fledged python program underneath my web application? Is django the right tool to do so?
As mentioned previously, in the (intermediate) future ( ~1 year), we might have to implement some computational expensive tasks. Is it then also possible to utilize C as it is within normal python?
Another question I have is on the development. In order to become productive, we have to advance in small steps. Can I first create regular python classes, which later on can be imported to my web application? (Same question applies for widgets / QT?)
Finally: Is there a better way to go? Any standards, any references?

Django is a good candidate for the website, however:
It is not a good idea to run heavy functionality from a website. it should happen in a separate process.
All functions should be asynchronous, I.E. You should never wait for something to complete.
I would personally recommend writing a separate process with a message queue and the website would only ask that process for statuses and always display a result immediatly to the user
You can use ajax so that the browser will always have the latest result.
ZeroMQ or Celery are useful for implementing the functionality.
You can implement functionality in C pretty easily. I recomment however that you write that functionality as pure c with a SWIG wrapper rather that writing it as an extension module for python. That way the functionality will be portable and not dependent on the python website.

File changed event network share

What I want to do:
I got two directories. Each one contains about 90.000 xml and bak files.
I need the xml files to sync at both folders when a file changes (of course the newer one should be copied).
The problem is:
Because of the huge amount of files and the fact that one of the directories is a network share I can't just loop though the directory and compare os.path.getmtime(file) values.
Even watchdog and PyQt don't work (tried the solutions from here and here).
The question:
Is there any other way to get a file changed event (on windows systems) which works for those configuration without looping though all those files?

So I finally found the solution:
I changed some of my network share settings and used the FileSystemWatcher
To prevent files getting synced on syncing i use a md5 filehash.
The code i use can be found at pastebin (It's a quick and dirty code and just the parts of it mentioned in the question here).

The code in...
https://stackoverflow.com/a/12345282/976427
Seems to work for me when passed a network share.

I'm risking giving an answer which is way off here (you didn't specify requirement regarding speed, etc) but... Dropbox would do exactly that for you for free, and would required writing no code at all.
Of course it might not suit your needs if you required real-time syncing, or if you want to avoid "sharing" your files with a third party (although you can encrypt them first).

Can you use the second option on this page?
http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html

From mentioning watchdog, I assume you are running under Linux. For the local machine inotify can help, but for the network share you are out of luck.
Mercurial's inotify extension http://hgbook.red-bean.com/read/adding-functionality-with-extensions.html has the same limitation.
In a similar situation (10K+ files) I have used a cloned mercurial repository with inotify on both the server and the local machine. They automatically commit and notified each other of changes. It had a slight delay (no problem in my case) but as a benifit had a full history
of changes and easily resynced after one of the systems had been down.

How to send clip names using LiveAPI (of Ableton Live)

When an audio or midi clip is played (triggered), its name needs to be sent using OSC to another application.
LiveAPI is an interface which allows one to explore and automate Ableton Live using python scripts.
The code to do this must be written in a python script, which must be placed in a specific folder where Ableton Live can find it, selected in Live's Preferences.
More information about the LiveAPI can be found on these sites:
http://www.assembla.com/wiki/show/live-api
http://groups.google.com/group/liveapi

According to the LiveAPI documentation, the Clip object has a "name" attribute which holds the clip name. Presumably that's what you want to send in your OSC packets.
Also, it's worth mentioning that the Max/MSP support in Live8 will probably be a lot more comfortable to work with than LiveAPI, which is pretty much a dead project. Max/MSP supposedly has OSC support, which was added to support the JazzMutant Lemur, but I'm not sure how much of that made it into Live. Anyways, it's worth keeping in mind for when Live8 is released.

I know about Max 4 Live, but as I see it, it's kind of a different thing. Yes, it will probably be able to interface with Live to do all the stuff which people do now with LiveAPI. Some even think that M4L may not even go through LiveAPI, and use some internal interface instead (since Ableton and Cycling 74 are developing it together). From the promo videos on ableton.com site I think that M4L will mostly be about making and modifying sound, and not so much about controlling/reading other instruments, effects, clips etc.
I would not say that LiveAPI project is dead, because a lot of hardware MIDI controllers rely on LiveAPI to do some auto-mapping magic. When you look at the MIDI Remote Scripts folder in Live, you'll see that each controller has it's own folder with a python script. So I definitely think that LiveAPI is going to stay, and that this door into Live will remain open. They even created a new folder called Framework which contains some newer code, probably required for the new Akai controller to work with Live (that is what people believe in theory).
The application I plan to use the playing clip's name is called vvvv, so I don't want to have to bring Max into this, because it is not really needed.
I had some success with someone's modification of the original LiveAPI code, but only worked when I request all the clips' names, not when I asked for just a single one. I didn't have time to play with it later, and the thing for which I was preparing this has passed. I plan to work that out eventually, but it's not that urgent anymore.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.