How to check if files have been modified? - python

Okay, so I'm looking for an easy way to check if the contents of files in a folder have changed. And if one did change, it updates the version of that file.
I'm guessing this is what is called logging? I'm am completely new to this, so is a bit hard to explain the concept of what I'm looking for. I'll give an example:
Let's I have a reference folder that contains my original data.
Then for every time I run my code it inspects the contents of the files in said reference folder.
If the contents are the same, then it the code continues to run normally.
But if the contents of the files have changed, it updates the version of that file (for example: from '1.0.0' to '1.0.1') and keeps a copy of the changes.
Is there a way to do this in python or a module that helps me accomplish this? Or where can I start looking into this?

Related

Python module to keep a file location variable updated?

I have a variable in my python script that holds the path to a file as a string. Is there a way, through a python module, etc, to keep the file path updated if the file were to be moved to another destination?
Short answer: no, not with a string.
Long answer: If you want to use only a string to record the location of this file, you're probably out of luck unless you use other tools to find the location of the file, or record whenever it moves - I don't know what those tools are.
You don't give a lot of info in your question about what you want this variable for; as #DeepSpace says in his comment, if you're trying to make this String follow the file between different runs of this program, then you'd be better off making an argument for the script. If, however, you expect the file to move sometime during the execution of your program, you might be able to use a file object to keep track of it. Or, rather - instead of keeping the filepath in memory, keep a file descriptor in memory instead (the kind you would generate by using the open() function, and just never close that file until the program terminates. You can use seek to return to the start of the file if you needed to read it multiple times. Problems with this include that it's not memory-safe, and it's absolutely not a best practice.
TL;DR
Your best bet is probably to go with a solution like #DeepSpace mentioned where you could go and call your script with a parameter in command-line which will then force the user to input a valid path.
This is actually a really good question, but unfortunately purely Pythonly speaking, this is impossible.
No python module will be able to dynamically linked a variable to a path on the file-system. You will need an external script or routine which will update any kind of data structure that will hold the path value.
Even then, the name of the file could change, but not it's location. here is what I mean by that.
Let's say you were to wrapped that file in a folder only containing that specific file. Since you now know that it's location is fixed (theoretically speaking), you can have another python script/routine that will read the filename and store it in a textfile. Your other script could go and get that file name (considering your routine would sync that file on a regular basis). But, as soon as the location of the file changes, how can you possibly know where it is now. It has to be manually hard coded somewhere to have something close to the behavior your expecting.
Note that my example is not in any way a solution to go-to for your problem. I'm actually trying to underline the shortcomings of such a feature.

Getting the time where a file is copied to a folder (Python)

I'm trying to write a Python script that runs on Windows. Files are copied to a folder every few seconds, and I'm polling that folder every 30 seconds for names of the new files that were copied to the folder after the last poll.
What I have tried is to use one of the os.path.getXtime(folder_path) functions and compare that with the timestamp of my previous poll. If the getXtime value is larger than the timestamp, then I work on those files.
I have tried to use the function os.path.getctime(folder_path), but that didn't work because the files were created before I wrote the script. I tried os.path.getmtime(folder_path) too but the modified times are usually smaller than the poll timestamp.
Finally, I tried os.path.getatime(folder_path), which works for the first time the files were copied over. The problem is I also read the files once they were in the folder, so the access time keeps getting updated and I end up reading the same files over and over again.
I'm not sure what is a better way or function to do this.
You've got a bit of an XY problem here. You want to know when files in a folder change, you tried a homerolled solution, it didn't work, and now you want to fix your homerolled solution.
Can I suggest that instead of terrible hackery, you use an existing package designed for monitoring for file changes? One that is not a polling loop, but actually gets notified of changes as they happen? While inotify is Linux-only, there are other options for Windows.

Efficient way to track inter-runtime changes to a directory?

I know that I can use a library such as Watchdog in order to track changes to a directory during the runtime of a Python program. However, what if I want to track changes to the same directory between invocations of the same program? For example, if I have the following dir when I run the program the first time:
example/
file1
file2
I then quit, delete one file, and add another one:
example/
file2
file3
Now, when I start the program the second time, I would like to efficiently get a summary of the changes done ("deleted file1, added file3") to the directory since I last ran the program.
I know that I could brute force a solution by (for example) saving a list of all files when the program quits, create a new list when it starts, and then compare the two. However, is there a more efficient way to do this - preferably one that makes use of the underlying OS/filesystem AND is deployable cross-platform?

PYTHON - Search a directory

Is it possible for the user to input a specific directory on their computer, and for Python to write the all of the file/folder names to a text document? And if so, how?
The reason I ask is because, if you haven't seen my previous posts, I'm learning Python, and I want to write a little program that tracks how many files are added to a specific folder on a daily basis. This program isn't like a project of mine or anything, but I'm writing it to help me further learn Python. The more I write, the more I seem to learn.
I don't need any other help than the original question so far, but help will be appreciated!
Thank you!
EDIT: If there's a way to do this with "pickle", that'd be great! Thanks again!
os.walk() is what you'll be looking for. Keep in mind that this doesn't change the current directory your script is executing from; it merely iterates over all possible paths that stem from the arguments to it.
for dirpath, dirnames, files in os.walk(os.path.abspath(os.curdir)):
print files
# insert long (or short) list of files here
You probably want to be using os.walk, this will help you in creating a listing of the files down from your directory.
Once you have the listing, you can indeed store it in a file using pickle, but that's another problem (i.e.: Pickle is not about generating the listing but about storing it).
While the approach of #Makato is certainly right, for your 'diff' like application you want to capture the inode 'stat()' information of the files in your directory and pickle that python object from day-to-day looking for updates; this is one way to do it - overkill maybe - but more suitable than save/parse-load from text-files IMO.

viewing files in python?

I am creating a sort of "Command line" in Python. I already added a few functions, such as changing login/password, executing, etc., But is it possible to browse files in the directory that the main file is in with a command/module, or will I have to make the module myself and use the import command? Same thing with changing directories to view, too.
Browsing files is as easy as using the standard os module. If you want to do something with those files, that's entirely different.
import os
all_files = os.listdir('.') # gets all files in current directory
To change directories you can issue os.chdir('path/to/change/to'). In fact there are plenty of useful functions found in the os module that facilitate the things you're asking about. Making them pretty and user-friendly, however, is up to you!
I'd like to see someone write a a semantic file-browser, i.e. one that auto-generates tags for files according to their input and then allows views and searching accordingly.
Think about it... take an MP3, lookup the lyrics, run it through Zemanta, bam! a PDF file, a OpenOffice file, etc., that'd be pretty kick-butt! probably fairly intensive too, but it'd be pretty dang cool!
Cheers,
-C

Categories

Resources