List out of range error in python code - python

I am writing a simple python program which allows us to list all video files in a directory and play them according to the user input. However, I am getting an list out of range error while running this code.
Code:
import os
from subprocess import Popen
def processFile(currentDir):
# Get the absolute path of the currentDir parameter
currentDir = os.path.abspath(currentDir)
global list
list=[]
filesInCurDir = os.listdir(currentDir)
# Traverse through all files
for file in filesInCurDir:
curFile = os.path.join(currentDir, file)
# Check if it's a normal file or directory
if os.path.isfile(curFile):
# Get the file extension
curFileExtension = curFile[-3:]
# Check if the file has an extension of typical video files
if curFileExtension in ['avi', 'dat', 'mp4', 'mkv', 'vob']:
# We have got a video file! Increment the counter
processFile.counter += 1
list.append('curFile')
# Print it's name
print(processFile.counter, file)
else:
# We got a directory, enter into it for further processing
processFile(curFile)
if __name__ == '__main__':
# Get the current working directory
currentDir = os.getcwd()
print('Starting processing in %s' % currentDir)
# Set the number of processed files equal to zero
processFile.counter = 0
# Start Processing
processFile(currentDir)
# We are done. Exit now.
print('\n -- %s Movie File(s) found in directory %s --' \
% (processFile.counter, currentDir))
print('Enter the file you want to play')
x = int(input())
path = list[x-1]
oxmp=Popen(['omxplayer',path])

Aha, found your problem.
In processFile, you say
def processFile(currentDir):
# ...
global list
list=[]
# ...
processFile(...)
This means that, whenever you recurse, you are clearing the list again! This means that the processFile.counter number becomes out-of-sync with the actual length of list.
Three notes on this:
Storing variables on a function like processFile.counter is generally frowned upon, AFAIK.
There's no need for a separate counter; you can simply put len(list) to find the number of entries in your list.
To fix the list problem itself, consider initializing the list variable outside of the function or passing it in as a parameter to be modified.

Related

How to "refresh" a variable from file

I would like a variable to be stored in .py file and be imported into a main Python program.
Let me explain the problem in code. In my home folder I have the following files:
testcode.py
testmodule.py
testcode contains the following code:
import pprint
while __name__ == '__main__':
import testmodule
variableFromFile=testmodule.var
print("Variable from file is "+str(variableFromFile))
print("Enter variable:")
variable=input()
Plik=open('testowymodul.py','w')
Plik.write('var='+variable)
Plik.close()
and testmodule contains:
var=0
Now when I launched testcode.py, and input as variables 1,2,3,4,5 I got the following output:
Variable from file is 0
Enter variable:
1
Variable from file is 0
Enter variable:
2
Variable from file is 0
Enter variable:
3
Variable from file is 0
Enter variable:
4
Variable from file is 0
Enter variable:
5
Variable from file is 0
Enter variable:
But I would like to refresh this variable every time it is printed on screen, so I expect in this line:
print("Variable from file is "+str(variableFromFile))
to update the variable's value. Instead, I get in output only the first value of the variable, so the program print 0 every time. Only restarting the program will refresh the value of var.
Is there a way to import variables from file, change them at runtime and then then use their updated values later on in the script?
I believe your basic problem stems from the use of the variable in the testmodule.py file. As written you code imports this file and thus the pyhton interpreter assigns the value of testmodule.var to the contents thyat exist at load time. The code which attempts to update the variable isn't working the way you intended, since Plik.write('var='+variable) is creating a text string of the form "var = n". Thus subsequent attempts to import the testmodule and get the testmodule.var variable will result in a 0 value.
To fix this problem, as suggested by #JONSG, requires you abandon the import context and do something along the lines of the following:
#Contents of testmodule.txt file
0
#Contents of the testcode.py file
def readVar(fn):
with open(fn, 'r') as f:
return f.read().strip()
def writeVar(fn, val):
with open(fn,'w') as f:
f.write(val)
def runcode():
varfile = 'testmodule.txt' #Assumes testmodule.txt is in same folder as code
variableFromFile= readVar(varfile)
print("Variable from file is "+str(variableFromFile))
variable=input("Enter variable: ")
writeVar(varfile, variable)
def main():
runcode()
if __name__ == "__main__":
main()
Now every time you run the file, the latest variable data will be loaded and then updated with a new value.

Python3.6 |- Import a module from a list with __import__ -|

I'm creating a really basic program that simulates a terminal with Python3.6, its name is Prosser(The origin is that "Prosser" sounds like "Processer" and Prosser is a command processer).
A problem that I'm having is with command import, this is, all the commands are stored in a folder called "lib" in the root folder of Prosser and inside it can have folders and files, if a folder is in the "lib" dir it can't be named as folder anymore, its name now is package(But this doesn't care for now).
The interface of the program is just a input writed:
Prosser:/home/pedro/documents/projects/prosser/-!
and the user can type a command before the text, like a normal terminal:
Prosser:/home/pedro/documents/projects/prosser/-! console.write Hello World
let's say that inside the "lib" folder exists one folder called "console" and inside it has a file called "write.py" that has the code:
class exec:
def main(args):
print(args)
As you can see the first 2 lines is like a important structure for command execution: The class "exec" is the main class for the command execution and the def "main" is the main and first function that the terminal will read and execute also pass the arguments that the user defined, after that the command will be responsible to catch any error and do what it will be created to do.
At this moment, everything is ok, but now comes the true help that I need of U guys, the command import.
Like I writed the user can type any command, and in the example above I typed a "console.write Hello World" and exists one folder called "console" and one file "write.py". The point is that the packages can be defined by a "dot", this is:
-! write Hello World
Above I only typed "write" and this says that the file is only inside the "lib" folder, it doesn't has a package to storage and separate it, so it is a Freedom command(A command that doesn't has packages or nodes).
-! console.write Hello World
Now I typed above "console.write" and this says that the file has a package or node to storage and separate it, this means that it is a Tied command(A command that has packages or nodes).
With that, a file is separated from the package(s) with a dot, the more dots you put, more folders will be navigated to find the file and proceed to the next execution.
Code
Finnaly the code. With the import statement I tryied this:
import os
import form
curDir = os.path.dirname(os.path.abspath(__file__)) # Returns the current direrctory open
while __name__ == "__main__":
c = input('Prosser:%s-! ' % (os.getcwd())).split() # Think the user typed "task.kill"
path = c[0].split('.') # Returns a list like: ["task", "kill"]
try:
args = c[1:] # Try get the command arguments
format = form.formater(args) # Just a text formatation like create a new line with "$n"
except:
args = None # If no arguments the args will just turn the None var type
pass
if os.path.exists(curDir + "/lib/" + "/".join(path) + ".py"): # If curDir+/lib/task/kill.py exists
module = __import__("lib." + '.'.join(path)) # Returns "lib.task.kill"
module.exec.main(args) # Execute the "exec" class and the def "**main**"
else:
pathlast = path[-1] # Get the last item, in this case "kill"
path.remove(path[-1]) # Remove the last item, "kill"
print('Error: Unknow command: ' + '.'.join(path) + '. >>' + pathlast + '<<') # Returns an error message like: "Error: Unknow command: task. >>kill<<"
# Reset the input interface
The problem is that when the line "module = __import__("lib." + '.'.join(path))" is executed the console prints the error:
Traceback (most recent call last):
File "/home/pedro/documents/projects/prosser/main.py", line 18, in <module>
module.exec.main(path) # Execute the "exec" class and the def "**main**"
AttributeError: module 'lib' has no attribute 'exec'
I also tried to use:
module = \_\_import\_\_(curDir + "lib." + '.'.join(path))
But it gets the same error. I think it's lighter for now. I'd like if someone help me or find some replacement of the code. :)
I think you have error here:
You have diffirent path here:
if os.path.exists(curDir + "/lib/" + "/".join(path) + ".py")
And another here, you dont have curDir:
module = __import__("lib." + '.'.join(path)) # Returns "lib.task.kill"
You should use os.path.join to build paths like this:
module = __import__(os.path.join(curdir, 'lib', path + '.py'))

How to extract chains from a PDB file?

I would like to extract chains from pdb files. I have a file named pdb.txt which contains pdb IDs as shown below. The first four characters represent PDB IDs and last character is the chain IDs.
1B68A
1BZ4B
4FUTA
I would like to 1) read the file line by line
2) download the atomic coordinates of each chain from the corresponding PDB files.
3) save the output to a folder.
I used the following script to extract chains. But this code prints only A chains from pdb files.
for i in 1B68 1BZ4 4FUT
do
wget -c "http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId="$i -O $i.pdb
grep ATOM $i.pdb | grep 'A' > $i\_A.pdb
done
The following BioPython code should suit your needs well.
It uses PDB.Select to only select the desired chains (in your case, one chain) and PDBIO() to create a structure containing just the chain.
import os
from Bio import PDB
class ChainSplitter:
def __init__(self, out_dir=None):
""" Create parsing and writing objects, specify output directory. """
self.parser = PDB.PDBParser()
self.writer = PDB.PDBIO()
if out_dir is None:
out_dir = os.path.join(os.getcwd(), "chain_PDBs")
self.out_dir = out_dir
def make_pdb(self, pdb_path, chain_letters, overwrite=False, struct=None):
""" Create a new PDB file containing only the specified chains.
Returns the path to the created file.
:param pdb_path: full path to the crystal structure
:param chain_letters: iterable of chain characters (case insensitive)
:param overwrite: write over the output file if it exists
"""
chain_letters = [chain.upper() for chain in chain_letters]
# Input/output files
(pdb_dir, pdb_fn) = os.path.split(pdb_path)
pdb_id = pdb_fn[3:7]
out_name = "pdb%s_%s.ent" % (pdb_id, "".join(chain_letters))
out_path = os.path.join(self.out_dir, out_name)
print "OUT PATH:",out_path
plural = "s" if (len(chain_letters) > 1) else "" # for printing
# Skip PDB generation if the file already exists
if (not overwrite) and (os.path.isfile(out_path)):
print("Chain%s %s of '%s' already extracted to '%s'." %
(plural, ", ".join(chain_letters), pdb_id, out_name))
return out_path
print("Extracting chain%s %s from %s..." % (plural,
", ".join(chain_letters), pdb_fn))
# Get structure, write new file with only given chains
if struct is None:
struct = self.parser.get_structure(pdb_id, pdb_path)
self.writer.set_structure(struct)
self.writer.save(out_path, select=SelectChains(chain_letters))
return out_path
class SelectChains(PDB.Select):
""" Only accept the specified chains when saving. """
def __init__(self, chain_letters):
self.chain_letters = chain_letters
def accept_chain(self, chain):
return (chain.get_id() in self.chain_letters)
if __name__ == "__main__":
""" Parses PDB id's desired chains, and creates new PDB structures. """
import sys
if not len(sys.argv) == 2:
print "Usage: $ python %s 'pdb.txt'" % __file__
sys.exit()
pdb_textfn = sys.argv[1]
pdbList = PDB.PDBList()
splitter = ChainSplitter("/home/steve/chain_pdbs") # Change me.
with open(pdb_textfn) as pdb_textfile:
for line in pdb_textfile:
pdb_id = line[:4].lower()
chain = line[4]
pdb_fn = pdbList.retrieve_pdb_file(pdb_id)
splitter.make_pdb(pdb_fn, chain)
One final note: don't write your own parser for PDB files. The format specification is ugly (really ugly), and the amount of faulty PDB files out there is staggering. Use a tool like BioPython that will handle parsing for you!
Furthermore, instead of using wget, you should use tools that interact with the PDB database for you. They take FTP connection limitations into account, the changing nature of the PDB database, and more. I should know - I updated Bio.PDBList to account for changes in the database. =)
It is probably a little late for asnwering this question, but I will give my opinion.
Biopython has some really handy features that would help you achieve such a think easily. You could use something like a custom selection class and then call it for each one of the chains you want to select inside a for loop with the original pdb file.
from Bio.PDB import Select, PDBIO
from Bio.PDB.PDBParser import PDBParser
class ChainSelect(Select):
def __init__(self, chain):
self.chain = chain
def accept_chain(self, chain):
if chain.get_id() == self.chain:
return 1
else:
return 0
chains = ['A','B','C']
p = PDBParser(PERMISSIVE=1)
structure = p.get_structure(pdb_file, pdb_file)
for chain in chains:
pdb_chain_file = 'pdb_file_chain_{}.pdb'.format(chain)
io_w_no_h = PDBIO()
io_w_no_h.set_structure(structure)
io_w_no_h.save('{}'.format(pdb_chain_file), ChainSelect(chain))
Lets say you have the following file pdb_structures
1B68A
1BZ4B
4FUTA
Then have your code in load_pdb.sh
while read name
do
chain=${name:4:1}
name=${name:0:4}
wget -c "http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId="$name -O $name.pdb
awk -v chain=$chain '$0~/^ATOM/ && substr($0,20,1)==chain {print}' $name.pdb > $name\_$chain.pdb
# rm $name.pdb
done
uncomment the last line if you don't need the original pdb's.
execute
cat pdb_structures | ./load_pdb.sh

python - Monitor if file is being requested/read from external application [duplicate]

I have a log file being written by another process which I want to watch for changes. Each time a change occurs I'd like to read the new data in to do some processing on it.
What's the best way to do this? I was hoping there'd be some sort of hook from the PyWin32 library. I've found the win32file.FindNextChangeNotification function but have no idea how to ask it to watch a specific file.
If anyone's done anything like this I'd be really grateful to hear how...
[Edit] I should have mentioned that I was after a solution that doesn't require polling.
[Edit] Curses! It seems this doesn't work over a mapped network drive. I'm guessing windows doesn't 'hear' any updates to the file the way it does on a local disk.
Did you try using Watchdog?
Python API library and shell utilities to monitor file system events.
Directory monitoring made easy with
A cross-platform API.
A shell tool to run commands in response to directory changes.
Get started quickly with a simple example in Quickstart...
If polling is good enough for you, I'd just watch if the "modified time" file stat changes. To read it:
os.stat(filename).st_mtime
(Also note that the Windows native change event solution does not work in all circumstances, e.g. on network drives.)
import os
class Monkey(object):
def __init__(self):
self._cached_stamp = 0
self.filename = '/path/to/file'
def ook(self):
stamp = os.stat(self.filename).st_mtime
if stamp != self._cached_stamp:
self._cached_stamp = stamp
# File has changed, so do something...
If you want a multiplatform solution, then check QFileSystemWatcher.
Here an example code (not sanitized):
from PyQt4 import QtCore
#QtCore.pyqtSlot(str)
def directory_changed(path):
print('Directory Changed!!!')
#QtCore.pyqtSlot(str)
def file_changed(path):
print('File Changed!!!')
fs_watcher = QtCore.QFileSystemWatcher(['/path/to/files_1', '/path/to/files_2', '/path/to/files_3'])
fs_watcher.connect(fs_watcher, QtCore.SIGNAL('directoryChanged(QString)'), directory_changed)
fs_watcher.connect(fs_watcher, QtCore.SIGNAL('fileChanged(QString)'), file_changed)
It should not work on windows (maybe with cygwin ?), but for unix user, you should use the "fcntl" system call. Here is an example in Python. It's mostly the same code if you need to write it in C (same function names)
import time
import fcntl
import os
import signal
FNAME = "/HOME/TOTO/FILETOWATCH"
def handler(signum, frame):
print "File %s modified" % (FNAME,)
signal.signal(signal.SIGIO, handler)
fd = os.open(FNAME, os.O_RDONLY)
fcntl.fcntl(fd, fcntl.F_SETSIG, 0)
fcntl.fcntl(fd, fcntl.F_NOTIFY,
fcntl.DN_MODIFY | fcntl.DN_CREATE | fcntl.DN_MULTISHOT)
while True:
time.sleep(10000)
Check out pyinotify.
inotify replaces dnotify (from an earlier answer) in newer linuxes and allows file-level rather than directory-level monitoring.
For watching a single file with polling, and minimal dependencies, here is a fully fleshed-out example, based on answer from Deestan (above):
import os
import sys
import time
class Watcher(object):
running = True
refresh_delay_secs = 1
# Constructor
def __init__(self, watch_file, call_func_on_change=None, *args, **kwargs):
self._cached_stamp = 0
self.filename = watch_file
self.call_func_on_change = call_func_on_change
self.args = args
self.kwargs = kwargs
# Look for changes
def look(self):
stamp = os.stat(self.filename).st_mtime
if stamp != self._cached_stamp:
self._cached_stamp = stamp
# File has changed, so do something...
print('File changed')
if self.call_func_on_change is not None:
self.call_func_on_change(*self.args, **self.kwargs)
# Keep watching in a loop
def watch(self):
while self.running:
try:
# Look for changes
time.sleep(self.refresh_delay_secs)
self.look()
except KeyboardInterrupt:
print('\nDone')
break
except FileNotFoundError:
# Action on file not found
pass
except:
print('Unhandled error: %s' % sys.exc_info()[0])
# Call this function each time a change happens
def custom_action(text):
print(text)
watch_file = 'my_file.txt'
# watcher = Watcher(watch_file) # simple
watcher = Watcher(watch_file, custom_action, text='yes, changed') # also call custom action function
watcher.watch() # start the watch going
Well after a bit of hacking of Tim Golden's script, I have the following which seems to work quite well:
import os
import win32file
import win32con
path_to_watch = "." # look at the current directory
file_to_watch = "test.txt" # look for changes to a file called test.txt
def ProcessNewData( newData ):
print "Text added: %s"%newData
# Set up the bits we'll need for output
ACTIONS = {
1 : "Created",
2 : "Deleted",
3 : "Updated",
4 : "Renamed from something",
5 : "Renamed to something"
}
FILE_LIST_DIRECTORY = 0x0001
hDir = win32file.CreateFile (
path_to_watch,
FILE_LIST_DIRECTORY,
win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE,
None,
win32con.OPEN_EXISTING,
win32con.FILE_FLAG_BACKUP_SEMANTICS,
None
)
# Open the file we're interested in
a = open(file_to_watch, "r")
# Throw away any exising log data
a.read()
# Wait for new data and call ProcessNewData for each new chunk that's written
while 1:
# Wait for a change to occur
results = win32file.ReadDirectoryChangesW (
hDir,
1024,
False,
win32con.FILE_NOTIFY_CHANGE_LAST_WRITE,
None,
None
)
# For each change, check to see if it's updating the file we're interested in
for action, file in results:
full_filename = os.path.join (path_to_watch, file)
#print file, ACTIONS.get (action, "Unknown")
if file == file_to_watch:
newText = a.read()
if newText != "":
ProcessNewData( newText )
It could probably do with a load more error checking, but for simply watching a log file and doing some processing on it before spitting it out to the screen, this works well.
Thanks everyone for your input - great stuff!
Check my answer to a similar question. You could try the same loop in Python. This page suggests:
import time
while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
print line, # already has newline
Also see the question tail() a file with Python.
This is another modification of Tim Goldan's script that runs on unix types and adds a simple watcher for file modification by using a dict (file=>time).
usage: whateverName.py path_to_dir_to_watch
#!/usr/bin/env python
import os, sys, time
def files_to_timestamp(path):
files = [os.path.join(path, f) for f in os.listdir(path)]
return dict ([(f, os.path.getmtime(f)) for f in files])
if __name__ == "__main__":
path_to_watch = sys.argv[1]
print('Watching {}..'.format(path_to_watch))
before = files_to_timestamp(path_to_watch)
while 1:
time.sleep (2)
after = files_to_timestamp(path_to_watch)
added = [f for f in after.keys() if not f in before.keys()]
removed = [f for f in before.keys() if not f in after.keys()]
modified = []
for f in before.keys():
if not f in removed:
if os.path.getmtime(f) != before.get(f):
modified.append(f)
if added: print('Added: {}'.format(', '.join(added)))
if removed: print('Removed: {}'.format(', '.join(removed)))
if modified: print('Modified: {}'.format(', '.join(modified)))
before = after
Here is a simplified version of Kender's code that appears to do the same trick and does not import the entire file:
# Check file for new data.
import time
f = open(r'c:\temp\test.txt', 'r')
while True:
line = f.readline()
if not line:
time.sleep(1)
print 'Nothing New'
else:
print 'Call Function: ', line
Well, since you are using Python, you can just open a file and keep reading lines from it.
f = open('file.log')
If the line read is not empty, you process it.
line = f.readline()
if line:
// Do what you want with the line
You may be missing that it is ok to keep calling readline at the EOF. It will just keep returning an empty string in this case. And when something is appended to the log file, the reading will continue from where it stopped, as you need.
If you are looking for a solution that uses events, or a particular library, please specify this in your question. Otherwise, I think this solution is just fine.
Simplest solution for me is using watchdog's tool watchmedo
From https://pypi.python.org/pypi/watchdog I now have a process that looks up the sql files in a directory and executes them if necessary.
watchmedo shell-command \
--patterns="*.sql" \
--recursive \
--command='~/Desktop/load_files_into_mysql_database.sh' \
.
As you can see in Tim Golden's article, pointed by Horst Gutmann, WIN32 is relatively complex and watches directories, not a single file.
I'd like to suggest you look into IronPython, which is a .NET python implementation.
With IronPython you can use all the .NET functionality - including
System.IO.FileSystemWatcher
Which handles single files with a simple Event interface.
This is an example of checking a file for changes. One that may not be the best way of doing it, but it sure is a short way.
Handy tool for restarting application when changes have been made to the source. I made this when playing with pygame so I can see effects take place immediately after file save.
When used in pygame make sure the stuff in the 'while' loop is placed in your game loop aka update or whatever. Otherwise your application will get stuck in an infinite loop and you will not see your game updating.
file_size_stored = os.stat('neuron.py').st_size
while True:
try:
file_size_current = os.stat('neuron.py').st_size
if file_size_stored != file_size_current:
restart_program()
except:
pass
In case you wanted the restart code which I found on the web. Here it is. (Not relevant to the question, though it could come in handy)
def restart_program(): #restart application
python = sys.executable
os.execl(python, python, * sys.argv)
Have fun making electrons do what you want them to do.
Seems that no one has posted fswatch. It is a cross-platform file system watcher. Just install it, run it and follow the prompts.
I've used it with python and golang programs and it just works.
ACTIONS = {
1 : "Created",
2 : "Deleted",
3 : "Updated",
4 : "Renamed from something",
5 : "Renamed to something"
}
FILE_LIST_DIRECTORY = 0x0001
class myThread (threading.Thread):
def __init__(self, threadID, fileName, directory, origin):
threading.Thread.__init__(self)
self.threadID = threadID
self.fileName = fileName
self.daemon = True
self.dir = directory
self.originalFile = origin
def run(self):
startMonitor(self.fileName, self.dir, self.originalFile)
def startMonitor(fileMonitoring,dirPath,originalFile):
hDir = win32file.CreateFile (
dirPath,
FILE_LIST_DIRECTORY,
win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE,
None,
win32con.OPEN_EXISTING,
win32con.FILE_FLAG_BACKUP_SEMANTICS,
None
)
# Wait for new data and call ProcessNewData for each new chunk that's
# written
while 1:
# Wait for a change to occur
results = win32file.ReadDirectoryChangesW (
hDir,
1024,
False,
win32con.FILE_NOTIFY_CHANGE_LAST_WRITE,
None,
None
)
# For each change, check to see if it's updating the file we're
# interested in
for action, file_M in results:
full_filename = os.path.join (dirPath, file_M)
#print file, ACTIONS.get (action, "Unknown")
if len(full_filename) == len(fileMonitoring) and action == 3:
#copy to main file
...
Since I have it installed globally, my favorite approach is to use nodemon. If your source code is in src, and your entry point is src/app.py, then it's as easy as:
nodemon -w 'src/**' -e py,html --exec python src/app.py
... where -e py,html lets you control what file types to watch for changes.
Here's an example geared toward watching input files that write no more than one line per second but usually a lot less. The goal is to append the last line (most recent write) to the specified output file. I've copied this from one of my projects and just deleted all the irrelevant lines. You'll have to fill in or change the missing symbols.
from PyQt5.QtCore import QFileSystemWatcher, QSettings, QThread
from ui_main_window import Ui_MainWindow # Qt Creator gen'd
class MainWindow(QMainWindow, Ui_MainWindow):
def __init__(self, parent=None):
QMainWindow.__init__(self, parent)
Ui_MainWindow.__init__(self)
self._fileWatcher = QFileSystemWatcher()
self._fileWatcher.fileChanged.connect(self.fileChanged)
def fileChanged(self, filepath):
QThread.msleep(300) # Reqd on some machines, give chance for write to complete
# ^^ About to test this, may need more sophisticated solution
with open(filepath) as file:
lastLine = list(file)[-1]
destPath = self._filemap[filepath]['dest file']
with open(destPath, 'a') as out_file: # a= append
out_file.writelines([lastLine])
Of course, the encompassing QMainWindow class is not strictly required, ie. you can use QFileSystemWatcher alone.
Just to put this out there since no one mentioned it: there's a Python module in the Standard Library named filecmp which has this cmp() function that compares two files.
Just make sure you don't do from filecmp import cmp to not overshadow the built-in cmp() function in Python 2.x. That's okay in Python 3.x, though, since there's no such built-in cmp() function anymore.
Anyway, this is how its use looks like:
import filecmp
filecmp.cmp(path_to_file_1, path_to_file_2, shallow=True)
The argument shallow defaults to True. If the argument's value is True, then only the metadata of the files are compared; however, if the argument's value is False, then the contents of the files are compared.
Maybe this information will be useful to someone.
watchfiles (https://github.com/samuelcolvin/watchfiles) is a Python API and CLI that uses the Notify (https://github.com/notify-rs/notify) library written in Rust.
The rust implementation currently (2022-10-09) supports:
Linux / Android: inotify
macOS: FSEvents or kqueue, see features
Windows: ReadDirectoryChangesW
FreeBSD / NetBSD / OpenBSD / DragonflyBSD: kqueue
All platforms: polling
Binaries available on PyPI (https://pypi.org/project/watchfiles/) and conda-forge (https://github.com/conda-forge/watchfiles-feedstock).
You can also use a simple library called repyt, here is an example:
repyt ./app.py
related #4Oh4 solution a smooth change for a list of files to watch;
import os
import sys
import time
class Watcher(object):
running = True
refresh_delay_secs = 1
# Constructor
def __init__(self, watch_files, call_func_on_change=None, *args, **kwargs):
self._cached_stamp = 0
self._cached_stamp_files = {}
self.filenames = watch_files
self.call_func_on_change = call_func_on_change
self.args = args
self.kwargs = kwargs
# Look for changes
def look(self):
for file in self.filenames:
stamp = os.stat(file).st_mtime
if not file in self._cached_stamp_files:
self._cached_stamp_files[file] = 0
if stamp != self._cached_stamp_files[file]:
self._cached_stamp_files[file] = stamp
# File has changed, so do something...
file_to_read = open(file, 'r')
value = file_to_read.read()
print("value from file", value)
file_to_read.seek(0)
if self.call_func_on_change is not None:
self.call_func_on_change(*self.args, **self.kwargs)
# Keep watching in a loop
def watch(self):
while self.running:
try:
# Look for changes
time.sleep(self.refresh_delay_secs)
self.look()
except KeyboardInterrupt:
print('\nDone')
break
except FileNotFoundError:
# Action on file not found
pass
except Exception as e:
print(e)
print('Unhandled error: %s' % sys.exc_info()[0])
# Call this function each time a change happens
def custom_action(text):
print(text)
# pass
watch_files = ['/Users/mexekanez/my_file.txt', '/Users/mexekanez/my_file1.txt']
# watcher = Watcher(watch_file) # simple
if __name__ == "__main__":
watcher = Watcher(watch_files, custom_action, text='yes, changed') # also call custom action function
watcher.watch() # start the watch going
The best and simplest solution is to use pygtail:
https://pypi.python.org/pypi/pygtail
from pygtail import Pygtail
import sys
while True:
for line in Pygtail("some.log"):
sys.stdout.write(line)
import inotify.adapters
from datetime import datetime
LOG_FILE='/var/log/mysql/server_audit.log'
def main():
start_time = datetime.now()
while True:
i = inotify.adapters.Inotify()
i.add_watch(LOG_FILE)
for event in i.event_gen(yield_nones=False):
break
del i
with open(LOG_FILE, 'r') as f:
for line in f:
entry = line.split(',')
entry_time = datetime.strptime(entry[0],
'%Y%m%d %H:%M:%S')
if entry_time > start_time:
start_time = entry_time
print(entry)
if __name__ == '__main__':
main()
The easiest solution would get the two instances of the same file after an interval and Compare them. You Could try something like this
while True:
# Capturing the two instances models.py after certain interval of time
print("Looking for changes in " + app_name.capitalize() + " models.py\nPress 'CTRL + C' to stop the program")
with open(app_name.capitalize() + '/filename', 'r+') as app_models_file:
filename_content = app_models_file.read()
time.sleep(5)
with open(app_name.capitalize() + '/filename', 'r+') as app_models_file_1:
filename_content_1 = app_models_file_1.read()
# Comparing models.py after certain interval of time
if filename_content == filename_content_1:
pass
else:
print("You made a change in " + app_name.capitalize() + " filename.\n")
cmd = str(input("Do something with the file?(y/n):"))
if cmd == 'y':
# Do Something
elif cmd == 'n':
# pass or do something
else:
print("Invalid input")
If you're using windows, create this POLL.CMD file
#echo off
:top
xcopy /m /y %1 %2 | find /v "File(s) copied"
timeout /T 1 > nul
goto :top
then you can type "poll dir1 dir2" and it will copy all the files from dir1 to dir2 and check for updates once per second.
The "find" is optional, just to make the console less noisy.
This is not recursive. Maybe you could make it recursive using /e on the xcopy.
I don't know any Windows specific function. You could try getting the MD5 hash of the file every second/minute/hour (depends on how fast you need it) and compare it to the last hash. When it differs you know the file has been changed and you read out the newest lines.
I'd try something like this.
try:
f = open(filePath)
except IOError:
print "No such file: %s" % filePath
raw_input("Press Enter to close window")
try:
lines = f.readlines()
while True:
line = f.readline()
try:
if not line:
time.sleep(1)
else:
functionThatAnalisesTheLine(line)
except Exception, e:
# handle the exception somehow (for example, log the trace) and raise the same exception again
raw_input("Press Enter to close window")
raise e
finally:
f.close()
The loop checks if there is a new line(s) since last time file was read - if there is, it's read and passed to the functionThatAnalisesTheLine function. If not, script waits 1 second and retries the process.

How do I watch a file for changes?

I have a log file being written by another process which I want to watch for changes. Each time a change occurs I'd like to read the new data in to do some processing on it.
What's the best way to do this? I was hoping there'd be some sort of hook from the PyWin32 library. I've found the win32file.FindNextChangeNotification function but have no idea how to ask it to watch a specific file.
If anyone's done anything like this I'd be really grateful to hear how...
[Edit] I should have mentioned that I was after a solution that doesn't require polling.
[Edit] Curses! It seems this doesn't work over a mapped network drive. I'm guessing windows doesn't 'hear' any updates to the file the way it does on a local disk.
Did you try using Watchdog?
Python API library and shell utilities to monitor file system events.
Directory monitoring made easy with
A cross-platform API.
A shell tool to run commands in response to directory changes.
Get started quickly with a simple example in Quickstart...
If polling is good enough for you, I'd just watch if the "modified time" file stat changes. To read it:
os.stat(filename).st_mtime
(Also note that the Windows native change event solution does not work in all circumstances, e.g. on network drives.)
import os
class Monkey(object):
def __init__(self):
self._cached_stamp = 0
self.filename = '/path/to/file'
def ook(self):
stamp = os.stat(self.filename).st_mtime
if stamp != self._cached_stamp:
self._cached_stamp = stamp
# File has changed, so do something...
If you want a multiplatform solution, then check QFileSystemWatcher.
Here an example code (not sanitized):
from PyQt4 import QtCore
#QtCore.pyqtSlot(str)
def directory_changed(path):
print('Directory Changed!!!')
#QtCore.pyqtSlot(str)
def file_changed(path):
print('File Changed!!!')
fs_watcher = QtCore.QFileSystemWatcher(['/path/to/files_1', '/path/to/files_2', '/path/to/files_3'])
fs_watcher.connect(fs_watcher, QtCore.SIGNAL('directoryChanged(QString)'), directory_changed)
fs_watcher.connect(fs_watcher, QtCore.SIGNAL('fileChanged(QString)'), file_changed)
It should not work on windows (maybe with cygwin ?), but for unix user, you should use the "fcntl" system call. Here is an example in Python. It's mostly the same code if you need to write it in C (same function names)
import time
import fcntl
import os
import signal
FNAME = "/HOME/TOTO/FILETOWATCH"
def handler(signum, frame):
print "File %s modified" % (FNAME,)
signal.signal(signal.SIGIO, handler)
fd = os.open(FNAME, os.O_RDONLY)
fcntl.fcntl(fd, fcntl.F_SETSIG, 0)
fcntl.fcntl(fd, fcntl.F_NOTIFY,
fcntl.DN_MODIFY | fcntl.DN_CREATE | fcntl.DN_MULTISHOT)
while True:
time.sleep(10000)
Check out pyinotify.
inotify replaces dnotify (from an earlier answer) in newer linuxes and allows file-level rather than directory-level monitoring.
For watching a single file with polling, and minimal dependencies, here is a fully fleshed-out example, based on answer from Deestan (above):
import os
import sys
import time
class Watcher(object):
running = True
refresh_delay_secs = 1
# Constructor
def __init__(self, watch_file, call_func_on_change=None, *args, **kwargs):
self._cached_stamp = 0
self.filename = watch_file
self.call_func_on_change = call_func_on_change
self.args = args
self.kwargs = kwargs
# Look for changes
def look(self):
stamp = os.stat(self.filename).st_mtime
if stamp != self._cached_stamp:
self._cached_stamp = stamp
# File has changed, so do something...
print('File changed')
if self.call_func_on_change is not None:
self.call_func_on_change(*self.args, **self.kwargs)
# Keep watching in a loop
def watch(self):
while self.running:
try:
# Look for changes
time.sleep(self.refresh_delay_secs)
self.look()
except KeyboardInterrupt:
print('\nDone')
break
except FileNotFoundError:
# Action on file not found
pass
except:
print('Unhandled error: %s' % sys.exc_info()[0])
# Call this function each time a change happens
def custom_action(text):
print(text)
watch_file = 'my_file.txt'
# watcher = Watcher(watch_file) # simple
watcher = Watcher(watch_file, custom_action, text='yes, changed') # also call custom action function
watcher.watch() # start the watch going
Well after a bit of hacking of Tim Golden's script, I have the following which seems to work quite well:
import os
import win32file
import win32con
path_to_watch = "." # look at the current directory
file_to_watch = "test.txt" # look for changes to a file called test.txt
def ProcessNewData( newData ):
print "Text added: %s"%newData
# Set up the bits we'll need for output
ACTIONS = {
1 : "Created",
2 : "Deleted",
3 : "Updated",
4 : "Renamed from something",
5 : "Renamed to something"
}
FILE_LIST_DIRECTORY = 0x0001
hDir = win32file.CreateFile (
path_to_watch,
FILE_LIST_DIRECTORY,
win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE,
None,
win32con.OPEN_EXISTING,
win32con.FILE_FLAG_BACKUP_SEMANTICS,
None
)
# Open the file we're interested in
a = open(file_to_watch, "r")
# Throw away any exising log data
a.read()
# Wait for new data and call ProcessNewData for each new chunk that's written
while 1:
# Wait for a change to occur
results = win32file.ReadDirectoryChangesW (
hDir,
1024,
False,
win32con.FILE_NOTIFY_CHANGE_LAST_WRITE,
None,
None
)
# For each change, check to see if it's updating the file we're interested in
for action, file in results:
full_filename = os.path.join (path_to_watch, file)
#print file, ACTIONS.get (action, "Unknown")
if file == file_to_watch:
newText = a.read()
if newText != "":
ProcessNewData( newText )
It could probably do with a load more error checking, but for simply watching a log file and doing some processing on it before spitting it out to the screen, this works well.
Thanks everyone for your input - great stuff!
Check my answer to a similar question. You could try the same loop in Python. This page suggests:
import time
while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
print line, # already has newline
Also see the question tail() a file with Python.
This is another modification of Tim Goldan's script that runs on unix types and adds a simple watcher for file modification by using a dict (file=>time).
usage: whateverName.py path_to_dir_to_watch
#!/usr/bin/env python
import os, sys, time
def files_to_timestamp(path):
files = [os.path.join(path, f) for f in os.listdir(path)]
return dict ([(f, os.path.getmtime(f)) for f in files])
if __name__ == "__main__":
path_to_watch = sys.argv[1]
print('Watching {}..'.format(path_to_watch))
before = files_to_timestamp(path_to_watch)
while 1:
time.sleep (2)
after = files_to_timestamp(path_to_watch)
added = [f for f in after.keys() if not f in before.keys()]
removed = [f for f in before.keys() if not f in after.keys()]
modified = []
for f in before.keys():
if not f in removed:
if os.path.getmtime(f) != before.get(f):
modified.append(f)
if added: print('Added: {}'.format(', '.join(added)))
if removed: print('Removed: {}'.format(', '.join(removed)))
if modified: print('Modified: {}'.format(', '.join(modified)))
before = after
Here is a simplified version of Kender's code that appears to do the same trick and does not import the entire file:
# Check file for new data.
import time
f = open(r'c:\temp\test.txt', 'r')
while True:
line = f.readline()
if not line:
time.sleep(1)
print 'Nothing New'
else:
print 'Call Function: ', line
Well, since you are using Python, you can just open a file and keep reading lines from it.
f = open('file.log')
If the line read is not empty, you process it.
line = f.readline()
if line:
// Do what you want with the line
You may be missing that it is ok to keep calling readline at the EOF. It will just keep returning an empty string in this case. And when something is appended to the log file, the reading will continue from where it stopped, as you need.
If you are looking for a solution that uses events, or a particular library, please specify this in your question. Otherwise, I think this solution is just fine.
Simplest solution for me is using watchdog's tool watchmedo
From https://pypi.python.org/pypi/watchdog I now have a process that looks up the sql files in a directory and executes them if necessary.
watchmedo shell-command \
--patterns="*.sql" \
--recursive \
--command='~/Desktop/load_files_into_mysql_database.sh' \
.
As you can see in Tim Golden's article, pointed by Horst Gutmann, WIN32 is relatively complex and watches directories, not a single file.
I'd like to suggest you look into IronPython, which is a .NET python implementation.
With IronPython you can use all the .NET functionality - including
System.IO.FileSystemWatcher
Which handles single files with a simple Event interface.
This is an example of checking a file for changes. One that may not be the best way of doing it, but it sure is a short way.
Handy tool for restarting application when changes have been made to the source. I made this when playing with pygame so I can see effects take place immediately after file save.
When used in pygame make sure the stuff in the 'while' loop is placed in your game loop aka update or whatever. Otherwise your application will get stuck in an infinite loop and you will not see your game updating.
file_size_stored = os.stat('neuron.py').st_size
while True:
try:
file_size_current = os.stat('neuron.py').st_size
if file_size_stored != file_size_current:
restart_program()
except:
pass
In case you wanted the restart code which I found on the web. Here it is. (Not relevant to the question, though it could come in handy)
def restart_program(): #restart application
python = sys.executable
os.execl(python, python, * sys.argv)
Have fun making electrons do what you want them to do.
Seems that no one has posted fswatch. It is a cross-platform file system watcher. Just install it, run it and follow the prompts.
I've used it with python and golang programs and it just works.
ACTIONS = {
1 : "Created",
2 : "Deleted",
3 : "Updated",
4 : "Renamed from something",
5 : "Renamed to something"
}
FILE_LIST_DIRECTORY = 0x0001
class myThread (threading.Thread):
def __init__(self, threadID, fileName, directory, origin):
threading.Thread.__init__(self)
self.threadID = threadID
self.fileName = fileName
self.daemon = True
self.dir = directory
self.originalFile = origin
def run(self):
startMonitor(self.fileName, self.dir, self.originalFile)
def startMonitor(fileMonitoring,dirPath,originalFile):
hDir = win32file.CreateFile (
dirPath,
FILE_LIST_DIRECTORY,
win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE,
None,
win32con.OPEN_EXISTING,
win32con.FILE_FLAG_BACKUP_SEMANTICS,
None
)
# Wait for new data and call ProcessNewData for each new chunk that's
# written
while 1:
# Wait for a change to occur
results = win32file.ReadDirectoryChangesW (
hDir,
1024,
False,
win32con.FILE_NOTIFY_CHANGE_LAST_WRITE,
None,
None
)
# For each change, check to see if it's updating the file we're
# interested in
for action, file_M in results:
full_filename = os.path.join (dirPath, file_M)
#print file, ACTIONS.get (action, "Unknown")
if len(full_filename) == len(fileMonitoring) and action == 3:
#copy to main file
...
Since I have it installed globally, my favorite approach is to use nodemon. If your source code is in src, and your entry point is src/app.py, then it's as easy as:
nodemon -w 'src/**' -e py,html --exec python src/app.py
... where -e py,html lets you control what file types to watch for changes.
Here's an example geared toward watching input files that write no more than one line per second but usually a lot less. The goal is to append the last line (most recent write) to the specified output file. I've copied this from one of my projects and just deleted all the irrelevant lines. You'll have to fill in or change the missing symbols.
from PyQt5.QtCore import QFileSystemWatcher, QSettings, QThread
from ui_main_window import Ui_MainWindow # Qt Creator gen'd
class MainWindow(QMainWindow, Ui_MainWindow):
def __init__(self, parent=None):
QMainWindow.__init__(self, parent)
Ui_MainWindow.__init__(self)
self._fileWatcher = QFileSystemWatcher()
self._fileWatcher.fileChanged.connect(self.fileChanged)
def fileChanged(self, filepath):
QThread.msleep(300) # Reqd on some machines, give chance for write to complete
# ^^ About to test this, may need more sophisticated solution
with open(filepath) as file:
lastLine = list(file)[-1]
destPath = self._filemap[filepath]['dest file']
with open(destPath, 'a') as out_file: # a= append
out_file.writelines([lastLine])
Of course, the encompassing QMainWindow class is not strictly required, ie. you can use QFileSystemWatcher alone.
Just to put this out there since no one mentioned it: there's a Python module in the Standard Library named filecmp which has this cmp() function that compares two files.
Just make sure you don't do from filecmp import cmp to not overshadow the built-in cmp() function in Python 2.x. That's okay in Python 3.x, though, since there's no such built-in cmp() function anymore.
Anyway, this is how its use looks like:
import filecmp
filecmp.cmp(path_to_file_1, path_to_file_2, shallow=True)
The argument shallow defaults to True. If the argument's value is True, then only the metadata of the files are compared; however, if the argument's value is False, then the contents of the files are compared.
Maybe this information will be useful to someone.
watchfiles (https://github.com/samuelcolvin/watchfiles) is a Python API and CLI that uses the Notify (https://github.com/notify-rs/notify) library written in Rust.
The rust implementation currently (2022-10-09) supports:
Linux / Android: inotify
macOS: FSEvents or kqueue, see features
Windows: ReadDirectoryChangesW
FreeBSD / NetBSD / OpenBSD / DragonflyBSD: kqueue
All platforms: polling
Binaries available on PyPI (https://pypi.org/project/watchfiles/) and conda-forge (https://github.com/conda-forge/watchfiles-feedstock).
You can also use a simple library called repyt, here is an example:
repyt ./app.py
related #4Oh4 solution a smooth change for a list of files to watch;
import os
import sys
import time
class Watcher(object):
running = True
refresh_delay_secs = 1
# Constructor
def __init__(self, watch_files, call_func_on_change=None, *args, **kwargs):
self._cached_stamp = 0
self._cached_stamp_files = {}
self.filenames = watch_files
self.call_func_on_change = call_func_on_change
self.args = args
self.kwargs = kwargs
# Look for changes
def look(self):
for file in self.filenames:
stamp = os.stat(file).st_mtime
if not file in self._cached_stamp_files:
self._cached_stamp_files[file] = 0
if stamp != self._cached_stamp_files[file]:
self._cached_stamp_files[file] = stamp
# File has changed, so do something...
file_to_read = open(file, 'r')
value = file_to_read.read()
print("value from file", value)
file_to_read.seek(0)
if self.call_func_on_change is not None:
self.call_func_on_change(*self.args, **self.kwargs)
# Keep watching in a loop
def watch(self):
while self.running:
try:
# Look for changes
time.sleep(self.refresh_delay_secs)
self.look()
except KeyboardInterrupt:
print('\nDone')
break
except FileNotFoundError:
# Action on file not found
pass
except Exception as e:
print(e)
print('Unhandled error: %s' % sys.exc_info()[0])
# Call this function each time a change happens
def custom_action(text):
print(text)
# pass
watch_files = ['/Users/mexekanez/my_file.txt', '/Users/mexekanez/my_file1.txt']
# watcher = Watcher(watch_file) # simple
if __name__ == "__main__":
watcher = Watcher(watch_files, custom_action, text='yes, changed') # also call custom action function
watcher.watch() # start the watch going
The best and simplest solution is to use pygtail:
https://pypi.python.org/pypi/pygtail
from pygtail import Pygtail
import sys
while True:
for line in Pygtail("some.log"):
sys.stdout.write(line)
import inotify.adapters
from datetime import datetime
LOG_FILE='/var/log/mysql/server_audit.log'
def main():
start_time = datetime.now()
while True:
i = inotify.adapters.Inotify()
i.add_watch(LOG_FILE)
for event in i.event_gen(yield_nones=False):
break
del i
with open(LOG_FILE, 'r') as f:
for line in f:
entry = line.split(',')
entry_time = datetime.strptime(entry[0],
'%Y%m%d %H:%M:%S')
if entry_time > start_time:
start_time = entry_time
print(entry)
if __name__ == '__main__':
main()
The easiest solution would get the two instances of the same file after an interval and Compare them. You Could try something like this
while True:
# Capturing the two instances models.py after certain interval of time
print("Looking for changes in " + app_name.capitalize() + " models.py\nPress 'CTRL + C' to stop the program")
with open(app_name.capitalize() + '/filename', 'r+') as app_models_file:
filename_content = app_models_file.read()
time.sleep(5)
with open(app_name.capitalize() + '/filename', 'r+') as app_models_file_1:
filename_content_1 = app_models_file_1.read()
# Comparing models.py after certain interval of time
if filename_content == filename_content_1:
pass
else:
print("You made a change in " + app_name.capitalize() + " filename.\n")
cmd = str(input("Do something with the file?(y/n):"))
if cmd == 'y':
# Do Something
elif cmd == 'n':
# pass or do something
else:
print("Invalid input")
If you're using windows, create this POLL.CMD file
#echo off
:top
xcopy /m /y %1 %2 | find /v "File(s) copied"
timeout /T 1 > nul
goto :top
then you can type "poll dir1 dir2" and it will copy all the files from dir1 to dir2 and check for updates once per second.
The "find" is optional, just to make the console less noisy.
This is not recursive. Maybe you could make it recursive using /e on the xcopy.
I don't know any Windows specific function. You could try getting the MD5 hash of the file every second/minute/hour (depends on how fast you need it) and compare it to the last hash. When it differs you know the file has been changed and you read out the newest lines.
I'd try something like this.
try:
f = open(filePath)
except IOError:
print "No such file: %s" % filePath
raw_input("Press Enter to close window")
try:
lines = f.readlines()
while True:
line = f.readline()
try:
if not line:
time.sleep(1)
else:
functionThatAnalisesTheLine(line)
except Exception, e:
# handle the exception somehow (for example, log the trace) and raise the same exception again
raw_input("Press Enter to close window")
raise e
finally:
f.close()
The loop checks if there is a new line(s) since last time file was read - if there is, it's read and passed to the functionThatAnalisesTheLine function. If not, script waits 1 second and retries the process.

Categories

Resources