I am using a simple python with statement such as below to write to a log file.
with open(filename, 'a+') as f:
do_stuff1()
f.write('stuff1 complete. \n')
do_stuff2()
f.write('stuff2 complete. \n')
do_stuff3()
f.write('stuff3 complete. \n')
I am finding that my script fails intermittently at do_stuff2() however in the log file I do not find the line "stuff1 complete" as I would expect if the file was closed correctly as should happen when using with. The only reason I do know the script is failing at do_stuff2() without my log working is because this function calls an API that does its own logging and that other log file tells me that 2 has been executed even if it did not complete.
My question is what sort of error would have to occur inside the with statement that would not only stop execution but also prevent the file from being closed correctly?
Some additional information:
The script is a scheduled task that runs late at night.
I have never been able to reproduce by running the process interactively.
The problem occurs once in every 2-3 nights.
I do see errors in the Windows event logs that point to a dll file, .NET Framework and a 0xC0000005 error which is a memory violation. The API used by do_stuff2() does use this DLL which in turn uses the .NET Framework.
Obviously I am going to try to fix the problem itself but at this point my question is focused on what could happen inside the with (potentially some number of layers below my code) that could break its intended functionality of closing the file properly regardless of whether the content of the with is executed successfully.
The with can only close the file if there is an exception. If there is a segfault inside an extension, no exception might be raised and the process dies without giving Python a chance to close the file. You can try to use f.flush() at several places to force Python to write to the file.
Related
I have a scraper that runs on windows command line. The scraper shuts down after about two days of running (due to an unspecified error). How can I implement some code to have the program run again when an error is encountered? Also, how can I make it run from the point it just stopped?
The only way you can make it run from where it stops is that if you actually store the data somewhere else, if something basic; you can use simply a text file to write your scraping output. For the part where it gets an unspecified error; if the error stops the script you can have your whole code in a while loop.
while True:
last_scraped_point = "You should update this in your script"
try:
# Start with using the last_scraped_point
except KeyboardInterrupt:
pass
This would simply keep restarting the script from where it stopped working
On some occasions, my python program won't response because there seems to be a deadlock. Since I have no idea where this deadlock happens, I'd like to set a breakpoint or dump the stack of all threads after 10 seconds in order to learn what my program is waiting for.
Use the logging module and put e.g. Logger.debug() calls in strategic places through your program. You can disable these messages by one single setting (Logger.setLevel) if you want to. And you can choose if you want to write them to e.g. stderr or to a file.
import pdb
from your_test_module import TestCase
testcase = TestCase()
testcase.setUp()
pdb.runcall(testcase.specific_test)
And then ctrl-c at your leisure. The KeyboardInterupt will cause pdb to drop into debugger prompt.
Well, as it turns out, it was because my database was locked (a connection wasn't closed) and when the tests were tearing down (and the database schema was being erased so that the database is clean for the next tests), psycopg2 just ignored the KeyboardInterrupt exception.
I solved me problem using the faulthandler module (for earlier versions, there is a pypi repo). Fault handler allows me to dump the stack trace to any file (including sys.stderr) after a period of time (repeatingly) using faulthandler.dump_traceback_later(3, repeat=True). That allowed me to set the breakpoint where my program stopped responding and tackle the issue down effectively.
I use a program (on Windows), whose name I won't disclose, that can be opened from the command line without going through any authentication. I'm trying to create some security to prevent others from accessing it this way.
I plan on replacing the built-in binary for this program with a batch file with a first line that points to my own authentication system (implemented in python, compiled to .exe with py2exe), and the second line to the command that opens the program.
My original plan was to have my script (auth.py) stop the batch file from executing the second line if authentication failed with something like this:
if not authenticated:
print('Authentication failed')
sys.exit(1)
else:
print('Successfully authenticated!')
I had counted on sys.exit(1) to do this, and I didn't bother testing it out until I was done developing the script. Now I realize that sys.exit only exits the python process.
I either need a way to stop the batch process FROM the python script, or a way for the batch to detect the exit code from auth.py (1 if failed, 0 if passed) and execute the other program only if the exit code is 0 from there.
If anyone could give me any suggestions or help of any sort, I would really appreciate it.
Thanks!
Use subprocess to call the program on successful authentication. So your python script would launch the program, not a batch file.
if not authenticated:
print('Authentication failed')
sys.exit(1)
else:
print('Successfully authenticated!')
proc = subprocess.Popen([program])
Please note, if the user has permission to start the program from within a python or batch script, nothing is stopping them from accessing it directly. This will prevent no one from accessing the program; save maybe the extreme un-technical.
You could do something really complicated to try and find the parent PID of the Python process and kill that or you could just check the %ERRORLEVEL% in your batch file. Something like:
python auth.py
if %ERRORLEVEL% neq 0 exit /B 1
I found these two methods hope these might help
http://metazin.wordpress.com/2008/08/09/how-to-kill-a-process-in-windows-using-python/ http://code.activestate.com/recipes/347462-terminating-a-subprocess-on-windows/
I have a python script which is working fine so far. However, my program does not exit properly. I can debug until and I'm returning to the end, but the programm keeps running.
main.main() does a lot of stuff: it downloads (http, ftp, sftp, ...) some csv files from a data provider, converts the data into a standardized file format and loads everyting into the database.
This works fine. However, the program does not exit. How can I find out, where the programm is "waiting"?
There exist more than one provider - the script terminates correctly for all providers except for one (sftp download, I'm using paramiko)
if __name__ == "__main__":
main.log = main.log2both
filestoconvert = []
#filestoconvert = glob.glob(r'C:\Data\Feed\ProviderName\download\*.csv')
main.main(['ProviderName'], ['download', 'convert', 'load'], filestoconvert)
I'm happy for any thoughts and ideas!
If your program does not terminate it most likely means you have a thread still working.
To list all the running threads you can use :
threading.enumerate()
This function lists all Thread that are currently running (see documentation)
If this is not enough you might need a bit of script along with the function (see documentation):
sys._current_frames()
So to print stacktrace of all alive threads you would do something like :
import sys, traceback, threading
thread_names = {t.ident: t.name for t in threading.enumerate()}
for thread_id, frame in sys._current_frames().iteritems():
print("Thread %s:" % thread_names.get(thread_id, thread_id))
traceback.print_stack(frame)
print()
Good luck !
You can involve the python debugger for a script.py with
python -m pdb script.py
You find the pdb commands at http://docs.python.org/library/pdb.html#debugger-commands
You'd better use GDB, which allows to pinpoint hung processes, like jstack in Java
This question is 10 years old, but I post my solution for someone with a similar issue with a non-finishing Python script like mine.
In my case, the debugging process didn't help. All debugging outputs showed only one thread. But the suggestion by #JC Plessis that some work should be going on helped me find the cause.
I was using Selenium with the chrome driver, and I was finishing the selenium process after closing the only tab that was open with
driver.close()
But later, I changed the code to use a headless browser, and the Selenium driver wasn't closed after driver.close(), and the python script was stuck indefinitely. It results that the right way to shutdown the Selenium driver was actually.
driver.quit()
That solved the problem, and the script was finally finishing again.
You can use sys.settrace to pinpoint which function blocks. Then you can use pdb to step through it.
I have a Python app running on Linux. It is called every minute from cron. It checks a directory for files and if it finds one it processes it - this can take several minutes. I don't want the next cron job to pick up the file currently being processed so I lock it using the code below which calls portalocker. The problem is it doesn't seem to work. The next cron job manages to get a file handle returned for the file all ready being processed.
def open_and_lock(full_filename):
file_handle = open(full_filename, 'r')
try:
portalocker.lock(file_handle, portalocker.LOCK_EX
| portalocker.LOCK_NB)
return file_handle
except IOError:
sys.exit(-1)
Any ideas what I can do to lock the file so no other process can get it?
UPDATE
Thanks to #Winston Ewert I checked through the code and found the file handle was being closed way before the processing had finished. It seems to be working now except the second process blocks on portalocker.lock rather than throwing an exception.
After fumbling with many schemes, this works in my case. I have a script that may be executed multiple times simultaneously. I need these instances to wait their turn to read/write to some files. The lockfile does not need to be deleted, so you avoid blocking all access if one script fails before deleting it.
import fcntl
def acquireLock():
''' acquire exclusive lock file access '''
locked_file_descriptor = open('lockfile.LOCK', 'w+')
fcntl.lockf(locked_file_descriptor, fcntl.LOCK_EX)
return locked_file_descriptor
def releaseLock(locked_file_descriptor):
''' release exclusive lock file access '''
locked_file_descriptor.close()
lock_fd = acquireLock()
# ... do stuff with exclusive access to your file(s)
releaseLock(lock_fd)
You're using the LOCK_NB flag which means that the call is non-blocking and will just return immediately on failure. That is presumably happening in the second process. The reason why it is still able to read the file is that portalocker ultimately uses flock(2) locks, and, as mentioned in the flock(2) man page:
flock(2) places advisory locks only;
given suitable permissions on a file,
a process is free to ignore the use of
flock(2) and perform I/O on the file.
To fix it you could use the fcntl.flock function directly (portalocker is just a thin wrapper around it on Linux) and check the returned value to see if the lock succeeded.
Don't use cron for this. Linux has inotify, which can notify applications when a filesystem event occurs. There is a Python binding for inotify called pyinotify.
Thus, you don't need to lock the file -- you just need to react to IN_CLOSE_WRITE events (i.e. when a file opened for writing was closed). (You also won't need to spawn a new process every minute.)
An alternative to using pyinotify is incron which allows you to write an incrontab (very much in the same style as a crontab), to interact with the inotify system.
what about manually creating an old-fashioned .lock-file next to the file you want to lock?
just check if it’s there; if not, create it, if it is, exit prematurely. after finishing, delete it.
I think fcntl.lockf is what you are looking for.