Debugging strategy for a bug (apparently) affected by timing - python

I'm fairly inexperienced with python, so I find myself at a loss as to how to approach this bug. I've inherited a python app that mainly just copies files and folders from one place to another. All the files and folders are on the local machine and the user has full admin rights, so there are no networking or security issues here.
What I've found is that a number of files fail to get copied from one directory to another unless I slow down the code somehow. If I just run the program it fails, but if I step through with a debugger or add print statements to the copy loop, it succeeds. The difference there seems to be either the timing of the loop or moving things around in memory.
I've seen this sort of bug in compiled languages before, and it usually indicates either a race condition or memory corruption. However, there is only one thread and no interaction with other processes, so a race condition seems impossible. Memory corruption remains a possibility, but I'm not sure how to investigate that possibility with python.
Here is the loop in question. As you can see, it's rather simple:
def concat_dirs(dir, subdir):
return dir + "/" + subdir
for name in fullnames:
install_name = concat_dirs(install_path, name)
dirname = os.path.dirname(install_name)
if not os.path.exists(dirname):
os.makedirs(dirname)
shutil.copyfile(concat_dirs(java_path, name), install_name)
That loop usually fails to copy the files unless I either step through it with a debugger or add this statement after the shutil.copyfile line.
print "copied ", concat_dirs(java_path, name), " to ", install_name
If I add that statement or step through in debug, the loop works perfectly and consistently. I'm tempted to say "good enough" with the print statement but I know that's just masking an underlying problem.
I'm not asking you to debug my code because I know you can't; I'm asking for a debug strategy. How do I approach finding this bug?

You do have a race condition: you check for the existence of dirname and then try to create it. That should cause the program to bomb if something unexpected happens, but...
In Python, we says that it's easier to ask forgiveness than permission. Go ahead and create that directory each time, then apologize if it already exists:
import errno
import os
for name in fullnames:
source_name = os.path.join(java_path, name)
dest_name = os.path.join(install_path, name)
dest_dir = os.path.dirname(dest_name)
try:
os.makedirs(dest_dir)
except OSError as exc:
if exc.errno != errno.EEXIST:
raise
shutil.copyfile(source_name, dest_name)
I'm not sure how I'd troubleshoot this, other than by trying it the non-racy way and seeing what happens. There may be a subtle filesystem issue that's making this run oddly.

Related

How to detect cycles in directory traversal

I am using Python on Ubuntu (Linux). I would also like this code to work on modern-ish Windows PCs. I can take or leave Macs, since I have no plan to get one, but I am hoping to make this code as portable as possible.
I have written some code that is supposed to traverse a directory and run a test on all of its subdirectories (and later, some code that will do something with each file, so I need to know how to detect links there too).
I have added a check for symlinks, but I do not know how to protect against hardlinks that could cause infinite recursion. Ideally, I'd like to also protect against duplicate detections in general (root: [A,E], A: [B,C,D], E: [D,F,G], where D is the same file or directory in both A and E).
My current thought is to check if the path from the root directory to the current folder is the same as the path being tested, and if it isn't, skip it as an instance of a cycle. However, I think that would take a lot of extra I/O or it might just retrace the (actually cyclic) path that was just created.
How do I properly detect cycles in my filesystem?
def find(self) -> bool:
if self._remainingFoldersToSearch:
current_folder = self._remainingFoldersToSearch.pop()
if not current_folder.is_symlink():
contents = current_folder.iterdir()
try:
for item in contents:
if item.is_dir():
if item.name == self._indicator:
potentialArchive = [x.name for x in item.iterdir()]
if self._conf in potentialArchive:
self._archives.append(item)
if self._onArchiveReadCallback:
self._onArchiveReadCallback(item)
else:
self._remainingFoldersToSearch.append(item)
self._searched.append(item)
if self._onFolderReadCallback:
self._onFolderReadCallback(item)
except PermissionError:
logging.info("Invalid permissions accessing folder:", exc_info=True)
return True
else:
return False

Can GetFileAttributesExA return stale information on an SMB3 mount?

I'm trying to figure the root cause of an issue in a single-threaded Python program that essentially goes like this (heavily simplified):
# Before running
os.remove(path)
# While running
if os.path.isfile(path):
with open(path) as fd:
...
I'm essentially seeing erratic behavior where isfile (which uses stat, itself using GetFileAttributesExA under the hood in Python 2.7, see here) can return True when the file doesn't exist, failing the next open call.
path being on an SMB3 network share, I'm suspecting caching behavior of some kind. Is it possible that GetFileAttributesExA returns stale information?
Reducing SMB client caching from the default (10s) to 0s seems to make the issue disappear:
Set-SmbClientConfiguration -DirectoryCacheLifetime 0 -FileInfoCacheLifetime 0
(Note: The correct fix here is to try opening the file and catch the exception, of course, but I'm puzzled by the issue and would like to understand the root cause.)

How to detect a .lock file in a geodatabase

I am very new to python, but I have written a simple python script tool for automating the process of updating mosaic datasets at my job. The tool runs great, but sometimes I get the dreaded 9999999 error, or "the geodatase already exists" when I try to overwrite the data.
The file structure is c:\users\my.name\projects\ImageryMosaic\Alachua_2014\Alachua_2014_mosaic.gdb. After some research, I determined that the lock was being placed on the FGDB whenever I opened the newly created mosaic dataset inside of the FGDB to check for errors after running the tool. I would like to be able to overwrite the data instead of having to delete it, so I am using the arcpy.env.overwriteOutput statement in my script. This works fine unless I open the dataset after running the tool. Since other people will be using this tool, I don't want them scratching thier heads for hours like me, so it would be nice if the script tool could look for the presence of a .Lock file in the geodatabase. That way I could at least provide a statement in the script as to why the tool failed in lieu of the unhelpful 9999999 error. I know about arcpy.TestSchemaLock, but I don't think that will work in this case since I am not trying to place a lock and I want to overwrite the FGDB, not edit it.
Late but this function below will check for lock files in given (gdb) path.
def lockFileExist(path = None):
if path == None:
import traceback
raise Exception("Invalid Path!")
else:
import glob
full_file_paths = glob.glob(path+"\\*.lock")
for f in full_file_paths:
if f.endswith(".lock"):
return True
return False
if lockFileExist(r"D:\sample.gdb"):
print "Lock file found in gdb. Aborting..."
else:
print "No lock files found!. Ready for processing..."

Ignore exceptions thrown and caught inside a library

The Python standard library and other libraries I use (e.g. PyQt) sometimes use exceptions for non-error conditions. Look at the following except of the function os.get_exec_path(). It uses multiple try statements to catch exceptions that are thrown while trying to find some environment data.
try:
path_list = env.get('PATH')
except TypeError:
path_list = None
if supports_bytes_environ:
try:
path_listb = env[b'PATH']
except (KeyError, TypeError):
pass
else:
if path_list is not None:
raise ValueError(
"env cannot contain 'PATH' and b'PATH' keys")
path_list = path_listb
if path_list is not None and isinstance(path_list, bytes):
path_list = fsdecode(path_list)
These exceptions do not signify an error and are thrown under normal conditions. When using exception breakpoints for one of these exceptions, the debugger will also break in these library functions.
Is there a way in PyCharm or in Python in general to have the debugger not break on exceptions that are thrown and caught inside a library without any involvement of my code?
in PyCharm, go to Run-->View Breakpoints, and check "On raise" and "Ignore library files".
The first option makes the debugger stop whenever an exception is raised, instead of just when the program terminates, and the second option gives PyCharm the policy to ignore library files, thus searching mainly in your code.
The solution was found thanks to CrazyCoder's link to the feature request, which has since been added.
For a while I had a complicated scheme which involved something like the following:
try( Closeable ignore = Debugger.newBreakSuppression() )
{
... library call which may throw ...
} <-- exception looks like it is thrown here
This allowed me to never be bothered by exceptions that were thrown and swallowed within library calls. If an exception was thrown by a library call and was not caught, then it would appear as if it occurred at the closing curly bracket.
The way it worked was as follows:
Closeable is an interface which extends AutoCloseable without declaring any checked exceptions.
ignore is just a name that tells IntelliJ IDEA to not complain about the unused variable, and it is necessary because silly java does not support try( Debugger.newBreakSuppression() ).
Debugger is my own class with debugging-related helper methods.
newBreakSuppression() was a method which would create a thread-local instance of some BreakSuppression class which would take note of the fact that we want break-on-exception to be temporarily suspended.
Then I had an exception breakpoint with a break condition that would invoke my Debugger class to ask whether it is okay to break, and the Debugger class would respond with a "no" if any BreakSuppression objects were instantiated.
That was extremely complicated, because the VM throws exceptions before my code has loaded, so the filter could not be evaluated during program startup, and the debugger would pop up a dialog complaining about that instead of ignoring it. (I am not complaining about that, I hate silent errors.) So, I had to have a terrible, horrible, do-not-try-this-at-home hack where the break condition would look like this: java.lang.System.err.equals( this ) Normally, this would never return
true, because System.err is not equal to a thrown exception, therefore the debugger would never break. However, when my Debugger class would get initialized, it would replace System.err with a class of its own,
which provided an implementation for equals(Object) and returned true if the debugger should break. So, essentially, I was using System.err as an eternal global variable.
Eventually I ditched this whole scheme because it is overly complicated and it performs very bad, because exceptions apparently get thrown very often in the java software ecosystem, so evaluating an expression every time an exception is thrown tremendously slows down everything.
This feature is not implemented yet, you can vote for it:
add ability to break (add breakpoint) on exceptions only for my files
There is another SO answer with a solution:
Debugging with pycharm, how to step into project, without entering django libraries
It is working for me, except I still go into the "_pydev_execfile.py" file, but I haven't stepped into other files after adding them to the exclusion in the linked answer.

Dropping a file onto a script to run as argument causes exception in Vista

edit:OK, I could swear that the way I'd tested it showed that the getcwd was also causing the exception, but now it appears it's just the file creation. When I move the try-except blocks to their it actually does catch it like you'd think it would. So chalk that up to user error.
Original Question:
I have a script I'm working on that I want to be able to drop a file on it to have it run with that file as an argument. I checked in this question, and I already have the mentioned registry keys (apparently the Python 2.6 installer takes care of it.) However, it's throwing an exception that I can't catch. Running it from the console works correctly, but when I drop a file on it, it throws an exception then closes the console window. I tried to have it redirect standard error to a file, but it threw the exception before the redirection occurred in the script. With a little testing, and some quick eyesight I saw that it was throwing an IOError when I tried to create the file to write the error to.
import sys
import os
#os.chdir("C:/Python26/Projects/arguments")
try:
print sys.argv
raw_input()
os.getcwd()
except Exception,e:
print sys.argv + '\n'
print e
f = open("./myfile.txt", "w")
If I run this from the console with any or no arguments, it behaves as one would expect. If I run it by dropping a file on it, for instance test.txt, it runs, prints the arguments correctly, then when os.getcwd() is called, it throws the exception, and does not perform any of the stuff from the except: block, making it difficult for me to find any way to actually get the exception text to stay on screen. If I uncomment the os.chdir(), the script doesn't fail. If I move that line to within the except block, it's never executed.
I'm guessing running by dropping the file on it, which according to the other linked question, uses the WSH, is somehow messing up its permissions or the cwd, but I don't know how to work around it.
Seeing as this is probably not Python related, but a Windows problem (I for one could not reproduce the error given your code), I'd suggest attaching a debugger to the Python interpreter when it is started. Since you start the interpreter implicitly by a drag&drop action, you need to configure Windows to auto-attach a debugger each time Python starts. If I remember correctly, this article has the needed information to do that (you can substitute another debugger if you are not using Visual Studio).
Apart from that, I would take a snapshot with ProcMon while dragging a file onto your script, to get an idea of what is going on.
As pointed out in my edit above, the errors were caused by the working directory changing to C:\windows\system32, where the script isn't allowed to create files. I don't know how to get it to not change the working directory when started that way, but was able to work around it like this.
if len(sys.argv) == 1:
files = [filename for filename in os.listdir(os.getcwd())
if filename.endswith(".txt")]
else:
files = [filename for filename in sys.argv[1:]]
Fixing the working directory can be managed this way I guess.
exepath = sys.argv[0]
os.chdir(exepath[:exepath.rfind('\\')])

Categories

Resources