Q: Python stops working at end of script in Batch - python

I'm running a series of python scripts from the Command window via a batch file.
Previously, it's worked without issue. However now, without a change in code, every time it gets to the end of a script I get a "Python.exe has stopped working" error. The scripts have actually completed processing, but I need to close the error window for the batch to proceed.
I've tried adding sys.exit to ends of the scripts but that makes no difference. The first script has no issue but every script after has this issue.
How do I stop this error from happening?
Batch File
C:\Path\to\Python\ArcGIS64bitversion C:\Path\to\Script1
C:\Path\to\Python\ArcGIS64bitversion C:\Path\to\Script2
C:\Path\to\Python\ArcGIS64bitversion C:\Path\to\Script3
C:\Path\to\Python\ArcGIS64bitversion C:\Path\to\Script4a
C:\Path\to\Python\ArcGIS64bitversion C:\Path\to\Script4b
C:\Path\to\Python\ArcGIS64bitversion C:\Path\to\Script4c
C:\Path\to\Python\ArcGIS64bitversion C:\Path\to\Script4d
C:\Path\to\Python\ArcGIS64bitversion C:\Path\to\Script5
C:\Path\to\Python\ArcGIS64bitversion C:\Path\to\Script6
the python scripts do, all, actually complete. Scripts 2-5 all use multiprocessing, however script 6 does not use multiprocessing and still experiences the error.
General Script Structure
import statements
global variables
get data statements
Def Code:
try:
code
sys.exit
except:
print error in text file
Def multiprocessing:
pool = multiprocessing.pool(32)
pool.map(Code, listofData)
if main statement
try:
code
multiprocessing()
sys.exit
except:
print error to text file
Script 2 (the first script to error)
import arcpy, fnmatch, os, shutil, sys, traceback
import multiprocessing
from time import strftime
#===========================================================================================
ras_dir = r'C:\Path\to\Input'
working_dir = r'C:\Path\to\Output'
output_dir = os.path.join(working_dir, 'Results')
if not os.path.isdir(output_dir):
os.mkdir(output_dir)
#===========================================================================================
global input_files1
global raslist
global ras
raslist = []
input_files1 = []
#===========================================================================================
for r, d, f in os.walk(working_dir):
for inFile in fnmatch.filter(f, '*.shp'):
input_files1.append(os.path.join(r, inFile))
for r, d, f in os.walk(ras_dir):
for rasf in fnmatch.filter(f,'*.tif'):
raslist.append(os.path.join(r, rasf))
ras = raslist[0]
del rasf,raslist
def rasextract(file):
arcpy.CheckOutExtension("Spatial")
arcpy.env.overwriteOutput = True
proj = file.split('.')
proj = proj[0] + '.' + proj[1] + '.prj'
arcpy.env.outputCoordinateSystem = arcpy.SpatialReference(proj)
try:
filename = str(file)
filename = filename.split('\\')
filename = filename[-1]
filename = filename.split('.')
filename = filename[0]
tif_dir = output_dir + '\\' + filename
os.mkdir(tif_dir)
arcpy.env.workspace = tif_dir
arcpy.env.scratchWorkspace = tif_dir
dname = tif_dir + '\\' + filename + '_ras.tif'
fname = working_dir+ '\\' + filename + '_ras.tif'
bufname = tif_dir + '\\' + filename + '_rasbuf.shp'
arcpy.Buffer_analysis(file, bufname, "550 METERS", "FULL", "ROUND", "ALL")
newras = arcpy.sa.ExtractByMask(ras, bufname)
newras.save(rasname)
print "Saved " + filename + " ras"
sys.exit
except:
var = traceback.format_exc()
x = str(var)
timecode = strftime("%a, %d %b %Y %H:%M:%S + 0000")
logfile = open(r'C:\ErrorLogs\Log_Script2_rasEx.txt', "a+")
ent = "\n"
logfile.write(timecode + " " + x + ent)
logfile.close()
def MCprocess():
pool = multiprocessing.Pool(32)
pool.map(rasextract, input_files1)
if __name__ == '__main__':
try:
arcpy.CheckOutExtension("Spatial")
ras_dir = r'C:\Path\to\Input'
working_dir = r'C:\Path\to\Output'
output_dir = os.path.join(working_dir, 'Results')
if not os.path.isdir(output_dir):
os.mkdir(output_dir)
#=============================================================
raslist = []
input_files1 = []
#=============================================================
for r, d, f in os.walk(working_dir):
for inFile in fnmatch.filter(f, '*.shp'):
input_files1.append(os.path.join(r, inFile))
for r, d, f in os.walk(ras_dir):
for demf in fnmatch.filter(f,'*.tif'):
demlist.append(os.path.join(r, rasf))
ras = raslist[0]
del rasf,raslist
MCprocess()
sys.exit
except:
var = traceback.format_exc()
x = str(var)
timecode = strftime("%a, %d %b %Y %H:%M:%S + 0000")
logfile = open(r'C:\ErrorLogs\Log_Script2_rasEx.txt', "a+")
ent = "\n"
logfile.write(timecode + " " + x + ent)
logfile.close()
NEW error message
this error was encountered after disabling error reporting.

Windows is catching the error.
Try disabling 'Window Error Reporting' in the Registry. After that a traceback/error should be shown. Here you find instructions how to disable 'WER' for Windows 10.

Posting this as it is the only web search I could find matching my error (which was that a script ran flawlessly in IDLE, but threw the "Python has stopped working" error when called from a batch (.bat) file) - Full disclosure, I was using shelve, not using arcpy.
I think the issue is that you are somehow leaving files 'open', and then when the script ends Python is forced to clean up the open files in an 'unplanned' fashion. Inside the IDE, this is caught and handled, but once in a batch file, the issue bubbles up to give the 'stopped working'
contrast:
f = open("example.txt", "r")
with
f = open("example.txt", "r")
f.close()
The first will error out from a bat file, the second will not.

Related

How to search and read a text file in a specific folder faster using python

I've written a simple python script to search for a log file in a folder (which has approx. 4 million files) and read the file.
Currently, the average time taken for the entire operation is 20 seconds. I was wondering if there is a way to get the response faster.
Below is my script
import re
import os
import timeit
from datetime import date
log_path = "D:\\Logs Folder\\"
rx_file_name = r"[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12}"
log_search_script = True
today = str(date.today())
while log_search_script:
try:
log_search = input("Enter image file name: ")
file_name = re.search(rx_file_name, log_search).group()
log_file_name = str(file_name) + ".log"
print(f"\nLooking for log file '{log_file_name}'...\n")
pass
except:
print("\n ***** Invalid input. Try again! ***** \n")
continue
start = timeit.default_timer()
if log_file_name in os.listdir(log_path):
log_file = open(log_path + "\\" + log_file_name, 'r', encoding="utf8")
print('\n' + "--------------------------------------------------------" + '\n')
print(log_file.read())
log_file.close()
print('\n' + "--------------------------------------------------------" + '\n')
print("Time Taken: " + str(timeit.default_timer() - start) + " seconds")
print('\n' + "--------------------------------------------------------" + '\n')
else:
print("Log File Not Found")
search_again = input('\nDo you want to search for another log ("y" / "n") ?').lower()
if search_again[0] == 'y':
print("======================================================\n\n")
continue
else:
log_search_script = False
Your problem is the line:
if log_file_name in os.listdir(log_path):
This has two problems:
os.listdir will create a huge list which can take a lot of time (and space...).
the ... in ... part will now go over that huge list linearly and search for the file.
Instead, let your OS do the hard work and "ask for forgivness, not permission". Just assume the file is there and try to open it. If it is not actually there - an error will be raised, which we will catch:
try:
with open(log_path + "\\" + log_file_name, 'r', encoding="utf8") as file:
print(log_file.read())
except FileNotFoundError:
print("Log File Not Found")
You can use glob.
import glob
print(glob.glob(directory_path))

Incomplete extraction of Zip files using Python

I have a script where I download a bunch of Zip files (150+) from a website and unzip them. I just noticed that the Zip files weren't completely extracting - i.e., there should be 68 files in each directory and there are only 62. The script ran fine with no errors.
Any thoughts? I tried running one Zip file through by itself and it extracted fine. Could the operation be timing out or something? Please forgive my code, I'm new.
I'm running Python 2.7.
import csv, urllib, urllib2, zipfile
from datetime import date
dlList =[]
dloadUrlBase = r"https://websoilsurvey.sc.egov.usda.gov/DSD/Download/Cache/SSA/"
dloadLocBase = r"Z:/Shared/Corporate/Library/GIS_DATA/Soils/"
stateDirList =[]
countyDirList =[]
fileNameList=[]
unzipList =[]
extractLocList=[]
logfile = 'log_{}.txt'.format(date.today())
with open(r'N:\Shared\Service Areas\Geographic Information Systems\Tools and Scripts\Soil_Downloads\FinalListforDownloads.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
stateDirList.append(row['StateDir'])
countyDirList.append(row['CountyDir'])
fileNameList.append(row['File_Name'])
for state, county, fileName in zip(stateDirList, countyDirList, fileNameList):
dloadDir = dloadLocBase + state + r"/" + county + "/" + fileName
requestURL = dloadUrlBase + fileName
extractLocList.append(dloadLocBase + state + r"/" + county + "/")
try:
urllib.urlretrieve(requestURL, dloadDir)
print requestURL + " found"
urllib.urlcleanup()
unzipList.append(dloadDir)
f = open(logfile, 'a+')
f.write(dloadDir + " has been downloaded")
f.close()
except:
pass
for zFile, uzDir in zip(unzipList, extractLocList):
zip_ref = zipfile.ZipFile(zFile, "r")
zip_ref.extractall(uzDir)
zip_ref.close()
Instead of just passing when an error is raised, log / print what the error is. That should indicate what is the issue or a set of issues.
except Exception as e:
print e # or print e.message
Turns out that this was a syncing issue with my network. We use a cloud based network that syncs with our offices, so somehow not all of the files were being synced and getting left in a queue.

How to skip unhashable (corrupt) files while md5 fingerprinting?

The code below makes an md5/metadata fingerprint, but crashes on files with unknown corruption (e.g., files, that can be copied, mostly even opened, but that can not be hashed or zipped up [to disguise their corruption]).
Question: How one makes this code to skip or ignore any and all problem files and just do the rest? Imagine 1 million files on 8 TB. Otherwise I leave it running and having no real-time monitoring of progress, 2 days later I find out that nothing got hashed because a couple problem files made the code hung.
Part of the code (see full code below):
def createBasicInfoListFromDisk():
global diskCompareListDetails, onlyFileNameOnDisk, driveLetter,walk_dir
walk_dir = os.path.abspath(walk_dir)
for root, subdirs, files in os.walk(walk_dir, topdown=True, onerror=None, followlinks=True ):
for filename in files:
file_path = os.path.join(root, filename)
temp = file_path.split(":")
driveLetter = temp[0]
filePathWithoutDriveLetter = temp[1]
fileSize = os.path.getsize(file_path)
mod_on = get_last_write_time(file_path)
print('\t- file %s (full path: %s)' % (filename, file_path))
print('FileName : {filename} is of size {size} and was modified on{mdt}'.format(filename=file_path,size=fileSize,mdt=mod_on ))
diskCompareListDetails.append("\"" + filePathWithoutDriveLetter+"\",\""+str(fileSize) + "\",\"" + mod_on +'"')
onlyFileNameOnDisk.append("\""+filePathWithoutDriveLetter+"\"")
return
Error:
FileName : T:\problemtest\problemfile.doc is of size 27136 and was modified on2010-10-10 13:58:32
Traceback (most recent call last):
File "t:\scripts\test.py", line 196, in <module>
createBasicInfoListFromDisk()
File "t:\scripts\test.py", line 76, in createBasicInfoListFromDisk
mod_on = get_last_write_time(file_path)
File "t:\scripts\test.py", line 61, in get_last_write_time
convert_time_to_human_readable = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(st.st_mtime))
OSError: [Errno 22] Invalid argument
Full code:
import os
import sys
import time
import datetime
import difflib
import decimal
import hashlib
from pip._vendor.distlib.compat import raw_input
csvListDetails = list()
csvCompareListDetails = list()
diskCompareListDetails = list()
onlyFileNameOnDisk = list()
addedFiles = list()
removedFiles = list()
driveLetter =""
finalFilesToChange=list()
finalFilesToDelete=list()
changedFiles=list()
csvfilewithPath="md5.csv"
import shutil
walk_dir=""
def findAndReadCSVFile(fileName):
global csvListDetails
global csvCompareListDetails
haveIgnoredLine = 0
foundFile=0
try :
inputFileHandler = open(fileName,"rt",encoding='utf-8')
update_time = get_last_write_time(fileName)
print("\n Found md5.csv, last updated on: %s" % update_time)
foundFile=1
except (OSError, IOError, FileNotFoundError):
print("\n md5.csv not found. Will create a new one.")
return foundFile
for line in inputFileHandler:
if (haveIgnoredLine==0):
haveIgnoredLine=1
continue
rowItem = line.replace("\n","").split('","')
csvCompareListDetails.append('"' + rowItem[3]+',"'+rowItem[2]+'","' +rowItem[1]+'"')
lineDetails = list()
for detailNum in range (0,len(rowItem)):
lineDetails.append('"' + (rowItem[detailNum].replace('"','')) + '"')
csvListDetails.append(lineDetails)
inputFileHandler.close()
return foundFile
def get_last_write_time(filename):
st = os.stat(filename)
convert_time_to_human_readable = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(st.st_mtime))
return convert_time_to_human_readable
def createBasicInfoListFromDisk():
global diskCompareListDetails, onlyFileNameOnDisk, driveLetter,walk_dir
walk_dir = os.path.abspath(walk_dir)
for root, subdirs, files in os.walk(walk_dir, topdown=True, onerror=None, followlinks=True ):
for filename in files:
file_path = os.path.join(root, filename)
temp = file_path.split(":")
driveLetter = temp[0]
filePathWithoutDriveLetter = temp[1]
fileSize = os.path.getsize(file_path)
mod_on = get_last_write_time(file_path)
print('\t- file %s (full path: %s)' % (filename, file_path))
print('FileName : {filename} is of size {size} and was modified on{mdt}'.format(filename=file_path,size=fileSize,mdt=mod_on ))
diskCompareListDetails.append("\"" + filePathWithoutDriveLetter+"\",\""+str(fileSize) + "\",\"" + mod_on +'"')
onlyFileNameOnDisk.append("\""+filePathWithoutDriveLetter+"\"")
return
def compareLogAndDiskLists():
global addedFiles, removedFiles
diff = difflib.unified_diff(csvCompareListDetails, diskCompareListDetails, fromfile='file1', tofile='file2', lineterm='', n=0)
lines = list(diff)[2:]
addedFiles = [line[1:] for line in lines if line[0] == '+']
removedFiles = [line[1:] for line in lines if line[0] == '-']
return
def displayInfoForUserInput():
global finalFilesToChange, finalFilesToDelete
changedOrNewFileCount = 0
noLongerExistingFilesCount = 0
totalSizeOfChange = 0
for line in addedFiles:
if line not in removedFiles:
changedOrNewFileCount = changedOrNewFileCount +1
elements = line.replace("\n","").split('","')
sizeOfFile= int(elements[1].replace('"',''))
totalSizeOfChange = totalSizeOfChange + sizeOfFile
finalFilesToChange.append(elements[0] +'"')
for line in removedFiles:
elements = line.split('","')
if elements[0]+'"' not in onlyFileNameOnDisk:
noLongerExistingFilesCount = noLongerExistingFilesCount + 1
finalFilesToDelete.append(elements[0]+'"')
GBModSz= decimal.Decimal(totalSizeOfChange) / decimal.Decimal('1073741824')
print("\n New or modified files on drive: {} (need to hash)".format(changedOrNewFileCount))
print (" Obsolete lines in md5.csv (files modified or not on drive): {} (lines to delete)".format(noLongerExistingFilesCount))
print (" {} files ({:.2f} GB) needs to be hashed.".format(changedOrNewFileCount,GBModSz))
userInput = raw_input("\n Proceed with hash? (Y/N, Yes/No) ")
if (userInput.strip().upper() == "Y" or userInput.strip().upper() == "YES"):
print("Continuing Processing...")
else:
print("You opted not to continue, Exiting")
sys.exit()
return
def processFiles(foundFile):
if (foundFile==1):
oldFileName = walk_dir+"/md5.csv"
shutil.copy( oldFileName, getTargetFileName(oldFileName))
BLOCKSIZE = 1048576*4
global changedFiles
for fileToHash in finalFilesToChange:
hasher = hashlib.new('md5')
fileToUse=driveLetter+":"+fileToHash.replace('"','')
with open(fileToUse, 'rb') as afile:
buf = afile.read(BLOCKSIZE)
while len(buf) > 0:
hasher.update(buf)
buf = afile.read(BLOCKSIZE)
fileDetails = list()
fileDetails.append(hasher.hexdigest())
fileDetails.append(get_last_write_time(fileToUse))
fileDetails.append(os.path.getsize(fileToUse))
fileDetails.append(fileToHash)
changedFiles.append(fileDetails)
return
def getTargetFileName(oldFileName):
targetFileName= walk_dir+"/generated_on_" + get_last_write_time(oldFileName).replace(" ","_").replace("-","").replace(":","")
targetFileName = targetFileName + "__archived_on_" + datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
targetFileName = targetFileName + "__md5.csv"
return targetFileName
def writeCSVFile(fileName):
try :
outputFileHandler=open(fileName,"wt",encoding='utf-8')
outputFileHandler.write("\"md5Hash\",\"LastWriteTime\",\"Length\",\"FullName\"\n")
for details in csvListDetails:
if details[3] in finalFilesToDelete:
continue
if details[3] in finalFilesToChange:
continue
outputFileHandler.write("{},{},{},{}\n".format(details[0],details[1],details[2],details[3]))
for details in changedFiles:
outputFileHandler.write("\"{}\",\"{}\",\"{}\",{}\n".format(details[0],details[1],details[2],details[3]))
outputFileHandler.close()
except (OSError, IOError, FileNotFoundError) as e:
print("ERROR :")
print("File {} is either not writable or some other error: {}".format(fileName,e))
return
if __name__ == '__main__':
walk_dir = raw_input("\n Enter drive or directory to scan: ")
csvfilewithPath=walk_dir+"/md5.csv"
print("\n Drive to scan: " + walk_dir)
foundFile = 0
foundFile=findAndReadCSVFile(csvfilewithPath)
createBasicInfoListFromDisk()
compareLogAndDiskLists()
displayInfoForUserInput()
processFiles(foundFile)
writeCSVFile(csvfilewithPath)
Trying this fix, no luck:
def get_last_write_time(filename):
try:
st = os.stat(filename)
convert_time_to_human_readable = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(st.st_mtime))
return convert_time_to_human_readable
except OSError:
pass
return "ERROR"
def createBasicInfoListFromDisk():
I agree with IMCoins and I'm very qurius on why except isn't catching the error.
So first thing I would do is to go to the source where the OSError is being raised and try to catch it explicity.
def get_last_write_time(filename):
try:
st = os.stat(filename)
convert_time_to_human_readable = time.strftime("%Y-%m-%d %H:%M:%S",
time.localtime(st.st_mtime)
return convert_time_to_human_readable
except OSError:
pass
return "ERROR" #or whatever string you want add
Updated answer, for updated post.
As stated earlier, except statement with exception type specified catches everything. So, in order to do what want... I'm afraid possible answer are either :
To make a method that identifies corrupted files, and handles it properly.
Make try, except statement that encapsulate every part of your code where there could be an error.
Let me warn you about the second solution though, as sometimes, there are system errors that you do not want to avoid. I believe you should print the exception that you catch, in order to identify further problems you may encounter.
Just so you know, as you may not : your error is not in a try, except statement. Your error is in (if I copied and pasted properly in my editor) line 196, createBasicinfoListFromDisk(), then line 76, mod_on = get_last_write_time(file_path)
As you also mentioned you are using python 3.x, I suggest you are looking into the suppress function (https://docs.python.org/3/library/contextlib.html#contextlib.suppress).
I hope it helped you.

Setting a limit for running time with a Python 'while' loop

I have some questions related to setting the maximum running time in Python. In fact, I would like to use pdfminer to convert the PDF files to .txt. The problem is that very often, some files are not possible to decode and take an extremely long time. So I want to set time.time() to limit the conversion time for each file to 20 seconds. In addition, I run under Windows so I cannot use signal function.
I succeeded in running the conversion code with pdfminer.convert_pdf_to_txt() (in my code it is "c"), but I could not integrate the time.time() in the while loop. It seems to me that in the following code, the while loop and time.time() do not work.
In summary, I want to:
Convert the PDf file to a .txt file
The time limit for each conversion is 20 seconds. If it runs out of time, throw an exception and save an empty file
Save all the txt files under the same folder
If there are any exceptions/errors, still save the file, but with empty content.
Here is the current code:
import converter as c
import os
import timeit
import time
yourpath = 'D:/hh/'
for root, dirs, files in os.walk(yourpath, topdown=False):
for name in files:
t_end = time.time() + 20
try:
while time.time() < t_end:
c.convert_pdf_to_txt(os.path.join(root, name))
t = os.path.split(os.path.dirname(os.path.join(root, name)))[1]
a = str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])
g = str(a.split("\\")[1])
with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
newfile.write(c.convert_pdf_to_txt(os.path.join(root, name)))
print "yes"
if time.time() > t_end:
print "no"
with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
newfile.write("")
except KeyboardInterrupt:
raise
except:
for name in files:
t = os.path.split(os.path.dirname(os.path.join(root, name)))[1]
a = str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])
g = str(a.split("\\")[1])
with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
newfile.write("")
You have the wrong approach.
You define the end time and immediately enter the while loop if the current timestamp is lower than the end timestamp (will be always True). So the while loop is entered and you get stuck at the converting function.
I would suggest the signal module, which is already included in Python. It allows you to quit a function after n seconds. A basic example can be seen in this Stack Overflow answer.
Your code would be like this:
return astring
import converter as c
import os
import timeit
import time
import threading
import thread
yourpath = 'D:/hh/'
for root, dirs, files in os.walk(yourpath, topdown=False):
for name in files:
try:
timer = threading.Timer(5.0, thread.interrupt_main)
try:
c.convert_pdf_to_txt(os.path.join(root, name))
except KeyboardInterrupt:
print("no")
with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
newfile.write("")
else:
timer.cancel()
t = os.path.split(os.path.dirname(os.path.join(root, name)))[1]
a = str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])
g = str(a.split("\\")[1])
print("yes")
with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
newfile.write(c.convert_pdf_to_txt(os.path.join(root, name)))
except KeyboardInterrupt:
raise
except:
for name in files:
t = os.path.split(os.path.dirname(os.path.join(root, name)))[1]
a = str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])
g = str(a.split("\\")[1])
with open("D:/f/"+g+"&"+t+"&"+name+".txt", mode="w") as newfile:
newfile.write("")
Just for the future: Four spaces indentation and not too much whitespace ;)

Debug python that wont respect a catch statement

I am trying to run avg as part of a program. The program normally is executed automatically, so I cant see the standard output from python.
When I run the program by calling it directly, it works perfectly, however when I run it via automation, it fails.
It will say in the syslog -> "Starting scan of: xxx", but it never says "Unexpected error" OR "Scan Results". Which means, its failing, but not using the catch statement, or reporting the error in the "out" variable.
The offending function:
# Scan File for viruses
# fpath -> fullpath, tname -> filename, tpath -> path to file
def scan(fpath, tname, tpath):
syslog("Starting scan of: " + tname)
command = ["avgscan",
"--report=" + tpath + "scan_result-" + tname +".txt",
fpath]
try:
out = subprocess.call(command)
syslog("Scan Results: " + str(out))
except:
syslog("Unexpected error: " + sys.exc_info()[0])
finally:
syslog("Finished scan()")
Both idea's so far are around the debugging code itself, prior to this, the scan was just a simple subprocess.call(command) with a simple syslog output. The with statement, the try catch was added to help the debugging.
I am suspecting the error is actually from the opening of the debug file; with statements do not prevent exceptions from being raised. In fact, they usually raise exceptions of their own.
Note the change of the scope of the try/except block.
# Scan File for viruses
# fpath -> fullpath, tname -> filename, tpath -> path to file
def scan(fpath, tname, tpath):
syslog("Starting scan of: " + tname)
command = ["avgscan",
"--report=" + tpath + "scan_result-" + tname +".txt",
fpath]
try:
with open(tpath + tname + "-DEBUG.txt", "w") as output:
out = subprocess.call(command, stdout = output, stderr = output)
syslog("Scan Results: " + str(out))
except:
syslog("Unexpected error: " + sys.exc_info()[0])
finally:
syslog("Finished scan()")
So I solved it. Solved it in so much as I am no longer using AVG Scan, and using libclamscan.
By using a scanner that works directly with python, the results are faster, and errors are all gone. In case someone comes across this via a search, here is the code I am now using:
import os.path
import pyclamav.scanfile
def r_scan(fpath):
viruslist = []
if os.path.isfile(fpath):
viruslist = f_scan(fpath, viruslist)
for root, subFolders, files in os.walk(fpath):
for filename in files:
viruslist = f_scan(
os.path.join(root, filename), viruslist)
writeReport(fpath, viruslist)
def f_scan(filename, viruslist):
result = pyclamav.scanfile(filename)
if result[0] > 0:
viruslist.append([result[1], filename])
return viruslist
def writeReport(fpath, viruslist):
header = "Scan Results: \n"
body = ""
for virusname, filename in viruslist:
body = body + "\nVirus Found: " + virusname + " : " + filename
with open(fpath + "-SCAN_RESULTS.txt", 'w') as f:
f.write(header+body)

Categories

Resources