httplib2.IncompleteRead: AttributeError: 'module' object has no attribute 'IncompleteRead' - python

I've been using a script that was no longer maintained, which downloads your entire Google Drive to your local storage. I mananged to fix a few issues to do with depreciation, and the script seemed to be running smoothly, however, as seemingly random times in my script, it will break, and I will receive the following error.
File "drive.py", line 169, in download_file
except httplib2.IncompleteRead:
AttributeError: 'module' object has no attribute 'IncompleteRead'
These are the modules I am using
import gflags, httplib2, logging, os, pprint, sys, re, time
import pprint
from apiclient.discovery import build
from apiclient.discovery import build
from oauth2client.file import Storage
from oauth2client.client import AccessTokenRefreshError, flow_from_clientsecrets
from oauth2client.tools import run_flow
And here is the code that is causing the error
if is_google_doc(drive_file):
try:
download_url = drive_file['exportLinks']['application/pdf']
except KeyError:
download_url = None
else:
download_url = drive_file['downloadUrl']
if download_url:
try:
resp, content = service._http.request(download_url)
except httplib2.IncompleteRead:
log( 'Error while reading file %s. Retrying...' % drive_file['title'].replace( '/', '_' ) )
print 'Error while reading file %s. Retrying...' % drive_file['title'].replace( '/', '_' )
download_file( service, drive_file, dest_path )
return False
if resp.status == 200:
try:
target = open( file_location, 'w+' )
except:
log( "Could not open file %s for writing. Please check permissions." % file_location )
print "Could not open file %s for writing. Please check permissions." % file_location
return False
target.write( content )
return True
else:
log( 'An error occurred: %s' % resp )
print 'An error occurred: %s' % resp
return False
else:
# The file doesn't have any content stored on Drive.
return False
I am assuming this error has something to do with losing connection while downloading, and I am unfamilar with the httplib2 module.
The full code can be found here
Thankyou in advance to anyone who can shed some light in a possible fix.

I've been updating that drive backup script, and have encountered the same error. I haven't yet worked out what exception is being thrown, but in order to reveal what it is (and allow the script to keep running) I've made the following change:
Remove this:
- except httplib2.IncompleteRead:
- log( 'Error while reading file %s. Retrying...' % drive_file['title'].replace( '/', '_' ) )
Replace it with this:
+ except Exception as e: #httplib2.IncompleteRead: # no longer exists
+ log( traceback.format_exc(e) + ' Error while reading file %s. Retrying...' % drive_file['title'].replace( '/', '_' ) )
This does have the downside that if it encounters an exception consistently, it may enter an endless loop. However, it will then reveal the actual exception being thrown, so the "except:" can be updated appropriately.
This change is visible in the repository
here.
If I encounter the error again I'll update this answer with more detail.

Related

Log file monitoring until specific string is found in a log file

I would like to monitor a log file until specific string is found in the file using Jython
Sample code:
import codecs
import sys
import time
cnt = True
while cnt:
if 'User TechAdmin is logged out' in open('<File path>/Log_File_Test.log').read():
if 'ERROR' in open('<File path>/Log_File_Test.log').read():
raise StandardError ("Error Found in File - Please refer log")
else:
read_f = ''
cnt = False
else:
time.sleep(3)
Code should monitor a file until it finds "User TechAdmin is logged out". If found, then I would like check whether the string "Error" exist in file. If Error found, raise exception and close the file.

Python - variable not defined but I think it is

I am pulling images from the Internet Archive as a test of some python code and I am incorporating the requests module. My code is as follows: (note, not the entire code, just the relevant section)
image_results = []
image_hashes = []
session = requests.Session()
for image in image_list:
if txtUrl not in image:
continue
try:
self.rslBox.AppendText("[v] Downloading %s" % image + "\n")
self.rslBox.Refresh()
response = session.get(image)
except:
self.rslBox.AppendText("[!] Failed to download: %s" % image + "\n")
self.rslBox.Refresh()
# continue
if "image" in response.headers['content-type']:
sha1 = hashlib.sha1(response.content).hexadigest()
if sha1 not in image_hashes:
image_hashes.append(sha1)
image_path = "WayBackImages/%s-%s" % (sha1.image.split("/")[-1])
with open(image_path, "wb") as fd:
fd.write(response.content)
self.rslBox.AppendText("[*] Saved %s" % images + "\n")
self.rslBox.Refresh()
info = pyexifinfo.get_json(image_path)
info[0]['ImageHash'] = sha1
image_results.append(info[0])
image_results_json = json.dumps(image_results)
data_frame = pandas.read_json(image_results_json)
csv = data_frame.to_csv('results.csv')
self.rslBox.AppendText("[*] Finished writing CSV to results.csv" + '\n')
self.rslBox.Refresh()
return
When I run my code, I get the following message:
Traceback (most recent call last):
File "C:\eclipse-workspace\test\tabbedPage.py", line 136, in OnSearch
if "image" in response.headers['content-type']:
NameError: name 'response' is not defined
But response is defined in the try statement - or so I would think. It only complains on the if "image" section - why??
I am new to python and I am using python3.6 and pydev with Eclipse.
Thanks!
Something inside your try failed. Your except caught it handle the error but since there is no raise in it, it continues execution, but response is not set.
It's because you are declaring response in the try block. If the exception gets thrown then response is not declared.
A work around for this would be putting the code that relies on response being declared into that try block.

How can I get around this try except block?

I wrote a try except block that I now realize was a bad idea because it keep throwing 'blind' exceptions that are hard to debug. The problem is that I do not know how to go about writing it another way besides going through each of the methods that are called and manually reading all the exceptions and making a case for each.
How would you structure this code?
def get_wiktionary_audio(self):
'''function for adding audio path to a definition, this is meant to be run before trying to get a specific URL'''
#this path is where the audio will be saved, only added the kwarg for testing with a different path
path="study_audio/%s/words" % (self.word.language.name)
try:
wiktionary_url = "http://%s.wiktionary.org/wiki/FILE:en-us-%s.ogg" % (self.word.language.wiktionary_prefix, self.word.name)
wiktionary_page = urllib2.urlopen(wiktionary_url)
wiktionary_page = fromstring(wiktionary_page.read())
file_URL = wiktionary_page.xpath("//*[contains(concat(' ', #class, ' '), ' fullMedia ')]/a/#href")[0]
file_number = len(self.search_existing_audio())
relative_path = '%s/%s%s.ogg' % (path, self.word.name, file_number)
full_path = '%s/%s' % (settings.MEDIA_ROOT, relative_path)
os.popen("wget -q -O %s 'http:%s'" % (full_path, file_URL))
except:
return False
WordAudio.objects.create(word=self.word, audio=relative_path, source=wiktionary_url)
return True
Often, exceptions come with error strings which can be used to pinpoint the problem. You can access this value like so:
try:
# code block
except Exception as e:
print str(e)
You can also print what class of exception it is along with any error messages by using the repr method:
try:
# code block
except Exception as e:
print repr(e)
One way I like to go about it is configure Python logging and log the output. This gives you a lot of flexibility in what you do with the log output. The below example logs the exception traceback.
import traceback
import logging
logger = logging.getLogger(__name__)
try:
...
except Exception as e:
logger.exception(traceback.format_exc()) # the traceback
logger.exception(e) # just the exception message
First your code is un-pythonic. You are using 'self' for a function. "self" is usually reserved for a class. So in reading your code, it feels unnatural. Second, my style is to line up "=" signs for readability. My advice is to start over -- Use standard pythonic conventions. You can get this by going through python tutorials.
Throw exception early and often -ONLY when the code stops running. You could also move some of the naming outside the try/except block.
def get_wiktionary_audio(self):
'''function for adding audio path to a definition, this is meant to be run before trying to get a specific URL'''
#this path is where the audio will be saved, only added the kwarg for testing with a different path
path = "study_audio/%s/words" % (self.word.language.name)
try:
wiktionary_url = "http://%s.wiktionary.org/wiki/FILE:en-us-%s.ogg" % (self.word.language.wiktionary_prefix, self.word.name)
wiktionary_page = urllib2.urlopen(wiktionary_url)
wiktionary_page = fromstring(wiktionary_page.read())
file_URL = wiktionary_page.xpath("//*[contains(concat(' ', #class, ' '), ' fullMedia ')]/a/#href")[0]
file_number = len(self.search_existing_audio())
relative_path = '%s/%s%s.ogg' % (path, self.word.name, file_number)
full_path = '%s/%s' % (settings.MEDIA_ROOT, relative_path)
os.popen("wget -q -O %s 'http:%s'" % (full_path, file_URL))
except Exception as e : print e
WordAudio.objects.create(word=self.word, audio=relative_path, source=wiktionary_url)
return True

Python else issues making an FTP program

I am having an issue with the else statement of this program... I have checked my spacing and it seems to be correct. I keep getting syntax error on the else statement. The program creates and file then attempts to upload it to a ftp server but if it fails to not say anything to the user and just continue It will try again when the program loops. Any help you could provide would be greatly appreciated.
#IMPORTS
import ConfigParser
import os
import random
import ftplib
from ftplib import FTP
#LOOP PART 1
from time import sleep
while True:
#READ THE CONFIG FILE SETUP.INI
config = ConfigParser.ConfigParser()
config.readfp(open(r'setup.ini'))
path = config.get('config', 'path')
name = config.get('config', 'name')
#CREATE THE KEYFILE
filepath = os.path.join((path), (name))
if not os.path.exists((path)):
os.makedirs((path))
file = open(filepath,'w')
file.write('text here')
file.close()
#Create Full Path
fullpath = path + name
#Random Sleep to Accomidate FTP Server
sleeptimer = random.randrange(1,30+1)
sleep((sleeptimer))
#Upload File to FTP Server
try:
host = '0.0.0.0'
port = 3700
ftp = FTP()
ftp.connect(host, port)
ftp.login('user', 'pass')
file = open(fullpath, "rb")
ftp.cwd('/')
ftp.storbinary('STOR ' + name, file)
ftp.quit()
file.close()
else:
print 'Something is Wrong'
#LOOP PART 2
sleep(180.00)
else is valid as part of an exception block, but it is only run if an exception is not raised and there must be a except defined before it.
(edit) Most people skip the else clause and just write code after exiting (dedenting) from the try/except clauses.
The quick tutorial is:
try:
# some statements that are executed until an exception is raised
...
except SomeExceptionType, e:
# if some type of exception is raised
...
except SomeOtherExceptionType, e:
# if another type of exception is raised
...
except Exception, e:
# if *any* exception is raised - but this is usually evil because it hides
# programming errors as well as the errors you want to handle. You can get
# a feel for what went wrong with:
traceback.print_exc()
...
else:
# if no exception is raised
...
finally:
# run regardless of whether exception was raised
...

Download big files via FTP with python

Im trying to download daily a backup file from my server to my local storage server, but i got some problems.
I wrote this code (removed the useless parts, as the email function):
import os
from time import strftime
from ftplib import FTP
import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEBase import MIMEBase
from email.MIMEText import MIMEText
from email import Encoders
day = strftime("%d")
today = strftime("%d-%m-%Y")
link = FTP(ftphost)
link.login(passwd = ftp_pass, user = ftp_user)
link.cwd(file_path)
link.retrbinary('RETR ' + file_name, open('/var/backups/backup-%s.tgz' % today, 'wb').write)
link.delete(file_name) #delete the file from online server
link.close()
mail(user_mail, "Download database %s" % today, "Database sucessfully downloaded: %s" % file_name)
exit()
And i run this with a crontab like:
40 23 * * * python /usr/bin/backup-transfer.py >> /var/log/backup-transfer.log 2>&1
It works with small files, but with the backups files (about 1.7Gb) it freeze, the downloaded file get about 1.2Gb then never grows up (i waited about a day), and the log file is empty.
Any idea?
p.s: im using Python 2.6.5
Sorry if i answer my own question, but I found the solution.
I tryed ftputil with no success, so i tryed many way and finally, this works:
def ftp_connect(path):
link = FTP(host = 'example.com', timeout = 5) #Keep low timeout
link.login(passwd = 'ftppass', user = 'ftpuser')
debug("%s - Connected to FTP" % strftime("%d-%m-%Y %H.%M"))
link.cwd(path)
return link
downloaded = open('/local/path/to/file.tgz', 'wb')
def debug(txt):
print txt
link = ftp_connect(path)
file_size = link.size(filename)
max_attempts = 5 #I dont want death loops.
while file_size != downloaded.tell():
try:
debug("%s while > try, run retrbinary\n" % strftime("%d-%m-%Y %H.%M"))
if downloaded.tell() != 0:
link.retrbinary('RETR ' + filename, downloaded.write, downloaded.tell())
else:
link.retrbinary('RETR ' + filename, downloaded.write)
except Exception as myerror:
if max_attempts != 0:
debug("%s while > except, something going wrong: %s\n \tfile lenght is: %i > %i\n" %
(strftime("%d-%m-%Y %H.%M"), myerror, file_size, downloaded.tell())
)
link = ftp_connect(path)
max_attempts -= 1
else:
break
debug("Done with file, attempt to download m5dsum")
[...]
In my log file i found:
01-12-2011 23.30 - Connected to FTP
01-12-2011 23.30 while > try, run retrbinary
02-12-2011 00.31 while > except, something going wrong: timed out
file lenght is: 1754695793 > 1754695793
02-12-2011 00.31 - Connected to FTP
Done with file, attempt to download m5dsum
Sadly, i have to reconnect to FTP even if the file has been fully downloaded, that in my cas is not a problem, becose i have to download the md5sum too.
As you can see, I'm not been able to detect the timeout and retry the connection, but when i got timeout, I simply reconnect again; If someone know how to reconnect without creating a new ftplib.FTP instance, let me know ;)
You might try setting the timeout. From the docs:
# timeout in seconds
link = FTP(host=ftp_host, user=ftp_user, passwd=ftp_pass, acct='', timeout=3600)
I implemented code with ftplib which can monitor connection, reconnect and redownload file in case of failure. Details here: How to download big file in python via ftp (with monitoring & reconnect)?

Categories

Resources