How to handle TimeoutException in selenium, python - python

First of all, I created several functions to use them instead of default "find_element_by_..." and login() function to create "browser". This is how I use it:
def login():
browser = webdriver.Firefox()
return browser
def find_element_by_id_u(browser, element):
try:
obj = WebDriverWait(browser, 10).until(
lambda browser : browser.find_element_by_id(element)
)
return obj
#########
driver = login()
find_element_by_link_text_u(driver, 'the_id')
Now I use such tests through jenkins(and launch them on a virtual machine). And in case I got TimeoutException, browser session will not be killed, and I have to manually go to VM and kill the process of Firefox. And jenkins will not stop it's job while web browser process is active.
So I faced the problem and I expect it may be resoved due to exceptions handling.
I tryed to add this to my custom functions, but it's not clear where exactly exception was occured. Even if I got line number, it takes me to my custom function, but not the place where is was called:
def find_element_by_id_u(browser, element):
try:
obj = WebDriverWait(browser, 1).until(
lambda browser : browser.find_element_by_id(element)
)
return obj
except TimeoutException, err:
print "Timeout Exception for element '{elem}' using find_element_by_id\n".format(elem = element)
print traceback.format_exc()
browser.close()
sys.exit(1)
#########
driver = login()
driver .get(host)
find_element_by_id_u('jj_username').send_keys('login' + Keys.TAB + 'passwd' + Keys.RETURN)
This will print for me the line number of string "lambda browser : browser.find_element_by_id(element)" and it's useles for debugging. In my case I have near 3000 rows, so I need a propper line number.
Can you please share your expirience with me.
PS: I divided my program for few scripts, one of them contains only selenium part, that's why I need login() function, to call it from another script and use returned object in it.

Well, spending some time in my mind, I've found a proper solution.
def login():
browser = webdriver.Firefox()
return browser
def find_element_by_id_u(browser, element):
try:
obj = WebDriverWait(browser, 10).until(
lambda browser : browser.find_element_by_id(element)
)
return obj
#########
try:
driver = login()
find_element_by_id_u(driver, 'the_id')
except TimeoutException:
print traceback.format_exc()
browser.close()
sys.exit(1)
It was so obvious, that I missed it :(

Related

Handling website errors with selenium python

I am scraping a website with selenium and send an alert, if something specific happens. Generally, my code works fine, but sometimes the website doesn't load the elements or the website has an error message like: "Sorry, something went wrong! Please refresh the page and try again!" Both times, my script waits until elements are loaded, but they don't and then my program doesn't do anything. I usually use requests and Beautifulsoup for web scraping, so I am not that familiar with selenium and I am not sure how to handle these errors, because my code doesn't send an error message and just waits, until the elements load, which will likely never happen. If I manually refresh the page, the program continues to work. My idea would be something like: If it takes more than 10 seconds to load, refresh the page and try again.
My code looks somewhat like this:
def get_data():
data_list = []
while len(data_list) < 3:
try:
data = driver.find_elements_by_class_name('text-color-main-secondary.text-sm.font-bold.text-left')
count = len(data)
data_list.append(data)
driver.implicitly_wait(2)
time.sleep(.05)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
WebDriverWait(driver, 3).until(EC.visibility_of_element_located((By.CLASS_NAME,
'text-color-main-secondary.text-sm.font-bold.text-left'.format(
str(
count + 1)))))
except TimeoutException:
break
text = []
elements = []
for i in range(len(data_list)):
for j in range(len(data_list[i])):
t = data_list[i][j].text
elements.append(data_list[i][j])
for word in t.split():
if '#' in word:
text.append(word)
return text, elements
option = webdriver.ChromeOptions()
option.add_extension('')
path = ''
driver = webdriver.Chrome(executable_path=path, options=option)
driver.get('')
login(passphrase)
driver.switch_to.window(driver.window_handles[0])
while True:
try:
infos, elements = get_data()
data, message = check_data(infos, elements)
if data:
send_alert(message)
time.sleep(600)
driver.refresh()
except Exception as e:
exception_type, exception_object, exception_traceback = sys.exc_info()
line_number = exception_traceback.tb_lineno
print("an exception occured - {}".format(e) + " in line: " + str(line_number))
You can use try and except to overcome this problem. First, let's locate the element with a 10s waiting time if the element is not presented you can refresh the page. here is the basic version of the code
try:
# wait for 10s to load element if it did not load then it will redirect to except block
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CLASS_NAME,'text-color-main-secondary.text-sm.font-bold.text-left'.format(str(count + 1)))))
except:
driver.refresh()
# locate the elemnt here again

How to end a session in Selenium and start a new one?

I'm trying to quit the browser session and start a new one when I hit an exception. Normally I wouldn't do this, but in this specific case it seems to make sense.
def get_info(url):
browser.get(url)
try:
#get page data
business_type_x = '//*[#id="page-desc"]/div[2]/div'
business_type = browser.find_element_by_xpath(business_type_x).text
print(business_type)
except Exception as e:
print(e)
#new session
browser.quit()
return get_info(url)
This results in this error: http.client.RemoteDisconnected: Remote end closed connection without response
I expected it to open a new browser window with a new session. Any tips are appreciated. Thanks!
You need to create the driver object again once you quite that. Initiate the driver in the get_info method again.
You can replace webdriver.Firefox() with whatever driver you are using.
def get_info(url):
browser = webdriver.Firefox()
browser.get(url)
try:
#get page data
business_type_x = '//*[#id="page-desc"]/div[2]/div'
business_type = browser.find_element_by_xpath(business_type_x).text
print(business_type)
except Exception as e:
print(e)
#new session
browser.quit()
return get_info(url)
You can also use close method instead of quit. So that you do not have to recreate the browser object.
def get_info(url):
browser.get(url)
try:
#get page data
business_type_x = '//*[#id="page-desc"]/div[2]/div'
business_type = browser.find_element_by_xpath(business_type_x).text
print(business_type)
except Exception as e:
print(e)
#new session
browser.close()
return get_info(url)
difference between quit and close can be found in the documentation as well.
quit
close
This error message...
http.client.RemoteDisconnected: Remote end closed connection without response
...implies that the WebDriver instance i.e. browser was unable to communicate with the Browsing Context i.e. the Web Browsing session.
If your usecase is to keep on trying to invoke the same url in a loop till the desired element is getting located you can use the following solution:
def get_info(url):
while True:
browser.get(url)
try:
#get page data
business_type_x = '//*[#id="page-desc"]/div[2]/div'
business_type = browser.find_element_by_xpath(business_type_x).text
print(business_type)
break
except NoSuchElementException as e:
print(e)
continue

How to put XPath into optional mode

This XPath may available sometime or sometime not.
If reject is true then I am using if statement:
from selenium import webdriver
from selenium.webdriver.support.select import Select
from selenium.webdriver.firefox.options import Options
import time
import bs4
import requests
url="abc"
options = Options()
options.set_preference("dom.webnotifications.enabled", False)
driver=webdriver.Firefox(executable_path="C:\driver\geckodriver.exe",options=options)
driver.get(url)
driver.maximize_window()
reject = driver.find_element_by_xpath("/html/body/div/div/div/main/div/section/div[2]/div[2]/div/ul/a[1]/div[3]/label")
if reject:
driver.find_element_by_xpath("/html/body/div[1]/div/div/main/div/section/div[2]/div[2]/div/ul/a[1]/div[1]/span/i").click()
time.sleep(1)
driver.find_element_by_xpath("/html/body/div[1]/div/div/main/div/section/div[2]/div[2]/div/ul/a[1]/div[1]/div/ul/li[2]").click()
time.sleep(2)
driver.find_element_by_xpath("/html/body/div[3]/div/div/div[3]/button[1]").click()
time.sleep(5)
# Above code blocking to run below code (if reject is None).
neighbourhood= Select(driver.find_element_by_name("Locality"))
neighbourhood.select_by_value("5001641")
But the problem is if this reject variable XPath doesn't exist so it's showing error & blocking below code.
how to make this reject variable optional if XPath available then work if not then leave it & run below code.
You could catch the exception. Something like following:
...
try:
reject = driver.find_element_by_xpath("/html/body/div/div/div/main/div/section/div[2]/div[2]/div/ul/a[1]/div[3]/label")
except:
print("No element found")
if reject:
...
If you need this more often you could create a utility method for that.
def elementVisible(xpath):
try:
driver.find_element_by_xpath(xpath);
return true;
except:
return false;
try-except block will do the trick.
try:
reject = driver.find_element_by_xpath("/html/body/div/div/div/main/div/section/div[2]/div[2]/div/ul/a[1]/div[3]/label")
except:
print("An exception occurred")
Whenever the xpath is not found, the print statement is executed, and further code gets executed without an error like before.

Can't access some content on page using selenium and python

I am using selenium in a python script to login into a website where I can get an authorization key to access their API. I am able to login and navigate to the page where the authorization key is provided, I am using chrome driver for testing so I can see what's going on. When I get to the final page where the key is displayed, I can't find a way to access it. I can't see it in the page source, and when I try to access via the page element outer html, it doesn't print the value shown on the page. Here is a screenshot of what I see in the browser (I'm interested in accessing the content shown in response body):
this is the code snippet I am using to try to access the content:
auth_key = WebDriverWait(sel_browser, 10).until(EC.presence_of_element_located((By.XPATH, '//*[#id="responseBodyContent"]')))
print auth_key.get_attribute("outerHTML")
and this is what the print statement returns:
<pre id="responseBodyContent"></pre>
I've also tried:
print auth_key.text
which returns nothing. Is there way I can extract this key from the page?
It looks like you need a custom wait to wait for the element and then wait for text.
First, add a class, find element and then get innerHTML of the element. Finally, measure length of the string.
See my example below.
class element_text_not_empty(object):
def __init__(self, locator):
self.locator = locator
def __call__(self, driver):
try:
element = driver.find_element(*self.locator)
if(len(element.get_attribute('innerHTML').strip())>0):
return element.get_attribute('innerHTML')
else:
return False
except Exception as ex:
print("Error while waiting: " + str(ex))
return False
driver = webdriver.Chrome(chrome_path)
...
...
try:
print("Start wait")
result = WebDriverWait(driver, 20).until(element_text_not_empty((By.XPATH, '//*[#id="responseBodyContent"]')))
print(result)
except Exception as ex:
print("Error: " + str(ex))
Since attribute value is in json format for responseBodyContent try this
authkey_text = json.loads(auth_key.get_attribute)
print str(authkey_text)

Python Selenium. How to use driver.set_page_load_timeout() properly?

from selenium import webdriver
driver = webdriver.Chrome()
driver.set_page_load_timeout(7)
def urlOpen(url):
try:
driver.get(url)
print driver.current_url
except:
return
Then I have URL lists and call above methods.
if __name__ == "__main__":
urls = ['http://motahari.ir/', 'http://facebook.com', 'http://google.com']
# It doesn't print anything
# urls = ['http://facebook.com', 'http://google.com', 'http://motahari.ir/']
# This prints https://www.facebook.com/ https://www.google.co.kr/?gfe_rd=cr&dcr=0&ei=3bfdWdzWAYvR8geelrqQAw&gws_rd=ssl
for url in urls:
urlOpen(url)
The problem is when website 'http://motahari.ir/' throws Timeout Exception, websites 'http://facebook.com' and 'http://google.com' always throw Timeout Exception.
Browser keeps waiting for 'motahari.ir/' to load. But the loop just goes on (It doesn't open 'facebook.com' but wait for 'motahari.ir/') and keep throwing timeout exception
Initializing a webdriver instance takes long, so I pulled that out of the method and I think that caused the problem. Then, should I always reinitialize webdriver instance whenever there's a timeout exception? And How? (Since I initialized driver outside of the function, I can't reinitialize it in except)
You will just need to clear the browser's cookies before continuing. (Sorry, I missed seeing this in your previous code)
from selenium import webdriver
driver = webdriver.Chrome()
driver.set_page_load_timeout(7)
def urlOpen(url):
try:
driver.get(url)
print(driver.current_url)
except:
driver.delete_all_cookies()
print("Failed")
return
urls = ['http://motahari.ir/', 'https://facebook.com', 'https://google.com']
for url in urls:
urlOpen(url)
Output:
Failed
https://www.facebook.com/
https://www.google.com/?gfe_rd=cr&dcr=0&ei=o73dWfnsO-vs8wfc5pZI
P.S. It is not very wise to do try...except... without a clear Exception type, is this might mask different unexpected errors.

Categories

Resources