How to check if the element is present in Sel-python - python

I am trying to automate a page wherein the page can sometimes lead to a custom error page like the below snapshot. The scenario is the page doesn't encounter an error often but when it happens I want to display a message.
The code that I tried is as follows:
login_actions.enter_username(self.username)
login_actions.enter_password(self.password)
login_actions.login()
error_mesg = driver.find_elements(by=By.XPATH, value="//h2[#class='exception-http']")
if error_mesg:
print("encountered an error")
else:
#continue with the actions
With the above code, the web driver tries to find the element first which doesn't appear often and the test case fails.
Can anyone suggest a solution to find the presence of elements?

To handle the probable custom error page you can wrapup the validation within a try-except{} block and catching the NoSuchElementException you can use the following solution:
login_actions.enter_username(self.username)
login_actions.enter_password(self.password)
login_actions.login()
try:
driver.find_element(by=By.XPATH, value="//h2[#class='exception-http']")
print("encountered an error")
# steps to leave the error page to continue with the other actions
except NoSuchElementException:
pass
# steps to continue with the other actions

Related

Selenium Python Instagram Scraping All Images in a post not working

I am writing a small code to download all images/videos in a post. Here is my code:
import urllib.request as reqq
from selenium import webdriver
import time
browser = webdriver.Chrome("D:\\Python_Files\\Programs\\chromedriver.exe")
browser.get(url)
browser.maximize_window()
url_list = ['https://www.instagram.com/p/CE9CZmsghan/']
img_urls = []
vid_urls = []
img_url = ""
vid_url = ""
for x in url_list:
count = 0
browser.get(x)
while True:
try:
elements = browser.find_elements_by_class_name('_6CZji')
elements[0].click()
time.sleep(1)
except:
count+=1
time.sleep(1)
if count == 2:
break
try:
vid_url = browser.find_element_by_class_name('_5wCQW').find_element_by_tag_name('video').get_attribute('src')
vid_urls.append(vid_url)
except:
img_url = browser.find_element_by_class_name('KL4Bh').find_element_by_tag_name('img').get_attribute('src')
img_urls.append(img_url)
for x in range(len(img_urls)):
reqq.urlretrieve(img_urls[x],f"D:\\instaimg"+str(x+1)+".jpg")
for x in range(len(vid_urls)):
reqq.urlretrieve(vid_urls[x],"D:\\instavid"+str(x+1)+".mp4")
browser.close()
This code extracts all the images in the post except the last image. IMO, this code is right. Do you know why this code doesn't extract the last image? Any help would be appreciated. Thanks!
Go to the URL that you're using in the example and open the inspector, and very carefully watch how the DOM changes as you click between images. There are multiple page elements with class KL4Bh because it tracks the previous image, the current image, and the next image.
So doing find_element_by_class_name('KL4Bh') returns the first match on the page.
Ok, lets break down this loop and see what is happening:
first iteration
page opens
immediately click 'next' to second photo
grab the first element for class 'KL4Bh' from the DOM
the first element for that class is the first image (now the 'previous' image)
[... 2, 3, 4 same as 1 ...]
fifth iteration
look for a "next" button to click
find no next button
`elements[0]` fails with index error
grab the first element for class 'KL4Bh' from the DOM
the first element for that class is **still the fourth image**
sixth iteration
look for a "next" button to click
find no next button
`elements[0]` fails with index error
error count exceeds threshold
exit loop
try something like this:
n = 0
while True:
try:
elements = browser.find_elements_by_class_name('_6CZji')
elements[0].click()
time.sleep(1)
except IndexError:
n=1
count+=1
time.sleep(1)
if count == 2:
break
try:
vid_url = browser.find_elements_by_class_name('_5wCQW')[n].find_element_by_tag_name('video').get_attribute('src')
vid_urls.append(vid_url)
except:
img_url = browser.find_elements_by_class_name('KL4Bh')[n].find_element_by_tag_name('img').get_attribute('src')
img_urls.append(img_url)
it will do the same thing as before, except since it's now using find_elements_by_class and indexing into the resulting list, when it gets to the last image the index error for the failed button click will also cause the image lookup to increment the index it uses. So it will take the second element (the current image) on the last iteration of the loop.
There are still some serious problems with this code, but it does fix the bug you are seeing. One problem at a time :)
Edit
A few things that I think would improve this code:
When using try-except blocks to catch exceptions/errors, there are a few rules that should almost always be followed:
Name specific exceptions & errors to handle, don't use unqualified except. The reason for this is that by catching every possible error, we actually suppress and obfuscate the source of bugs. The only legitimate reason to do this is to generate a custom error message, and the last line of the except-block should always be raise to allow the error to propagate. It goes against how we typically think of software errors, but when writing code, errors are your friend.
The try-except blocks are also problematic because they are being used as a conditional control structure. Sometimes it seems easier to code like this, but it is usually a sign of incomplete understanding of the libraries being used. I am specifically referring to the block that is checking for a video versus an image, although the other one could be refactored too. As a rule, when doing conditional branching, use an if statement.
Using sleep with selenium is almost always incorrect, but it's by far the most common pitfall for new selenium users. What happens is that the developer will start getting errors about missing elements when trying to search the DOM. They will correctly conclude that it is because the page was not full loaded in the browser before selenium tried to read it. But using sleep is not the right approach because just waiting for a fixed time makes no guarantee that the page will be fully loaded. Selenium has a built-in mechanism to handle this, called explicit wait (along with implicit wait and fluent wait). Using an explicit wait will guarantee that the page element is visible before your code is allowed to proceed.

Why does Selenium throw a "StaleElementReference" exception when I'm making a new lookup each time?

Testing a web-app in which I need to run a certain code on each page, so, while the next button is NOT disabled, I execute. When all pages are reviewed, the next button will get disabled, so the loop stops.
Found lots of questions on the topic, but most of them are storing the element in a variable in which case it makes total sense. In my case, driver.find_element_by_class_name("nextPage") is not stored in a variable, so shouldn't the element be located (again and again) at each iteration, getting a fresh element each time?
Now... I have managed to except the error and continue my test as you can see below, but I still don't get why is the exception executed, and it's totally random.
from selenium.common.exceptions import StaleElementReferenceException
driver = webdriver.Firefox()
i = 1
while True:
try:
while driver.find_element_by_class_name("nextPage").is_enabled():
print('page' + str(i))
i += 1
driver.find_element_by_class_name("nextPage").click()
except StaleElementReferenceException:
print('An exception to a weird error, continuing loop...')
continue
break
The Exception message is this:
selenium.common.exceptions.StaleElementReferenceException: Message: The element reference of is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed
Well document.readyState just let you know that if the page is completely loaded or not. So it do not stop you from getting stale element as it cannot detect any ajax, or other such scripts in action.
Stale happens when an element was present and then the dom again was modified/loaded.
To get past this error, I have modified method something like this(in C#) :-
public static void StaleEnterTextExceptionHandler(this IWebDriver driver, By element, string value)
{
try
{
driver.EnterText(element, value);
}
catch (Exception e)
{
if (e.GetBaseException().Equals(typeof(StaleElementReferenceException)))
{
StaleEnterTextExceptionHandler(driver, element, value);
}
else
throw e;
}
}
So it ensured that I will not get that error again. You can make it for click as well. Or just Wrap the selenium Click() with a new click method with stale handling functionality.
Hope it helps! let me know if there is anything else on this.
Most probable reason is the page is not completely loaded when the DOM is been searched causing the random failures.
Can you please add a wait for page to refresh. A sample code will be :-
wait.until(lambda d: d.execute_script("return document.readyState")=='complete')
I am not an regular python coder. So you may have to troubleshhot the code a bit too. The algorithm is to wait till the page is not ready or loaded. This is identified via the javascript code :: "return document.readyState")=='complete' ::
just add the above code after i+=1
Hope it helps!!

How to make a while-loop run as long as an element is present on page using selenium in python?

I'm struggling with creating a while-loop which runs as long as a specific element is present on a website. What I currently have is by no means a solution I'm proud of, but it works. I would however very much appreciate some suggestions on how to change the below:
def spider():
url = 'https://stackoverflow.com/questions/ask'
driver.get()
while True:
try:
unique_element = driver.find_element_by_class_name("uniqueclass")
do_something()
except NoSuchElementException:
print_data_to_file(entries)
break
do_something_else()
As you can see, the first thing I do within the while-loop is to check for a unique element which is only present on pages containing data I'm interested in. Thus, when I reach a page without this information, I'll get the NoSuchElementException and break.
How can I achieve the above without having to make a while True?
driver.find_elements won't throw any error. If the returned list is empty it means there aren't any more elements
def spider():
url = 'https://stackoverflow.com/questions/ask'
driver.get()
while len(driver.find_elements_by_class_name("uniqueclass")) > 0:
do_something()
do_something_else()
You could also use explicit wait with expected condition staleness_of, however you won't be able to execute do_something() and it's used for short waiting periods.

Stop when error

In python, I am using selenium to scrape a webpage. There is a button that I need to repeatedly click until there is no button anymore. So far, I have code like:
count = 20
while count:
driver.find_elements_by_class_name('buttontext')[0].click()
count-=1
The problem is that I don't actually know how many times I need to do this - count = 20 is incorrect. What I actually need is for it to keep going until the command encounters an error (IndexError: list index out of range), and then stop. How can I do this?
Follow the EAFP approach - make an endless loop and break it once there is no element found:
from selenium.common.exceptions import NoSuchElementException
while True:
try:
button = driver.find_element_by_class_name("buttontext")
button.click()
except NoSuchElementException:
break
You should use the try statement to handle exceptions. It will run your code until it finds an exception. Your code should look something like that:
try:
while True:
click_function()
except Exception: #as Exception being your error
got_an_error() #or just a pass
I would follow the Selenium docs' recommendation and use .findElements().
findElement should not be used to look for non-present elements, use
findElements(By) and assert zero length response instead.
buttons = driver.find_elements_by_class_name("buttontext")
while buttons:
buttons[0].click()
// might need a slight pause here depending on what your click triggers
buttons = driver.find_elements_by_class_name("buttontext")
Do you need this?
while True:
try:
driver.find_elements_by_class_name('buttontext')[0].click()
except IndexError:
break
try will try to run some code, and except can capture the error that you select.
And if you didn't select an error, except will capture all errors. For more info.

Learning to use Assert and Asserttrue in Python for selenium web driver

Im trying to make a python webdriver to load a webpage and then assert true and run a print command if a text or object is there, and if not I want it to just continue running my loop. Im a noob to python and have been learning from Learn python the Hard Way, and reading documentation. Ive spent the last day or so trying to get my code finding text or elements, but it doesnt feed back info...Here is my code so far minus the top part about going to the webpage, I am just stuff on this count loop assert logic.
count = 1000.00
while count < 1000.03:
driver.find_element_by_id("select").clear()
driver.find_element_by_id("select").send_keys(str(count))
time.sleep(1)
driver.find_element_by_id("BP").click()
driver.find_element_by_id("BP").click()
count += 0.01 ## increases count to test
highervalue = driver.find_element_by_link_text("Select Error")
assertTrue(driver.link_contains_match_for("")) ##could also be, ##text_containt_match_for("ex") or driver.assertTrue(element in WEBPAGE)??
print 'Someone is %.7r' %count
else:
print 'I have %.7r' %count
time.sleep(1)
then the loop starts over again. The issue i am having is I want to find "Select Error" on the webpage in some kind of form, link, or text, and then if it is there print me a msg, and if not, to just continue my loop.
Is it better to use assert/asserttrue, or something like
def is_element_present(self, how, what):
try: self.driver.find_element(by=how, value=what)
except NoSuchElementException, e: return False
return True
or
Some other examples I have searched about that could be used:
self.assertTrue(self.is_element_present(By.ID, "FOO"))
self.assertTrue(self.is_element_present(By.TEXT, "BAR"))
self.assertTrue(self.is_text_present("FOO"))
self.assertTrue(self.driver.is_text_present("FOO"))
Can someone let me know how I would write the part when I find something in the webpage and it gives me feedback if found?
Selenium contains methods known as waits to assist with this specific situation. I would read up on the documentation of explicit waits. Also, there is another Stack Overflow question which may be useful to look at.
In addition, and slightly off topic, it is atypical to use Asserts in the way you have in the code shown. In general, you should not think of using Asserts to check for conditions, but rather a condition that must be true for the test to pass, otherwise the test stops and there is no reason to continue on.
Cheers.
Assuming that your "Select Error" message is actually a link_text, you could try using a 'try/except' statement.
try:
WebDriverWait(driver, 3).until(
expected_conditions.text_to_be_present_in_element((By.LINK_TEXT, "Select Error"), "Select Error")
)
print "Link Found..."
except NoSuchElementException:
pass
This code tries to find the "Select Error" link text for 3 seconds, catches the NoSuchElementException error, and then continues with your code.

Categories

Resources