if else loop on Python. Checking a class name with Selenium

if else loop on Python. Checking a class name with Selenium - python

I have this appointment system where i have to wait till the link is available. If this link its availble then click it. If not back and foward (beacause the page doesn't let me reload ). And check it again till is available.
while True:
if driver.find_element_by_class_name("linkButton"):
#do somthing
else:
driver.back()
driver.forward()
#check again.
The program doesn't throw any error but when i force the if to be false the else just do nothing.
I can't check it with the link that is not available because in the page is till available that's why i force the if to be false

First of all, the find_element_*() methods don't return true/false, they either return a WebElement instance (which is truthy) or throw a NoSuchElementException exception (or other exception).
The existence check is usually done via find_elements_*() methods which, if an element is not found, would return an empty list, which is falsy:
while True:
if driver.find_elements_by_class_name("linkButton"):
# do something
else:
driver.refresh()
Note that I think you just meant to refresh() the page instead of going back and forward.
And, you should also add some time delays between the attempts.

Related

Selenium Python Instagram Scraping All Images in a post not working

I am writing a small code to download all images/videos in a post. Here is my code:
import urllib.request as reqq
from selenium import webdriver
import time
browser = webdriver.Chrome("D:\\Python_Files\\Programs\\chromedriver.exe")
browser.get(url)
browser.maximize_window()
url_list = ['https://www.instagram.com/p/CE9CZmsghan/']
img_urls = []
vid_urls = []
img_url = ""
vid_url = ""
for x in url_list:
count = 0
browser.get(x)
while True:
try:
elements = browser.find_elements_by_class_name('_6CZji')
elements[0].click()
time.sleep(1)
except:
count+=1
time.sleep(1)
if count == 2:
break
try:
vid_url = browser.find_element_by_class_name('_5wCQW').find_element_by_tag_name('video').get_attribute('src')
vid_urls.append(vid_url)
except:
img_url = browser.find_element_by_class_name('KL4Bh').find_element_by_tag_name('img').get_attribute('src')
img_urls.append(img_url)
for x in range(len(img_urls)):
reqq.urlretrieve(img_urls[x],f"D:\\instaimg"+str(x+1)+".jpg")
for x in range(len(vid_urls)):
reqq.urlretrieve(vid_urls[x],"D:\\instavid"+str(x+1)+".mp4")
browser.close()
This code extracts all the images in the post except the last image. IMO, this code is right. Do you know why this code doesn't extract the last image? Any help would be appreciated. Thanks!

Go to the URL that you're using in the example and open the inspector, and very carefully watch how the DOM changes as you click between images. There are multiple page elements with class KL4Bh because it tracks the previous image, the current image, and the next image.
So doing find_element_by_class_name('KL4Bh') returns the first match on the page.
Ok, lets break down this loop and see what is happening:
first iteration
page opens
immediately click 'next' to second photo
grab the first element for class 'KL4Bh' from the DOM
the first element for that class is the first image (now the 'previous' image)
[... 2, 3, 4 same as 1 ...]
fifth iteration
look for a "next" button to click
find no next button
`elements[0]` fails with index error
grab the first element for class 'KL4Bh' from the DOM
the first element for that class is **still the fourth image**
sixth iteration
look for a "next" button to click
find no next button
`elements[0]` fails with index error
error count exceeds threshold
exit loop
try something like this:
n = 0
while True:
try:
elements = browser.find_elements_by_class_name('_6CZji')
elements[0].click()
time.sleep(1)
except IndexError:
n=1
count+=1
time.sleep(1)
if count == 2:
break
try:
vid_url = browser.find_elements_by_class_name('_5wCQW')[n].find_element_by_tag_name('video').get_attribute('src')
vid_urls.append(vid_url)
except:
img_url = browser.find_elements_by_class_name('KL4Bh')[n].find_element_by_tag_name('img').get_attribute('src')
img_urls.append(img_url)
it will do the same thing as before, except since it's now using find_elements_by_class and indexing into the resulting list, when it gets to the last image the index error for the failed button click will also cause the image lookup to increment the index it uses. So it will take the second element (the current image) on the last iteration of the loop.
There are still some serious problems with this code, but it does fix the bug you are seeing. One problem at a time :)
Edit
A few things that I think would improve this code:
When using try-except blocks to catch exceptions/errors, there are a few rules that should almost always be followed:
Name specific exceptions & errors to handle, don't use unqualified except. The reason for this is that by catching every possible error, we actually suppress and obfuscate the source of bugs. The only legitimate reason to do this is to generate a custom error message, and the last line of the except-block should always be raise to allow the error to propagate. It goes against how we typically think of software errors, but when writing code, errors are your friend.
The try-except blocks are also problematic because they are being used as a conditional control structure. Sometimes it seems easier to code like this, but it is usually a sign of incomplete understanding of the libraries being used. I am specifically referring to the block that is checking for a video versus an image, although the other one could be refactored too. As a rule, when doing conditional branching, use an if statement.
Using sleep with selenium is almost always incorrect, but it's by far the most common pitfall for new selenium users. What happens is that the developer will start getting errors about missing elements when trying to search the DOM. They will correctly conclude that it is because the page was not full loaded in the browser before selenium tried to read it. But using sleep is not the right approach because just waiting for a fixed time makes no guarantee that the page will be fully loaded. Selenium has a built-in mechanism to handle this, called explicit wait (along with implicit wait and fluent wait). Using an explicit wait will guarantee that the page element is visible before your code is allowed to proceed.

Why does Selenium throw a "StaleElementReference" exception when I'm making a new lookup each time?

Testing a web-app in which I need to run a certain code on each page, so, while the next button is NOT disabled, I execute. When all pages are reviewed, the next button will get disabled, so the loop stops.
Found lots of questions on the topic, but most of them are storing the element in a variable in which case it makes total sense. In my case, driver.find_element_by_class_name("nextPage") is not stored in a variable, so shouldn't the element be located (again and again) at each iteration, getting a fresh element each time?
Now... I have managed to except the error and continue my test as you can see below, but I still don't get why is the exception executed, and it's totally random.
from selenium.common.exceptions import StaleElementReferenceException
driver = webdriver.Firefox()
i = 1
while True:
try:
while driver.find_element_by_class_name("nextPage").is_enabled():
print('page' + str(i))
i += 1
driver.find_element_by_class_name("nextPage").click()
except StaleElementReferenceException:
print('An exception to a weird error, continuing loop...')
continue
break
The Exception message is this:
selenium.common.exceptions.StaleElementReferenceException: Message: The element reference of is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed

Well document.readyState just let you know that if the page is completely loaded or not. So it do not stop you from getting stale element as it cannot detect any ajax, or other such scripts in action.
Stale happens when an element was present and then the dom again was modified/loaded.
To get past this error, I have modified method something like this(in C#) :-
public static void StaleEnterTextExceptionHandler(this IWebDriver driver, By element, string value)
{
try
{
driver.EnterText(element, value);
}
catch (Exception e)
{
if (e.GetBaseException().Equals(typeof(StaleElementReferenceException)))
{
StaleEnterTextExceptionHandler(driver, element, value);
}
else
throw e;
}
}
So it ensured that I will not get that error again. You can make it for click as well. Or just Wrap the selenium Click() with a new click method with stale handling functionality.
Hope it helps! let me know if there is anything else on this.

Most probable reason is the page is not completely loaded when the DOM is been searched causing the random failures.
Can you please add a wait for page to refresh. A sample code will be :-
wait.until(lambda d: d.execute_script("return document.readyState")=='complete')
I am not an regular python coder. So you may have to troubleshhot the code a bit too. The algorithm is to wait till the page is not ready or loaded. This is identified via the javascript code :: "return document.readyState")=='complete' ::
just add the above code after i+=1
Hope it helps!!

Element sometimes appears and sometimes does not, how to continue script either way?

My selenium (python) script is doing some work on a website that is a dashboard I.e the actual webpage/link does not change as I interact with elements on this dashboard. I mention this as I considered simply hopping around different links via driver.get() command to avoid some of the issues I am having right now.
At one point, the script reaches a part of the dashboard that for some reason during some of my test runs has an element and other times this element is not present. I have a few lines that interact with this element as it is in the way of another element that I would like to .click(). So when this element is not present my script stops working. This is a script that will repeat the same actions with small variations at the beginning, so I need to somehow integrate some sort of an 'if' command I guess.
My goal is to write something that does this:
- If this element is NOT present skip the lines of code that interact with it and jump to the next line of action. If the element is present obviously carry on.
day = driver.find_element_by_xpath('/html/body/div[4]/table/tbody/tr[4]/td[2]/a')
ActionChains(driver).move_to_element(day).click().perform()
driver.implicitly_wait(30)
lizard2 = driver.find_element_by_xpath('/html/body/div[5]/div/div[2]/div[1]/img')
ActionChains(driver).move_to_element(lizard2).perform()
x2 = driver.find_element_by_xpath('/html/body/div[5]/div/div[2]/div[2]')
ActionChains(driver).move_to_element(x2).click().perform()
driver.execute_script('window.scrollTo(0, document.body.scrollHeight)')
So the block that starts with 'lizard2' and goes to the ActionChains line is the block that interacts with this element that is sometimes apparent sometimes not as the script goes back and forth doing the task.
The top two lines are just the code that gets me to the phase of the dashboard that has this randomly appearing element. And the scroll command at the end is what follows. As mentioned before I need the middle part to be ignored and continue to the scroll part if this 'lizard2' and 'x2' elements are not found.
I apologize if this is confusing and not really concise, I am excited to hear what you guys think and provide any additional information/details, thank you!

You can simply perform your find_element_by_xpath within a try/except block like so:
from selenium.common.exceptions import NoSuchElementException
try:
myElement = driver.find_element_by_xpath(...)
#Element exists, do X
except NoSuchElementException:
#Element doesn't exist, do Y
Edit: Just like to add that someone in the comments to your question suggested that this method was 'hacky', it's actually very Pythonic and very standard.

private boolean isElementPresent(By by) {
try {
driver.findElement(by);
return true;
} catch (NoSuchElementException e) {
return false;
}
}

Answered by #Lucan :
You can simply perform your find_element_by_xpath within a try/except block like so:
from selenium.common.exceptions import NoSuchElementException
try:
myElement = driver.find_element_by_xpath(...)
#Element exists, do X
except NoSuchElementException:
#Element doesn't exist, do Y

How to make a while-loop run as long as an element is present on page using selenium in python?

I'm struggling with creating a while-loop which runs as long as a specific element is present on a website. What I currently have is by no means a solution I'm proud of, but it works. I would however very much appreciate some suggestions on how to change the below:
def spider():
url = 'https://stackoverflow.com/questions/ask'
driver.get()
while True:
try:
unique_element = driver.find_element_by_class_name("uniqueclass")
do_something()
except NoSuchElementException:
print_data_to_file(entries)
break
do_something_else()
As you can see, the first thing I do within the while-loop is to check for a unique element which is only present on pages containing data I'm interested in. Thus, when I reach a page without this information, I'll get the NoSuchElementException and break.
How can I achieve the above without having to make a while True?

driver.find_elements won't throw any error. If the returned list is empty it means there aren't any more elements
def spider():
url = 'https://stackoverflow.com/questions/ask'
driver.get()
while len(driver.find_elements_by_class_name("uniqueclass")) > 0:
do_something()
do_something_else()
You could also use explicit wait with expected condition staleness_of, however you won't be able to execute do_something() and it's used for short waiting periods.

How to test for absence of an element without waiting for a 30 second timeout

I am writing some functional tests and it takes 5 minutes to do some simple single-page testing because the find_element function takes 30 seconds to complete when the element is not found. I need to test for the absence of an element without having to wait for a timeout. I've been searching but so far have not found any alternative to find_element(). Here's my code:
def is_extjs_checkbox_selected_by_id(self, id):
start_time = time.time()
find_result = self.is_element_present(By.XPATH, "//*[#id='" + id + "'][contains(#class,'x-form-cb-checked')]") # This line is S-L-O-W
self.step(">>>>>> This took " + str( (time.time() - start_time) ) + " seconds")
return find_result
def is_element_present(self, how, what):
try: self.driver.find_element(by=how, value=what)
except NoSuchElementException, e: return False
return True
Thanks.
Well, I followed most of the recommendations here and in the other links and in the end, and have failed to achieve the goal. It has the exact same behavior of taking 30 seconds when the element is not found:
# Fail
def is_element_present_timeout(self, id_type, id_locator, secs_wait_before_testing):
start_time = time.time()
driver = self.driver
time.sleep(secs_wait_before_testing)
element_found = True
try:
element = WebDriverWait(driver, 0).until(
EC.presence_of_element_located((id_type, id_locator))
)
except:
element_found = False
elapsed_time = time.time() - start_time
self.step("elapsed time : " + str(elapsed_time))
return element_found
Here's a second approach using the idea of getting all elements
# Fail
def is_element_present_now(self, id_type, id_locator):
driver = self.driver
# This line blocks for 30 seconds if the id_locator is not found, i.e. fail
els = driver.find_elements(By.ID, id_locator)
the_length = els.__len__()
if the_length == 0:
result = False
else:
result = True
self.step('length='+str(the_length))
return result
Note, I've unaccepted previous answer since following poster's advice did not produce a successful result.

Given the code you show in the question, if you have a 30 second wait before Selenium decides that the element is missing, this means you are using implicit waits. You should stop using implicit waits and use explicit waits exclusively. The reason is that sooner or later you'll want to use an explicit wait to control precisely how long Selenium waits. Unfortunately, implicit and explicit waits just don't mix. The details are explained in this answer.
There are two general methods I use to test for the absence of an element, depending on conditions. The one problem that we face when using Selenium to test dynamic applications is the following: at which point can you be certain that the element whose absence you want to check is not going to show up a fraction of a second later than the moment at which you check? For instance, if you perform a test which clicks a button, which launches an Ajax request, which may cause an element indicating an error to be added to the DOM when the request fails, and you check for the presence of the error element immediately after asking Selenium to click the button, chances are you're going to miss the error message, if it so happens that the request fails. You'd have to wait at least a little bit to give the Ajax request a chance to complete. How long depends on your application.
This being said, here are the two methods that I use.
Pair an Absence With a Presence
I pair the absence test with a presence test, use find_elements rather than find_element, check for a a length of zero on the array returned by find_elements.
By "I pair the absence test with a presence test" I mean that I identify a condition like the presence of another element on the page. This condition must be such that if the condition is true, then I do not need to wait to test for the absence: I know that if the condition is true, the element whose absence I want to test must be either finally absent or present on the page. It won't show up a fraction of a second later.
For instance, I have a button that when clicked performs a check through an Ajax call, and once the Ajax call is done puts on the page <p>Check complete</p> and under it adds paragraphs with error messages if there are errors. To test for the case where there is no error, I'd wait for <p>Check complete</p> to come up and only then check for the absence of the error messages. I'd use find_elements to check for these error messages and if the length of the returned list is 0, I know there are none. Checking for these error messages themselves does not have to use any wait.
This also works for cases where there is no Ajax involved. For instance, if a page is generated by the server with or without a certain element, then there is no need to wait and the method described here can be used. The "presence" of the page (that is, the page has finished loading) is the only condition needed to check for the absence of the element you want to check.
Wait for a Timeout
In cases where I cannot benefit from some positive condition to pair with my absence check, then I use an explicit wait:
import selenium.webdriver.support.expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from nose.tools import assert_raises
def test():
WebDriverWait(driver, timeout).until(
EC.presence_of_element_located((By.CSS_SELECTOR, ".foo")))
assert_raises(TimeoutException, test)
In the code above driver a Selenium's WebDriver previously created, timeout is the desired timeout, and assert_raises is just an example of an assertion that could be used. The assertion passes if TimeoutException is raised.

After researching this and getting nowhere I finally came up with an acceptable solution. It's kind of embarrassingly simple how easy this was to solve. All I had to do was to manipulate the timeout with the implicitly_wait() function, which for some reason I was ignoring/not noticing. I'm using the syntax provided by Louis but it would be even simpler and would work just the same by taking my original is_element_present() function above and adding the driver.implicitly_wait() functions before and after the find_element() function.
# Success
def is_element_present(self, id_type, id_locator):
driver = self.driver
element_found = True
driver.implicitly_wait(1)
try:
element = WebDriverWait(driver, 0).until(
EC.presence_of_element_located((id_type, id_locator))
)
except:
element_found = False
driver.implicitly_wait(30)
return element_found

I wrote a simple method in Java:
public boolean isElementPresent(long waitTime, String elementLocation) {
WebDriverWait wait = new WebDriverWait(driver, waitTime); // you can set the wait time in second
try {
wait.until(ExpectedConditions.visibilityOfElementLocated(By.xpath(elementLocation)));
} catch (Exception e) {
return false;
}
return true;
}
(you can see examples in python: http://selenium-python.readthedocs.org/en/latest/waits.html)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

if else loop on Python. Checking a class name with Selenium - python

Related

Selenium Python Instagram Scraping All Images in a post not working

Why does Selenium throw a "StaleElementReference" exception when I'm making a new lookup each time?

Element sometimes appears and sometimes does not, how to continue script either way?

How to make a while-loop run as long as an element is present on page using selenium in python?

How to test for absence of an element without waiting for a 30 second timeout

Categories

Resources