Selenium: avoid ElementNotVisibleException exception - python

I've never studied HTML seriously, so probably what I'm going to say is not right.
While I was writing my selenium code, I noticed that some buttons on some webpages does not redirect to other pages, but they change the structure of the the first one. From what I've understood, this happens because there's some JavaScript code that modifies it.
So, when I wanna get some data which is not present on the first page loading, I just have to click the right sequence of button to obtain it, rigth?
The page I wanted to load is https://watch.nba.com/, and what I want to get is the match list of a given day. The fastest path to get it is to open the calendary:
calendary = driver.find_element_by_class_name("calendar-btn")
calendary.click()
and click the selected day:
page = calendary.find_element_by_xpath("//*[contains(text(), '" + time.strftime("%d") + "')]")
page.click()
running this code, I get this error:
selenium.common.exceptions.ElementNotVisibleException
I read somewhere that the problem is that the page is not correctly loaded, or the element is not visible/clickable, so I tried with this:
wait = WebDriverWait(driver, 10)
page = wait.until(EC.visibility_of_element_located((By.XPATH, "//*[contains(text(), '" + time.stfrtime("%d") + "')]")))
page.click()
But this time I get this error:
selenium.common.exceptions.TimeoutException
Can you help me to solve at least one of these two problems?

The reason you are getting such behavior is because this page is loaded with iFrames (I can see 15 on the main page) once you click the calendar button, you will need to switch your context to either be on the defaultContext or a specific iframe where the calendar resides. There is tons of code outthere that shows you how to switch into and out of iframe. Hope this helps.

Related

Python Selenium Error - StaleElementReferenceException: Message: stale element reference: element is not attached to the page document

Today I'm having troubles due to a "a href" button that does not have any ID to be identified, then let's explain a little bit more about the problem... I have an structure like this one(let's assume XXX is an anonymous path):
wait = WebDriverWait(driver, 5)
el=wait.until(EC.presence_of_element_located((By.ID, 'XXX1')))
entries = el.find_elements_by_tag_name('tr')
for i in range(len(entries)):
if(entries[i].find_element_by_xpath(XXX2).text==compare):
el = wait.until(EC.element_to_be_clickable((By.ID,XXX3)))
el.click()
el=wait.until(EC.presence_of_element_located((By.ID, XXX4)))
entries2 = el.find_elements_by_tag_name('tr')
for j in range(len(entries2)):
#Some statements...
xpath = ".../a"
your_element=WebDriverWait(driver,10)\
.until(EC.element_to_be_clickable((By.XPATH,xpath)))##Here the problem
your_element.click()
Then, I'm getting information from an hibrid page (dynamic and static) using as driver a ChromeDriver one, once I get a big table, inside every row there is a button that shows more info, then I need to click it for open it too, the main problem is when this operation iterates, that error is shown by the output. This driver is a ChromeDriver one. In summary, first I search something and click on search button, then I get a table where every row(at the end in the last column) has a button that once is open, shows more information, consequently I need to open it and close it due to the next row, it works with the first row, but with the second one, it crashes. I would really appreciate any advice of how to handle this problem.
Thanks in advance!
The problem is that you change with the click the dom within your loop. For mr this never worked.
One solution is, to try to re-query within the loop to make sure your at the correct position. Your third line:
entries = el.find_elements_by_tag_name('tr')
should be executed every time and with a counter make sure you are at the correct position of your <tr> entries.

Python Selenium unable to locate element when it's there?

I am trying to scrape product's delivery date data from a bunch of lists of product urls.
I am running the python file on terminal doing multi-processing so ultimately this opens up multiple chrome browsers (like 10 - 15 of them) which slows down my computer quite a bit.
My code basically clicks a block that contains shipping options, which would would show a pop up box that shows estimated delivery time. I have included an example of a product url in the code below.
I noticed that some of my chrome browsers freeze and do not locate the element and click it like I have in my code. I've incorporated refreshing the page in webdriver into my code just in case that will do the trick but it doesn't seem like the frozen browsers are even refreshing.
I don't know why it would do that as I have set the webdriver to wait until the element is clickable. Do I just increase the time in time.sleep() or the seconds in webdriverwait() to resolve this?
chromedriver = "path to chromedriver"
driver = webdriver.Chrome(chromedriver, options=options)
# Example url
url = "https://shopee.sg/Perfect-Diary-X-Sanrio-MagicStay-Loose-Powder-Weightless-Soft-velvet-Blurring-Face-Powder-With-Cosmetic-Puff-Oil-Control-Smooth-Face-Powder-Waterproof-Applicable-To-Mask-Face-Cinnamoroll-Purin-Gudetama-i.240344695.5946255000?ads_keyword=makeup&adsid=1095331&campaignid=563217&position=1"
driver.get(url)
time.sleep(2)
try:
WebDriverWait(driver,60).until(EC.element_to_be_clickable((By.XPATH,'//div[#class="flex flex-column"]//div[#class="shopee-drawer "]'))).click()
while retries <= 5:
try:
shipping_block = WebDriverWait(driver,60).until(EC.element_to_be_clickable((By.XPATH,'//div[#class="flex flex-column"]//div[#class="shopee-drawer "]'))).click()
break
except TimeoutException:
driver.refresh()
retries += 1
except (NoSuchElementException, StaleElementReferenceException):
delivery_date = None
The element which you desire will get displayed when you hover the mouse, V The type of the element is svg which you need to handle accordingly.
you can take the help of this XPath to hover the mouse
((//*[name()='svg']//*[name()='g'])/*[name()='path'][starts-with(#d, 'm11 2.5c0 .1 0 .2-.1.3l')])
Getting the text from the popUp you need to check with this XPath
((//div[#class='shopee-drawer__contents']//descendant::div[4]))
You can use get_attribute("innerText") to get all the values
You can just check it here the same answer, I hope it will help
First, don't use headless option in webdriver.ChromeOption() to let the webpage window pop up to observe if the element is clicked or not.
Second, your code is just click the element, it doesn't help you to GET anything. After click and open new window, it should do something like this:
items = WebDriverWait(driver, 60).until(
EC.visibility_of_all_elements_located((By.CLASS_NAME, 'load-done'))
)
for item in items:
deliver_date= item.get_attribute('text')
print(deliver_date)

Getting an element by attribute and using driver to click on a child element in webscraping - Python

Webscraping results from Indeed.com
-Searching 'Junior Python' in 'Los Angeles, CA' (Done)
-Sometimes popup window opens. Close the window if popup occurs.(Done)
-Top 3 results are sponsored so skip these and go to real results
-Click on result summary section which opens up side panel with full summary
-Scrape the full summary
-When result summary is clicked, url changes. Rather than opening new window, I would like to scrape the side panel full summary
-Each real result is under ('div':{'data-tn-component':'organicJob'}). I am able to get jobtitle, company, and short summary using BeautifulSoup. However, I would like to get the full summary on the side panel.
Problem
1) When I try to click on the link(using Selenium) (jobtitle or the short summary, which opens up the side panel), the code only ends up clicking on the 1st link which is the 'sponsored'. Unable to locate and click on real result under id='jobOrganic'
2) Once a real result is clicked on(manually), I can see that the full summary side panel is under <td id='auxCol'> and within this, under . The full summary is contained within the <p> tags. When I try to have a selenium scrape full summary using findAll('div':{'id':'vjs-desc'}), all I get is blank result [].
3) The url also changes when the side panel is opened. I tried using Selenium to have driver get the new url and then soup the url to get results but all I'm getting is the 1st sponsored result which is not what I want. I'm not sure why BeautifulSoup keeps getting the result for sponsored, even when I'm running the code under 'id='jobOrganic' real results.
Here is my code. I've been working on this for almost past two days, went through stackoverflow, documentation, and google but unable to find the answer. I'm hoping someone can point out what I'm doing wrong and why I'm unable to get the full summary.
Thanks and sorry for this being so long.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup as bs
url = 'https://www.indeed.com/'
driver = webdriver.Chrome()
driver.get(url)
whatinput = driver.find_element_by_id('text-input-what')
whatinput.send_keys('Junior Python')
whereinput = driver.find_element_by_id('text-input-where')
whereinput.click()
whereinput.clear()
whereinput.send_keys('Los Angeles, CA')
findbutton = driver.find_element_by_xpath('//*[#id="whatWhere"]/form/div[3]/button')
findbutton.click()
try:
popup = driver.find_element_by_id('prime-popover-close-button')
popup.click()
except:
pass
This is where I'm stuck. The result summary is under {'data-tn-component':'organicJob'}, span class='summary'. Once I click on this, side panel opens up.
soup = bs(driver.page_source,'html.parser')
contents = soup.findAll('div',{"data-tn-component":"organicJob"})
for each in contents:
summary = driver.find_element_by_class_name('summary')
summary.click()
This opens side panel but it clicks the first sponsored link in the whole page (sponsored link), not the real result. This basically goes out of the 'organicJob' resultset for some reason.
url = driver.current_url
driver.get(url)
I tried to set the new url after clicking on the link(sponsored) to test out whether I can even get the side panel full summary(albeit sponsored, as test purpose).
soup=bs(driver.page_source,'html.parser')
fullsum = soup.findAll('div',{"id":"vjs-desc"})
print(fullsum)
This actually prints out the full summary of side panel, but it keeps printing the same 1st result over and over through the whole loop, instead of moving to the next one.
The problem is you are fetching divs using beautiful soup but, clicking using selenium which is not aware of your collected divs.
As you are using find_element_by_class_name() method of the driver object. It searches the whole page instead of your intended div object each(in the for loop). Thus, it ends up fetching the same first result from the whole page in each iterations.
One, quick work around is possible using only selenium(this will be slower though)
elements = driver.find_elements_by_tag_name('div')
for element in elements:
if "organicJob" in element.get_attribute("data-tn-component"):
summary = element.find_element_by_class_name('summary')
summary.click()
The above code will search for all the divs and, iterate over them to find divs with data-tn-component attribute containing organicJob. Once, it find one it will search for element with class name summary and click on that element.

python selenium: how to get first hidden button's url?

The page that I worked on has invisible hidden option button.
* download sample video in the page(button is hidden by html)
[button1] (<- LINK_TEXT i s 'button1')
[button2]
[button3]
so, I used 'EC.element_to_be_clickable'.
This code is working, but this way can not used if I don't know button's LINK_TEXT. The name of LINK_TEXT is different for each page.
I want to get only video's first link url(ex- button1).
_sDriver = webdriver.Firefox()
_sDriver.get('www.example.com/video')
wait = WebDriverWait(_sDriver, 10)
download_menu = _sDriver.find_element_by_id("download-button")
action = ActionChains(_sDriver)
action.move_to_element(download_menu).perform()
documents_url = wait.until(EC.element_to_be_clickable((By.LINK_TEXT, "button1"))).get_attribute('href')
my code's result is gotten by url of 'button1', but if I don't know 'button1' text, how to get first hidden button's url using python?
thanks for your help.
First of all, by "button" I assume you mean a element in this case.
And, since the button is hidden, element_to_be_clickable would not work, use presence_of_element_located. To get the very first a element, use the "by tag name" locator:
documents_url = wait.until(EC.presence_of_element_located((By.TAG_NAME, "a"))).get_attribute('href')
There could be a better way to locate the element, without seeing the actual HTML of the mentioned "button" elements, it is difficult to tell.

Selenium internal id of element don't change when ajax executed

I use this great solution for waiting until the page loads completely.
But for one page it's don't work:
from selenium import webdriver
driver = webdriver.FireFox()
driver.get("https://vodafone.taleo.net/careersection/2a/jobsearch.ftl")
element = driver.find_element_by_xpath(".//*[#id='currentPageInfo']")
print element.id, element.text
driver.find_element_by_xpath(".//a[#id='next']").click()
element = driver.find_element_by_xpath(".//*[#id='currentPageInfo']")
print element.id, element.text
Output:
{52ce3a9f-0efb-49e1-be86-70446760e422} 1 - 25 of 1715
{52ce3a9f-0efb-49e1-be86-70446760e422} 26 - 50 of 1715
How to explain this behavior?
P.S.
With PhantomJS occurs the same thing
Selenium lib version 2.47.1
Edit
It's ajax calls on page.
This solution is used in tasks similar to that described in this article
Without the HTML one can only guess:
Reading the linked answer, the behaviour you observe is most probably because clicking the "next" button is not loading the whole page again but is only making an ajax call or something and filling an already existing table with new values.
This means that the whole page stays "the same" and thus also the "current page info" element still has the same id (just some java-script that changed its text value)
To check this you can do the following:
write another test-method identical to the one you got, but this time you replace this line:
driver.find_element_by_xpath(".//a[#id='next']").click()
with this line:
driver.navigate().refresh();
If it now gives you different ids, then I'm pretty sure, my guess is correct.

Categories

Resources