I got stuck with extracting href="/ttt/play" from the following HTML code.
<div class="collection-list-wrapper-2 w-dyn-list">
<div class="w-dyn-items">
<div typeof="ListItem" class="collection-item-2 w-clearfix w-dyn-item">
<div class="div-block-38 w-hidden-medium w-hidden-small w-hidden-tiny"><img src="https://global-uploads.webflow.com/59cf_home.svg" width="16" height="16" alt="Official Link" class="image-28">
<a property="url" href="/ttt/play" class="link-block-4 w-inline-block">
<div class="row-7 w-row"><div class="column-10 w-col w-col-2"><img height="25" property="image" src="https://global-fb0edc0001b4b11d/5a77ba9773fd490001ddaaaa_play.png" alt="Play" class="image-23"><h2 property="name" class="heading-34">Play</h2><div style="background-color:#d4af37;color:white" class="text-block-28">GOLD LEVEL</div><div class="text-block-30">HOT</div><div style="background-color:#d4af37;color:white" class="text-block-28 w-condition-invisible">SILVER LEVEL</div></div></div></a>
</div>
<div typeof="ListItem" class="collection-item-2 w-clearfix w-dyn-item">
This is my code in Python:
driver = webdriver.PhantomJS()
driver.implicitly_wait(20)
driver.set_window_size(1120, 550)
driver.get(website_url)
tag = driver.find_elements_by_class_name("w-dyn-item")[0]
tag.find_element_by_tag_name("a").click()
url = driver.current_url
print(url)
driver.quit()
When I print url using print(url), I want to see url equal to website_url/ttt/play, but instead of it I get website_url.
It looks like the click event does not work and the new link is not really opened.
When using .click() it must be "visible" (you using PhantomJS) and not hidden, in a drop-down for example.
Also make sure the page is completely loaded.
As i see it you have two options:
Ether use selenium to revile it, and then click.
Use java script to do the actual click
I strongly suggest to click with javascript, its much faster and more reliable.
Here is a little wrapper to make things easier:
def execute_script(driver, xpath):
""" wrapper for selenium driver execute_script
:param driver: selenium driver
:param xpath: (str) xpath to the element
:return: execute_script result
"""
execute_string = "window.document.evaluate('{}', document, null, 9, null).singleNodeValue.click();".format(xpath)
return driver.execute_script(execute_string)
The wrapper basically implement this technique to click on elements with javascript.
then in your selenium script use the wrapper like so:
execute_script(driver, element_xpath)
you can also make it more general to not only do clicks, but scrolls and other magic..
ps. in my example i use xpath, but you can also use css_path basically, what-ever runs in javascript.
Related
Trying to identify a javascript button on a website and press it to extend the page.
The website in question is the tencent appstore after performing a basic search. At the bottom of the page is a button titled "div.load-more-new" where upon pressing will extend the page with more apps.
the html is as follows
<div data-v-33600cb4="" class="load-more-btn-new" style="">
<a data-v-33600cb4="" href="javascript:void(0);">加载更多
<i data-v-33600cb4="" class="load-more-icon">
</i>
</a>
</div>
At first I thought I could identify the button using BeautifulSoup but all calls to find result as empty.
from selenium import webdriver
import BeautifulSoup
import time
url = 'https://webcdn.m.qq.com/webapp/homepage/index.html#/appSearch?kw=%25E7%2594%25B5%25E5%25BD%25B1'
WebDriver = webdriver.Chrome('/chromedriver')
WebDriver.get(url)
time.sleep(5)
# Find using BeuatifulSoup
soup = BeautifulSoup(WebDriver.page_source,'lxml')
button = soup.find('div',{'class':'load-more-btn-new'})
[0] []
After looking around here, it became apparent that even if I could it in BeuatifulSoup, it would not help in pressing the button. Next I tried to find the element in the driver and use .click()
driver.find_element_by_class_name('div.load-more-btn-new').click()
[1] NoSuchElementException
driver.find_element_by_css_selector('.load-more-btn-new').click()
[2] NoSuchElementException
driver.find_element_by_class_name('a.load-more-new.load-more-btn-new[data-v-33600cb4]').click()
[3] NoSuchElementException
but all return with the same error: 'NoSuchElementException'
Your selections wont work, cause they do not point on the <a>.
This one selects by class name and you try to click the <div> that holds your <a>:
driver.find_element_by_class_name('div.load-more-btn-new').click()
This one is very close but is missing the a in selection:
driver.find_element_by_css_selector('.load-more-btn-new').click()
This one try to find_element_by_class_name but is a wild mix of tag, attribute and class:
driver.find_element_by_class_name('a.load-more-new.load-more-btn-new[data-v-33600cb4]').click()
How to fix?
Select your element more specific and nearly like in your second apporach:
driver.find_element_by_css_selector('.load-more-btn-new a').click()
or
driver.find_element_by_css_selector('a[data-v-33600cb4]').click()
Note:
While working with newer selenium versions you will get DeprecationWarning: find_element_by_ commands are deprecated. Please use find_element()*
from selenium.webdriver.common.by import By
driver.find_element(By.CSS_SELECTOR, '.load-more-btn-new a').click()
I am trying to web scraping a popular movie database on the Internet. After a few thousands of requests, my spider is detected and I have to manually click on a simple button to continue my data collection.
I have tried to automatize this process by clicking automatically in that button. For that purpose, I am using Selenium library. I have located the xpath of the button, but, for some reason, .click() method doesn't work.
Here is my code:
def click_button():
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1200")
DRIVER_PATH = 'geckodriver'
driver = webdriver.Firefox(options=options, executable_path=DRIVER_PATH)
driver.get('https://www.filmaffinity.com/es/main.html')
element = driver.find_element(By.XPATH,"//input[#value='Send']")
element.click()
driver.quit()
Also, I have tried other common alternatives, such as waiting until the element gets clickable or visible, but this didn't work. I have also tried to execute the click as a javascript script.
This is the window I see when my spider is detected:
Too many request window (picture)
As you see, there is a reCaptcha-protection icon on the lower right corner, but I don't have to solve any captcha puzzle to confirm I am not a robot, I have just to click on the send button you can see on the picture.
The html which contains the button is the following one:
<div class="content">
<h1>Too many requests</h1>
<div class="image">
<img height="400" src="/images/too-many-request.jpg">
</div>
<div class="text">
Are you sure you do not dream of electric sheep?
</div>
<form name="toomanyrequest">
<div class="alert">
please enter the Captcha to continue.
</div>
<div>
<input type="submit" value="Send">
</div>
</form>
</div>
Which do you think is the trouble with my code or approach? I have to find a way to click on that button to continue my scraping, but I don't know what I am doing wrong (I have little experience at this field).
Maybe the xpath is not correct? Maybe the captcha protection is blocking the click action? I don't get any exception when I execute my code; but nothing happens and the "Too many request" window does not disappear.
Thank you very much.
Try this:
driver.execute_script("arguments[0].click()", element)
or:
driver.execute_script("arguments[0].click();", element)
I'm trying to click on an element (radio button) using Selenium (in Python), I can't disclose the URL because it's a private corporate intranet, but will share the relevants part of code.
So basically this element is within an iframe, thus, I've used the following code to get the element:
# Select the item on main DOM that will udpate the iframe contents
wait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//*[#id='sm20']"))).click()
# Don't sleep, but only WedDriverWait...
wait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[#name='ifrInterior']")))
# Select the element inside the iframe and click
wait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[#name='gestionesPropias_Equipo']"))).click()
The HTML code for the element is this:
<span class="W1">
<input type="radio" name="gestionesPropias_Equipo" value="1" onclick="javascript:indicadorPropiasEquipo(); ocultarPaginacion('1'); limpiarDatosCapaResultado();">
</span>
I'm interested in clicking this because when we click on it a drop-down is enabled:
If clicked then the dropdown is enabled:
The intersting HTML code for this is
<span id="filtroAsignado" class="W30">
<select name="nuumaAsignado" class="W84">
<option value="">[Todo mi equipo]</option></select>
</span>
Debugging a bit Selenium I can see that the elemtn is found:
And this is actually the base64 image of the element, which is the expected radio button
So I'm wondering why the element actually does not get clicked??
UPDATE: Based on request from #DebanjanB, I'm adding the HTML code from the iframe, which is enclosed inside a div in the main page:
<div id="contenido">
<iframe frameborder="0" id="ifrInterior" name="ifrInterior" src="Seguimiento.do?metodo=inicio" scrolling="auto" frameborder="0"></iframe>
</div>
Actually if I look for the word "iframe", there's only one...
Now checking the iframe source itself, has several iframes hidden but the element I need to interact with is in the iframe mentioned above, the only thing that I forgot to mention is that it's inside a form, but I guess that's not relevant? You can see the whole structure in the following image:
Great question. If you know that selenium found the element, you can use Javascript to click the element directly.
The syntax is:
driver.execute_script("arguments[0].click();", element)
You can also do this, which works the same way:
automate.execute_script("arguments[0].click();", wait.until(EC.element_to_be_clickable((By.XPATH, 'Your xpath here'))))
Essentially you are having Selenium run a javascript click on the element you have found which bypasses Selenium. Let me know if this helps!
I don't see any such issue with your code block either. Perhaps you can try out either of the following options:
Using ActionChains:
ActionChains(driver).move_to_element(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[#name='gestionesPropias_Equipo' and #type='radio']")))).click().perform()
Using executeScript() method:
driver.execute_script("arguments[0].click();", WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[#name='gestionesPropias_Equipo' and #type='radio']"))))
What is wrong in the below code
import os
import time
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://x.x.x.x/html/load.jsp")
elm1 = driver.find_element_by_link_text("load")
time.sleep(10)
elm1.click()
time.sleep(30)
driver.close()
The page source is
<body>
<div class="formcenterdiv">
<form class="form" action="../load" method="post">
<header class="formheader">Loader</header>
<div align="center"><button class="formbutton">load</button></div>
</form>
</div>
</body></html>
I want to click on button load. when I ran the above code getting this error
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: load
As the documentation says, find_elements_by_link_text only works on a tags:
Use this when you know link text used within an anchor tag. With this
strategy, the first element with the link text value matching the
location will be returned. If no element has a matching link text
attribute, a NoSuchElementException will be raised.
The solution is to use a different selector like find_element_by_class_name:
elm1 = driver.find_element_by_class_name('formbutton')
Did you try using Xpath?
As the OP said, find_elements_by_link_text works on a tags only:
Below code might help you out
driver.get_element_by_xpath("/html/body/div/form/div/button")
I am using Python and Selenium to scrape a webpage, in some cases, I can't get it to work,
*
I would like to access the element with text 'PInt', wich is the second link in the below code.
The xPath for it (copied from Developer console) is: //[#id="submenu1"]/a[2]
<div id="divTest" onscroll="SetDivPosition();" style="height: 861px;">
<div class="menuTitle" id="title1">
</div>
<div class="subtitle" id="submenu1">
<img src="images/spacer.gif" border="0" width="2px" height="12px">
Mov<br>
<img src="images/spacer.gif" border="0" width="2px" height="12px">
PInt<br>
<img src="images/spacer.gif" border="0" width="2px" height="12px">
SWAM / SWIF<br>
</div>
...
A snipped of my code is:
try:
res = driver.find_elements_by_link_text('PInt')
print("res1:{}".format(res))
res = driver.find_element(By.XPATH,'//*[#id="submenu1"]/a[3]')
print("res:{} type[0]:{}".format(res,res[0]))
itm1 = res[0]
itm1.click()
I get the error:
Unable to locate element:
{"method":"xpath","selector":"//*[#id="submenu1"]/a[2]"}
My question is, how can I get the right xPath of the element or any other way to access the element?
UPDATE:
This might be important, the issue with
Message: invalid selector: Unable to locate an element with the xpath expression (and I've tried all proposed solutions) might be that this is after authenticate in the webpage (User + Pwd) before, everything works.
I noticed that the url driver.current_url after login is static (asp page).
Also this part I am trying to access in a frameset and frame
html > frameset > frameset > frame:nth-child(1)
Thanks to #JeffC to point me in the right direction.
since the page has some frames, I manage to access the element first by switching to the right frame (using xPath)
and then access the element.
driver.switch_to.default_content()
driver.switch_to.frame(driver.find_element_by_xpath('html / frameset / frameset / frame[1]'))
driver.find_element_by_xpath("//a[contains(text(),'PInt')]").click()
BTW, in case you want to run the script from a crontab,you need to setup a display:
30 5 * * * export DISPLAY=:0; python /usr/.../main.py
To see a full list of all the ways of selecting elements using selenium, you can read all about it in the documentation.
Using xpath:
res = driver.find_element_by_xpath(u'//*[#id="submenu1"]/a[2]')
Using css selector:
res = driver.find_element_by_css_selector('#submenu1 a:nth-of-type(2)')
Try with any of the below xpath. Sometimes automatically generated xpath does not work.
//a[contains(text(),'PInt')]
or
//div[#id='submenu1']//a[contains(text(),'PInt')]
Also I would suggest you to set some wait time before clicking on above link in case if above xpath does not work
To find xPath in chrome:
Right click the element you want
Inspect, which will open the developer window and highlight the selected element.
Right click the highlighted element and choose copy > copy by xpath
Here is a list of all the different ways to locate an element Locating Elements