I've written a script in python with selenium to click on a certain link in a webpage to download an excel file. However, when I execute my script, it throws timeout exception. How can I make it work? Any help will be greatly appreciated.
Link to the site: webpage
Script I've tried with:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get('replace_with_above_link')
item = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".hasmore #dlink")))
item.click()
driver.quit()
Html elements which contain the dropdown options:
<li class="hasmore drophover"><span>Share & more</span><div><ul><li><button class="tooltip" tip="Use a customizable report creator that can<br>output HTML, CSV, or a shareable link." id="share_on_ajax_result_table">Modify & Share Table</button></li><li><button class="tooltip" tip="Get a bit of widget code to emed this table on your site">Embed this Table</button></li><li><button class="tooltip" tip="Convert the table below to comma-separated values<br>suitable for use with excel">Get as Excel Workbook (experimental)</button><a id="dlink" style="display: none;"></a></li><li><button class="tooltip" tip="Export table as <br>suitable for use with excel">Get table as CSV (for Excel)</button></li><li><button class="tooltip" tip="">Strip Mobile Formatting</button></li><li><a id="a_ajax_result_table" name="ajax_result_table" href="#ajax_result_table::none">Copy Link to Table to Clipboard</a></li><li><button class="tooltip" tip="">About Sharing Tools</button></li><li><button class="tooltip" tip="">Video: SR Sharing Tools & How-to</button></li><li><button class="tooltip" tip="">Video: Stats Table Tips & Tricks</button></li></ul></div></li>
Location of that file in that webpage (the desired link is marked with pencil):
Target link is hidden and so wait for its visibility will always fail. You should try to handle button node instead:
item = wait.until(EC.visibility_of_element_located((By.XPATH, "//li[span='Share & more']")))
item.click()
wait.until(lambda driver: "drophover" in item.get_attribute("class"))
item.find_element_by_xpath("//button[.='Get as Excel Workbook (experimental)']").click()
As you are trying to click on the link with text as Get as Excel Workbook (experimental) and as per your comment you are already able to click on the Share&more link in the first place and found it working next your intended <a> tagged element contains the attribute style set to display: none;. So to invoke click() to download you can use the following code block :
Get_as_Excel_Workbook_link = driver.find_element_by_xpath("//li[#class='hasmore drophover']//ul//li//a[#id='dlink']")
driver.execute_script("arguments[0].removeAttribute('style')", Get_as_Excel_Workbook_link)
Get_as_Excel_Workbook_link.click()
Update A
As per your comment :
I am not sure if the xpath which you have used is a valid one or not :
"//li[a[#id='dlink']]/a"
You tried using :
Get_link = driver.find_element_by_xpath("//li[a[#id='dlink']]/a")
print(Get_link.get_attribute("outerHTML"))
But why? Is there any necessity?
As per my research and analysis you can be assured that you are at the right place. See the formatted version of the HTML you have shared and the resolution of the xpath I have provided.
<li class="hasmore drophover"><span>Share & more</span>
<div>
<ul>
<li><button class="tooltip" tip="Use a customizable report creator that can<br>output HTML, CSV, or a shareable link." id="share_on_ajax_result_table">Modify & Share Table</button></li>
<li><button class="tooltip" tip="Get a bit of widget code to emed this table on your site">Embed this Table</button></li>
<li><button class="tooltip" tip="Convert the table below to comma-separated values<br>suitable for use with excel">Get as Excel Workbook (experimental)</button>
<a id="dlink" style="display: none;"></a>
</li>
<li><button class="tooltip" tip="Export table as <br>suitable for use with excel">Get table as CSV (for Excel)</button></li>
<li><button class="tooltip" tip="">Strip Mobile Formatting</button></li>
<li><a id="a_ajax_result_table" name="ajax_result_table" href="#ajax_result_table::none">Copy Link to Table to Clipboard</a></li>
<li><button class="tooltip" tip="">About Sharing Tools</button></li>
<li><button class="tooltip" tip="">Video: SR Sharing Tools & How-to</button></li>
<li><button class="tooltip" tip="">Video: Stats Table Tips & Tricks</button></li>
</ul>
</div>
</li>
So the result you have seen is pretty correct. Now, for you understanding I have inserted some text as MyLink within the intended tag :
<a id="dlink" style="display: none;"></a>
Converted as :
<a id="dlink" style="display: none;">MyLink</a>
See the result :
Check out my solution once again I can ensure that works.
Update B
unable to locate element is good message to debug perhaps apart from "display: none;" you have pulled a rug over the actual issue by mentioning clicked on the share&more link in the first place and found it working. Troubles come up when i try to initiate a click on the link.
If you observe the HTML the element is within class="tooltip" so you need to induce a waiter as follows :
//perform click on the link Share&more
Get_as_Excel_Workbook_link = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//li[#class='hasmore drophover']//ul//li//a[#id='dlink']")))
driver.execute_script("arguments[0].removeAttribute('style')", Get_as_Excel_Workbook_link)
Get_as_Excel_Workbook_link.click()
Related
I'm trying to automate a download via Selenium using Python. The website I'm trying to download from has multiple options, with each option having a HTML HREF and an Excel HREF. So the site code looks like this:
</ul>
<li><a class="pnid-642 pv-pid-0 pvid-9972 cid-31"> </a>24. Option 24
<ul>
<li><table width='50%'><tr><td width='20%'> </td><td width='50%'>Select type</td><td width='15%'><A title='Html' HREF='/apps/carteras/genera_xsl_v2.0.php?param=RWJybTl4VEV4MnlHc0VSQVd5T1VKV3Q3STg4Rk5oS1RYUDdaa1dFbDhoWkwzam53L3huQzBnPT0='><span class="fa fa-file-code-o fa-2x" aria-hidden="true"'></span></a></td><td width='15%'><A title='Excel' HREF='/apps/carteras/genera_xsl2xls.php?param=RWJybTl4VEV4MnlHc0VSQVd5T1VKV3Q3STg4Rk5oS1RYUDdaa1dFbDhoWkwzam53L3huQzBnPT0='><span class="fa fa-file-excel-o fa-2x" aria-hidden="true"></span></a></td></tr></table></li>
</ul>
<li><a class="pnid-642 pv-pid-0 pvid-9972 cid-31"> </a>25. Option 25
<ul>
<li><table width='50%'><tr><td width='20%'> </td><td width='50%'>Select type<td width='15%'><A title='Html' HREF='/apps/carteras/genera_xsl_v2.0.php?param=RWJybTl4VEV4MnlHc0VSQVd5T1VKVTBSRDZ5aVNsb2JYUDdaa1dFbDhoWkwzam53L3huQzBnPT0='><span class="fa fa-file-code-o fa-2x" aria-hidden="true"'></span></a></td><td width='15%'><A title='Excel' HREF='/apps/carteras/genera_xsl2xls.php?param=RWJybTl4VEV4MnlHc0VSQVd5T1VKVTBSRDZ5aVNsb2JYUDdaa1dFbDhoWkwzam53L3huQzBnPT0='><span class="fa fa-file-excel-o fa-2x" aria-hidden="true"></span></a></td></tr></table></li>
</ul>
I'm trying to automate the download of the Option 25 Excel file, but as you can see the Excel HREF are identical for each option on the website. Is there a way I can use Selenium to download only that Excel file?
Thanks
To identify the 25th Excel file use following xpath to identify.
driver.find_element(By.XPATH, "//li[contains(., '25. Option 25')]/ul/li//a[#title='Excel']").click()
If you want to make it dynamic you can create a method and pass the option text as parameter.
def DownloadFileOptions(optionName) :
driver.find_element(By.XPATH, "//li[contains(., '{}')]/ul/li//a[#title='Excel']".format(optionName)).click()
DownloadFileOptions('25. Option 25')
DownloadFileOptions('24. Option 24')
I would suggest you to use webdriverwait() and wait for element to be clickable.
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//li[contains(., '25. Option 25')]/ul/li//a[#title='Excel']"))).click()
you need to import following library.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
You could try to find the element by the text it contains using
driver.find_elements_by_xpath("//*[contains(text(), '25. Option 25')]")
My objective is to open a webpage, and click the app button for a specific app, like Anaplan. In the past, I've used get element by CSS selector with the combination of class, and ID, as shown in this past post.
first_item = driver.find_element_by_id("anaplan")
I've come across a webpage where the buttons seem to have literally no ID whatsoever, or unique values:
HTML output of the Anaplan App button:
<a
aria-label="launch app Anaplan"
class="chiclet a--no-decoration"
data-se="app-card"
href="https://gartner.okta.com/home/anaplan/0oaforg08lyATdLuw4x6/2487"
draggable="true"
><article class="chiclet--article">
<button
class="chiclet--action"
tabindex="0"
aria-label="Settings for Anaplan"
data-se="app-card-settings-button"
>
<svg
class="chiclet--action-kebab"
width="20"
height="4"
viewBox="0 0 20 4"
fill="#B7BCC0"
xmlns="http://www.w3.org/2000/svg"
>
<circle cx="2" cy="2" r="2"></circle>
<circle cx="10" cy="2" r="2"></circle>
<circle cx="18" cy="2" r="2"></circle>
</svg>
</button>
<section class="chiclet--main" data-se="app-card-main">
<img
class="chiclet--main-logo"
src="https://ok11static.oktacdn.com/fs/bcg/4/gfs1ev15ab63zqgZ91d8"
alt="Anaplan logo"
/>
</section>
<footer class="chiclet--footer" data-se="app-card-footer">
<o-tooltip content="Anaplan" position="bottom" class="hydrated"
><div slot="content"></div>
<div aria-describedby="o-tooltip-0">
<h1 class="chiclet--app-title" data-se="app-card-title">Anaplan</h1>
</div>
</o-tooltip>
</footer>
</article>
</a>
I grabbed the Xpath of the Anaplan button, which shows the following:
/html[#class='hydrated wf-proximanova-n4-inactive wf-
inactive']/body[#class='default']/div[#id='root']
/div[#class='enduser-app ']/section[#class='content-frame']
/main[#class='main-container has-top-bar']/div[#class='dashboard--main']/section[#id='main-
content']/section[#class='chiclet-area']
/section[#class='chiclet-grid--container']
/section/section[#class='chiclet-grid section-appear-done section-enter-done']
/a[#class='chiclet a--no-decoration'][1]/article[#class='chiclet--article']
The only differences between apps is the number in the bracket:
/a[#class='chiclet a--no-decoration'][1], where 1 seems to be Anaplan, 3 is G Drive, and so on. Is there a way to select elements such as this where there appears to be no unique identifier at all?
To locate the first button you can use one of the following xpaths //a[#aria-label='launch app Anaplan'] or //a[contains(#href,'anaplan')] and there are many other unique combinations. The same can be done with css selectors
Similarly to the above there are several combinations for all the other navigation buttons you provided here.
In case the element located inside <iframe> you have to switch to that <iframe> first and get out of it after that.
Locate the <iframe> with
iframe = driver.find_element_by_xpath("//iframe[#name='iframeName']") or whatever locator that it matches
Then switch_to the <iframe>:
driver.switch_to.frame(iframe)
If after that you need to continue anywhere out of the <iframe> switch out of it with
driver.switch_to.default_content()
It is possible both with xpath and css.
Example of xpath:
Anaplan:
//a[contains(#aria-label, 'Anaplan')]/article/button
Or:
//button[contains(#aria-label, 'Settings for Anaplan')]
Spam Quarantine:
//a[contains(#aria-label, 'Spam Quarantine')]
G-suite
//a[contains(#aria-label, 'G Suite Drive')]
The main idea is that you can find an element by writing a partial name of an attribute.
Update:
If an element is located inside an iframe, you should wait for it to load and switch to it. Selenium has very convenient method for it: frame_to_be_available_and_switch_to_it
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
driver = webdriver.Chrome()
driver.get(url)
wait = WebDriverWait(driver, 15)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[data-testid=shell-content]")))
After switching to iframe you work with elements inside it.
I am trying to scrape GoFundMe information but can't seem to extract the number of donors.
This is the html I am trying to navigate. I am attempting to retrieve 11.1K,
<ul class="list-unstyled m-meta-list m-meta-list--default">
<li class="m-meta-list-item">
<button class="text-stat disp-inline text-left a-button a-button--inline" data-element-
id="btn_donors" type="button" data-analytic-event-listener="true">
<span class="text-stat-value text-underline">11.1K</span>
<span class="m-social-stat-item-title text-stat-title">donors</span>
I've tried using
donors = soup.find_all('li', class_ = 'm-meta-list-item')
for donor in donors:
print(donor.text)
The class/button seems to be hidden inside another class? How can I extract it?
I'm new to beautifulsoup but have used selenium quite a bit.
Thanks in advance.
These fundraiser pages all have similar html and that value is dynamically retrieved. I would suggest using selenium and a css class selector
from selenium import webdriver
d = webdriver.Chrome()
d.get('https://www.gofundme.com/f/treatmentforsiyona?qid=7375740208a5ee878a70349c8b74c5a6')
num = d.find_element_by_css_selector('.text-stat-value').text
print(num)
d.quit()
Learn more about selenium:
https://sqa.stackexchange.com/a/27856
get the id gofundme.com/f/{THEID} and call the API
/web-gateway/v1/feed/THEID/donations?sort=recent&limit=20&offset=20
process the Data
for people in apiResponse['references']['donations']
print(people['name'])
use browser console to find host API.
I got stuck with extracting href="/ttt/play" from the following HTML code.
<div class="collection-list-wrapper-2 w-dyn-list">
<div class="w-dyn-items">
<div typeof="ListItem" class="collection-item-2 w-clearfix w-dyn-item">
<div class="div-block-38 w-hidden-medium w-hidden-small w-hidden-tiny"><img src="https://global-uploads.webflow.com/59cf_home.svg" width="16" height="16" alt="Official Link" class="image-28">
<a property="url" href="/ttt/play" class="link-block-4 w-inline-block">
<div class="row-7 w-row"><div class="column-10 w-col w-col-2"><img height="25" property="image" src="https://global-fb0edc0001b4b11d/5a77ba9773fd490001ddaaaa_play.png" alt="Play" class="image-23"><h2 property="name" class="heading-34">Play</h2><div style="background-color:#d4af37;color:white" class="text-block-28">GOLD LEVEL</div><div class="text-block-30">HOT</div><div style="background-color:#d4af37;color:white" class="text-block-28 w-condition-invisible">SILVER LEVEL</div></div></div></a>
</div>
<div typeof="ListItem" class="collection-item-2 w-clearfix w-dyn-item">
This is my code in Python:
driver = webdriver.PhantomJS()
driver.implicitly_wait(20)
driver.set_window_size(1120, 550)
driver.get(website_url)
tag = driver.find_elements_by_class_name("w-dyn-item")[0]
tag.find_element_by_tag_name("a").click()
url = driver.current_url
print(url)
driver.quit()
When I print url using print(url), I want to see url equal to website_url/ttt/play, but instead of it I get website_url.
It looks like the click event does not work and the new link is not really opened.
When using .click() it must be "visible" (you using PhantomJS) and not hidden, in a drop-down for example.
Also make sure the page is completely loaded.
As i see it you have two options:
Ether use selenium to revile it, and then click.
Use java script to do the actual click
I strongly suggest to click with javascript, its much faster and more reliable.
Here is a little wrapper to make things easier:
def execute_script(driver, xpath):
""" wrapper for selenium driver execute_script
:param driver: selenium driver
:param xpath: (str) xpath to the element
:return: execute_script result
"""
execute_string = "window.document.evaluate('{}', document, null, 9, null).singleNodeValue.click();".format(xpath)
return driver.execute_script(execute_string)
The wrapper basically implement this technique to click on elements with javascript.
then in your selenium script use the wrapper like so:
execute_script(driver, element_xpath)
you can also make it more general to not only do clicks, but scrolls and other magic..
ps. in my example i use xpath, but you can also use css_path basically, what-ever runs in javascript.
I am trying to select a hyperlink in a document from a website, but not sure how to select it using Selenium.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
names = 'Catostomus discobolus yarrowi'
driver = webdriver.Firefox()
driver.get("http://ecos.fws.gov/ecos/home.action")
SciName = driver.find_element_by_id('searchbox')
SciName.send_keys(names)
SciName.send_keys(Keys.RETURN)
The above code gets to the page that I am interested in working on, but not sure how to select the hyperlink. I am interested in selecting the first hyperlink. The html of interest is
Zuni Bluehead Sucker (<strong>Catostomus discobolus</strong> yarrowi)
</h4>
<div class='url'>ecos.fws.gov/speciesProfile/profile/speciesProfile.action?spcode=E063</div>
<span class='description'>
States/US Territories in which the Zuni Bluehead Sucker is known to or is believed to occur: Arizona, New Mexico; US Counties in which the Zuni ...
</span>
<ul class='sitelinks'></ul>
</div>
I am guessing I could use find_element_by_xpath, but have been unable to do so successfully. I will want to always select the first hyperlink. Also, the hyperlink name will change based on the species name entered.
I added the following code:
SciName = driver.find_element_by_css_selector("a[href*='http://ecos.fws.gov/speciesProfile/profile/']")
SciName.click()
I should have read the selenium documentation more thoroughly.
try this:
SciName = driver.find_element_by_link_text("Zuni Bluehead Sucker")
SciName.click()