Select hyperlink in html document using Python and Selenium

Select hyperlink in html document using Python and Selenium - python

I am trying to select a hyperlink in a document from a website, but not sure how to select it using Selenium.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
names = 'Catostomus discobolus yarrowi'
driver = webdriver.Firefox()
driver.get("http://ecos.fws.gov/ecos/home.action")
SciName = driver.find_element_by_id('searchbox')
SciName.send_keys(names)
SciName.send_keys(Keys.RETURN)
The above code gets to the page that I am interested in working on, but not sure how to select the hyperlink. I am interested in selecting the first hyperlink. The html of interest is
Zuni Bluehead Sucker (<strong>Catostomus discobolus</strong> yarrowi)
</h4>
<div class='url'>ecos.fws.gov/speciesProfile/profile/speciesProfile.action?spcode=E063</div>
<span class='description'>
States/US Territories in which the Zuni Bluehead Sucker is known to or is believed to occur: Arizona, New Mexico; US Counties in which the Zuni ...
</span>
<ul class='sitelinks'></ul>
</div>
I am guessing I could use find_element_by_xpath, but have been unable to do so successfully. I will want to always select the first hyperlink. Also, the hyperlink name will change based on the species name entered.

I added the following code:
SciName = driver.find_element_by_css_selector("a[href*='http://ecos.fws.gov/speciesProfile/profile/']")
SciName.click()
I should have read the selenium documentation more thoroughly.

try this:
SciName = driver.find_element_by_link_text("Zuni Bluehead Sucker")
SciName.click()

Related

Identifying html structures with data-v-xxxxxxxx and pressing them using selenium

Trying to identify a javascript button on a website and press it to extend the page.
The website in question is the tencent appstore after performing a basic search. At the bottom of the page is a button titled "div.load-more-new" where upon pressing will extend the page with more apps.
the html is as follows
<div data-v-33600cb4="" class="load-more-btn-new" style="">
<a data-v-33600cb4="" href="javascript:void(0);">加载更多
<i data-v-33600cb4="" class="load-more-icon">
</i>
</a>
</div>
At first I thought I could identify the button using BeautifulSoup but all calls to find result as empty.
from selenium import webdriver
import BeautifulSoup
import time
url = 'https://webcdn.m.qq.com/webapp/homepage/index.html#/appSearch?kw=%25E7%2594%25B5%25E5%25BD%25B1'
WebDriver = webdriver.Chrome('/chromedriver')
WebDriver.get(url)
time.sleep(5)
# Find using BeuatifulSoup
soup = BeautifulSoup(WebDriver.page_source,'lxml')
button = soup.find('div',{'class':'load-more-btn-new'})
[0] []
After looking around here, it became apparent that even if I could it in BeuatifulSoup, it would not help in pressing the button. Next I tried to find the element in the driver and use .click()
driver.find_element_by_class_name('div.load-more-btn-new').click()
[1] NoSuchElementException
driver.find_element_by_css_selector('.load-more-btn-new').click()
[2] NoSuchElementException
driver.find_element_by_class_name('a.load-more-new.load-more-btn-new[data-v-33600cb4]').click()
[3] NoSuchElementException
but all return with the same error: 'NoSuchElementException'

Your selections wont work, cause they do not point on the <a>.
This one selects by class name and you try to click the <div> that holds your <a>:
driver.find_element_by_class_name('div.load-more-btn-new').click()
This one is very close but is missing the a in selection:
driver.find_element_by_css_selector('.load-more-btn-new').click()
This one try to find_element_by_class_name but is a wild mix of tag, attribute and class:
driver.find_element_by_class_name('a.load-more-new.load-more-btn-new[data-v-33600cb4]').click()
How to fix?
Select your element more specific and nearly like in your second apporach:
driver.find_element_by_css_selector('.load-more-btn-new a').click()
or
driver.find_element_by_css_selector('a[data-v-33600cb4]').click()
Note:
While working with newer selenium versions you will get DeprecationWarning: find_element_by_ commands are deprecated. Please use find_element()*
from selenium.webdriver.common.by import By
driver.find_element(By.CSS_SELECTOR, '.load-more-btn-new a').click()

How do I extract text from a button using Beautiful Soup?

I am trying to scrape GoFundMe information but can't seem to extract the number of donors.
This is the html I am trying to navigate. I am attempting to retrieve 11.1K,
<ul class="list-unstyled m-meta-list m-meta-list--default">
<li class="m-meta-list-item">
<button class="text-stat disp-inline text-left a-button a-button--inline" data-element-
id="btn_donors" type="button" data-analytic-event-listener="true">
<span class="text-stat-value text-underline">11.1K</span>
<span class="m-social-stat-item-title text-stat-title">donors</span>
I've tried using
donors = soup.find_all('li', class_ = 'm-meta-list-item')
for donor in donors:
print(donor.text)
The class/button seems to be hidden inside another class? How can I extract it?
I'm new to beautifulsoup but have used selenium quite a bit.
Thanks in advance.

These fundraiser pages all have similar html and that value is dynamically retrieved. I would suggest using selenium and a css class selector
from selenium import webdriver
d = webdriver.Chrome()
d.get('https://www.gofundme.com/f/treatmentforsiyona?qid=7375740208a5ee878a70349c8b74c5a6')
num = d.find_element_by_css_selector('.text-stat-value').text
print(num)
d.quit()
Learn more about selenium:
https://sqa.stackexchange.com/a/27856

get the id gofundme.com/f/{THEID} and call the API
/web-gateway/v1/feed/THEID/donations?sort=recent&limit=20&offset=20
process the Data
for people in apiResponse['references']['donations']
print(people['name'])
use browser console to find host API.

Can't trigger a click on a certain link using selenium

I've written a script in python with selenium to click on a certain link in a webpage to download an excel file. However, when I execute my script, it throws timeout exception. How can I make it work? Any help will be greatly appreciated.
Link to the site: webpage
Script I've tried with:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get('replace_with_above_link')
item = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".hasmore #dlink")))
item.click()
driver.quit()
Html elements which contain the dropdown options:
<li class="hasmore drophover"><span>Share & more</span><div><ul><li><button class="tooltip" tip="Use a customizable report creator that can<br>output HTML, CSV, or a shareable link." id="share_on_ajax_result_table">Modify & Share Table</button></li><li><button class="tooltip" tip="Get a bit of widget code to emed this table on your site">Embed this Table</button></li><li><button class="tooltip" tip="Convert the table below to comma-separated values<br>suitable for use with excel">Get as Excel Workbook (experimental)</button><a id="dlink" style="display: none;"></a></li><li><button class="tooltip" tip="Export table as <br>suitable for use with excel">Get table as CSV (for Excel)</button></li><li><button class="tooltip" tip="">Strip Mobile Formatting</button></li><li><a id="a_ajax_result_table" name="ajax_result_table" href="#ajax_result_table::none">Copy Link to Table to Clipboard</a></li><li><button class="tooltip" tip="">About Sharing Tools</button></li><li><button class="tooltip" tip="">Video: SR Sharing Tools & How-to</button></li><li><button class="tooltip" tip="">Video: Stats Table Tips & Tricks</button></li></ul></div></li>
Location of that file in that webpage (the desired link is marked with pencil):

Target link is hidden and so wait for its visibility will always fail. You should try to handle button node instead:
item = wait.until(EC.visibility_of_element_located((By.XPATH, "//li[span='Share & more']")))
item.click()
wait.until(lambda driver: "drophover" in item.get_attribute("class"))
item.find_element_by_xpath("//button[.='Get as Excel Workbook (experimental)']").click()

As you are trying to click on the link with text as Get as Excel Workbook (experimental) and as per your comment you are already able to click on the Share&more link in the first place and found it working next your intended <a> tagged element contains the attribute style set to display: none;. So to invoke click() to download you can use the following code block :
Get_as_Excel_Workbook_link = driver.find_element_by_xpath("//li[#class='hasmore drophover']//ul//li//a[#id='dlink']")
driver.execute_script("arguments[0].removeAttribute('style')", Get_as_Excel_Workbook_link)
Get_as_Excel_Workbook_link.click()
Update A
As per your comment :
I am not sure if the xpath which you have used is a valid one or not :
"//li[a[#id='dlink']]/a"
You tried using :
Get_link = driver.find_element_by_xpath("//li[a[#id='dlink']]/a")
print(Get_link.get_attribute("outerHTML"))
But why? Is there any necessity?
As per my research and analysis you can be assured that you are at the right place. See the formatted version of the HTML you have shared and the resolution of the xpath I have provided.
<li class="hasmore drophover"><span>Share & more</span>
<div>
<ul>
<li><button class="tooltip" tip="Use a customizable report creator that can<br>output HTML, CSV, or a shareable link." id="share_on_ajax_result_table">Modify & Share Table</button></li>
<li><button class="tooltip" tip="Get a bit of widget code to emed this table on your site">Embed this Table</button></li>
<li><button class="tooltip" tip="Convert the table below to comma-separated values<br>suitable for use with excel">Get as Excel Workbook (experimental)</button>
<a id="dlink" style="display: none;"></a>
</li>
<li><button class="tooltip" tip="Export table as <br>suitable for use with excel">Get table as CSV (for Excel)</button></li>
<li><button class="tooltip" tip="">Strip Mobile Formatting</button></li>
<li><a id="a_ajax_result_table" name="ajax_result_table" href="#ajax_result_table::none">Copy Link to Table to Clipboard</a></li>
<li><button class="tooltip" tip="">About Sharing Tools</button></li>
<li><button class="tooltip" tip="">Video: SR Sharing Tools & How-to</button></li>
<li><button class="tooltip" tip="">Video: Stats Table Tips & Tricks</button></li>
</ul>
</div>
</li>
So the result you have seen is pretty correct. Now, for you understanding I have inserted some text as MyLink within the intended tag :
<a id="dlink" style="display: none;"></a>
Converted as :
<a id="dlink" style="display: none;">MyLink</a>
See the result :
Check out my solution once again I can ensure that works.
Update B
unable to locate element is good message to debug perhaps apart from "display: none;" you have pulled a rug over the actual issue by mentioning clicked on the share&more link in the first place and found it working. Troubles come up when i try to initiate a click on the link.
If you observe the HTML the element is within class="tooltip" so you need to induce a waiter as follows :
//perform click on the link Share&more
Get_as_Excel_Workbook_link = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//li[#class='hasmore drophover']//ul//li//a[#id='dlink']")))
driver.execute_script("arguments[0].removeAttribute('style')", Get_as_Excel_Workbook_link)
Get_as_Excel_Workbook_link.click()

Fetching name and email from a web page [duplicate]

This question already has an answer here:
How to get data off from a web page in selenium webdriver [closed]
(1 answer)
Closed 7 years ago.
I'm trying to fetch data off from a Link. I want to fetch name/email/location/etc content from the web page and paste it into the webpage. I have written the code for it always when i run this code it just stores a blank list.
Please help me to copy these data from the web page.
I want to fetch company name, email, phone number from this Link and put these contents in an excel file. I want to do the same for the all pages of the website. I have got the logic to fetch the the links in the browser and switch in between them. I'm unable to fetch the data from the website. Can anybody provide me an enhancement to the code i have written.
Below is the code i have written:
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
import time
from lxml import html
import requests
import xlwt
browser = webdriver.Firefox() # Get local session of firefox
# 0 wait until the pages are loaded
browser.implicitly_wait(3) # 3 secs should be enough. if not, increase it
browser.get("http://ae.bizdirlib.com/taxonomy/term/1493") # Load page
links = browser.find_elements_by_css_selector("h2 > a")
#print link
for link in links:
link.send_keys(Keys.CONTROL + Keys.RETURN)
link.send_keys(Keys.CONTROL + Keys.PAGE_UP)
#tree = html.fromstring(link.text)
time.sleep(5)
companyNameElement = browser.find_elements_by_css_selector(".content.clearfix>div>fieldset>div>ul>li").text
companyName = companyNameElement
print companyNameElement
The Html code is given below
<div class="content">
<div id="node-946273" class="node node-country node-promoted node-full clearfix">
<div class="content clearfix">
<div itemtype="http://schema.org/Corporation" itemscope="">
<fieldset>
<legend>Company Information</legend>
<div style="width:100%;">
<div style="float:right; width:340px; vertical-align:top;">
<br/>
<ul>
<li>
<strong>Company Name</strong>
:
<span itemprop="name">Sabbro - F.Z.C</span>
</li>
</ul>
when i use it it gives me a error that list' object has no attribute 'text'. Can somebody help me to enhance the code and make it work. I'm kind of like stuck forever on this issue.

companyNameElement = browser.find_elements_by_css_selector(".content.clearfix>div>fieldset>div>ul>li").text
companyName = companyNameElement
print companyNameElement
find_elements_by... return a list, you can either access first element of that list or use equivalent find_element_by... method that would get just the first element.

Search for Error Elements in Selenium After .submit() Using Python

I have some selenium code to input various search terms into a website's search field using the following code:
browser = webdriver.Chrome()
browser.get(url)
search_box1 = browser.find_element_by_id('searchText-0')
search_box2 = browser.find_element_by_id('searchText-2-dateInput')
search_box1.send_keys("Foobars")
search_box1.send_keys("2013")
search_box1.submit()
and then I have more code written to grab the number of hits that result from the given search query. However, for some values of "Foobars" in particular years, there are no hits and the query results in a page like this:
<body class="search">
<div id="skip">...</div>
<div style = "display:non;">...</div>
<div id="container" class="js">
<div id="header">...</div>
<div id="search">...</div>
<div id="helpContent">...</div>
<div id="main-body" class="noBg">
<div class="error">
<div>Sorry. There are no articles that contain all the keywords you entered.</div>
<p>Possible reasons:
</p>
<ul></ul>
<p></p>
<p></p>
</div>
</div>
How can I check that the search query is this rather than the page I get when there are hits from the search query? I was going to implement an if statement to check that the search query returned something, but I can't seem to figure out the right syntax to get the error element I need to do this. I've tried things like:
Error=browser.find_element_by_name('error')
or
Error=browser.find_element_by_xpath("//div[#class='error']")
But I keep getting the error:
selenium.common.exceptions.NoSuchElementException: Message: u'no such element\n
I want to identify the error element so I can do something like
if Error == "There are no articles that contain all the keywords you entered":
do something
else:
do something else
or even better, something that will tell me if the error exists to use for the conditional. Any help would be much appreciated.

Perhaps you are getting there too fast? Try
from selenium.webdriver.support import expected_conditions as EC
....
Error= WebDriverWait(ff, 10).until(EC.presence_of_element_located((By.XPATH, "//div[#class='error']")))

I found a solution that seems to work pretty well. You need to import the selenium common exceptions module then use try/execept. In the example code I am navigating to a page given by url and inputing Foobars and 2013 into two search fields. Then, I am recovering the number of hits that result, which is stored in navBreadcrumb.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
browser = webdriver.Chrome()
browser.get(url)
search_box1 = browser.find_element_by_id('searchText-0')
search_box2 = browser.find_element_by_id('searchText-2-dateInput')
search_box1.send_keys("Foobars")
search_box1.send_keys("2013")
search_box1.submit()
try:
Hits = browser.find_element_by_id('navBreadcrumb').text
Hits = int(Hits)
except NoSuchElementException:
Hits = int(0)
browser.quit()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Select hyperlink in html document using Python and Selenium - python

I added the following code: SciName = driver.find_element_by_css_selector("a[href*='http://ecos.fws.gov/speciesProfile/profile/']") SciName.click() I should have read the selenium documentation more thoroughly.

try this: SciName = driver.find_element_by_link_text("Zuni Bluehead Sucker") SciName.click()

Related

Identifying html structures with data-v-xxxxxxxx and pressing them using selenium

How do I extract text from a button using Beautiful Soup?

Can't trigger a click on a certain link using selenium

Fetching name and email from a web page [duplicate]

Search for Error Elements in Selenium After .submit() Using Python

Categories

Resources