Parsing sub-element using selenium - python

Parsing sub-element using selenium - python - python

Here is a screenshot of the html structure:
I want to go through each item of league-list(every league-item) and look for the value aria-expanded.
Here is my code:
_1bet = 'https://1bet.com/ca/sports/tennis?time_range=all'
driver.get(_1bet) # enter the website
league1 = driver.find_elements_by_class_name('league-list')[0] # first league-item
league1.find_element_by_xpath(".//div[#class='league-title.d-flex.justify-content-between.align-items-center.collapsible']")
Selenium can't find any element and I don't understand why.
for reference I was inspired by another post :
Parsing nested elements using selenium not working - python

First of all you have to add a wait to make a page loaded before accessing the elements there.
Then you can get all those elements into the list and then iterate over the list getting each element attribute.
I'm not sure the list is initially expanded as it is presented on your picture.
Also not sure all those league-title elements are matching the selector you provided.
If not - we can correct the code accordingly
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 20)
_1bet = 'https://1bet.com/ca/sports/tennis?time_range=all'
driver.get(_1bet) # enter the website
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.league-list")))
league1 = driver.find_elements_by_class_name('league-list')[0] # first league-item
titles = league1.find_elements_by_xpath(".//div[#class='league-title.d-flex.justify-content-between.align-items-center.collapsible']")
for title in titles:
print(title.get_attribute("aria-expanded"))

Related

How to click on the first link of an opensecrets page without getting an indexError?

I'm trying to click on the first link in the results page of an opensecrets.org query. I've searched "chase" using selenium, and I now want to go to the page of the first link available. Eventually I want to loop through a whole list of searches and click on the first link, but now I can't even get my code to run with this one element's specific xpath. When I run this code:
result = driver.find_elements_by_xpath(
'//*[#id="___gcse_0"]/div/div/div/div[5]/div[2]/div/div/div[1]/div[1]/div[1]/div[1]/div/a')[0]
result.click()
I get an error: IndexError: list index out of range. Does anyone know why I'm getting this error or is there a more general way to do this, so that I could do it with any search? I keep seeing answers to click on the first link of a google search, but they all use the tag_name "cite" so I'm unsure of how to apply this to an opensecrets search

to just click on the first link, you can use the below xpath :
(//a[contains(#class, 'gs-title')])[1]
and in code, you can use it like this :
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, "(//a[contains(#class, 'gs-title')])[1]"))).click()
Imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC4
from selenium.webdriver.common.action_chains import ActionChains
and if you'd like to do it from loop :
counter = 1
action = ActionChains(driver)
for link in driver.find_elements(By.XPATH, f"(//a[contains(#class, 'gs-title')])[{counter}]"):
action.move_to_element(link).click().perform()
if counter == 1:
break
counter = counter + 2

How to scrape <li> tag with class like active/selected?

I'm trying to scrape a list from a website. There are two different lists, and one will load only after the first option is chosen. Issue is, I'm unable to select the first option. I scraped the list of all available options. But after writing it, I have to select it from the given option, and I'm unable to do so.
I've tried using browser.find_element_by_css_selector(....).click(), but it's showing the elementnotfound exception even after putting the proper wait condition. I think that's because it's unable to find that element.
browser.find_element_by_css_selector("#Brand_name").send_keys(company[i])
element= browser.find_element_by_css_selector("#Brand_name_selectWrap")
browser.implicitly_wait(5) # seconds
browser.find_element_by_css_selector("""#Brand_name_selectWrap > ul > li.selected""").click()
PS: Following is the link which I'm trying to scrape. I need all the mobiles listed company wide.
https://bangalore.quikr.com/Escrow/post-classifieds-ads/?postadcategoryid=227
Can someone kindly suggest some better way?

You can gather all options and their rel attributes into a dictionary and then loop that with appropriate wait conditions for the sublist to appear:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
d = webdriver.Chrome()
d.get('https://bangalore.quikr.com/Escrow/post-classifieds-ads/?postadcategoryid=227')
options = {i.get_attribute('textContent'):i.get_attribute('rel') for i in d.find_elements_by_css_selector('#Brand_name_selectWrap .optionLists li:not(.optionHeading) a')}
input_element = d.find_element_by_id('Brand_name')
for k,v in options.items():
input_element.click()
input_element.send_keys(k)
selector = '[rel="' + v + '"]'
WebDriverWait(d, 3).until(EC.element_to_be_clickable((By.CSS_SELECTOR, selector))).click()
WebDriverWait(d, 2).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#Model_selectWrap.showCustomSelect")))
I created a dictionary that had all the options for mobiles in it. The keys were the actual text that would go in the input box and the values are the rel attribute values for those elements. Each option has a rel attribute. It means that I can input the phone name via the key so as to generate the dropdowns to select your mobile from possible values, then use the rel attribute in a css attribute = value selector to ensure I click on the right one
The rel attribute inside anchor tags (<a>) describes the relation to the document where the link points to.
The selector variable just hold the current css attribute = value selector for getting an mobile drop down option by its rel attribute value.

The element is covered by an other element so you'll need to use ActionChains to preform the click.
You'll needto import it:
from selenium.webdriver.common.action_chains import ActionChains
Then to click on the input only after that send_keys:
input_el = browser.find_element_by_css_selector('#Brand_name')
ActionChains(browser).move_to_element(input_el).click().perform()
It's a good practice to use WebDriverWait to validate your conditions:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
wait = WebDriverWait(browser, 15)
input_el = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#Brand_name")))
ActionChains(browser).move_to_element(input_el).click().perform()

Selenium Python Unable to locate element

I'm trying to gather price information for each product variation from this web page: https://www.safetysign.com/products/7337/ez-pipe-marker
I'm using Selenium and FireFox with Python 3 and Windows 10.
Here is my current code:
driver = webdriver.Firefox()
driver.get('https://www.safetysign.com/products/7337/ez-pipe-marker')
#frame = driver.find_element_by_class_name('product-dual-holder')
# driver.switch_to.frame('skuer5c866ddb91611')
# driver.implicitly_wait(5)
driver.find_element_by_id('skuer5c866ddb91611-size-label-324').click()
price = driver.find_element_by_class_name("product-pricingnodecontent product-price-content").text.replace('$', '')
products.at[counter, 'safetysign.com Price'] = price
print(price)
print(products['safetysign.com URL'].count()-counter)
So, I'm trying to start by just selecting the first product variation by id (I've also tried class name). But, I get an Unable to locate element error. As suggested in numerous SO posts, I tried to change frames (even though I can't find a frame tag in the html that contains this element). I tried switching to different frames using index, class name, and id of different div elements that I thought might be a frame, but none of this worked. I also tried using waits, but that return the same error.
Any idea what I am missing or doing wrong?

To locate the elements you have to induce WebDriverWait for the visibility_of_all_elements_located() and you can create a List and iterate over it to click() each item and you can use the following solution:
Code Block:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver=webdriver.Firefox(executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("https://www.safetysign.com/products/7337/ez-pipe-marker")
for product in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//form[#class='product-page-form']//div[#class='sku-contents']//following::ul[1]/li//label[starts-with(#for, 'skuer') and contains(., 'Pipe')]"))):
WebDriverWait(driver, 20).until(EC.visibility_of(product)).click()
driver.quit()

They may well be dynamic. Select by label type selector instead and index to click on required item e.g. 0 for the item you mention (first in the list). Also, add a wait condition for labels to be present.
If you want to limit to just those 5 size choices then use the following css selector instead of label :
.sku-contents ul:nth-child(3) label
i.e.
sizes = WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".sku-contents ul:nth-child(3) label")))
sizes[0].click()
After selecting size you can grab the price from the price node depending on whether you want the price for a given sample size e.g. 0-99.
To get final price use:
.product-under-sku-total-label
Code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
url = 'https://www.safetysign.com/products/7337/ez-pipe-marker'
driver = webdriver.Chrome()
driver.get(url)
labels = WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "label")))
labels[0].click()
price0to99 = driver.find_element_by_css_selector('.product-pricingnodecontent').text
priceTotal = driver.find_element_by_css_selector('.product-under-sku-total-label').text
print(priceTotal, price0To99)
# driver.quit()

Trouble getting the second link when the first link contains certain keyword right next to it

I've created a script in python in association with selenium to get the first link (populated by duckduckgo.com) of any search item unless the keyword Ad is right next to that link like the image below. If the first link contains the very keyword then the script will get the second link and quits.
My current search is houzz
This is my try (it always gets the first link irrespective of the presence of that keyword Ad):
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
link = "https://duckduckgo.com/?q={}&ia=web"
def get_info(driver,keyword):
driver.get(link.format(keyword))
for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"h2.result__title"))):
lead_link = item.find_element_by_css_selector("a.result__a").get_attribute("href")
break
print(lead_link)
if __name__ == '__main__':
chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument("--headless")
driver = webdriver.Chrome(options=chromeOptions)
wait = WebDriverWait(driver, 10)
try:
get_info(driver,"*houzz*")
finally:
driver.quit()
How can I rectify my script to get the second link if the Ad keyword adjacent to the first link?

It looks like just add #links:
lead_link = item.find_element_by_css_selector("#links a.result__a").get_attribute("href")
The ads are inside of a #ads div

You can use the XPath
//h2[not(./span)]/a
^ h2 is the container for the entire link plus Ad icon
^ exclude h2s with SPAN children since they contain the Ad icons
^ what you DO want is the A result (hyperlink)

Python Selenium: find and click an element

How do I use Python to simply find a link containing text/id/etc and then click that element?
My imports:
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
Code to go to my URL:
browser = webdriver.Firefox() # Get local session of firefox
browser.get(myURL) # Load page
#right here I want to look for element containing "Click Me" in the text
#and then click it

Look at method list of WebDriver class - http://selenium.googlecode.com/svn/trunk/docs/api/py/webdriver_remote/selenium.webdriver.remote.webdriver.html
For example, to find element by its ID and click it, the code will look like this
element = browser.find_element_by_id("your_element_id")
element.click()

Hi there are a couple of ways to find a link (or any element):
element = driver.find_element_by_id("element-id")
element = driver.find_element_by_name("element-name")
element = driver.find_element_by_xpath("//input[#id='element-id']")
element = driver.find_element_by_link_text("link-text")
element = driver.find_element_by_class_name("class-name")
I think the best option for you is find_element_by_link_text since it's a link.
Once you saved the element in a variable, you call the click function: element.click() or element.send_keys(Keys.RETURN)
Take a look to the selenium-python documentation, there are a couple of examples there.

You just use this code
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
driver = webdriver.Firefox()
driver.get("https://play.spotify.com/")# here change your link
wait=WebDriverWait(driver,250)
# it will wait for 250 seconds an element to come into view, you can change the #value
submit=wait.until(EC.presence_of_element_located((By.LINK_TEXT, 'Click Me')))
submit.click()
here wait is best fit for javascript enabled website let suppose if you load the page in webdriver but due to javascript some element loads after some second then it best suite to use wait element there.

for link containing text
browser = webdriver.firefox()
browser.find_element_by_link_text(link_text).click()

You can also do it by xpath:
Browser.find_element_by_xpath("//a[#href='you_link']").click()

Finding Element by Link Text
driver.find_element_by_link_text("")
Finding Element by Name
driver.find_element_by_name("")
Finding Element By ID
driver.find_element_by_id("")

Try SST - it is a simple yet very good test framework for python.
Install it first: http://testutils.org/sst/index.html
Then:
Imports:
from sst.actions import *
First define a variable - element - that you're after
element = assert_element(text="some words and more words")
I used this: http://testutils.org/sst/actions.html#assert-element
Then click that element:
click_element('element')
And that's it.

First start one instance of browser. Then you can use any of the following methods to get element or link. After locating the element you can use element.click() or element.send_keys(Keys.RETURN) or any other object of selenium webdriver
browser = webdriver.Firefox()
Selenium provides the following methods to locate elements in a page:
To find individual elements. These methods will return individual element.
browser.find_element_by_id(id)
browser.find_element_by_name(name)
browser.find_element_by_xpath(xpath)
browser.find_element_by_link_text(link_text)
browser.find_element_by_partial_link_text(partial_link_text)
browser.find_element_by_tag_name(tag_name)
browser.find_element_by_class_name(class_name)
browser.find_element_by_css_selector(css_selector)
To find multiple elements (these methods will return a list). Later you can iterate through the list or with elementlist[n] you can locate individual element from list.
browser.find_elements_by_name(name)
browser.find_elements_by_xpath(xpath)
browser.find_elements_by_link_text(link_text)
browser.find_elements_by_partial_link_text(partial_link_text)
browser.find_elements_by_tag_name(tag_name)
browser.find_elements_by_class_name(class_name)
browser.find_elements_by_css_selector(css_selector)

browser.find_element_by_xpath("")
browser.find_element_by_id("")
browser.find_element_by_name("")
browser.find_element_by_class_name("")
Inside the ("") you have to put the respective xpath, id, name, class_name, etc...

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parsing sub-element using selenium - python - python

Related

How to click on the first link of an opensecrets page without getting an indexError?

How to scrape <li> tag with class like active/selected?

Selenium Python Unable to locate element

Trouble getting the second link when the first link contains certain keyword right next to it

Python Selenium: find and click an element

Categories

Resources