selenium accessing text input element with changing ID using chrome webdriver - python

I am having trouble accessing a input element from this specific webpage. http://prod.symx.com/MTECorp/config.asp?cmd=edit&CID=428D77C8A7ED4DA190E6170116F3A71B
if the webpage has timed out just go ahead and click on this clink below
https://www.mtecorp.com/click-find/
and click on the hyperlink "RL_reactors" to take you to the page.
On this page, I am currently trying to access the search bar/ input element of the webpage to type in a part number that the company sells. This is for school projects and collecting data from different companies for pricing and etc. I am using pycharm(python) and selenium to write this script. Currently, this is the snippet of my code at the moment
# web scraping for MTE product cost list
# reading excel files on the drive
import time
from openpyxl import workbook, load_workbook
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
.........................
..more code
..........................
#part that is getting stuck on
if((selection >= 1) and (selection <= 7)):
print("valid selection going to page...")
if(selection == 1):
target=driver.find_element(By.XPATH,"/html/body/main/article/div/div/div/table/tbody/tr[1]/td[1]/a")
driver.execute_script("arguments[0].click();", target)
element = WebDriverWait(driver,100).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).send_keys("test")
print("passed clickabel element agruement\n")
currently, my code does go to the RL_reactors page as shown below but however when I'm using CSS selector by class name it doesn't recognize the class type I'm trying to get. Now of course many would say why not use XPath and etc. The reason I cant use XPath and etc is that the element id changes for every iteration of the script. So for example the 1st run of the program id name would be "hr8" when for the other script the program name could be "dsfsih". For my observation, the only part of the element that stays constant is the value and the class name. I have tried using XPath, id, ccselector, and such but to no result. Any suggestions
thanks!

Because you are using javascript to click the link on your website, selenium doesn't change the tab (hence it cannot locate the class you are searching for). You can explicitly tell selenium to change the tab window.
url = "https://www.mtecorp.com/click-find/"
driver.get(url)
target=driver.find_element(By.XPATH,"/html/body/main/article/div/div/div/table/tbody/tr[1]/td[1]/a")
driver.execute_script("arguments[0].click();", target)
driver.switch_to.window(driver.window_handles[1])
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).clear()
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).send_keys('test')
Alternatively, instead of clicking the link, you can grab the href and open it in a new instance of selenium by calling driver.get() again.
url = "https://www.mtecorp.com/click-find/"
driver.get(url)
target_link=driver.find_element(By.XPATH,"/html/body/main/article/div/div/div/table/tbody/tr[1]/td[1]/a").get_attribute('href')
driver.get(target_link)
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).clear()
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).send_keys("test")

Related

How to use selenium for webscraping google flights?

I'm trying to pull the airline names and prices of a specific flight. I'm having trouble with the x.path and/or using the right html tags because when I run the code below, all I get back is 14 empty lists.
from selenium import webdriver
from lxml import html
from time import sleep
driver = webdriver.Chrome(r"C:\Users\14074\Python\chromedriver")
URL = 'https://www.google.com/travel/flights/searchtfs=CBwQAhopagwIAxIIL20vMHBseTASCjIwMjEtMTItMjNyDQgDEgkvbS8wMWYwOHIaKWoNCAMSCS9tLzAxZjA4chIKMjAyMS0xMi0yN3IMCAMSCC9tLzBwbHkwcAGCAQsI____________AUABSAGYAQE&tfu=EgYIAhAAGAA'
driver.get(URL)
sleep(1)
tree = html.fromstring(driver.page_source)
for flight_tree in tree.xpath('//div[#class="TQqf0e sSHqwe tPgKwe ogfYpf"]'):
title = flight_tree.xpath('.//*[#id="yDmH0d"]/c-wiz[2]/div/div[2]/div/c-wiz/div/c-wiz/div[2]/div[2]/div/div[2]/div[6]/div/div[2]/div/div[1]/div/div[1]/div/div[2]/div[2]/div[2]/span/text()')
price = flight_tree.xpath('.//span[contains(#data-gs, "CjR")]')
print(title, price)
#driver.close()
This is just the first part of my code but I can't really continue without getting this to work. If anyone has some ideas on what I'm doing wrong that would be amazing! It's been driving me crazy. Thank you!
I noticed a few issues with your code. First of all, I believe that when entering this page, first google will show you the "I agree to terms and conditions" popup before showing you the content of the page, therefore you need to first click on that button.
Also, you should use the find_elements_by_xpath function directly on driver instead of using the page content, as this also allows you to render the javascript content. You can find more info here: python tree.xpath return empty list
To get more info on how to scrape using selenium and python you could check out this guide: https://www.webscrapingapi.com/python-selenium-web-scraper/
I used the following code to scrape the titles. (I also changed the xpaths to do so, by extracting them directly from google chrome. You can do that by right clicking on an element -> inspect and in the elements tab where the element is, you can right click -> copy -> Copy xpath)
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
# I used these for the code to work on my windows subsystem linux
option = webdriver.ChromeOptions()
option.add_argument('--no-sandbox')
option.add_argument('--disable-dev-sh-usage')
driver = webdriver.Chrome(ChromeDriverManager().install(), options=option)
URL = 'https://www.google.com/travel/flights/searchtfs=CBwQAhopagwIAxIIL20vMHBseTASCjIwMjEtMTItMjNyDQgDEgkvbS8wMWYwOHIaKWoNCAMSCS9tLzAxZjA4chIKMjAyMS0xMi0yN3IMCAMSCC9tLzBwbHkwcAGCAQsI____________AUABSAGYAQE&tfu=EgYIAhAAGAA'
driver.get(URL)
driver.find_element_by_xpath('//*[#id="yDmH0d"]/c-wiz/div/div/div/div[2]/div[1]/div[4]/form/div[1]/div/button/span').click() # this is necessary to pres the I agree button
elements = driver.find_elements_by_xpath('//*[#id="yDmH0d"]/c-wiz[2]/div/div[2]/div/c-wiz/div/c-wiz/div[2]/div[3]/div[3]/c-wiz/div/div[2]/div[1]/div/div/ol/li')
for flight_tree in elements:
title = flight_tree.find_element_by_xpath('.//*[#class="W6bZuc YMlIz"]').text
print(title)
I tried the below code, with screen maximized and having explicit waits and could successfully extract the information, please see below :
Sample code :
driver = webdriver.Chrome(driver_path)
driver.maximize_window()
driver.get("https://www.google.com/travel/flights/searchtfs=CBwQAhopagwIAxIIL20vMHBseTASCjIwMjEtMTItMjNyDQgDEgkvbS8wMWYwOHIaKWoNCAMSCS9tLzAxZjA4chIKMjAyMS0xMi0yN3IMCAMSCC9tLzBwbHkwcAGCAQsI____________AUABSAGYAQE&tfu=EgYIAhAAGAA")
wait = WebDriverWait(driver, 10)
titles = wait.until(EC.presence_of_all_elements_located((By.XPATH, "//div/descendant::h3")))
for name in titles:
print(name.text)
price = name.find_element(By.XPATH, "./../following-sibling::div/descendant::span[2]").text
print(price)
Imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Output :
Tokyo
₹38,473
Mumbai
₹3,515
Dubai
₹15,846

Dynamic element (Table) in page is not updated when i use Click() in selenium, so i couldn't retrive the new data

Page which i need to scrape data from: Digikey Search result
Issue
It is allowed to show only 100 row in each table, so i have to move between multiple tables using the NextPageButton.
As illustrated in the code below, I actually do though, but the results retrieves to me every time the first table results and doesn't move on to the next table results on my click action ActionChains(driver).click(element).perform().
Keep in mind that NO new pages is opened, click is going to be intercepted by some sort of JavaScript to do some rich UI stuff on the same page to load a new table of data
My Expectations
I am just trying to validate that I could move to the next table, then i will edit the code to loop through all of them.
This piece of code should return the data in the second table from results, BUT it actually returns the values from the first table which loaded initially with the URL. This means that the click action didn't occur or it actually occurred but the WebDriver driver content isn't being updated by interacting with dynamic JavaScript elements in the page.
I will appreciate any help, Thanks..
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located
from selenium.webdriver import ActionChains
import time
import sys
url = "https://www.digikey.com/en/products/filter/coaxial-connectors-rf-terminators/382?s=N4IgrCBcoA5QjAGhDOl4AYMF9tA"
chrome_driver_path = "..PATH\\chromedriver"
chrome_options = Options()
chrome_options.add_argument ("--headless")
webdriver = webdriver.Chrome(
executable_path= chrome_driver_path
,options= chrome_options
)
with webdriver as driver:
wait = WebDriverWait(driver, 10)
driver.get(url)
wait.until(presence_of_element_located((By.CSS_SELECTOR, "tbody")))
element = driver.find_element_by_css_selector("button[data-testid='btn-next-page']")
ActionChains(driver).click(element).perform()
time.sleep(10) #too much time i know, but to make sure it is not a waiting issue. something needs to be updated
results = driver.find_elements_by_css_selector("tbody")
for count in results:
countArr = count.text
print(countArr)
print()
driver.close()
Finally found a SOLUTION !
Source of the solution.
As expected the issue was in the clicking action itself. It is somehow not being done right or it's not being done at all as illustrated in the solution Source question.
the solution is to click the button using Javascript execution.
Change line 30
ActionChains(driver).click(element).perform()
to be as following:
driver.execute_script("arguments[0].click();",element)
That's it..

Having trouble referencing to a certain element on page with Selenium

I am having a terribly hard time referencing to a certain "next page" button on a website that I am trying to scrape links from [https://www.sreality.cz/adresar?strana=2]. If you scroll down you can see a red right arrow button that you can click to go to the next page and so the website load new dynamic content. Every approach seems to report the same exact error and I don't know how am I supposed to point to the element without running into it.
This is the code that I currently have :
from selenium import webdriver
chromedriver_path = "/home/user/Dokumenty/iCloud/RealityScraper/chromedriver"
driver = webdriver.Chrome(chromedriver_path)
print("WebDriver Successfully Initialized")
driver.get("https://www.sreality.cz/adresar?strana=2")
links = driver.find_elements_by_css_selector("h2.title a")
nextPage = driver.find_element_by_css_selector("li.paging-item a.btn-paging-pn.icof.icon-arr-right.paging-next")
for link in links:
print(link.get_attribute("href"))
nextPage.click()
The "nextPage" variable is holding a supposed value to be clicked on once the "links" variable search finishes scraping all the links from the company titles. However when I run this code I get an error :
selenium.common.exceptions.StaleElementReferenceException: Message:
stale element reference: element is not attached to the page document
I have been searching for various fixes online but none of them seemed to resolve the issue. I think that the issue at this point is not caused by the element not loading quickly enough but rather Selenium having trouble finding the element because of wrong reference.
Because of this I have tried using XPath to accurately point to the actual element and so I changed the "nextPage" variable to :
nextPage = driver.find_element_by_xpath("""/html/body/div[2]/div[1]/div[2]/div[2]/div[4]/div/div/div/div[2]/div/div[2]/ul[1]/li[12]/a""")
Which returns exactly the same error as stated above. I have been trying to find a solution to this for hours now and I can't understand where the issue lies. I would be grateful if anyone could explain to me what am I doing wrong. Thanks to anyone.
If you want to get all the ng-href tags from every page. Or you could look into their api.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from time import sleep
driver.get("https://www.sreality.cz/adresar?strana=2")
wait = WebDriverWait(driver, 10)
while True:
try:
links = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "h2.title > a")))
#print(len(links))
for link in links:
print(link.get_attribute("ng-href"))
nextPage = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.btn-paging-pn.icof.icon-arr-right.paging-next")))
nextPage.click()
time.sleep(10)
except Exception as e:
print(e)
break
First of all never use the absolute xpath it will breakdown easily, Use the relative xpath.
Secondly, i think the error you are getting is because after clicking "Next" button for the first time it loads a new page. Which has a different DOM structure and that's why you are not able to find that element.
You can try searching for the element after every new page load (after clicking "Next" button everytime.)
// imports
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
// initialize
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 20)
action = ActionChains(driver)
// Try to use the below code and see if it works.
Next_btn = wait.until(EC.presence_of_element_located((By.XPATH, '(//li[#class="paging-item"])[2]')))
action.move_to_element(Next_btn).click().perform()

Selenium webdriver finds elements from the previous page, not the current

I am trying to automate a YouTube search, click on a search result, and then follow the recommended videos on the right-hand side. The code I wrote goes to youtube, makes the search, and clicks on a video and the video gets opened up on the browser. However, I cannot bring it to click on one of the recommended videos.
The problem seems to be that, when I use recommended_videos = driver.find_elements_by_id("video-title") to get a list of elements of recommended videos from the right, what I get instead is a list from the previous page (when I first type in the word and get search results).
The code does work properly when I go directly to a video link with driver.get(url), instead of doing a search first.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
import random
seed = 1
random.seed(seed)
driver = webdriver.Firefox()
driver.get("http://www.youtube.com/")
element = driver.find_element_by_tag_name("input")
# Put the word "history" in the search box and hit enter
element.send_keys("history")
element.send_keys(Keys.RETURN)
time.sleep(5)
# Get a list of elements (videos) that get returned by the search
search_results = driver.find_elements_by_id("video-title")
# Click randomly on one of the first five results
search_results[random.randint(0,4)].click()
# Go to the end of the page (I don't know if this is necessary
html = driver.find_element_by_tag_name('html')
html.send_keys(Keys.END)
time.sleep(10)
# Get the recommended videos the same way as above. This is where the problem starts, because recommended_videos essentially becomes the same thing as the previous page's search_results, even though the browser is in a new page now.
recommended_videos = driver.find_elements_by_id("video-title")
recommended_videos[random.randint(0,4)].click()
So when I click (last line), I get the error
ElementNotInteractableException: Element <a id="video-title" class="yt-simple-endpoint style-scope ytd-video-renderer" href="/watch?v=1oean5l__Cc"> could not be scrolled into view
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
import random
seed = 1
random.seed(seed)
driver = webdriver.Chrome()
driver.get("https://www.youtube.com")
element = driver.find_element_by_tag_name("input")
# Put the word "history" in the search box and hit enter
element.send_keys("history")
element.send_keys(Keys.RETURN)
time.sleep(5)
# Get a list of elements (videos) that get returned by the search
search_results = driver.find_elements_by_id("video-title")
# Click randomly on one of the first five results
search_results[random.randint(0,10)].click()
# Go to the end of the page (I don't know if this is necessary
#
time.sleep(4)
# Get the recommended videos the same way as above. This is where the problem starts, because recommended_videos essentially becomes the same thing as the previous page's search_results, even though the browser is in a new page now.
while True:
recommended_videos = driver.find_elements_by_xpath("//*[#id='dismissable']/div/a")
print(recommended_videos)
recommended_videos[random.randint(1,4)].click()
time.sleep(4)
i am also completely new and thank you for giving me interest in selenium. may this code helps you. change the driver to firefox if you want.

Python Selenium: find and click an element

How do I use Python to simply find a link containing text/id/etc and then click that element?
My imports:
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
Code to go to my URL:
browser = webdriver.Firefox() # Get local session of firefox
browser.get(myURL) # Load page
#right here I want to look for element containing "Click Me" in the text
#and then click it
Look at method list of WebDriver class - http://selenium.googlecode.com/svn/trunk/docs/api/py/webdriver_remote/selenium.webdriver.remote.webdriver.html
For example, to find element by its ID and click it, the code will look like this
element = browser.find_element_by_id("your_element_id")
element.click()
Hi there are a couple of ways to find a link (or any element):
element = driver.find_element_by_id("element-id")
element = driver.find_element_by_name("element-name")
element = driver.find_element_by_xpath("//input[#id='element-id']")
element = driver.find_element_by_link_text("link-text")
element = driver.find_element_by_class_name("class-name")
I think the best option for you is find_element_by_link_text since it's a link.
Once you saved the element in a variable, you call the click function: element.click() or element.send_keys(Keys.RETURN)
Take a look to the selenium-python documentation, there are a couple of examples there.
You just use this code
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
driver = webdriver.Firefox()
driver.get("https://play.spotify.com/")# here change your link
wait=WebDriverWait(driver,250)
# it will wait for 250 seconds an element to come into view, you can change the #value
submit=wait.until(EC.presence_of_element_located((By.LINK_TEXT, 'Click Me')))
submit.click()
here wait is best fit for javascript enabled website let suppose if you load the page in webdriver but due to javascript some element loads after some second then it best suite to use wait element there.
for link containing text
browser = webdriver.firefox()
browser.find_element_by_link_text(link_text).click()
You can also do it by xpath:
Browser.find_element_by_xpath("//a[#href='you_link']").click()
Finding Element by Link Text
driver.find_element_by_link_text("")
Finding Element by Name
driver.find_element_by_name("")
Finding Element By ID
driver.find_element_by_id("")
Try SST - it is a simple yet very good test framework for python.
Install it first: http://testutils.org/sst/index.html
Then:
Imports:
from sst.actions import *
First define a variable - element - that you're after
element = assert_element(text="some words and more words")
I used this: http://testutils.org/sst/actions.html#assert-element
Then click that element:
click_element('element')
And that's it.
First start one instance of browser. Then you can use any of the following methods to get element or link. After locating the element you can use element.click() or element.send_keys(Keys.RETURN) or any other object of selenium webdriver
browser = webdriver.Firefox()
Selenium provides the following methods to locate elements in a page:
To find individual elements. These methods will return individual element.
browser.find_element_by_id(id)
browser.find_element_by_name(name)
browser.find_element_by_xpath(xpath)
browser.find_element_by_link_text(link_text)
browser.find_element_by_partial_link_text(partial_link_text)
browser.find_element_by_tag_name(tag_name)
browser.find_element_by_class_name(class_name)
browser.find_element_by_css_selector(css_selector)
To find multiple elements (these methods will return a list). Later you can iterate through the list or with elementlist[n] you can locate individual element from list.
browser.find_elements_by_name(name)
browser.find_elements_by_xpath(xpath)
browser.find_elements_by_link_text(link_text)
browser.find_elements_by_partial_link_text(partial_link_text)
browser.find_elements_by_tag_name(tag_name)
browser.find_elements_by_class_name(class_name)
browser.find_elements_by_css_selector(css_selector)
browser.find_element_by_xpath("")
browser.find_element_by_id("")
browser.find_element_by_name("")
browser.find_element_by_class_name("")
Inside the ("") you have to put the respective xpath, id, name, class_name, etc...

Categories

Resources