Clicking on LI link using Selenium Webdriver (Python) - python

New to Selenium here. Thanks for the help in advance! (Solved)
I've had success at clicking on links using the code below, but having a hard time clicking on links under LI. I've referenced a couple of other stackoverflow pages, but have yet to find a solution.
In this case, I am trying to click on the page number "2", and then run my scraper (which I have working for page 1) for all subsequent pages. Note that clicking on page 2 will cause a change in the table (aka, a new set of stock tickers and information gets pulled up), but the website link itself will not change.
Website link: https://www.gurufocus.com/insider/summary
Here is what I am trying to click on:
The number 2, highlighted in yellow
What I see when I inspect the element: inspect
I can click on a different link (titled "Can Aggregated Insider Trading Activities Predict the Market?" on the same page via code below, but when I input "2" instead, I get an error message, "NoSuchElementException: Message: no such element: Unable to locate element: {"method":"link text","selector":"2"}"
In summary, I would like to "click" on page 2, and bring up more stock information, and then run my scraper through it (then use a for loop for the rest of the pages). I won't have any troubles creating the for loop to scrape multiple pages, but I can't seem to get Selenium to click on the next page for me.
Solved Code - thanks for all the help everyone!
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
PATH = "C:\\Users\\MYUSERNAME\\Webdrivers\\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://www.gurufocus.com/insider/summary")
driver.find_element_by_xpath("//ul[#class='el-pager']/li[text()='2']").click
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//ul[#class='el-pager']/li[text()='2']"))
)
element.click()
except:
driver.quit()

Your issue is because page '2' is not a "link text". If you will notice 'Can Aggregated Insider Trading Activities Predict the Market?" is link text because its tag is "a". Add this to click on your page 2
driver.find_element_by_xpath("//ul[#class='el-pager']/li[text()='2']").click

Related

selenium accessing text input element with changing ID using chrome webdriver

I am having trouble accessing a input element from this specific webpage. http://prod.symx.com/MTECorp/config.asp?cmd=edit&CID=428D77C8A7ED4DA190E6170116F3A71B
if the webpage has timed out just go ahead and click on this clink below
https://www.mtecorp.com/click-find/
and click on the hyperlink "RL_reactors" to take you to the page.
On this page, I am currently trying to access the search bar/ input element of the webpage to type in a part number that the company sells. This is for school projects and collecting data from different companies for pricing and etc. I am using pycharm(python) and selenium to write this script. Currently, this is the snippet of my code at the moment
# web scraping for MTE product cost list
# reading excel files on the drive
import time
from openpyxl import workbook, load_workbook
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
.........................
..more code
..........................
#part that is getting stuck on
if((selection >= 1) and (selection <= 7)):
print("valid selection going to page...")
if(selection == 1):
target=driver.find_element(By.XPATH,"/html/body/main/article/div/div/div/table/tbody/tr[1]/td[1]/a")
driver.execute_script("arguments[0].click();", target)
element = WebDriverWait(driver,100).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).send_keys("test")
print("passed clickabel element agruement\n")
currently, my code does go to the RL_reactors page as shown below but however when I'm using CSS selector by class name it doesn't recognize the class type I'm trying to get. Now of course many would say why not use XPath and etc. The reason I cant use XPath and etc is that the element id changes for every iteration of the script. So for example the 1st run of the program id name would be "hr8" when for the other script the program name could be "dsfsih". For my observation, the only part of the element that stays constant is the value and the class name. I have tried using XPath, id, ccselector, and such but to no result. Any suggestions
thanks!
Because you are using javascript to click the link on your website, selenium doesn't change the tab (hence it cannot locate the class you are searching for). You can explicitly tell selenium to change the tab window.
url = "https://www.mtecorp.com/click-find/"
driver.get(url)
target=driver.find_element(By.XPATH,"/html/body/main/article/div/div/div/table/tbody/tr[1]/td[1]/a")
driver.execute_script("arguments[0].click();", target)
driver.switch_to.window(driver.window_handles[1])
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).clear()
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).send_keys('test')
Alternatively, instead of clicking the link, you can grab the href and open it in a new instance of selenium by calling driver.get() again.
url = "https://www.mtecorp.com/click-find/"
driver.get(url)
target_link=driver.find_element(By.XPATH,"/html/body/main/article/div/div/div/table/tbody/tr[1]/td[1]/a").get_attribute('href')
driver.get(target_link)
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).clear()
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).send_keys("test")

Python web scraping with Selenium on Dynamic Page - Issue with looping to next element

I am hoping someone can please help me out and put me out of my misery. I have recently started to learn Python and wanted to challenge myself with some web-scraping.
Over the past couple of days I have been trying to web-scrape this website (https://ebn.eu/?p=members). On the website, I am interesting in:
Clicking on each logo image which brings up a pop-up
From the pop-up scrape the link which is behind the text "VIEW FULL PROFILE"
Move to the next logo and do the same for each
I have managed to get Selenium up and running but the issue is that it keeps opening the first logo and copying the same link as opposed to moving to the next one. I have tried in various different ways but came up against a brick wall.
My code so far:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
PATH = "/home/ed/Documents/python/chromedriver" # location of the webdriver - amend as requried
url = "https://ebn.eu/?p=members"
driver = webdriver.Chrome(PATH)
driver.get(url)
member_list = []
# Flow to open page and click on link to extract href
members = driver.find_elements_by_xpath('//*[#class="projectImage"]') # Looking for the required class - the image which on click brings up the info
for member in members:
print(member.text) # to see that loop went to next iteration
member.find_element_by_xpath('//*[#class="projectImage"]').click()
wait = WebDriverWait(driver, 10)
element = wait.until(EC.element_to_be_clickable((By.LINK_TEXT, 'VIEW FULL PROFILE')))
links = driver.find_element_by_partial_link_text("VIEW FULL PROFILE")
href = links.get_attribute("href")
member_list.append(href)
member.find_element_by_xpath("/html/body/div[5]/div[1]/button").click()
print(member_list)
driver.quit()
P.S: I have tried changing the member.find to:
member.find_element_by_xpath('.//*[#class="projectImage"]').click() But then I get "Unable to find element"
Any help is very much appreciated.
Thanks
If you study the HTML of the page, they have the onclick script which basically triggers the JS and renders the pop-up. You can make use of it.
You can find the onclick script in the child element img.
So your logic should be like (1)Get the child element (2)go to first child element (which is img always for your case) (3)Get the onclick script text (4)execute the script.
child element
for member in members:
print(member.text) # to see that loop went to next iteration
# member.find_element_by_xpath('//*[#class="projectImage"]').click()
#Begin of modification
child_elems = member.find_elements_by_css_selector("*") #Get the child elems
onclick_script = child_elems[0].get_attribute('onclick')#Get the img's onclick value
driver.execute_script(onclick_script) #Execute the JS
time.sleep(5) #Wait for some time
#end of modification
wait = WebDriverWait(driver, 10)
element = wait.until(EC.element_to_be_clickable((By.LINK_TEXT, 'VIEW FULL PROFILE')))
links = driver.find_element_by_partial_link_text("VIEW FULL PROFILE")
href = links.get_attribute("href")
member_list.append(href)
member.find_element_by_xpath("/html/body/div[5]/div[1]/button").click()
print(member_list)
You need to import time module. I prefer time.sleep over wait.until, it's more easier to use when you are starting with web scraping.

Having trouble referencing to a certain element on page with Selenium

I am having a terribly hard time referencing to a certain "next page" button on a website that I am trying to scrape links from [https://www.sreality.cz/adresar?strana=2]. If you scroll down you can see a red right arrow button that you can click to go to the next page and so the website load new dynamic content. Every approach seems to report the same exact error and I don't know how am I supposed to point to the element without running into it.
This is the code that I currently have :
from selenium import webdriver
chromedriver_path = "/home/user/Dokumenty/iCloud/RealityScraper/chromedriver"
driver = webdriver.Chrome(chromedriver_path)
print("WebDriver Successfully Initialized")
driver.get("https://www.sreality.cz/adresar?strana=2")
links = driver.find_elements_by_css_selector("h2.title a")
nextPage = driver.find_element_by_css_selector("li.paging-item a.btn-paging-pn.icof.icon-arr-right.paging-next")
for link in links:
print(link.get_attribute("href"))
nextPage.click()
The "nextPage" variable is holding a supposed value to be clicked on once the "links" variable search finishes scraping all the links from the company titles. However when I run this code I get an error :
selenium.common.exceptions.StaleElementReferenceException: Message:
stale element reference: element is not attached to the page document
I have been searching for various fixes online but none of them seemed to resolve the issue. I think that the issue at this point is not caused by the element not loading quickly enough but rather Selenium having trouble finding the element because of wrong reference.
Because of this I have tried using XPath to accurately point to the actual element and so I changed the "nextPage" variable to :
nextPage = driver.find_element_by_xpath("""/html/body/div[2]/div[1]/div[2]/div[2]/div[4]/div/div/div/div[2]/div/div[2]/ul[1]/li[12]/a""")
Which returns exactly the same error as stated above. I have been trying to find a solution to this for hours now and I can't understand where the issue lies. I would be grateful if anyone could explain to me what am I doing wrong. Thanks to anyone.
If you want to get all the ng-href tags from every page. Or you could look into their api.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from time import sleep
driver.get("https://www.sreality.cz/adresar?strana=2")
wait = WebDriverWait(driver, 10)
while True:
try:
links = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "h2.title > a")))
#print(len(links))
for link in links:
print(link.get_attribute("ng-href"))
nextPage = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.btn-paging-pn.icof.icon-arr-right.paging-next")))
nextPage.click()
time.sleep(10)
except Exception as e:
print(e)
break
First of all never use the absolute xpath it will breakdown easily, Use the relative xpath.
Secondly, i think the error you are getting is because after clicking "Next" button for the first time it loads a new page. Which has a different DOM structure and that's why you are not able to find that element.
You can try searching for the element after every new page load (after clicking "Next" button everytime.)
// imports
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
// initialize
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 20)
action = ActionChains(driver)
// Try to use the below code and see if it works.
Next_btn = wait.until(EC.presence_of_element_located((By.XPATH, '(//li[#class="paging-item"])[2]')))
action.move_to_element(Next_btn).click().perform()

How do I use Selenium to automate clicks on multiple links within an iframe?

I am trying to webscrape data for several bills initiated in the Peruvian Congress from the following website: http://www.congreso.gob.pe/pley-2016-2021
Basically, I want to click on each link in the search results, scrape the relevant information for the bill, return to the search results and then click on the next link for the next bill and repeat the process. Obviously with so many bills over congressional sessions it would be great if I could automate this.
So far I've been able to accomplish everything up until clicking on the next bill. I've been able to use Selenium to initiate a web browser that displays the search results, click on the first link using the xpath embedded in the iframe and then scrape the content with beautifulsoup and then navigate back to the search results. What I'm having trouble with is being able to click on the next bill in the search results because I'm not sure how to iterate over the xpath (or how to iterate over something that would take me to each subsequent bill). I'd like to be able to scrape the information for all of the bills on each page and then be able to navigate to the next page of the search results.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import requests
driver = webdriver.Chrome('C:\\Users\\km13\\chromedriver.exe')
driver.get("http://www.congreso.gob.pe/pley-2016-2021")
WebDriverWait(driver, 50).until(EC.frame_to_be_available_and_switch_to_it((By.NAME, 'ventana02')))
elem = WebDriverWait(driver, 10).until(EC.element_to_be_clickable(
(By.XPATH, "//a[#href='/Sicr/TraDocEstProc/CLProLey2016.nsf/641842f7e5d631bd052578e20058a231/243a65573d33ecc905258449007d20cc?OpenDocument']")))
elem.click()
soup = BeautifulSoup(driver.page_source, 'lxml')
table = soup.find('table', {'bordercolor' : '#6583A0'})
table_items = table.findAll('font')
table_authors = table.findAll('a')
for item in table_items:
content = item.contents[0]
print(content)
for author in table_authors:
authors = author.contents[0]
print(authors)
driver.back()
So far this is the code I have that launches the web browser, clicks on the first link of the search results, scrapes the necessary data, and then returns to the search results.
The following code will navigate to different pages in the search results:
elem = WebDriverWait(driver, 10).until(EC.element_to_be_clickable(
(By.XPATH, "//a[contains(#onclick,'D32')]/img[contains(#src,'Sicr/TraDocEstProc/CLProLey')]")))
elem.click()
I guess my specific problem is being able to figure out how to automate clicking on subsequent bills in the iframe because once I'm able to do that, I'm assuming I could just loop over the bills on each page and then nest that loop within a function that loops over the pages of search results.
UPDATE: Somewhat with the help of the answer below I applied the logic but used beautifulsoup to scrape the href links in the iframe, and store them in a list concatenating the necessary string elements to create a list of xpaths for all of the bills on the page:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import requests
driver = webdriver.Chrome('C:\\Users\\km13\\chromedriver.exe')
driver.get("http://www.congreso.gob.pe/pley-2016-2021")
WebDriverWait(driver, 50).until(EC.frame_to_be_available_and_switch_to_it((By.NAME, 'ventana02')))
soup = BeautifulSoup(driver.page_source, 'lxml')
table = soup.find('table', {'cellpadding' : '2'})
table_items = table.find_all('a')
for item in table_items:
elem = WebDriverWait(driver, 10).until(EC.element_to_be_clickable(
(By.XPATH, "//a[#href='" + item.get('href') + "']")))
elem.click()
driver.back()
Now my issue is that it will click the link for the first item in the loop and click back to the search results but does not progress to clicking on the next item in the loop (the code just times out). I'm also pretty new to writing loops in Python so I was wondering if there would be a way for me to iterate through the items of xpaths so that I can click an xpath, scrape the info on that page, click back to the search results and then click on the next item in the list?
Here is my logic to this problem.
1. First get into the Iframe using switchTo.
2. Get webelements for xpath "//a" using driver.findElements in to a variable "billLinks" as this frame has links for only bills.
3. Now iterate through billLinks and perform desired actions.
I hope this solution helps.

Webscrape Flashscore with Python/Selenium

I started to learn scrape websites with Python and Selenium. I choose selenium because I need to navigate through the website and I also have to login.
I wrote an script that is able to open a firefox window and it opens the website www.flashscore.com. With this script I also be able to login and navigate to the different sports section (main menu) they have.
The code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
# open website
driver = webdriver.Firefox()
driver.get("http://www.flashscore.com")
# login
driver.find_element_by_id('signIn').click()
username = driver.find_element_by_id("email")
password = driver.find_element_by_id("passwd")
username.send_keys("*****")
password.send_keys("*****")
driver.find_element_by_name("login").click()
# go to the tennis section
link = driver.find_element_by_link_text('Tennis')
link.click()
#go to the live games tab in the tennis section
# ?????????????????????????????'
Then it went more difficult. I also want to navigate to, for example, the sections "live games" and "finished" tabs in the sports sector. This part wouldn't work. I tried many things but I can't get into one of this tabs. When analyzing the website I see that they use some Iframes. I also find some code to switch to a Iframes window. But the problem is, I can't find the name of the Iframe where the tabs are that I want to click on. Maybe the Iframes are not the problem and do I look to the wrong way. (Maybe the problem is caused by some javascript?)
Can anybody please help me with this?
No, the iframes are not the problem in this case. The "Live games" element is not inside an iframe. Locate it by link text and click:
live_games_link = driver.find_element_by_link_text("LIVE Games")
live_games_link.click()
You may need to wait for this link to be clickable before actually trying to click it:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
wait = WebDriverWait(driver, 10)
live_games_link = wait.until(EC.element_to_be_clickable((By.LINK_TEXT, "LIVE Games")))
live_games_link.click()

Categories

Resources