Why is this xpath not working using PhantomJS? - python

I'm building a scraper that needs to scrape prices from a dozen of different websites .
All websites use JS to show the price, so I went with selenium in order to scrape the data needed .
Before starting to build the scraper, I created a list of xpaths I need to get the price element for each url I was scraping in an external file .
I got those xpath using FireFox and Firebug, HoĆ wever I get an errror, everytime I try to get these element with selenium (PhantomJS driver ):
selenium.common.exceptions.NoSuchElementException: Message: {"errorMessage":"Una
ble to find element with xpath './div'","request":{"headers":{"Accept":"applicat
ion/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"89
","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:51048","User
-Agent":"Python-urllib/2.7"},"httpVersion":"1.1","method":"POST","post":"{\"usin
g\": \"xpath\", \"sessionId\": \"27f41b80-cf63-11e6-bbcc-13b1a315759a\", \"value
\": \"./div\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"ele
ment","directory":"/","path":"/element","relative":"/element","port":"","host":"
","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/
element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/27f41b80-cf
63-11e6-bbcc-13b1a315759a/element"}}
It seems that my xpath is wrong, However I double checked it, using other plugins and it's correct everytime I test that xpath on Firefox .
Here are two different xpaths that both should work, (they did using Firfox , but didn't with selenium) :
"id('regular-hero')/div[3]/div[1]/div[2]/div/div[1]/div/div/span"
"/html/body/div[2]/div[3]/div[1]/div[2]/div/div[2]/div/div/span/text()[1]"
And here's the target page html code Here
And here's the code to get the elemnts using selenium :
self.browser = wd.PhantomJS()
for n in xrange(len(self.url_list)):
url = self.url_list[n]
provider = self.provider_list[n]
self.browser.get(url)
for plan in provider:
for hosting_plan in provider[plan]:
xpath = hosting_plan.values()[0] # Get the xpath of a plan
price_elem = self.browser.find_element_by_xpath("//*")
print price_elem
self.browser.close()
All the loops are used to traverse the JSON external file that hold the xpath list .
What is wrong ? What should I try ? Can lxml help me (Given that the HTML code is broken sometimes)?

According to provided link you can match required element with following XPath:
//span[#class="term-price"]
If your element generated with JavaScript you need to wait some time for element appearance:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
element = WebDriverWait(driver,10).until(EC.presence_of_element_located((By.XPATH, "//span[#class='term-price']")))

Related

Selenium doesn't find element loaded from Ajax

I have been trying to get access to the 4 images on this page: https://altkirch-alsace.fr/serviable/demarches-en-ligne/prendre-un-rdv-cni/
However the grey region seems to be Ajax-loaded (according to its class name). I want to get the element <div id="prestations"> inside of it but can't access it, nor any other element within the grey area.
I have tried to follow several answers to similar questions, but no matter how long I wait I get an error that the element is not found ; the element is here when I click "Inspect element" but I don't see it when I click "View source". Does that mean I can't access it through selenium?
Here is my code:
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
driver.get("https://altkirch-alsace.fr/serviable/demarches-en-ligne/prendre-un-rdv-cni/")
element = WebDriverWait(driver, 10) \
.until(lambda x: x.find_element(By.ID, "prestations"))
print(element)
You're not using WebDriverWait(...).until properly. Your lambda is using find_element, which throws exception when it is called and the element is not found.
You should use it like this instead:
from selenium.webdriver.support import expected_conditions as EC
...
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "prestations"))
)
Got the same problem. There is no issue with waiting. There is a frame on that webpage:
<iframe src="https://www.rdv360.com/mairie-d-altkirch?ajx_md=1" width="100%"
height="600" style="border:0px;"></iframe>
The Document Object Model - DOM of the main website does not contain any informations about the webpage that is loaded into a frame.
Even waiting hours you will not find any elements inside this frame, as it has its own DOM.
You need to switch the WebDriver context to this frame. Than you may access the frame's DOM:
As on your website the iframe has no id, you may search for all frames like described on the selenium webpage (https://www.selenium.dev/documentation/webdriver/browser/frames/):
The code searches for all HTML-tags of type "iframe" and takes the first (maybe it should be [0] not [1])
iframe = driver.find_elements(By.TAG_NAME,'iframe')[1]
driver.switch_to.frame(iframe)
Now you may find your desired elements.
The solution that worked in my case on a webpage like this:
<html>
...
<iframe ... id="myframe1">
...
<\iframe>
<iframe ... id="myframe2">
...
<\iframe>
...
</html>
my_iframe = my_WebDriver.find_element(By.ID, 'myframe_1')
my_Webdriver.swith_to.frame(my_iframe)
also working:
my_WebDriver.switch_to.frame('myframe_1')
According Selenium docs you may use iframe's name, id or the element itself to switch to that frame.

selenium accessing text input element with changing ID using chrome webdriver

I am having trouble accessing a input element from this specific webpage. http://prod.symx.com/MTECorp/config.asp?cmd=edit&CID=428D77C8A7ED4DA190E6170116F3A71B
if the webpage has timed out just go ahead and click on this clink below
https://www.mtecorp.com/click-find/
and click on the hyperlink "RL_reactors" to take you to the page.
On this page, I am currently trying to access the search bar/ input element of the webpage to type in a part number that the company sells. This is for school projects and collecting data from different companies for pricing and etc. I am using pycharm(python) and selenium to write this script. Currently, this is the snippet of my code at the moment
# web scraping for MTE product cost list
# reading excel files on the drive
import time
from openpyxl import workbook, load_workbook
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
.........................
..more code
..........................
#part that is getting stuck on
if((selection >= 1) and (selection <= 7)):
print("valid selection going to page...")
if(selection == 1):
target=driver.find_element(By.XPATH,"/html/body/main/article/div/div/div/table/tbody/tr[1]/td[1]/a")
driver.execute_script("arguments[0].click();", target)
element = WebDriverWait(driver,100).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).send_keys("test")
print("passed clickabel element agruement\n")
currently, my code does go to the RL_reactors page as shown below but however when I'm using CSS selector by class name it doesn't recognize the class type I'm trying to get. Now of course many would say why not use XPath and etc. The reason I cant use XPath and etc is that the element id changes for every iteration of the script. So for example the 1st run of the program id name would be "hr8" when for the other script the program name could be "dsfsih". For my observation, the only part of the element that stays constant is the value and the class name. I have tried using XPath, id, ccselector, and such but to no result. Any suggestions
thanks!
Because you are using javascript to click the link on your website, selenium doesn't change the tab (hence it cannot locate the class you are searching for). You can explicitly tell selenium to change the tab window.
url = "https://www.mtecorp.com/click-find/"
driver.get(url)
target=driver.find_element(By.XPATH,"/html/body/main/article/div/div/div/table/tbody/tr[1]/td[1]/a")
driver.execute_script("arguments[0].click();", target)
driver.switch_to.window(driver.window_handles[1])
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).clear()
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).send_keys('test')
Alternatively, instead of clicking the link, you can grab the href and open it in a new instance of selenium by calling driver.get() again.
url = "https://www.mtecorp.com/click-find/"
driver.get(url)
target_link=driver.find_element(By.XPATH,"/html/body/main/article/div/div/div/table/tbody/tr[1]/td[1]/a").get_attribute('href')
driver.get(target_link)
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).clear()
element = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".plxsty_pid"))).send_keys("test")

Clicking on LI link using Selenium Webdriver (Python)

New to Selenium here. Thanks for the help in advance! (Solved)
I've had success at clicking on links using the code below, but having a hard time clicking on links under LI. I've referenced a couple of other stackoverflow pages, but have yet to find a solution.
In this case, I am trying to click on the page number "2", and then run my scraper (which I have working for page 1) for all subsequent pages. Note that clicking on page 2 will cause a change in the table (aka, a new set of stock tickers and information gets pulled up), but the website link itself will not change.
Website link: https://www.gurufocus.com/insider/summary
Here is what I am trying to click on:
The number 2, highlighted in yellow
What I see when I inspect the element: inspect
I can click on a different link (titled "Can Aggregated Insider Trading Activities Predict the Market?" on the same page via code below, but when I input "2" instead, I get an error message, "NoSuchElementException: Message: no such element: Unable to locate element: {"method":"link text","selector":"2"}"
In summary, I would like to "click" on page 2, and bring up more stock information, and then run my scraper through it (then use a for loop for the rest of the pages). I won't have any troubles creating the for loop to scrape multiple pages, but I can't seem to get Selenium to click on the next page for me.
Solved Code - thanks for all the help everyone!
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
PATH = "C:\\Users\\MYUSERNAME\\Webdrivers\\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://www.gurufocus.com/insider/summary")
driver.find_element_by_xpath("//ul[#class='el-pager']/li[text()='2']").click
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//ul[#class='el-pager']/li[text()='2']"))
)
element.click()
except:
driver.quit()
Your issue is because page '2' is not a "link text". If you will notice 'Can Aggregated Insider Trading Activities Predict the Market?" is link text because its tag is "a". Add this to click on your page 2
driver.find_element_by_xpath("//ul[#class='el-pager']/li[text()='2']").click

Selenium not finding an element

I am trying to retrieve an element that I would like to click on. Here's the opening of the website with Selenium in Python:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument('--dns-prefetch-disable')
driver = webdriver.Chrome("./chromedriver", options=chrome_options)
website = "https://www.agronet.gov.co/estadistica/Paginas/home.aspx?cod=4"
driver.get(website) # loads the page
Then, I look for the element I'm interested in:
driver.find_element_by_xpath('//*[#id="cmbDepartamentos"]')
which raises a NoSuchElementException error. When looking at the html source (driver.page_source), indeed "cmbDepartamentos" does not exist! and the text of the dropdown menu I am trying to locate which is "Departamentos:" does not exist either. How can I deal with this?
This should work:
iframe=driver.find_element_by_xpath('//div[#class="iframe"]//iframe')
driver.switch_to.frame(iframe)
driver.find_element_by_xpath('//*[#id="cmbDepartamentos"]').click()
Notes:
The reason for NoSuchElementException error is that the element is
inside an iframe. Unless you switch your driver to that iframe,
the identification will not work.
CTRL + F in the Dev Tools panel, then search for the xpath you
defined in your script is always a good way to rule out issues with
your xpath definition, as cause for NoSuchElementException error (and in your case, the xpath is correct)
You might want to consider adding a WebdriverWait for a complete load of the search area/iframe before attempting to find the "Departamentos" field

Python Selenium: find and click an element

How do I use Python to simply find a link containing text/id/etc and then click that element?
My imports:
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
Code to go to my URL:
browser = webdriver.Firefox() # Get local session of firefox
browser.get(myURL) # Load page
#right here I want to look for element containing "Click Me" in the text
#and then click it
Look at method list of WebDriver class - http://selenium.googlecode.com/svn/trunk/docs/api/py/webdriver_remote/selenium.webdriver.remote.webdriver.html
For example, to find element by its ID and click it, the code will look like this
element = browser.find_element_by_id("your_element_id")
element.click()
Hi there are a couple of ways to find a link (or any element):
element = driver.find_element_by_id("element-id")
element = driver.find_element_by_name("element-name")
element = driver.find_element_by_xpath("//input[#id='element-id']")
element = driver.find_element_by_link_text("link-text")
element = driver.find_element_by_class_name("class-name")
I think the best option for you is find_element_by_link_text since it's a link.
Once you saved the element in a variable, you call the click function: element.click() or element.send_keys(Keys.RETURN)
Take a look to the selenium-python documentation, there are a couple of examples there.
You just use this code
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
driver = webdriver.Firefox()
driver.get("https://play.spotify.com/")# here change your link
wait=WebDriverWait(driver,250)
# it will wait for 250 seconds an element to come into view, you can change the #value
submit=wait.until(EC.presence_of_element_located((By.LINK_TEXT, 'Click Me')))
submit.click()
here wait is best fit for javascript enabled website let suppose if you load the page in webdriver but due to javascript some element loads after some second then it best suite to use wait element there.
for link containing text
browser = webdriver.firefox()
browser.find_element_by_link_text(link_text).click()
You can also do it by xpath:
Browser.find_element_by_xpath("//a[#href='you_link']").click()
Finding Element by Link Text
driver.find_element_by_link_text("")
Finding Element by Name
driver.find_element_by_name("")
Finding Element By ID
driver.find_element_by_id("")
Try SST - it is a simple yet very good test framework for python.
Install it first: http://testutils.org/sst/index.html
Then:
Imports:
from sst.actions import *
First define a variable - element - that you're after
element = assert_element(text="some words and more words")
I used this: http://testutils.org/sst/actions.html#assert-element
Then click that element:
click_element('element')
And that's it.
First start one instance of browser. Then you can use any of the following methods to get element or link. After locating the element you can use element.click() or element.send_keys(Keys.RETURN) or any other object of selenium webdriver
browser = webdriver.Firefox()
Selenium provides the following methods to locate elements in a page:
To find individual elements. These methods will return individual element.
browser.find_element_by_id(id)
browser.find_element_by_name(name)
browser.find_element_by_xpath(xpath)
browser.find_element_by_link_text(link_text)
browser.find_element_by_partial_link_text(partial_link_text)
browser.find_element_by_tag_name(tag_name)
browser.find_element_by_class_name(class_name)
browser.find_element_by_css_selector(css_selector)
To find multiple elements (these methods will return a list). Later you can iterate through the list or with elementlist[n] you can locate individual element from list.
browser.find_elements_by_name(name)
browser.find_elements_by_xpath(xpath)
browser.find_elements_by_link_text(link_text)
browser.find_elements_by_partial_link_text(partial_link_text)
browser.find_elements_by_tag_name(tag_name)
browser.find_elements_by_class_name(class_name)
browser.find_elements_by_css_selector(css_selector)
browser.find_element_by_xpath("")
browser.find_element_by_id("")
browser.find_element_by_name("")
browser.find_element_by_class_name("")
Inside the ("") you have to put the respective xpath, id, name, class_name, etc...

Categories

Resources