Finding xpaths on pages running script - python

Im trying to scrape a webpage using selenium. The xpaths suggested by inspecting the page and right clicking are of an unstable kind (/html/body/table[2]/tbody/tr[1]/td/form/table/tbody/tr[2]) . So I tried the following solution instead:
driver = webdriver.Chrome("path")
driver.get("https://www.bundesfinanzhof.de/entscheidungen/entscheidungen-online")
time.sleep(1)
links=driver.find_element_by_xpath('//tr[#class="SuchForm"]')
or even
links=driver.find_elements_by_xpath('//*[#class="SuchForm"]')
don't return any results. However earlier on in the page I can obtain:
links=driver.find_element_by_xpath('//iframe')
links.get_attribute('src')
It seems that after:
<script language="JavaScript" src="/rechtsprechung/jscript/list.js" type="text/javascript"></script>
I can no longer get to any of the elements.
How do I determine the correct XPath?
suggests that parts within a script are impossible to parse. However, the path I am after seems to me not to be within a path. Am I misinterpretting how scripts work on a page ?
For instance, later on there is a path:
/html/body/table[2]/tbody/tr[1]/td/script
I would expect this to create such a problem. I am by no means a programmer, so my understanding of this subject is limited. Can someone explain what the problem is and if possible a solution ?
Attempted using solutions from:
Find element text using xpath in selenium-python NOt Working
xpath does not work with this site, pls verify

The table is located inside an iframe, so you need to switch to that iframe before handling required tr:
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver.get("https://www.bundesfinanzhof.de/entscheidungen/entscheidungen-online")
wait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[#src='https://juris.bundesfinanzhof.de/cgi-bin/rechtsprechung/list.py?Gericht=bfh&Art=en']")))
link = driver.find_element_by_xpath('//tr[#class="SuchForm"]')
Use driver.switch_to.default_content() to switch back from iframe

Related

python selenium: clicking buttons with various html element/css attributes

i'm trying to create python automation scripts using selenium to make appointment times, however i'm having some trouble with clicking on certain elements. to confirm an appointment the steps are going to the webpage > selecting the date > selecting the time > hitting confirm
this is what i've got so far:
driver = webdriver.Chrome()
driver.get('<url.date>') -- this works
driver.find_element(By.ID,'corresponding time id').click() -- this works
driver.find_element(By.DATA-TEST-ID,'order_summary_page-button-book').click() -- this is where i'm struggling
i've also tried doing it by the button class, using the css selector and looking for 'Reserve Now' text, and using the xpath
when i do it by xpath, i get the error NoSuchElementException
i tried adding a webdriver wait function but that didn't seem to work either.
the appointment times are really hard to get, so i'd rather not but any wait times into the code so it can execute and book as soon as slots open
here are the elements:
https://i.stack.imgur.com/S62kR.png
any help is appreciated!
Selenium doesn't have option to identify the element by DATA-TEST-ID locator.
In this case You have to use either by xpath or css selector
driver.find_element(By.XPATH,'//button[#data-test-id="order_summary_page-button-book"]').click()
OR
driver.find_element(By.CSS_SELECTOR,'button[data-test-id="order_summary_page-button-book"]').click()
Ideally you should use explicit wait to click on the element.
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, '//button[#data-test-id="order_summary_page-button-book"]'))).click()
you need following imports.
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By

Issues with Selenium and a Single Page Application (Ember.js)

I just started trying to learn to write python code today. I am trying to make a web scraper to do some stuff, but i am having trouble getting selenium to click on the specific boxes/places to get to the correct page. Because it uses Ember.js, it is a Single Page Application. This throws things off a little. I found some stuff on stackoverflow about it, but i wasnt able to get it to work. I keep getting the "NoSuchElementException" error. Here is my code
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
PATH = r'C:\Users\dme03\Desktop\desktop\webdrivers\chromedriver_win32\chromedriver.exe'
driver = webdriver.Chrome(PATH)
driver.get('https://thedailyrecord.com/reader-rankings-interface/')
time.sleep(15)
mim = driver.find_element(By.XPATH, '//img[#src="//media.secondstreetapp.com/1973100?width=370&height=180&cropmode=Fill&anchor=Center"]/parent::div')
mim.click
For the XPATH, I had to select a child element and specify the parent to select the correct element. The key identifiers in the parent element kept changing, so i was unable to select just that. I believe the XPATH is correct because when I put that into the website, it showed up as the correct line.
I have also tried using the code below, as i have seen some people suggest. When using that, I get a timed out error.
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//img[#src="//media.secondstreetapp.com/1973100?width=370&height=180&cropmode=Fill&anchor=Center"]/parent::div')))
Any help would be appreciated
NEW CODE for iFrames
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH, '//iframe[#src="https://embed-882706.secondstreetapp.com/embed/ae709978-d6d1-499e-9f5e-136cc36eab1b/"]')))
mim = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//img[#src="//media.secondstreetapp.com/1973100?width=370&height=180&cropmode=Fill&anchor=Center"]/parent::div')))
mim.click
I am having issues clicking the item. To be fair, i am not sure that you are supposed to click that element to change the webpage. That is probably an issue.
Edit: nvm im just dumb that totally worked lol :))
You are getting "NoSuchElementException" error because the desired element is inside the iframe .So you have to switch to iframe at first.See the below image screenshot

Automate Login with Selenium and Python

I am trying to use Selenium to log into a printer and conduct some tests. I really do not have much experience with this process and have found it somewhat confusing.
First, to get values that I may have wanted I opened the printers web page in Chrome and did a right click "view page source" - This turned out to be not helpful. From here I can only see a bunch of <script> tags that call some .js scripts. I am assuming this is a big part of my problem.
Next I selected the "inspect" option after right clicking. From here I can see the actual HTML that is loaded. I logged into the site and recorded the process in Chrome. With this I was able to identify the variables which contain the Username and Password. I went to this part of the HTML did a right click and copied the Xpath. I then tried to use the Selenium find_element_by_xpath but still no luck. I have tried all the other methods to (find by ID, and name) however it returns an error that the element is not found.
I feel like there is something fundamental here that I am not understanding. Does anyone have any experience with this???
Note: I am using Python 3.7 and Selenium, however I am not opposed to trying something other than Selenium if there is a more graceful way to accomplish this.
My code looks something like this:
EDIT
Here is my updated code - I can confirm this is not just a time/wait issue. I have managed to successfully grab the first two outer elements but the second I go deeper it errors out.
def sel_test():
chromeOptions = Options()
chromeOptions.add_experimental_option("useAutomationExtension", False)
browser = webdriver.Chrome(chrome_options=chromeOptions)
url = 'http://<ip address>/'
browser.get(url)
try:
element = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.XPATH, '//*[#id="ccrx-root"]')))
finally: browser.quit()
The element that I want is buried in this tag - Maybe this has something to do with it? Maybe related to this post
<frame name="wlmframe" src="../startwlm/Start_Wlm.htm?arg11=">
As mentioned in this post you can only work with the current frame which is seen. You need to tell selenium to switch frames in order to access child frames.
For example:
browser.switch_to.frame('wlmframe')
This will then load the nested content so you can access the children
Your issue is most likely do to either the element not loading on the page until after your bot searches for it, or a pop-up changing the xpath of the element.
Try this:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
delay = 3 # seconds
try:
elementUsername = WebDriverWait(browser, delay).until(EC.presence_of_element_located((By.xpath, 'element-xpath')))
element.send_keys('your username')
except TimeoutException:
print("Loading took too much time!")
you can find out more about this here

How do I click a button in a form using Selenium and Python 2.7?

I am trying to create a Python program that periodically checks a website for a specific update. The site is secured and multiple clicks are required to get to the page that I want to monitor. Unfortunately, I am stuck trying to figure out how to click a specific button. Here is the button code:
<input type="button" class="bluebutton" name="manageAptm" value="Manage Interview Appointment" onclick="javascript:programAction('existingApplication', '0');">
I have tried numerous ways to access the button and always get "selenium.common.exceptions.NoSuchElementException:" error. The obvious approach to access the button is XPath and using the Chrome X-Path Helper tool, I get the following:
/html/body/form/table[#class='appgridbg mainContent']/tbody/tr/td[2]/div[#class='maincontainer']/div[#class='appcontent'][1]/table[#class='colorgrid']/tbody/tr[#class='gridItem']/td[6]/input[#class='bluebutton']
If I include the above as follows:
browser.find_element_by_xpath("/html/body/form/table[#class='appgridbg mainContent']/tbody/tr/td[2]/div[#class='maincontainer']/div[#class='appcontent'][1]/table[#class='colorgrid']/tbody/tr[#class='gridItem']/td[6]/input[#class='bluebutton']").submit()
I still get the NoSuchElementException error.
I am new to selenium and so there could be something obvious that I am missing; however, after much Googling, I have not found an obvious solution.
On another note, I have also tried find_element_by_name('manageAptm') and find_element_by_class_name('bluebutton') and both give the same error.
Can someone advise on how I can effectively click this button with Selenium?
Thank you!
To follow your attempts and the #har07's comment, the find_element_by_name('manageAptm') should work, but the element might not be immediately available and you may need to wait for it:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
manageAppointment = wait.until(EC.presence_of_element_located((By.NAME, "manageAptm")))
manageAppointment.click()
Also, check if the element is inside an iframe or not. If yes, you would need to switch into the context of it and only then issue the "find" command:
driver.switch_to.frame("frame_name_or_id")
driver.find_element_by_name('manageAptm').click()

Getting the HTML code of a particular div using selenium in python

Using Selenium in Python, I'd like to download a page, and save the HTML code of a particular div, identified by its id. I've got the following:
from selenium.webdriver import Firefox
from selenium.webdriver.support.ui import WebDriverWait
...
with closing(Firefox()) as browser:
browser.get(current_url)
WebDriverWait(browser, timeout=3).until(lambda x: x.find_element_by_id('element_id'))
element = browser.find_element_by_id('element_id')
element is of type selenium.webdriver.remote.webelement.WebElement. Is there a way to get the HTML code (not processed in any way) from element? Is there some better way, using Selenium, of accomplishing this task?
Right from pydoc selenium.webdriver.remote.webelement.WebElement:
| text
| Gets the text of the element.
Use the .text attribute.
If you really are after the HTML source of the element
then please see: Get HTML Source of WebElement in Selenium WebDriver using Python
As stated above, it's not as straight forward as you'd like.

Categories

Resources