I'm using Python / Selenium to submit a form then I have the web driver waiting for the next page to load by using an expected condition using class id.
My problem is that there are two pages that can be displayed but they do not share an unique element (that I can find) that is not in the original page. One page has a unique class is of mobile_txt_holder and the other possible page has a class id of notfoundcopy. I would like to use a wait that is looking for mobile_txt_holder OR notfoundcopy to appear.
Is it possible to combine two expected conditions into one wait?
Basic idea of what I am looking for but obviously won't work:
WebDriverWait(driver, 30).until(EC.presence_of_element_located(
(By.CLASS_NAME, "mobile_txt_holder")))
or .until(EC.presence_of_element_located((By.CLASS_NAME, "notfoundcopy")))
I really just need to program to wait until the next page loads so that I can parse the source.
Sample HTML:
<p class="notfoundcopy">Unfortunately, the number you entered is not in our tracking system.</p>
Apart from clubbing up 2 expected_conditions through or clause, we can easily construct a CSS to take care of our requirement The following CSS will look either for the EC either in mobile_txt_holder class or in notfoundcopy class:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CSS_SELECTOR, ".mobile_txt_holder, .notfoundcopy"))
You can find a detailed discussion in selenium two xpath tests in one
Related
I'm trying to scrape some profiles of people in linkedin from a specific job. To do this I was trying to find the people button and click it to specifically look at the relevant people.
The path is as follows:
From signed out Linkedin home -> I sign in and go to LinkedIn home -> I write in the search bar "hr" and hit enter.
In the result page of hr, on the left side of the page, there is a navigation list that says "On this page". One of the options includes "People" and that is what I want to target.
The link to the page is: https://www.linkedin.com/search/results/all/?keywords=hr&origin=GLOBAL_SEARCH_HEADER&sid=Xj2
The HTML of the button for 'People' in the navigation list is:
<li>
<button aria-current="false" class="search-navigation-panel_button" data-target-section-id="PTFmMNSPSz2LQRzwynhRBQ==" role="link" type="button"> People
I have tried to find this button through By.Link_text and found the keyword People but did not work. I have also tried to do By.XPATH "//button[#data-target-section-id='RIK0XK7NRnS21bVSiNaicw==']")"" but it also does not find it.
How can I make selenium find this custom attribute so I can find this button through data-target-section-id="PTFmMNSPSz2LQRzwynhRBQ=="?
Another issue that I am having is that I can target all the relevant people on the page and loop through them but I cannot extract the link of each of the profiles. It only takes the first link of the first person and never updates the variable again through the loop.
For example, if the first person is Ian, and the second is Brian, it gives me the link for Ian's profile even if 'users' is Brian.
Debugging the loop I can see the correct list of people in all_users but it only gets the href of the first person in the list and never updates.
Here is the code of that:
all_users = driver.find_elements(By.XPATH, "//*[contains(#class, 'entity-result__title-line entity-result__title-line--2-lines')]")
for users in all_users:
print(users)
get_links = users.find_element(By.XPATH, "//*[contains(#href, 'miniProfileUrn')]")
print(get_links.get_attribute('href'))
I have also tried to do By.XPATH
"//button[#data-target-section-id='RIK0XK7NRnS21bVSiNaicw==']")"" but
it also does not find it.
The data-target-section-id that you mention is not the same as the one that the button has (PTFmMNSPSz2LQRzwynhRBQ==). Check that this is not dynamic before targeting it.
Your xPath is not bad but as I told you, fix the target-id:
driver.findElement(By.xpath("//button[#data-target-section-id='PTFmMNSPSz2LQRzwynhRBQ==']")).click()
Where "driver" is your WebDriver instance.
Given the HTML:
<li>
<button aria-current="false" class="search-navigation-panel_button" data-target-section-id="PTFmMNSPSz2LQRzwynhRBQ==" role="link" type="button"> People </button>
</li>
The data-target-section-id attribute values like PTFmMNSPSz2LQRzwynhRBQ== are dynamically generated and is bound to chage sooner/later. They may change next time you access the application afresh or even while next application startup. So can't be used in locators.
Solution
The desired element being a dynamic element to click on the clickable element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button.search-navigation-panel_button[data-target-section-id]"))).click()
Using XPATH:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#class='search-navigation-panel_button' and #data-target-section-id][contains(., 'People')]"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
If you want to locate several elements with the same attribute replace find_element with find_elements. See if that works to find not just the first element matching your search, but all elements with that attribute.
Review the Selenium: Locating Elements documentation and see if you can try each and every option they have for locating elements.
Something else to try:
button_element = driver.find_element(By.XPATH, "//button[#data-target-section-id='RIK0XK7NRnS21bVSiNaicw==']")
list_element.find_element(By.TAG_NAME, "button").click()
It looks like the reason your People button locator isn't working is because the data-target-section-id is dynamic. Mine is showing as hopW8RkwTN2R9dPgL6Fm/w==. We can get around that by using an XPath to locate the element based on the text contained, "People", e.g.
//button[text()='People']
Turns out that matches two elements on the page because many of the left nav links are repeated as rounded buttons on the top of the page so we can further refine our locator to
//button[text()='People'][#data-target-section-id]
Having said that, that link only scrolls the page so you don't really need to click that.
From there, you want to get the links to each person listed under the People heading. We first need the DIV that contains the People section. It's kinda messy because the IDs on those elements are also dynamic so we need to find the H2 that contains "People" and then work our way back up the DOM to the DIV that contains only that section. We can get that using the XPath below
//div[#class='search-results-container']/div[.//h2[text()='People']]
From there, we want all of the A tags that uniquely link to a person... and there's a lot of A tags in that section but most are not ones we want so we need to do more filtering. I found that the below XPath locates each unique URL in that section.
//a[contains(#href,'miniProfileUrn')][contains(#class,'scale-down')]
Combining the two XPaths, we get
//div[#class='search-results-container']/div[.//h2[text()='People']]//a[contains(#href,'miniProfileUrn')][contains(#class,'scale-down')]
which locates all unique URLs belonging to a person in the People section of the page.
Using this, your code would look like
all_users = driver.find_elements(By.XPATH, "//div[#class='search-results-container']/div[.//h2[text()='People']]//a[contains(#href,'miniProfileUrn')][contains(#class,'scale-down')]")
for user in all_users:
print(user.get_attribute('href'))
NOTE: The reason your code was only returning the first href repeatedly is because you are searching from an existing element with an XPath so you need to add a "." at the start of the XPath to indicate to start searching from the referenced element.
get_links = users.find_element(By.XPATH, ".//*[contains(#href, 'miniProfileUrn')]")
^ add period here
I've eliminated that step in my code so you won't need it there.
I am trying to automate a process on a website that dynamically generates IDs for its elements :
Ids have this form:
ZCODE:FORM:j_1279323:element
I managed to make CSS or XPATH selectors for most of the elements.
I am struggling though with a ul/li element which I manage to click on with its id but not with a relative XPath, which is what I aim to achieve:
I have tried all sorts of xpath:
/html[1]/body/[1]/div[37]/div[1]/ul[1]/li[13]
also:
//div[contains(#id, 'voie_panel')]/div/ul/li[13]
And many other different ways..
All the xpath/css selector I tested work perfectly in chrome developer console.
I only manage to drop the list down, but when I am trying to access the list element... it times-out.
I am using WebDriverWait, I have also tried to Pause the program at the exact point where it has to be loaded in order to click on the list.
I wait for the element with:
myElem = WebDriverWait(self.driver, 30).until(ec.element_to_be_clickable((BY.XPATH, css))
To summarize the situation :
It works smoothly with the ID but times out with an xpath or css selector
Can someone recommend a strategy to overcome this ?
if you are using selenium web driver you can add this to your code
chrome_options = webdriver.ChromeOptions()
scpathTemp = str(scpathTemp).replace('/', '\\').strip()
preferences = {"safebrowsing.enabled": True}
chrome_options.add_experimental_option("prefs", preferences)
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=master_path + "/chromedriver.exe")
driver.find_element_by_xpath("/html[1]/body/[1]/div[37]/div[1]/ul[1]/li[13]").click()
I'm having a bit of an issue using Selenium with python. There is a page I'm scraping, and I'm accessing a children of a parent element. However each time I run the script, it's not always guaranteed that I'll be able to get the children.
So for example, I have:
filters = driver.find_element_by_class_name("classname")
filters_children = filters.find_elements_by_class_name("anotherclassname")
And I print out filters_children[1] just to make sure.
Around 60% it will work fine, and filters_children will have a list of the children elements. However the other 40%, it'll have a NoneType so it won't be able to grab the elements.
I tried using a sleep of up to 10 seconds after the page rendered but that hasn't helped a whole lot.
Your parent class might be too broad and some time you might get a different element, then your second query will fail to find the proper child.
When searching via css selector, you can combine multiple nested class by using spaces between them. You could then combine your nested query into one.
Also I suggest that you use wait until in this case to ensure that the element will be present. Compare to sleep, this will send the request periodically to the page until it finds your request.
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
wait = WebDriverWait(driver, '30')
wait.until(EC.presence_of_all_elements_located("css selector", ".classname .anotherclassname")))
If the element also need to be visible, change presence_of_all_elements_located to visibility_of_any_elements_located
I’m trying to retrieve text from a webpage marked as a SPAN within a CLASS. I’ve tried this with Xpath, but this won’t work because the tag is encountered multiple times. I use Jupyter Notebook to write the program.
Here is an example from Instagram:
<div class="C4VMK">
<a class="FPmhX notranslate TlrDj" title="henkbrinkman1994"
href="/henkbrinkman1994/">henkbrinkman1994</a>
<span>Awesome!</span>
</div>
In this case i want to get the text 'Awesome!' in the SPAN tag.
How can I do this in Selenium Python?
Example
I don't have an Instagram account, nor do I have permission to use automation to collect information from their site (see their terms of service), so I can't really test this. The idea is that you would use find_element_by_xpath() to find the particular post (or find_elements_by_xpath() to get all of them).
my_post = driver.find_element_by_xpath('/xpath/to/a/post')
Then for each post use the same method to get the list of comments:
post_comments = my_post.find_elements_by_xpath('./relative/xpath/to/comments')
You can then loop through the objects in post_comments to get the text.
for post in post_comments:
print post.text
[there are probably more efficient ways of doing this, but this will get you started]
The desired element looks to be a dynamic element so to get the comment with text as Awesome! you need to induce WebDriverWait for the element to be visible and you can use either of the following solutions:
XPATH#1:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[contains(#class,'notranslate') and contains(.,'henkbrinkman1994')]//following::span[1]"))).get_attribute("innerHTML"))
XPATH#2:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[contains(#class,'notranslate') and #title='henkbrinkman1994']//following::span[1]"))).get_attribute("innerHTML"))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can try alternative locator strategy because its not 100% surety that your xpath will work every time.
Use below css Selector
div[class='C4VMK'] span
OR
.C4VMK span
Make sure your element has been loaded and visible enough if not then try different ExplicitWait conditions to make them work.
I desire to iterate thru a set of URLs using Selenium. From time to time I get 'element is not attached to the page document'. Thus after reading a couple of other questions indicated that it's because I am changing the page that is looking at. But I am not satisfied with that argument since:
for url in urlList:
driver.get(url)
WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.XPATH, '//div/div')))
#^ WebDriverWait shall had taken care of it
myString = driver.find_element_by_xpath('//div/div').get_attribute("innerHTML")
# ^ Error occurs here
# Then I call this function to go thru other elements given other conditions not shown
if myString:
getMoreElements(driver)
But if I add a delay like this:
for url in urlList:
driver.get(url)
time.sleep(5) # <<< IT WORKS, BUT WHY?
element = WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.XPATH, '//div/div')))
myString = driver.find_element_by_xpath('//div/div').get_attribute("innerHTML") # Error occured here
I feel I am hiding the problem by adding the delay right there. I have implicity_wait set to 30s and set_page_load_timeout to 90s, that would had been sufficient. So, why am I still facing to add what looks like useless time.sleep?
Did you try the xpath: //div/div manually in dev tool to see how many div will be found on the page? I thinks there should be many. So your below explicity wait code can very easy to satisfied, maybe no more than 1 second, selenium can find such one div after browser.get() and your wait end.
WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.XPATH, '//div/div')))
Consider following possiblity:
Due to your above explicity wait issue, the page loading not complete, more and more //div/div are rendering to page, at this time point, you ask selenium to find such one div and to interact with it.
Think about the possiblity of the first found div by selenium won't be deleted or moved to another DOM node.
What do you think the rate of above possiblity will be high or low? I think it's very hight, because div is very common tag in nowdays web page and you use such a relaxed xpath which lead to so many matched div will be found, and each one of them is possible to cause the 'Element Stale' issue
To resolve your issue, please use more strict locator to wait some special element, rather than such hasty xpath which result in finding very common and many exist element.
What you observe as element is not attached to the page document is pretty much possible.
Analysis:
In your code, while iterating over the urlList, we are opening an url then waiting for the WebElement with XPATH as //div/div with ExpectedConditions clause set to presence_of_element_located which does not necessarily mean that the element is visible or clickable.
Hence, next when you try to driver.find_element_by_xpath('//div/div').get_attribute("innerHTML") the reference of previous search/find_element is not found.
Solution:
The solution to your question would be to change the ExpectedConditions clause from presence_of_element_located to element_to_be_clickable which checks that element is visible and enabled such that you can even click it.
Code Block:
Your optimized code block may look like:
for url in urlList:
driver.get(url)
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.XPATH, '//div/div')))
myString = driver.find_element_by_xpath('//div/div').get_attribute("innerHTML")
Your other solution:
Your other solution works because you are trying to covering up Selenium's work through time.sleep(5) which is not a part of best practices.