I am scraping a webpage in multiple pages, pressing certain buttons will advance to the next page, but all the pages share exactly the same url, means most elements I have scraped aren't visible until certain buttons of the same type before them are clicked, thus trying to click them will raise the following error:
ElementNotInteractableException: Message: Element <div class="answer"> could not be scrolled into view
The elements are all of the same class, and they are all children of instances of another class, each of the parent classes have multiple instances of the child class.
To find the elements, I use two times .find_elements_by_class_name() method:
lists = []
for i in Firefox.find_elements_by_class_name('parent'):
There is exactly one element in each of the sublists that needs to be clicked, the element is determined by its attribute and identified using list indexing, so I have absolutely no idea what its xpath is.
Each of the elements aren't visible until the preceding element is clicked, so wait for them is a must to avoid ElementNotInteractableException.
I am using the following syntax:
wait = WebDriverWait(Firefox, 3)
wait.until(EC.visibility_of_element_located((By.XPATH, xpath)))
And I need to find the xpath of an element located using the aforementioned methods, unfortunately Selenium doesn't support this natively.
But I have found a trick here:
In [1]: el
Out[1]: <Element span at 0x109187f50>
In [2]: el.getroottree().getpath(el)
Out[2]: '/html/body/div/table[2]/tbody/tr[1]/td[3]/table[2]/tbody/tr/td[1]/p[4]/span'
So I think, if I can build lxml tree from selenium page source then somehow convert selenium elements into lxml elements then it can be done, though I don't know exactly how...
What about something like this (maybe you need to use contains for class):
wait.until(EC.visibility_of_element_located((By.XPATH, f'//*[#class="parent"][{index}]//*[#class="child"][{subindex}]')))
There is a table whose xpath is .//table[#id='target'] in target webpage, I want to get all data in the table (all text in td in the table).
Should i write the wait.until statement
wait.until(EC.visibility_of_all_elements_located(By.XPATH, ".//table[#id='target']")))
wait.until(EC.visibility_of_all_elements_located(By.XPATH, ".//table[#id='target']//td")))
Both commands will NOT give you what you are looking for.
visibility_of_all_elements_located will NOT really wait for visibility of ALL the elements on the page matching the passed locator.
visibility_of_all_elements_located method actually waits for at least 1 element matching the passed locator to be visible.
So, to make sure all the elements are visible you will have to add some sleep after that command.
Also, I think that waiting for table internal elements visibility should be better than waiting for the table element itself visibility.
So, I would use something like this:
wait.until(EC.visibility_of_all_elements_located(By.XPATH, ".//table[#id='target']//td")))
tds are basically not direct child but descendant :
should be the right xpath.
all_table_data = wait.until(EC.visibility_of_all_elements_located((By.XPATH, ".//table[#id='target']/descendant::td")))
all_table_data is a list that contain all the web elements.Print like below and that should give you all the data available in selenium view port.
for data in all_table_data:
I am scraping the xbox website with selenium but I encountered a problem when extracting someone's followers and friends: both elements have the same class, with no other property setting them apart, so I need to find all elements with that class and append them to a list and get the first, second value. I just need to know how to find all elements with a class whilst using wait until as seen below
followers = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".item-value-data"))).text
#this currently only gets the first element
I am aware of how to do this without wait; just putting elements, but I couldn't find anything regarding using this in wait.
WebDriverWait waits until at least 1 element matching the passed condition is found.
There is no expected condition supported by Selenium with Python to wait for predefined amount of elements matching or something like this.
What you can do is to put a small sleep after the wait to make the page fully loaded and then get the list of desired elements.
Like this:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".item-value-data")))
followers = []
followers_els = driver.find_elements_by_css_selector(".item-value-data")
for el in followers_els:
i am using a school class schedule website and i want to access the div element that contains info on how many seats are in a class and who is teaching it in order to scrape it. i first find the element which contains the div element i want, after that i try to find the div element i want by using xpaths. the problem i face is when i try to use either the find_element_by_xpath or find_elements_by_xpath to get the div i want i get this error:
'list' object has no attribute 'find_element_by_xpath'
is this error happening because the div element i want to find is nested? is there a way to get nested elements using a div tag?
here is the code i have currently :
driver = webdriver.Chrome(ChromeDriverManager().install())
url = "https://app.testudo.umd.edu/soc/202008/INST"
section_container = driver.find_elements_by_id('INST366')
sixteen_grid = section_container.find_element_by_xpath(".//div[#class = 'sections sixteen colgrid']").text
the info i want is this:
<div class = "sections sixteen colgrid"</div>
its currently inside this id tag:
<div id="INST366" class="course"</div>
greatly appreciated if anyone could help me out with this
From documentation of find_elements_by_id:
Returns : list of WebElement - a list with elements if any was found. An empty list if not
Which means section_container is a list. You can't call find_element_by_xpath on a list but you can on each element within the list because they are WebElement.
What says the documentation about find_element_by_id?
Returns : WebElement - the element if it was found
In this case you can use find_element_by_xpath directly. Which one you should use? Depends on your need, if need to find the first match to keep digging for information or you need to use all the matches.
After fixing that you will encounter a second problem: your information is displayed after executing javascript code when clicking on "Show Sections", so you need to do that before locating what you want. For that go get the a href and click on it.
The new code will look like this:
from selenium import webdriver
from time import sleep
driver = webdriver.Chrome()
url = "https://app.testudo.umd.edu/soc/202008/INST"
section_container = driver.find_element_by_id('INST366')
section_info = section_container.find_element_by_xpath(".//div[#class='sections sixteen colgrid']").text
I'm trying to create a program extracting all persons I follow on Instagram. I'm using Python, Selenium and Chromedriver.
To do so, I first get the number of followed persons and click on the 'following' button : `
nb_abonnements = int(webdriver.find_element_by_xpath('/html/body/span[1]/section[1]/main/div[1]/header/section[1]/ul/li[3]/a/span').text)
abonnements = webdriver.find_element_by_xpath('/html/body/span[1]/section[1]/main/div[1]/header/section[1]/ul/li[3]/a')
I then use the following code to get the followers and scroll the popup page in case I can't find one:
followers_panel = webdriver.find_element_by_xpath('/html/body/div[3]/div/div/div[2]')
while i < nb_abonnements:
followed = webdriver.find_element_by_xpath('/html/body/div[3]/div/div/div[2]/ul/div/li[{}]/div/div[2]/div/div/div/a'.format(i+1)).text
#the followeds are in an ul-list
i += 1
except NoSuchElementException:
The problem is once i is at 12, the program raises the exception and scrolls. From there, he still can't find the next follower and is stuck in a loop where he does nothing but scroll. I've checked the source codeof the IG page, and it turns out the path is still good, but apparently I can't access the elements as I do anymore, probably because the ul-list in which I am accessing them has become to long (line 5 of the program).
I can't work out how to solve this. I hope you will be of some help.
UPDATE: the DOM looks like this:
The ul is the list of the followers.
The lis contain the info i'm trying to extract (username). Even when I go go by myself on the webpage, open the popup window, scroll a little and let everything load, I can't find the element I'm looking for by typing the xpath in the search bar of the DOM manually. Although the path is correct, I can check it by looking at the DOM.
I've tried various webdrivers for selenium, currently I am using chromedriver 2.45.615291. I've also put an explicit wait to wait for the element to show (WebDriverWait(webdriver, 10).until(EC.presence_of_element_located((By.XPATH, '/html/body/div[3]/div/div/div[2]/ul/div/li[{}]/div/div[2]/div/div/div/a'.format(i+1))))), but I just get a timeout exception: selenium.common.exceptions.TimeoutException: Message:.
It just seems like once the ul list is too long (which is from the moment I've scrolled down enough to load new people), I can't access any element of the list by its XPATH, even the elements that were already loaded before I began scrolling.
Instead of using xpath for each of the child element... find the ul-list element then find all the child elements using something like : ul-list element.find_elements_by_tag_name(). Then iterate through each element in the collection & get the required text
I've foud a solution: i just access the element through the XPATH like this: find_element_by_xpath("(//*[#class='FPmhX notranslate _0imsa '])[{}]".format(i)). I don't know why it didn't work the other way, but like this it works just fine.
Quick info: I'm using Mac OS, Python 3.
I have like 800 links that need to be clicked on a page (and many more pages to go so need automation).
They were hidden because you only see those links when you hover over.
I fixed that by injecting CSS rule (just saying in case its the reason it's not working).
When I try to find elements by xpath it does not want to click the links afterwards and it also doesn't find all of them always just 4 (even when more are displayed in view).
When i click ok copy xpath in inspect it gives me:
But it doesn't work when I use it like this:
So two questions:
How do I get them all?
How do I get it to click on each of them?
The pattern in the XPath is the same, with the /li[3] being the only number that changes, for this I created a for loop to create them all based on the count on page which I did successfully.
So if it can be done with the XPaths generated by myself that are corresponding to when I copy XPath in inspector then I only need question 2 answered.
PS.: this is HTML of parent of that first HTML:
<li onclick="openPopup(event, 'collect', {item_id: 165214})" class="collect" data-item-id="165214">Display</li>
This XPath,
will select all a links with anchor text equal to "Display".
As per your question, the HTML you have shared and your code attempts there is no necessity to get the <li> tags. Instead we will get the <a> tags in a list. So to answer your first question How do I get them all you can use the following line of code :
all_Display = driver.find_elements_by_xpath("//*[#id='tiles']//li/div[2]/ul/li[#class='collect']/a[#title='Display']")
Next to click on each of them you have to create a loop to iterate through all the <a> tag as follows :
all_Display = driver.find_elements_by_xpath("//*[#id='tiles']//li/div[2]/ul/li[#class='collect']/a[#title='Display']")
for each_Display in all_Display :
Using an XPath with elements by position is not ideal. Instead use a CSS selector to match the attributes for the targeted elements.
Something like:
all_Display = driver.find_elements_by_css_selector("#tiles li[onclick][data-item-id] a[title]")
You can then click them in a loop if none of them is loading a new page:
for element in all_Display: