selenium get element by css selector - python

I am trying to get user details from each block as given
driver.get("https://www.facebook.com/public/karim-pathan")
wait = WebDriverWait(driver, 10)
li_link = []
for s in driver.find_elements_by_class_name('clearfix'):
print s
print s.find_element_by_css_selector('_8o._8r.lfloat._ohe').get_attribute('href')
print s.find_element_by_tag_name('img').get_attribute('src')
it says:
unable to find element with css selector
any hint appreciable.

Just a mere guess based on assumption that you are not logged in. You are getting exception cause for all class clearfix, element with ._8o._8r.lfloat._ohe does not exists. So your code isn't reaching the required elements. Anyhow, if you are trying to fetch href and img source of results, you need not iterate over all clearfix cause as suggested by #leo.fcx, your css is incorrect, trying the css provided by leo, you can achieve the desired result as:
driver.get("https://www.facebook.com/public/karim-pathan")
for s in driver.find_elements_by_css_selector('._8o._8r.lfloat._ohe'): // there didn't seemed to iterate over each class of clearfix
print s.get_attribute('href')
print s.find_element_by_tag_name('img').get_attribute('src')
P.S. sorry for any syntax, never explored python binding :)

Since you are using all class-names that the element applies, adding a . to the beginning of your CSS selector should fix it.
Try this:
s.find_element_by_css_selector('._8o._8r.lfloat._ohe')
instead of:
s.find_element_by_css_selector('_8o._8r.lfloat._ohe')

Adding to what #leo.fcx pointed about the selector, wait for search results to become visible:
wait.until(EC.visibility_of_element_located((By.ID, "all_search_results")))

Related

Selenium - Get element after a specific element (Python)

As you can see from the picture, the pagination is a bit different. So my idea is to get the position of the current-page and get the link for the a tag.
current_page= driver.find_element(By.CSS_SELECTOR,"span.current-page")
driver.find_element(By.TAG_NAME,"a").click()
but I didn't know to use the current_page element to find the a tag after it and click on it.
Thanks for your help in advance.
To get the next element using current page reference you can use the following css selector in one liner or XPATH option.
next_page= driver.find_element(By.CSS_SELECTOR,"span.current-page+a")
print(next_page.get_attribute("href"))
next_page.click()
Or you can use this.
current_page= driver.find_element(By.CSS_SELECTOR,"span.current-page")
current_page.find_element(By.XPATH,"./following::a[1]").click()

I found a span on a website that is not visible and I can't scrape it! Why?

Currently I'm trying to scrape data from a website. Therefore I'm using Selenium.
Everything is working as it should. Until I realised I have to scrape a tooltiptext.
I found already different threads on stackoverflow that are providing an answer. Anyway I did not manage to solve this issue so far.
After a few hours of frustration I realised the following:
This span has nothing to do with the tooltip I guess. Because the tooltip looks like this:
There is actually a span that I can't read. I try to read it like this:
bewertung = driver.find_elements_by_xpath('//span[#class="a-icon-alt"]')
for item in bewertung:
print(item.text)
So Selenium finds this element. But unfortunatly '.text' returns nothing. Why is it always empty ?
And what for is the span from the first screenshot ? Btw. it is not displayed at the Website as well.
Since you've mentioned Selenium finds this element, I would assume you must have print the len of bewertung list
something like
print(len(bewertung))
if this list has some element in it, you could probably use innerText
bewertung = driver.find_elements_by_xpath('//span[#class="a-icon-alt"]')
for item in bewertung:
print(item.get_attribute("innerText"))
Note that, you are using find_elements which won't throw any error instead if it does not find the element it will return an empty list.
so if you use find_element instead, it would throw the exact error.
Also, I think you've xpath for the span (Which does not appear in UI, sometime they don't appear until some actions are triggered.)
You can try to use this xpath instead:
//i[#data-hook='average-stars-rating-anywhere']//span[#data-hook='acr-average-stars-rating-text']
Something like this in code:
bewertung = driver.find_elements_by_xpath("//i[#data-hook='average-stars-rating-anywhere']//span[#data-hook='acr-average-stars-rating-text']")
for item in bewertung:
print(item.text)

Selenium on Twitter for user information: no such elements

I try to use Selenium to get specific users' information (e.g., num of followers) by entering their page using ids. The thing is, though I find the needed information in the INSPECT, I cannot position it using Selenium, even with the help of ChroPath, which tells you the Xpath or CssSelector that you can use to position. It keeps saying: No such element...I'm quite confused. I'm not even trying to automatically log in or anything.
here are the codes:
from selenium import webdriver
driver = webdriver.Chrome(executable_path='E:/data mining/chromedriver.exe')
driver.get('https://twitter.com/intent/user?user_id=823730524426477568')
ele = driver.find_element_by_class_name('css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0').text
print(ele)
Error:
Message: no such element: Unable to locate element: {"method":"css selector","selector":".r-poiln3 r-bcqeeo r-qvutc0"}
(Session info: chrome=88.0.4324.104)
It's so strange, because everything is right on the first page, and I don't even need to scroll down to see the information, but it won't be scraped...
To fix the issue with your code, try this:
ele = driver.find_element_by_css_selector('.css-901oao.css-16my406.r-poiln3.r-bcqeeo.r-qvutc0').text
However, because of how the site is formatted, it won't get you the result you want. It appears that not only do they rotate the class name, but there isn't enough variability in how the elements are labelled to make anything but XPath a viable option for getting specific data. (Unless you want to go through a list of all the elements with the same class to find what you need). After some initial site interactions, these XPaths worked for me:
following = driver.find_element_by_xpath('//*[#id="react-root"]/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div[2]/div[4]/div[1]/a/span[1]/span').text
followers = driver.find_element_by_xpath('//*[#id="react-root"]/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div[2]/div[4]/div[2]/a/span[1]/span').text
Problem is, there are over 100 elements with classes 'css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0' on that page.
To avoid absolute xpath you could use something like this for e.g. number of followers:
Find span with descendant span with text 'Followers'. Above that, on same level (sibling) there is span with child - number of followers for text
ele = driver.find_element_by_xpath("//span[descendant::span[text()='Followers']]/preceding-sibling::span/span").text
The selectors which you are using are very much fragile. chropath gives fragile xpath which is changing in next run so script is failing. You might like to use relative xpath generated by selectorshub which is much robust.

Python, Selenium: can't find element by xpath when ul list is too long

I'm trying to create a program extracting all persons I follow on Instagram. I'm using Python, Selenium and Chromedriver.
To do so, I first get the number of followed persons and click on the 'following' button : `
nb_abonnements = int(webdriver.find_element_by_xpath('/html/body/span[1]/section[1]/main/div[1]/header/section[1]/ul/li[3]/a/span').text)
sleep(randrange(1,3))
abonnements = webdriver.find_element_by_xpath('/html/body/span[1]/section[1]/main/div[1]/header/section[1]/ul/li[3]/a')
abonnements.click()
I then use the following code to get the followers and scroll the popup page in case I can't find one:
followers_panel = webdriver.find_element_by_xpath('/html/body/div[3]/div/div/div[2]')
while i < nb_abonnements:
try:
print(i)
followed = webdriver.find_element_by_xpath('/html/body/div[3]/div/div/div[2]/ul/div/li[{}]/div/div[2]/div/div/div/a'.format(i+1)).text
#the followeds are in an ul-list
i += 1
followed_list.append(followed)
except NoSuchElementException:
webdriver.execute_script(
"arguments[0].scrollBy(0,400)",followers_panel
)
sleep(7)
The problem is once i is at 12, the program raises the exception and scrolls. From there, he still can't find the next follower and is stuck in a loop where he does nothing but scroll. I've checked the source codeof the IG page, and it turns out the path is still good, but apparently I can't access the elements as I do anymore, probably because the ul-list in which I am accessing them has become to long (line 5 of the program).
I can't work out how to solve this. I hope you will be of some help.
UPDATE: the DOM looks like this:
html
body
span
script
...
div[3]
div
...
div
div
div[2]
ul
div
li
li
li
li
...
li
The ul is the list of the followers.
The lis contain the info i'm trying to extract (username). Even when I go go by myself on the webpage, open the popup window, scroll a little and let everything load, I can't find the element I'm looking for by typing the xpath in the search bar of the DOM manually. Although the path is correct, I can check it by looking at the DOM.
I've tried various webdrivers for selenium, currently I am using chromedriver 2.45.615291. I've also put an explicit wait to wait for the element to show (WebDriverWait(webdriver, 10).until(EC.presence_of_element_located((By.XPATH, '/html/body/div[3]/div/div/div[2]/ul/div/li[{}]/div/div[2]/div/div/div/a'.format(i+1))))), but I just get a timeout exception: selenium.common.exceptions.TimeoutException: Message:.
It just seems like once the ul list is too long (which is from the moment I've scrolled down enough to load new people), I can't access any element of the list by its XPATH, even the elements that were already loaded before I began scrolling.
Instead of using xpath for each of the child element... find the ul-list element then find all the child elements using something like : ul-list element.find_elements_by_tag_name(). Then iterate through each element in the collection & get the required text
I've foud a solution: i just access the element through the XPATH like this: find_element_by_xpath("(//*[#class='FPmhX notranslate _0imsa '])[{}]".format(i)). I don't know why it didn't work the other way, but like this it works just fine.

Using Python and Selenium why am I unable to find link by link text?

I have a list webelement that has a bunch of links within it. The html looks like:
<li>
<span class="ss-icon"></span> Remove
<a href="/sessions/new"><span class="ss-icon"></span> Sign in to save items</a
...
When I try to do something like:
link = element.find_element_by_link_text('Sign in to save items')
I get an error that says:
NoSuchElementException: Message: Unable to locate element:
{"method":"link text","selector":"Sign in to save items"}
I have been able to find this link by instead doing a find_elements_by_tag_name('a') and then just using the link with the correct HREF, but I would like to understand why the first method fails.
It happened to me before that the find_element_by_link_text method sometimes works and sometimes doesn't work; even in a single case. I think it's not a reliable way to access elements; the best way is to use find_element_by_id.
But in your case, as I visit the page, there is no id to help you. Still you can try find_elements_by_xpath in 2 ways:
1- Accessing title: find_element_by_xpath["//a[contains(#title = 'Sign in to save items')]"]
2- Accessing text: find_element_by_xpath["//a[contains(text(), 'Sign in to save items')]"]
Hope it helps.
The problem is, most likely, in the extra spaces before or/and after the link text. You can still approach it with a "partial link text' match:
element.find_element_by_partial_link_text('Sign in to save items')

Categories

Resources