I want to extract data in <div class="user-profile_list __relatives"> ... (see image)
Source code of the page https://gist.github.com/mascai/59e3bf779c2ba7cecb973ab9653ed419
My code
def get_relatives(driver):
relatives = []
relatives_container = driver.find_element_by_class_name("user-profile_list __relatives")
return relatives
driver = webdriver.Chrome(executable_path='chromedriver')
get_relatives(driver)
Error text
Message: no such element: Unable to locate element: {"method":"css selector","selector":".user-profile_list __relatives"}
This thing happens many time, its better to use xpath and search for class in it,
relatives_container = driver.find_element_by_xpath('//*[class="classuser-profile_list __relatives"]')
You can also try contains in xpath, it also work if there are multiple classes in that element and write only one of them
relatives_container = driver.find_element_by_xpath('//*[contains(#class, 'user-profile_list __relatives')]')
Related
I am writing a YouTube scraper in selenium. I am unable to scrape comments.
Following is my code:
driver.get("https://www.youtube.com/watch?v=b5jt2bhSeXs")
wait = WebDriverWait(driver, 20)
time.sleep(10)
# wait.until(EC.presence_of_element_located((By.TAG_NAME, "ytd-comments")))
wait.until(EC.presence_of_element_located((By.TAG_NAME, "ytd-item-section-renderer")))
get_comments_and_commentors(driver)
def get_comments_and_commentors(driver):
scroll_down()
main_div = driver.find_element_by_id("contents")
inner = main_div.find_element_by_class_name("style-scope.ytd-item-section-renderer")
here = inner.find_element_by_tag_name("ytd-comment-renderer")
print(here)
scroll_down function is to scroll till the end to load all the comments this function is working because I can see the scrolling of page when I run this script.
The issue I am facing is that selenium is unable to find this tag "ytd-comment-renderer" instead of tag I have also tried to use the class name its giving the same error.
I am keep getting this error for both the tag and class name:
"selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"ytd-comment-renderer"}"
To fix this I tried this:
wait.until(EC.presence_of_element_located((By.TAG_NAME, "ytd-comment-renderer")))
But I was getting the TimeOutException.
What can I do to fix this error.
Element to be located
I am trying to locate a span element inside a webpage, I have tried by XPath but its raise timeout error, I want to locate title span element inside Facebook marketplace product. url
here is my code :
def title_detector():
title = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, 'path'))).text
list_data = title.split("ISBN", 1)
Try this xpath //span[contains(text(),'isbn')]
You can't locate pseudo elements with XPath, only with CSS selector.
I see it's FaceBook with it's ugly class names...
I'm not sure this will work for you, maybe these class names are dynamic, but it worked for me this time.
Anyway, the css_locator for that span element is .dati1w0a.qt6c0cv9.hv4rvrfc.discj3wi .d2edcug0.hpfvmrgz.qv66sw1b.c1et5uql.lr9zc1uh.a8c37x1j.keod5gw0.nxhoafnm.aigsh9s9.qg6bub1s.fe6kdd0r.mau55g9w.c8b282yb.iv3no6db.o0t2es00.f530mmz5.hnhda86s.oo9gr5id
So, since we are trying to get it's before we can do it with the following JavaScript script:
span_locator = `.dati1w0a.qt6c0cv9.hv4rvrfc.discj3wi .d2edcug0.hpfvmrgz.qv66sw1b.c1et5uql.lr9zc1uh.a8c37x1j.keod5gw0.nxhoafnm.aigsh9s9.qg6bub1s.fe6kdd0r.mau55g9w.c8b282yb.iv3no6db.o0t2es00.f530mmz5.hnhda86s.oo9gr5id`
script = "return window.getComputedStyle(document.querySelector('{}'),':before').getPropertyValue('content')".format(span_locator)
print(driver.execute_script(script).strip())
In case the css selector above not working since the class names are dynamic there - try to locate that span with some stable css_locator, it is possible. Just have to try it several times until you see which class names are stable and which are not.
UPD:
You don't need to locate the pseudo elements there, will be enough to catch that span itself. So, it will be enough something like this:
span_locator = `.dati1w0a.qt6c0cv9.hv4rvrfc.discj3wi .d2edcug0.hpfvmrgz.qv66sw1b.c1et5uql.lr9zc1uh.a8c37x1j.keod5gw0.nxhoafnm.aigsh9s9.qg6bub1s.fe6kdd0r.mau55g9w.c8b282yb.iv3no6db.o0t2es00.f530mmz5.hnhda86s.oo9gr5id`
def title_detector():
title = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CSS_SELECTOR, 'span_locator'))).text
title = title.strip()
list_data = title.split("ISBN", 1)
I am trying to retrieve the information related to reviewers posted on google play store. I tried to extract the name of reviewers and rating given by reviewer. Code returns the first set of information then it returns the error. I tried to debug but couldnt fix it. Here is my code:
driver = webdriver.Chrome(driverLocation)
driver.get('https://play.google.com/store/apps/details?id=com.getsomeheadspace.android&showAllReviews=true')
driver.implicitly_wait(20)
parentElement = driver.find_element_by_xpath('//h3[contains(text(),"User reviews")]//parent::div/div[1]/div')
for child in parentElement.find_element_by_tag_name('div'):
name = child.find_element_by_class_name('X43Kjb').text
indvd_rating = child.find_element_by_xpath('//span[#class="nt2C1d"]//div[#aria-label]').get_attribute('aria-label')
print(name)
print(indvd_rating)
driver.quit()
Error is:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".X43Kjb"}
Could anyone point out where am I making the mistake?
I'm writing a program to iterate through elements on a webpage. I start the browser like so:
self.browser = webdriver.Chrome(executable_path="C:/Users/me/chromedriver.exe")
self.browser.get("https://www.google.com/maps/place/Foster+Street+Coffee/#36.0016436,-78.9018397,19z/data=!4m7!3m6!1s0x89ace473f05b7d39:0x42c63a92682d9ec3!8m2!3d36.0016427!4d-78.9012927!9m1!1b1")
this opens the site, in which I can find an element i'm interested in using:
reviews = self.browser.find_elements_by_class_name("section-review-line")
now I have a list of elements for class name "section-review-line", which seems to populate correctly. I'd like to iterate through this list of elements and pick out subelements with a set of logic. To get the subelements, which I know exist as class name "section-review-review-content", I try this:
for review in reviews:
content = review.find_element_by_class_name("section-review-review-content")
This errors out with:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".section-review-review-content"}
Ok, here are all the items information that you need from each review.
reviews = driver.find_elements_by_class_name("section-review-content")
for review in reviews:
reviewer = review.find_element_by_class_name("section-review-title").text
numOfReviews = review.find_element_by_xpath(".//div[#class='section-review-subtitle']//span[contains(.,'reviews')]").text.strip().replace('.','')
numberOfStarts = review.find_element_by_class_name("section-review-stars").get_attribute('aria-label').strip()
publishDate = review.find_element_by_class_name("section-review-publish-date").text
content = review.find_element_by_class_name("section-review-review-content").text
ah figured it out,
using a strange page that had an empty element up top, causing it to error out. the large majority of the elements did not have this problem, using a try catch solved it like so:
reviews = self.browser.find_elements_by_class_name("section-review-line")
for review in reviews:
try:
content = review.find_element_by_class_name("section-review-review-content")
rtext = content.find_element_by_class_name("section-review-text").text
except:
continue
I am trying to get image and video url from https://www.google.com/trends/home/all/IN
Here is the code:
driver = webdriver.PhantomJS('/usr/local/bin/phantomjs')
driver.set_window_size(1124, 850)
driver.get("https://www.google.com/trends/home/all/IN")
trend = {}
def getGooglerends():
try:
#Does this line makes any sense
#element = WebDriverWait(driver, 20).until(lambda driver: driver.find_elements_by_class_name('md-list-block ng-scope'))
for s in driver.find_elements_by_class_name('md-list-block ng-scope'):
print s.find_element_by_tag_name('img').get_attribute('src')
print s.find_element_by_tag_name('img').get_attribute('alt')
print s.find_elements_by_class_name('image-wrapper ng-scope').get_attribute('href')
except:
getNDTVTrends()
getGooglerends()
which gives
WebDriverException: Message: {"errorMessage":"Compound class names not permitted","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"111","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:57213","User-Agent":"Python-urllib/2.7"},"httpVersion":"1.1","method":"POST","post":"{\"using\": \"class name\", \"sessionId\": \"648251c0-1cc7-11e5-bf1c-4ff79ddbdce4\", \"value\": \"md-list-block ng-scope\"}","url":"/elements","urlParsed":{"anchor":"","query":"","file":"elements","directory":"/","path":"/elements","relative":"/elements","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/elements","queryKey":{},"chunks":["elements"]},"urlOriginal":"/session/648251c0-1cc7-11e5-bf1c-4ff79ddbdce4/elements"}}
Screenshot: available via screen
Any suggestion for this error?
Compound class names not permitted
It basically means, that you can not have spaces in your class name. You need to switch to another selector, be that css, xpath or something like that.
Not really sure what you are trying to select, but for example following xpath selects a list of items containing that class:
//div[#class="homepage-trending-stories generic-container ng-scope"]/md-list[#class="md-list-block ng-scope"]