ProductNames is an array of required data when using this line:
ProductNames[3].find_element_by_css_selector('.aok-align-bottom').get_attribute("innerHTML")
I'm getting this:
<span class="a-icon-alt">4.3 out of 5 stars</span>
So how can I extract only exactly text 4.3 out of 5 stars from span tag
You should include in your css_selector this >span too, and search get_attribute("innetHTML") on <span class="a-icon-alt">4.3 out of 5 stars</span>
Try something like this:
ProductNames[3].find_element_by_css_selector('.aok-align-bottom').get_attribute("innerHTML").text
You don't extract from innerHTML. Rather you extract text or the value of any attribute of a WebElement.
To extract the text _4.3 out of 5 stars_ you need to move one step deeper to the <span> and you can use the following Locator Strategy:
ProductNames[3].find_element_by_css_selector('.aok-align-bottom>span.a-icon-alt').get_attribute("innerHTML")
Or simply:
ProductNames[3].find_element_by_css_selector('.aok-align-bottom>span').get_attribute("innerHTML")
As an alternative, you can also use the text attribute as follows:
ProductNames[3].find_element_by_css_selector('.aok-align-bottom>span.a-icon-alt').text
Or simply:
ProductNames[3].find_element_by_css_selector('.aok-align-bottom>span').text
References
You can find a couple of relevant discussions in:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium
Related
I want to scrape the URLs within the HTML of the 'Racing-Next to Go' section of www.tab.com.au.
Here is an excerpt of the HTML:
<a ng-href="/racing/2020-07-31/MACKAY/MAC/R/8" href="/racing/2020-07-31/MACKAY/MAC/R/8"><i ng-
All I want to scrape is the last bit of that HTML which is a link, so:
/racing/2020-07-31/MACKAY/MAC/R/8
I have tried to find the element by using xpath, but I can't get the URL I need.
My code:
driver = webdriver.Firefox(executable_path=r"C:\Users\Harrison Pollock\Downloads\Python\geckodriver-v0.27.0-win64\geckodriver.exe")
driver.get('https://www.tab.com.au/')
elements = driver.find_elements_by_xpath('/html/body/ui-view/main/div[1]/ui-view/version[2]/div/section/section/section/race-list/ul/li[1]/a')
for e in elements:
print(e.text)
Probaly you want to use get_attribute insted of .text. Documentation here.
elements = driver.find_elements_by_xpath('/html/body/ui-view/main/div[1]/ui-view/version[2]/div/section/section/section/race-list/ul/li[1]/a')
for e in elements:
print(e.get_attribute("href"))
Yes, you can use getAttribute(attributeLocator) function for your requirement.
selenium.getAttribute(//xpath#href);
Specify the Xpath of the element for which you require to know the class of.
The value /racing/2020-07-31/MACKAY/MAC/R/8 within the HTML is the value of href attribute but not the innerText.
Solution
Instead of using the text attribute you need to use get_attribute("href") and the effective lines of code will be:
elements = driver.find_elements_by_xpath('/html/body/ui-view/main/div[1]/ui-view/version[2]/div/section/section/section/race-list/ul/li[1]/a')
for e in elements:
print(e.get_attribute("href"))
So I have this site and I'm trying to obtain the location and size of an element based on this xpath "//div[#class='titlu']"
How you can see that is visible and has nothing special.
Now the problem I've faced is that when I'm doing the search for xpath like this
e = self.driver.find_element_by_xpath(xpath) the location and size
of e are both 0
Also, for some reason, if I'm trying to get the text like this:
e.text is going to show me an empty string, and I need to get the actual text in this way e.get_attribute("textContain")
So do you have any idea how can I get the location and size of this element?
There are two elements matching this xpath. driver.find_element_by_xpath returns the first one while you are looking for the second one. Use the ancestor <div> with id attribute for unique xpath
"//div[#id='content-detalii']//div[#class='titlu']"
I am trying to select "seltarget" from a list of other options.
HTML:
<div class="filter-item" data-reactid=".3.1.0.0.1.0.0.3.0.0.0.3">seltarget</div>
When exported to a python file it recommends that I locate the element by xpath:
driver.find_element_by_xpath("//div[#id='quickswitcher']/div/nav/ul/li/div/div/menu/div[2]/div/div/div/div[5]").click()
This is working for the time being because "seltarget" just happens to be 5th in the list. The above line will select whichever option is 5th in the list. I want to be more specific and select the element "seltarget"
Locating by link_text works only with <a> tags. To locate this element you can use the class if it unique
driver.find_element_by_class_name('filter-item').click()
Or the data-reactid attribute
driver.find_element_by_css_selector('[data-reactid*="3.1.0.0.1.0.0.3.0.0.0.3"]').click()
To locate by the text you can also use css_selector
driver.find_element_by_css_selector('div:contains("seltarget")').click()
Or xpath
driver.find_element_by_xpath('//div[contains(text(), "seltarget")]').click()
If you want to find the div by its text, you can use the following.
driver.find_element_by_xpath("//div[text()='seltarget']")
Trying to figure out how to access the text in the screenshot below without pulling all the span tags.
Doing element = driver.find_elements_by_id('response') gives me a list, but I can't seem to dig down further to access the text I want.
I also tried this after doing some searching:
element = driver.find_element_by_xpath("//div[#id='response']/pre")
But I get the same result.
Any tips?
element.get_attribute('innerHTML')
this will help you to get the text between two div tag
element.text
Should give out the contents of the element without any HTML tags.
In the case of the text being in the pure div the text is not extracted using element.text
Example:
<div>the text here</div>
I recommend to use a library called html2text and next:
html2text(element.get_attribute("outerHTML"))
It will do the trick!
I'm trying to scrape data from a page with JavaScript loaded content. For example, the content I want is in the following format:
<span class="class">text</span>
...
<span class="class">more text</span>
I used the find_element_by_xpath(//span[#class="class"]').text function but it only returned the first instance of the specified class. Basically, I would want a list like [text, more text] etc. I found the find_elements_by_xpath() function, but the .text at the end results in an error exceptions.AttributeError: 'list' object has no attribute 'text'.
find_element_by_xpath returns one element, which has text attribute.
find_elements_by_xpath() returns all matching elements, which is a list, so you need to loop through and get text attribute for each of the element.
all_spans = driver.find_elements_by_xpath("//span[#class='class']")
for span in all_spans:
print span.text
Please refer to Selenium Python API docs here for more details about find_elements_by_xpath(xpath).
This returns a list of items:
all_spans = driver.find_elements_by_xpath("//span[#class='class']")
for span in all_spans:
print span.text