I want to read out the text in this html element using selenium with python. I just can't find a way to find or select it without using the text (i don't want that because its content changes)
<div font-size="14px" color="text" class="sc-gtsrHT jFEWVt">0.101 ONE</div>
Do you have an idea how i could select it? The conventional ways listed in the documentation seem to not work for me. To be honest i'm not very good with html what doesn't make things any easier.
Thank you in advance
Try this :
element = browser.find_element_by_class_name('sc-gtsrHT jFEWVt').text
Or use a loop if you have several elements :
elements = browser.find_elements_by_class_name('sc-gtsrHT jFEWVt')
for e in elements:
print(e.text)
print(browser.find_element_by_xpath("//*[#class='sc-gtsrHT jFEWVt']").text)
You could simply grab it by class name. It's 2 class names so it would be like so. by_class_name only uses one.
If the class name isn't dynamic otherwise you'd have to right click and copy the xpath or find a unique identiftier.
Find by XPath as long as font size and color attribute are consistent. Be like,
//div[#font-size='14px' and #color='text' and starts-with(#class,'sc-')]
I guess the class name is random?
Related
I have a page with a list of cards with information.
The XPATHS of each cards are:
self.automatic_payments_cards_list = (By.XPATH , '//*[#id="page-inner"]/div/div/main/lseg-gateway-billing-payment-line-info')
I'm trying to get the text of a specific elements for every card in the page.
//*[#id="page-inner"]/div/div/main/lseg-gateway-billing-payment-line-info[1]/lseg-card/div/lseg-card-container/ng-transclude/div/div[4]/div/div[3]
//*[#id="page-inner"]/div/div/main/lseg-gateway-billing-payment-line-info[2]/lseg-card/div/lseg-card-container/ng-transclude/div/div[4]/div/div[3]
//*[#id="page-inner"]/div/div/main/lseg-gateway-billing-payment-line-info[3]/lseg-card/div/lseg-card-container/ng-transclude/div/div[4]/div/div[3]
I know that with this code i get all the text on each card
for i in range(len(self.driver.find_elements(*self.automatic_payments_cards_list))):
print(self.driver.find_element(*self.automatic_payments_cards_list)[i].text)
But i don't want to get all the text on the cards, only the text on this specifics XPATHS
//*[#id="page-inner"]/div/div/main/lseg-gateway-billing-payment-line-info[**X**]**/lseg-card/div/lseg-card-container/ng-transclude/div/div[4]/div/div[3]**
Can you guys guide me in finding a solution to this?
The best way to actually achieve this is by using find element_by_xpath on the element.
all_card_els = driver.find_elements_by_xpath('//*[#id="page-inner"]/div/div/main/lseg-gateway-billing-payment-line-info')
for card_el in all_card_els:
specific_el_within_card = card_el.find_element_by_xpath('.//lseg-card/div/lseg-card-container/ng-transclude/div/div[4]/div/div[3]')
The . at the starting of the xpath is essential to make sure that the search is within the selected el, without the . you will always end up getting the first el which matches this xpath on the page. You can now use specific_el_within_card however way you like inside the loop, or append it to an external list.
PS: You can access the text via specific_el_within_card.text() as you mentioned you wanted to extract info for each card.
It is simple.:-) Transfer your xpath as dynamical string and pass it, like you do in a loop for e.g.
parent_locator_String=string1+ iterator + string2
"//*[#id="page-inner"]/div/div/main/lseg-gateway-billing-payment-line-info["+i+"]/lseg-card/div/lseg-card-container/ng-transclude/div/div[4]/div/div[3]"
If I have some HTML:
<span class="select2-selection__rendered" id="select2-plotResults-container" role="textbox" aria-readonly="true" title="50">50</span>
And I want to find it using something like:
driver.find_element_by_xpath('//*[contains(text(), "50")]')
The problem is that there is 500 somewhere before on the webpage and it's picking up on that, is there way to search for a perfect match to 50?
Instead of contains, search for a specific text value:
driver.find_element_by_xpath('//*[text()="50"]')
And if you know it will be a span element, you can be a little more specific:
driver.find_element_by_xpath('//span[text()="50"]')
Note that your question asks how to find an element by its text value. If possible and would apply to your situation, you should look for a specific class or id, if known and consistent.
You can search for it by its absolute Xpath. For that, inspect the page and find the element. Then right-click it and copy its Xpath or full Xpath.
Otherwise you can use the id:
driver.find_element_by_id("select2-plotResults-container")
Here is more on locating elements.
use something like this
msg_box=driver.find_element_by_class_name('_3u328') and driver.find_element_by_xpath('//div[#data-tab = "{}"]'.format('1'))
I wonder if anybody can help :)
I am using python lxml and cssselector to scrape data from HTML pages.
I can select most classes with ease using this method and find it very convenient but I am having a problem selecting class names with a space
For example I want to extract Blah from the following class:
<li class="feature height">Blah blah</li>
I have tried using the following css selectors without success - whole path is not included as this is not the problem
li.feature.height
li.feature height
li.feature:height
Anybody know how to do this? I can't find the answer and am sure it must be a fairly common thing that people need to do...
I cannot just select the parent element
li.feature
as the data is not in the same order on different pages, same applies for nth element selections...
Scratching my head on this a while now and searched alot, hope somebody knows!
I can work around it by getting the data using re's and that works but I wonder if there is a simple solution...
Thanks for you help in advance!
Matt
Extra information as requested - it doesn't work because it returns an empty list or a negative result for boolean
so if use the
css_9_seed_height = 'html body div.seedicons ul li.feature.height'
# 9. Get seed_height
seed_height_obj = root.cssselect(css_9_seed_height)
print seed_height_obj
This returns an empty list - ie the class is not found but its there
You can assume that root.cssselect() works correctly as I am retrieving lots of other info in the same way
When I go to a certain webpage I am trying to find a certain element and piece of text:
<span class="Bold Orange Large">0</span>
This didn't work: (It gave an error of compound class names or something...)
elem = browser.find_elements_by_class_name("Bold Orange Large")
So I tried this: (but I'm not sure it worked because I don't really understand the right way to do css selectors in selenium...)
elem = browser.find_elements_by_css_selector("span[class='Bold Orange Large']")
Once I find the span element, I want to find the number that is inside...
num = elem.(what to put here??)
Any help with css selectors, class names, and finding element text would be great!!
Thanks.
EDIT:
My other problem is that there are multiple of those exact span elements but with different numbers inside..how can I deal with that?
you're correct in your usage of css selectors! Also your first attempt was failing because there are spaces in the class name and selenium does not seem to be able to find standalone identifiers with spacing at all. I think that is a bad development practice to begin with, so its not your problem. Selenium itself does not include an html editor, because its already been done before.
Try looking here: How to find/replace text in html while preserving html tags/structure.
Also this one is relevant and popular as well: RegEx match open tags except XHTML self-contained tags
I am using Selenium webdriver in Python for a web-scraping project.
The webpage, I am working on has a number of Table entries with the same class name.
<table class="table1 text print">
I am using find_element_by_class_name. However I am getting a error :
*Compound class names not permitted *
Another question:
How to iterate through all the tables having the same css classname ?
Thanks
You should use find_elements_by_class_name. This will return an iterable object.
The error you describe happens when you provide multiple class names rather than a single one. An easy way around this is to get the elements using CSSSelector or XPath. Alternatively, you could use find_elements_by_class_name and provide one class name, then iterate through that list to find the elements that match the additional class names you want to match on as well.