I have a web page something like:
<div class='product-pod-padding'>
<a class='header product-pod--ie-fix' href='link1'/>
<div> SKU#1</div>
</div>
<div class='product-pod-padding'>
<a class='header product-pod--ie-fix' href='link2'/>
<div> SKU#2</div>
</div>
<div class='product-pod-padding'>
<a class='header product-pod--ie-fix' href='link3'/>
<div> SKU#3</div>
</div>
When I tried to loop through the products with the following code, it will give us expected outcome:
products=driver.find_elements_by_xpath("//div[#class='product-pod-padding']")
for index, product in enumerate(products):
print(product.text)
SKU#1
SKU#2
SKU#3
However, if I try to locate the href of each product, it will only return the first item's link:
products=driver.find_elements_by_xpath("//div[#class='product-pod-padding']")
for index, product in enumerate(products):
print(index)
print(product.text)
url=product.find_element_by_xpath("//a[#class='header product-pod--ie-fix']").get_attribute('href')
print(url)
SKU#1
link1
SKU#2
link1
SKU#3
link1
What should I do to get the corrected links?
This should make your code functional:
[...]
products=driver.find_elements_by_xpath("//div[#class='product-pod-padding']")
for index, product in enumerate(products):
print(index)
print(product.text)
url=product.find_element_by_xpath(".//a[#class='header product-pod--ie-fix']").get_attribute('href')
print(url)
[..]
The crux here is the dot in front of xpath, which means searching within the element only.
You need to use a relative XPath in order to locate a node inside another node.
//a[#class='header product-pod--ie-fix'] will always return a first match from the beginning of the DOM.
You need to put a dot . on the front of the XPath locator
".//a[#class='header product-pod--ie-fix']"
This will retrieve you a desired element inside a parent element:
url=product.find_element_by_xpath(".//a[#class='header product-pod--ie-fix']").get_attribute('href')
So, your entire code could be as following:
products=driver.find_elements_by_xpath("//div[#class='product-pod-padding']")
for index, product in enumerate(products):
print(index)
url=product.find_element_by_xpath(".//a[#class='header product-pod--ie-fix']").get_attribute('href')
print(url)
Related
My html code looks like:
<li>
<div class="level1">
<div id="li_hw2" class="toggle open" </div>
<ul style="" mw="220">
<li>
<div class ="level2">
...
</li>
</ul>
I am currently on the element with the id = "li_hw2", which was found by
level_1_elem = self.driver.find_element(By.ID, "li_hw2")
Now i want to go from level_1_elem to class = "level2". Is it possible to go to the parent li and than to level2? Maybe with xpath?
Hint: It is neccassary to go via the parent li and not directly to the element level2 with
self.driver.find_element(By.Class_Name, "level2")
The best-suited locator for your usecase is xpath, since you want to traverse upward as well as downwards in the HTMLDOM.
level_1_elem = self.driver.find_element(By.XPATH, "//div[#class='li_hw2']")
and then using level_1_elem web element, You can do the following :
to directly go to following-sibling
level_1_elem.find_element(By.XPATH, ".//following-sibling::ul/descendant::div[#class='level2']")
Are you sure about the html i think the ul should group all the li if it s the case then it s easy if not i realy dont get that html.
//div[#class="level1"]/parent::li/parent::ul/li/div[#class="level2"]
Hey geeks I am new to selenium and automate testing and I am trying to extract Span values(i.e. l e k C N t in below example) from webpage but either it gives empty value or No such element error. Can anyone help in it!
Language: python
Webpage:
<div _ngcontent-serverapp-c6="" class="captcha-letters">
<span _ngcontent-serverapp-c6="">l</span>
<span _ngcontent-serverapp-c6="">e</span>
<span _ngcontent-serverapp-c6="">k</span>
<span _ngcontent-serverapp-c6="">C</span>
<span _ngcontent-serverapp-c6="">N</span>
<span _ngcontent-serverapp-c6="">t</span>
</div>
To print the text from each span element element on the page, this should work:
elements = driver.find_elements_by_css_selector('span')
for element in elements:
print(element.text)
If you need to be more specific and grab only certain span elements, we will need more HTML to go off of.
I want to get the href from a <p> tag using an XPath expression.
I want to use the text from <h1> tag ('Cable Stripe Knit L/S Polo') and simultaneously text from the <p> tag ('White') to find the href in the <p> tag.
Note: There are more colors of one item (more articles with different <p> tags, but the same <h1> tag)!
HTML source
<article>
<div class="inner-article">
<a href="/shop/tops-sweaters/ix4leuczr/a1ykz7f2b" style="height:150px;">
</a>
<h1>
<a href="/shop/tops-sweaters/ix4leuczr/a1ykz7f2b" class="name-link">Cable Stripe Knit L/S Polo
</a>
</h1>
<p>
White
</p>
</div>
</article>
I've tried this code, but it didn't work.
specificProductColor = driver.find_element_by_xpath("//div[#class='inner-article' and contains(text(), 'White') and contains(text(), 'Cable')]/p")
driver.get(specificProductColor.get_attribute("href"))
As per the HTML source, the XPath expression to get the href tags would be something like this:
specificProductColors = driver.find_elements_by_xpath("//div[#class='inner-article']//a[contains(text(), 'White') or contains(text(), 'Cable')]")
specificProductColors[0].get_attribute("href")
specificProductColors[1].get_attribute("href")
Since there are two hyperlink tags, you should be using find_elements_by_xpath which returns a list of elements. In this case it would return two hyperlink tags, and you could get their href using the get_attribute method.
I've got working code. It's not the fastest one - this part takes approximately 550 ms, but it works. If someone could simplify that, I'd be very thankful :)
It takes all products with the specified keyword (Cable) from the product page and all products with a specified color (White) from the product page as well. It compares href links and matches wanted product with wanted color.
I also want to simplify the loop - stop both for loops if the links match.
specificProduct = driver.find_elements_by_xpath("//div[#class='inner-article']//*[contains(text(), '" + productKeyword[arrayCount] + "')]")
specificProductColor = driver.find_elements_by_xpath("//div[#class='inner-article']//*[contains(text(), '" + desiredColor[arrayCount] + "')]")
for i in specificProductColor:
specProductColor = i.get_attribute("href")
for i in specificProduct:
specProduct = i.get_attribute("href")
if specProductColor == specProduct:
print(specProduct)
wantedProduct = specProduct
driver.get(wantedProduct)
The html is like this:
<body>
<div class="div_a">
<ul class="ul">
<li>li</li>
<li>li</li>
</ul>
</div>
<div class="div_b">
<a>link</a>
<ul>
<li>div_b li</li>
</ul>
</div>
</body>
try to get div_a's li
node = page.xpath("//div[#class='div_a']")
li1 = node.xpath("//li")
but li1 got the all li element in page not only div_a's. I cannot figure what the issue is.
Your XPATH - //li - is actually taking elements from the root element, hence you get all li . If you want to take only the element inside node , you should give relative XPATH. Example -
li1 = node.xpath(".//li")
. in above means current element, which would be the div element with class attribute as 'div_a'.
Fix your second XPath to be relative instead of absolute as Anand suggests, or simply use a single XPath to get the li elements in the first place:
li1 = page.xpath("//div[#class='div_a']//li")
Working in lxml, I want to get the href attribute of all links with an img child that has title="Go to next page".
So in the following snippet:
<a class="noborder" href="StdResults.aspx">
<img src="arrowr.gif" title="Go to next page"></img>
</a>
I'd like to get StdResults.aspx back.
I've got this far:
next_link = doc.xpath("//a/img[#title='Go to next page']")
print next_link[0].attrib['href']
But next_link is the img, not the a tag - how can I get the a tag?
Thanks.
Just change a/img... to a[img...]: (the brackets sort of mean "such that")
import lxml.html as lh
content='''<a class="noborder" href="StdResults.aspx">
<img src="arrowr.gif" title="Go to next page"></img>
</a>'''
doc=lh.fromstring(content)
for elt in doc.xpath("//a[img[#title='Go to next page']]"):
print(elt.attrib['href'])
# StdResults.aspx
Or, you could go even farther and use
"//a[img[#title='Go to next page']]/#href"
to retrieve the values of the href attributes.
You can also select the parent node or arbitrary ancestors by using //a/img[#title='Go to next page']/parent::a or //a/img[#title='Go to next page']/ancestor::a respectively as XPath expressions.