Locating an element using Selenium with Python - python

Edited based on the answers:
I am using Selenium with Python and trying to locate a button on an web application on Chrome. The block of code has an iframe as mentioned in the answer.
<iframe data-bind="attr: { src: src, foo: $root.registerTargetDisplayFrame($data, $element) }, event: {load: function() {loaded(true);}, focus: $root.blurredNavigationPane}" src="https://products.com/InfoShareAuthor/home">
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">
<html>
<head>code here
<frameset id="IshTop" class="infoshareauthor" framespacing="0" border="0" bordercolor="#FFFFFF" frameborder="0" rows="31,25,*,0">
<frame id="MenuBar" scrolling="no" name="MenuBar" src="./MainMenuBar.asp">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<body>
<div id="Top-Menu-Container">
<div id="top-menu-wrapper">
<div id="top-menu">
<form name="MainBar">
<script type="text/javascript" language="javascript">
<table cellspacing="0" cellpadding="0" border="0">
<tbody>
<tr>
<td width="95" valign="bottom">
<td width="95" valign="bottom">
<div style="POSITION: relative;">
<div height="30" style="POSITION: absolute; z-index:0; top: 4px; margin-left: -5px">
<a href="javascript:TabSelect(1);">
<img border="0" src="./UIFramework/tab_active.png">
</a>
</div>
<div onclick="javascript:TabSelect(1);" style="POSITION: absolute; z-index:2; top: -8px">
<table cellspacing="0" cellpadding="0" border="0">
<tbody>
<tr>
<td id="MenuButton1" class="tab_active" width="95" valign="bottom" height="30" align="center" style="cursor:pointer;padding-bottom:2px;" name="Repository">Repository</td>
</tr>
</tbody>
</table>
</div>
</div>
</td>
<td width="95" valign="bottom">
<td width="95" valign="bottom">
<td width="95" valign="bottom">
<td width="95" valign="bottom">
</tr>
</tbody>
</table>
</form>
</div>
<div id="top-help">
<div id="top-nav-links">
</div>
</div>
</body>
</html>
</frame>
<frame id="BreadCrumbs" frameborder="0" border="0" scrolling="no" name="BreadCrumbs" src="./BreadCrumbs.asp">
<frameset id="Application" bordercolor="#0099CC" frameborder="0" rows="0,*,0,0,0,0">
<frameset id="HiddenFrameSet" bordercolor="#0099CC" frameborder="0" rows="0,0,,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1">
<noframes> It looks like your browser doesn't support frames. This page requires frames in order to function. <br><br>For more information, please <a href='http://www.trisoftcms.com/en/contact-us.html' target=_blank style='white-space:nowrap'>contact us</a>. </noframes>
</frameset>
</html>
</iframe>
I switched frames using this:
iframe = browser.find_element_by_xpath("//iframe[#src='https://products.com/InfoShareAuthor/home']")
browser.switch_to.frame(iframe)
The code that I wrote:
browser.find_element_by_xpath("//td[#id='MenuButton1'][#name='Repository'][contains(text(),'Repository')]")
I could find the element using this xpath when I did a Firebug search
I also tried:
browser.find_element_by_id("MenuButton1")
and
browser.find_element_by_name("Repository")
Note: When I click the button, the URL does not change. Just a list of items in the application expands. Also, IDs and the Names are unique for the seven five menu buttons. None of the menu buttons work.
Does any one have any idea about what might be wrong? I am very new to Python and Selenium.

This doesn't exactly answer your question, but it does address what you're trying to do: it is likely you can accomplish the same task (and many others) with SDL's API client ISHRemote.
https://github.com/sdl/ISHRemote
For example, if you're looking for all the directories under '\General':
Import-Module ISHRemote
# first authenticate
$session = New-IshSession -IshPassword $password -IshUserName $username -WsBaseUrl 'https://ccms.example.com/InfoShareWS/'
# get a list of all the child folders under General
Get-IshFolder -IshSession $session -FolderPath '\General' -Recurse -Depth 2
Or if you're trying to get a list of files in a particular directory:
Import-Module ISHRemote
# first authenticate
$session = New-IshSession -IshPassword $password -IshUserName $username -WsBaseUrl 'https://ccms.example.com/InfoShareWS/'
# get all content in this folder
Get-IshFolderContent -IshSession $session -FolderPath 'General\path\to\topics'
With ISHRemote, you can also find and update publications, move content, modify metadata, etc.
Hope that helps.

you can try load iframe url. It avoids issues with selenium waiting from the iframe to load

Related

How to click an element using Selenium and Python

I have a problem with clicking an element using XPath in selenium. Here is the HTML element for the problem :
<label for="file" class="pb default" style="display: inline-block;margin: 5px 10px;">Select File</label>
Do you know the solution for this? Any response is really appreciated.
UPDATE :
here is the whole source code for the problem
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
<BASE HREF="http://1.1.1.19/webclient/utility">
<link rel=stylesheet type="text/css" href="../webclient/skins/skins-20190807-1904/iot18/css/filex.css">
</head>
<body style="background: transparent; background-color: transparent">
<form name="IMPORT" id="IMPORT" enctype="multipart/form-data" method="post">
<input type="hidden" NAME="componetId" VALUE="itemimage_3_1-if">
<input type="hidden" NAME="controlId" VALUE="itemimage_3_1">
<table width="100%" cellspacing="0" align="center" class="maintable">
<tr>
<td align="left" style="white-space: nowrap;">
<label for="file" class="pb default" style="display: inline-block;margin: 5px 10px;">Select File</label>
<input id="fileName" onmousedown="" type="text" value="" class="fld fld_ro text ib" readonly size="50"/>
<input id="file" type="file" name="value" title="Specify a New File" onchange="" size="35" class="text" style=" width: 0.1px;height: 0.1px !important;fileStyle: 0;overflow: hidden;position: absolute;z-index: -1;opacity: 0;" value="" onclick="if(!parent.undef(parent.firingControl) && parent.firingControl.id==this.id){parent.sendEvent('clientonly','clickFileButton', this.id)}">
</td>
</tr>
</table>
</form>
</body>
<script>
document.querySelector('#file').addEventListener('change', function(e){
document.querySelector('#fileName').value = e.currentTarget.files[0].name;
});
</script>
</html>
Can try with //label[#for='file'] or //label[#for='file'][text()='Select File']
If this is not working maybe you are targeting the wrong element or you are not waiting enough before it appears. When dealing with uploading files, I target the input with type file //input[#type='file'], not the label, you may need to provide bigger part of your HTML.
if class=pb default is a unique class in the web-page then try this out.
driver.driver.find_element(By.XPATH,'//*[#class="pb default"]').click()
or you can search by text Select File
driver.find_element(By.XPATH,'//*[contains(text(),"Select File")]').click()
Well No, you don't click on a <label> element. However, you may require to locate the <label> element for other purposes.
To identify the <label> element you can use either of the following Locator Strategies:
Using css_selector:
element = driver.find_element(By.CSS_SELECTOR, "label.pb.default[for='file']")
Using xpath:
element = driver.find_element(By.XPATH, "//label[#class='pb default' and text()='Select File']")
Update
Element can be within an <iframe> or within a #shadow-root. Otherwise the locator works perfecto:

How to use border-radius while converting html to pdf using xhtmltopdf

I am trying to round the corners of my table, border-radius doen't seem to work when I convert the below HTML to PDF using xhtmltopdf pdf generator. Below is the HTML written for content file name is sticker_print.html :
<div class="sticker" style="height:196px">
<table class="sticker_box" align="left">
<tr>
<td style="border: 1px solid #222;background-color: #ffffff;">
<h3 style="border-bottom: 1px solid #222222;">Batch Sticker</h3>
<h5 style="padding: 0 0 0 10px;">Batch ID</h5>
<p>MFG Date</p>
<p style="padding-bottom:0px;"><img src="http://www.computalabel.com/Images/C128ff#2x.png" width="195px" height="26px"><span> Bar Code </span></p>
<p style="text-align: left; padding-bottom: 0px;">
<img src="https://www.kaspersky.com/content/en-global/images/repository/isc/2020/9910/a-guide-to-qr-codes-and-how-to-scan-qr-codes-2.png" width="65px" height="65px">
<span style="display: block;margin-top: 0px;">QR Code</span>
</p>
</td>
</tr>
</table>
</div>
PDF CODE
pdf = render_to_pdf('sticker_print.html')
return HttpResponse(pdf, content_type='application/pdf')
Even though I'm not using the same PDF engine as you (and your question is 6 months old), I solved this issue by using corner-radius instead of border-radius on a table cell or div.

Click without id or name (Ghost)

I'm using Ghost for Python 2.7 and I'm trying to click in a link which is in a table. The problem is that I have no ID, name... This is the HTML code:
<table id="table_webbookmarkline_2" cellpadding="4" cellspacing="0" border="0" width="100%">
<tr valign="top">
<td>
<a href="/dana/home/launch.cgi?url=.ahuvs%3A%2F%2Fhq0l5458452ERA-w-Xz8G3LKe8JNM%2F.ISDXWXaWXUivecOc" target="_blank" onClick='javascript:openBookmark(
this.href, "yes", "yes");
return false;' ><img src="/dana-cached/imgs/icn18x18WebBookmarkPop.gif" alt="This will open in a new TAB" width="18" height="18" border="0" ></a>
</td>
<td width="100%" align="left">
<a href="/dana/home/launch.cgi?url=.ahuvs%3A%2F%2Fhq0l5458452ERA-w-Xz8G3LKe8JNM%2F.ISDXWXaWXUivecOc" target="_blank" onClick='JavaScript:openBookmark(
this.href, "yes", "yes");
return false;' ><b>**LINK WHERE I WANT TO CLICK**</b> </a><br><span class="cssSmall"></span>
</td>
</tr>
</table>
How can I click in this kind of link ?
Seems like Ghost's Session.click() takes a CSS selector. Here only the table has an ID, so a selector that takes the second td that is a descendant of that ID and finds the a element should work:
session.click('#table_webbookmarkline_2 td:nth-child(2) a')

xPath: Difficulties matching expression with actual source code

From this Deutsche Börse web page, under the table header Issuer I want to get the string content 'db X-trackers' in the cell next to the one with Name in it.
Using my web browser, I inspect that table area and get the code, which I've pasted into this XML tree just so that I can test my xPath.
<root>
<div class="row">
<div class="col-lg-12">
<h2>Issuer</h2>
</div>
</div>
<div class="table-responsive">
<table class="table">
<tbody>
<tr>
<td>Name</td>
<td class="text-right">db X-trackers</td>
</tr>
</tbody>
</table>
</div>
</root>
According to FreeFormatter.com, my xPath below succeeds in retrieving the correct element (Text='db X-trackers'):
my_xpath = "//h2['Issuer']/ancestor::div[#class='row']/following-sibling::div//td['Name']/following-sibling::td[1]/text()"
Note: It goes to <h2>Issuer</h2> first to identify the right place to start working from.
However, when I run this on the actual web page using Selenium WebDriver, None is returned.
def get_sibling(driver, my_xpath):
try:
find_value = driver.find_element_by_xpath(my_xpath).text
except NoSuchElementException:
return None
else:
value = re.search(r"(.+)", find_value).group()
return value
I don't believe anything is wrong in the function itself, so either the xPath must be faulty or there is something in the actual web page source code that throws it off.
When studying the actual Source code in Chrome, it looks a bit messier than what I see with Inspector, which is what I used to create the little XML tree above.
<div class="box">
<div class="row">
<div class="col-lg-12">
<h2>Issuer</h2>
</div>
</div>
<div class="table-responsive">
<table class="table">
<tbody>
<tr>
<td >
Name
</td>
<td class="text-right" >
db X-trackers
</td>
</tr>
<tr>
<td >
Product Family
</td>
<td class="text-right" >
db X-trackers
</td>
</tr>
<tr>
<td >
Homepage
</td>
<td class="text-right" >
<a target="_blank" href="http://www.etf.db.com">www.etf.db.com</a>
</td>
</tr>
</tbody>
</table>
</div>
Are there some peculiarities in the source code above, or is my xPath (or function) wrong?
I would use the following and following-sibling axis:
//h2[. = "Issuer"]/following::table//td[. = "Name"]/following-sibling::td
First we locate the h2 element, then get the following table element. In the table element we look for the td element with Name text and then get the following td sibling.

Remove matched tags in html files?

I have some html files, each of which contains
<td id="MenuTD" style="vertical-align: top;">
...
</td>
where ... can contain anything, and </td> matches <td id="MenuTD" style="vertical-align: top;">. I would like to remove this part from the html files.
Similarly, I may also want to remove some other tags in the files.
How shall I program that in Python?
I am looking at HTMLParser module in Python 2.7, but haven't figured out if that can help.
You can accomplish this using BeautifulSoup. You have two options, depending on what you want to do with the element you're removing.
Set up:
from bs4 import BeautifulSoup
html_doc = """
<html>
<header>
<title>A test</title>
</header>
<body>
<table>
<tr>
<td id="MenuTD" style="vertical-align: top;">
Stuff here <a>with a link</a>
<p>Or paragraph tags</p>
<div>Or a DIV</div>
</td>
<td>Another TD element, without the MenuTD id</td>
</tr>
</table>
</body>
</html>
"""
soup = BeautifulSoup(html_doc)
Option 1 is to use the extract() method. Using this, you will retain a copy of your extracted element so that you can utilize it later in your application:
Code:
menu_td = soup.find(id="MenuTD").extract()
At this point, the element you are removing has been saved to the menu_td variable. Do what you want with that. Your HTML in the soup variable no longer contains your element though:
print(soup.prettify())
Outputs:
<html>
<header>
<title>
A test
</title>
</header>
<body>
<table>
<tr>
<td>
Another TD element, without the MenuTD id
</td>
</tr>
</table>
</body>
</html>
Everything in the MenuTD element has been removed. You can see it is still in the menu_td variable though:
print(menu_td.prettify())
Outputs:
<td id="MenuTD" style="vertical-align: top;">
Stuff here
<a>
with a link
</a>
<p>
Or paragraph tags
</p>
<div>
Or a DIV
</div>
</td>
Option 2: Utilize .decompose(). If you do not need a copy of the removed element, you can utilize this function to remove it from the document and destroy the contents.
Code:
soup.find(id="MenuTD").decompose()
It doesn't return anything (unlike .extract()). It does, however, remove the element from your document:
print(soup.prettify())
Outputs:
<html>
<header>
<title>
A test
</title>
</header>
<body>
<table>
<tr>
<td>
Another TD element, without the MenuTD id
</td>
</tr>
</table>
</body>
</html>

Categories

Resources