I am trying to automate a process using selenium and a webdriver.
The html looks like this:
<span class="contract-item">
<span class="contract-label">
<span class="contract-name">Jimmy</span>
</span>
<div class="current-stats">
<span class="info"></span>
The issue I am facing is that there are many 'contract-item' and 'info' classes. I only want to find info for specific 'contract name's. However by finding the 'contract name' I have lost the info. How do I get the 'info' for a specific name?
I have this so far.
team_name = self.driver.find_elements_by_xpath("//*[contains(text(), {})]".format(jimmy))[1]
Many thanks!
Below is required xpath:
//*[#class='contract-item' and
contains(.,'Jimmy')]/div/span[#class='info']
You just need to change the contact name String (i.e. Jimmy) with desired one and you will get corresponding info.
You need to first get the element that contains the name, you did that correctly
//*[contains(text(), "{}")]
Then you need to go the nearest common parent between the info element and the element you found, each /.. will go one element up the HTML tree.
//*[contains(text(), "{}")]/../..
Finally, find the correct element filtering by class
//*[contains(text(), "{}")]/../..//span[#class="info"]//text()
So, you expression should be :
team_name = self.driver.find_elements_by_xpath('//*[contains(text(), "{}")]/../..//span[#class="info"]//text()'.format('jimmy'))[1]
Related
I'm trying to find one element using Selenium in Python using the following code:
element_A = driver.find_element(By.XPATH, '/html/body/div[5]/div[1]/div[2]/div[3]/div/div/div/div[1]/div[1]/div[2]/div[1]/div[3]/div[1]/i')
in this HTML code:
<div class="display-flex>
<span class=" cc-customer-info__value"="">212121 <i title="copiar" class="fa fa-clone _clipboard_" data-clipboard="212121 "></i> <br>
</div>
I want the number 212121, but I'm getting no such element error. The problem is this number can be different every time that I open the website. It's the number of the customer.
Is it possible to help me to locate this element?
I'm also trying to find two more elements:
Customer profile, it's also change
<text class="legend-graph font-menu" x="416" y="38">Customer Profile: A</text>
and Diamond, that also can vary
<span class="cc-customer-info__value ">Diamond</span>
Thank you!
For the first one, the span element has the text as number, not the tag.
driver.find_element(By.XPATH, '/html/body/div[5]/div[1]/div[2]/div[3]/div/div/div/div[1]/div[1]/div[2]/div[1]/div[3]/div[1]/span)
try this it may work, or try to shorten the xpath you are creating. this is absolute xpath.
try and use selenium ide to record the scenario and check what locator is being captured for the element.
you could also try devtools/ and other xpath locating tools which are available as chrome extentions.
I am using Scrapy selector to extract fields from html
xpath = /html/body/path/to/element/text()
This is similar to question scrapy get nth-child text of same class
and following the documentation we can use .getall() method to get all element and select specific one from the list.
selected_list = Selector(text=soup.prettify()).xpath(xpath).getall()
Is it possible to directly specify which nth element to select in the xpath itself?
Something like below
xpath = /html/body/path/to/element/text(2) #to select 3 child text
Example
<body>
<div>
<i class="ent_sprite remind_icon">
</i>
text that needs to be
</div>
</body>
The result of response.xpath('/body/div/text()').getall() consist of 2 elements
'\n'
'text that needs to be'
You can use following-sibling:: to have the nearest sibling (downward) of the expression. For example in this case you want the nearest text() of <i> tag, so you can do:
response.xpath('//i[#class="ent_sprite remind_icon"]/following-sibling::text()').get()
Which gives you the nearest text() node to <i class="ent_sprite remind_icon">.
If you want to get nth nearest sibling (downward) of a node the XPath would be following-sibling::node[n] in our case comes following:
'//i[#class="ent_sprite remind_icon"]/following-sibling::text()[n]'
I am making a web crawling for checking a kind of availability.
I want to check the title of the specific time. However, if the title is 'NO', there is no href, otherwise there is a href. Therefore, it's xpath depends on the title. The title name changes every time. So i can't check by xpath.
If I want to check the availability of 09:00~11:00, how can do that?
I tried to find by XPATH. However, since the XPATH changes as I told, I can't to check the specific time i want.
Thanks in advance.
Below is the HTML code.
<span class="rs">07:00~09:00</span><img src="../images/reservation_btn04.gif" title="NO"><br>
<span class="rs">09:00~11:00</span><img src="../images/reservation_btn04.gif" title="NO"><br>
<span class="rs">11:00~13:00</span><img src="../images/reservation_btn04.gif" title="NO"><br>
<span class="rs">13:00~15:00</span><img src="../images/reservation_btn03.gif" title="YES"><br>
<span class="rs">15:00~17:00</span><img src="../images/reservation_btn03.gif" title="YES"><br>
<span class="rs">17:00~19:00</span><img src="../images/reservation_btn03.gif" title="YES"><br>
<span class="rs">19:00~21:00</span><img src="../images/reservation_btn04.gif" title="NO"><br>
As per the HTML you have shared to check the availability of any timespan e.g. 09:00~11:00 you can use the following solution:
You can create a function() which will take an argument as the timespan and extract the availability as follows:
def check_availability(myTimeSpan):
print(driver.find_element_by_xpath("//span[#class='rs'][.='" + myTimeSpan + "']//following::img[1]").get_attribute("title"))
Now you can call the function check_availability() with any timespan as follows:
check_availability("09:00~11:00")
If the text 09:00~11:00 is fixed, you can locate the img element like this -
element = driver.find_element_by_xpath("//span[#class='rs' and contains(text(),'09:00~11:00')]/following-sibling::img")
To check whether the title attribute of the element is "YES" -
if element.get_attribute("title") == 'YES':
// do whatever you want
To get the href attribute of your required element-
source = driver.find_element_by_xpath("//span[#class='rs' and contains(text(),'09:00~11:00')]/following-sibling::img[#title='YES']/preceding-sibling::a").get_attribute("href")
I'm trying to rewrite someones library to parse some xml returned with requests. However they use lxml in a way I'm not used to. I believe it's using regular expression to find the data and while most of the library provided works, it doesn't work when the site being parsed has the file id in a list structure. Essnetially I get a page back and I'm looking for an id that matches the href athlete number. So say I want to just get id's for athlete 567377.
</div>
</a></div>
<ul class='list-entries'>
<li class='entity-details feed-entry' id='Activity-123120999590'>
<div class='avatar avatar-athlete avatar-default'>
<a class='avatar-content' href='/athletes/567377' >
</a>
</div>
</li>
<li class='entity-details feed-entry' id='Activity-16784940202'>
<div class='avatar avatar-athlete avatar-default'>
<a class='avatar-content' href='/athletes/5252525'>
</a>
</div>
The code:
lst_group_activity = parser.xpath(".//li[substring(#id, 1, 8)='Activity']")
Provides all list items perfectly but for all activities. I want to only have the one related to the right athlete. The library uses the following to use an #href to select the right athlete.
lst_athlethe_act_in_group_activity = parser.xpath(".//li[substring(#id, 1, 8)='Activity']/*[#href='/athletes/"+athlethe_id+"']/..")
However, this never seems to work. It finds the activity but then throws them all away.
Is there a better way to get this working? Any tutorial that can point me in the right direction to correlate to the next element.
The element with the href attribute isn't an immedite child of your li element, so your xpath is failing. You're matching:
.//li/*[#href="..."]
You want:
.//li/div/a[#href="..."]
(You could match * instead of a if you think another element might contain the href attribute, and you can match against .//li//a[#href="..."] if you think the path to the a element might not always be li/div/a).
So to find the li element:
parser.xpath(".//li[substring(#id, 1, 8)='Activity']/div/a[#href='/athletes/%s']/../.." % '5252525')
But you can also write that without the ../..:
parser.xpath(".//li[substring(#id, 1, 8)='Activity' and div/a/#href='/athletes/%s']" % '5252525')
Here is the HTML I'm dealing with
<a class="_54nc" href="#" role="menuitem">
<span>
<span class="_54nh">Other...</span>
</span>
</a>
I can't seem to get my XPath structured correctly to find this element with the link. There are other elements on the page with the same attributes as <a class="_54nc"> so I thought I would start with the child and then go up to the parent.
I've tried a number of variations, but I would think something like this:
crawler.get_element_by_xpath('//span[#class="_54nh"][contains(text(), "Other")]/../..')
None of the things I've tried seem to be working. Any ideas would be much appreciated.
Or, more cleaner is //*[.='Other...']/../.. and with . you are directly pointing to the parent element
In other scenario, if you want to find a tag then use css [role='menuitem'] which is a better option if role attribute is unique
how about trying this
crawler.get_element_by_xpath('//a[#class="_54nc"][./span/span[contains(text(), "other")]]')
Try this:
crawler.get_element_by_xpath('//a[#class='_54nc']//span[.='Other...']');
This will search for the element 'a' with class as "_54nc" and containing exact text/innerHTML "Other...". Furthermore, you can just edit the text "Other..." with other texts to find the respective element(s)