I am working on a script that aims "taking all of the entries which are written by users" under a specific title in a website(In this case, the title is "python(programlama dili"). I would like to read the number which shows the current number of pages under this specific title.
The reason behind reading the number of this elements is that number of pages can increase at the time when we run the script due to increasing number of entries by users. Thus, I should take the number which exist within the element via script.
In this case, I need to read "122" as the value and assign it to a int variable . I use Selenium to take all entries and Firefox web driver.
It Would be better if you try to access it using the xpath.
Try and get the value attribute of the element, you've mentioned you can find the element using xpath so you can do the following.
user_count = element.get_attribute('value')
If that gets you the number (as a string) then you can just convert to an int as usual
value = int(user_count)
First pick the selector .last and then you can extract the reference of that. Don't forget to split the reference.
my_val = driver.find_element_by_css_selector(".last [href]").split("p=")[1]
Related
I have the custom attribute called upgrade-test="secondary-pull mktg-data-content in the following code snippet:
<section class="dvd-pull tech-pull-- secondary-pull--anonymous tech-pull--digital secondary-pull--dvd-ping tech-pull--minimise" upgrade-test="secondary-pull mktg-data-content" data-js="primary-pull" style="--primary-direct-d_user-bottom-pos:-290px;">
I am able to identify my element successfully by doing the following:
element = driver.find_element(By.XPATH, "//section[contains(#upgrade-test, 'mktg-data-content')]")
This mktg-data-content gets changed every time a user goes to a different page for example it could be sales-data-content for the sales page etc.
What I am after is to find a way to retrieve this dynamic text of this custom attribute and pass it to my variable. Any help would really be appreciated. Thanks
You need to fetch the attribute value using element.get_attribute("upgrade-test") and then need to do string manipulation.
elementval = driver.find_element(By.XPATH, "//section[contains(#upgrade-test, 'secondary-pull')]").get_attribute("upgrade-test")
print(elementval.split(" ")[-1])
Note:- splitted with space,which returns zero based list, index -1 means the last value of the list,
since you have two elements in the list you can use this as well
print(elementval.split(" ")[1])
You need to find a unique locator for that element.
Without seeing that page we can only guess, so I can guess tech-pull--digital and secondary-pull--dvd-ping class names are making a unique combination.
If so you can use the following code:
attribute_val = driver.find_element(By.CSS_SELECTOR, "section.tech-pull--digital.secondary-pull--dvd-ping").get_attribute("upgrade-test")
print(attribute_val.split(" ")[-1])
The first line here locates the element and retrieves the desired attribute value while the second line isolates the first part of the desired attribute value as explained by KunduK
I am trying to get specific contents of the second "datadisplaytable" is the Schedule type and Instructors name. The line below:
datadisplaytable = soup.find(class_='datadisplaytable').text
gets me the whole class which contains all other 'datadisplaytable's which I intend to loop through for the specific data that I need.
Using "Xpath" in selenium only gets me the contents of the selected path. and trying to use a for loop in selenium returns "WebElement not iterable'
Which brings me to the question,
How do I get the schedule type and Instructor.
catalog = https://prod-ssb-01.dccc.edu/PROD/bwckschd.p_disp_dyn_sched
term and subject is any.
Use find_elements_by_xpath (‘<your xpath>) not find_element_by_xpath. See the difference one is with s.
Remember find_elements return a list of web elements which you can loop through where as find_element return a single web element.
I'm using selenium, with find_element_by_path method to do some web scraping, I have some problem to get a path which change through pages, I know how the path is written, but one of the string within the path change through my loop, I would like to know how can I use regex to solve it.
I have this code for one of the page but when I go through all pages the string "NUMBER" below changes:
browser.find_element_by_xpath(re.compile('//*[#id="exhibDetail:exhib"]/section[3]/div[2]/div/div[2]/div/div/div[NUMBER]/div').click()
I want to know if it was possible to use regex in order to say that it has to click whatever the "NUMBER" as long as the rest of the path is the same so I tried this but I'm not sure about the syntax and how to use regex here:
browser.find_element_by_xpath('//*[#id="exhibDetail:exhib"]/section[3]/div[2]/div/div[2]/div/div/div['). + re.compile("^[1-9]\d*$") + ']/div').click()
browser.find_element_by_xpath(re.compile('^//*[#id="exhibDetail:exhib"]/section[3]/div[2]/div/div[2]/div/div/div[')).click()
browser.find_element_by_xpath('//*[#id="exhibDetail:exhib"]/section[3]/div[2]/div/div[2]/div/div/div[1]/div').click()
browser.find_element_by_xpath('//*[#id="exhibDetail:exhib"]/section[3]/div[2]/div/div[2]/div/div/div[9]/div').click()
browser.find_element_by_xpath('//*[#id="exhibDetail:exhib"]/section[3]/div[2]/div/div[2]/div/div/div[4]/div').click()
browser.find_element_by_xpath('//*[#id="exhibDetail:exhib"]/section[3]/div[2]/div/div[2]/div/div/div[10]/div').click()
browser.find_element_by_xpath('//*[#id="exhibDetail:exhib"]/section[3]/div[2]/div/div[2]/div/div/div[6]/div').click()
the path evolves more or less in this manner (randomly) but not gradually one by one.
How do I solve this problem?
Welcome to SO.
If you are trying to pass the NUMBER as part of xpath in your loop then you can do the below.
If NUMBER in an integer:
browser.find_element_by_xpath("//*#id='exhibDetail:exhib']/section[3]/div[2]/div/div[2]/div/div/div[%i]/div"%(NUMBER)).click()
If NUMBER is a string
browser.find_element_by_xpath("//*#id='exhibDetail:exhib']/section[3]/div[2]/div/div[2]/div/div/div[%s]/div"%(NUMBER)).click()
I want to know if it was possible to use regex in order to say that it
has to click whatever the "NUMBER" as long as the rest of the path is
the same
If you want to select those div elements disregarding their position (that is what the predicates [1], [2], etc. are testing) then just don't use the predicates at all:
//*[#id="exhibDetail:exhib"]/section[3]/div[2]/div/div[2]/div/div/div/div
I asked my previous question here:
Xpath pulling number in table but nothing after next span
This worked and i managed to see the number i wanted in a firefox plugin called xpath checker. the results show below.
so I know i can find this number with this xpath, but when trying to run a python scrpit to find and save the number it says it cannot find it.
try:
views = browser.find_element_by_xpath("//div[#class='video-details-inside']/table//span[#class='added-time']/preceding-sibling::text()")
except NoSuchElementException:
print "NO views"
views = 'n/a'
pass
I no that pass is not best practice but i am just testing this at the moment trying to find the number. I'm wondering if i need to change something on the end of the xpath like .text as the xpath checker normally shows a results a little differently. Like below:
i needed to use the xpath i gave rather than the one used in the above picture because i only want the number and not the date. You can see part of the source in my previous question.
Thanks in advance! scratching my head here.
The xpath used in find_element_by_xpath() has to point to an element, not a text node and not an attribute. This is a critical thing here.
The easiest approach here would be to:
get the td's text (parent)
get the span's text (child)
remove child's text from parent's
Code:
span = browser.find_element_by_xpath("//div[#class='video-details-inside']/table//span[#class='added-time']")
td = span.find_element_by_xpath('..')
views = td.text.replace(span.text, '').strip()
Hi I am trying to locate an element by its class name and the text that it contains
<div class="fc-day-number">15</div>
there are a bunch of fc-day-number on the page with different values, I need the one with for example 15.
I do
driver.find_element_by_class_name("fc-day-content")
but I also need it to be equal to 15 and I am stuck here, please help.
You can use xpath:
driver.find_element_by_xpath("//div[#class='fc-day-content' and text()='15']")
fc_day_contents = driver.find_elements_by_class_name("fc-day-content")
the_one_you_want = [x for x in fc_day_contents if "15" == x.text][0]
first line puts all elements with class name "fc-day-content" in a list ( also notice how its elementSSSSSSSSS with an S, this returns a list of all elements by_class_name, by_name, by_id or wahtever)
second line, goes through each element and looks to see if it has the text "15" as its text, and returns it as a (probably smaller) list
the [0] at the end of it, returns the first item in the list (you can remove it, if you want a list of all the ones that are "15" )
For things like this, prefer to use JavaScript:
els = driver.execute_script("""
return Array.prototype.slice.call(document.getElementsByClassName("fc-day-content"))
.filter(function (x) { return x.textContent === "15"; });
""")
assert len(els) == 1
el = els[0]
What this does is get all elements that have the class fc-day-content as the class. (By the way, your question uses fc-day-content and fc-day-number. Unclear which one you're really looking for but it does not matter in the grand scheme of things.) The call to Array.prototype.slice creates an array from the return value of getElementsByClassName because this method returns an HTMLCollection and thus filter is not available on it. Once we have the array, run a filter to narrow it to the elements that have 15 for text. This array of elements is returned by the JavaScript code. The assert is to make sure we don't unwittingly get more than one element, which would be a surprise. And then the element is extracted from the list.
(If you care about IE compatibility, textContent is not available before IE 9. If you need support for IE 8 or earlier, you might be able to get by with innerText or innerHTML or you could check that the element holds a single text node with value 15.)
I prefer not to do it like TehTris does (find_elements_by_class_name plus a Python loop to find the one with the text) because that method takes 1 round-trip between Selenium client and Selenium server to get all the elements of class fc-day-content plus 1 round-trip per element that was found. So if you have 15 elements on your page with the class fc-day-content, that's 16 round-trips. If you run a test through Browser Stack or Sauce Labs, that's going to slow things down considerably.
And I prefer to avoid an XPath expression with #class='fc-day-content' because as soon as you add a new class to your element, this expression breaks. Maybe the element you care about has just one CSS class now but applications change. You could use XPath's contains() function but then you run into other complications, and once you take care of everything, it becomes a bit unwieldy. See this answer for how to use it robustly.