I am trying to get specific contents of the second "datadisplaytable" is the Schedule type and Instructors name. The line below:
datadisplaytable = soup.find(class_='datadisplaytable').text
gets me the whole class which contains all other 'datadisplaytable's which I intend to loop through for the specific data that I need.
Using "Xpath" in selenium only gets me the contents of the selected path. and trying to use a for loop in selenium returns "WebElement not iterable'
Which brings me to the question,
How do I get the schedule type and Instructor.
catalog = https://prod-ssb-01.dccc.edu/PROD/bwckschd.p_disp_dyn_sched
term and subject is any.
Use find_elements_by_xpath (‘<your xpath>) not find_element_by_xpath. See the difference one is with s.
Remember find_elements return a list of web elements which you can loop through where as find_element return a single web element.
Related
I have the custom attribute called upgrade-test="secondary-pull mktg-data-content in the following code snippet:
<section class="dvd-pull tech-pull-- secondary-pull--anonymous tech-pull--digital secondary-pull--dvd-ping tech-pull--minimise" upgrade-test="secondary-pull mktg-data-content" data-js="primary-pull" style="--primary-direct-d_user-bottom-pos:-290px;">
I am able to identify my element successfully by doing the following:
element = driver.find_element(By.XPATH, "//section[contains(#upgrade-test, 'mktg-data-content')]")
This mktg-data-content gets changed every time a user goes to a different page for example it could be sales-data-content for the sales page etc.
What I am after is to find a way to retrieve this dynamic text of this custom attribute and pass it to my variable. Any help would really be appreciated. Thanks
You need to fetch the attribute value using element.get_attribute("upgrade-test") and then need to do string manipulation.
elementval = driver.find_element(By.XPATH, "//section[contains(#upgrade-test, 'secondary-pull')]").get_attribute("upgrade-test")
print(elementval.split(" ")[-1])
Note:- splitted with space,which returns zero based list, index -1 means the last value of the list,
since you have two elements in the list you can use this as well
print(elementval.split(" ")[1])
You need to find a unique locator for that element.
Without seeing that page we can only guess, so I can guess tech-pull--digital and secondary-pull--dvd-ping class names are making a unique combination.
If so you can use the following code:
attribute_val = driver.find_element(By.CSS_SELECTOR, "section.tech-pull--digital.secondary-pull--dvd-ping").get_attribute("upgrade-test")
print(attribute_val.split(" ")[-1])
The first line here locates the element and retrieves the desired attribute value while the second line isolates the first part of the desired attribute value as explained by KunduK
I'm trying to make an attendance app, using Selenium; the script should read the names and click on a designated checkbox when a name is matched. What happens is that it keeps clicking on the first checkbox repeatedly. When I looked into it turns out that it only reads the first li only.
here's the HTML code
<div class="members">
<ul class="memberlist"></ul>
</div>
the li are generated via fetch API from the database
when I try these in python, it only reads the first li element over and over
for i in range(len(df)):
print(driver.find_element(By.XPATH,".//ul[#class='memberlist']/li/p").text)
for i in range(len(df)):
print(driver.find_element(By.CLASS_NAME,"name").text)
also, I tried the solution from this and I don't think this is the correct syntax
selenium python for loop printing only first element repeatedly
Update: as per dosas comment, this did list all the names instead of just one, then I wrote this to click on the checkboxes of the attendees.
for i in range(len(df)):
# match names of the sheet with the 'name' class in the webpage
if df['name'][i]==driver.find_elements(By.CLASS_NAME,"name")[i].text:
driver.find_elements(By.CLASS_NAME, week)[i].click()
print(driver.find_elements(By.CLASS_NAME,"name")[i].text)
else:
print('No match')
Now I have a different problem, the attendance excel sheet has to be in the same order as the list on the web page, otherwise, it'll show 'No match'. So how do I make it search the whole page for each particular name?
I am working on a script that aims "taking all of the entries which are written by users" under a specific title in a website(In this case, the title is "python(programlama dili"). I would like to read the number which shows the current number of pages under this specific title.
The reason behind reading the number of this elements is that number of pages can increase at the time when we run the script due to increasing number of entries by users. Thus, I should take the number which exist within the element via script.
In this case, I need to read "122" as the value and assign it to a int variable . I use Selenium to take all entries and Firefox web driver.
It Would be better if you try to access it using the xpath.
Try and get the value attribute of the element, you've mentioned you can find the element using xpath so you can do the following.
user_count = element.get_attribute('value')
If that gets you the number (as a string) then you can just convert to an int as usual
value = int(user_count)
First pick the selector .last and then you can extract the reference of that. Don't forget to split the reference.
my_val = driver.find_element_by_css_selector(".last [href]").split("p=")[1]
Using Selenium to perform some webscraping. Have it log in to a site, where an HTML table of data is returned with five values at a time. I'm going to have Selenium scrape a particular bit of data off the table, write to a file, click next, and repeat with the next five.
New automation script. I've a myriad of variations of get_attribute, find_elements_by_class_name, etc. Example:
pnum = prtnames.get_attribute("title")
for x in prtnames:
print('pnum')
Here's the HTML from one of the returned values:
<div class="text-container prtname"><span class="PrtName" title="P011">P011</span></div>
I need to get that "P011" value. Obviously Selenium doesn't have "find_elements_by_title", and there is no HTML id for the value. The Xpath for that line of HTML is:
//*[#id="printerConnectTable"]/tbody/tr[5]/td/table/tbody/tr[1]/td[2]/div/span
But I don't see a reference to "title" or "P011" in that Xpath.
pnum = prtnames.get_attribute("title")
AttributeError: 'list' object has no attribute 'get_attribute'
It's like get_attribute doesn't exist, but there is some (albeit not much) documentation on it.
Fundamentally I'd like to grab that "P011" value and print to console, then I know Selenium is working with the right data.
P.S. I'm self-taught with all of this, I'm automating a sysadmin task.
I think the problem is that prtnames is a list of element, not a specific element. You can use a list comprehension if you want a list of the attributes of titles for the list of prtnames.
pnums = [x.get_attribute('title') for x in prtnames]
I am trying to get data from the website but I want to select first 1000 link open one by one and get data from there.
I have tried:
list_links = driver.find_elements_by_tag_name('a')
for i in list_links:
print (i.get_attribute('href'))
through this getting extra links which are not required.
for example: https://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1,2,3,4,5,%3E5&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment,Residential-House,Villa,Residential-Plot&cityName=Mumbai
we will get more than 50k link. How to open only first 1000 link has in below with properties photos.
Edit
I have tried this also:
driver.find_elements_by_xpath("//div[#class='.l-srp__results.flex__item']")
driver.find_element_by_css_selector('a').get_attribute('href')
for matches in driver:
print('Liking')
print (matches)
#matches.click()
time.sleep(5)
But getting error: TypeError: 'WebDriver' object is not iterable
Why I am not getting link by using this line: driver.find_element_by_css_selector('a').get_attribute('href')
Edit 1
I am trying to sort links as per below but getting error
result = re.findall(r'https://www.magicbricks.com/propertyDetails/', my_list)
print (result)
Error: TypeError: expected string or bytes-like object
or Tried
a = ['https://www.magicbricks.com/propertyDetails/']
output_names = [name for name in a if (name[:45] in my_list)]
print (output_names)
Not getting anything.
All links are in list. Please suggest
Thank you in advance. Please suggest
Selenium is not a good idea for web scraping. I would suggest you to use JMeter which is FREE and Open Source.
http://www.testautomationguru.com/jmeter-how-to-do-web-scraping/
If you want to use selenium, the approach you are trying to follow is not a stable approach - clicking and grabbing the data. Instead I would suggest you to follow this - something similar here. The example is in java. But you could get the idea.
driver.get("https://www.yahoo.com");
Map<Integer, List<String>> map = driver.findElements(By.xpath("//*[#href]"))
.stream() // find all elements which has href attribute & process one by one
.map(ele -> ele.getAttribute("href")) // get the value of href
.map(String::trim) // trim the text
.distinct() // there could be duplicate links , so find unique
.collect(Collectors.groupingBy(LinkUtil::getResponseCode)); // group the links based on the response code
More info is here.
http://www.testautomationguru.com/selenium-webdriver-how-to-find-broken-links-on-a-page/
I believe you should collect all the elements in list which having tag name "a" with "href" properties which is not null.
Then traverse through the list and click on element one by one.
Create a list of type WebElement and store all the valid links.
Here you can apply more filters or conditions i.e. link contains some characters or some other condition.
To Store the WebElement in list you can use driver.findEelements() this method will return list of type WebElement.