I was trying to extract the data from the website, here:
https://apps.ecology.wa.gov/tcpwebreporting/reports/ust?CityZip=Seattle&County=King&StoredSubstance=Unleaded%20Gasoline
I click the > button to get more details of each gas station. I was trying to scrape the data, but I couldn't find a way to click > button using my codes.
I am able to extract each row's elements. what should I do next?
driver = webdriver.Chrome(executable_path=r'C:\Users\Owner\Desktop\Career\Coltura\chromedriver.exe')
driver.get('https://apps.ecology.wa.gov/tcpwebreporting/reports/ust?CityZip=Seattle&County=King&StoredSubstance=Unleaded%20Gasoline')
buttons = driver.find_elements_by_class_name(' details-control parent-td clickable parent-control')
driver.find_elements_by_tag_name('tr')
<tr class="clickable odd details" role="row">
<td class=" details-control parent-td clickable parent-control">
<button title="Toggle more information about the site RICK'S CHEVRON GROCERY" class="btn btn-sm btn-whitesmoke"></button>
</td>
<td class=" parent-td">27</td>
<td class=" parent-td">41179492</td>
<td class=" parent-td">A3602</td>
<td class=" parent-td">RICK'S CHEVRON GROCERY</td>
<td class=" parent-td">8506 5TH AVE NE</td>
<td class=" parent-td">Seattle</td>
<td class=" parent-td">98115</td>
<td class=" parent-td">King</td>
<td class=" parent-td">Northwest</td>
</tr>
To locate element with multiple class name, you can use *_by_css_selector not _by_class_name.
I suggest to use method : .location_once_scrolled_into_view before click the element.
This is for click each arrow button you mean:
driver.get('https://apps.ecology.wa.gov/tcpwebreporting/reports/ust?CityZip=Seattle&County=King&StoredSubstance=Unleaded%20Gasoline')
#add some wait here.....
arrows = driver.find_elements_by_css_selector('td[class*="details-control"]')
for arrow in arrows:
arrow.location_once_scrolled_into_view
time.sleep(0.5)
arrow.click()
Related
I am trying to scrape the data in a bunch of rows. I am able to expand an individual row using the following:
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//*[#id="7858101"]'))).click()
The problem is each row has a different id. They have common class name so I have also tried:
WebDriverWait(driver, 60).until(EC.presence_of_elements_located((By.CLASS_NAME, 'course-row normal faculty-BU active'))).click()
I have attached a few rows below Any suggestions on how I can fix this
<tr id="7858101" class="course-row normal faculty-BU active" data-cid="7858101" data-cc="ACTG1P01" data-year="2021" data-session="FW" data-type="UG" data-subtype="UG" data-level="Year1" data-fn2_notes="BB" data-duration="2" data-class_type="ASY" data-course_section="1" data-days=" " data-class_time="" data-room1="ASYNC" data-room2="" data-location="ASYNC" data-location_desc="" data-instructor="Zhang, Xia (Celine)" data-msg="0" data-main_flag="1" data-secondary_type="E" data-startdate="1631073600" data-enddate="1638853200" data-faculty_code="BU" data-faculty_desc="Goodman School of Business">
<td class="arrow"><span class="fa fa-angle-down"></span></td>
<td class="course-code">ACTG 1P01 </td>
<td class="title">Introduction to Financial Accounting <div class="details-loader" style="display: none;"><span class="fa fa-refresh fa-spin fa-fw"></span></div></td>
<td class="duration">D2</td>
<td class="days"> </td>
<td class="time"> </td>
<!-- <td class="start" data-sort-value="1631073600">Sep 08, 2021</td> -->
<!-- <td class="end" data-sort-value="1638853200">Dec 07, 2021</td> -->
<td class="type">ASY</td>
<td class="data"><div style="" class="course-details-data">
<div class="description">
<h3>Introduction to Financial Accounting</h3>
<p class="page-intro">Fundamental concepts of financial accounting as related to the balance sheet, income statement and statement of cash flows. Understanding the accounting cycle and routine transactions. Integrates both theoretical and practical application of accounting concepts.</p>
<p><strong>Format:</strong> Lectures, discussion, 3 hours per week.</p>
<p><strong>Restrictions:</strong> open to BAcc majors.</p>
<p><strong>Exclusions:</strong> Completion of this course will replace previous assigned grade and credit obtained in ACTG 1P11, 1P91 and 2P51.</p>
<p><strong>Notes:</strong> Open to Bachelor of Accounting majors. </p>
</div>
<div class="vitals">
<ul>
<li><strong>Duration:</strong> Sep 08, 2021 to Dec 07, 2021</li>
<li>
<strong>Location:</strong> ASYNC </li>
<li><strong>Instructor:</strong> Zhang, Xia (Celine)</li>
<li><strong>Section:</strong> 1</li>
</ul>
</div>
<hr>
</div>
</td>
</tr>
<tr id="3724102" class="course-row normal faculty-BU active" data-cid="3724102" data-cc="ACTG1P01" data-year="2021" data-session="FW" data-type="UG" data-subtype="UG" data-level="Year1" data-fn2_notes="BB" data-duration="2" data-class_type="LEC" data-course_section="2" data-days=" M R " data-class_time="1100-1230" data-room1="GSB306" data-room2="" data-location="GSB306" data-location_desc="" data-instructor="Zhang, Xia (Celine)" data-msg="0" data-main_flag="1" data-secondary_type="E" data-startdate="1631073600" data-enddate="1638853200" data-faculty_code="BU" data-faculty_desc="Goodman School of Business">
<td class="arrow"><span class="fa fa-angle-right"></span></td>
<td class="course-code">ACTG 1P01 </td>
<td class="title">Introduction to Financial Accounting <div class="details-loader"><span class="fa fa-refresh fa-spin fa-fw"></span></div></td>
<td class="duration">D2</td>
<td class="days">
<table class="coursecal">
<thead>
<tr>
<th class="">S</th>
<th class="active">M</th>
<th class="">T</th>
<th class="">W</th>
<th class="active">T</th>
<th class="">F</th>
<th class="">S</th>
</tr>
</thead>
<tbody>
<tr>
<td class="weekend "></td>
<td class="active"></td>
<td class=""></td>
<td class=""></td>
<td class="active"></td>
<td class=""></td>
<td class="weekend "></td>
</tr>
</tbody>
</table>
</td>
<td class="time">1100-1230</td>
<!-- <td class="start" data-sort-value="1631073600">Sep 08, 2021</td> -->
<!-- <td class="end" data-sort-value="1638853200">Dec 07, 2021</td> -->
<td class="type">LEC</td>
<td class="data"></td>
</tr>
Are almost there...
You can retrieve a list of all the relevant web elements with the use of driver.find_elements method and then to iterate over each element in the list clicking on it.
Since course-row normal faculty-BU active is actually several class names, not a single class name, you should use XPath or CSS Selector there.
Also it's recommended to use visibility_of_element_located expected condition here, not presence_of_elements_located since the former condition is fulfilled even when the web element is not finally rendered on the page while visibility_of_element_located expected condition waits for more mature state of the web element
WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, '//tr[#class = "course-row normal faculty-BU active"]')))
time.sleep(0.4) #short delay added to make ALL the elements loaded
elements = driver.find_element(By.XPATH, '//tr[#class = "course-row normal faculty-BU active"]')
for element in elements:
element.click()
#scrape the data you need here etc
As the id attributes of the <tr> have dynamic value to identify all the <tr>s and click on each of them you need to induce WebDriverWait for the visibility_of_all_elements_located() and you need to construct a dynamic locator strategy as follows:
Using CSS_SELECTOR:
elements = WebDriverWait(driver, 60).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr.course-row.normal.faculty-BU.active[data-faculty_desc='Goodman School of Business'] a[data-cc][data-cid]")))
for element in elements:
element.click()
Using XPATH:
elements = WebDriverWait(driver, 60).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[#class='course-row normal faculty-BU active' and #data-faculty_desc='Goodman School of Business']//a[#data-cc and #data-cid]")))
for element in elements:
element.click()
My code:
from selenium import webdriver
driver.get('http://www.datiopen.it/it/opendata/Mappa_delle_stazioni_ferroviarie_in_Italia')
element = driver.find_element_by_id("Tabella")
time.sleep(5)
element.click()
time.sleep(5)
a=driver.find_element_by_id('rId_48').get_attribute('innerHTML')
print(a)
My output:
<td role="gridcell" style="" title="" aria-describedby="list_"><a title="Vedi su Google Maps" href="javascript:StatPortalOpenData.ODataUtility.openInStreetView(45.0760003999999,7.5911782);"><img alt="Vedi su Google Maps" height="25" width="25" style="vertical-align:middle" src="/sites/all/modules/spodata/metadata/viewer/multidimensional_viewer/img/streetView.png"></a></td>
<td role="gridcell" style="" class="" title="COLLEGNO" aria-describedby="list_Cccomune_608711150">COLLEGNO</td>
My desired output:
<td role="gridcell" style="" class="" title="COLLEGNO" aria-describedby="list_Cccomune_608711150">COLLEGNO</td><td role="gridcell" style="" class="" title="CITTA' METROPOLITANA DI TORINO" aria-describedby="list_Ccprovincia_1472723626">CITTA' METROPOLITANA DI TORINO</td>
So it is the second block of <td> </td>
Thank you!
If you want to target specific row and cell value you can use following CSS selector.
To print the element:
print(driver.find_element_by_css_selector("#rId_48>td:nth-child(2)").get_attribute('outerHTML'))
print(driver.find_element_by_css_selector("#rId_48>td:nth-child(3)").get_attribute('outerHTML'))
OR to print the text of the element
print(driver.find_element_by_css_selector("#rId_48>td:nth-child(2)").text)
print(driver.find_element_by_css_selector("#rId_48>td:nth-child(3)").text)
It looks like you selected the parent element of the <td> you want. Just use a css selector to get all the td elements inside it:
a=driver.find_elements_by_css_selector('#rId_48 td')
Note the "s" in find_elements.... This returns a list of all the <td> elements. So the one you want should be a[1].
I'm trying to do automation and struck in the middle.
Cannot able to select option from a submenu.
Tried every solution from stack overflow and anything doesn't work.
Attaching the code.
<input id="arid_WIN_0_2000053" class="text " readonly="" style="top: 0px; left: 0px; width: 72px; height: 21px;" title="Screen" type="text">
This is the id i need to click so a drop down appears.
That is from differant section and the code is,
<table class="MenuTable" style="width: 93px;" cellspacing="0" cellpadding="0">
<tbody class="MenuTableBody">
<tr class="MenuTableRow">
<td class="MenuEntryName" nowrap="">Screen</td>
<td class="MenuEntryNoSub" arvalue="Screen"></td>
</tr>
<tr class="MenuTableRow">
<td class="MenuEntryName" nowrap="">File</td>
<td class="MenuEntryNoSub" arvalue="File"></td>
</tr>
<tr class="MenuTableRow">
<td class="MenuEntryName" nowrap="">Printer</td>
<td class="MenuEntryNoSub" arvalue="Printer"></td>
</tr>
<tr class="MenuTableRow">
<td class="MenuEntryNameHover" nowrap="">(clear)</td>
<td class="MenuEntryNoSubHover" arvalue=""></td>
</tr>
</tbody>
</table>
Once i selected the ID arid_WIN_0_2000053, i need to select option as File.
Thanks in advance.
As per the HTML to select an option e.g. File from the submenu you can use either of the following solutions:
driver.find_element_by_xpath("//input[#class='text' and #title='Screen'][starts-with(#id,'arid_WIN_0_')]").click()
driver.find_element_by_xpath("//table[#class='MenuTable']//tr[#class='MenuTableRow']//td[#class='MenuEntryName' and contains(.,'File')]").click()
Or
driver.find_element_by_xpath("//input[#class='text' and #title='Screen'][starts-with(#id,'arid_WIN_0_')]").click()
driver.find_element_by_xpath("//table[#class='MenuTable']//tr[#class='MenuTableRow']//td[#class='MenuEntryNoSub' and #arvalue='File']").click()
Use as Css locator : .MenuTableRow:nth-of-type(2) .MenuEntryName
I'm using Ghost for Python 2.7 and I'm trying to click in a link which is in a table. The problem is that I have no ID, name... This is the HTML code:
<table id="table_webbookmarkline_2" cellpadding="4" cellspacing="0" border="0" width="100%">
<tr valign="top">
<td>
<a href="/dana/home/launch.cgi?url=.ahuvs%3A%2F%2Fhq0l5458452ERA-w-Xz8G3LKe8JNM%2F.ISDXWXaWXUivecOc" target="_blank" onClick='javascript:openBookmark(
this.href, "yes", "yes");
return false;' ><img src="/dana-cached/imgs/icn18x18WebBookmarkPop.gif" alt="This will open in a new TAB" width="18" height="18" border="0" ></a>
</td>
<td width="100%" align="left">
<a href="/dana/home/launch.cgi?url=.ahuvs%3A%2F%2Fhq0l5458452ERA-w-Xz8G3LKe8JNM%2F.ISDXWXaWXUivecOc" target="_blank" onClick='JavaScript:openBookmark(
this.href, "yes", "yes");
return false;' ><b>**LINK WHERE I WANT TO CLICK**</b> </a><br><span class="cssSmall"></span>
</td>
</tr>
</table>
How can I click in this kind of link ?
Seems like Ghost's Session.click() takes a CSS selector. Here only the table has an ID, so a selector that takes the second td that is a descendant of that ID and finds the a element should work:
session.click('#table_webbookmarkline_2 td:nth-child(2) a')
I have the following HTML:
<tbody role="alert" aria-live="polite" aria-relevant="all"
<tr class="odd">
<td class="">program user</td>
<td class="">program pass</td>
<td class="">program email</td>
<td class="">Program User</td>
<td class="">
<span class="ui-icon ui-icon-closethick"></span>
</td>
</tr>
<tr class="even">
<td class="">progman</td>
<td class="">progman_name</td>
<td class="">progman_lastname</td>
<td class="">Program Manager</td>
<td class="">
<span class="ui-icon ui-icon-closethick"></span>
This displays a table of users and:
<span class="ui-icon ui-icon-closethick"></span>
is the button 'x', which I am trying to locate so I can delete the user, a specific user 'Program Manager' or 'Program User'
Is this possible?
I assume that you are trying to find a specific user (for which you already know the name) and click on the associated delete button. Something like this should work:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("your-url-here")
elem = driver.find_element_by_xpath("//tbody/td[text()='your-name-here']/../span")
elem.click()
driver.close()
Well I found the solution:
one = driver.find_element_by_xpath("//td[#class='' and text()='Program Manger']/..//span[#class='ui-icon ui-icon-closethick']")
ActionChains(driver).double_click(one).perform()
Basically the I find the class containing text "Program Manager" then I move up to its parent, then iteratively ie. with // look for the 'ui-icon ui-icon-closethick'
And it worked!