Selenium not finding tab link elements by various methods - python

I'm having issues with Selenium locating a set of specific tab link elements by IDs or link text. Using Selenium, I'm trying to click/loop through each of the tabs ("DESCRIPTION AND PRICE", "FINISH", and "NOTES") and scrape the subsequent table (see screenshot).
Below is the HTML of the tabs. When my loop first loads the page, the "DESCRIPTION AND PRICE" tab is active, and the subsequent table is easily scraped with BeautifulSoup (by searching for the with the specific table ID). However, after the "D+P" table is scraped, I try to click the "FINISH" tab with Selenium, I get the NoSucElementException error.
I hope to be able to click on the "FINISH" and "NOTES" tabs using the link text method (since the tabs are different from page to page). This results in the error.
driver.find_element_by_link_text("FINISH").click()
I've also tried the ID method, but this fails too.
driver.find_element_by_id("cphMain_tbTabs_rptTabs_lnkTab_1").click()
I've also tried various wait methods in case the element just wasn't yet loaded, although I get the same error when attempting to wait for an element ID, because it can't find the ID.
Another consideration is that I'm not seeing any mention of an iframe in the html.
<div id="cphMain_upTabs">
<div id="cphMain_divTabs" class="tabs">
<div id="cphMain_tbTabs_divTabs">
<ul class="tabset">
<li><a id="cphMain_tbTabs_rptTabs_lnkTab_0" class="tab active" href="javascript:__doPostBack('ctl00$cphMain$tbTabs$rptTabs$ctl01$lnkTab','')" style="font-weight:bold;">DESCRIPTION AND PRICE</a></li>
<li><a id="cphMain_tbTabs_rptTabs_lnkTab_1" class="tab" href="javascript:__doPostBack('ctl00$cphMain$tbTabs$rptTabs$ctl02$lnkTab','')" style="font-weight:normal;">FINISH</a></li>
<li><a id="cphMain_tbTabs_rptTabs_lnkTab_2" class="tab" href="javascript:__doPostBack('ctl00$cphMain$tbTabs$rptTabs$ctl04$lnkTab','')" style="font-weight:normal;">NOTES</a></li>
</ul>

Looks like my issue was with the first page I'm checking - I've added Try/Except logic and now I'm able to skip the first page and scrape the info for the subsequent pages. Not sure what the issue is with the first page...

Related

Selenium get URL of "a" Tag without href attribute

I'm facing an element like:
<li _ngcontent-bcp-c271="">
<a _ngcontent-bcp-c271="">2018</a>
<!---->
<!---->
</li>
This element is clickable but since it does not have a href attribute, and I think it should use some script for the click event, I don't have a solution to get the URL from this element.
The code that I use most of the time is as follows:
driver.find_element(By.TAG_NAME, 'li').find_element(By.TAG_NAME, 'a').get_attribute('href')
Update:
I need to know the URL before I click on the bottom.
The answer is: NO, you can not do that.
You can not get the URL before clicking such elements since URL is dynamically created by script etc, it is not statically kept on the page.
The easiest way to know its URL is to click it and then get page's url with:
driver.current_url
Another way is to get the javascript of this page and find in it the code that is responsible for clicking on this link and get the url from it if it is written explicitly there.

I cannot find a "button onclick" element using Selenium and Python

I am automating a process using Selenium and python. Right now, I am trying to click on a button in a webpage (sorry I cannot share the link, since it requires credential to login), but there is no way my code can find this button element. I have tried every selector (by id, css selector, xpath, etc.) and done a lot of googling, but no success.
Here is the source content from the web page:
<button onclick="javascript: switchTabs('public');" aria-selected="false" itemcount="-1" type="button" title="Public Reports" dontactassubmit="false" id="public" aria-label="" class="col-xs-12 text-left list-group-item tabbing_class active"> Public Reports </button>
I also added a sleep command before this to make sure the page is fully loaded, but it does not work.
Can anyone help how to select this onclick button?
Please let me know if you need more info.
Edit: you can take a look at this picture to get more insight (https://ibb.co/cYXWkL0). The yellow arrow indicates the button I want to click on.
The element you trying to click is inside an iframe. So, you need to switch driver into the iframe content before accessing elements inside it.
I can't give you a specific code solution since you didn't share a link to that page, even not all that HTML block. You can see solutions for similar questions here or enter link description here. More results can be found with google search

Get links from ::before ::after using Selenium in Python

I am trying to scrap a website using Selenium in Python in order to extract few links.
But for some of the tags, I am not able to find the links. When I inspect element for these links, it points me to ::before and ::after. One way to do this is to click on it which opens a new window and get the link from the new window. But this solutions is quite slow. Can someone help me know how can I fetch these links directly from this page?
Looks like the links you are trying to extract are not statically stored inside the i elements you see there. These links are dynamically generated by some JavaScripts running on that page.
So, the answer is "No", you can not extract there links from that page without human-like iterating elements of that page.

How to distinguish between product's page and a regular page

I am trying to scrape:
https://www.lanebryant.com/
My crawler starts from a URL and then goes further to all the links that are mentioned on that page. Now, I scraped for other site and my logic works by checking if URL contains "products" string and then downloads the product's information. In this site there is no such thing as mentioned previously. How do I distinguish between a product's page and a regular page? (All it requires is an if statement. I hope my question is clear. For the record, here is the product's page for this site:
https://www.lanebryant.com/faux-wrap-maxi-dress/prd-358414#color/0000081590
Something that might be helpful in this case is to go through several product pages (visually at first), and to look for similarities in their html. If you're new to this, just go to the page and then do something similar to right click + "View Page Source" (this is the way to do it on Chrome). For the page example you gave, an example of probably relevant element would be: <input type="submit"
class="cta-btn btn btn--full mar-add-to-bag asc-bag-action grid__item"
value="Add to Bag">, which corresponds to the "Add to Bag" button.
Then you might look into how to use BS to actually go through the html elements of the page and do your filtering based on this.
Hope that helps!

Selenium cannot find element in chromedriver

I am trying to automate some actions on a website. My script fills out a form and clicks post and then the website essentially asks if you are sure you want to post and you need to click the post button a second time. The problem is, while the old post button is no longer visible to the user, Selenium can only find the old post button and insists that the new post button does not exist.
Here is what the HTML looks like for the new post button.
<span id="__w2__w927tml1_submit_question_anon">
<a class="modal_cancel modal action" href="#"
id="__w2__w927tml1_add_original">Ask Original Question</a>
</span>
I have tried every different locator I can think of but it always locates the old post button. The closest I've got is locating the parent class shown above, but when trying to parse through its children it says there are none. I am at a loss for what to do here. Thanks for your help.

Categories

Resources