Unable to parse data with BeautifulSoup: Python3 - python

[Please click here to view the Tags][1]
The following is one of the tables of the website I am scraping. Here, under 'tbody' I wish to click on the 'MS' button tag under both odd and even class which provides me a different table for further parsing it.
I am using Selenium and Python 3 to perform Web scraping.
The current code only clicks on the 'MS' button in the first row. How can I create a for loop so that I can iterate through all the rows and click on 'MD' element in all the rows?
Thank you.
Following is the code:
table_0=table.find_element_by_tag_name('tbody')
for buttons in table_0.find_elements_by_tag_name("tr"):
buttons.find_elements_by_xpath('//tr[#class="odd"]')
buttons.find_element_by_xpath('//button[text()="MS"]').click()
for buttons in table_0.find_elements_by_tag_name("tr"):
buttons.find_elements_by_xpath('//tr[#class="even"]')
buttons.find_element_by_xpath('//button[text()="MS"]').click()

You should be able to use a CSS selector to gather those for clicking
.btn-group.btn-group-xs button:first-child
The selector certainly works:
Not sure whether you need waits but maybe something like:
elements = driver.find_elements_by_css_selector(".btn-group.btn-group-xs button:first-child")
for element in elements:
element.click()
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()
driver.get("https://ibl.mdanderson.org/fasmic/#!/")
driver.find_element_by_css_selector("input[type='text']").send_keys("AKT1 (3 mutations)")
driver.find_element_by_css_selector("input[type='text']").send_keys(Keys.RETURN)
elements = driver.find_elements_by_css_selector(".btn-group.btn-group-xs button:first-child")
for element in elements:
element.click()

Related

Webscaping table data with drop down menu help in either Pandas, Beautiful Soup or Selenium

I am trying to scrape data from this website:
https://www.shanghairanking.com/rankings/grsssd/2021
Initially pandas gets me out the gates and I can scrape the table but I am struggling with the drop down menus. I want to select the options next to the total score box which are PUB, CIT, etc. When I inspect the element it looks like maybe Javascript and the usual methods of interating over these options don't work. I have tried Beutifalsoup and most recently Selenium to select the drop downs by hand. This works for the default table data
'''
import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.support.ui import Select
driver = webdriver.Chrome('/Users/martinbell/Downloads/chromedriver')
driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
submit = driver.find_element_by_xpath("//input[#value='CIT']").click()
'''
Doesn't get me anywhere.
Your code would not work as you first have to click the dropdown open and then traverse through the options in the dropdown. Here is the refactored code.
Note that I have used time.sleep for instant purposes but for a robust code and good practice, use explicit wait such as WebdriverWait
driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
time.sleep(10)
driver.find_element(By.XPATH, "(//*[#class='inputWrapper'])[3]").click()
#The below commented code loops through all the dropdown options and performs actions.
# opt_ele = driver.find_elements(By.XPATH, "(//*[#class='rank-select'])[2]//*[#class='options']//li")
# for ele in opt_ele:
# print(ele.text)
# ele.click()
# print('perform your actions here')
# driver.find_element(By.XPATH, "(//*[#class='inputWrapper'])[3]").click()
# If you do not want to loop through but just want to select only CIT, here is the line:
driver.find_element(By.XPATH, "(//*[#class='rank-select'])[2]//*[#class='options']//li[text()='CIT']").click()

How to select a g-menu-item in selenium (Python)

I am trying to find news articles related to different web searches within a custom time frame. If someone knows a better way to do this than with selenium then please do tell. I tried to use API's but they are all expensive.
My Code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome(executable_path=r".\chromedriver.exe")
driver.get("https://www.google.com")
cookie_button = driver.find_element_by_id('L2AGLb')
cookie_button.click()
search_bar = driver.find_element_by_name("q")
search_bar.clear()
search_bar.send_keys("Taylor Swift")
search_bar.send_keys(Keys.RETURN)
news_button = driver.find_element_by_css_selector('[data-hveid="CAEQAw"]')
news_button.click()
tools_button = driver.find_element_by_id('hdtb-tls')
tools_button.click()
recent_button = driver.find_element_by_class_name('KTBKoe')
recent_button.click()
custom_range_button = dashboards_button = driver.find_elements_by_tag_name('g-menu-item')[6]
custom_range_button.click()
driver.close()
I am trying to click this button:
I have tried to select, the span, all parent divs and now the g-menu-item itself, in every case I have received the error:
selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable
I understand this can happen when the element is not visible or can not be clicked but I also tried adding in sleep(2) between each operation, and I can see in the browser that selenium opens that it gets to the point in the image I have sent and then can't find the custom range button ??
In short:
Do you know how I can click this custom range button,
Or do you know how to scrape a lot of news articles from specific dates (without expensive API's)
Those g-menu are special type of tag,
you can locate them using :
//*[name()='g-menu-item']
this xpath will point to all the nodes.
further, I see that you are interested in 6th item.
(//*[name()='g-menu-item'])[6]
in code :
custom_range_button = driver.find_element_by_xpath("(//*[name()='g-menu-item'])[6]")
custom_range_button.click()

Click on dynamic dropdown element with selenium and python

I try to click on dropdown options using selenium and python. The problem is that the dropdown values are different each time I open the browser therefore, i can not use the HTML tags from the webpage.
I'm wondering if there's any way of clicking the values?
I tried to use this line of code using answered provided by Sers in this case as an example How to select/click in a dropdown content using selenium chromewebdriver / python but I'm not sure how should I change/compose the class here.
att4 = WebDriverWait(driver, 2).until(ec.visibility_of_element_located((
By.CLASS_NAME, f"div.awsui-select-option awsui-select-option-selectable[title='{l}']"))).click()
l is for each option in my dropdown, I need to click on each of them
You are using a wrong, non-stable locator.
Try this:
att4 = WebDriverWait(driver, 2).until(ec.visibility_of_element_located((By.XPATH, "//div[#data-value='Bambino']"))).click()
Or
att4 = WebDriverWait(driver, 2).until(ec.visibility_of_element_located((By.XPATH, "//div[#title='Bambino']"))).click()

Unable to find element by any means

I'm kinda of a newbie in Selenium, started learning it for my job some time ago. Right now I'm working with a code that will open the browser, enter the specified website, put the products ID in the search box, search, and them open it. Once it opens the product, it needs to extract its name and price and write it in a CSV file. I'm kinda struggling with it a bit.
The main problem right now is that Selenium is unable to open the product after searching it. I've tried by ID, name and class and it still didn't work.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import csv
driver = webdriver.Firefox()
driver.get("https://www.madeiramadeira.com.br/")
assert "Madeira" in driver.title
elem = driver.find_element_by_id("input-autocomplete")
elem.clear()
elem.send_keys("525119")
elem.send_keys(Keys.RETURN)
product_link = driver.find_element_by_id('variant-url').click()
The error I get is usually this:
NoSuchElementException: Message: Unable to locate element: [id="variant-url"]
There are multiple elements with the id="variant-url", So you could use the index to click on the desired element, Also you need to handle I think the cookie pop-Up. Check the below code, hope it will help
#Disable the notifications window which is displayed on the page
option = Options()
option.add_argument("--disable-notifications")
driver = webdriver.Chrome(r"Driver_Path",chrome_options=option)
driver.get("https://www.madeiramadeira.com.br/")
assert "Madeira" in driver.title
elem = driver.find_element_by_id("input-autocomplete")
elem.clear()
elem.send_keys("525119")
elem.send_keys(Keys.RETURN)
#Clicked on the cookie popup button
accept= driver.find_element_by_xpath("//button[#id='lgpd-button-agree']")
accept.click()
OR with explicitWait
accept=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//button[#id='lgpd-button-agree']")))
accept.click()
#Here uses the XPath with the index like [1][2] to click on the specific element
product_link = driver.find_element_by_xpath("((//a[#id='variant-url'])[1])")
product_link.click()
OR with explicitWait
product_link=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"((//a[#id='variant-url'])[1])")))
product_link.click()
I used to get this error all the time when starting. The problem is that there are probably multiple elements with the same ID "variant-url". There are two ways you can fix this
By using "driver.find_elementS_by_id"
product_links = driver.find_elements_by_id('variant-url')
product_links[2].click()
This will create an index off all the elements with id 'variant-url' then click the index of 2. This works but it is annoying to find the correct index of the button you want to click and also takes a long time if there are many elements with the same ID.
By using Xpaths or CSS selectors
This way is a lot easier as each element has a specific Xpath of Selector. It will look like this.
product_link = driver.get_element_by_xpath("XPATH GOES HERE").click()
To get an Xpath or selector 1. go into developer mode on your browser by inspecting the element 2. right click the element in the F12 menu 3. hover over copy 4. move to copy Xpath and click on it.
Hope this helps you :D

Selenium Stops Extracting Option Text After Click Command

I am trying to use Selenium to extract dynamically loaded content. The content is on http://www.afl.com.au/stats
I am attempting to navigate to the 'Players' tab, then obtain a list of all the Seasons available. When I do this from the Teams tab, the following code works:
from selenium import webdriver
from selenium.webdriver.support.ui import Select
driver = webdriver.Chrome(executable_path=r'D:\ChromeDriver\chromedriver.exe')
driver.get('http://www.afl.com.au/stats')
dropdown_menu = Select(driver.find_element_by_xpath('//*[#id="selTeamSeason"]'))
for option in dropdown_menu.options:
print(option.text)
which gives me a list of all the options available in the Seasons tab.
However, when I click to the 'Players' tab first, I am unable to get the same list with almost identical code:
from selenium import webdriver
from selenium.webdriver.support.ui import Select
import time
driver = webdriver.Chrome(executable_path=r'D:\ChromeDriver\chromedriver.exe')
driver.get('http://www.afl.com.au/stats')
driver.find_element_by_xpath('//*[#id="stats_tab"]/ul/li[2]').click()
time.sleep(3)
dropdown_menu = Select(driver.find_element_by_xpath('//*[#id="selTeamSeason"]'))
for option in dropdown_menu.options:
print(option.text)
The click successfully executes, I wait for the content to update, but instead of printing all the years (2001 to 2018), Selenium prints 18 instances of empty strings. I am thoroughly stumped. Any help at all would be appreciated.
In your first case , locator(//*[#id="selTeamSeason"]) is pointing to season dropdown of Teams tab and page has only one matching node at the time ,so its working for you.
But in the second case , for the same locator there are 2 matching nodes are available and in this case selenium automatically pick the first one(Its an hidden element in your case).
So try to build a unique xpath with can work in both the tabs.
You can try //div[#id='stats-player-stats']//select[#id='selTeamSeason'] locator for season dropdown in Players tab and //div[#id='stats-team-stats']//select[#id='selTeamSeason'] for season dropdown in Teams tab
Hope this will work for you
Instead of using Select just find the xpath and get all the option tags like below, tested and works.
element = driver.find_element_by_xpath('//*[#id="selTeamSeason"]')
all_options = element.find_elements_by_tag_name("option")
for option in all_options:
print(option.text)
In your first attempt the following Locator Strategy worked on the default TEAMS TAB :
dropdown_menu = Select(driver.find_element_by_xpath('//*[#id="selTeamSeason"]'))
As the first match as per the Locator Strategy was the Dropdown itself which is not the case when dealing with PLAYERS TAB. On PLAYERS TAB to list of all the options available in the Seasons Dropdown you can use the following code block :
dropdown_menu = Select(driver.find_element_by_xpath("//div[#id='stats-player-stats']//select[#id='selTeamSeason']"))
for option in dropdown_menu.options:
print(option.text)

Categories

Resources