I was writing a web scraper in python using selenium on a google form page, but i want to be able to select a particular button that i want. i tired selecting all the buttons once so it ended up selecting the last one as expected. Going through a multiple of a number is also not possible as each contain different number of buttons.
So i want a way to get the number of radio buttons in a group and a way to select the desired one.
Test site
If there are any other suggestions then i will be happy to listen ?
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
import Resources
from selenium.webdriver.support.select import Select
driver = webdriver.Chrome(executable_path= Resources.driverChrome)
driver.maximize_window()
driver.get(Resources.linkTest)
time.sleep(3)
try:
email = driver.find_element_by_xpath("//input[#type='email']")
if email.is_displayed() & email.is_enabled():
email.send_keys(Resources.emailTest)
except:
print("Email box was not found")
containers = driver.find_elements_by_xpath("//div[#class ='freebirdFormviewerViewNumberedItemContainer']")
sNoBoxes = driver.find_elements_by_xpath("//input[#type='text' ] ")
time.sleep(2)
radios = driver.find_elements_by_xpath("//div[#class='appsMaterialWizToggleRadiogroupOffRadio exportOuterCircle']")
for radio in radios:
radio.click()
for container in containers:
sNo = int(containers.index(container))
print("\n\n" + str(sNo) + " " + container.text)
button = driver.find_element_by_xpath("//span[contains(text(),'Submit')]")
button.click()
time.sleep(3)
try:
viewScore = driver.find_element_by_xpath("//span[contains(text(),'View score')]").click()
except:
pass
First, you can find all radio buttons on the webpage with the following code:
radiobuttons = driver.find_elements_by_class_name("appsMaterialWizToggleRadiogroupElContainer")
Result: a list of radio button elements
Next, you are going to count the radio buttons. Note that radio buttons are global; The radio button "Option 2" is the 11th button instead of 2nd. If you want to click the very first radio button, use:
radiobuttons[0].click()
Result: the radio button is clicked
To click the n th radio button, use:
radiobuttons[n-1].click()
Related
I am building a web scraper that has to try a combination of multiple drop-down menu options and gather data from each combination.
So basically there're 5 drop-downs. I have to gather data from all of the possible combinations of the drop-down options. For each combination, I have to press a button to pull up the page with all the data on it. I am storing all the data in a dictionary.
This is the website: http://siops.datasus.gov.br/filtro_rel_ges_covid_municipal.php?S=1&UF=12;&Municipio=120001;&Ano=2020&Periodo=20
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
# General Stuff about the website
path = '/Users/admin/desktop/projects/scraper/chromedriver'
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1200")
driver = webdriver.Chrome(options=options, executable_path=path)
website = 'http://siops.datasus.gov.br/filtro_rel_ges_covid_municipal.php'
driver.get(website)
# Initial Test: printing the title
print(driver.title)
print()
# Dictionary to Store stuff in
totals = {}
# Drop Down Menus
year_select = Select(driver.find_element(By.XPATH, '//*[#id="cmbAno"]'))
uf_select = Select(driver.find_element(By.XPATH, '//*[#id="cmbUF"]'))
### THIS IS WHERE THE ERROR IS OCCURING ###
# Choose from the drop down menus
uf_select.select_by_value('29')
year_select.select_by_value('2020')
# Submit button on the page
submit_button = driver.find_element(By.XPATH, '//*[#id="container"]/div[2]/form/div[2]/div/input[2]')
submit_button.click()
# Pulling data from the webpage
nameof = driver.find_element(By.XPATH, '//*[#id="arearelatorio"]/div[1]/div/table[1]/tbody/tr[2]').text
total_balance = driver.find_element(By.XPATH, '//*[#id="arearelatorio"]/div[1]/div/table[3]/tbody/tr[9]/td[2]').text
paid_expenses = driver.find_element(By.XPATH, '//*[#id="arearelatorio"]/div[1]/div/table[4]/tbody/tr[11]/td[4]').text
# Update Dictionary with the new info
totals.update({nameof: [total_balance, paid_expenses]})
totals.update({'this is a test': ['testing stuff']})
# Print the final Dictionary and quit
print(totals)
driver.quit()
For some reason, this code does not work when trying 1 possible combination (selecting value 29 from the UF drop-down, as well as value 2020 from the year_select drop-down). If I comment out of the two drop-down selections, then it works perfectly fine.
How do I try multiple combinations of drop-down options during a single iteration?
try this instead.
# Drop Down Menus
### THIS IS WHERE THE ERROR IS OCCURING ###
# Choose from the drop down menus
uf_select = Select(driver.find_element(By.XPATH, '//*[#id="cmbUF"]'))
uf_select.select_by_value('29')
year_select = Select(driver.find_element(By.XPATH, '//*[#id="cmbAno"]'))
year_select.select_by_value('2020')
This works for me. With your example i get a stale... error, means that the element disappears. Haven´t checked, but maybe the checkbox is somehow updated and looses reference when selecting the other one.
I am new to web scrapping. I am trying to scrape a website (for getting flight prices) using Python Selenium which has some drop down menu for fields like departure city, arrival city. The first option listen should be accessed and used.
Lets say that the field departure city is button, on click of that button, a new html page will be loaded with input box and a list set is available with every list element consisting of a button.
While debugging I found that, once my keyword is entered the options loading screen appears but options are not getting loaded even after increasing the sleep time (I have attached the image for the same).
image with error
Manually there are no issues with the drop down menu. As soon as I enter my departure city, options will be loaded and I choose the respective location.
I also tried to use actionchains. Unfortunately there is no luck. I am attaching my code below. Any help is appreciated.
#Manually trying to access the first element:**
#Accessing Input Field:**
fly_from = browser.find_element("xpath", "//button[contains(., 'Departing')]").click()
element = WebDriverWait(browser,10).until(ec.presence_of_element_located((By.ID,"location")))
#Sending the keyword:**
element.send_keys("Los Angeles")
#Selecting the first element:**
first_elem = browser.find_element("xpath", "//div[#class='classname default-padding']//div//ul//li[0]]//button")
Using Action Chains:
fly_from = browser.find_element("xpath", "//button[contains(., 'Departing')]").click()
time.sleep(10)
element = WebDriverWait(browser, 10).until(ec.presence_of_element_located((By.ID, "location")))
element.send_keys("Los Angeles")
time.sleep(3)
list_elem = browser.find_element("xpath", "//div[#class='class name default-padding']//div//ul//li[0]]//button")
size=element.size
action = ActionChains(browser)
action.move_to_element_with_offset(to_element=list_elem,xoffset =0.5*size['width'], yoffset=70).click().perform()
action.click(on_element = list_elem)
action.perform()
I want to store this site as text file:
https://www.tradingview.com/symbols/BTCUSD/technicals/
But I have a hard time since it's content is dependent on "buttons" that generate the content dynamically.
I wrote this code:
import requests
from bs4 import BeautifulSoup
results = requests.get("https://www.tradingview.com/symbols/BTCUSD/technicals/")
src = results.content
soup = BeautifulSoup(src)
var = soup.find_all()
with open('readme.txt', 'w') as f:
f.write(str(var))
But it appears it only fetches the source code and not the dynamically generated content. E.g. when clicking the button "1 minute" the content changes. I want some sort of "snapshot" of what "1 minute" generates as content and then sorta look for keywords "buy" or "sell" somehow later.
Really stuck with the first step of fetching the website's dynamic content... Can anyone help?
If you want to be able to get different content based on which buttons you click, you'll want to use Selenium instead of BeautifulSoup.
Here's a rough example which uses the page you provided:
from selenium import webdriver
from selenium.webdriver.common.by import By
import os
driver_path = os.path.abspath('chromedriver')
driver = webdriver.Chrome(driver_path)
driver.get('https://www.tradingview.com/symbols/BTCUSD/technicals/')
driver.implicitly_wait(10)
# Get the '1 minute' button element
btn_1min = driver.find_element(By.XPATH, '//*[#id="1m"]')
# Get the '15 minutes' button element
btn_15min = driver.find_element(By.XPATH, '//*[#id="15m"]')
# Decide which button to click on based on user input
choice = ''
while True:
time_input = input('Choose which button to click:\n> ')
if time_input == '1':
btn_1min.click() # Click the '1 minute' button
choice = '1 min'
break
elif time_input == '15':
btn_15min.click() # Click the '15 minute' button
choice = '15 min'
break
else:
print("Invalid input - try again")
# Get the 'Summary' section of the selected page
summary = driver.find_element(By.XPATH, '//*[#id="technicals-root"]/div/div/div[2]/div[2]')
# Get the element containing the signal ("Buy", "Sell", etc.)
signal = summary.find_element(By.XPATH, '//*[#id="technicals-root"]/div/div/div[2]/div[2]/span[2]')
# Get the current text of the signal element
signal_txt = signal.text
print(f'{choice} Summary: {signal_txt}')
driver.quit()
This script gives us the option to choose which button to click via user input, and prints different content based on which button was chosen.
For example: If we input '1' when prompted, thereby telling the script to click the "1 minute" button, we get the signal currently displayed in the Summary section of this button's page, which in this case happens to be "BUY":
Choose which button to click:
> 1
1 min Summary: BUY
If you want to read up on how to use Selenium with Python, here's the documentation: https://selenium-python.readthedocs.io
I am a total noob that tries to get together a code that will do below things.
1.) Log in with my credentials to the page https://www.alza.sk/EN/
2.) Refreshes the page in a specific interval, for example "60 seconds"
3.) Checks for the “In stock” text on any product displayed or a specific one product that will be opened in the one browser tab"
4.) Click on the “Add to cart” button beside the specific product that will be opened in the mentioned one browset tab and continue to the shopping cart by clickiny on “Proceed to Checkout” button, afterwards click on “Continue” button, if popup window displays with two buttons “Do not add anything” and “Add the selected items to your cart“ then click on the first button “Do not add anything”. In the next page choose the “Bratislava - main shop” checkbox and after that choose “Confirm your selection” button in the popup window. Afterwards click on “All payments option” section so all options are displayed and choose the “Cash / Card (when collected) checkbox. After that click on “Continue” button. And this is the step when i need to login by entering my credentials. So this i”ll update later today. But i suppose it would be best to log in in the Step 1.) as i described.
This is all i could so far get together, but i am unable to add all i described, so it will work.
from selenium import webdriver
import time
import urllib
driver = webdriver.Chrome()
driver.get("https://www.alza.sk/EN/")
while(True):
driver.refresh()
time.sleep("60")
from selenium import webdriver
import time
driver = webdriver.Chrome()
driver.get("input url here")
expected_text="Available"
actual_text=driver.find_element(By.XPATH, '//button[text()="Available"]').text
while actual_text!=True:
driver.refresh()
time.sleep(input time interval here)
if expected_text==actual_text:
actual_text=True
driver.find_element(By.XPATH, '//button[text()="Available"]').click()
#add other steps
This is just a suggestion on how to proceed with the script. If you can update the question with the URL and proper code, it will be easier to give a proper solution.
I have a problem scraping data from the following site: https://arcc.sdcounty.ca.gov/Pages/Assessors-Roll-Tax.aspx.
I have to do these steps in order:
Select a drop down option "Street Address'
Enter a street address into a text field (ie 43 Hadar Dr)
Click the 'Submit' button.
After clicking submit, I should be directed to a page that has the APN number for a given address.
The problem:
I am able to do the above steps. However, when I select a drop down option and input address in the textbox, it fails as the textbox input address for some reason is cleared before clicking 'submit' ONLY when I have selected a drop down option.
I have tried using Selenium's Expected Conditions to trigger the input in the text box after a drop down option has been selected, but did nothing. I am looking for any help on identifying the why there is this problem as well as any advice on solutions.
Thanks.Much appreciated.
My code:
driver = webdriver.Chrome()
driver.get('https://arcc.sdcounty.ca.gov/Pages/Assessors-Roll-Tax.aspx')
#Selects drop down option ('Street Address')
mySelect = Select(driver.find_element_by_id("ctl00_ctl43_g_d30f33ca_a5a7_4f69_bb21_cd4abc25 ea12_ctl00_ddlSearch"))
my=mySelect.select_by_value('0')
wait = WebDriverWait(driver,300)
#Enter address in text box to left of drop down
driver.find_element_by_id("ctl00_ctl43_g_d30f33ca_a5a7_4f69_bb21_cd4abc25ea12_ct l00_txtSearch").send_keys("11493 hadar dr")
#Click 'Submit' button to return API numbers associated with address
driver.find_element_by_id("ctl00_ctl43_g_d30f33ca_a5a7_4f69_bb21_cd4abc25ea12_ctl00_btnSearch").click()
driver.quit()
Just changed a few things in your code to make it work.
mySelect = Select(driver.find_element_by_id("ctl00_ctl43_g_d30f33ca_a5a7_4f69_bb21_cd4abc25 ea12_ctl00_ddlSearch"))
To find_element_by_name(...):
mySelect = Select(driver.find_element_by_name("ctl00$ctl43$g_d30f33ca_a5a7_4f69_bb21_cd4abc25ea12$ctl00$ddlSearch"))
And
my=mySelect.select_by_value('0')
To select_by_visible_text('...'):
my = mySelect.select_by_visible_text("Street Address")
And
driver.find_element_by_id("ctl00_ctl43_g_d30f33ca_a5a7_4f69_bb21_cd4abc25ea12_ct l00_txtSearch").send_keys("11493 hadar dr")
To find_element_by_xpath(...), since I usually get better results when finding elements by xpath.
driver.find_element_by_xpath('//*[#id="ctl00_ctl43_g_d30f33ca_a5a7_4f69_bb21_cd4abc25ea12_ctl00_txtSearch"]').send_keys("11493 hadar dr")
This is how it all looks like:
from selenium.webdriver.support.ui import WebDriverWait
from selenium import webdriver
from selenium.webdriver.support.ui import Select
driver = webdriver.Chrome()
driver.get('https://arcc.sdcounty.ca.gov/Pages/Assessors-Roll-Tax.aspx')
#Selects drop down option ('Street Address')
mySelect = Select(driver.find_element_by_name("ctl00$ctl43$g_d30f33ca_a5a7_4f69_bb21_cd4abc25ea12$ctl00$ddlSearch"))
my = mySelect.select_by_visible_text("Street Address")
wait = WebDriverWait(driver,300)
#Enter address in text box to left of drop down
driver.find_element_by_xpath('//*[#id="ctl00_ctl43_g_d30f33ca_a5a7_4f69_bb21_cd4abc25ea12_ctl00_txtSearch"]').send_keys("11493 hadar dr")
#Click 'Submit' button to return API numbers associated with address
driver.find_element_by_id("ctl00_ctl43_g_d30f33ca_a5a7_4f69_bb21_cd4abc25ea12_ctl00_btnSearch").click()
driver.quit()
Not sure if this is your situation. But one thing that jumped out from your question is the text box input... Often, when filling in a website text box, even though the text is clearly visible, the text is not actually read by the text-box method until after the focus (cursor) is clicked or tabbed out and away from the text box.
Tabbing the text cursor out of the text entry box first, before 'clicking submit', will often solve this issue.