How to click a label without id with Selenium - python

I'm testing with this site here any https://www.nike.com.br/cosmic-unity-153-169-211-324680
And I'm trying after a few seconds that the page loads you must select the size and I can't select the size automatically with Selenium. Can someone help me?
Look, when it appears for you to select the size of the sneaker, I'm in Brazil and I select the size 40 of the sneaker, only if you inspect the "40" you will see that it is a label, and this label has no id, this label is the following html code snippet:
<label for="tamanho__id40">40</label>
How could I click on this label in Selenium?
I currently have this code:
import datetime
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.desired_capabilities
import DesiredCapabilities from selenium.webdriver.support.ui
import WebDriverWait from selenium.webdriver.common.by
import By from selenium.webdriver.support
import expected_conditions as EC
import time
option = Options()
prefs = {'profile.default_content_setting_values': {'images': 2}}
option.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(options = option)
# Navigate to url
driver.get"https://www.nike.com.br/cosmic-unity-153-169-211-324680")
What would I have to add to be able to click on this label that has no id?

1 You need to accept cookies
2 Use Selenium's explicit waits. To use them you will need to import:
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
3 Use reliable locators. I propose using this xpath locator for 40 shoe size: //label[#for="tamanho__id40"]
4 I added some chrome_options for dealing with this site.
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--disable-blink-features")
chrome_options.add_argument("--disable-blink-features=AutomationControlled")
chrome_options.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36")
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver', options=chrome_options)
driver.get("https://www.nike.com.br/cosmic-unity-153-169-211-324680")
wait = WebDriverWait(driver, 15)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '.cc-allow'))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, '//label[#for="tamanho__id40"]'))).click()

Related

Selecting page number from dropdown using Selenium python

I am attempting to change the page number to page 2 from the default of page 1 on https://stats.oecd.org/Index.aspx?DataSetCode=REVDEU
Currently my code opens the page, and seemingly does nothing, eventually timing out.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select
driver = webdriver.Chrome()
driver.get("https://stats.oecd.org/Index.aspx?DataSetCode=REVDEU")
select = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable(
(By.XPATH, "//select[starts-with(#id, 'PAGE')][starts-with(#name,'PAGE')]"))))
select.select_by_visible_text('2')
You are useing the Select not correctly.
This worked for me:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.select import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=webdriver_service)
wait = WebDriverWait(driver, 5)
url = "https://stats.oecd.org/Index.aspx?DataSetCode=REVDEU"
driver.get(url)
select = Select(driver.find_element(By.ID, 'PAGE'))
select.select_by_value('2')
The result is

Closing an ad button in a website using selenium

I was working on a web scraping project with Selenium and was trying to scrape news from the site https://www.businesstimes.com.sg/government-economy.
But whenever I open the site with the selenium automated chrome window, one ad comes up in a popup which I want to close.
import selenium
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as LM
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.support.ui import Select
import pandas as pd
import time
options = webdriver.ChromeOptions()
"""options.add_argument("enable-automation")
options.add_argument("--headless")"""
lists = ['disable-popup-blocking']
caps = DesiredCapabilities().CHROME
caps["pageLoadStrategy"] = "normal"
options.add_argument("--window-size=1920,1080")
options.add_argument("--disable-extensions")
options.add_argument("--disable-notifications")
options.add_argument("--disable-Advertisement")
options.add_argument("--disable-popup-blocking")
driver = webdriver.Chrome(executable_path= r"E:\chromedriver\chromedriver.exe", options=options, desired_capabilities=caps) #paste your own choromedriver path
driver.get('https://www.businesstimes.com.sg/government-economy')
I tried two methods, one was by the xpath method and one by the CSS selector method, but both failed.
#1st method
driver.find_element(By.CSS_SELECTOR, 'button[data-id="pclose-btn"]').click()
#2nd method
driver.find_element_by_xpath("//div[#class='bz-el bz-pclose-btn knd-BUTTON']").click()
Please help me with this. Thank you!
Element you trying to access is inside nested iframe.
This should work:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(service=webdriver_service, options=options)
url = 'https://www.businesstimes.com.sg/government-economy'
driver.get(url)
wait = WebDriverWait(driver, 20)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[id*='prestitial']")))
wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe")))
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[data-id='pclose-btn']"))).click()

Selenium sometimes clicking button, sometimes not without raising any errors [duplicate]

I need some help.
There is URL: https://www.inipec.gov.it/cerca-pec/-/pecs/companies.
I need to click checkbox Captcha:
My code is look like:
import os, urllib.request, requests, datetime, time, random, ssl, json, codecs, csv, urllib
from urllib.request import Request, urlopen
from urllib.request import urlretrieve
from datetime import datetime
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoAlertPresentException
from selenium.webdriver.chrome.options import Options
chromedriver = "chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
chrome_options = webdriver.ChromeOptions()
driver = webdriver.Chrome(executable_path=chromedriver, chrome_options=chrome_options)
driver.get("https://www.inipec.gov.it/cerca-pec/-/pecs/companies")
driver.switch_to_default_content()
element = driver.find_elements_by_css_selector('iframe')[1]
driver.switch_to_frame(element)
driver.find_elements_by_xpath('//*[#id="recaptcha-anchor"]/div[1]').click()
During the execution, there is an error:
driver.find_elements_by_xpath('//*[#id="recaptcha-anchor"]/div1').click()
AttributeError: 'list' object has no attribute 'click'
Please, help to fix it.
Solution update (11-Feb-2020)
Using the following set of binaries:
Selenium v3.141.0
ChromeDriver v80.0
Chrome Version 80.0
You can use the following updated block of code as a solution:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get("https://www.inipec.gov.it/cerca-pec/-/pecs/companies")
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[name^='a-'][src^='https://www.google.com/recaptcha/api2/anchor?']")))
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//span[#id='recaptcha-anchor']"))).click()
Original solution
Within the URL https://www.inipec.gov.it/cerca-pec/-/pecs/companies to invoke click() on the reCAPTCHA checkbox you need to:
Induce WebDriverWait for the desired frame to be available and switch to it.
Induce WebDriverWait for the desired element to be clickable.
You can use the following solution:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver = webdriver.Chrome(executable_path=r'C:\WebDrivers\chromedriver.exe', chrome_options=options)
driver.get("https://www.inipec.gov.it/cerca-pec/-/pecs/companies")
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[name^='a-'][src^='https://www.google.com/recaptcha/api2/anchor?']")))
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//span[#class='recaptcha-checkbox goog-inline-block recaptcha-checkbox-unchecked rc-anchor-checkbox']/div[#class='recaptcha-checkbox-checkmark']"))).click()
I resolved this, you can try this with your landing website url.
from selenium import webdriver
from selenium.webdriver.support.select import Select
from selenium.common.exceptions import SessionNotCreatedException
options = webdriver.ChromeOptions()
prefs = {"download.default_directory": download_dir}
options.add_experimental_option("prefs", prefs)
options.add_argument("--no-sandbox")
driver = webdriver.Chrome("/usr/bin/chromedriver", chrome_options = options)
driver.get("https://www.google.com/recaptcha/api2/demo")
driver.maximize_window()
price = driver.find_element_by_xpath("//div[#class='g-recaptcha']")
price_content = price.get_attribute('innerHTML')
start = str(price_content).find(";k=")+len(";k=")
end = str(price_content).find("&co")
driver.implicitly_wait(20)
driver.execute_script("document.getElementById('g-recaptcha-response').style.display = '';")
recaptcha_text_area = driver.find_element_by_id("g-recaptcha-response")
recaptcha_text_area.clear()
recaptcha_text_area.send_keys(price_content[start:end])
#.....................................................................................
button = driver.find_element_by_id("recaptcha-demo-submit")

extracting csv download link from an webpage using python

I want to extract the CSV download URL from website - https://www.nseindia.com/option-chain
enter image description here
Code I used till now
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
s = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=s)
driver.get("https://www.nseindia.com/option-chain")
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.ID,
"equity_underlyingVal")))
nifty = (driver.find_element(By.XPATH, '//*
[#id="equity_underlyingVal"]').text).replace('NIFTY ',
'').replace(',','')
time_stamp = driver.find_element(By.XPATH, '//*
[#id="equity_timeStamp"]').text
I need the csv link to be load in pandas df. I dont want to use selenium or if using selenium, I need it as headless. Let me know if anyone has a better idea about extracting data directly into pandas datafream..
You can extract the downloading link contained in that element with Selenium as following:
link = driver.find_element(By.CSS_SELECTOR, '#downloadOCTable').get_attribute("href")
As the download link is not present in the href attribute, the best approach is to download the csv file.
Interacting in headless mode can cause problems if the window-size argument is not specified, and a workaround to download files in headless mode is to specify the download path using the driver.command_executor method.
Code snippet to download csv in headless mode-
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import os
options = Options()
#add necessary arguments
options.add_argument("user-agent= Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36")
options.add_argument("--window-size=1920,1080")
options.add_argument("--headless")
driver = webdriver.Chrome(ChromeDriverManager().install(),options=options)
driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
#set download path (set to current working directory in this example)
params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow','downloadPath':os.getcwd()}}
command_result = driver.execute("send_command", params)
driver.get("https://www.nseindia.com/option-chain")
#wait for table details to appear
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, '//*[#id="equity_optionChainTable"]')))
#find and click on download csv button
download_button=driver.find_element_by_xpath('//*[#id="downloadOCTable"]')
download_button.click()

There is an error in registering on the site using webdriver Selenium Python

I am trying to register a new account for this site. However, I cannot register there because of an error or block for ChromeDriver (selenium Python).
I am using this code:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import random
import string
from time import sleep
from selenium.webdriver.common.keys import Keys
user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'
options = webdriver.ChromeOptions()
options.add_argument(f'user-agent={user_agent}')
options.add_argument('disable-infobars')
options.add_argument('--profile-directory=Default')
options.add_argument("--incognito")
options.add_argument("--disable-plugins-discovery")
options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
options.add_argument('--disable-extensions')
options.add_argument("start-maximized")
options.add_argument("--disable-blink-features")
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'chromedriver1.exe')
driver.get('https://www.nordstrom.com/signin?cm_sp=SI_SP_A-_-SI_SP_B-_-SI_SP_C&origin=tab&ReturnURL=https%3A%2F%2Fwww.nordstrom.com%2F')
def email(stringLength=8):
letters = string.ascii_lowercase
return ''.join(random.choice(letters) for i in range(stringLength))
email = email(6) + "#gmail.com"
sleep(5)
# email
driver.find_element_by_name("email").send_keys(email)
sleep(5)
# next
driver.find_element_by_id('account-check-next-button').send_keys(Keys.ENTER)
I think the website is blocking WebDriver. When I use Chrome in my computer, I don't encounter any problems, but using ChromeDriver, this is the issue I receive.
Open form use:
driver.get('https://www.nordstrom.com/signin')
Not
driver.get('https://www.nordstrom.com/signin?cm_sp=SI_SP_A-_-SI_SP_B-_-SI_SP_C&origin=tab&ReturnURL=https%3A%2F%2Fwww.nordstrom.com%2F')
Try to use explicit wait before click:
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
wait = WebDriverWait(driver, 30)
wait.until(EC.element_to_be_clickable(
(By.CSS_SELECTOR, 'button[alt="next button"]')))
btn = driver.find_element_by_css_selector('button[alt="next button"]')
btn.click()
It must work because I reproduced your error. The problem is that:
You should use click() for clicking, not send_keys(Keys.ENTER) In your case you click Enter before email is completely input, just before # symbol.
Your time.sleep(5) is not enough. Use explicit wait. Or, in the works case increase your sleep (if you don't case about the speed)
Read here how to use Selenium's wait instead on time.sleep()
Update:
Clear all cookies, cache, application data for this site and try again. First manually. Looks like it block users after some unsuccessful attempts. Even valid emails.
UPDATE:
Unfortunately, Nordstrom is blocking automated requests...
Nordstrom is tracking all customers actions, so it's unlikely to be used for testing. I would suggest to try other sites to save your time.

Categories

Resources