How to click the button to crawl the reviews using python

How to click the button to crawl the reviews using python - python

I am trying to crawl the reviews on this website: https://www.bol.com/nl/p/Matras-140x200-7-zones-koudschuim-premium-plus-tijk-15-cm-medium/9200000118425897/.
However, I have to click a button ( Toon meer) to show all the reviews.
<div class="load-more load-more--divider load-more--reviews js-review-load-more-container">
<a data-href="/nl/rnwy/productPage/reviews?productId=9200000118425896&offset=5&limit=10&loadMore=true" class="review-load-more__button js-review-load-more-button" data-test="review-load-more"><div class="css-loader css-loader--reviews"></div>
Toon meer</a>
</div>
I use the below code :
import requests
import pandas as pd
from selenium import webdriver
from bs4 import BeautifulSoup
from datetime import datetime
start_time = datetime.now()
data = []
link = "https://www.bol.com/nl/p/Matras-140x200-7-zones-koudschuim-premium-plus-tijk-15-cm-medium/9200000118425897/"
op = webdriver.ChromeOptions()
op.add_argument('--ignore-certificate-errors')
op.add_argument('--incognito')
op.add_argument('--headless')
driver = webdriver.Chrome(executable_path='D:/Desktop/work/real/chromedriver.exe',options=op)
driver.get(link)
driver.find_element_by_css_selector('div.review-load-more__button js-review-load-more-button').click()
However, it throws an error:
No such element: Unable to locate element: {"method":"css selector","selector":"div.review-load-more__button js-review-load-more-button"} .
Is there any solution?

Css selectors cannot select an element by containing text.
Try using xpath. The last line of your script should look something like:
wait = WebDriverWait(driver, 10)
wait.until(expected_conditions.element_to_be_clickable((By.XPATH, "//a[contains(., 'Toon meer')]")).click()

To click on Toon meer you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
driver.get('https://www.bol.com/nl/p/Matras-140x200-7-zones-koudschuim-premium-plus-tijk-15-cm-medium/9200000118425896/')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[data-test='consent-modal-confirm-btn']>span"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.review-load-more__button.js-review-load-more-button"))).click()
Using XPATH:
driver.get('https://www.bol.com/nl/p/Matras-140x200-7-zones-koudschuim-premium-plus-tijk-15-cm-medium/9200000118425896/')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#data-test='consent-modal-confirm-btn']/span"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[#class='review-load-more__button js-review-load-more-button' and contains(., 'Toon meer')]"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

When you get the page a popup comes with an accept button click it and then proceed with clicking your element.
driver.get('https://www.bol.com/nl/p/Matras-140x200-7-zones-koudschuim-premium-plus-tijk-15-cm-medium/9200000118425896/')
wait=WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[#class='js-confirm-button']"))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//a[#data-test='review-load-more']"))).click()
Import
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Related

Can't find out xpath with selenium in jobsite.co.uk website

I want to find out the "Accept All" button xpath for click accept cookies.
Code trials:
from ast import Pass
import time
from selenium import webdriver
driver = driver = webdriver.Chrome(executable_path=r'C:\Users\Nahid\Desktop\Python_code\Jobsite\chromedriver.exe') # Optional argument, if not specified will search path.
driver.get('http://jobsite.co.uk/')
driver.maximize_window()
time.sleep(1)
#find out XPath in div tag but there has another span tag
cookie = driver.find_element_by_xpath('//div[#class="privacy-prompt-button primary-button ccmgt_accept_button "]/span')
cookie.click()

The desired element:
<div id="ccmgt_explicit_accept" class="privacy-prompt-button primary-button ccmgt_accept_button ">
<span>Accept All</span>
</div>
is a <span> tag having an ancestor <div>.
Solution
To click on the clickable element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.privacy-prompt-button.primary-button.ccmgt_accept_button>span"))).click()
Using XPATH:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[text()='Accept All']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Your XPath looks correct but if can be improved.
Also you should use WebDriverWait expected conditions instead of hardcoded sleeps.
As following:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("--start-maximized")
s = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=s)
url = 'http://jobsite.co.uk/'
wait = WebDriverWait(driver, 10)
driver.get(url)
wait.until(EC.element_to_be_clickable((By.ID, "ccmgt_explicit_accept"))).click()

How to click on the play button with selenuim python

I'm trying to click the play button on this website
Here is my code below
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
import time
browser = webdriver.Chrome()
browser.get("https://audiomack.com/iamrealtraffic")
time.sleep(10)
a = driver.find_element_by_xpath('//path[#d="M70.79 103c-.988 0-1.79-.802-1.79-1.79V69.866c0-.99.802-1.79 1.79-1.79l29.104 16.118s1.344 1.343 0 2.686C98.55 88.225 70.79 103 70.79 103"]')
webdriver.ActionChains(browser).move_to_element(a).click(a).perform()

Are you sure you have the right xpath?
I would try this xpath:
/html/body/div[2]/div[3]/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/span[1]/button
Also I would use implicitlywait instead of time.sleep.

To click on the play icon you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:
Using CSS_SELECTOR:
driver.get("https://audiomack.com/black-sherif/song/kweku-the-traveler")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.row button[data-action='play']"))).click()
Using XPATH:
driver.get("https://audiomack.com/black-sherif/song/kweku-the-traveler")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[#class='row']//button[#data-action='play']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Getting id or xpath of parts of a website

I want to use selenium to login in the website https://www.winamax.es/account/login.php?redir=/apuestas-deportivas. The case is that I don't find the xpath/id/text to get te next code running successfully:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
options = webdriver.ChromeOptions()
options.add_argument("window-size=1920,1080")
options.add_argument('--disable-blink-features=AutomationControlled')
driver=webdriver.Chrome(options=options,executable_path=r"chromedriver.exe")
driver.get("https://www.winamax.es/account/login.php?redir=/apuestas-deportivas")
WebDriverWait(driver=driver, timeout=15).until(
lambda x: x.execute_script("return document.readyState === 'complete'")
)
upload_field = driver.find_element_by_xpath("//input[#type='email']")
I don't only want the specific xpath for this example, but I prefer a method to obtain the xpath or something similar to get the code working for other parts of the website

<iframe id="iframe-login" data-node="iframe" name="login" scrolling="auto" frameborder="1" style="min-height: 280px; width: 100%;"></iframe>
Your element is in an iframe. Switch to it.
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "iframe-login")))
Import
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Full working code.
wait = WebDriverWait(driver, 10)
driver.get("https://www.winamax.es/account/login.php?redir=/apuestas-deportivas")
wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, "iframe-login")))
upload_field = wait.until(EC.element_to_be_clickable((By.XPATH, "//input[#type='email']")))
upload_field.send_keys("stuff")

The email element is within an <iframe> so you have to:
Induce WebDriverWait for the desired frame to be available and switch to it.
Induce WebDriverWait for the desired element to be clickable.
You can use either of the following Locator Strategies:
Using CSS_SELECTOR:
driver.get("https://www.winamax.es/account/login.php?redir=/apuestas-deportivas")
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe#iframe-login")))
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[type='email']"))).send_keys("scraper#stackoverflow.com")
Using XPATH:
driver.get("https://www.winamax.es/account/login.php?redir=/apuestas-deportivas")
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[#id='iframe-login']")))
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[#type='email']"))).send_keys("scraper#stackoverflow.com")
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Browser Snapshot:
Reference
You can find a couple of relevant discussions in:
Ways to deal with #document under iframe
Switch to an iframe through Selenium and python
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element while trying to click Next button with selenium
selenium in python : NoSuchElementException: Message: no such element: Unable to locate element

How do I paginate to the next page in selenium python

How do I paginate to the next page in selenium python?
Code trials:
import os,json,requests, time
from pprint import pprint
from selenium import webdriver
import csv
driver = webdriver.Chrome()
driver.get('https://www.ebay.co.uk/b/Cordless-Drills/184655/bn_9863722?rt=nc&_dmd=1')
time.sleep(0.5)
links = driver.find_elements_by_xpath('//div[#class="s-item__wrapper clearfix"]//a[#class="s-item__link"]')
for link in links:
print(link.get_attribute('href'))

To paginate to the next page you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
driver.get('https://www.ebay.co.uk/b/Cordless-Drills/184655/bn_9863722?rt=nc&_dmd=1')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#gdpr-banner-accept[aria-label='Accept privacy terms and settings']"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "li.ebayui-pagination__li.ebayui-pagination__li--selected +li > a"))).click()
Using XPATH:
driver.get('https://www.ebay.co.uk/b/Cordless-Drills/184655/bn_9863722?rt=nc&_dmd=1')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#id='gdpr-banner-accept']"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//li[#class='ebayui-pagination__li ebayui-pagination__li--selected']//following::li[1]/a"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Browser Snapshot:

Selenium to automate clicking a tab using Python

I am using Selenium with Python to webscrape a page which includes JavaScript.
The racecourse result tabs towards the top of the page eg "Ludlow","Dundalk" are manually clickable but do not have any obvious hyperlinks attached to them.
...
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome(executable_path='C:/A38/chromedriver_win32/chromedriver.exe')
driver.implicitly_wait(30)
driver.maximize_window()
# Navigate to the application home page
driver.get("https://www.sportinglife.com/racing/results/2020-11-23")
...
This works so far. I have used BeautifulSoup to find the label names of the NewGenericTabs eg "Ludlow","Dundalk" etc. However, the following code, to attempt to automate clicking a tab, times out each time.
WebDriverWait(driver, 60).until(EC.element_to_be_clickable((By.LINK_TEXT, "Ludlow"))).click()
Would welcome any help.

The WebElement aren't <a> tags but <span> tags so By.LINK_TEXT wouldn't work.
To click on the desired elements you can use either of the following xpath based Locator Strategies:
Ludlow:
driver.get("https://www.sportinglife.com/racing/results/2020-11-23")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#mode='primary']"))).click()
driver.find_element_by_xpath("//span[#data-test-id='generic-tab' and text()='Ludlow']").click()
Dundalk:
driver.get("https://www.sportinglife.com/racing/results/2020-11-23")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#mode='primary']"))).click()
driver.find_element_by_xpath("//span[#data-test-id='generic-tab' and text()='Dundalk']").click()
Ideally, to click on the elements you need to induce WebDriverWait for the element_to_be_clickable() and you can use the following xpath based Locator Strategies:
Ludlow:
driver.get("https://www.sportinglife.com/racing/results/2020-11-23")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#mode='primary']"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[#data-test-id='generic-tab' and text()='Ludlow']"))).click()
Dundalk:
driver.get("https://www.sportinglife.com/racing/results/2020-11-23")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#mode='primary']"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[#data-test-id='generic-tab' and text()='Dundalk']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to click the button to crawl the reviews using python - python

Css selectors cannot select an element by containing text. Try using xpath. The last line of your script should look something like: wait = WebDriverWait(driver, 10) wait.until(expected_conditions.element_to_be_clickable((By.XPATH, "//a[contains(., 'Toon meer')]")).click()

Related

Can't find out xpath with selenium in jobsite.co.uk website

How to click on the play button with selenuim python

Getting id or xpath of parts of a website

How do I paginate to the next page in selenium python

Selenium to automate clicking a tab using Python

Categories

Resources