Bookmakers scraping with selenium

Bookmakers scraping with selenium - python

I'm trying do understand how to scrape this betting website https://www.betaland.it/
I'm trying to scrape all the table rows that have inside the information of the 1X2 odds of the italian "Serie A".
The code I have written is this:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located
import time
import sys
url = 'https://www.betaland.it/sport/calcio/italia/serie-a-OIA-scommesse-sportive-online'
# absolute path
chrome_driver_path = '/Users/39340/PycharmProjects/pythonProject/chromedriver'
chrome_options = Options()
chrome_options.add_argument('--headless')
webdriver = webdriver.Chrome(
executable_path=chrome_driver_path, options=chrome_options
)
with webdriver as driver:
#timeout
wait = WebDriverWait(driver, 10)
#retrieve the data
driver.get(url)
#wait
wait.until(presence_of_element_located((By.ID, 'prematch-container-events-1-33')))
#results
results = driver.find_elements_by_class_name('simple-row')
print(results)
for quote in results:
quoteArr = quote.text
print(quoteArr)
print()
driver.close()
And the error that I have is:
Traceback (most recent call last):
File "C:\Users\39340\PycharmProjects\pythonProject\main.py", line 41, in <module>
wait.until(presence_of_element_located((By.ID, 'prematch-container-events-1-33')))
File "C:\Users\39340\PycharmProjects\pythonProject\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
P.S: if you try to access to the bookmakers you have to set an Italian IP address. Italian bookmakers are avaible only from Italy.

It's basically a timeout error which means the given time to load the page or find the element(as in this case) is insufficient. So firstly try to increase the wait time from 10 to 15 or 30 even.
Secondly you can use other element identifiers like xpath, css_selector and others instead of id and adjust the wait time like said in point one.

Related

Unable to click radio button even after using explicit wait on selenium

I am trying to select 'Female' Radio Button in the webpage
import time
import selenium.common.exceptions
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome(executable_path="C:\Drivers\chrome\chromedriver.exe")
driver.get("https://fs2.formsite.com/meherpavan/form2/index.html?1537702596407")
wait = WebDriverWait(driver, 60)
element = wait.until(EC.element_to_be_clickable((By.ID, "RESULT_RadioButton-7_1")))
driver.execute_script("arguments[0].click();",element)
#element.click()
#driver.find_element_by_id("RESULT_RadioButton-7_1").click()
print(driver.find_element_by_id("RESULT_RadioButton-7_0").is_selected())
print(driver.find_element_by_id("RESULT_RadioButton-7_1").is_selected())
Error:
C:\Users\kkumaraguru\PycharmProjects\pythonProject\venv\Scripts\python.exe C:/Users/kkumaraguru/PycharmProjects/SeleniumProject/RadioButtons.py
Traceback (most recent call last):
File "C:\Users\kkumaraguru\PycharmProjects\SeleniumProject\RadioButtons.py", line 14, in <module>
element = wait.until(EC.element_to_be_clickable((By.ID, "RESULT_RadioButton-7_1")))
File "C:\Users\kkumaraguru\PycharmProjects\pythonProject\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Process finished with exit code 1

It seems that it times out waiting for an element with ID RESULT_RadioButton-7_1 to be present on the page. I'd open the page yourself to make sure such element is present. You can do this using javascript in the browser's console: document.getElementById("RESULT_RadioButton-7_1"). If this doesn't work then try to debug through the code, and check what HTML Selenium is looking at to make sure is what you expect.

You can do that using JS intervention, Also make sure to maximize the windows size like below :
driver = webdriver.Chrome(driver_path)
driver.maximize_window()
driver.implicitly_wait(30)
driver.get("https://fs2.formsite.com/meherpavan/form2/index.html?1537702596407")
#time.sleep(5)
element = driver.find_element(By.ID, "RESULT_RadioButton-7_1")
driver.execute_script("arguments[0].click();", element)

Try - except clause, NoSuchElementExcepts error

Actually this is my second question on the same subject.
But in the original question, I put so many functions in my code which played a role as distraction.
So in this post, I deleted all the unnecessary functions and focused on my problem.
What I want to do is as follwing:
1. open a url with the firefox browser(using selenium)
2. click into a page
3. click every thumnails using a loop until the loop hit the "NoSuchElementExcepts" error
4. stop the loop when the loop hit the error
Here is my code.
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import time
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
#1. opening Firefox and go to an URL
driver = webdriver.Firefox()
url = "https://www.amazon.com/Kraft-Original-Macaroni-Microwaveable-Packets/dp/B005ECO3H0"
driver.get(url)
action = ActionChains(driver)
time.sleep(5)
#2. going to the main images page
driver.find_element_by_css_selector('#landingImage').click()
time.sleep(2)
#3 parsing the imgs and click them
n = 0
for i in range(1,10):
try:
driver.find_element_by_css_selector(f'#ivImage_{n}').click()
element = WebDriverWait(driver,20,1).until(
EC.presence_of_element_located((By.CLASS_NAME, "fullscreen"))
)
n = n + 1
except NoSuchElementException:
break
driver.close()
and error stacktrace is like this
Exception has occurred: NoSuchElementException
Message: Unable to locate element: #ivImage_6
File "C:\Users\Administrator\Desktop\pythonworkspace\except_test.py", line 25, in <module>
driver.find_element_by_css_selector(f'#ivImage_{n}').click()
FYI, all the thumnail images are under [ivImage_"number"] ID
I don't know why my try-except statement is not working.
Am I missing something?

Selenium with Python Random Timeout without error message

I am following the example here (under the Python tab): https://www.selenium.dev/documentation/en/
Code here:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located
#This example requires Selenium WebDriver 3.13 or newer
with webdriver.Firefox() as driver:
wait = WebDriverWait(driver, 10)
driver.get("https://google.com/ncr")
driver.find_element_by_name("q").send_keys("cheese" + Keys.RETURN)
first_result = wait.until(presence_of_element_located((By.CSS_SELECTOR, "h3>div")))
print(first_result.get_attribute("textContent"))
I've ran this code and got it to work, displaying the first result "Show More". However, other times when I run this code this doesn't work, and gives a random timeout error without a message:
Traceback (most recent call last):
File "c:\Users\User\PyCharm_Projects\Project\sample_test.py", line 12, in <module>
first_result = wait.until(presence_of_element_located((By.CSS_SELECTOR, "h3>div")))
File "C:\Program Files\Python38\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
My question is: what is causing the timeout error if it doesn't happen every time? I've tried locating the elements with other methods (XPATH, link text). I've looked at the other following examples, but nothing that they post seemed to fix this problem:
Tried, didn't work
- Selenium random timeout exceptions without any message
Non-applicable solutions
- Instagram search bar with selenium
- Selenium Timeoutexception Error
- Random TimeoutException even after using ui.WebDriverWait() chrome selenium python
I am on Python 3.8, Firefox 68.6.0, and here is the relevant packages from 'pip freeze'
- beautifulsoup4==4.8.2
- requests==2.22.0
- selenium==3.141.0
- urllib3==1.25.8
- webencodings==0.5.1
Thank you!

I have made some customized actions for this cases, for instance:
def findXpath(xpath,driver):
actionDone = False
count = 0
while not actionDone:
if count == 3:
raise Exception("Cannot found element %s after retrying 3 times.\n"%xpath)
break
try:
element = WebDriverWait(driver, waitTime).until(
EC.presence_of_element_located((By.XPATH, xpath)))
actionDone = True
except:
count += 1
sleep(random.randint(1,5)*0.1)
return element

Please refer below solution, I have executed this on a chrome and firefox browser couple of times and its working fine.TimeoutException thrown when a command does not complete in enough time.
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium import webdriver
#utilise chrome driver to open specified webpage
driver = webdriver.Chrome(executable_path=r"chromedriver.exe")
driver.maximize_window()
driver.get("https://google.com/ncr")
WebDriverWait(driver, 20).until(
EC.visibility_of_element_located((By.NAME,"q"))).send_keys("cheese" + Keys.RETURN)
first_result=WebDriverWait(driver, 20).until(
EC.visibility_of_element_located((By.CSS_SELECTOR, "h3>div")))
print(first_result.get_attribute("textContent"))
Output
Show more

How to Access span element and change it's value

I am trying to create a python + selenium script in order to fetch CSV from a salesforce pardot page.
Can Somebody please help me in accessing the nested span element inside the dropdown list which will be generated on button click.
I am adding my code done so far and I am getting Timeout Error While running my script.
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
try:
chrome_options = webdriver.ChromeOptions()
prefs = {'download.default_directory': r'C:\Pardot'}
chrome_options.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(executable_path="D:\XXX XXXX\XXXX\drivers\chromedriver.exe", options=chrome_options)
driver.get('https://pi.pardot.com/engagementStudio/studio#/15627/reporting')
user_name = driver.find_element_by_css_selector('#email_address')
user_name.send_keys('XXXXXXXXXXXXXXXXXXX')
password = driver.find_element_by_css_selector('#password')
password.send_keys('XXXXXXXXXXXXXXXXX)
submit_button = driver.find_element_by_css_selector('input.btn')
submit_button.click()
WebDriverWait(driver, 10).until(
EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe#content-frame")))
WebDriverWait(driver, 10).until(EC.text_to_be_present_in_element_value(text_='All Time',locator=(By.CSS_SELECTOR,"span[data-qa='reporting-filter-trigger-value']")))
except (RuntimeError) as e:
print(e)
finally:
time.sleep(10)
driver.close()
Screenshot for span DOM element
Error Stack trace:
C:\Users\Vipul.Garg\AppData\Local\Programs\Python\Python37\python.exe "D:/Vipul Garg Backup/XXXX/testingCSVfetching.py"
Traceback (most recent call last):
File "D:/Vipul Garg Backup/XXXX/testingCSVfetching.py", line 22, in <module>
WebDriverWait(driver, 10).until(EC.text_to_be_present_in_element_value(text_='All Time',locator=(By.CSS_SELECTOR,"span[data-qa='reporting-filter-trigger-value']")))
File "C:\Users\Vipul.Garg\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:

try this approach
spans = driver.find_element_by_css_selector('.slds-button_reset.slds-grow.slds-has-blur-focus.trigger-button').get_attribute('innerHTML')
print(spans)
You should be getting all span.

WebDriver does not recognize element

I'm trying to make Selenium wait for a specific element (near the bottom of the page) since I have to wait until the page is fully loaded.
I'm confused by it's behavior.
I'm not an expert in Selenium but I expect this work:
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Firefox()
wait = WebDriverWait(driver, 10)
def load_page():
driver.get('http://www.firmy.cz/?geo=0&q=hodinov%C3%BD+man%C5%BEel&thru=sug')
wait.until(EC.visibility_of_element_located((By.PARTIAL_LINK_TEXT, 'Zobrazujeme')))
html = driver.page_source
print html
load_page()
TIMEOUT:
File "C:\Python27\lib\site-packages\selenium\webdriver\support\wait.py", line 78, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
I'm just trying to see the HTML of the fully loaded page. It raises TimeoutException but I'm sure that that element is already there. I've already tried another approach.
wait.until(EC.visibility_of_element_located(driver.find_element_by_xpath('//a[#class="companyTitle"]')))
But this approach raises error too:
selenium.common.exceptions.NoSuchElementException:
Message: Unable to locate element:
{"method":"xpath","selector":"//a[#class=\"companyTitle\"]"}

Loading the site takes a long time, use implicitly waiting.
In this case, when you are interested in the whole HTML, you don't have to wait for a specific element at the bottom of the page.
The load_page function will print the HTML as soon as the whole site is loaded if you give the browser enough time to do this with implicitly_wait().
from selenium import webdriver
driver = webdriver.Firefox()
# wait max 30 seconds till any element is located
# or the site is loaded
driver.implicitly_wait(30)
def load_page():
driver.get('http://www.firmy.cz/?geo=0&q=hodinov%C3%BD+man%C5%BEel&thru=sug')
html = driver.page_source
print html
load_page()

The main issue in your code is wrong selectors.
If you want to wait till web element with text Zobrazujeme will loaded and then print page source:
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Firefox()
wait = WebDriverWait(driver, 10)
def load_page():
driver.get('http://www.firmy.cz/?geo=0&q=hodinov%C3%BD+man%C5%BEel&thru=sug')
wait.until(EC.visibility_of_element_located((By.CLASS_NAME , 'switchInfoExt')))
html = driver.page_source
print html
load_page()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Bookmakers scraping with selenium - python

Related

Unable to click radio button even after using explicit wait on selenium

Try - except clause, NoSuchElementExcepts error

Selenium with Python Random Timeout without error message

How to Access span element and change it's value

WebDriver does not recognize element

Categories

Resources