I am trying to use some explicit waits using Undetected Chromedriver (v2). Rather than executing the statements once the element had loaded, etc it appears to pause until the wait time expires.
When I use the normal selenium chromedriver everything works as expected ("opt-in" is closed in 1-2 seconds) and when I use sleeps instead of waits the statements are executed much quicker.
Can anyone see the problem?
Here's the code:
class My_Chrome(uc.Chrome):
def __del__(self):
pass
options = uc.ChromeOptions()
arguments = [
'--log-level=3', '--no-first-run', '--no-service-autorun', '--password-store=basic',
'--start-maximized',
'--window-size=1920, 1080',
'--credentials_enable_service=False',
'--profile.password_manager_enabled=False,'
'--add_experimental_option("detach", True)'
]
for argument in arguments:
options.add_argument(argument)
driver = My_Chrome(options=options)
wait = WebDriverWait(driver, 20)
driver.get('https://www.oddschecker.com')
try:
opt_in = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[text()='Not Now']/..")))
VirtualClick(driver, opt_in)
current_time('Closing opt-in')
except:
pass
Related
I'm attempting to run a selenium script locally and open three non-headless browsers. I'm using multiprocesing Pools (and have tried with just regular multiprocessing as well) and come across an interesting issue where 3 browser sessions open, but only the first one actually navigates to the target_url and attempts control. The other two just sit and wait and do nothing.
Here is the execution code that's relevant
run_id = str(uuid.uuid4())
options = Options()
#options.binary_location = '/opt/headless-chromium' #works for lambda
start_time = time.time()
options.binary_location = '/usr/bin/google-chrome' #testing
#options.add_argument('--headless') don't need headless
options.add_argument('--no-sandbox')
options.add_argument('--verbose')
#options.add_argument('--single-process')
options.add_argument('--user-data-dir=/tmp/user-data')# test add
options.add_argument('--data-path=/tmp/data-path')
options.add_argument('--disk-cache-dir=/tmp/cache-dir')
options.add_argument('--homedir=/tmp')
#options.add_argument('--disable-gpu')# test add
#options.add_argument("--remote-debugging-port=9222") test remove
#options.add_argument('--disable-dev-shm-usage')
#'/opt/chromedriver' not found
logger.info("Before driver initiated")
# job_id = event['job_id']
# run_id = event['run_id']
send_log(job_id, run_id, "JOB START", True, "", time.time() - start_time)
retries = 0
drivers = []
try:
#driver = webdriver.Chrome('/opt/chromedriver_89', chrome_options=options)
driver = webdriver.Chrome('/opt/chromedriver90', chrome_options=options)
#driver2 = webdriver.Chrome('/opt/chromedriver90', chrome_options=options)
break
except Exception as e:
print(str(e))
logger.info('exception with driver instantiation, retrying... ' + str(e))
#time.sleep(5)
driver = None
driver = webdriver.Chrome('/opt/chromedriver', chrome_options=options)
....
and here is how i'm invoking each process
from multiprocessing import Pool
pool = Pool(processes=3)
for i in range(3):
pool.apply_async(invoke, args=("https://macau-flash-sale.myshopify.com/",))
pool.close()
pool.join()
Is it possible that despite the multiple processes, selenium is not communicating with the other two browser instances properly?
Any ideas are greatly appreciated!
I don't think selenium can handle multiple browsers at once. I recommend writing 3 different python scripts and running them all at once through the terminal. If they need to communicate information to each other, probably the easiest way is to just get them to write to a text file.
For anyone else struggling with this, Selenium CAN be used with async/multiprocessing.
But you cannot specify the same user/data directories for each session. I had to remove these parameters so that each chrome session creates unique directories for iteself.
just remove the below and it'll work.
options.add_argument('--user-data-dir=/tmp/user-data')# test add
options.add_argument('--data-path=/tmp/data-path')
I am making an automated python script which opens chromedriver on a loop until it finds a specific element on the webpage (using selenium) the driver gets. This obviously eats up recourses eventually as it is constantly opening and closing the driver while on the loop.
Is there a way to use an existing chromedriver window instead of just opening and closing on a loop until a conditional is satisfied?
If that is not possible is there an alternative way to go about this you would reccomend?
Thanks!
Script:
from selenium.webdriver.common.keys import Keys
from selenium import webdriver
import pyautogui
import time
import os
def snkrs():
driver = webdriver.Chrome('/Users/me/Desktop/Random/chromedriver')
driver.get('https://www.nike.com/launch/?s=in-stock')
time.sleep(3)
pyautogui.click(184,451)
pyautogui.click(184,451)
current = driver.current_url
driver.get(current)
time.sleep(3.5)
elem = driver.find_element_by_xpath("//* .
[#id='j_s17368440']/div[2]/aside/div[1]/h1")
ihtml = elem.get_attribute('innerHTML')
if ihtml == 'MOON RACER':
os.system("clear")
print("SNKR has not dropped")
time.sleep(1)
else:
print("SNKR has dropped")
pyautogui.click(1303,380)
pyautogui.hotkey('command', 't')
pyautogui.typewrite('python3 messages.py') # Notifies me by text
pyautogui.press('return')
pyautogui.click(928,248)
pyautogui.hotkey('ctrl', 'z') # Kills the bash loop
snkrs()
Bash loop file:
#!/bin/bash
while [ 1 ]
do
python snkrs.py
done
You are defining a method that contains the chromedriver launch and then running through the method once (not looping) so each method call generates a new browser instance. Instead of doing that, do something more like this...
url = 'https://www.nike.com/launch/?s=in-stock'
driver.get(url)
# toggle grid view
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[aria-label='Show Products as List']"))).click();
# wait for shoes to drop
while not driver.find_elements((By.XPATH, "//div[#class='figcaption-content']//h3[contains(.,'MOON RACER')]"))
print("SNKR has not dropped")
time.sleep(300) // 300s = 5 mins, don't spam their site
driver.get(url)
print("SNKR has dropped")
I simplified your code, changed the locator, and added a loop. The script launches a browser (once), loads the site, clicks the grid view toggle button, and then looks for the desired shoe to be displayed in this list. If the shoes don't exist, it just sleeps for 5 mins, reloads the page, and tries again. There's no need to refresh the page every 1s. You're going to draw attention to yourself and the shoes aren't going to be refreshed on the site that often anyway.
If you're just trying to wait until something changes on the page then this should do the trick:
snkr_has_not_dropped = True
while snkr_has_not_dropped:
elem = driver.find_element_by_xpath("//* .[ # id = 'j_s17368440'] / div[2] / aside / div[1] / h1")
ihtml = elem.get_attribute('innerHTML')
if ihtml == 'MOON RACER':
print("SNKR has not dropped")
driver.refresh()
else:
print("SNKR has dropped")
snkr_has_not_dropped = False
Just need to refresh the page and try again.
driver.page_source don't returns all the source code.It is detaily printing only some parts of code, but it's missing a big part of code. How can i fix this?
This is my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
def htmlToLuna():
url ='https://codefights.com/tournaments/Xph7eTJQssbXjDLzP/A'
driver = webdriver.Chrome('C:\\Python27\\chromedriver\\chromedriver.exe')
driver.get(url)
web=open('web.txt','w')
web.write(driver.page_source)
print driver.page_source
web.close()
print htmlToLuna()
Here is a simple code all it does is it opens the url and gets the length page source and waits for five seconds and will get the length of page source again.
if __name__=="__main__":
browser = webdriver.Chrome()
browser.get("https://codefights.com/tournaments/Xph7eTJQssbXjDLzP/A")
initial = len(browser.page_source)
print(initial)
time.sleep(5)
new_source = browser.page_source
print(len(new_source)
see the output:
15722
48800
you see that the length of the page source increases after a wait? you must make sure that the page is fully loaded before getting the source. But this is not a proper implementation since it blindly waits.
Here is a nice way to do this, The browser will wait until the element of your choice is found. Timeout is set for 10 sec.
if __name__=="__main__":
browser = webdriver.Chrome()
browser.get("https://codefights.com/tournaments/Xph7eTJQssbXjDLzP/A")
try:
WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, '.CodeMirror > div:nth-child(1) > textarea:nth-child(1)'))) # 10 seconds delay
print("Result:")
print(len(browser.page_source))
except TimeoutException:
print("Your exception message here!")
The output: Result: 52195
Reference:
https://stackoverflow.com/a/26567563/7642415
http://selenium-python.readthedocs.io/locating-elements.html
Hold on! even that wont make any guarantees for getting full page source, since individual elements are loaded dynamically. If the browser finds the element it moves on. So make sure you find the proper element to make sure the page has been loaded fully.
P.S Mine is Python3 & webdriver is in my environment PATH. So my code needs to be modified a bit to make it work for Python 2.x versions. I guess only print statements are to be modified.
I've looked all over and found some very nice solutions to wait for page load and window load (and yes I know there are a lot of Stack questions regarding them in isolation). However, there doesn't seem to be a way to combine these two waits effectively.
Taking my primary inspiration from the end of this post, I came up with these two functions to be used in a "with" statement:
#This is within a class where the browser is at self.driver
#contextmanager
def waitForLoad(self, timeout=60):
oldPage = self.driver.find_element_by_tag_name('html')
yield
WebDriverWait(self.driver, timeout).until(staleness_of(oldPage))
#contextmanager
def waitForWindow(self, timeout=60):
oldHandles = self.driver.window_handles
yield
WebDriverWait(self.driver, timeout).until(
lambda driver: len(oldHandles) != len(self.driver.window_handles)
)
#example
button = self.driver.find_element_by_xpath(xpath)
if button:
with self.waitForPage():
button.click()
These work great in isolation. However, they can't be combined due to the way each one checks their own internal conditions. For example, this code will fail to wait until the second page has loaded because the act of switching windows causes "oldPage" to become stale:
#contextmanager
def waitForWindow(self, timeout=60):
with self.waitForLoad(timeout):
oldHandles = self.driver.window_handles
yield
WebDriverWait(self.driver, timeout).until(
lambda driver: len(oldHandles) != len(self.driver.window_handles)
)
self.driver.switch_to_window(self.driver.window_handles[-1])
Is there some Selenium method that will allow these to work together?
In selenium doc we can see that we must set some timeout for wait.
For example: code from that doc
wait = WebDriverWait(driver, 10)
element = wait.until(EC.element_to_be_clickable((By.ID,'someid')))
I wonder do we always must set up some timeout? Or there is some method that will wait until all of the AJAX code will download and only after it our driver will interact with some web-elements(I mean without any fixed timeout , it just loads all things and only after it starts interacting)?
Hopefully this code will help you. This is how I solved this issue.
#Check with jQuery if it has any outstanding ajax
def ajax_complete(self):
try:
return 0 == self.execute_script("return jQuery.active")
except:
pass
#Create a method to wait for ajax to complete
driver.wait_for_ajax = lambda: WebDriverWait(driver, 10).until(ajax_complete, "")
driver.implicitly_wait(30)