Unable to pull text from elements within shadow-root using Python Selenium - python

I am having a terrible time getting text from shadow-root. I found several docs and it looks right to me, except i always get:
Traceback (most recent call last):
File "C:\python\vttest.py", line 29, in
print(driver.execute_script("return document.querySelector('vt-ui-shell').shadowRoot.querySelector('url-view').shadowRoot.querySelector('vt-ui-main-generic-report').shadowRoot.querySelector('vt-ui-url-card').shadowRoot.querySelector('vt-ui-generic-card').shadowRoot.querySelector('p')").text)
File "C:\Users\barberion.NATJ\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 634, in execute_script
return self.execute(command, {
File "C:\Users\barberion.NATJ\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\barberion.NATJ\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.JavascriptException: Message: javascript error: Cannot read properties of null (reading 'shadowRoot')
(Session info: chrome=96.0.4664.93)
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver import DesiredCapabilities # necessary in headless mode, need this library to accept ssl
from selenium.webdriver import Chrome
from selenium.webdriver.support.ui import WebDriverWait
#from selenium.webdriver.support.select import Select
capabilities = DesiredCapabilities.CHROME.copy()
capabilities['acceptSslCerts'] = True
capabilities['acceptInsecureCerts'] = True
opts = Options()
#opts.headless = True # comment this out to get out of headless
#opts.add_argument("--window-size=1400x1400")
opts.add_argument("--enable-javascript")
opts.add_argument("--start-maximized")
opts.add_argument('--disable-gpu')
searchvt=input("Enter URL: ")
driver = Chrome(options=opts,desired_capabilities=capabilities,executable_path='C:/python/chromedriver.exe')
driver.get('https://www.virustotal.com/gui/home/url')
url='//*[#id="view-container"]/home-view'
time.sleep(.5)
driver.find_element_by_xpath(url).send_keys(searchvt)
time.sleep(.5)
driver.find_element_by_xpath(url).send_keys(Keys.RETURN)
driver.maximize_window()
driver.set_page_load_timeout(5)
time.sleep(3)
print(driver.execute_script("return document.querySelector('vt-ui-shell').shadowRoot.querySelector('url-view').shadowRoot.querySelector('vt-ui-main-generic-report').shadowRoot.querySelector('vt-ui-url-card').shadowRoot.querySelector('vt-ui-generic-card').shadowRoot.querySelector('p')").text)
my error must be somewhere in the print :( any help is most appreciated!
Edit:
To be more specific, I want to submit an URL or domain to virustotal, and then read the result text on top of page. If you use MSN.com the text I want is on top where it says "No security vendors flagged this URL as malicious"

If you update to Selenium 4.0 and use a Chromium browser v96+ you can use the new shadow_root property in Python Selenium and avoid using JavaScript entirely.
This is a working example:
driver.get('http://watir.com/examples/shadow_dom.html')
shadow_host = driver.find_element(By.CSS_SELECTOR, '#shadow_host')
shadow_root = shadow_host.shadow_root
shadow_content = shadow_root.find_element(By.CSS_SELECTOR, '#shadow_content')
assert shadow_content.text == 'some text'
Source: https://titusfortner.com/2021/11/22/shadow-dom-selenium.html

My shadow DOMS were just out of order. once i modified print to:
print(driver.execute_script("return document.querySelector('url-view').shadowRoot.querySelector('vt-ui-url-card').shadowRoot.querySelector('vt-ui-generic-card').querySelector('p')").text)
everything functioned as expected.

Related

I am trying to use selenium python to click an element in chrome new tab, but I am getting error no such element even thought the element is there

I got all the right modules that I need and my code looks pretty good.I am trying to click the add shortcut button using selenium, it is my first time using selenium, but I am pretty sure I did all of the code right. This is all of my code:
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get('chrome://newtab')
add_button = driver.find_element_by_id('addShortcut')
add_button.click()
This is the chrome elements:
https://i.stack.imgur.com/m0DJR.png
This is my error:
Traceback (most recent call last):
File "C:\Users\User\Desktop\Srikar's Stuff\Programming\Python\WebScraper.py", line 7, in <module>
add_button = driver.find_element_by_id('addShortcut')
File "C:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 360, in find_element_by_id
return self.find_element(by=By.ID, value=id_)
File "C:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "C:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="addShortcut"]"}
(Session info: chrome=89.0.4389.90)
Is it an error with my code or something else?
Help!!??
So Google is using the feature of Shadow Root. It hides what is under it as an Iframe. I found the following workaround:
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get('chrome://newtab')
delay = 1 # seconds
js_query = "return document.getElementsByTagName('ntp-app')[0].shadowRoot.getElementById('mostVisited').shadowRoot"
try:
_ = WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.TAG_NAME, 'ntp-app')))
print("Page is ready!")
except TimeoutException:
print("Loading took too much time!")
most_visited = driver.execute_script(js_query)
add_button = most_visited.find_element_by_id('addShortcut')
add_button.click()
This should work though it is a little bit hacky
The icons aren't on that page. They're in an <iframe> called chrome-untrusted://new-tab-page/custom_background_image?url=, and I don't think you can bring that up on its own. Why would you want to? Unless you login, this Chrome isn't connected to your desktop Chrome.

24/7 Python selenium "while true:" script can't execute driver.get("URL")

This is an odd problem
My script has a while True: script running 24/7. It always hits print ('False') every sleep(55) seconds, and once the current_time is between specified times, it calls driver.get("URL") and does a few actions on a website.
But sometimes it hits an error where it cannot open driver.get('https://www.instagram.com/accounts/login/?source=auth_switcher') and prints the error below.
Traceback (most recent call last):
File "/home/matt/insta/users/nycforest/lcf.py", line 254, in <module>
lcf_time(input_begin_time,input_end_time,input_begin_time2,input_end_time2)
File "/home/matt/insta/users/nycforest/lcf.py", line 241, in lcf_time
login()
File "/home/matt/insta/users/nycforest/lcf.py", line 40, in login
driver.get('https://www.instagram.com/accounts/login/?source=auth_switcher')
File "/home/matt/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
self.execute(Command.GET, {'url': url})
File "/home/matt/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/home/matt/.local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: timeout
(Session info: headless chrome=78.0.3904.70)
(Driver info: chromedriver=2.42.591071 (0b695ff80972cc1a65a5cd643186d2ae582cd4ac),platform=Linux 5.4.0-1020-aws x86_64)
I tried adding driver.set_page_load_timeout(20) after driver.get('URL') but it hits the error above
from pyvirtualdisplay import Display
from datetime import datetime, time
from itertools import islice
from random import randint
from time import sleep
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
input_begin_time = time(18,40)
input_end_time = time(18,50)
input_begin_time2 = time(21,20)
input_end_time2 = time(21,30)
opts = webdriver.ChromeOptions()
opts.add_argument('--headless')
opts.add_argument('--disable-gpu')
opts.add_argument('--no-sandbox')
opts.add_argument('--disable-dev-shm-usage')
opts.add_argument('--enable-features=NetworkService,NetworkServiceInProcess')
def login():
sleep(2)
driver.get('https://www.instagram.com/accounts/login/?source=auth_switcher')
driver.set_page_load_timeout(20)
sleep(3)
input_username = driver.find_element_by_name('username').send_keys(username)
input_password = driver.find_element_by_name('password').send_keys(password)
button_login = driver.find_element_by_css_selector('#loginForm > div > div:nth-child(3) > button')
button_login.click()
# def other_functions()
driver = webdriver.Chrome(options=opts)
def lcf_time(time_begin1, time_end1, time_begin2, time_end2, curren_time=None):
current_time = datetime.now().time()
if time_begin1 < current_time < time_end1 or time_begin2 < current_time < time_end2:
login()
# other_functions()
driver.close()
else:
print("False")
while True:
lcf_time(input_begin_time,input_end_time,input_begin_time2,input_end_time2)
sleep(55)
The odd thing is: I have a few different scripts running the exact same script with different variables and it some scripts are fine and others consistently hit this 60% of the time. I need a more reliable way to driver.get("URL") 24/7
Is this a compatibility issue? Am I doing something wrong?
Instead of doing it with Selenium, you can do it with Python only or "curl" command.
Read here how to login to instagram using its exposed API for developers
In this way you can create a bash/shell script to trigger the script when you want and you won't need to keep the IDE running.

How to tell google that I accept cookies python

I am trying to accept cookies and search in google. But I am facing a problem that I never faced before.
Webdriver cant find the accept cookies button element. I looked everything. I tried to see if something changes the xpath(Is there any thing triggers any other element so the xpath will change) but I have zero in my hand. I tried using accept.send_keys(Keys.RETURN) and accept.click(). Nothing seems to work.
What I have for now;
def GoogleIt():
path = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(path)
driver.get("https://google.com")
wait(5)
accept = driver.find_element_by_xpath("/html/body/div/c-wiz/div[2]/div/div/div/div/div[2]/form/div/div[2]")
accept.send_keys(Keys.RETURN)
accept.click()
search = driver.find_element_by_xpath("/html/body/div/div[2]/form/div[2]/div[1]/div[1]/div/div[2]/input")
search.click()
search.send_keys(talk)
Note: wait() is a different name of time.sleep()
Error;
Traceback (most recent call last):
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\tkinter\__init__.py", line 1883, in __call__
return self.func(*args)
File "C:/Users/Teknoloji/Desktop/Phyton/Assistant V1/Assistant.py", line 20, in GoogleIt
accept = driver.find_element_by_xpath("/html/body/div/c-wiz/div[2]/div/div/div/div/div[2]/form/div/div[2]")
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/div/c-wiz/div[2]/div/div/div/div/div[2]/form/div/div[2]"}
(Session info: chrome=85.0.4183.121)
Process finished with exit code -1
If you need more information I am here.
Thanks...
To click on accept cookies button which is inside an iframe you need to switch to iframe first.
Induce WebDriverWait() and frame_to_be_available_and_switch_to_it() and following css selector.
Induce WebDriverWait() and element_to_be_clickable() and following xpath.
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[src^='https://consent.google.com']")))
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//div[#id='introAgreeButton']"))).click()
You need to import following libraries.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
In case KunduK's solution doesn't work for some of those reading this, use "By.ID" parameter instead of the "By.CSS_SELECTOR" in the frame_to_be_available_and_switch_to_it() method -
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "name_of_the_iframe")))
After the latest update of Selenium, the above solutions did not work for me.
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium import webdriver
from selenium.webdriver.common.by import By
firefox_binary = FirefoxBinary()
browser = webdriver.Firefox(firefox_binary=firefox_binary)
Let's say you want to get this page from Google:
browser.get('https://www.google.com/search?q=cats&source=lnms&tbm=isch')
To continue, you have to click on "Accept all"
Then, you use Selenium to click the button:
browser.find_element(By.XPATH,"//.[#aria-label='Accept all']").click()
Then, you can continue to Google page!

Click on the attribut of a tag with the librarie selenium

I trying to click on the attribute class container in the tag div with a librarie Selenium. Here is code :
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get('https://www.flashscore.com/')
driver.find_element(By.CSS_SELECTOR, ".header__button header__button--search").click()
And here is error display :
>>> driver.find_element(By.CSS_SELECTOR, ".header__button header__button--search").click();
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\avis\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py",
line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "C:\Users\avis\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py",
line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\avis\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py",
line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css
selector","selector":".header__button header__button--search"}
(Session info: chrome=85.0.4183.102)
the code is inspired of the doc of selenium : https://www.selenium.dev/documentation/en/getting_started_with_webdriver/performing_actions_on_the_aut/
I did some research and find a function which makes it possible to overcome the exception and it'is :element_to_be_clickable()
it is used to wait as the element is display and be clickable, according to the doc.
And I used it as this :
> > element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CLASS_NAME, "header__button
> header__button--search"))
> element.click();
But this syntaxe error is display in console :
File "", line 2
element.click()
^ SyntaxError: invalid syntax
while I not see of syntax error.
From where can arise from the error ?
And function use is good ?
In HTML, you specify multiple classes with a space:
<span class="class1 class2">....</span>
In selenium, when searching for a single class, just use one of the class names:
driver.find_element(By.CSS_SELECTOR, ".class1")
This will code will open the site with Selenium and click the Search button:
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
# prevent version errors and plugin warning, may not be needed for you
options = webdriver.ChromeOptions()
options.add_argument("disable-extensions")
options.add_argument("disable-plugins")
options.experimental_options["useAutomationExtension"] = False # prevent load error - Error Loading Extension - Failed to load extension from ... - Could not load extension from ... Loading of unpacked extensions is disabled
driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
# main code
driver.get('https://www.flashscore.com/')
element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".header__button--search")))
driver.find_element(By.CSS_SELECTOR, ".header__button--search").click()
To find an element using multiple classes, check this post:
Find div element by multiple class names?

Selenium -python element not clickable

I am trying to browse facebook via selenium in python.
Here is my script so far.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
usr = ""# I have put 3 different accounts and tested it. Same error
pwd = ""
driver = webdriver.Chrome('E:\python_libs\chromedriver.exe')
driver.get("https://www.facebook.com")
assert "Facebook" in driver.title
elem = driver.find_element_by_id("email")
elem.send_keys(usr)
elem = driver.find_element_by_id("pass")
elem.send_keys(pwd)
elem.send_keys(Keys.RETURN)
elem = driver.find_element_by_css_selector(".input.textInput")
elem.send_keys("Posted using Python's Selenium WebDriver bindings!")
elem = driver.find_element_by_css_selector(".selected")
elem.click()
time.sleep(5)
driver.close()
This script when run on windows with chromedriver obtained and correctly installed returns an error.
Traceback (most recent call last):
File "C:\Users\Home\Desktop\facebook_post.py", line 28, in <module>
elem.click()
File "C:\Python27\lib\site-packages\selenium-2.42.0- py2.7.egg\selenium\webdriver\remote\webelement.py", line 60, in click
self._execute(Command.CLICK_ELEMENT)
File "C:\Python27\lib\site-packages\selenium-2.42.0- py2.7.egg\selenium\webdriver\remote\webelement.py", line 370, in _execute
return self._parent.execute(command, params)
File "C:\Python27\lib\site-packages\selenium-2.42.0-py2.7.egg\selenium\webdriver\remote\webdriver.py", line 172, in execute
self.error_handler.check_response(response)
File "C:\Python27\lib\site-packages\selenium-2.42.0-py2.7.egg\selenium\webdriver\remote\errorhandler.py", line 164, in check_response
raise exception_class(message, screen, stacktrace)
WebDriverException: Message: u'unknown error: Element is not clickable at point (481, 185). Other element would receive the click: <input type="file" class="_n _5f0v" title="Choose a file to upload" accept="image/*" name="file" id="js_0">\n (Session info: chrome=35.0.1916.114)\n (Driver info: chromedriver=2.10.267521,platform=Windows NT 6.1 x86)'
I am unable to make head or tail of this.
Any help would be appreciated.
The Graph API is not feasible here as I want to browse as myself and not as some application.
If however browsing as myself can be done by graph API or some other means, please do tell.
If you're doing this process several times, Selenium has some options in WebDriverWait to not waste too much time and check to see if elements are visible.
Here is the documentation on Waits in Selenium and
this is a section of the Python documentation of Selenium that talks about the Expected Conditions classes. It seems that "clickable" is one conditions that you can check for.
In my case, I wrote a simple function that took care of visibility and clicking and called it every time I needed to click on something dynamic.
My example code:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Browser = webdriver.Chrome()
def wait_until_visible_then_click(element):
element = WebDriverWait(Browser,5,poll_frequency=.2).until(
EC.visibility_of(element))
element.click()
EDIT:
The links above seem to be nerfed. This is the new documentation on waits and here are the expected conditions docs.

Categories

Resources