Project: saving all the URLs/titles from https://theuselessweb.com/
Code to test (only 3 pages and print not save):
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from time import sleep
PATH = r"C:\Users\XXX\Documents\scraping\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://theuselessweb.com/")
driver.switch_to.window(driver.window_handles[-1])
button = driver.find_element_by_id("button")
for i in range(3):
button.click()
sleep(2)
driver.switch_to.window(driver.window_handles[-1])
print(driver.current_url)
print(driver.title)
driver.close()
Error(s):
DevTools listening on ws://127.0.0.1:60235/devtools/browser/a5ea4ab0-fba6-4a34-b0ee-8926876c554f
[11636:4168:0626/143411.535:ERROR:device_event_log_impl.cc(214)] [14:34:11.535] USB: usb_device_handle_win.cc:1058 Failed to read descriptor from node connection: Ein an das System angeschlossenes Gerõt funktioniert nicht. (0x1F)
[11636:4168:0626/143411.552:ERROR:device_event_log_impl.cc(214)] [14:34:11.552] USB: usb_device_handle_win.cc:1058 Failed to read descriptor from node connection: Ein an das System angeschlossenes Gerõt funktioniert nicht. (0x1F)
[11636:4168:0626/143411.555:ERROR:device_event_log_impl.cc(214)] [14:34:11.555] USB: usb_device_handle_win.cc:1058 Failed to read descriptor from node connection: Ein an das System angeschlossenes Gerõt funktioniert nicht. (0x1F)
https://thatsthefinger.com/ #this is what I want
The finger, deal with it. #this is what I want
Traceback (most recent call last):
File "C:\Users\XXX\Documents\scraping\programs\linkscraping.py", line 16, in <module>
button.click()
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\remote\webelement.py", line 80, in click
self._execute(Command.CLICK_ELEMENT)
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\remote\webelement.py", line 633, in _execute
return self._parent.execute(command, params)
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchWindowException: Message: no such window: target window already closed
from unknown error: web view not found
(Session info: chrome=91.0.4472.124)
It prints out the URL and title of the first website and then crashes. Also everytime i run the driver.get(ANYURL) command, it opens the link AND the Chrome settings (chrome://settings/triggeredResetProfileSettings). Maybe this messes it up, anyway it would be really helpful if i could get rid of this unwanted window too.
Here is a solution to the problem. it still opens every link but since it's headless it's not visible to the user.
In this case, X is the number of random websites you want to extract
The code opens the site and then clicks the button the number of times you want in accordance with x and then goes on each one and logs the results. At the end, it closes Chrome.
from selenium.webdriver.chrome.options import Options
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
options = Options()
options.headless = True
driver = webdriver.Chrome(
ChromeDriverManager().install(),
options=options
)
x = 10
driver.get('https://theuselessweb.com/')
button = button = driver.find_element_by_id("button")
for i in range(x):
button.click()
for i in range(x):
driver.switch_to.window(driver.window_handles[i+1])
print(driver.current_url)
print(driver.title)
driver.quit()
Related
I am having a terrible time getting text from shadow-root. I found several docs and it looks right to me, except i always get:
Traceback (most recent call last):
File "C:\python\vttest.py", line 29, in
print(driver.execute_script("return document.querySelector('vt-ui-shell').shadowRoot.querySelector('url-view').shadowRoot.querySelector('vt-ui-main-generic-report').shadowRoot.querySelector('vt-ui-url-card').shadowRoot.querySelector('vt-ui-generic-card').shadowRoot.querySelector('p')").text)
File "C:\Users\barberion.NATJ\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 634, in execute_script
return self.execute(command, {
File "C:\Users\barberion.NATJ\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\barberion.NATJ\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.JavascriptException: Message: javascript error: Cannot read properties of null (reading 'shadowRoot')
(Session info: chrome=96.0.4664.93)
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver import DesiredCapabilities # necessary in headless mode, need this library to accept ssl
from selenium.webdriver import Chrome
from selenium.webdriver.support.ui import WebDriverWait
#from selenium.webdriver.support.select import Select
capabilities = DesiredCapabilities.CHROME.copy()
capabilities['acceptSslCerts'] = True
capabilities['acceptInsecureCerts'] = True
opts = Options()
#opts.headless = True # comment this out to get out of headless
#opts.add_argument("--window-size=1400x1400")
opts.add_argument("--enable-javascript")
opts.add_argument("--start-maximized")
opts.add_argument('--disable-gpu')
searchvt=input("Enter URL: ")
driver = Chrome(options=opts,desired_capabilities=capabilities,executable_path='C:/python/chromedriver.exe')
driver.get('https://www.virustotal.com/gui/home/url')
url='//*[#id="view-container"]/home-view'
time.sleep(.5)
driver.find_element_by_xpath(url).send_keys(searchvt)
time.sleep(.5)
driver.find_element_by_xpath(url).send_keys(Keys.RETURN)
driver.maximize_window()
driver.set_page_load_timeout(5)
time.sleep(3)
print(driver.execute_script("return document.querySelector('vt-ui-shell').shadowRoot.querySelector('url-view').shadowRoot.querySelector('vt-ui-main-generic-report').shadowRoot.querySelector('vt-ui-url-card').shadowRoot.querySelector('vt-ui-generic-card').shadowRoot.querySelector('p')").text)
my error must be somewhere in the print :( any help is most appreciated!
Edit:
To be more specific, I want to submit an URL or domain to virustotal, and then read the result text on top of page. If you use MSN.com the text I want is on top where it says "No security vendors flagged this URL as malicious"
If you update to Selenium 4.0 and use a Chromium browser v96+ you can use the new shadow_root property in Python Selenium and avoid using JavaScript entirely.
This is a working example:
driver.get('http://watir.com/examples/shadow_dom.html')
shadow_host = driver.find_element(By.CSS_SELECTOR, '#shadow_host')
shadow_root = shadow_host.shadow_root
shadow_content = shadow_root.find_element(By.CSS_SELECTOR, '#shadow_content')
assert shadow_content.text == 'some text'
Source: https://titusfortner.com/2021/11/22/shadow-dom-selenium.html
My shadow DOMS were just out of order. once i modified print to:
print(driver.execute_script("return document.querySelector('url-view').shadowRoot.querySelector('vt-ui-url-card').shadowRoot.querySelector('vt-ui-generic-card').querySelector('p')").text)
everything functioned as expected.
as you can see I'm starting coding a bot to cop sneakers on SNKRS. I've already coded a few useless things but we don't care. I want the bot to click on the log in button, but at the end when I run the code, it opens Chrome then it opens the nike website but then it doesn't click on the button and I have this error message:
Traceback (most recent call last):
File "C:/Users/xxx/xxx/xxx/xxx/xxx/SNKRS_bot/snkrs bot.py", line 11, in <module>
loginBtn = driver.find_elements_by_xpath("/html/body/div[2]/div/div/div[1]/div/header/div[1]/section/div/ul/li[1]/button").click()
AttributeError: 'list' object has no attribute 'click'
Here is my code :
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
import time
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://www.nike.com/ch/fr/launch?s=upcoming")
time.sleep(3)
loginBtn = driver.find_elements_by_xpath("/html/body/div[2]/div/div/div[1]/div/header/div[1]/section/div/ul/li[1]/button").click()
time.sleep(6)
driver.quit()
Thanks a lot
See my comment to your posted question. I did in the end find your button, but it is not clickable without resorting to using JavaScript:
from selenium import webdriver
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
try:
driver.implicitly_wait(6) # wait up to 6 seconds to find an element
driver.get("https://www.nike.com/ch/fr/launch?s=upcoming")
loginBtns = driver.find_elements_by_xpath("/html/body/div[2]/div/div/div[1]/div/header/div[1]/section/div/ul/li[1]/button")
if loginBtns:
#loginBtns[0].click()
driver.execute_script('arguments[0].click()', loginBtns[0])
else:
print('Could not find login button.')
finally:
driver.quit()
When I'm executing this code with Selenium using Python:
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
driver = webdriver.Chrome(executable_path=r'/Users/qa/Documents/Python/chromedriver')
The error occurred:
Traceback (most recent call last):
File "/Users/qa/Documents/Python/try.py", line 4, in <module>
driver = webdriver.Chrome(executable_path=r'/Users/qa/Documents/Python/chromedriver')
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 81, in __init__
desired_capabilities=desired_capabilities)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__
self.start_session(capabilities, browser_profile)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: session not created
from disconnected: unable to connect to renderer
(Session info: chrome=71.0.3578.98)
(Driver info: chromedriver=2.44.609545 (c2f88692e98ce7233d2df7c724465ecacfe74df5),platform=Mac OS X 10.13.6 x86_64)
Can someone help me? Thanks.
I had a similar error, first getting the error message:
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited normally.
(unknown error: DevToolsActivePort file doesn't exist)
This was solved by adding options.add_argument("--remote-debugging-port=9230") to the ChromeOptions(). And the program runs once and I gained the same error message as above:
selenium.common.exceptions.SessionNotCreatedException: Message: session not created
from disconnected: unable to connect to renderer
(Session info: headless chrome=89.0.4389.114)
The problem here was that the chrome process does not close properly in the program so the process is still active on the debugging-port. To solve this problem close the active port sudo kill -9 $(sudo lsof -t -i:9230) and simply add the following lines to the end of the code:
driver.stop_client()
driver.close()
driver.quit()
Since I didn't find this answer anywhere, I hope it helps someone.
If you have options.add_argument("--remote-debugging-port=9222") change this to options.add_argument("--remote-debugging-port=9230")
or just simply Adding options.add_argument("--remote-debugging-port=9230") fixed in my case.
This error message...
selenium.common.exceptions.SessionNotCreatedException: Message: session not created
from disconnected: unable to connect to renderer
...implies that the ChromeDriver was unable to initiate/spawn a new WebBrowser i.e. Chrome Browser session.
You need to consider a fact:
As you are using Mac OS X the Key executable_path must be supported with a Value as :
'/Users/qa/Documents/Python/chromedriver'
So line will be:
driver = webdriver.Chrome(executable_path='/Users/qa/Documents/Python/chromedriver')
Note: The path itself is a raw path so you don't need to add the switch r and drop it.
Additionally, ensure that /etc/hosts on your system contains the following entry :
127.0.0.1 localhost.localdomain localhost
#or
127.0.0.1 localhost loopback
I ran into this same issue on a Windows 10 machine. What I had to do to resolve the issue was to open the Task Manager and exit all Python.exe processes, along with all Chrome.exe processes. After doing this,
I am facing the same error and add the close code solve the problem, the code finnaly block look like this:
#staticmethod
def fetch_music_download_url(music_name: str):
chrome_driver_service = Service(ChromeDriverManager(chrome_type=ChromeType.GOOGLE).install())
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--remote-debugging-port=9230")
driver = webdriver.Chrome(service=chrome_driver_service, options=chrome_options)
try:
driver.maximize_window()
driver.get('http://example.cn/music/?page=audioPage&type=migu&name=' + music_name)
driver.implicitly_wait(5)
// do some logic
except Exception as e:
logger.error(e)
finally:
// add the close logic
driver.stop_client()
driver.close()
driver.quit()
chrome_driver_service.stop()
the key is you should close the chrome service after using it by add chrome_driver_service.stop().hope this code could help other people facing the same issue.
This worked for me on WINDOWS OS
from selenium import webdriver
import time
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome(options=options)
driver.get(url)
time.sleep(3)
page = driver.page_source
driver.quit()
soup = BeautifulSoup(page, 'html.parser')
Hope you'd find this useful. I also used it more comprehensively here.
This question already has answers here:
How to resolve ElementNotInteractableException: Element is not visible in Selenium webdriver?
(6 answers)
Selenium WebDriver throws Exception in thread "main" org.openqa.selenium.ElementNotInteractableException
(1 answer)
Closed 4 years ago.
Hello everyone forgive I think I can not really explain my problem, I will try again, if they can not understand me very well is that sometimes my English written fails a lot sorry
What I want to try is to automate the access to a web that I leave and link here
RUNT
The first part I have solved that is to enter the data to the form and resolve the im not to robot in send
I'm going to post all the code in python
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
import threading
import time
import csv
import os
# options = webdriver.ChromeOptions()
# options.add_argument(
# r'user-data-dir=C:\Users\RADEON\AppData\Local\Google\Chrome\user Data\default')
#
# options.add_extension(r"C:\Users\RADEON\Documents\Web Driver\Selenium\exs.crx")
# driver = webdriver.Chrome(
# executable_path="C:\\Users\\RADEON\\Documents\\Web Driver\\chrome Driver\\chromedriver.exe",
# chrome_options=options
# )
#
# driver = webdriver.Firefox()
#
# driver.get("https://www.runt.com.co/consultaCiudadana/#/consultaVehiculo")
# assert "Consulta Ciudadano - RUNT" in driver.title
#
# wait = WebDriverWait(driver, 2)
# wait.until(EC.presence_of_element_located((By.ID, "noPlaca")))
#
#
# wait.until(EC.presence_of_element_located((By.ID, "noPlaca")))
class Runt:
def __init__(self, placa, nit, time):
self.placa = placa
self.nit = nit
self.options = webdriver.ChromeOptions()
self.options.add_extension(
r"C:\Users\RADEON\Documents\Web Driver\cp.crx")
self.driver = webdriver.Chrome(
chrome_options=self.options)
self.wait = WebDriverWait(self.driver, time)
self.wait_API = WebDriverWait(self.driver, 150)
def closeBrowser(self):
self.driver.close()
def run(self):
driver = self.driver
wait = self.wait
wait_api = self.wait_API
driver.get("https://www.runt.com.co/consultaCiudadana/#/consultaVehiculo")
wait.until(EC.presence_of_element_located((By.ID, "noPlaca")))
placa = driver.find_element_by_id('noPlaca')
placa.clear()
placa.send_keys(self.placa)
wait.until(EC.presence_of_element_located((By.NAME, "noDocumento")))
owner = driver.find_element_by_name('noDocumento')
owner.clear()
owner.send_keys(self.nit)
# /html/body/div[1]/div/div[1]/div[2]/div[1]/div[2]/div[7]/div[1]
wait_api.until(EC.presence_of_element_located((
By.XPATH, "//*[#id='captcha']/div/div[2]/a[1]")))
while (driver.find_element_by_xpath("//*[#id='captcha']/div/div[2]/a[1]").get_attribute("innerText") != "Solved"):
print("Search Solution....")
print("solution found...")
if(driver.find_element_by_xpath("//*[#id='captcha']/div/div[2]/a[1]").get_attribute("innerText") == "Solved"):
driver.find_element_by_xpath(
"/html/body/div[1]/div/div[1]/div[2]/div[1]/div[2]/div[1]/div[3]/div[2]/div/div/form/div[8]/button").click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable(
(By.CSS_SELECTOR, "div.panel.panel-default>div.panel-heading>h4.panel-title a"))).click()
vigente = driver.find_element_by_xpath(
"//*[#id='pnlRevisionTecnicoMecanicaNacional']/div/div/div/table/tbody/tr[1]/td[5]")
print(vigente.get_attribute("innerText"))
runt2 = Runt("aqd470", 63364079, 2)
# runt1 = Runt("aqd470", 45884847, 2)
#
# thread1 = threading.Thread(target=runt1.run)
thread2 = threading.Thread(target=runt2.run)
#
# thread1.start()
thread2.start()
# r'C:\Users\RADEON\Documents\Web Driver\csv.csv'
Ignore the threads was doing some tests.
When running this program on the aforementioned web waiting to solve the I am not a robot and send the form, and the code appears that I want to get the information
but the information does not appear in the html until clicking on the following div
<div class="panel-heading" ng-click="togglePanel('pnlRevisionTecnicoMecanicaNacional');
consultarDetalle('RevisionTecnicoMecanicaNacional')">
<h4 class="panel-title">
<i class="i_hoja s_25_o1 ib"></i>
<a> Certificado de revisión técnico mécanica y de gases (RTM)</a>
</h4>
</div>
U can use this example data to enter "AFD329" For Nplaca,"6656954" For Documento
The other fields can be left as default
I need to click on that element to load the rest of the query would appreciate a lot if you can help me the TRACEBACK IS
Traceback (most recent call last):
File "C:\Users\RADEON\AppData\Local\Programs\Python\Python36-32\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "C:\Users\RADEON\AppData\Local\Programs\Python\Python36-32\lib\threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "bypass.py", line 86, in run
"/html/body/div[1]/div/div[1]/div[2]/div[1]/div[2]/div[7]/div[1]").click()
File "C:\Users\RADEON\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 80, in click
self._execute(Command.CLICK_ELEMENT)
File "C:\Users\RADEON\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 633, in _execute
return self._parent.execute(command, params)
File "C:\Users\RADEON\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\RADEON\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotVisibleException: Message: element not interactable
(Session info: chrome=70.0.3538.102)
(Driver info: chromedriver=2.43.600210 (68dcf5eebde37173d4027fa8635e332711d2874a),platform=Windows NT 10.0.16299 x86_64)
I am trying to browse facebook via selenium in python.
Here is my script so far.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
usr = ""# I have put 3 different accounts and tested it. Same error
pwd = ""
driver = webdriver.Chrome('E:\python_libs\chromedriver.exe')
driver.get("https://www.facebook.com")
assert "Facebook" in driver.title
elem = driver.find_element_by_id("email")
elem.send_keys(usr)
elem = driver.find_element_by_id("pass")
elem.send_keys(pwd)
elem.send_keys(Keys.RETURN)
elem = driver.find_element_by_css_selector(".input.textInput")
elem.send_keys("Posted using Python's Selenium WebDriver bindings!")
elem = driver.find_element_by_css_selector(".selected")
elem.click()
time.sleep(5)
driver.close()
This script when run on windows with chromedriver obtained and correctly installed returns an error.
Traceback (most recent call last):
File "C:\Users\Home\Desktop\facebook_post.py", line 28, in <module>
elem.click()
File "C:\Python27\lib\site-packages\selenium-2.42.0- py2.7.egg\selenium\webdriver\remote\webelement.py", line 60, in click
self._execute(Command.CLICK_ELEMENT)
File "C:\Python27\lib\site-packages\selenium-2.42.0- py2.7.egg\selenium\webdriver\remote\webelement.py", line 370, in _execute
return self._parent.execute(command, params)
File "C:\Python27\lib\site-packages\selenium-2.42.0-py2.7.egg\selenium\webdriver\remote\webdriver.py", line 172, in execute
self.error_handler.check_response(response)
File "C:\Python27\lib\site-packages\selenium-2.42.0-py2.7.egg\selenium\webdriver\remote\errorhandler.py", line 164, in check_response
raise exception_class(message, screen, stacktrace)
WebDriverException: Message: u'unknown error: Element is not clickable at point (481, 185). Other element would receive the click: <input type="file" class="_n _5f0v" title="Choose a file to upload" accept="image/*" name="file" id="js_0">\n (Session info: chrome=35.0.1916.114)\n (Driver info: chromedriver=2.10.267521,platform=Windows NT 6.1 x86)'
I am unable to make head or tail of this.
Any help would be appreciated.
The Graph API is not feasible here as I want to browse as myself and not as some application.
If however browsing as myself can be done by graph API or some other means, please do tell.
If you're doing this process several times, Selenium has some options in WebDriverWait to not waste too much time and check to see if elements are visible.
Here is the documentation on Waits in Selenium and
this is a section of the Python documentation of Selenium that talks about the Expected Conditions classes. It seems that "clickable" is one conditions that you can check for.
In my case, I wrote a simple function that took care of visibility and clicking and called it every time I needed to click on something dynamic.
My example code:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Browser = webdriver.Chrome()
def wait_until_visible_then_click(element):
element = WebDriverWait(Browser,5,poll_frequency=.2).until(
EC.visibility_of(element))
element.click()
EDIT:
The links above seem to be nerfed. This is the new documentation on waits and here are the expected conditions docs.