"Permission Denied" error ruins Selenium scraping

"Permission Denied" error ruins Selenium scraping - python

I've been scraping a website using Selenium (Python Webdriver). When I try to have it click() an option, I get a permission denied error. Full stack trace:
Traceback (most recent call last):
File "scrape.py", line 19, in <module>
subjectOptions[1].click()
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webelement.py", line 45, in click
self._execute(Command.CLICK_ELEMENT)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webelement.py", line 194, in _execute
return self._parent.execute(command, params)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 153, in execute
self.error_handler.check_response(response)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 147, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: u"'Error: Permission denied for <http://localhost/scrape_test> to get property HTMLDocument.compatMode' when calling method: [wdIMouse::move]"
Here is the code that causes the problem. I know for a fact that the option I'm trying to click exists (based on print):
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait #available since 2.4.0
import time
# Create a new instance of the FireFox driver
driver = webdriver.Firefox()
# go to the local version of the page for testing
driver.get("http://localhost/scrape_test")
# Find the select by ID, get its options
selectElement = driver.find_element_by_id("CLASS_SRCH_WRK2_SUBJECT$65$")
subjectOptions = selectElement.find_elements_by_tag_name("option")
# Click the desired option
subjectOptions[1].click()
I'm using Firefox 8.0.1 on Mac OS X 10.7.2

Looks like it's a webdriver bug. The latest log entry from the programmer who last modified one of the selenium source code files says:
This leads to permissions errors, which I've still been unable to
reduce:
Error: Permission denied for http://www.finn.no to get property
HTMLDocument.compatMode' when calling method: [wdIMouse::move]
There is some discussion about the issue here, here and here.
According to the discussion it should work fine with Firefox 7. Also, this related issue implies that the link is still clicked in spite of the error, so it might work inside a try/except.

For time being you can overcome with work around given here

Related

Python, Variables in 2 files and selenium

I’ve created the main.py and the login.py, then I tried to link the 2 files together. I don’t know how to correctly link these 2 files.
I’m using selenium and the program works, but it opens 2 chrome windows (When I want to open just one) then, the second one goes ahead and works perfectly but when it stops to do the login file suddently this error comes:
Traceback (most recent call last):
File "C:/Users/alebu/PycharmProjects/selenium/Main.py", line 9, in <module>
Login.login()
File "C:\Users\alebu\PycharmProjects\selenium\Login.py", line 6, in login
Main.browser.find_element_by_link_text('Log in').click()
File "C:\Users\alebu\PycharmProjects\selenium\venv\lib\site-
packages\selenium\webdriver\remote\webdriver.py", line 419, in
find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "C:\Users\alebu\PycharmProjects\selenium\venv\lib\site-
packages\selenium\webdriver\remote\webdriver.py", line 955, in find_element
'value': value})['value']
File "C:\Users\alebu\PycharmProjects\selenium\venv\lib\site-
packages\selenium\webdriver\remote\webdriver.py", line 312, in execute
self.error_handler.check_response(response)
File "C:\Users\alebu\PycharmProjects\selenium\venv\lib\site-
packages\selenium\webdriver\remote\errorhandler.py", line 242, in
check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element:
Unable to locate element: {"method":"link text","selector":"Log in"}
(Session info: chrome=64.0.3282.167)
(Driver info: chromedriver=2.35.528161
(5b82f2d2aae0ca24b877009200ced9065a772e73),platform=Windows NT 10.0.16299
x86_64)
Process finished with exit code 1
probably I’ve done something wrong with the variables
the main.py
from selenium import webdriver
import Login
driver_location = "C:\webDrivers\chromedriver.exe"
options = webdriver.ChromeOptions()
options.add_argument('--lang=en')
browser = webdriver.Chrome(executable_path=driver_location,
chrome_options=options)
Login.login()
the login.py
def login():
import Main
from time import sleep
Main.browser.get('https://www.instagram.com')
Main.browser.find_element_by_link_text('Log in').click()
Main.browser.find_element_by_name('username').send_keys('*******')
Main.browser.find_element_by_name('password').send_keys(********')
Main.browser.find_element_by_xpath('//form/span/button[text()="Log
in"]').click()
sleep(3)
Main.browser.find_element_by_link_text('Not Now').click()
sleep(2)
print("Logged In")
The weird thing is: before the program was in one unique file and it worked perfectly.

I would suggest you read some material on how import works in python here.
to get what you have working properly quickly, you should not import Main in your login function. Try passing the driver as an argument in your login function as so (Note: after passing the browser as an argument and not importing Main, you will no longer need to use Main.browser, just browser):
from time import sleep
def login(browser):
browser.get('https://www.instagram.com')
Then when you go to call login from your main, you will want to pass browser as an argument:
Login.login(browser)
This should fix the problem you were having with two browsers opening. If the issue is still occurring with the Log in link not found please read this and ask a new question if you have a clear understanding of how it works and require more help.

Timeout when using click() webdriver selenium function Python

This is my first web scraping project and I'm using selenium webdriver with Python in order to dynamically generate some csv files after choosing a few options on a website (though I'm not there yet).
However, I'm facing an unexpected timeout when the execution reaches a button click(). The click is performed but it gets stuck in there and does not continue the execution till the timeout.
Any clues on how to solve that?
Thanks!!
from selenium import webdriver
from selenium.webdriver.support.ui import Select
import time
driver = webdriver.Firefox()
driver.get('http://www8.receita.fazenda.gov.br/SimplesNacional/Aplicacoes/ATBHE/estatisticasSinac.app/Default.aspx')
driver.find_element_by_id('ctl00_ctl00_Conteudo_AntesTabela_lnkOptantesPorCNAE').click()
Select(driver.find_element_by_id("ctl00_ctl00_Conteudo_AntesTabela_ddlColuna")).select_by_visible_text("Município")
filtro_uf = driver.find_element_by_id('ctl00_ctl00_Conteudo_AntesTabela_btnFiltros')
for i in range (1, 28):
filtro_uf.click()
uf = Select(driver.find_element_by_id("ctl00_ctl00_Conteudo_AposTabela_ddlUf"))
uf.options[i].click()
time.sleep(2)
driver.find_element_by_id('chkTodosMunicipios').click()
time.sleep(2)
driver.find_element_by_xpath("//*[contains(text(),'Ok')]").click()
time.sleep(2)
# Here is where my code get stuck and gets a timeout
driver.find_element_by_id('ctl00_ctl00_Conteudo_AntesTabela_btnExibir').click()
The error I get:
Traceback (most recent call last):
File "/home/hissashi/Desktop/Python3/WS_SINAC/download_SINAC.py", line 22, in <module> driver.find_element_by_id('ctl00_ctl00_Conteudo_AntesTabela_btnExibir').click()
File "/home/hissashi/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click
self._execute(Command.CLICK_ELEMENT)
File "/home/hissashi/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webelement.py", line 501, in _execute
return self._parent.execute(command, params)
File "/home/hissashi/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 308, in execute
self.error_handler.check_response(response)
File "/home/hissashi/.local/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
**selenium.common.exceptions.TimeoutException: Message: Timeout loading page after 300000ms**

I've found a workaround for the problem.
Apparently, the click() function blocks the code until the page is "completely" loaded. However, for some reason, the page keeps loading forever (without anything else to load) and it holds my code till it reaches the timeout limit.
Instead of using click, I've changed it to key ENTER and the page still keeps loading forever but it doesn't hold the code anymore.
#FROM CLICK
driver.find_element_by_id('ctl00_ctl00_Conteudo_AntesTabela_btnExibir').click()
#TO SENDING ENTER (ue007)
driver.find_element_by_id('ctl00_ctl00_Conteudo_AntesTabela_btnExibir').send_keys(u'\ue007')

Encountering selenium.common.exceptions.WebDriverException only on specific web page

I'm trying to create a content scraper to gather a list of medical terms and I'm using https://www.merriam-webster.com/browse/medical for that task. All I have in my code right now is:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get("https://www.merriam-webster.com/browse/medical/a")
driver.close()
The following error message is shown after every run of the program using this or any https://www.merriam-webster.com URL:
Traceback (most recent call last):
File "mw-download.py", line 5, in <module>
driver.get("https://www.merriam-webster.com/browse/medical/a")
File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 268, in get
self.execute(Command.GET, {'url': url})
File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Reached error page: about:neterror?e=dnsNotFound&u=https%3A//nop.xpanama.net/if.html%3Fadflag%3D1%26cb%3DKHg6C73w9F&c=UTF-8&f=regular&d=Firefox%20can%E2%80%99t%20find%20the%20server%20at%20nop.xpanama.net.
I have tried changing the URL to different sites, with and without https support to test if it was something with https, but I haven't encountered this error with any other site. I also tried removing the driver.close() command at the end of the script to see if trying to close the driver before the page contents were loaded is what was causing the problem but the problem persists even without that line of code.
Any help in understanding this error and fixing it would be greatly appreciated!

Can't open IE using selenium in python

I am running on a Windows 10 machine, Internet Explorer 11, python 3.6, selenium 3.4.3 with IEDriverServer 3.5. I am trying to open up IE using the following code.
from selenium import webdriver
import os
driverLocation = "C:\\Users\\JD\\PycharmProjects\\Lib\\IEDriverServer.exe"
os.environ["webdriver.ie.driver"] = driverLocation
driver = webdriver.Ie(driverLocation)
google = "https://google.com"
driver.get(google)
The output:
Traceback (most recent call last):
File "C:/Users/J/PycharmProjects/Automation/IE_Test.py", line 7, in <module>
driver = webdriver.Ie(driverLocation)
File "C:\Users\JD\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\ie\webdriver.py", line 57, in __init__
desired_capabilities=capabilities)
File "C:\Users\JD\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 98, in __init__
self.start_session(desired_capabilities, browser_profile)
File "C:\Users\JD\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 188, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "C:\Users\JD\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 256, in execute
self.error_handler.check_response(response)
File "C:\Users\JD\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Invalid capabilities in alwaysMatch: unknown capability named platform
Any help would be greatly appreciated thanks.
UPDATE:
I added this to my previous code,
capabilities = DesiredCapabilities.INTERNETEXPLORER
print(capabilities["platform"])
print(capabilities["browserName"])
OUTPUT:
WINDOWS
internet explorer
File "C:\Users\JD\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Invalid capabilities in alwaysMatch: unknown capability named platform
UPDATE:
I have also tried setting the capabilities but still recieve the same error: "unknown capabilities named platform
caps = DesiredCapabilities.INTERNETEXPLORER.copy()
caps["platform"] = "WINDOWS"
caps["browserName"] = "internet explorer"
caps["requireWindowFocus"] = True
browser = webdriver.Ie(capabilities=caps,
executable_path="C:\\Users\\JD\\PycharmProjects\\Lib\\IEDriverServer.exe")
browser.get("https://www.facebook.com/")

I had the same problem for couple of days.
My workaround for this was to delete platform and version keys from capabilities dictionary
Example:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
#create capabilities
capabilities = DesiredCapabilities.INTERNETEXPLORER
#delete platform and version keys
capabilities.pop("platform", None)
capabilities.pop("version", None)
#start an instance of IE
driver = webdriver.Ie(executable_path="C:\\your\\path\\to\\IEDriverServer.exe", capabilities=capabilities)
driver.get("https://www.google.com/")
My guess, so far, is that this error happens because w3c_caps are passed as the only right capabilities. You can see that in the Traceback:
response = self.execute(Command.NEW_SESSION, parameters)
when you click on it you will see that:
w3c_caps["alwaysMatch"].update(capabilities)
As you can see here _W3C_CAPABILITY_NAMES hold different values than the ones we are passing.
We are passing "WINDOWS" as "platform", while _W3C_CAPABILITY_NAMES has "platformName" and accepts only small caps. Same goes for "version" key.
So we were adding capabilities that are not recognized.
This workaround is by no means perfect, and I was able to start IE in selenium java without deleting some of the capabilities.
EDIT: Another solution can be found here in Grimlek comment, which essentially says that you should delete "capabilities": w3c_caps from start_session(self, capabilities, browser_profile=None) (from remote\webdriver.py). Code looks like this:
w3c_caps["alwaysMatch"].update(capabilities)
parameters = {"capabilities": w3c_caps,
"desiredCapabilities": capabilities}
Then you would not need to delete keys from capabilities.
ANOTHER EDIT: I have just updated my selenium-python from 3.4.3 to 3.5.0 and there's no longer need for messing with capabilities.

Python script runs in console but errors out as a script

I'm working on a script to pull some information from a site that I must login to use. I'm using Python 2.7.12 and Selenium 3.4.3.
#!/usr/bin/python
from selenium import webdriver
browser = webdriver.Firefox(firefox_binary='/usr/bin/firefox', executable_path="./geckodriver")
# Get to the login page
browser.get('https://example.com')
browser.find_element_by_link_text('Application').click()
# Login
browser.find_element_by_id('username').send_keys('notmyusername')
browser.find_element_by_id('password').send_keys('notmypassword')
browser.find_element_by_css_selector('.btn').click()
# Open the application
browser.find_element_by_id('ctl00_Banner1_ModuleList_ctl01_lnkModule').click()
If I copy this code and paste it into the python console, it runs just fine and goes to the page I want. However, when I run the script from the terminal (bash on Linux Mint 18), it errors out. Here's the output with the try and catch statements removed:
Traceback (most recent call last):
File "./script.py", line 14, in <module>
browser.find_element_by_id('ctl00_Banner1_ModuleList_ctl01_lnkModule').click()
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 289, in find_element_by_id
return self.find_element(by=By.ID, value=id_)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 791, in find_element
'value': value})['value']
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: [id="ctl00_Banner1_ModuleList_ctl01_lnkModule"]
I don't even know how to go about troubleshooting this. Any help?

What is most probably happening is that when you run the script from bash, the script runs too quickly and the get_by_id operation is started before the browser has finished loading the page, which results in this error.
As #murali-selenium suggested, you should probably add some wait time before starting to look for stuff in the document.
That can be achived this way:
#!/usr/bin/python
from selenium import webdriver
import time
wait_time_sec = 1
browser = webdriver.Firefox(firefox_binary='/usr/bin/firefox', executable_path="./geckodriver")
# Get to the login page
browser.get('https://example.com')
time.sleep(wait_time_sec)
browser.find_element_by_link_text('Application').click()
time.sleep(wait_time_sec)
# Login
browser.find_element_by_id('username').send_keys('notmyusername')
browser.find_element_by_id('password').send_keys('notmypassword')
browser.find_element_by_css_selector('.btn').click()
time.sleep(wait_time_sec)
# Open the application
try:
browser.find_element_by_id('ctl00_Banner1_ModuleList_ctl01_lnkModule').click()
except:
print('failed')
#browser.stop()

Per selenium docs, setting browser.implicity_wait(10) # seconds tells the browser to poll the page for 10 seconds before deciding the element isn't there. By default it is set to 0 seconds. Once it is set, the setting persists for the life of the webdriver object.
There are a lot of other nifty tools to wait for elements to load, documented at readthedocs.io.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

"Permission Denied" error ruins Selenium scraping - python

For time being you can overcome with work around given here

Related

Python, Variables in 2 files and selenium

Timeout when using click() webdriver selenium function Python

Encountering selenium.common.exceptions.WebDriverException only on specific web page

Can't open IE using selenium in python

Python script runs in console but errors out as a script

Categories

Resources