I am trying to get a list of email addresses from a website and am very close. The code I have can be seen below. I am getting the following error.
What happens is that there is a page of links which are then clicked on and in the following page there is an email address.
I am trying to print out the email address inside each of the pages after the link is clicked.
Here is an example of a page that the link clicks through to.
xTraceback (most recent call last): File "scrape.py", line 34, in
lookup(driver) File "scrape.py", line 26, in lookup
emailAdress = driver.find_element_by_xpath('//div[#id="widget-contact"]//a').get_attribute('href')
File
"/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py",
line 293, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath) File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py",
line 752, in find_element
'value': value})['value'] File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py",
line 236, in execute
self.error_handler.check_response(response) File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py",
line 192, in check_response
raise exception_class(message, screen, stacktrace) selenium.common.exceptions.InvalidSelectorException:
I am using python 2.7.13.
# -*- coding: utf-8 -*-
from lxml import html
import requests
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
def init_driver():
driver = webdriver.Firefox()
driver.wait = WebDriverWait(driver, 5)
return driver
def lookup(driver):
driver.get("http://www.sportbirmingham.org/directory?sport=&radius=15&postcode=B16+8QG&submit=Search")
try:
for link in driver.find_elements_by_xpath('//h2[#class="heading"]/a'):
link.click()
emailAdress = driver.find_element_by_xpath('//div[#id="widget-contact"]//a').get_attribute('href')
print emailAdress
except TimeoutException:
print "not found"
if __name__ == "__main__":
driver = init_driver()
lookup(driver)
time.sleep(5)
driver.quit()
When I try and continue to the next page of links I get the following error
File "scrape.py", line 43, in
lookup(driver) File "scrape.py", line 26, in lookup
links.extend([link.get_attribute('href') for link in driver.find_elements_by_xpath('//h2[#class="heading"]/a')]) File
"/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webelement.py",
line 139, in get_attribute
self, name) File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py",
line 465, in execute_script
'args': converted_args})['value'] File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py",
line 236, in execute
self.error_handler.check_response(response) File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py",
line 192, in check_response
raise exception_class(message, screen, stacktrace) selenium.common.exceptions.StaleElementReferenceException: Message:
The element reference is stale. Either the element is no longer
attached to the DOM or the page has been refreshed.
You just need more precise X-PATH (aslo with calling text method):
emailAdress = driver.find_element_by_xpath('//div[#class="body"]/dl/dd[2]').text
But this example works with Python3. Let me know if it works for you.
I would also recommend to use "XPath Helper" extension for Chrome.
This seem to be copy/paste issue. Sometimes when you copy code from StackOverflow answers, some hidden characters might be present. Your XPath in Python shell appears like '//div[#id="widget-contact"]//a??'. You should re-write it manually to get rid of those ??...
Also note that your code will not work as you stuck on the first iteration- there is no turning back to search page.
Try to use below code instead:
from selenium.common.exceptions import NoSuchElementException
def lookup(driver):
driver.get("http://www.sportbirmingham.org/directory?sport=&radius=15&postcode=B16+8QG&submit=Search")
links = [link.get_attribute('href') for link in driver.find_elements_by_xpath('//h2[#class="heading"]/a')]
page_counter = 1
while True:
try:
page_counter += 1
driver.find_element_by_link_text(str(page_counter)).click()
links.extend([link.get_attribute('href') for link in driver.find_elements_by_xpath('//h2[#class="heading"]/a')])
except NoSuchElementException:
break
try:
for link in links:
driver.get(link)
try:
emailAdress = driver.find_element_by_xpath('//div[#id="widget-contact"]//a').text
print emailAdress
except NoSuchElementException:
print "No email specified"
except TimeoutException:
print "not found"
Related
I'm trying to get access to this page "fullcollege" with a bot I'm making for students. The problem is that I can't even select an element from it, and this error shows up. I have recently tested my code with another webpages like instagram and everything works perfectly. Anyone knows what can I do to solve this? Thanks in advance.
My code:
from time import sleep
from selenium import webdriver
browser = webdriver.Firefox()
browser.implicitly_wait(5)
browser.get('https://www.fullcollege.cl/fullcollege/')
sleep(5)
username_input = browser.find_element_by_xpath("//*[#id='textfield-3610-inputEl']")
password_input = browser.find_element_by_xpath("//*[#id='textfield-3611-inputEl']")
username_input.send_keys("username")
password_input.send_keys("password")
sleep(5)
browser.close()
The error:
Traceback (most recent call last):
File "c:\Users\marti\OneDrive\Escritorio\woo\DiscordBot\BetterCollege\tester.py", line 11, in <module>
username_input = browser.find_element_by_xpath("//*[#id='textfield-3610-inputEl']")
File "C:\Users\marti\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "C:\Users\marti\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "C:\Users\marti\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\marti\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: //*[#id='textfield-3610-inputEl']
The username and password field is inside and iframe you need to switch it first.
browser.get('https://www.fullcollege.cl/fullcollege/')
WebDriverWait(browser,10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe#logFrame")))
sleep(5)
username_input = browser.find_element_by_xpath("//input[#id='textfield-3610-inputEl']")
password_input = browser.find_element_by_xpath("//input[#id='textfield-3611-inputEl']")
username_input.send_keys("username")
password_input.send_keys("password")
import below libraries as well.
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
I'm new to selenium
Here I want to ask about a problem code (actually not mine)
this is the code
aww = email.strip().split('|')
driver = webdriver.Chrome()
driver.get("https://stackoverflow.com/users/signup?ssrc=head&returnurl=%2fusers%2fstory%2fcurrent")
time.sleep(5)
loginform = driver.find_element_by_xpath("//button[#data-provider='google']")
loginform.click()
mailform = driver.find_element_by_id('identifierId')
mailform.send_keys(aww[0])
driver.find_element_by_xpath("//div[#id='identifierNext']").click()
time.sleep(3)
passform = driver.find_element_by_css_selector("input[type='password']")
passform.send_keys(aww[1])
driver.find_element_by_id('passwordNext').click()
time.sleep(3)
driver.get("https://myaccount.google.com/lesssecureapps?pli=1")
open('LIVE.txt', 'a+').write(f"CHECKED : {aww[0]}|{aww[1]}")
time.sleep(3)
lessoff = driver.find_element_by_xpath('//div[#class="hyMrOd "]/div/div/div//div[#class="N9Ni5"]').click()
driver.delete_all_cookies()
driver.close()
I'm using those code for automating turn on the less-secure apps from Gmail
and the error will pop up like this
quote Traceback (most recent call last):
File "C:\Users\ASUS\Downloads\ok\less.py", line 59, in
lessoff = driver.find_element_by_xpath('//div[#class="hyMrOd "]/div/div/div//div[#class="N9Ni5"]').click()
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//div[#class="hyMrOd "]/div/div/div//div[#class="N9Ni5"]"}
(Session info: chrome=86.0.4240.183)
any help gonna be helpfull,sorry for my english before :)
You can simply target this xpath and .click to toggle the less secure apps.
lessoff = driver.find_element_by_xpath("input[type='checkbox']").click()
It looks like it couldn't find the element it's looking for so give some time to load the element. You can check with Wait().until.
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait
wait(driver, 10).until(EC.presence_of_element_located((By.XPATH, 'YOUR_XPATH')))
when you try to click an element make sure it's there. above code will wait until the element located for the 10s if the element not located then it will throw an exception
I am trying to use webdriver to click the login button and the page has transformed correctly.but the program stop and occured the proplem "selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable"
and this is the code where occured problems
browser.find_element_by_xpath(
'//*[#id="emap-rsids-content"]/div/div[3]/div/div[1]/div/div/div/input').send_keys(uid)
browser.find_element_by_xpath(
'//*[#id="emap-rsids-content"]/div/div[3]/div/div[2]/div/div/div/input').send_keys(pwd)
# click to sign in
browser.find_element_by_xpath('//*[#id="emap-rsids-content"]/div/div[3]/div/div[3]/div/button').send_keys(Keys.ENTER)
time.sleep(3)
browser.find_element_by_xpath('/html/body/main/article/section[1]/div/div/div/div[2]/div/div/div[2]/div[2]').click()
this is the traceback
Traceback (most recent call last):
File "C:/Users/14638/Desktop/auto_sign_zzu_jksb-master/auto_sign.py", line 68, in sign_in
time.sleep(3)
File "C:\Users\14638\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\webelement.py", line 80, in click
self._execute(Command.CLICK_ELEMENT)
File "C:\Users\14638\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\webelement.py", line 633, in _execute
return self._parent.execute(command, params)
File "C:\Users\14638\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\14638\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable
(Session info: chrome=83.0.4103.116)
the line 68 is time.sleep(3), the click works and transform to a new page,but it still occured that the button element is interctable.
I have try two methods, one is
'''
browser.find_element_by_xpath('//*[#id="emap-rsids-content"]/div/div[3]/div/div[3]/div/button').click()
'''
and another is
'''
pages=browser.find_element_by_xpath('//*[#id="emap-rsids-content"]/div/div[3]/div/div[3]/div/button')
browser.execute_script("arguments[0].click();", pages)
'''
but still not work
I think you are trying to click the element which is not completely loaded.What you have to do is wait till that happens.
First import these files
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
Then add this after logging in.
WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.XPATH, '/html/body/main/article/section[1]/div/div/div/div[2]/div/div/div[2]/div[2]')))
Then do
browser.find_element_by_xpath('/html/body/main/article/section[1]/div/div/div/div[2]/div/div/div[2]/div[2]').click()
You need to update the page_source:
browser.find_element_by_xpath(
'//*[#id="emap-rsids-content"]/div/div[3]/div/div[1]/div/div/div/input').send_keys(uid)
browser.find_element_by_xpath(
'//*[#id="emap-rsids-content"]/div/div[3]/div/div[2]/div/div/div/input').send_keys(pwd)
# click to sign in
time.sleep(5) # add some time to load the page
browser.get(browser.current_url)
time.sleep(1)
browser.find_element_by_xpath('//*[#id="emap-rsids-content"]/div/div[3]/div/div[3]/div/button').send_keys(Keys.ENTER)
time.sleep(3)
browser.find_element_by_xpath('/html/body/main/article/section[1]/div/div/div/div[2]/div/div/div[2]/div[2]').click()
So im pretty new to selenium and im following the docs to make some bots,
but when i try to login into social media networks (twitter/instagram) it doesnt send the strokes.
Code:
#!usr/bin/env python3
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
browser = webdriver.Firefox()
browser.get("https://www.instagram.com/accounts/login/")
elem = browser.find_element_by_name("username")
elem.send_keys('Laptops' + Keys.RETURN)
time.sleep(4)
browser.quit()
i've tried it by using- browser.get_element_by_name/class/xpath but nothing seemed to work.
Error code:
Traceback (most recent call last):
File "ig.py", line 50, in <module>
login(driver)
File "ig.py", line 15, in login
driver.find_element_by_xpath("//div/input[#name='username']").send_keys(username)
File "/home/lario/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 365, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "/home/lario/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 843, in find_element
'value': value})['value']
File "/home/lario/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 308, in execute
self.error_handler.check_response(response)
File "/home/lario/.local/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: //div/input[#name='username']
os=ubuntu17
driver=firefox/geckodriver
python3.6
selenium3.6
I know this code shouldnt work bc you need a password and a username, but it doesnt even execute the send_keys code because of the error on the line above tHAT
I've had some problems with Firefox driver on linux too.
I copied your code and only changed the driver to Chrome and it worked.
You can get the Chrome driver here:
http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.chrome.webdriver
Try adding a wait-for presence condition. It may however require a EC.element_to_be_clickable, if the element is disabled until some check is performed.
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
...
usernamefield = WebDriverWait(self.driver, 10)\
.until(EC.presence_of_element_located((By.NAME, 'username')))
usernamefield.send_keys("Laptops")
passwordfield = WebDriverWait(...
I am using selenium bindings for python to download all links on a page that contain a string "VS". The problem is that the second item in the list is not a valid web page (returns 404 error), also if I manually click on the broken link it returns:
error.html - 404 error page does not exist.
And if I run the following code it raises an error.
import selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import StaleElementReferenceException, NoSuchElementException, NoSuchWindowException
# To prevent download dialog
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2) # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'application/msword, application/vnd.ms-powerpoint')
driver = webdriver.Firefox(profile)
driver.get("http://www.SOME_URL.com/")
links = driver.find_elements_by_partial_link_text("VS")
for link in links:
url = link.get_attribute("href")
try:
driver.get(url)
except StaleElementReferenceException:
pass
The error:
Traceback (most recent call last):
File "C:\Users\lskrinjar\Dropbox\work\preracun\src\web_data_mining\get_files_from_web.py", line 79, in <module>
url = link.get_attribute("href")
File "C:\Python27\lib\site-packages\selenium-2.44.0-py2.7.egg\selenium\webdriver\remote\webelement.py", line 93, in get_attribute
resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
File "C:\Python27\lib\site-packages\selenium-2.44.0-py2.7.egg\selenium\webdriver\remote\webelement.py", line 385, in _execute
return self._parent.execute(command, params)
File "C:\Python27\lib\site-packages\selenium-2.44.0-py2.7.egg\selenium\webdriver\remote\webdriver.py", line 173, in execute
self.error_handler.check_response(response)
File "C:\Python27\lib\site-packages\selenium-2.44.0-py2.7.egg\selenium\webdriver\remote\errorhandler.py", line 166, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: Element not found in the cache - perhaps the page has changed since it was looked up
Stacktrace:
at fxdriver.cache.getElementAt (resource://fxdriver/modules/web-element-cache.js:8329:1)
at Utils.getElementAt (file:///c:/users/lskrin~1/appdata/local/temp/tmppsn7tm/extensions/fxdriver#googlecode.com/components/command-processor.js:7922:10)
at WebElement.getElementAttribute (file:///c:/users/lskrin~1/appdata/local/temp/tmppsn7tm/extensions/fxdriver#googlecode.com/components/command-processor.js:11107:31)
at DelayedCommand.prototype.executeInternal_/h (file:///c:/users/lskrin~1/appdata/local/temp/tmppsn7tm/extensions/fxdriver#googlecode.com/components/command-processor.js:11635:16)
at fxdriver.Timer.prototype.setTimeout/<.notify (file:///c:/users/lskrin~1/appdata/local/temp/tmppsn7tm/extensions/fxdriver#googlecode.com/components/command-processor.js:548:5)
The links list is a list of elements on the webpage. Once you navigate away from that page, the elements no longer exists (you have a list where each element points to a no-longer-existing web element). Instead of referring to the list of elements, you should use a list of URL strings:
list_of_links = []
links = driver.find_elements_by_partial_link_text("VS")
for link in links:
list_of_links.append(link.get_attribute("href"))
for string_link in list_of_links:
try:
driver.get(string_link)
except StaleElementReferenceException:
pass