firebug
console
I have a project that I chose Selenium to open 1-5 links. It's stopping at the 3rd link. I've followed the same methods for the previously successful requests. I've allowed 17 seconds and watched as I can see the page load, before the script continues to run in my console. I'm just not sure why it can't find this link, and I hope it's something I'm simply overlooking...
from selenium import *
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
import csv
import time
username = "xxxxxxx"
password = "xxxxxxx"
driver = webdriver.Firefox()
driver.get("https://tm.login.trendmicro.com/simplesaml/saml2/idp/SSOService.php")
assert "Trend" in driver.title
elem1 = driver.find_element_by_class_name("input_username")
elem2 = driver.find_element_by_class_name("input_password")
elem3 = driver.find_element_by_id("btn_logon")
elem1.send_keys(username)
elem2.send_keys(password)
elem3.send_keys(Keys.RETURN)
time.sleep(7)
assert "No results found." not in driver.page_source
elem4 = driver.find_element_by_css_selector("a.float-right.open-console")
elem4.send_keys(Keys.RETURN)
time.sleep(17)
elem5 = driver.find_element_by_tag_name("a.btn_left")
elem5.send_keys(Keys.RETURN)
Well one of the reasons is elem5 is looking for the element by tag name, but you are passing it a css tag. "a.btn_left" is not an html tag name and so your script will never actually find it, because it simply doesn't exist in the dom.
You either need to find it by css_selector or better yet by Xpath. If you want to make this as reliable possible and more future proof I always try and find elements on a page with at least 2 descriptors using Xpath if possible.
Change this:
elem5 = driver.find_element_by_tag_name("a.btn_left")
To this:
elem5 = driver.find_element_by_css_selector("a.btn_left")
You will almost never use tag_name, mostly because it will always retrieve the first tag you pass to it, so "a" will always find the first link and click it, yours however does not exist.
I wound up solving it with this code. I increased time to 20 secs, believe it or not, I did try the find by css, I actually left the a.btn_left, and cycled through all the elements, and none of them worked, fortunately, I could access by tab and key functions so that worked for now.
time.sleep(20)
driver.get("https://wfbs-svc-nabu.trendmicro.com/wfbs-svc/portal/en/view/cm")
elem5 = driver.find_element_by_link_text("Devices")
elem5.send_keys(Keys.ENTER)
Related
I'm very new and learning web scraping in python by trying to get the search results from the website below after a user types in some information, and then print the results. Everything works great up until the very last 2 lines of this script. When I include them in the script, nothing happens. However, when I remove them and then just try typing them into the shell after the script is done running, they work exactly as I'd intended. Can you think of a reason this is happening? As I'm a beginner I'm also super open if you see a much easier solution. All feedback is welcome. Thank you!
#Setup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
#Open Chrome
driver = webdriver.Chrome()
driver.get("https://myutilities.seattle.gov/eportal/#/accountlookup/calendar")
#Wait for site to load
time.sleep(10)
#Click on street address search box
elem = driver.find_element(By.ID, 'sa')
elem.click()
#Get input from the user
addr = input('Please enter part of your address and press enter to search.\n')
#Enter user input into search box
elem.send_keys(addr)
#Get search results
elem = driver.find_element(By.XPATH, ('/html/body/app-root/main/div/div/account-lookup/div/section/div[2]/div[2]/div/div/form/div/ul/li/div/div[1]'))
print(elem.text)
I haven't used Selenium in a while, so I can only point you in the right direction. It seems to me you need to iterate over the individual entries, and print those, as opposed to printing the entire div as one element.
You should remove the parentheses from the xpath expression
You can shorten the xpath expression as follows:
Code:
elems = driver.find_element(By.XPATH, '//*[#class="addressResults"]/div')
for elem in elems:
print(elem.text)
You are using an absolute XPATH, what you should be looking into are relative XPATHs
Something like this should do it:
elems = driver.find_elements(By.XPATH, ("//*[#id='addressResults']/div"))
for elem in elems:
...
I ended up figuring out my problem - I just needed to add in a bit that waits until the search results actually load before proceeding on with the script. tossing in a time.sleep(5) did the trick. Eventually I'll add a bit that checks that an element has loaded before proceeding with the script, but this lets me continue for now. Thanks everyone for your answers!
I am trying to find news articles related to different web searches within a custom time frame. If someone knows a better way to do this than with selenium then please do tell. I tried to use API's but they are all expensive.
My Code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome(executable_path=r".\chromedriver.exe")
driver.get("https://www.google.com")
cookie_button = driver.find_element_by_id('L2AGLb')
cookie_button.click()
search_bar = driver.find_element_by_name("q")
search_bar.clear()
search_bar.send_keys("Taylor Swift")
search_bar.send_keys(Keys.RETURN)
news_button = driver.find_element_by_css_selector('[data-hveid="CAEQAw"]')
news_button.click()
tools_button = driver.find_element_by_id('hdtb-tls')
tools_button.click()
recent_button = driver.find_element_by_class_name('KTBKoe')
recent_button.click()
custom_range_button = dashboards_button = driver.find_elements_by_tag_name('g-menu-item')[6]
custom_range_button.click()
driver.close()
I am trying to click this button:
I have tried to select, the span, all parent divs and now the g-menu-item itself, in every case I have received the error:
selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable
I understand this can happen when the element is not visible or can not be clicked but I also tried adding in sleep(2) between each operation, and I can see in the browser that selenium opens that it gets to the point in the image I have sent and then can't find the custom range button ??
In short:
Do you know how I can click this custom range button,
Or do you know how to scrape a lot of news articles from specific dates (without expensive API's)
Those g-menu are special type of tag,
you can locate them using :
//*[name()='g-menu-item']
this xpath will point to all the nodes.
further, I see that you are interested in 6th item.
(//*[name()='g-menu-item'])[6]
in code :
custom_range_button = driver.find_element_by_xpath("(//*[name()='g-menu-item'])[6]")
custom_range_button.click()
I'm kinda of a newbie in Selenium, started learning it for my job some time ago. Right now I'm working with a code that will open the browser, enter the specified website, put the products ID in the search box, search, and them open it. Once it opens the product, it needs to extract its name and price and write it in a CSV file. I'm kinda struggling with it a bit.
The main problem right now is that Selenium is unable to open the product after searching it. I've tried by ID, name and class and it still didn't work.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import csv
driver = webdriver.Firefox()
driver.get("https://www.madeiramadeira.com.br/")
assert "Madeira" in driver.title
elem = driver.find_element_by_id("input-autocomplete")
elem.clear()
elem.send_keys("525119")
elem.send_keys(Keys.RETURN)
product_link = driver.find_element_by_id('variant-url').click()
The error I get is usually this:
NoSuchElementException: Message: Unable to locate element: [id="variant-url"]
There are multiple elements with the id="variant-url", So you could use the index to click on the desired element, Also you need to handle I think the cookie pop-Up. Check the below code, hope it will help
#Disable the notifications window which is displayed on the page
option = Options()
option.add_argument("--disable-notifications")
driver = webdriver.Chrome(r"Driver_Path",chrome_options=option)
driver.get("https://www.madeiramadeira.com.br/")
assert "Madeira" in driver.title
elem = driver.find_element_by_id("input-autocomplete")
elem.clear()
elem.send_keys("525119")
elem.send_keys(Keys.RETURN)
#Clicked on the cookie popup button
accept= driver.find_element_by_xpath("//button[#id='lgpd-button-agree']")
accept.click()
OR with explicitWait
accept=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//button[#id='lgpd-button-agree']")))
accept.click()
#Here uses the XPath with the index like [1][2] to click on the specific element
product_link = driver.find_element_by_xpath("((//a[#id='variant-url'])[1])")
product_link.click()
OR with explicitWait
product_link=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"((//a[#id='variant-url'])[1])")))
product_link.click()
I used to get this error all the time when starting. The problem is that there are probably multiple elements with the same ID "variant-url". There are two ways you can fix this
By using "driver.find_elementS_by_id"
product_links = driver.find_elements_by_id('variant-url')
product_links[2].click()
This will create an index off all the elements with id 'variant-url' then click the index of 2. This works but it is annoying to find the correct index of the button you want to click and also takes a long time if there are many elements with the same ID.
By using Xpaths or CSS selectors
This way is a lot easier as each element has a specific Xpath of Selector. It will look like this.
product_link = driver.get_element_by_xpath("XPATH GOES HERE").click()
To get an Xpath or selector 1. go into developer mode on your browser by inspecting the element 2. right click the element in the F12 menu 3. hover over copy 4. move to copy Xpath and click on it.
Hope this helps you :D
I'm trying to use Selenium to create a new message in my mailbox. I have a problem with finding napisz (en: 'write') button on my e-mail website. I tried to use driver.find_element_by_link_text but it doesn't work. I've managed to go workaround this problem using xpath but I'm very curious why the first method fails.
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get('https://profil.wp.pl/login.html?zaloguj=poczta&url=https://poczta.wp.pl/profil/')
elem_login = browser.find_element_by_name('login_username')
elem_login.send_keys('stack_scraper_wp#wp.pl')
elem_password = browser.find_element_by_name('password')
elem_password.send_keys('thankyouforhelp')
elem_zaloguj_button = browser.find_element_by_id('btnSubmit')
elem_zaloguj_button.click()
browser.get('https://poczta.wp.pl/d635/indexgwt.html#start')
elem_napisz_button = browser.find_element_by_link_text('napisz')
elem_napisz_button.click()
EDIT: I've tried to used same xpath today but it failed. Is it possible that it's somehow dynamic causing the problem?
.find_element_by_link_text() looks for a elements only. In your case, this is the button element and cannot be located using this locator.
When scraping data from NASDAQ there are tickers like ACHC that have empty pages. ACHC Empty Field
My program iterates through all ticker symbols and when I get to this one it times out because there is no data to grasp. I am trying to figure out a way to check if there is nothing and if so skip the ticker, but continue the loop. The code is pretty long, so Ill post the most relevant part: the beginning of the loop where it opens the page:
## navigate to income statement annualy page
url = url_form.format(symbol, "income-statement")
browser.get(url)
company_xpath = "//h1[contains(text(), 'Company Financials')]"
company = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.XPATH, company_xpath))).text
annuals_xpath = "//thead/tr[th[1][text() = 'Period Ending:']]/th[position()>=3]"
annuals = get_elements(browser,annuals_xpath)
Here is a pic of the error message
Selenium doesn't have a built-in method for determining whether an element exists or not, so the most common thing to do is use a try/except block.
from selenium.common.exceptions import TimeoutException
...
try:
company = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.XPATH, company_xpath))).text
except TimeoutException:
continue
This should keep the loop going without crashing, assuming that continue works as expected with your loop.
You can use libraries like requests or urllib to scrape that web page and check if what you need is there. These libraries are much faster than Selenium because they just fetch the source of the page. If there are particular tags or structures like tables, etc. that you're looking for, you should take a look at beautifulsoup which you can use with requests to identify very specific parts of the page.