I'm trying to automate the filling out of an Easily Apply job application on Indeed. Here is an example of a job application on Indeed that uses the Easily Apply approach. I've tried every which way to navigate the nested iframes; however, I cannot find an approach that works. I even found that this question has been asked before, unfortunately, the solution given to the question does not work for me. Below is my code as it stands now:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('indeed_url_goes_here')
driver.find_element_by_class_name('indeed-apply-button').click()
driver.switch_to_frame(driver.find_element_by_xpath('/html/body/iframe'))
driver.switch_to_frame(driver.find_element_by_xpath('//*[#id="indeedapply-modal-preload-iframe"]'))
driver.find_element_by_class_name('applicant.name')
Find the first parent iframe and switch to it and then to the nested frame by index.
Complete working code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("http://www.indeed.com/viewjob?jk=2e3d019aa34a2801&q=bartender&tk=1a9g51n08a3iof6h&from=web&advn=5333586156877432&sjdu=UvkB_mgi5f7NyMagFcTHP0E6zA3mclLGHWb8Kte-0FV3cY2ZuZvj3LUvh8wnnxrqeYWG3HpvTXBK3G4htWfwgfQeMa0N1Tds6VxYb4V3Vlg&pub=4a1b367933fd867b19b072952f68dceb")
driver.find_element_by_class_name('indeed-apply-button').click()
wait = WebDriverWait(driver, 10)
frame = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "iframe[name$=modal-iframe]")))
driver.switch_to.frame(frame)
driver.switch_to.frame(0)
print(driver.find_element_by_css_selector("h1.jobtitle").text)
Prints the job title from the popup: Bartender/Mixologist.
Well first off, the element doesn't have a class name - it has a regular name and an ID, so use either driver.find_element_by_nameor driver.find_element_by_id.
Related
I am trying to scrape a website and I am unable to send keys to the email/password boxes to log in.
from config import root_dir, username_placer, password_placer
import os
import pathlib
os.chdir(root_dir)
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select
import time
import math
import datetime as dt
import pandas as pd
from parse_account_name import parse_account_name
class Placer:
def __init__(self, account_name):
# Set webdriver parameters
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--kiosk-printing')
chrome_options.add_argument("window-size=1440,1080")
chromedriver = r'.\_reference\chromedriver.exe'
self.driver = webdriver.Chrome(executable_path=chromedriver, options=chrome_options)
self.driver.get('https://permits.placer.ca.gov/CitizenAccess/Default.aspx')
# Log in
WebDriverWait(self.driver, 30).until(lambda d: d.find_element_by_xpath('//*[#id="ctl00_PlaceHolderMain_LoginBox_txtUserId"]')).send_keys(username_placer)
self.driver.find_element_by_xpath('//*[#id="ctl00_PlaceHolderMain_LoginBox_txtPassword"]').send_keys(password_placer)
self.driver.find_element_by_xpath('//*[#id="ctl00_PlaceHolderMain_LoginBox_btnLogin"]').click()
time.sleep(2)
def search_placer(account_name):
scraper = Placer(account_name)
#
if __name__ == '__main__':
account_name = '''test'''
result = search_placer(account_name)
print(result)
It will just return a TimeoutException. I've tried using EC.element_to_be_clickable, but I get the same error. Without waiting and just trying to skip to sending keys, I (predictably) get NoSuchElementException.
If I print the page source, it returns the collapsed HTML.
It seems like the page and elements load by the time it looks for the login box element, so I'm not sure why it can't find it. What should I try at this point?
The issue you are facing is that the elements you are trying to reach are inside of an IFRAME. Using Selenium, you have to first switch to the IFRAME before interacting with that part of the page. Once you are done with elements inside the IFRAME, make sure you switch back to the default content.
Additional feedback...
If you are going to locate an element using the ID, always use By.ID instead of By.XPATH. It's much simpler syntax and easier to read.
You should always use the EC class methods when possible rather than writing your own custom wait conditions.
Using sleep is a bad practice and should be avoided whenever possible. Instead use WebDriverWait each time you need to wait for an element as you did earlier in your code.
Using these suggestions, the updated code would be something like
self.driver.get('https://permits.placer.ca.gov/CitizenAccess/Default.aspx')
driver.switch_to.frame(driver.find_element(By.ID, "ACAFrame"))
# Log in
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, "ctl00_PlaceHolderMain_LoginBox_txtUserId")).send_keys(username_placer)
self.driver.find_element(By.ID, "ctl00_PlaceHolderMain_LoginBox_txtPassword").send_keys(password_placer)
self.driver.find_element(By.ID, "ctl00_PlaceHolderMain_LoginBox_btnLogin").click()
driver.switch_to.default_content()
# don't use sleep... instead use a WebDriverWait
This question already has answers here:
Python Beautifulsoup cannot get svg tags
(2 answers)
Closed 1 year ago.
from selenium import webdriver
import time
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome(executable_path='C:/chromedriver/chromedriver.exe')
driver.get('https://ggl-maxim.com/')
driver.find_element_by_xpath('//*[#id="body"]/div/div[2]/div/div[2]/fieldset/input[1]').send_keys('tnrud3080')
driver.find_element_by_xpath('//*[#id="body"]/div/div[2]/div/div[2]/fieldset/input[2]').send_keys('tnrud3080')
driver.find_element_by_xpath('//*[#id="body"]/div/div[2]/div/div[2]/fieldset/button[1]').click()
time.sleep(2)
driver.get('https://ggl-maxim.com/api/popup/popup_menu.asp?mobile=0&lobby=EVOLUTION')
wait = WebDriverWait(driver, 20)
wait.until(EC.frame_to_be_available_and_switch_to_it("gameIframe"))
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".svg--1nrnH")))
targets = driver.find_elements_by_css_selector(".svg--1nrnH")
res = []
for el in targets:
res.append(el.get_attribute('innerHTML'))
print(*res, sep='\n')
This code gets the svg (records of the game) as you look at the picture. However, if you click the button that I wrote "multi" the picture at the bottom, I can see records also at the right of the page. I found out that this part shows up the records more faster than before. In order to do that I have to get svg value only from that div. How can I? Please help me!
The second approach is not faster and is harder to implement, as each container is loaded separately and starts to reload after the first load is done. It looks like a nightmare to automate.
I tried Selenium's explicit waits and time.sleep() neither of the approached worked.
The code below clicks the button, switches to a new iframe and tries to get containers content. But the content is almost always empty for the reasons described above.
from selenium import webdriver
import time
from bs4 import BeautifulSoup
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver')
driver.get('http://ggl-maxim.com/')
driver.find_element_by_xpath('//*[#id="body"]/div/div[2]/div/div[2]/fieldset/input[1]').send_keys('tnrud3080')
driver.find_element_by_xpath('//*[#id="body"]/div/div[2]/div/div[2]/fieldset/input[2]').send_keys('tnrud3080')
driver.find_element_by_xpath('//*[#id="body"]/div/div[2]/div/div[2]/fieldset/button[1]').click()
time.sleep(2)
driver.get('http://ggl-maxim.com/api/popup/popup_menu.asp?mobile=0&lobby=EVOLUTION')
wait = WebDriverWait(driver, 30)
wait.until(EC.frame_to_be_available_and_switch_to_it("gameIframe"))
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".svg--1nrnH")))
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "span[data-role=button-label]"))).click()
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".sidebar-container>iframe")))
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".sidebar-container")))
iframe2 = driver.find_element_by_css_selector('iframe[src^="https://evo.kplaycasino.com/frontend/evo/r2/"]')
driver.switch_to.frame(iframe2)
# wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".list--BLuiJ .svg--1nrnH")))
time.sleep(20)
targets = driver.find_elements_by_css_selector(".list--BLuiJ .svg--1nrnH")
res = []
for el in targets:
res.append(el.get_attribute('innerHTML'))
print(*res, sep='\n')
As you see, even 20 seconds is not enough, because content is reloading on fly.
I left explicit wait commented, so you could reassure that it won't work as well.
However, from my answer you can learn how to find a locator which starts with a specific text:
iframe2 = driver.find_element_by_css_selector('iframe[src^="https://evo.kplaycasino.com/frontend/evo/r2/"]')
Where, src^=means that src starts with some specified text.
I am attempting to scrape the website basketball-reference and am running into an issue I can't seem to solve. I am trying to grab the box score element for each game played. This is something I was able to easily do with urlopen but b/c other portions of the site require Selenium I thought I would rewrite the entire process with Selenium
Issue seems to be that even if I wait to scrape until I to see the first element load using WebDriverWait, when I then move forward to grabbing the elements I get nothing returned.
One thing I found interesting is if I did a full site print using my results from urlopen w/ something like print (uClient.read()) I would get roughly 300 more lines of html after beautifying compared to doing the same with print (driver.page_source). Even if I put an ImplicitlyWait set for 5 minutes.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome('/usr/local/bin/chromedriver')
driver.wait = WebDriverWait(driver, 10)
driver.get('https://www.basketball-reference.com/boxscores/')
driver.wait.until(EC.presence_of_element_located((By.XPATH,'//*[#id="content"]/div[3]/div[1]')))
box = driver.find_elements_by_class_name('game_summary expanded nohover')
print (box)
driver.quit()
Try the below code, it is working in my computer. Do let me know if you still face problem.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.wait = WebDriverWait(driver, 60)
driver.get('https://www.basketball-reference.com/boxscores/')
driver.wait.until(EC.presence_of_element_located((By.XPATH, '//*[#id="content"]/div[3]/div[1]')))
boxes = driver.wait.until(
EC.presence_of_all_elements_located((By.XPATH, "//div[#class=\"game_summary expanded nohover\"]")))
print("Number of Elements Located : ", len(boxes))
for box in boxes:
print(box.text)
print("-----------")
driver.quit()
If it resolves your problem then please mark it as answer. Thanks
Actually the site doesn't require selenium at all. All the data is there through a simple requests (it's just in the Comments of the html, would just need to parse that). Secondly, you can grab the box scores quite easily with pandas
import pandas as pd
dfs = pd.read_html('https://www.basketball-reference.com/boxscores/')
for idx, table in enumerate(dfs[:-2]):
print (table)
if (idx+1)%3 == 0:
print("-----------")
I'm new to using Selenium but I watched enough videos and followed enough articles to know something is missing. I'm trying to get values from TradingView but the problem I'm running into is that I simply can't find any of the elements, not by Xpath or Css. I went ahead and tried to do a simple visibility element test as shown in the code below and to my surprise it times out.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
# Stops the UI interface (chrome browser) from popping up
# chrome_options.add_argument("--headless")
driver = webdriver.Chrome(executable_path='c:\se\chromedriver.exe', options=chrome_options)
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
import time
driver.get("https://www.tradingview.com/chart/")
timeout = 20
try:
WebDriverWait(driver, timeout).until(EC.visibility_of_element_located((By.XPATH, "/html/body/div[1]")))
print("Page loaded")
except TimeoutException:
print("Timed out waiting for page to load")
driver.quit()
I tried to click on one of the chart buttons too using the following and that doesn't work either. I noticed that unlike many other websites for Tradingview the elements don't have names and don't generate a relative path (only full) using Xpath.
driver.find.element_by_xpath('/html/body/div[2]/div[5]/div/div[2]/div/div/div/div/div[4]').click()
Any help is greatly appreciated!
I think there must be an issue with xpath.
When I try to click the AAPL button it is working for me.
The xpath I used is:
(//div[contains(text(),'AAPL')])[1]
If you specify exactly which element to be clicked I will try.
And also be familiar with the concept of frames because these type of websites has lot of frames in it.
Hoping you can help. I'm relatively new to Python and Selenium. I'm trying to pull together a simple script that will automate news searching on various websites. The primary focus was football and to go and get me the latest Manchester United news from a couple of places and save the list of link titles and URLs for me. I could then look through the links myself and choose anything I wanted to review.
In trying the the independent newspaper (https://www.independent.co.uk/) I seem to have come up against a problem with element not interactable when using the following approaches:
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome('chromedriver')
driver.get('https://www.independent.co.uk')
time.sleep(3)
#accept the cookies/privacy bit
OK = driver.find_element_by_id('qcCmpButtons')
OK.click()
#wait a few seconds, just in case
time.sleep(5)
search_toggle = driver.find_element_by_class_name('icon-search.dropdown-toggle')
search_toggle.click()
This throws the selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable error
I've also tried with XPATH
search_toggle = driver.find_element_by_xpath('//*[#id="quick-search-toggle"]')
and I also tried ID.
I did a lot of reading on here and then also tried using WebDriverWait and execute_script methods:
element = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, '//*[#id="quick-search-toggle"]')))
driver.execute_script("arguments[0].click();", element)
This didn't seem to error but the search box never appeared, i.e. the appropriate click didn't happen.
Any help you could give would be fantastic.
Thanks,
Pete
Your locator is //*[#id="quick-search-toggle"], there are 2 on the page. The first is invisible and the second is visible. By default selenium refers to the first element, sadly the element you mean is the second one, so you need another unique locator. Try this:
search_toggle = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//div[#class="row secondary"]//a[#id="quick-search-toggle"]')))
search_toggle.click()
First you need to open search box, then send search keys:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
import os
chrome_options = Options()
chrome_options.add_argument("--start-maximized")
browser = webdriver.Chrome(executable_path=os.path.abspath(os.getcwd()) + "/chromedriver", options=chrome_options)
link = 'https://www.independent.co.uk'
browser.get(link)
# accept privacy
button = browser.find_element_by_xpath('//*[#id="qcCmpButtons"]/button').click()
# open search box
li = browser.find_element_by_xpath('//*[#id="masthead"]/div[3]/nav[2]/ul/li[1]')
search_tab = li.find_element_by_tag_name('a').click()
# send keys to search box
search = browser.find_element_by_xpath('//*[#id="gsc-i-id1"]')
search.send_keys("python")
search.send_keys(Keys.RETURN)
Can you try with below steps
search_toggle = driver.find_element_by_xpath('//*[#class="row secondary"]/nav[2]/ul/li[1]/a')
search_toggle.click()