I'm making a project which goes to my orders page on amazon and collects data like product name, price, delivery date using selenium (cuz there is no api for that, and cant do with bs4). I get login and get to orders page without any problem.But I'm stuck where i have to find the delivery date using find element by class( I chose class because all other delivery date text have same class), but selenium says it cannot find it.
No, its not in an iframe as i cant see the option for This Frame when i right click on that element.
here is the code -
import requests
from selenium import webdriver
import time
userid = #userid
passwd = #passwd
browser = webdriver.Chrome()
browser.get('https://www.amazon.in/gp/your-account/order-history?ref_=ya_d_c_yo')
email_input = browser.find_element_by_id('ap_email')
email_input.send_keys(userid)
email_input.submit()
passwd_input = browser.find_element_by_id('ap_password')
passwd_input.send_keys(passwd)
passwd_input.submit()
time.sleep(5)
date = browser.find_element_by_class_name('a-color-secondary value')
print(date.text)
Finding element by xpath seems to work, but fails to find the date for all orders as xpath is different for every element.
Any help is appreciated.
Thanks
Refers to this line:
date = browser.find_element_by_class_name('a-color-secondary value')
It seem like your element target having multiple class name, a-color-secondary and value. Sadly .find_element_by_class_name just for single class name.
Instead you can use .find_element_by_css_selector:
date = browser.find_element_by_css_selector('.a-color-secondary.value')
Related
I am new to web scrapping. I am trying to scrape a website (for getting flight prices) using Python Selenium which has some drop down menu for fields like departure city, arrival city. The first option listen should be accessed and used.
Lets say that the field departure city is button, on click of that button, a new html page will be loaded with input box and a list set is available with every list element consisting of a button.
While debugging I found that, once my keyword is entered the options loading screen appears but options are not getting loaded even after increasing the sleep time (I have attached the image for the same).
image with error
Manually there are no issues with the drop down menu. As soon as I enter my departure city, options will be loaded and I choose the respective location.
I also tried to use actionchains. Unfortunately there is no luck. I am attaching my code below. Any help is appreciated.
#Manually trying to access the first element:**
#Accessing Input Field:**
fly_from = browser.find_element("xpath", "//button[contains(., 'Departing')]").click()
element = WebDriverWait(browser,10).until(ec.presence_of_element_located((By.ID,"location")))
#Sending the keyword:**
element.send_keys("Los Angeles")
#Selecting the first element:**
first_elem = browser.find_element("xpath", "//div[#class='classname default-padding']//div//ul//li[0]]//button")
Using Action Chains:
fly_from = browser.find_element("xpath", "//button[contains(., 'Departing')]").click()
time.sleep(10)
element = WebDriverWait(browser, 10).until(ec.presence_of_element_located((By.ID, "location")))
element.send_keys("Los Angeles")
time.sleep(3)
list_elem = browser.find_element("xpath", "//div[#class='class name default-padding']//div//ul//li[0]]//button")
size=element.size
action = ActionChains(browser)
action.move_to_element_with_offset(to_element=list_elem,xoffset =0.5*size['width'], yoffset=70).click().perform()
action.click(on_element = list_elem)
action.perform()
I am trying to pull data from following page:
https://www.lifeinscouncil.org/industry%20information/ListOfFundNAVs
I have to select the name of the company and select a date from the calendar and click get data button.
I am trying to achieve this using Selenium Web Driver using Chrome in Python, I am stuck how do i pass the date parameter to the page.
it seems the page is postback after selection of date from the calendar.
Date needs to be selected from the calendar else the data is not returned by the webpage.
I have tried using requests Post method as well but am not able to get the NAV data.
I need to iterate this for a period of 5 years on daily (Trading Days) basis.
PS: I am bad at understanding DOM elements and have basic knowledge of Python and coding. by profession I am a data analyst.
Thanks in Advance.
Kiran Jain
edit: adding current code below:
from selenium import webdriver
url='https://www.lifeinscouncil.org/industry%20information/ListOfFundNAVs'
opt = webdriver.ChromeOptions()
prefs = {"profile.managed_default_content_settings.images": 2}
opt.add_argument("--start-maximized")
# opt.add_argument("--headless")
opt.add_argument("--disable-notifications")
opt.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(options=opt)
driver.get('https://www.lifeinscouncil.org/industry%20information/ListOfFundNAVs');
insurer = driver.find_element_by_id("MainContent_drpselectinscompany")
nav_date=driver.find_element_by_id('MainContent_txtdateselect')
get_data_btn=driver.find_element_by_id('MainContent_btngetdetails')
options=insurer.find_elements_by_tag_name("option")
data=[]
row={'FirmName','SFIN','fundName','NAVDate','NAV'}
for option in options:
print('here')
print(option.get_attribute("value") + ' ' + option.text)
if(option.text!='--Select Insurer--'):
option.click()
driver.find_element_by_id("MainContent_imgbtncalender").click()#Calender Icon
driver.find_element_by_link_text("June").click()#Date
driver.find_element_by_link_text("25").click()#Date
get_data_btn=driver.find_element_by_id('MainContent_btngetdetails') #this is put here again because on clicking the date, the page is reloaded
get_data_btn.click()
print('clicked')
driver.quit()
The date is in "a" tag. You can try to do select the date using "link-text".
driver.find_element_by_id("MainContent_imgbtncalender").click()#Calender Icon
driver.find_element_by_link_text("27").click()#Date
As per your comment I tried to traverse through dates but it only worked for that particular month. I tried to use "send_keys()" to that text box and its not working. Below is the code to traverse it for a month.
driver.get("https://www.lifeinscouncil.org/industry%20information/ListOfFundNAVs")
driver.find_element_by_id("MainContent_drpselectinscompany").click()
driver.find_element_by_xpath("//option[starts-with(text(),'Aditya')]").click()
driver.find_element_by_id("MainContent_imgbtncalender").click()
driver.find_element_by_link_text("1").click()
driver.find_element_by_id("MainContent_btngetdetails").click()
dateval = 2
while True:
if dateval == 32:
break
try:
driver.find_element_by_id("MainContent_imgbtncalender").click()
driver.find_element_by_link_text(str(dateval)).click()
driver.find_element_by_id("MainContent_btngetdetails").click()
dateval+=1
time.sleep(2)
except:
driver.switch_to.default_content()
dateval+=1
time.sleep(2)
time.sleep(5)
driver.quit()
I am trying to build a web scraper for redfin. I noticed that for the Redfin Estimate that the class name is called 'statsValue'. But that name exists in 5 places thus I was thinking that given that this:
class name "info-block avm" only exists one I thought I could use that to get the statsValue as it seems to be its sort of parent class.
I am totally new to webscraping and selenium. Here is my code:
from selenium import webdriver
from selenium.webdriver.remote import webelement
import pandas as pd
import time
driver = webdriver.Chrome('chromedriver.exe')
driver.get('https://www.redfin.com/')
time.sleep(2)
search_box = driver.find_element_by_name('searchInputBox')
search_box.send_keys('693 Bluebird Canyon Drive, Laguna Beach, CA 92651')
search_box.submit()
time.sleep(2)
# element = driver.find_elements_by_class_name('statsValue')
# print(element[0].get_attribute('innerHTML'))
element = driver.find_element_by_class_name('info-block avm')
driver.quit()
The problem I am having is when I find 'info-block avm' how do I return the value underneath it in the picture posted?
Below examples how you can get price. Information about how to use selectors, you can find in the following links css and xpath.
price = driver.find_element_by_css_selector('.info-block.avm .statsValue').text
price = driver.find_element_by_css_selector('.avm .statsValue').text
element = driver.find_element_by_class_name('avm')
price = price.find_element_by_class_name('statsValue').text
Best practice is to use Explicit or Implicit waits instead of time.sleep().
Background
I am writing a script to automate web form submissions for a set of websites using Python and Selenium. However, the script needs to work on similar websites where the underlying HTML and general structure of a form is not known in advance.
I have managed to get this working well for standard input fields (e.g. text fields, radio buttons, etc), however I am having difficulty handling date picker fields, as these fields are very site-specific.
Despite these site-specific differences, there appears to be a common usage pattern: (i) click the date picker field to reveal the calendar pop-up; then (ii) click on the desired date.
Revealing the pop-up calendar by clicking on the date picker field is straightforward and results in a new snippet of HTML (defining the calendar) being inserted somewhere within the page source. If I could extract just this new HTML snippet then I could use this information to reason about the input field being a calendar and how to interact with it.
Question
How can I extract new HTML code that was generated by a user interaction, such as clicking on an input field?
Edited response according to user's feedbacks
Get html content of element
How to get & print WebElement instance content with python / selenium:
my_web_element.get_attribute('innerHTML')
# Or, depending what you want
my_web_element.get_attribute('outerHTML')
How to wait for an element
In this example I click on a link with id 'target' and wait for an element with id 'calendar' is visible.
driver.find_element_by_id('target').click()
calendar = WebDriverWait(driver, 30).until(
ec.visibility_of_element_located((By.ID, 'calendar'))
)
Conclusion
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
import selenium.webdriver.support.expected_conditions as ec
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get('http://spam.egg.bacon')
driver.find_element_by_id('target').click()
calendar = WebDriverWait(driver, 30).until(
ec.visibility_of_element_located((By.ID, 'calendar'))
)
print calendar.get_attribute('innerHTML')
Prints: <div id="calendar">I'm a calendar</div> with this complet example: https://gist.github.com/arount/285e5add758bb14074efc4708f1d6b07
I'd like to use python selenium to search at https://www.homeaway.com/
The following works:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("https://www.homeaway.com/")
driver.find_element_by_xpath("//input[#id='searchKeywords']").send_keys("Philadelphia, PA, USA")
But I'm running into an issue using the the calendar dropdown date picker, as it's not taking any of the values.
I've tried the following
Attempt 1 to enter a date into the start and end date field:
driver.find_element_by_xpath("//input[#id='stab-searchbox-start-date']").send_keys("02/01/2017")
driver.find_element_by_xpath("//input[#id='stab-searchbox-end-date']").send_keys("03/01/2017")
Note: It looks like homeaway website is completely ignoring the above commands unless you mannually click on the website with your mouse and then use the above selenium commands. In other words, the above commands are not working without a manual mouse click on the website first.
Attempt 2 to enter a date into the start and end date field:
driver.find_element_by_xpath("//div[#id='2017-02-01_2017-02']").click()
Attempt 3 to enter a date into the start and end date field
driver.execute_script("document.querySelectorAll('#stab-searchbox-start-date')[0].value = '02/01/2017'")
driver.execute_script("document.querySelectorAll('#stab-searchbox-end-date')[0].value = '03/01/2017'")
driver.find_element_by_xpath("//button[#class='btn btn-primary btn-lg searchbox-submit js-searchSubmit']").click()
Note: this looks like it works, but the dates are actually not registered when you click on search despite the dates being entered into the star and end date text boxes. It seems that homeaway will only register the dates if you use the calendar dropdown.
Note that you can not send_keys in these types when input text is not supported.
try this code, it will work:
from selenium import webdriver
url='http://www.homeaway.com/'
driver = webdriver.Chrome()
driver.maximize_window()
driver.get(url)
driver.find_element_by_id("search-location").send_keys("Philadelphia, PA, USA")
driver.find_element_by_id('search-checkin').click()
driver.find_element_by_xpath('//*[#id="ui-datepicker-div"]//*[text()="30"]').click()
driver.find_element_by_id('search-checkout').click()
driver.find_element_by_xpath('//*[#id="ui-datepicker-div"]//*[text()="31"]').click()