Python Selenium - Webdriver interacts with data from previous page

Python Selenium - Webdriver interacts with data from previous page - python

I am trying to create a piece of code that takes an input from the user and searches for that input in his gmail account and then checks all the boxes from the sender which is the same as the input.
Here is my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
from selenium.webdriver import Remote
from selenium.webdriver.support.ui import WebDriverWait
import urllib
driver = webdriver.Chrome('D:/chromedriver_win32/chromedriver.exe')
driver.get("http://www.gmail.com")
elem = driver.find_element_by_id('Email')
elem.send_keys("****************")
elem2 = driver.find_element_by_id('next')
elem2.send_keys(Keys.ENTER)
driver.maximize_window()
driver.implicitly_wait(20)
elem3 = driver.find_element_by_name('Passwd')
elem3.send_keys("*************")
driver.find_element_by_id('signIn').send_keys(Keys.ENTER)
driver.implicitly_wait(20)
inp = 'randominput'
driver.find_element_by_name('q').send_keys(inp)
driver.find_element_by_css_selector('button.gbqfb').send_keys(Keys.ENTER)
x = driver.current_url
for i in driver.find_elements_by_css_selector('.zA.zE'):
print(i.find_element_by_class_name('zF').get_attribute('name'))
if(i.find_element_by_class_name('zF').get_attribute('name') == inp):
i.find_element_by_css_selector('.oZ-jc.T-Jo.J-J5-Ji').click()
This main problem is that although the webdriver shows the new page where it has searched for the query but when the code interacts with the page it interacts with the previous one.
I have tried putting implicit wait. And when I check for the current url it shows the new url.

The problem is that this is a single page app so selenium isn't going to wait for new data to load. You need to figure out a way to wait for the search results to come back. Whether it be based off of the 'Loading...' that appears at the top or even just waiting for the first result to change.

Grab an element off of the first page and wait for it to go stale. That will tell you that the page is loading. Then wait for the element you want. This is about the only way to ensure that the page has reloaded and you aren't referencing the original page with a dynamic page.
To wait for stale, see staleness_of on http://selenium-python.readthedocs.io/waits.html.

Related

How to load page until specific element found selenium python

Is there a way to set timeout when specific element in webpage loads
Notice: I am talking about loading webpage using driver.get() method
I tried setting page loads timeout to 10s for example and check whether element is present but if it is not present i'll have to load it from start.
Edit:
Clearly said that I don't want to load full url
I want driver.get() to load url until element found and then stop loading more from url
In your examples you used simply driver.get() method which will load full url and then execute next command. One way is to use driver.set_page_load_timeout()

Edit:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
op = Options()
op.page_load_strategy = 'none'
driver = webdriver.Chrome(options=op)
driver.get(URL)
WebDriverWait(driver, 10).until(
EC.presence_of_element_located(
(By.XPATH, '//div')))
driver.execute_script("window.stop();")
PS. You will keep getting off-topic answers if you post vague questions without a line of code.

webdriver will wait for a page to load by default. It does not wait for loading inside frames or for ajax requests. It means when you use .get('url'), your browser will wait until the page is completely loaded and then go to the next command in the code. But when you are posting an ajax request, webdriver does not wait and it's your responsibility to wait an appropriate amount of time for the page or a part of page to load; so there is a module named expected_conditions.

When the element is not found you get an error so you can use " try " untill the element is found:
while True:
try:
element = driver.find_element(by=By.<Your Choice>, value=<Name or Xpath or ...>)
except:
time.sleep(1)
It tries till the element is found

How to use selenium for webscraping google flights?

I'm trying to pull the airline names and prices of a specific flight. I'm having trouble with the x.path and/or using the right html tags because when I run the code below, all I get back is 14 empty lists.
from selenium import webdriver
from lxml import html
from time import sleep
driver = webdriver.Chrome(r"C:\Users\14074\Python\chromedriver")
URL = 'https://www.google.com/travel/flights/searchtfs=CBwQAhopagwIAxIIL20vMHBseTASCjIwMjEtMTItMjNyDQgDEgkvbS8wMWYwOHIaKWoNCAMSCS9tLzAxZjA4chIKMjAyMS0xMi0yN3IMCAMSCC9tLzBwbHkwcAGCAQsI____________AUABSAGYAQE&tfu=EgYIAhAAGAA'
driver.get(URL)
sleep(1)
tree = html.fromstring(driver.page_source)
for flight_tree in tree.xpath('//div[#class="TQqf0e sSHqwe tPgKwe ogfYpf"]'):
title = flight_tree.xpath('.//*[#id="yDmH0d"]/c-wiz[2]/div/div[2]/div/c-wiz/div/c-wiz/div[2]/div[2]/div/div[2]/div[6]/div/div[2]/div/div[1]/div/div[1]/div/div[2]/div[2]/div[2]/span/text()')
price = flight_tree.xpath('.//span[contains(#data-gs, "CjR")]')
print(title, price)
#driver.close()
This is just the first part of my code but I can't really continue without getting this to work. If anyone has some ideas on what I'm doing wrong that would be amazing! It's been driving me crazy. Thank you!

I noticed a few issues with your code. First of all, I believe that when entering this page, first google will show you the "I agree to terms and conditions" popup before showing you the content of the page, therefore you need to first click on that button.
Also, you should use the find_elements_by_xpath function directly on driver instead of using the page content, as this also allows you to render the javascript content. You can find more info here: python tree.xpath return empty list
To get more info on how to scrape using selenium and python you could check out this guide: https://www.webscrapingapi.com/python-selenium-web-scraper/
I used the following code to scrape the titles. (I also changed the xpaths to do so, by extracting them directly from google chrome. You can do that by right clicking on an element -> inspect and in the elements tab where the element is, you can right click -> copy -> Copy xpath)
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
# I used these for the code to work on my windows subsystem linux
option = webdriver.ChromeOptions()
option.add_argument('--no-sandbox')
option.add_argument('--disable-dev-sh-usage')
driver = webdriver.Chrome(ChromeDriverManager().install(), options=option)
URL = 'https://www.google.com/travel/flights/searchtfs=CBwQAhopagwIAxIIL20vMHBseTASCjIwMjEtMTItMjNyDQgDEgkvbS8wMWYwOHIaKWoNCAMSCS9tLzAxZjA4chIKMjAyMS0xMi0yN3IMCAMSCC9tLzBwbHkwcAGCAQsI____________AUABSAGYAQE&tfu=EgYIAhAAGAA'
driver.get(URL)
driver.find_element_by_xpath('//*[#id="yDmH0d"]/c-wiz/div/div/div/div[2]/div[1]/div[4]/form/div[1]/div/button/span').click() # this is necessary to pres the I agree button
elements = driver.find_elements_by_xpath('//*[#id="yDmH0d"]/c-wiz[2]/div/div[2]/div/c-wiz/div/c-wiz/div[2]/div[3]/div[3]/c-wiz/div/div[2]/div[1]/div/div/ol/li')
for flight_tree in elements:
title = flight_tree.find_element_by_xpath('.//*[#class="W6bZuc YMlIz"]').text
print(title)

I tried the below code, with screen maximized and having explicit waits and could successfully extract the information, please see below :
Sample code :
driver = webdriver.Chrome(driver_path)
driver.maximize_window()
driver.get("https://www.google.com/travel/flights/searchtfs=CBwQAhopagwIAxIIL20vMHBseTASCjIwMjEtMTItMjNyDQgDEgkvbS8wMWYwOHIaKWoNCAMSCS9tLzAxZjA4chIKMjAyMS0xMi0yN3IMCAMSCC9tLzBwbHkwcAGCAQsI____________AUABSAGYAQE&tfu=EgYIAhAAGAA")
wait = WebDriverWait(driver, 10)
titles = wait.until(EC.presence_of_all_elements_located((By.XPATH, "//div/descendant::h3")))
for name in titles:
print(name.text)
price = name.find_element(By.XPATH, "./../following-sibling::div/descendant::span[2]").text
print(price)
Imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Output :
Tokyo
₹38,473
Mumbai
₹3,515
Dubai
₹15,846

Dynamic element (Table) in page is not updated when i use Click() in selenium, so i couldn't retrive the new data

Page which i need to scrape data from: Digikey Search result
Issue
It is allowed to show only 100 row in each table, so i have to move between multiple tables using the NextPageButton.
As illustrated in the code below, I actually do though, but the results retrieves to me every time the first table results and doesn't move on to the next table results on my click action ActionChains(driver).click(element).perform().
Keep in mind that NO new pages is opened, click is going to be intercepted by some sort of JavaScript to do some rich UI stuff on the same page to load a new table of data
My Expectations
I am just trying to validate that I could move to the next table, then i will edit the code to loop through all of them.
This piece of code should return the data in the second table from results, BUT it actually returns the values from the first table which loaded initially with the URL. This means that the click action didn't occur or it actually occurred but the WebDriver driver content isn't being updated by interacting with dynamic JavaScript elements in the page.
I will appreciate any help, Thanks..
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located
from selenium.webdriver import ActionChains
import time
import sys
url = "https://www.digikey.com/en/products/filter/coaxial-connectors-rf-terminators/382?s=N4IgrCBcoA5QjAGhDOl4AYMF9tA"
chrome_driver_path = "..PATH\\chromedriver"
chrome_options = Options()
chrome_options.add_argument ("--headless")
webdriver = webdriver.Chrome(
executable_path= chrome_driver_path
,options= chrome_options
)
with webdriver as driver:
wait = WebDriverWait(driver, 10)
driver.get(url)
wait.until(presence_of_element_located((By.CSS_SELECTOR, "tbody")))
element = driver.find_element_by_css_selector("button[data-testid='btn-next-page']")
ActionChains(driver).click(element).perform()
time.sleep(10) #too much time i know, but to make sure it is not a waiting issue. something needs to be updated
results = driver.find_elements_by_css_selector("tbody")
for count in results:
countArr = count.text
print(countArr)
print()
driver.close()

Finally found a SOLUTION !
Source of the solution.
As expected the issue was in the clicking action itself. It is somehow not being done right or it's not being done at all as illustrated in the solution Source question.
the solution is to click the button using Javascript execution.
Change line 30
ActionChains(driver).click(element).perform()
to be as following:
driver.execute_script("arguments[0].click();",element)
That's it..

Selenium webdriver finds elements from the previous page, not the current

I am trying to automate a YouTube search, click on a search result, and then follow the recommended videos on the right-hand side. The code I wrote goes to youtube, makes the search, and clicks on a video and the video gets opened up on the browser. However, I cannot bring it to click on one of the recommended videos.
The problem seems to be that, when I use recommended_videos = driver.find_elements_by_id("video-title") to get a list of elements of recommended videos from the right, what I get instead is a list from the previous page (when I first type in the word and get search results).
The code does work properly when I go directly to a video link with driver.get(url), instead of doing a search first.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
import random
seed = 1
random.seed(seed)
driver = webdriver.Firefox()
driver.get("http://www.youtube.com/")
element = driver.find_element_by_tag_name("input")
# Put the word "history" in the search box and hit enter
element.send_keys("history")
element.send_keys(Keys.RETURN)
time.sleep(5)
# Get a list of elements (videos) that get returned by the search
search_results = driver.find_elements_by_id("video-title")
# Click randomly on one of the first five results
search_results[random.randint(0,4)].click()
# Go to the end of the page (I don't know if this is necessary
html = driver.find_element_by_tag_name('html')
html.send_keys(Keys.END)
time.sleep(10)
# Get the recommended videos the same way as above. This is where the problem starts, because recommended_videos essentially becomes the same thing as the previous page's search_results, even though the browser is in a new page now.
recommended_videos = driver.find_elements_by_id("video-title")
recommended_videos[random.randint(0,4)].click()
So when I click (last line), I get the error
ElementNotInteractableException: Element <a id="video-title" class="yt-simple-endpoint style-scope ytd-video-renderer" href="/watch?v=1oean5l__Cc"> could not be scrolled into view

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
import random
seed = 1
random.seed(seed)
driver = webdriver.Chrome()
driver.get("https://www.youtube.com")
element = driver.find_element_by_tag_name("input")
# Put the word "history" in the search box and hit enter
element.send_keys("history")
element.send_keys(Keys.RETURN)
time.sleep(5)
# Get a list of elements (videos) that get returned by the search
search_results = driver.find_elements_by_id("video-title")
# Click randomly on one of the first five results
search_results[random.randint(0,10)].click()
# Go to the end of the page (I don't know if this is necessary
#
time.sleep(4)
# Get the recommended videos the same way as above. This is where the problem starts, because recommended_videos essentially becomes the same thing as the previous page's search_results, even though the browser is in a new page now.
while True:
recommended_videos = driver.find_elements_by_xpath("//*[#id='dismissable']/div/a")
print(recommended_videos)
recommended_videos[random.randint(1,4)].click()
time.sleep(4)
i am also completely new and thank you for giving me interest in selenium. may this code helps you. change the driver to firefox if you want.

Python Selenium (waiting for a frame) [duplicate]

I have these includes:
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
Browser set up via
browser = webdriver.Firefox()
browser.get(loginURL)
However sometimes I do
browser.switch_to_frame("nameofframe")
And it won't work (sometimes it does, sometimes it doesn't).
I am not sure if this is because Selenium isn't actually waiting for pages to load before it executes the rest of the code or what. Is there a way to force a webpage load?
Because sometimes I'll do something like
browser.find_element_by_name("txtPassword").send_keys(password + Keys.RETURN)
#sends login information, goes to next page and clicks on Relevant Link Text
browser.find_element_by_partial_link_text("Relevant Link Text").click()
And it'll work great most of the time, but sometimes I'll get an error where it can't find "Relevant Link Text" because it can't "see" it or some other such thing.
Also, is there a better way to check if an element exists or not? That is, what is the best way to handle:
browser.find_element_by_id("something")
When that element may or may not exist?

You could use WebDriverWait:
from contextlib import closing
from selenium.webdriver import Chrome as Browser
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import NoSuchFrameException
def frame_available_cb(frame_reference):
"""Return a callback that checks whether the frame is available."""
def callback(browser):
try:
browser.switch_to_frame(frame_reference)
except NoSuchFrameException:
return False
else:
return True
return callback
with closing(Browser()) as browser:
browser.get(url)
# wait for frame
WebDriverWait(browser, timeout=10).until(frame_available_cb("frame name"))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Selenium - Webdriver interacts with data from previous page - python

The problem is that this is a single page app so selenium isn't going to wait for new data to load. You need to figure out a way to wait for the search results to come back. Whether it be based off of the 'Loading...' that appears at the top or even just waiting for the first result to change.

Related

How to load page until specific element found selenium python

How to use selenium for webscraping google flights?

Dynamic element (Table) in page is not updated when i use Click() in selenium, so i couldn't retrive the new data

Selenium webdriver finds elements from the previous page, not the current

Python Selenium (waiting for a frame) [duplicate]

Categories

Resources