How to forward to another subpage - python

Today I started my first project in Selenium with Python.
I'm going to write a program that will automate buying items on a
website. I'm going to expand functions of this bot in future updates, but right now I need that first thing done.
What the program should do right now:
Program has to login to site
Program has to forward to subsite - here is problem. I'm getting logged off from site and next step of course doesn't work.
Program has to choose size and then click "add to cart."
The program works fine if I don't use forwarding but click on the next parts of the site. This means that if I want to get forwarded from the home page to the product page it doesn't work - I get logged off. When I click, for example, home -> shoes -> name of shoe, everything works fine.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
import time
browser = webdriver.Chrome(executable_path=r'C:\Users\damia\Desktop\chromedriver.exe')
stronaLog="https://www.zalando-lounge.pl/#/login"
stronaKup="https://www.zalando-lounge.pl/campaigns/ZZO0TCY/categories/5999626/articles/AD115O085-A12"
#getting to the site
browser.get(stronaLog)
#logging
email = browser.find_element_by_id('form-email').send_keys('myemail')
password = browser.find_element_by_id('form-password').send_keys('mypass' + Keys.ENTER)
time.sleep(1)
#save cookies
cookie={'name': 'MojCiasteczek', 'value':'666666'}
browser.add_cookie(cookie)
#forwading to next site and buying product
browser.get(stronaKup)
browser.find_element_by_xpath("//*[contains(text(), '41 1/3')]").click()
browser.find_element_by_css_selector('#addToCartButton > div.articleButton___add-to-cart___1Nngf.core___flipper___3yDf4').click()

It looks like the authentication process hasn't completed by the time you attempt to navigate to a url that requires the user to be logged in.
As discussed in the comments, to test this theory a sleep was added to wait for authentication to complete and that appeared to work.
Rather than using a sleep, a more robust approach would be to wait for some element to appear on the landing page. I'll give steps for a general example:
here are the import statements you'll need:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Identify some element on the landing page after login... Like a welcome text, or anything really that's not also on the login page. For this example let's say the element you identify has an id="Welcome", of course yours will be different.
Use this code to wait for that element:
WebDriverWait(browser,5).until(
EC.presence_of_element_located((By.ID, "Welcome")))
This code will wait up to 5 seconds for that element to be present, but will return as soon as it finds it. Change the seconds to wait to the maximum possible time you would expect the login to take.
After that you should be able to navigate to your other url as an authenticated user:
browser.get(stronaKup)

Related

How do I test every link on a webpage with Selenium using Pytho and pytest or Selenium Firefox IDE?

So I'm trying to learn Selenium for automated testing. I have the Selenium IDE and the WebDrivers for Firefox and Chrome, both are in my PATH, on Windows. I've been able to get basic testing working but this part of the testing is eluding me. I've switched to using Python because the IDE doesn't have enough features, you can't even click the back button.
I'm pretty sure this has been answered elsewhere but none of the recommended links provided an answer that worked for me. I've searched Google and YouTube with no relevant results.
I'm trying to find every link on a page, which I've been able to accomplish, even listing the I would think this would be just a default test. I even got it to PRINT the text of the link but when I try to click the link it doesn't work. I've tried doing waits of various sorts, including
visibility_of_any_elements_located AND time.sleep(5) To wait before trying to click the link.
I've tried this to click the link after waiting self.driver.find_element(By.LINK_TEXT, ("lnktxt")).click(). But none work, not in below code, the below code works, listing the URL Text, the URL and the URL Text again, defined by a variable.
I guess I'm not sure how to get a variable into the By.LINK_TEXT or ...by_link_text statement, assuming that would work. I figured if I got it into the variable I could use it again. That worked for print but not for click()
I basically want to be able to load a page, list all links, click a link, go back and click the next link, etc.
The only post this site recommended that might be helpful was...
How can I test EVERY link on the WEBSITE with Selenium
But it's Java based and I've been trying to learn Python for the past month so I'm not ready to learn Java just to make this work. The IDE does not seem to have an easy option for this, or from all my searches it's not documented well.
Here is my current Selenium code in Python.
import pytest
import time
import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
wait_time_out = 15
class TestPazTestAll2():
def setup_method(self, method):
driver = webdriver.Firefox()
self.driver = webdriver.Firefox()
self.vars = {}
def teardown_method(self, method):
self.driver.quit()
def test_pazTestAll(self):
self.driver.get('https://poetaz.com/poems/')
lnks=self.driver.find_elements_by_tag_name("a")
print ("Total Links", len(lnks))
# traverse list
for lnk in lnks:
# get_attribute() to get all href
print(lnk.get_attribute("text"))
lnktxt = (lnk.get_attribute("text"))
print(lnk.get_attribute("href"))
print(lnktxt)
driver.quit()
Again, I'm sure I missed something in my searches but after hours of searching I'm reaching out.
Any help is appreciated.
I basically want to be able to load a page, list all links, click a link, go back and click the next link, etc.
I don't recommend doing this. Selenium and manipulating the browser is slow and you're not really using the browser for anything where you'd really need a browser.
What I recommend is simply sending requests to those scraped links and asserting response status codes.
import requests
link_elements = self.driver.find_elements_by_tag_name("a")
urls = map(lambda l: l.get_attribute("href"), link_elements)
for url in urls:
response = requests.get(url)
assert response.status_code == 200
(You also might need to prepend some base url to those strings found in href attributes.)

Struggling to click the load more button with Selenium

I plan to build a scraper that'll utilize both Selenium and BeautifulSoup.
I'm struggling to click the load more button with selenium. I've managed to detect the button, scroll to it etc. - can't seem to figure out a way to continuously click the button.
Any suggestions on how to pass this hurdle?
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import TimeoutException, NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time, requests
from bs4 import BeautifulSoup
def search_agent(zip):
location = bot.find_element_by_name('hheroquotezip')
time.sleep(3)
location.clear()
location.send_keys(zip)
location.submit()
def load_all_agents():
# click more until no more results to load
while True:
try:
#more_button = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'results.length'))).click()
more_button = wait.until(EC.visibility_of_element_located((By.XPATH, '//*[#id="searchResults"]/div[3]/button'))).click()
except TimeoutException:
break
# wait for results to load
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.seclection-result .partners-detail')))
print ("Complete")
bot.quit()
#define Zip for search query
zip = 20855
bot = webdriver.Safari()
wait = WebDriverWait(bot, 10)
#fetch agents page
bot.get('https://www.erieinsurance.com/find-an-insurance-agent')
search_agent(zip)
load_all_agents()
With the above approach, the console spits out these errors:
[Error] Refused to load https://9203275.fls.doubleclick.net/activityi;src=9203275;type=agent0;cat=agent0;ord=7817740349177;gtm=2wg783;auiddc=373080108.1594822533;~oref=https%3A%2F%2Fwww.erieinsurance.com%2Ffind-an-insurance-agent-results%3Fzipcode%3D20855? because it does not appear in the frame-src directive of the Content Security Policy.
[Error] Refused to connect to https://api.levelaccess.net/analytics/3.0/results because it does not appear in the connect-src directive of the Content Security Policy.
Creating an answer to post a couple of images.
When i ran the attached script in chrome it worked fine.
When #furas did the same in firefox he had the same result
I ran the same script 10 times back to back and i wasn't refused.
What i note based on the error is that iframe seems broswer sensitive:
In Chrome this header contains chromium scripts:
In Firefox it contains no scripts:
Have a look and see what you get manually in your safari.
A simple answer might be to not use safari - use chrome or FF. Is that an option? (if it MUST be safari just say and i'll look again.)
Finally - couple of quick additional notes.
The site is using angular, so you might want to consider protractor if you're struggling with synchronisation. (protractor helps with some script-syncing capailies)
Also worth a note - don't feel you have to land on the home page and then navigate as user. Update your URL to the search results page and feed in the zip code and save yourself some time:
https://www.erieinsurance.com/find-an-insurance-agent-results?zipcode=20855
[edit/update]
This the same thing? https://github.com/SeleniumHQ/selenium/issues/458
Closed bug in 2016 around "Content Security Policies" - logged as an apple thing.

How to handle Google Authenticator with Selenium

I need to download a massive amount of excel-files (estimated: 500 - 1000) from sellercentral.amazon.de. Manually downloading is not an option, as every download needs several clicks until the excel pops up.
Since amazon cannot provide me a simple xml with its structure, I decided to automate this on my own. The first thing coming to mind was Selenium and Firefox.
The Problem:
A login to sellercentral is required, as well as 2-factor-authentication (2FA). So if I login once, i can open another tab, enter sellercentral.amazon.de and am instantly logged in.
I can even open another instance of the browser, and be instantly logged in there too. They might be using session-cookies. The target URL to "scrape" is https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu .
But when I open the URL from my python-script with selenium webdrive, a new instance of the browser is launched, in which I am not logged in. Even though, there are instances of firefox running at the same time, in which I am logged in. So I guess the instances launched by selenium are somewhat different.
What I've tried:
I tried setting a timedelay after the first .get() (to open site), then I'll manually login, and after that redoing the .get(), which makes the script go on for forever.
from selenium import webdriver
import time
browser = webdriver.Firefox()
# Wait for website to fire onload event
browser.get("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")
time.sleep(30000)
browser.get("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")
elements = browser.find_elements_by_tag_name("browse-node-component")
print(str(elements))
What am I looking for?
Need solution to use the two factor authentication token from google authenticator.
I want the selenium to be opened up as a tab in the existing instance of the firefox browser, where I will have already logged in beforehand. Therefore no login (should be) required and the "scraping" and downloading can be done.
If there's no direct way, maybe someone comes up with a workaround?
I know selenium cannot download the files itself, as the popups are no longer part of the browser. I'll fix that when I get there.
Important Side-Notes:
Firefox is not a given! I'll gladly accept a solution for any browser.
Here is the code that will read the google authenticator token and used in the login. Used js to open the new tab.
Install pyotp package before running the test code.
pip install pyotp
Test code:
from pyotp import *
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
driver.get("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")
wait = WebDriverWait(driver,10)
# enter the email
email = wait.until(EC.presence_of_element_located((By.XPATH, "//input[#name='email']")))
email.send_keys("email goes here")
# enter password
driver.find_element_by_xpath("//input[#name='password']").send_keys("password goes here")
# click on signin button
driver.find_element_by_xpath("//input[#id='signInSubmit']").click()
#wait for the 2FA feild to display
authField = wait.until(EC.presence_of_element_located((By.XPATH, "xpath goes here")))
# get the token from google authenticator
totp = TOTP("secret goes here")
token = totp.now()
print (token)
# enter the token in the UI
authField.send_keys(token)
# click on the button to complete 2FA
driver.find_element_by_xpath("xpath of the button goes here").click()
# now open new tab
driver.execute_script("""window.open("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")""")
# continue with your logic from here

Python Selenium - Webdriver interacts with data from previous page

I am trying to create a piece of code that takes an input from the user and searches for that input in his gmail account and then checks all the boxes from the sender which is the same as the input.
Here is my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
from selenium.webdriver import Remote
from selenium.webdriver.support.ui import WebDriverWait
import urllib
driver = webdriver.Chrome('D:/chromedriver_win32/chromedriver.exe')
driver.get("http://www.gmail.com")
elem = driver.find_element_by_id('Email')
elem.send_keys("****************")
elem2 = driver.find_element_by_id('next')
elem2.send_keys(Keys.ENTER)
driver.maximize_window()
driver.implicitly_wait(20)
elem3 = driver.find_element_by_name('Passwd')
elem3.send_keys("*************")
driver.find_element_by_id('signIn').send_keys(Keys.ENTER)
driver.implicitly_wait(20)
inp = 'randominput'
driver.find_element_by_name('q').send_keys(inp)
driver.find_element_by_css_selector('button.gbqfb').send_keys(Keys.ENTER)
x = driver.current_url
for i in driver.find_elements_by_css_selector('.zA.zE'):
print(i.find_element_by_class_name('zF').get_attribute('name'))
if(i.find_element_by_class_name('zF').get_attribute('name') == inp):
i.find_element_by_css_selector('.oZ-jc.T-Jo.J-J5-Ji').click()
This main problem is that although the webdriver shows the new page where it has searched for the query but when the code interacts with the page it interacts with the previous one.
I have tried putting implicit wait. And when I check for the current url it shows the new url.
The problem is that this is a single page app so selenium isn't going to wait for new data to load. You need to figure out a way to wait for the search results to come back. Whether it be based off of the 'Loading...' that appears at the top or even just waiting for the first result to change.
Grab an element off of the first page and wait for it to go stale. That will tell you that the page is loading. Then wait for the element you want. This is about the only way to ensure that the page has reloaded and you aren't referencing the original page with a dynamic page.
To wait for stale, see staleness_of on http://selenium-python.readthedocs.io/waits.html.

Webdriver.get() will not navigate to another page until I add wait before the call

I have a Python 3.5 script that basically logs into a web site (and gets redirected after login) and then navigates to another page where it actually does its job. If I run webdriver.get() to this second page right after the login, it doesn't navigate to the target location and stays on the page it gets redirected to after I log in. But if I put a breakpoint or just time.sleep(5) - it opens up fine. I found this question with the C# code, where it is suggested to add async waits. Is there a Pythonic way to work around it?
My code for reference:
# underneath is webdriver.Chrome() wrapped up in some exception handling
mydriver = get_web_driver()
# do the first get, wait for the form to load, then fill in the form and call form.submit()
login(mydriver, login_page, "username", "password")
# perform the second get() with the same webdriver
open_invite_page(mydriver, group_invite_page)
You need to apply the concept of Explicit Waits and the WebDriverWait class. You just need to choose which of the Expected Conditions to wait for. It sounds like the title_is and title_contains could work in your case:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
login(mydriver, login_page, "username", "password")
# wait for invite page to be opened
wait = WebDriverWait(mydriver, 10)
wait.until(
EC.title_is("Invite Page") # TODO: change to the real page title
)
open_invite_page(mydriver, group_invite_page)
Note that there are multiple built-in Expected Conditions. You may, for instance, wait for a specific element to be present, visible or clickable on the invite page. And, you can create custom Expected Conditions if you would not find a good fit among the built-in conditions.

Categories

Resources