How to handle Google Authenticator with Selenium - python

I need to download a massive amount of excel-files (estimated: 500 - 1000) from sellercentral.amazon.de. Manually downloading is not an option, as every download needs several clicks until the excel pops up.
Since amazon cannot provide me a simple xml with its structure, I decided to automate this on my own. The first thing coming to mind was Selenium and Firefox.
The Problem:
A login to sellercentral is required, as well as 2-factor-authentication (2FA). So if I login once, i can open another tab, enter sellercentral.amazon.de and am instantly logged in.
I can even open another instance of the browser, and be instantly logged in there too. They might be using session-cookies. The target URL to "scrape" is https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu .
But when I open the URL from my python-script with selenium webdrive, a new instance of the browser is launched, in which I am not logged in. Even though, there are instances of firefox running at the same time, in which I am logged in. So I guess the instances launched by selenium are somewhat different.
What I've tried:
I tried setting a timedelay after the first .get() (to open site), then I'll manually login, and after that redoing the .get(), which makes the script go on for forever.
from selenium import webdriver
import time
browser = webdriver.Firefox()
# Wait for website to fire onload event
browser.get("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")
time.sleep(30000)
browser.get("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")
elements = browser.find_elements_by_tag_name("browse-node-component")
print(str(elements))
What am I looking for?
Need solution to use the two factor authentication token from google authenticator.
I want the selenium to be opened up as a tab in the existing instance of the firefox browser, where I will have already logged in beforehand. Therefore no login (should be) required and the "scraping" and downloading can be done.
If there's no direct way, maybe someone comes up with a workaround?
I know selenium cannot download the files itself, as the popups are no longer part of the browser. I'll fix that when I get there.
Important Side-Notes:
Firefox is not a given! I'll gladly accept a solution for any browser.

Here is the code that will read the google authenticator token and used in the login. Used js to open the new tab.
Install pyotp package before running the test code.
pip install pyotp
Test code:
from pyotp import *
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
driver.get("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")
wait = WebDriverWait(driver,10)
# enter the email
email = wait.until(EC.presence_of_element_located((By.XPATH, "//input[#name='email']")))
email.send_keys("email goes here")
# enter password
driver.find_element_by_xpath("//input[#name='password']").send_keys("password goes here")
# click on signin button
driver.find_element_by_xpath("//input[#id='signInSubmit']").click()
#wait for the 2FA feild to display
authField = wait.until(EC.presence_of_element_located((By.XPATH, "xpath goes here")))
# get the token from google authenticator
totp = TOTP("secret goes here")
token = totp.now()
print (token)
# enter the token in the UI
authField.send_keys(token)
# click on the button to complete 2FA
driver.find_element_by_xpath("xpath of the button goes here").click()
# now open new tab
driver.execute_script("""window.open("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")""")
# continue with your logic from here

Related

Open browser with selenium without creating new instance

I am automating a form-filler using selenium, however the issue is the user needs to be logged in to their google account. Selenium is opening up a new browser instance where the user is not logged in. I cannot automate the log in process due to 2 factor authentication.
So far I've found
import webbrowser
webbrowser.open('www.google.com', new = 2)
which will open the window the way I want with the user logged in, however I am unable to interact with the page unlike selenium. Is there a way I can get selenium to open a window like webbrowser? Or is there a way to interact with the page with webbrowser? I have checked the docs of both and not seen an answer to this
You don't need to make the user log in again to their user account. You can use the same chrome profile that you have for your browser. This will enabled you to use all your accounts from chrome without making them log explicitly.
Here is how you can do this :
First get the user chrome profile path
Mine was : C:\Users\hpoddar\AppData\Local\Google\Chrome\User Data
If you have multiple profiles you might need to get the Profile id for your chrome google account as well.
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium import webdriver
chrome_path = r"C:\Users\hpoddar\Desktop\Tools\chromedriver_win32\chromedriver.exe"
options = webdriver.ChromeOptions()
options.add_argument(r"--user-data-dir=C:\Users\hpoddar\AppData\Local\Google\Chrome\User Data")
# Specify this if multiple profiles in chrome
# options.add_argument('--profile-directory=Profile 1')
s = Service(chrome_path)
driver = webdriver.Chrome(service=s, options=options)
driver.get("https://www.google.co.in")

Selenium Chromedriver doesn't keep me logged in in some websites

I'm using Selenium and ChromeDriver to scrape data from a website.
I need to keep my account logged in after closing the Driver: for this purpose I use every time the default Chrome profile.
Here you can see my code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
urlpage = 'https://example.com/'
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=C:\\Users\\MyName\\AppData\\Local\\Google\\Chrome\\User Data")
driver = webdriver.Chrome(options=options)
driver.get(urlpage)
The problem is that for some websites (e.g. https://projecteuler.net/) it works, so I'm logged in also the following session, but for other (like https://www.fundraiso.ch, the one I need) it doesn't, although in the "normal" browser I'm still logged in after I close the window.
Does anyone know how to fix this problem?
EDIT:
I didn't mention that I can't automate the login because the website has a maximum login number, and if I breach it the website will block my account.

Why isn't my selenium browser logged into sites that firefox is always logged into?

I am trying to scrape a site that is behind authentication. When I use firefox, I am always already logged on the site (even after a reboot) and I can go straight into the pages.
When I open selenium and tell it to use my firefox profile, the selenium browser isn't logged in. Then I have to go through about 4 or 5 min of clicking pictures to show that I'm a person. But if I open firefox myself immediately afterwards, I am still logged in - no problem.
So I really don't understand what is happening. I know that my firefox profile is being loaded: When I initialize the browser, it takes around 60 seconds for the profile to be copied over before the browser opens up. My code is below:
from selenium.webdriver import Firefox
from selenium import webdriver
url_start = 'https://authenticatedsection.some site.com/'
fp = webdriver.FirefoxProfile('C:/Users/Claudia/AppData/Roaming/Mozilla/Firefox/Profile/minvococ.default-release')
browser = Firefox(fp)
#initialize browser
browser.get(url_start)
You're logged in because of the cookies stored by your firefox browser. Selenium opens a new browser process with no cookies. If you only need to enter a user name and password to login, this can be done with selenium. If you need to pass a "human-test" (captcha) such as the picture clicking task you're describing, this cannot be done via scraper: it is specifically designed to prevent bots from logging onto that page.
Why do you want your scraper to log into your firefox profile?

How to forward to another subpage

Today I started my first project in Selenium with Python.
I'm going to write a program that will automate buying items on a
website. I'm going to expand functions of this bot in future updates, but right now I need that first thing done.
What the program should do right now:
Program has to login to site
Program has to forward to subsite - here is problem. I'm getting logged off from site and next step of course doesn't work.
Program has to choose size and then click "add to cart."
The program works fine if I don't use forwarding but click on the next parts of the site. This means that if I want to get forwarded from the home page to the product page it doesn't work - I get logged off. When I click, for example, home -> shoes -> name of shoe, everything works fine.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
import time
browser = webdriver.Chrome(executable_path=r'C:\Users\damia\Desktop\chromedriver.exe')
stronaLog="https://www.zalando-lounge.pl/#/login"
stronaKup="https://www.zalando-lounge.pl/campaigns/ZZO0TCY/categories/5999626/articles/AD115O085-A12"
#getting to the site
browser.get(stronaLog)
#logging
email = browser.find_element_by_id('form-email').send_keys('myemail')
password = browser.find_element_by_id('form-password').send_keys('mypass' + Keys.ENTER)
time.sleep(1)
#save cookies
cookie={'name': 'MojCiasteczek', 'value':'666666'}
browser.add_cookie(cookie)
#forwading to next site and buying product
browser.get(stronaKup)
browser.find_element_by_xpath("//*[contains(text(), '41 1/3')]").click()
browser.find_element_by_css_selector('#addToCartButton > div.articleButton___add-to-cart___1Nngf.core___flipper___3yDf4').click()
It looks like the authentication process hasn't completed by the time you attempt to navigate to a url that requires the user to be logged in.
As discussed in the comments, to test this theory a sleep was added to wait for authentication to complete and that appeared to work.
Rather than using a sleep, a more robust approach would be to wait for some element to appear on the landing page. I'll give steps for a general example:
here are the import statements you'll need:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Identify some element on the landing page after login... Like a welcome text, or anything really that's not also on the login page. For this example let's say the element you identify has an id="Welcome", of course yours will be different.
Use this code to wait for that element:
WebDriverWait(browser,5).until(
EC.presence_of_element_located((By.ID, "Welcome")))
This code will wait up to 5 seconds for that element to be present, but will return as soon as it finds it. Change the seconds to wait to the maximum possible time you would expect the login to take.
After that you should be able to navigate to your other url as an authenticated user:
browser.get(stronaKup)

Selenium not storing cookies

I'm working on trying to automate a game I want to get ahead in called pokemon vortex and when I login using selenium it works just fine, however when I attempt to load a page that requires a user to be logged in I am sent right back to the login page (I have tried it outside of selenium with the same browser, chrome).
This is what I have
import time
from selenium import webdriver
from random import randint
driver = webdriver.Chrome(r'C:\Program Files (x86)\SeleniumDrivers\chromedriver.exe')
driver.get('https://zeta.pokemon-vortex.com/dashboard/');
time.sleep(5) # Let the user actually see something!
usernameLoc = driver.find_element_by_id('myusername')
passwordLoc = driver.find_element_by_id('mypassword')
usernameLoc.send_keys('mypassword')
passwordLoc.send_keys('12345')
submitButton = driver.find_element_by_id('submit')
submitButton.submit()
time.sleep(3)
driver.get('https://zeta.pokemon-vortex.com/map/10')
time.sleep(10)
I'm using python 3.6+ and I literally just installed selenium today so it's up to date, how do I force selenium to hold onto cookies?
Using a pre-defined user profile might solve your problem. This way your cache will be saved and will not be deleted.
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--user-data-dir=C:/Users/user_name/AppData/Local/Google/Chrome/User Data")
driver = webdriver.Chrome(options=options)
driver.get("xyz.com")

Categories

Resources