Python Selenium web driver with chrome driver gets detected - python

I assumed that the chrome browsing session opened by Selenium would be the same as the google chrome local installation. But when I try to search on this website, even just open it with selenium and manually control the search process, I will get an error message where as the search result returns fine when I use regular chrome with my own profile or in incognito window.
Whenever I search on this issue, I find results stating mouse movements or clicking pattern gives it away. But it is not the case as I tried manually control after opening the browser. Something in the html request gives it away. Is there anyway to overcome that?
The website in question is: https://www.avnet.com/wps/portal/us
The error message when in automated session.

As per the the website in question https://www.avnet.com/wps/portal/us I am not sure about the exact issue you are facing perhaps your code block would have given us some more leads whats wrong happening. However I am am able to access the mentioned url just fine :
Code Block :
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://www.avnet.com/wps/portal/us')
print("Page Title is : %s" %driver.title)
Console Output :
Page Title is : Avnet: Quality Electronic Components & Services
Snapshot :
Update
I had a relook at the issue you are facing. I have read the entire HTML DOM and have found no traces of Bot Detection mechanisms. Had there been any Bot Detection mechanisms implemented the website even wouldn't have allowed you to traverse/scrape the DOM Tree even to find the Search Box even.
Further debugging the issue the following are my observations :
Through your Automated Script you can proceed till sending the searchtext to the Search Box successfully.
While manually you search for a valid product, the auto-suggestions are displayed through a <span> tag as the html below and you can click on any of the auto-suggestions to browse to the specific product.
Auto Suggestions :
SPAN tag HTML :
<span id="auto-suggest-parts-dspl">
<p class="heading">Recommended Parts</p>
<dl class="suggestion">
<div id="list_1" onmouseover="hoverColor(this)" onmouseout="hoverColorOut(this)" class="">
<div class="autosuggestBox">
AM8TW-4805DZ
<p class="desc1">Aimtec</p>
<p class="desc2">Module DC-DC 2-OUT 5V/-5V 0.8A/-0.8A 8W 9-Pin DIP Tube</p>
</div>
</div>
This <span> is basically not getting triggered/generated when we are using the WebDriver.
In absence of the Auto Suggestions if you force a search that results into the error page.
Conclusion
The main issue seems to be either with the form-control class or with the function scrollDown(event,this) associated with onkeydown event.

#TooLongForComment
To reproduce this issue
from random import randint
from time import sleep
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument('--disable-infobars')
options.add_argument('--disable-extensions')
options.add_argument('--profile-directory=Default')
options.add_argument('--incognito')
options.add_argument('--disable-plugins-discovery')
options.add_argument('--start-maximized')
browser = webdriver.Chrome('./chromedriver', chrome_options=options)
browser.get('https://www.avnet.com/wps/portal/us')
try:
search_box_id = 'searchInput'
myElem = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.ID, search_box_id)))
elem = browser.find_element_by_id(search_box_id)
sleep(randint(1, 5))
s = 'CA51-SM'
for c in s: # randomize key pressing
elem.send_keys(c)
sleep(randint(1, 3))
elem.send_keys(Keys.RETURN)
except TimeoutException as e:
pass
finally:
browser.close()
I've used hexedit to edit chromedriver key from $cdc_ to fff..
Investigate how it's done by reading every JavaScript block, look at this answer for detection example
Try to add extension to modify headers and mask under Googlebot by changing user-agent & referrer options.add_extension('/path-to-modify-header-extension')

Related

Selenium: Is there a way to force click an element when another element obscures it?

I am currently running in headless mode (uploaded in CirclCI Docker) and the site I am working on has an Accept Cookies banner. Apparently, in headless mode, it obscures the Log In button I need to click to be able to proceed with my test. Is there a way to disable checking?
We have .click({force: true}) in Cypress. Do we have a counterpart in Selenium?
Website is https://mailtrap.io/
Backstory:
Before going it Mailtrap, I came from another origin. I tried to use the Options() for Firefox before visiting Mailtrap but it doesn't work. If I put it at the first run, it messes up with the other site as well.
Here is the code:
options = Options()
options.set_preference("network.cookie.cookieBehavior", 2)
context.driver.get("https://mailtrap.io/signin")
loginButton = context.mailTrapPage.getLoginButton()
context.driver.execute_script("arguments[0].click();", loginButton)
Here is the error I'm getting:
selenium.common.exceptions.ElementClickInterceptedException: Message: Element <a class="login_next_button button button--medium button--block" href="#"> is not clickable at point (388,614) because another element <p class="cookies-banner__text"> obscures it
Stacktrace:
RemoteError#chrome://remote/content/shared/RemoteError.jsm:12:1
WebDriverError#chrome://remote/content/shared/webdriver/Errors.jsm:192:5
ElementClickInterceptedError#chrome://remote/content/shared/webdriver/Errors.jsm:291:5
webdriverClickElement#chrome://remote/content/marionette/interaction.js:166:11
interaction.clickElement#chrome://remote/content/marionette/interaction.js:125:11
clickElement#chrome://remote/content/marionette/actors/MarionetteCommandsChild.jsm:204:29
receiveMessage#chrome://remote/content/marionette/actors/MarionetteCommandsChild.jsm:92:31
Yes, you can click any existing element, including covered by other element, with JavaScript click as following:
element = driver.find_element(By.XPATH, element_xpath_locator)
driver.execute_script("arguments[0].click();", element)
However you need to remember that Selenium is developed to imitate actions performed by human user via GUI and human user can not perform such actions. As a user you will have to scroll the page or open dropdown etc. to make that element accessible and only then click it. But if sometimes you just need to perform some action without caring if such action can be done by a human user - you can use JavaScript clicks and other methods, out of Selenium covered actions.
UPD
You specific problem caused by selecting a wrong element.
The following code works for me:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(service=webdriver_service, options=options)
url = 'https://mailtrap.io/signin'
driver.get(url)
wait = WebDriverWait(driver, 20)
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#signinTopBlock a[href*='signin']"))).click()

How to click on "Accept Cookie" button without name & ID in Python Selinium

I need to accept the cookies but the button has no name or ID. This is related to this issue but no answer as of now.
I tried the following, based on other options I found online but in vain:
driver.find_element_by_xpath('/html/body/div/button[1]').click()
driver.find_element_by_xpath("//a[. = 'Accept all cookies']").click()
driver.find_element_by_class_name('button.accept-settings-button')
driver.find_element_by_name('Accept all cookies').click()
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
button = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, '//a[contains(., "Accept all cookies")]')))
button.click()
driver.find_element_by_css_selector("button.accept-settings-button").click()
driver.find_element_by_css_selector("body > div > button.accept-settings-button").click()
driver.switch_to_frame(driver.find_element_by_css_selector('main'))
Here is the website: https://careers.portmandentalcare.com/Search.aspx (you need to wait for the cookies to appear) and the js element:
<div class="main"><h1>Cookie preferences</h1><h2>Our use of cookies</h2><p>We use necessary cookies to make our site work. We also like to set optional analytics cookies to help us improve it. You can switch these off if you wish. Using this tool will set a cookie on your device to remember your preferences.</p><div>For more detailed information about the cookies we use, see our <a class="policy-link" href="#">Cookies policy</a>.</div><button type="button" class="accept-settings-button" aria-describedby="necessary-note">Accept all cookies</button><div class="note" id="necessary-note">By accepting recommended settings you are agreeing to the use of both our necessary and analytical cookies</div><h2>Necessary (always active)</h2><p>Necessary cookies enable core functionality such as security, network management, and accessibility. You may disable these by changing your browser settings, but this may affect how the website functions.</p><h2 id="analytics">Analytics</h2><p id="analytics-note">We use analytical cookies to help us to improve our website by collecting and reporting information on how you use it. For more information on how these cookies work please see our Cookies policy. The cookies collect information in an anonymous form.</p><div><div class="toggle-button-wrapper"><b>Off</b> <button type="button" class="toggle-analytics-button" role="switch" aria-labelledby="analytics" aria-describedby="analytics-note" aria-checked="false"></button> <b>On</b></div></div><button type="button" class="save-settings-button">Save my settings</button></div>
Thanks for your help.
P.S. using latest chromedriver with these settings
chrome_options = webdriver.ChromeOptions()
# chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-logging')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument("--remote-debugging-port=9222") # to avoid DevToolsActivePort file doesn't exist
if os_system == 'linux':
driver = webdriver.Chrome(executable_path='/usr/bin/chromedriver', options=chrome_options)
elif os_system == 'mac':
driver = webdriver.Chrome(executable_path='/Users/user1/chromedriver', options=chrome_options)
<iframe src="https://careers.portmandentalcare.com/ico-compliance/index.html?v=637472026522948225" class="ico-compliance-plugin" title="Cookie preferences" tabindex="0"></iframe>
Since the element is in an iframe wait till the iframe is available and switch to it and then click the element.
wait = WebDriverWait(driver, 30)
driver.get("https://careers.portmandentalcare.com/Search.aspx")
wait.until(EC.frame_to_be_available_and_switch_to_it(driver.find_element_by_class_name("ico-compliance-plugin")))
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button.accept-settings-button"))).click()
you can use any attribute for finding a element:
In the above element use the class of the accept all button
The css selector is :
button.accept-settings-button
xpath is
//buttun[#class="accept-settings-button"]
So
driver.find_element_by_xpath('//buttun[#class="accept-settings-button]"').click()
driver.find_element_by_css_selector('buttun.accept-settings-button').click()
driver.find_element_by_css_selector('buttun[class="accept-settings-button]"').click()
you can use any of the three methods

Ebay website hangs after login button clicked - Selenium Python

I have written the following code to login to a website. So far it simply gets the webpage, accepts cookies, but when I try to login by clicking the login button, the page hangs and the login page never loads.
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException, ElementNotInteractableException
# Accept consent cookies
def accept_cookies(browser):
try:
browser.find_element_by_xpath('//*[#id="gdpr-banner-accept"]').click()
except NoSuchElementException:
print('Cookies already accepted')
# Webpage parameters
base_site = "https://www.ebay-kleinanzeigen.de/"
# Setup remote control browser
fireFoxOptions = webdriver.FirefoxOptions()
#fireFoxOptions.add_argument("--headless")
browser = webdriver.Firefox(executable_path = '/home/Webdriver/bin/geckodriver',firefox_options=fireFoxOptions)
browser.get(base_site)
accept_cookies(browser)
# Click login pop-up
browser.find_elements_by_xpath("//*[contains(text(), 'Einloggen')]")[1].click()
Note: There are two login buttons (one popup & one in the page), I've tried both with the same result.
I have done similar with other websites, no problem. So am curious as to why it doesn't work here.
Any thoughts on why this might be? Or how to get around this?
I modified your code a bit adding a couple of optional arguments and on execution I got the following result:
Code Block:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver.get("https://www.ebay-kleinanzeigen.de/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#id='gdpr-banner-accept']"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[contains(text(), 'Einloggen')]"))).click()
Observation: My observation was similar to your's that the page hangs and the login page never loads as shown below:
Deep Dive
While inspecting the DOM Tree of the webpage you will find that some of the <script> and <link> tag refers to JavaScripts having keyword dist. As an example:
<script type="text/javascript" async="" src="/static/js/lib/node_modules/#ebayk/prebid/dist/prebid.10o55zon5xxyi.js"></script>
window.BelenConf.prebidFileSrc = '/static/js/lib/node_modules/#ebayk/prebid/dist/prebid.10o55zon5xxyi.js';
This is a clear indication that the website is protected by Bot Management service provider Distil Networks and the navigation by ChromeDriver gets detected and subsequently blocked.
Distil
As per the article There Really Is Something About Distil.it...:
Distil protects sites against automatic content scraping bots by observing site behavior and identifying patterns peculiar to scrapers. When Distil identifies a malicious bot on one site, it creates a blacklisted behavioral profile that is deployed to all its customers. Something like a bot firewall, Distil detects patterns and reacts.
Further,
"One pattern with Selenium was automating the theft of Web content", Distil CEO Rami Essaid said in an interview last week. "Even though they can create new bots, we figured out a way to identify Selenium the a tool they're using, so we're blocking Selenium no matter how many times they iterate on that bot. We're doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious".
Reference
You can find a couple of detailed discussion in:
Unable to use Selenium to automate Chase site login
Webpage Is Detecting Selenium Webdriver with Chromedriver as a bot
Is there a version of selenium webdriver that is not detectable
from selenium import webdriver
from selenium_stealth import stealth
import time
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
# options.add_argument("--headless")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r"C:\Users\DIPRAJ\Programming\adclick_bot\chromedriver.exe")
stealth(driver,
languages=["en-US", "en"],
vendor="Google Inc.",
platform="Win32",
webgl_vendor="Intel Inc.",
renderer="Intel Iris OpenGL Engine",
fix_hairline=True,
)
url = "https://bot.sannysoft.com/"
driver.get(url)
time.sleep(5)
driver.quit()

Dynamic dropdown doesn't populate with auto suggestions on https://www.nseindia.com/ when values are passed using Selenium and Python

driver = webdriver.Chrome('C:/Workspace/Development/chromedriver.exe')
driver.get('https://www.nseindia.com/companies-listing/corporate-filings-actions')
inputbox = driver.find_element_by_xpath('/html/body/div[7]/div[1]/div/section/div/div/div/div/div/div[1]/div[1]/div[1]/div/span/input[2]')
inputbox.send_keys("Reliance")
I'm trying to scrape the table from this website that would appear after you key in the company name in the textfield above it. The attached code block works well with such similar drop-downs of a normal google search and wolfram website, but when i run my script on the required website, that essentially just inputs the required text in the textfield - the dropdown shows 'No Records Found', whereas, when done manually it works well.
I executed your test adding a few tweaks and ran the test as follows:
Code Block:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://www.nseindia.com/companies-listing/corporate-filings-actions')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[#id='Corporate_Actions_equity']//input[#placeholder='Company Name or Symbol']"))).send_keys("Reliance")
Observation: Similar to your observation, I have hit the same roadblock with no results as follows:
Deep Dive
It seems the click() on the element with text as Get Data does happens. But while inspecting the DOM Tree of the webpage you will find that some of the <script> tag refers to JavaScripts having keyword akam. As an example:
<script type="text/javascript" src="https://www.nseindia.com/akam/11/3b383b75" defer=""></script>
<noscript><img src="https://www.nseindia.com/akam/11/pixel_3b383b75?a=dD02ZDMxODU2ODk2YTYwODA4M2JlOTlmOGNkZTY3Njg4ZWRmZjE4YmMwJmpzPW9mZg==" style="visibility: hidden; position: absolute; left: -999px; top: -999px;" /></noscript>
Which is a clear indication that the website is protected by Bot Manager an advanced bot detection service provided by Akamai and the response gets blocked.
Bot Manager
As per the article Bot Manager - Foundations:
Conclusion
So it can be concluded that the request for the data is detected as being performed by Selenium driven WebDriver instance and the response is blocked.
References
A couple of documentations:
Bot Manager
Bot Manager : Foundations
tl; dr
A couple of relevant discussions:
Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection
Unable to use Selenium to automate Chase site login

How to login within Wells Fargo account using Selenium and Python

I'm trying to write scripts to download relevant data from my Wells Fargo account, and using Selenium within a python script seems like a good potential solution. However, when I try to log in, I get redirected to a different login screen, which keeps redirecting to itself each time I try to log in again. How do I get past this infinite redirect?
The simplest way to reproduce this is to run this script to open a selenium Chrome browser:
import selenium
browser = selenium.webdriver.Chrome()
browser.get('https://www.wellsfargo.com/')
Then manually log in. On a normal browser this works fine. In the selenium browser, I get the redirect page.
Someone else has the same problem. If there's a way to do this with Selenium I would be thrilled, but would also be happy with other methods to automate data downloads from Wells Fargo.
Try this code:
from selenium.webdriver import Chrome
browser = Chrome()
browser.get('https://wellsfargo.com')
#GET user element and POST username to element
user = browser.find_element_by_id('userid')
user.send_keys('USERNAME')
#GET user element and POST password to element
passwd = browser.find_element_by_id('password')
passwd.send_keys('PASSWORD')
passwd.submit()
you should be logged in and redirected to account page.
Please tell me how it works out for you.
since im not the bank client, i cannot fo beyond this point. :)
So more info about after which action you get redirected to a different login screen would have helped us to debug your issue in a better way. However to login into your wellsfargo account you need to induce WebDriverWait for the desired element_to_be_clickable() and you can use the following solution:
Code Block:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("--disable-extensions")
# options.add_argument('disable-infobars')
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://www.wellsfargo.com/')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.formElement.formElementText#userid"))).send_keys("BenLindsay")
driver.find_element_by_css_selector("input.formElement.formElementPassword#password").send_keys("BenLindsay")
driver.find_element_by_css_selector("input#btnSignon[name='btnSignon']").click()
Browser Snapshot:

Categories

Resources