Send keys to window DOM with Selenium in python to bypass captcha - python

I am trying to get data from a webpage (https://www.educacion.gob.es/teseo/irGestionarConsulta.do). It has a captcha at the entry which I manually solve and move to the results page.
It happens that going back from the results page to the entry page does not modify the captcha if I reach the initial page with the "go back" button of the browser; but if I use the driver.back() instruction of Selenium's WebDriver, sometimes the captcha is modified - which I'd better avoid.
Clearly: I want to get access from Selenium to the DOM window (the browser), rather than to the document (or any element within the html) and send the ALT+ARROW_LEFT keys to the browser (the window).
This, apparently, cannot be done with:
from selenium.webdriver import Firefox
from selenium.webdriver.common.keys import Keys
driver = Firefox()
driver.get(url)
xpath = ???
driver.find_element_by_xpath(xpath).send_keys(Keys.ALT, Keys.ARROW_LEFT)
because send_keys connects to the element on focus, and my target is the DOM window, not any particular element of the document.
I have also tried with ActionChains:
from selenium.webdriver.common.action_chains import ActionChains
action = ActionChains(driver)
action.key_down(Keys.LEFT_ALT).key_down(Keys.ARROW_LEFT).send_keys().key_up(Keys.LEFT_ALT).key_up(Keys.ARROW_LEFT)
action.perform()
This also does not work (I have tried with several combinations). The documentation states that key_down/up require an element to send keys, which if None (the default) is the current focused element. So again here, there is the issue of how to focus on the window (the browser).
I have thought about using the mouse controls, but I assume it will be the same: I know how to make the mouse reach any element in the document, but I want to reach the window/browser.
I have thought of identifying the target element through the driver.current_window_handle, but also fruitlessly.
Can this be done from Selenium? If so, how? Can other libraries do it, perhaps pyppeteer or playwright?

Try with JavaScriptExecutor - driver.execute_script("window.history.go(-1)")

Related

Selenium Webdriver cannot find element which loads later than DOM

I've been trying to webscrap this page using selenium, and as you may notice on the first load a pop-up appears about agreeing on cookies. This pop-up appears around a second later that the rest of the DOM.
The problem is, that even after adding a sleep or WebDriverWait the driver still cannot click on the Accept button from the pop-up.
Inspecting the page gives that the Xpath of the element is /html/body/div/div[2]/div[4]/button[1] so I tried doing it as:
driver.get("https://www.planespotters.net/airlines")
time.sleep(5)
driver.find_element(By.XPATH, '/html/body/div/div[2]/div[4]/button[1]').click()
with no effect. I get that the element does not exist.
I tried it even with:
driver.execute_script('document.querySelector("#notice > div.message-component.message-row.button-container > button.message-component.message-button.no-children.focusable.primary.sp_choice_type_11").click()')
and
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '/html/body/div/div[2]/div[4]/button[1]'))).click()
Any suggestions on how to bypass this pop-up?
If you scroll up on devtools you'll notice your modal window is in an iframe. I queried the 'planespotters' header image (because it's the top item) and you can see it quite clearly:
For selenium switch to your iframe first - this finds the iframe wher the url contains 'privacy':
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[contains(#src,'privacy')]")))
Then you can complete your action. Be sure to keep that sync in there too.
Then you can switch back to your main window and carry in with your test
driver.switch_to_default_content()
You can try introducing an implicit wait for your driver. This will happen once per script and will affect all objects. This will wait 10 seconds before throwing an error, or continues whenever it is ready:
driver.implicitly_wait(10)

click link automatically and scraping

I try to extract the all the products data from this page:
https://www.shufersal.co.il/online/he/קטגוריות/סופרמרקט/חטיפים%2C-מתוקים-ודגני-בוקר/c/A25
I tried
shufersal = "https://www.shufersal.co.il/online/he/%D7%A7%D7%98%D7%92%D7%95%D7%A8%D7%99%D7%95%D7%AA/%D7%A1%D7%95%D7%A4%D7%A8%D7%9E%D7%A8%D7%A7%D7%98/%D7%97%D7%98%D7%99%D7%A4%D7%99%D7%9D%2C-%D7%9E%D7%AA%D7%95%D7%A7%D7%99%D7%9D-%D7%95%D7%93%D7%92%D7%A0%D7%99-%D7%91%D7%95%D7%A7%D7%A8/c/A25"
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())
import time
driver.get(shufersal)
products = driver.find_elements_by_css_selector("li.miglog-prod.miglog-sellingmethod-by_unit"
)
the problem is that the product details is showed only when I click the product:
There any option to click all links automatically and scraping the open windows?
What you want can be achieved, but is a significantly more time consuming processes.
You'll have to:
identify the elements that you need to click in the page (they'll probably have the same class) and then select them all:
butons_to_click = driver.find_elements_by_css_selector({RELEVANT SELECTOR HERE})
Then you should loop through all clickable elements, click on them, wait for the pop up to load, scrape the data, close the popup:
scraped_list = []
for button_instance in buttons_to_click:
button_instance.click()
#scrape the information you need here and append to scraped_list
#find close popup button
driver.find_element_by_xpath({XPATH TO ELEMENT}).click()
For this to work properly it is important to setup selenium's implicitly wait parameter. What that does is that if it doesn't find the required element it'll wait for X seconds until it is loaded, if X passes then it'll throw an error (you can handle the error in your code if it is expected).
In your case, you'll need the wait time because after you click on the product to display the popup, the information might take some seconds to load, in case you don't set implicitly wait, your script will exit with an element not found error. More information on selenium's wait parameters can be found here: https://selenium-python.readthedocs.io/waits.html
#put this line immediatelly after creating the driver object
driver.implicitly_wait(10) # seconds
***Suggestion:
I suggest you to always use Xpath when looking up elements, its syntax can actually emulate all other selenium selectors, is faster and will make an easier transition to C compiled html parsers (which you'll need in case you scale your scraper - recommend lxml a python package that uses a compiled parser)

Why won't Selenium in Python click the pop-up privacy button?

I am having some issues with Selenium not clicking the pop-up privacy button on https://www.transfermarkt.com/
Here is my code so far:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.transfermarkt.com/')
accept_button = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[2]/button')
accept_button.click()
It comes up saying:
NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/div/div[2]/div[3]/div[2]/button"}
Anyone have any thoughts?
I believe the issue is that the element you are trying to click is inside an iframe. In order to click that element you'll have to first switch to that frame. I noticed the iframe has title="SP Consent Message" so I'll use a CSS Selector to identify it based on that. Your code with the added line:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.transfermarkt.com/')
driver.switch_to.frame(driver.find_element_by_css_selector('iframe[title="SP Consent Message"]'))
accept_button = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[2]/button')
accept_button.click()
Note that you may have to switch back to the default frame to continue your test, I'm not 100% sure as that iframe has gone away.
Also it seems like some folks don't get that popup when the hit the website, not sure why, may be something you want to look in to to determine how you want to test this.
Like #Prophet says you should improve the xpath for the button (again it seems to have a unique title so I would use CSS Selector 'button[title="ACCEPT ALL"]'), but this works for me to click it.
There are 2 issues here:
You need to add wait / delay before accept_button = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[2]/button') to let the page load
It's very, very bad practice to use absolute XPaths. You have to define an unique, short and clear related XPath expression.
Additionally, I see no pop-up privacy button when I open that page. So, maybe that element is indeed not there.

Scroll down when the Scrollbar has no tag with Selenium for python

I wanted to scroll down in an element, where the scrollbar doesnt have a tag/element. I did quite a few research on this but i couldnt find anything for that specific case. Im using Selenium for Python
We are talking about the Site "www.swap.gg", its a site for csgo skins, and in order to load every item from the bot, you need to scroll down in the element "Bot inventory". The Site looks like this when using F12 in browser it looks like this The respective element
There is no key which you can use to scroll down there, therefore the .send_keys() command doesnt work by any chance. I also tried the touchactions.scroll_from_element(element,xoffset,yoffset) but didn't have any luck either. Im using Firefox as a webdriver incase that matters.
The xpath of the element is "/html/body/div/div1/section1/div[2]/div[3]/div[2]/div[2]"
The CSS selector is "div.is-paddingless:nth-child(3)"
Any ideas?
Find the parent div
element = driver.find_element_by_xpath('/html/body/div/div[1]/section[1]/div[2]/div[3]/div[2]/div[2]')
Scroll down to trigger the reload
driver.execute_script("arguments[0].scrollBy(0, 500)", element)
import selenium
import selenium.webdriver
import time
driver = selenium.webdriver.Chrome()
driver.get('https://swap.gg/')
element = driver.find_element_by_xpath('/html/body/div/div[1]/section[1]/div[2]/div[3]/div[2]/div[2]')
for i in range(20):
driver.execute_script("arguments[0].scrollBy(0, 500)", element)
time.sleep(2)

Python Scraping - Selenium - Element is not clickable at point

I'm trying to click the next button on a webpage multiple times, I need scrape the page after each click. The following code is shortened but illustrates my situation. I am able to scrape the required tag after the first click, but then the second click fails. Here is the code
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get('http://www.agoda.com/the-coast-resort-koh-phangan/hotel/koh-phangan-th.html')
browser.find_element_by_xpath("//a[#id='next-page']").click()
browser.find_element_by_xpath("//a[#id='next-page']").click()
browser.quit()
I get this error
WebDriverException: Message: Element is not clickable at point
Although when I use beautifulSoup and get the source code after the first click the element I am trying to click again is clearly there to click.
I've read on other answers I might need to wait some time but not sure how to implement that.
The first time you click the next page starts to load, you can't click on a link that is on a page that is still loading or on a link on a page that is about to be replaced by another page! So the message is quite correct. So you just have to wait for the next page to appear before you can click on the next link. You can just sleep for N seconds or, much better, wait for the element to appear before clicking on it again.
Have a look at: Expected Conditions
So there you would probably want to use element_to_be_clickable on each attempt to click. What happens is it waits N seconds for it to be clickable and if it does not take that state in that time it throws an exception.
Here is a good blog post on all this: How to get Selenium to wait for page load after a click
Here is a simple example of how you can use that. Here you have just clicked on the link for the first time, and then after that to wait for the next to load you can:
xpath = "//a[#id='next-page']"
WebDriverWait(browser.webdriver, 10).until(
expected_conditions.element_to_be_clickable((By.XPATH, xpath)))
browser.find_element_by_xpath(xpath).click()

Categories

Resources