Automation on the site using seleniumrequests - python

I am trying to automate some processes on the site. At first I tried to use queries, but a captcha came in response. Now I'm using selenium queries, and here's the problem: when I log in using selenium tools only, everything works fine, but I can't add coupons on the site and confirm them.
from seleniumrequests import Firefox
driver = Firefox()
user = '000000'
password = '000000'
driver_1x.get("https://1xstavka.ru/")
driver.find_element_by_id('curLoginForm').click()
driver.find_element_by_id('auth_id_email').send_keys(user)
driver.find_element_by_id('auth-form-password').send_keys(password)
driver.find_element_by_class_name('auth-button__text').click()
But if you use:
from seleniumrequests import Firefox
driver = Firefox()
driver.request('GET', 'https://1xstavka.ru')
The window opens for a second and immediately closes, a 200 response is received, but there are no cookies. It's the same with publishing requests, with which I'm trying to automate the process. After the request for publication, the response is 200, but nothing happens on the site.
driver.request('POST', 'https://1xstavka.ru/user/auth', json=json)
please tell me what is wrong or how you can solve this problem

I am unable to access the URL specified in the query; but for captcha/coupons, I created a loop with an interim stop function. This gives me the chance to input it manually and then continue the loop.

Related

how can I return to the window of the driver that is currently opened after my script is executed?

I am using selenium webdriver with chrome webdriver. In script1 I get a URL from the driver.get(" ...") and do some stuff and web scraping( for example clicking some buttoms, getting some informations and loging into the site).
When my script runs and finishes, I want to run another script(script2) that continues the last opened window( so in that case I don't have to spend a lot of time login to that site, clicking some buttons until I reach where I want to be).
for example imagine you want to login to your Gmail account and click some buttons to reach your mailbox and your script finishes right here. and then you want to run another script to open your emails one by one.
# script1
driver = webdriver.Chrome()
driver.get("https://gmail.google.com/inbox/")
inbox_button = driver.find_element_by_xpath("//*[#id=":5a"]/div/div[2]/span/a']")
inbox_button.click()
# the code finishes successfully right here
# script2
from script1 import driver
emails = driver.find_elements_by_xpath("path to emails']").find_element_by_tag_name("button")
print('email_button: ', emails)
for email in emails
emails.click()
I do not want to open a new chrome driver and run my code line by line again. I expect something that refers to the current chrome driver.
You need to save your session cookies to be able to return to the previous state.
You can either do this by
# Manually login to the website and then print the cookie
time.sleep(60)
print(driver.get_cookies())
# Then add_cookie() to add the cookie
driver.add_cookie({'domain': ''})
But this solution is not very elegant. You can instead use pickle to store and load the cookies
1. You would need to install pickle using pip - https://pypi.org/project/pickle5/
2. Add cookies after logging in to gmail
pickle.dump(driver.get_cookies(),open("cookies.pkl","wb"))
3. In the last part, you need to load the cookies and add it to your driver again when opening the browser for the second test
cookies = pickle.load(open("cookies.pkl","rb"))
for cookie in cookies:
driver.add_cookies(cookie)

Unable to programatically login to a website

So I am trying to login programatically (python) to https://www.datacamp.com/users/sign_in using my email & password.
I have tried 2 methods of login. One using requests library & another using selenium (code below). Both time facing [403] issue.
Could someone please help me login programatically to it ?
Thank you !
Using Requests library.
import requests; r = requests.get("https://www.datacamp.com/users/sign_in"); r (which gives <response [403]>)
Using Selenium webdriver.
driver = webdriver.Chrome(executable_path=driver_path, options=option)
driver.get("https://www.datacamp.com/users/sign_in")
driver.find_element_by_id("user_email") # there is supposed to be form element with id=user_email for inputting email
Implicit wait at least should have worked, like this:
from selenium import webdriver
driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver')
driver.implicitly_wait(10)
url = "https://www.datacamp.com/users/sign_in"
driver.get(url)
driver.find_element_by_id("user_email").send_keys("test#dsfdfs.com")
driver.find_element_by_css_selector("#new_user>button[type=button]").click()
BUT
The real issue is the the site uses anti-scraping software.
If you open Console and go to request itself you'll see:
It means that the site blocks your connection even before you try to login.
Here is similar question with different solutions: Can a website detect when you are using Selenium with chromedriver?
Not all answers will work for you, try different approaches suggested.
With Firefox you'll have the same issue (I've already checked).
You have to add a wait after driver.get("https://www.datacamp.com/users/sign_in") before driver.find_element_by_id("user_email") to let the page loaded.
Try something like WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, 'user_email')))

Failed to use selenium to automatically click the link in a website

I want to use selenium to automatically log in a website(https://www.cypress.com/) and download some materials.
I successfully open the website using selenium. But when I use selenium to click the "Log in" button. It shows this:
Access Denied
Here is my code:
from time import sleep
from selenium import webdriver
class Cypress():
def extractData(self):
browser = webdriver.Chrome(executable_path=r"C:chromedriver.exe")
browser.get("https://www.cypress.com/")
sleep(5)
element = browser.find_element_by_link_text("Log in")
sleep(1)
element.click()
pass
if __name__ == "__main__":
a = Cypress()
a.extractData()
pass
Can anyone give me some idea?
The website is protected using Akamai CDN, services, or whatever is loaded there.
I took a quick glance and it seems like the Akamai service worker is up, but I don't see any sensor data protection, selenium is simply detected as webdriver (and plenty other things) and flagged, try to login using requests, or ask the website owner to give you API access for your project.
Akamai cookies are up, so surely the protection is too, the 301 you got is the bot protection stopping you from automating something on a protected endpoint.

Selenium Firefox browser is stuck after downloading pdf

Was hoping someone could help me understand what's going on:
I'm using Selenium with Firefox browser to download a pdf (need Selenium to login to the corresponding website):
le = browser.find_elements_by_xpath('//*[#title="Download PDF"]')
time.sleep(5)
if le:
pdf_link = le[0].get_attribute("href")
browser.get(pdf_link)
The code does download the pdf, but after that just stays idle.
This seems to be related to the following browser settings:
fp.set_preference("pdfjs.disabled", True)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/pdf")
If I disable the first, it doesn't hang, but opens pdf instead of downloading it. If I disable the second, a "Save As" pop-up window shows up. Could someone explain how to handle this?
For me, the best way to solve this was to let Firefox render the PDF in the browser via pdf.js and then send a subsequent fetch via the Python requests library with the selenium cookies attached. More explanation below:
There are several ways to render a PDF via Firefox + Selenium. If you're using the most recent version of Firefox, it'll most likely render the PDF via pdf.js so you can view it inline. This isn't ideal because now we can't download the file.
You can disable pdf.js via Selenium options but this will likely lead to the issue in this question where the browser gets stuck. This might be because of an unknown MIME-Type but I'm not totally sure. (There's another StackOverflow answer that says this is also due to Firefox versions.)
However, we can bypass this by passing Selenium's cookie session to requests.session().
Here's a toy example:
import requests
from selenium import webdriver
pdf_url = "/url/to/some/file.pdf"
# setup driver with options
driver = webdriver.Firefox(..options)
# do whatever you need to do to auth/login/click/etc.
# navigate to the PDF URL in case the PDF link issues a
# redirect because requests.session() does not persist cookies
driver.get(pdf_url)
# get the URL from Selenium
current_pdf_url = driver.current_url
# create a requests session
session = requests.session()
# add Selenium's cookies to requests
selenium_cookies = driver.get_cookies()
for cookie in selenium_cookies:
session.cookies.set(cookie["name"], cookie["value"])
# Note: If headers are also important, you'll need to use
# something like seleniumwire to get the headers from Selenium
# Finally, re-send the request with requests.session
pdf_response = session.get(current_pdf_url)
# access the bytes response from the session
pdf_bytes = pdf_response.content
I highly recommend using seleniumwire over regular selenium because it extends Python Selenium to let you return headers, wait for requests to finish, use proxies, and much more.

How to avoid login each time with Selenium python

I have the following code to login in a website.
from selenium import webdriver
driver = webdriver.Chrome("C:\webdrivers\chromedriver.exe")
driver.get ("https://examplesite.com")
driver.find_element_by_id("username").send_keys("MyUsername")
driver.find_element_by_id("password").send_keys("MyPassword")
I do some clicks in that homepage and then a second page https://secondpage.com/some/text is opened in a different tab. I need to make some automation testing
in this second page but if I try to work directly in second page changing in my above code from this
driver.get ("https://examplesite.com")
to this
driver.get ("https://secondpage.com/some/text")
I'm being redirected to first page https://examplesite.com to login again.
I´ve tried to pass the credentials directly in get command like this:
driver.get ("https://MyUsarname:MyPassword#secondpage.com/some/text")
but the same happens and I'm redirected to the login page.
Is there a way to run the script directly in second page without need to login each time I test something?
Maybe mantain in memory Selenium that I´m already logged in?
Thanks for any help

Categories

Resources