With the following code, I am able to open a web page and retrieve its contents.
Based on this web page contents, I would like to execute a post on this page where I supply some form data.
How can this be done with the selenium / chromedriver api?
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
browser = webdriver.Chrome(executable_path=r"/usr/local/share/chromedriver")
url = r'https:\\somewebpage.com'
result = browser.get(url)
I don't think this is possible with selenium alone.
What you could do is fill the form / click on the submit button with something like this:
input_a = driver.find_element_by_id("input_a")
input_b = driver.find_element_by_id("input_b")
input_a.send_keys("some data")
input_b.send_keys("some data")
driver.find_element_by_name("submit").click()
If you really want to create the POST request yourself, you should look into the https://github.com/cryzed/Selenium-Requests package, which will allow you to create POST requests just like the Requests package but with Selenium.
Related
I am trying to scrape data from this Web
I need to login here.
Here is my code.
I dont really know how to do that.
How can do this?
... Sincerely thanks. <3
To login you can go directly to the login page and then go to the page you want to scrape.
You have to download chromedriver from here and specify the path on my script.
This is how you can do it:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
driver = webdriver.Chrome(executable_path=r'PATH')#PUT YOUR CHROMEDRIVER PATH HERE
driver.get("https://secure.vietnamworks.com/login/vi?client_id=3") #LOGIN URL
driver.find_element_by_id("email").send_keys("YOUR-EMAIL#gmail.com") #PUT YOUR EMAIL HERE
driver.find_element_by_id('login__password').send_keys('PASSWORD') #PUT YOUR PASSWORD HERE
driver.find_element_by_id("button-login").click()
driver.get("https://www.vietnamworks.com/technical-specialist-scientific-instruments-lam-viec-tai-hcm-chi-tuyen-nam-tuoi-tu-26-32-chi-nhan-cv-tieng-anh-1336108-jv/?source=searchResults&searchType=2&placement=1336109&sortBy=date") #THE WEB PAGE YOU NEED TO SCRAPE
And then you can get the data from the web page.
I don't think you need to use your main page a navigate tru that. You can just use the link https://www.vietnamworks.com/dang-nhap?from=home-v2&type=login and then write your credentials to the page that loads.
After the page loads, you use
find_element_by_xpath("""//*[#id="email"]""").send_keys("youremail")
password_element = find_element_by_xpath("""//[#id="login__password"]""").send_keys("yourpassword")
password_element.submit()
Xpath is obviously the xpath to the element you need. .submit() is the same as using enter.
I'm working on trying to automate a game I want to get ahead in called pokemon vortex and when I login using selenium it works just fine, however when I attempt to load a page that requires a user to be logged in I am sent right back to the login page (I have tried it outside of selenium with the same browser, chrome).
This is what I have
import time
from selenium import webdriver
from random import randint
driver = webdriver.Chrome(r'C:\Program Files (x86)\SeleniumDrivers\chromedriver.exe')
driver.get('https://zeta.pokemon-vortex.com/dashboard/');
time.sleep(5) # Let the user actually see something!
usernameLoc = driver.find_element_by_id('myusername')
passwordLoc = driver.find_element_by_id('mypassword')
usernameLoc.send_keys('mypassword')
passwordLoc.send_keys('12345')
submitButton = driver.find_element_by_id('submit')
submitButton.submit()
time.sleep(3)
driver.get('https://zeta.pokemon-vortex.com/map/10')
time.sleep(10)
I'm using python 3.6+ and I literally just installed selenium today so it's up to date, how do I force selenium to hold onto cookies?
Using a pre-defined user profile might solve your problem. This way your cache will be saved and will not be deleted.
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--user-data-dir=C:/Users/user_name/AppData/Local/Google/Chrome/User Data")
driver = webdriver.Chrome(options=options)
driver.get("xyz.com")
I am trying to automate logging into my salesforce account. When I use my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser= webdriver.Firefox()
browser.get("https://xxxxx.my.salesforce.com/?
SAMLRequest=&startURL=%2Fidp%2Flogin%3Fapp%
3D0sp0g000000Gmhj&un=xxxxx.xxxxx%40xxxxx.com")
elem=browser.find_element_by_id("username")
elem.send_keys("xxxxx.xxxx#xxxxx.com")
elem_pass=browser.find_element_by_id("password")
elem_pass.send_keys("xxxxxxx")
rem_me=browser.find_element_by_id("rememberUn")
rem_me.click()
elem.send_keys(Keys.ENTER)
As you can see, I pass the link to the url, pass the usname, password and remember me.
When I run this with Selenium, it goes to a email 2FA authentication page.
But when I do it manually:
Copy the url mentioned above.
Paste it into the address bar of firefox browser.
The uname and pass show already populated.
When I hit enter, it logs me in. (No 2FA).
Is Salesforce somehow detecting that request is from selenium?
And is there a way to get around it?
Could this be related to this?
Different results when using Selenium + Python
Yup, I got it resolved. I had to import cookies from Firefox, and use them with Selenium. from selenium import webdriver from selenium.webdriver.common.keys import Keys import os os.chdir("C:\Users\tsingh\Desktop\Cookies") ffprofile = webdriver.FirefoxProfile('C:\Users\tsingh\Desktop\Cookies') browser = webdriver.Firefox(firefox_profile=ffprofile)
I need to download the whole content of html pages images , css , js.
First Option:
Download the page by urllib or requests,
Extract the page info. by beutiful soup or lxml,
Download all links and
Edit the links in original page torelative.
Disadvantages
multiple steps.
The downloaded page will never be identical to remote page. may be due to js or ajax content
Second option
Some authors recommend automating the webbrowser to download the page; so the java scrip and ajax will executed before download.
scraping ajax sites and java script
I want to use this option.
First attempt
So I have copied this piece of selenium code to do 2 steps:
Open the URL in firefox browser
Download the page.
The code
import os
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2)
profile.set_preference('browser.download.manager.showWhenStarting', False )
profile.set_preference('browser.download.dir', os.environ["HOME"])
profile.set_preference("browser.helperApps.alwaysAsk.force", False )
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'text/html,text/webviewhtml,text/x-server-parsed-html,text/plaintext,application/octet-stream');
browser = webdriver.Firefox(profile)
def open_new_tab(url):
ActionChains(browser).send_keys(Keys.CONTROL, "t").perform()
browser.get(url)
return browser.current_window_handle
# call the function
open_new_tab("https://www.google.com")
# Result: the browser is opened t the given url, no download occur
Result
unfortunately no download occurs, it just opens the browser at the url provided (first step).
Second attempt
I think in downloading the page by separate function; so I have added this function.
The function added
def save_current_page():
ActionChains(browser).send_keys(Keys.CONTROL, "s").perform()
# call the function
open_new_tab("https://www.google.com")
save_current_page()
Result
# No more; the browser is opened at the given url, no download occurs.
Question
How to automate downloading webpages by selenium ??
I'm trying to parse the HTML of a webpage that requires being logged in. I can get the HTML of a webpage using this script:
from urllib2 import urlopen
from BeautifulSoup import BeautifulSoup
import re
webpage = urlopen ('https://www.example.com')
soup = BeautifulSoup (webpage)
print soup
#This would print the source of example.com
But trying to get the source of a webpage that I'm logged into proves to be more difficult.
I tried replacing the ('https://www.example.com') with ('https://user:pass#example.com') but I got an Invalid URL error.
Anyone know how I could do this?
Thanks in advance.
Selenium WebDriver ( http://seleniumhq.org/projects/webdriver/ ) might be good for your needs here. You can log in to the page and then print the contents of the HTML. Here's an example:
from selenium import webdriver
# initiate
driver = webdriver.Firefox() # initiate a driver, in this case Firefox
driver.get("http://example.com") # go to the url
# locate the login form
username_field = driver.find_element_by_name(...) # get the username field
password_field = driver.find_element_by_name(...) # get the password field
# log in
username_field.send_keys("username") # enter in your username
password_field.send_keys("password") # enter in your password
password_field.submit() # submit it
# print HTML
html = driver.page_source
print html
I suggest you could use Mechanize.
Python mechanize login to website
In mechanize you setup a browser object so cookies etc can be taken care of.
You can iterate through the form and links.. e.g.
for form in browser.forms():
print form
you can select the form you want and fill it in how you want.
you can try sending POST request to the login form (with the login credentials), afterwards save the recieved cookie and supply it while trying to download the page where you need to be logged in.
We can do it using selenium module as below
from selenium.selenium import selenium
from selenium import webdriver
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import webbrowser
# initiate
my_browser = webdriver.Firefox()
my_browser.get("fill with url of the login page ")
try:
my_browser.implicitly_wait(35)
username_field = my_browser.find_element_by_name(' enter the value of the name attribute')#value of the name attribute in the source code
password_field = my_browser.find_element_by_name('enter the value of the name attribute')
username_field.send_keys("fill_with password")
password_field.send_keys("fill with User_name")
password_field.submit() # submit it
finally:
print 'Look Into the Browser'