Selenium open the webpage forcing the url lowercase - python

i'm runnig this code.
from selenium import webdriver
driver = webdriver.Chrome("chromedriver.exe")
driver.implicitly_wait(1)
driver.get("https://www.vivaticket.com/it/search?categoryId=10&provinceCode=BO")
The open page is all lowercase "https://www.vivaticket.com/it/search?categoryid=10&provincecode=bo" and finds no results.
Do you have an idea how to deal with it?
Thx for help

I think you can bypass this using url-encoding:
def encode(string):
return "".join("%{0:0>2}".format(format(ord(char), "x")) for char in string)
province = "BO"
url = "https://www.vivaticket.com/it/search?categoryid=10&provincecode="+encode(province)
driver.get(url)
In your case, %42%4F stands for BO.
resource

This is the site doing that on first browse for some reason. What I found that works is to just navigate twice.
String url = "https://www.vivaticket.com/it/search?categoryId=10&provinceCode=BO";
driver.get(url);
driver.get(url);
// proceed with your script
The second navigation keeps the proper capitalization.

Related

How to automatically find link of download button and download corresponding file with Python?

I have permission to download some weather data from the following website:
https://www.meteobridel.com/messnetz/index3.php#
I was wondering is there is a possibility to automatically find the download URL behind the 'CSV' button and then download that csv file with Python.
I tried this, but it didn't work:
from selenium import webdriver
browser = webdriver.Safari()
url = 'https://meteobridel.lu/?page_id=5'
browser.get(url)
browser.find_element_by_xpath('//*[#id="CSV"]').click()
browser.close()
Thanks already!
Try
from selenium import webdriver
browser = webdriver.Safari()
url = 'https://meteobridel.lu/?page_id=5'
browser.get(url)
browser.find_element_by_xpath('//body/div[#id='main']/div[1]/div[1]/div[1]/a[4]').click()
browser.close()
Checking the page you provided I can't find an "CSV"-ID.
Maybe try getting the button by class:
browser.find_element_by_xpath(r"//a[contains(#class, 'buttons-csv')]").click()
element is inside iframe so you have to switch to it first , as the id of frame is unique you can switch like
browser.switch_to.frame("iframe")
browser.find_element_by_xpath('//span[contains(text(),"CSV")]/..').click()

always "wrong password" message in selenium automated login

I'm trying to automate a duolingo login with Selenium with the code posted below.
While everything seems to work as expected at first, I always get an "Wrong password" message on the website after the login button is clicked.
I have checked the password time and time again and even changed it to one without special characters, but still the login fails.
I have seen in other examples that there is sometimes an additional password input field, however I cannot find one while inspecting the html.
What could I be missing ?
(Side note: I'm also open to a completely different solution without a webdriver since I really only want to get to the duolingo.com/learn page to scrape some data, but as of yet I haven't found an alternative way to login)
The code used:
from selenium import webdriver
from time import sleep
url = "https://www.duolingo.com/"
def login():
driver = webdriver.Chrome()
driver.get(url)
sleep(2)
hve_acnt_btn = driver.find_element_by_xpath("/html/body/div/div/div/span[1]/div/div[1]/div[2]/div/div[2]/a")
hve_acnt_btn.click()
sleep(2)
email_input = driver.find_element_by_xpath("/html/body/div[1]/div[3]/div[2]/form/div[1]/div/label[1]/div/input")
email_input.send_keys("email#email.com")
sleep(2)
pwd_input = driver.find_element_by_css_selector("input[type=password]")
pwd_input.clear()
pwd_input.send_keys("password")
sleep(2)
login_btn = driver.find_element_by_xpath("/html/body/div[1]/div[3]/div[2]/form/div[1]/button")
login_btn.click()
sleep(5)
login()
I couldn't post the website's html because of the character limit, so here is the link to the duolingo page: Duolingo
Switch to Firefox or a browser which does not tell the page that you are visiting it automated. See my earlier answer for a very similar issue here: https://stackoverflow.com/a/57778034/8375783
Long story short: When you start Chrome it will run with navigator.webdriver=true. You can check it in console. Pages can detect that flag and block login or other actions, hence the invalid login. This is a read-only flag set by the browser during startup.
With Chrome I couldn't log in to Duolingo either. After I switched the driver to Firefox, the very same code just worked.
Also if I may recommend, try to use Xpath with attributes.
Instead of this:
hve_acnt_btn = driver.find_element_by_xpath("/html/body/div/div/div/span[1]/div/div[1]/div[2]/div/div[2]/a")
You can use:
hve_acnt_btn = driver.find_element_by_xpath('//*[#data-test="have-account"]')
Same goes for:
email_input = driver.find_element_by_xpath("/html/body/div[1]/div[3]/div[2]/form/div[1]/div/label[1]/div/input")
vs:
email_input = driver.find_element_by_xpath('//input[#data-test="email-input"]')

how to load multiple urls in driver.get()?

How to load multiple urls in driver.get() ?
I am trying to load 3 urls in below code, but how to load the other 2 urls?
And afterwards the next challenge is to pass authentication for all the urls as well which is same.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome(executable_path=r"C:/Users/RYadav/AppData/Roaming/Microsoft/Windows/Start Menu/Programs/Python 3.8/chromedriver.exe")
driver.get("https://fleet.my.salesforce.com/reportbuilder/reportType.apexp")#put here the adress of your page
elem = driver.find_elements_by_xpath('//*[#id="ext-gen63"]')#put here the content you have put in Notepad, ie the XPath
button = driver.find_element_by_id('id="ext-gen63"')
print(elem.get_attribute("class"))
driver.close
submit_button.click()
Try below code :
def getUrls(targeturl):
driver = webdriver.Chrome(executable_path=r" path for chromedriver.exe")
driver.get("http://www."+targeturl+".com")
# perform your taks here
driver.quit()
for i in range(3):
webPage = ['google','facebook','gmail']
for i in webPage:
print i;
getUrls(i)
You can't load more than 1 url at a time for each Webdriver. If you want to do so, you maybe need some multiprocessing module. If you want to do an iterative solution, just create a list with every url you need and loop through it. With that you won't have the credential problem neither.

How to prevent Selenium Webdriver adding double slash in driver.get() url in Python

I am using Selenium WebDriver to get an internal corporate web page, which is a search form.
My code will successfully open a browser and pull up the target page, but there is an unwanted double slash in the url which affects subsequent search form behavior.
Instead of showing 'http://example.web.company.com/directory/subdirectory/target_page.cfm'
I get: 'http://example.web.company.com//directory/subdirectory/target_page.cfm'
Note double slash after ".com". Does anyone know how/why the extra slash is inserted, and how I can prevent it?
This is somewhat complicated because I have to execute an internal login before the browser will open the page. It redirects to the login, fill in the prompts, and then the code below will load the requested page successfully. It just brings up the double slash version.
import getpass
import requests, lxml.html
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
# Get user input credentials
user = input('Enter ID: ')
password = getpass.getpass('Password: ')
# Driver
driver = webdriver.Chrome(executable_path=r'C:\Drivers\Chromedriver\chromedriver.exe')
# Target web page
driver.get('http://example.web.company.com/directory/subdirectory/target_page.cfm')
# Navigate Logon Page
elem = driver.find_element_by_id('ID')
elem.send_keys(user)
elem = driver.find_element_by_id('PASSWORD')
elem.send_keys(password)
elem = driver.find_element_by_id('Submit')
elem.click()
If I then try to execute code to search for something using the web form, the double slash url version will successfully display a list of results for a partial-matched search term (normal behavior). But if I enter an exact (valid) search term, I get an error, which I guess seems to have to do with relative linking. But I don't have control over those pages, I'm just a user.
# Search routine from page returned above
item = input('SEARCH TERM: ')
elem = driver.find_element_by_name('search_name')
elem.send_keys(item)
elem.send_keys(Keys.RETURN)
This site provided some background on single and double slash urls and relative linking: https://sitebulb.com/hints/internal/url-contains-a-double-slash/
These stackoverflow threads touch on webdriver and driver.get(), but I wasn't able to find an answer to the question: Where does the extra slash come from and how to prevent it?
double slash for xpath. Selenium Java Webdriver
Selenium driver.get() modifying URL
I eventually learned that the target result page is not accessible by automation under the same directory path that works manually. But there is another directory tree that does work with automation, and both touch the same database.
I cannot make my code work using the original target search site (because it only attempts the directory path that will not work with automation). But I am able use other loopable code to link to target results using the directory tree that does work.
I don't know exactly why the automation won't accept the same path that works manually, but I have a consistent workaround. I believe the issue with the double slash is not relevant.

xpath returns more than one result, how to handle in python

I have started selenium using python. I am able to change the message text using find_element_by_id. I want to do the same with find_element_by_xpath which is not successful as the xpath has two instances. want to try this out to learn about xpath.
I want to do web scraping of a page using python in which I need clarity on using Xpath mainly needed for going to next page.
#This code works:
import time
import requests
import requests
from selenium import webdriver
driver = webdriver.Chrome()
url = "http://www.seleniumeasy.com/test/basic-first-form-demo.html"
driver.get(url)
eleUserMessage = driver.find_element_by_id("user-message")
eleUserMessage.clear()
eleUserMessage.send_keys("Testing Python")
time.sleep(2)
driver.close()
#This works fine. I wish to do the same with xpath.
#I inspect the the Input box in chrome, copy the xpath '//*[#id="user-message"]' which seems to refer to the other box as well.
# I wish to use xpath method to write text in this box as follows which does not work.
driver = webdriver.Chrome()
url = "http://www.seleniumeasy.com/test/basic-first-form-demo.html"
driver.get(url)
eleUserMessage = driver.find_elements_by_xpath('//*[#id="user-message"]')
eleUserMessage.clear()
eleUserMessage.send_keys("Test Python")
time.sleep(2)
driver.close()
To elaborate on my comment you would use a list like this:
eleUserMessage_list = driver.find_elements_by_xpath('//*[#id="user-message"]')
my_desired_element = eleUserMessage_list[0] # or maybe [1]
my_desired_element.clear()
my_desired_element.send_keys("Test Python")
time.sleep(2)
The only real difference between find_elements_by_xpath and find_element_by_xpath is the first option returns a list that needs to be indexed. Once it's indexed, it works the same as if you had run the second option!

Categories

Resources