How to split selenium python code into multiply functions - python

I'm writing a python program which will test some functions on website. It will log in to this site, check it version and do some tests on it regarding the site version. I want to write few tests for this site but few things will repeat, for example login to the site.
I try to split my code into functions, like hue_login() and use it on every test I need to login to the site. To login to site I use selenium webdriver. So If I split the code into small functions and try to use it in other function where I also use selenium webdriver I end up with two browser windows. One from my hue_login() function where function log me in. And second browser window where it try to put url where I want to go after I log in to the site interface. Of course, because I am not login into the second browser window, site wont show and other tests will fail (tests from this second function).
Example:
def hue_version():
url = global_var.domain + global_var.about
response = urllib.request.urlopen(url)
htmlparser = etree.HTMLParser()
xpath = etree.parse(response, htmlparser).xpath('/html/body/div[4]/div/div/h2/text()')
string = "".join(xpath)
pattern = re.compile(r'(\d{1,2}).(\d{1,2}).(\d{1,2})')
return pattern.search(string).group()
hue_ver = hue_version()
print(hue_ver)
if hue_ver == '3.9.0':
do something
elif hue_version == '3.7.0':
do something else
else:
print("Hue version not recognized!")
def hue_login():
driver = webdriver.Chrome(global_var.chromeDriverPath)
driver.get(global_var.domain + global_var.loginPath)
input_username = driver.find_element_by_name('username')
input_password = driver.find_element_by_name('password')
input_username.send_keys(username)
input_password.send_keys(password)
input_password.submit()
sleep(1)
driver.find_element_by_id('jHueTourModalClose').click()
def file_browser():
hue_login()
click_file_browser_link = global_var.domain + global_var.fileBrowserLink
driver = webdriver.Chrome(global_var.chromeDriverPath)
driver.get(click_file_browser_link)
How can I call hue_login() from file_browser() function that rest of the code from file_browser() will be executed in the same window opened by hue_login()?

Here you go:
driver = webdriver.Chrome(global_var.chromeDriverPath)
def hue_login():
driver.get(global_var.domain + global_var.loginPath)
input_username = driver.find_element_by_name('username')
input_password = driver.find_element_by_name('password')
input_username.send_keys(username)
input_password.send_keys(password)
input_password.submit()
sleep(1)
driver.find_element_by_id('jHueTourModalClose').click()
def file_browser():
hue_login()
click_file_browser_link = global_var.domain + global_var.fileBrowserLink
driver.get(click_file_browser_link)

Related

Get Ublock Origin logger datas using Python and selenium

I'd like to know the number of blocked trackers detected by Ublock Origin using Python (running on linux server, so no GUI) and Selenium (with firefox driver). I don't necessarly need to really block them but i need to know how much there are.
Ublock Origin has a logger (https://github.com/gorhill/uBlock/wiki/The-logger#settings-dialog)) which i'd like to scrap.
This logger is available through an url like this: moz-extension://fc469b55-3182-4104-a95c-6b0b4f87cf0f/logger-ui.html#_ where the part in italic is the UUID of Ublock Origin Addon.
In this logger, for each entry, there is a div with class set to "logEntry" (yellow oblong in the screenshot below), and i'd like to get the datas in the green oblong:
So far, i got this:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.firefox.options import Options as FirefoxOptions
browser_options = FirefoxOptions()
browser_options.headless = True
# Activate add on
str_ublock_extension_path = "/usr/local/bin/uBlock0_1.45.3b10.firefox.signed.xpi"
browser = webdriver.Firefox(executable_path='/usr/loca/bin/geckodriver',options=browser_options)
str_id = browser.install_addon(str_ublock_extension_path)
# Getting the UUID which is new each time the script is launched
profile_path = browser.capabilities['moz:profile']
id_extension_firefox = "uBlock0#raymondhill.net"
with open('{}/prefs.js'.format(profile_path), 'r') as file_prefs:
lines = file_prefs.readlines()
for line in lines:
if 'extensions.webextensions.uuids' in line:
sublines = line.split(',')
for subline in sublines:
if id_extension_firefox in subline:
internal_uuid = subline.split(':')[1][2:38]
str_uoo_panel_url = "moz-extension://" + internal_uuid + "/logger-ui.html#_"
ubo_logger = browser.get(str_uoo_panel_url)
ubo_logger_log_entries = ubo_logger.find_element(By.CLASS_NAME, "logEntry")
for log_entrie in ubo_logger_log_entries:
print(log_entrie.text)
Using this "weird" url with moz-extension:// seems to work considering that print(browser.page_source) will display some relevant html code.
Problem: ubo_logger.find_element(By.CLASS_NAME, "logEntry") got nothing. What did i did wrong?
I found this to work:
parent = driver.find_element(by=By.XPATH, value='//*[#id="vwContent"]')
children = parent.find_elements(by=By.XPATH, value='./child::*')
for child in children:
attributes = (child.find_element(by=By.XPATH, value='./child::*')).find_elements(by=By.XPATH, value='./child::*')
print(attributes[4].text)
You could then also do:
if attributes[4].text.isdigit():
result = int(attributes[4].text)
This converts the resulting text into an int.

How can i use my python script as proxy for urls

i have a script that check the input link, if it's equivalent to one i specified in the code, then it will use my code, else it open the link in chrome.
i want to make that script kind of as a default browser, as to gain speed compared to opening the browser, getting the link with an help of an extension and then send it to my script using POST.
i used procmon to check where the process in question query the registry key and it seem like it tried to check HKCU\Software\Classes\ChromeHTML\shell\open\command so i added a some key there and in command, i edited the content of the key with my script path and arguments (-- %1)(-- only here for testing purposes)
unfortunately, once the program query this to send a link, windows prompt to choose a browser instead of my script, which isn't what i want.
Any idea?
in HKEY_CURRENT_USER\Software\Classes\ChromeHTML\Shell\open\command Replace the value in default with "C:\Users\samdra.r\AppData\Local\Programs\Python\Python39\pythonw.exe" "[Script_path_here]" %1
when launching a link, you'll be asked to set a default browser only once (it ask for a default browser for each change you make to the key):
i select chrome in my case
as for the python script, here it is:
import sys
import browser_cookie3
import requests
from bs4 import BeautifulSoup as BS
import re
import os
import asyncio
import shutil
def Prep_download(args):
settings = os.path.abspath(__file__.split("NewAltDownload.py")[0]+'/settings.txt')
if args[1] == "-d" or args[1] == "-disable":
with open(settings, 'r+') as f:
f.write(f.read()+"\n"+"False")
print("Background program disabled, exiting...")
exit()
if args[1] == "-e" or args[1] == "-enable":
with open(settings, 'r+') as f:
f.write(f.read()+"\n"+"True")
link = args[-1]
with open(settings, 'r+') as f:
try:
data = f.read()
osupath = data.split("\n")[0]
state = data.split("\n")[1]
except:
f.write(f.read()+"\n"+"True")
print("Possible first run, wrote True, exiting...")
exit()
if state == "True":
asyncio.run(Download_map(osupath, link))
async def Download_map(osupath, link):
if link.split("/")[2] == "osu.ppy.sh" and link.split("/")[3] == "b" or link.split("/")[3] == "beatmapsets":
with requests.get(link) as r:
link = r.url.split("#")[0]
BMID = []
id = re.sub("[^0-9]", "", link)
for ids in os.listdir(os.path.abspath(osupath+("/Songs/"))):
if re.match(r"(^\d*)",ids).group(0).isdigit():
BMID.append(re.match(r"(^\d*)",ids).group(0))
if id in BMID:
print(link+": Map already exist")
os.system('"'+os.path.abspath("C:/Program Files (x86)/Google/Chrome/Application/chrome.exe")+'" '+link)
return
if not id.isdigit():
print("Invalid id")
return
cj = browser_cookie3.load()
print("Downloading", link, "in", os.path.abspath(osupath+"/Songs/"))
headers = {"referer": link}
with requests.get(link) as r:
t = BS(r.text, 'html.parser').title.text.split("ยท")[0]
with requests.get(link+"/download", stream=True, cookies=cj, headers=headers) as r:
if r.status_code == 200:
try:
id = re.sub("[^0-9]", "", link)
with open(os.path.abspath(__file__.split("NewAltDownload.pyw")[0]+id+" "+t+".osz"), "wb") as otp:
otp.write(r.content)
shutil.copy(os.path.abspath(__file__.split("NewAltDownload.pyw")[0]+id+" "+t+".osz"),os.path.abspath(osupath+"/Songs/"+id+" "+t+".osz"))
except:
print("You either aren't connected on osu!'s website or you're limited by the API, in which case you now have to wait 1h and then try again.")
else:
os.system('"'+os.path.abspath("C:/Program Files (x86)/Google/Chrome/Application/chrome.exe")+'" '+link)
args = sys.argv
if len(args) == 1:
print("No arguments provided, exiting...")
exit()
Prep_download(args)
you obtain the argument %1 (the link) with sys.argv()[-1] (since sys.argv is a list) and from there, you just check if the link is similar to the link you're looking for (in my case it need to look like https://osu.ppy.sh/b/ or https://osu.ppy.sh/beatmapsets/)
if that's the case, do some code, else, just launch chrome with chrome executable and the link as argument. and if the id of the beatmap is found in the Songs folder, then i also open the link in chrome.
to make it work in the background i had to fight with subprocesses and even more tricks, and at the end, it started working suddenly with pythonw and .pyw extension.

Hot to get data from webapge using selenium and show it using flask?

Hello I'm a theologian and one of the things that I usually have to do is translate from latin to english or spanish. In order to do that I use an online dictionary and check if an specific word is in nominative case or dative case (latinist stuff)...
Now I'd code a simple script in python using selenium that get the dictionary's page and extract the case of the word. All works fine and as I want to, but...
Always there is a 'but' haha. I want to take that data that I extract by using selenium and 'print' it by using flask in a webpage. I code that, but it doesn't work...
my code:
from flask import Flask
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from tabulate import tabulate
import sys
import os
app = Flask(__name__)
chrome_opt = Options()
chrome_opt.binary_location = g_chrome_bin = os.environ.get("GOOGLE_CHROME_BIN")
chrome_opt.add_argument('--headless')
chrome_opt.add_argument('--no-sandbox')
chrome_opt.add_argument('--disable-dev-sh--usage')
selenium_driver_path = os.environ.get("CHROMEDRIVER_PATH")
driver = webdriver.Chrome(executable_path= selenium_driver_path if selenium_driver_path else "./chromedriver", options=chrome_opt)
def analyze (words):
ws = words.split()
sentence = []
for w in ws:
driver.get('http://archives.nd.edu/cgi-bin/wordz.pl?keyword=' + w)
pre = driver.find_element_by_xpath('//pre')
sentence = sentence + [[w] + [ pre.text.replace('.', '') ]]
return tabulate(sentence, headers=["Word", "Dictionary"])
#analyze("pater noster qui est in celis")
#app.route("/api/<string:ws>")
def api (ws):
return analyze(ws)
driver.close()
if __name__ == "__main__":
app.run(debug=True)
And when I go to http://localhost:5000/api/pater (for ex.) I've got Internal Server Error and in the console selenium.common.exceptions.InvalidSessionIdException: Message: invalid session id
You close your driver session (driver.close())before the main method runs. Thus when you make an api request and try to call driver.get() that driver is already closed. Eather you initialise a new driver for every call to analazye() and close that at the end of the method OR you dont close the driver session at all.

Python Selenium Webpage fill: To download data from links

I've compiled this code to perform an iteration of downloads from a webpage which has multiple download links. Once the download link is clicked, the webpage produces a webform which has to be filled and submitted for the download to start. I've tried running the code and face issue in 'try'& 'except' block code (Error: Too broad exception clause) and towards the end there is an error associated with the 'submit' (Error: method submit maybe static) both of these subsequently result in 'SyntaxError: invalid syntax '. Any suggestions / help will be much appreciated. Thank you.
import os
from selenium import webdriver
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir", os.getcwd())
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/x-msdos-program")
driver = webdriver.Firefox(firefox_profile=fp)
driver.get('http://def.com/catalog/attribute')
#This is to find the download links in the webpage one by one
i=0
while i<1:
try:
driver.find_element_by_xpath('//*[#title="xml (Open in a new window)"]').click()
except:
i=1
#Once the download link is clicked this has to fill the form for submission which fill download the file
class FormPage(object):
def fill_form(self, data):
driver.find_element_by_xpath('//input[#type = "radio" and #value = "Non-commercial"]').click()
driver.find_element_by_xpath('//input[#type = "checkbox" and #value = "R&D"]').click()
driver.find_element_by_xpath('//input[#name = "name_d"]').send_keys(data['name_d'])
driver.find_element_by_xpath('//input[#name = "mail_d"]').send_keys(data['mail_d'])
return self
def submit(self):
driver.find_element_by_xpath('//input[#value = "Submit"]').click()
data = {
'name_d': 'abc',
'mail_d': 'xyz#gmail.com',
}
FormPage().fill_form(data).submit()
driver.quit()
Actually you have two warnings and a error:
1 - "Too broad exception" this is a warning telling you that you should except espefic errors, not all of them. In your "except" line should be something like except [TheExceptionYouAreTreating]: an example would be except ValueError:. However this should not stop your code from running
2 - "Error: method submit maybe static" this is warning telling you that the method submit is a static method (bassically is a method that don't use the self attribute) to supress this warning you can use the decorator #staticmethod like this
#staticmethod
def submit():
...
3 - "SyntaxError: invalid syntax" this is what is stopping your code from running. This is a error telling you that something is written wrong in your code. I think that may be the indentation on your class. Try this:
i=0
while i<1:
try:
driver.find_element_by_xpath('//*[#title="xml (Open in a new window)"]').click()
except:
i=1
#Once the download link is clicked this has to fill the form for submission which fill download the file
class FormPage(object):
def fill_form(self, data):
driver.find_element_by_xpath('//input[#type = "radio" and #value = "Non-commercial"]').click()
driver.find_element_by_xpath('//input[#type = "checkbox" and #value = "R&D"]').click()
driver.find_element_by_xpath('//input[#name = "name_d"]').send_keys(data['name_d'])
driver.find_element_by_xpath('//input[#name = "mail_d"]').send_keys(data['mail_d'])
return self
def submit(self):
driver.find_element_by_xpath('//input[#value = "Submit"]').click()
data = {
'name_d': 'abc',
'mail_d': 'xyz#gmail.com',
}
FormPage().fill_form(data).submit()
driver.quit()
One more thing. Those are really simple errors and warnings, you should be able to fix them by yourself by carefully reading what the error has to say. I also reccomend you reading about Exceptions

Python Selenium: Unable to Find Element After First Refresh

I've seen a few instances of this question, but I was not sure how to apply the changes to my particular situation. I have code that monitors a webpage for changes and refreshes every 30 seconds, as follows:
import sys
import ctypes
from time import sleep
from Checker import Checker
USERNAME = sys.argv[1]
PASSWORD = sys.argv[2]
def main():
crawler = Checker()
crawler.login(USERNAME, PASSWORD)
crawler.click_data()
crawler.view_page()
while crawler.check_page():
crawler.wait_for_table()
crawler.refresh()
ctypes.windll.user32.MessageBoxW(0, "A change has been made!", "Attention", 1)
if __name__ == "__main__":
main()
The problem is that Selenium will always show an error stating it is unable to locate the element after the first refresh has been made. The element in question, I suspect, is a table from which I retrieve data using the following function:
def get_data_cells(self):
contents = []
table_id = "table.datadisplaytable:nth-child(4)"
table = self.driver.find_element(By.CSS_SELECTOR, table_id)
cells = table.find_elements_by_tag_name('td')
for cell in cells:
contents.append(cell.text)
return contents
I can't tell if the issue is in the above function or in the main(). What's an easy way to get Selenium to refresh the page without returning such an error?
Update:
I've added a wait function and adjusted the main() function accordinly:
def wait_for_table(self):
table_selector = "table.datadisplaytable:nth-child(4)"
delay = 60
try:
wait = ui.WebDriverWait(self.driver, delay)
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, table_selector)))
except TimeoutError:
print("Operation timeout! The requested element never loaded.")
Since the same error is still occurring, either my timing function is not working properly or it is not a timing issue.
I've run into the same issue while doing web scraping before and found that re-sending the GET request (instead of refreshing) seemed to eliminate it.
It's not very elegant, but it worked for me.
I appear to have fixed my own problem.
My refresh() function was written as follows:
def refresh():
self.driver.refresh()
All I did was switch frames right after the refresh() call. That is:
def refresh():
self.driver.refresh()
self.driver.switch_to.frame("content")
This took care of it. I can see that the page is now refreshing without issues.

Categories

Resources