Using Angular JS(Protractor) with Selenium in Python - python

I'm trying to select a textarea wrapped in angular 1 using selenium, but it can't be seen in DOM. There's a module called Pytractor. I've been trying to solve this but I'm unable to use it correctly.
Can anyone help me with this?

You can also use regular selenium bindings to test AngularJS applications. You would need to use Explicit Waits to wait for elements to appear, disappear, title/url to change etc - for any actions that would let you continue with testing the page.
Example (waiting for textarea element to appear):
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
wait.until(EC.visibility_of_element_located((By.TAG_NAME, "myaccount")))
There is one important thing that pytractor (as protractor itself) provides - it knows when AngularJS is settled and ready - models are updated, there are no outstanding async requests etc. It doesn't mean you have to use it to test AngularJS applications, but it gives you an advantage.
Additionally, pytractor provides you with new locators, e.g. you can find an element by model or binding. It also doesn't mean you cannot find the same element using other location techniques which regular selenium python provides out-of-the-box.
Note that pytractor is not actively developed and maintained at the moment.

You may just mine protractor for useful code snippets. I personally use this function that blocks until Angular is done rendering the page.
def wait_for_angular(self, selenium_driver):
selenium_driver.set_script_timeout(10)
selenium_driver.execute_async_script("""
callback = arguments[arguments.length - 1];
angular.element('html').injector().get('$browser').notifyWhenNoOutstandingRequests(callback);""")
Replace 'html' for whatever element is your 'ng-app'.
Solution for Angular 1 comes from protractor/lib/clientsidescripts.js#L51. It could also be possible to adapt the solution to Angular 2 using this updated code

Related

How to use selenium to scrape data that is generated on the website?

I am implementing a Python code wherein I need to generate an SHA key. To do this I use an online SHA generator. I send the input (the data for which the Hash is needed), through selenium, which works successfully. However, after that I am unable to get the output that is generated(the text string). I use the find_element_by_xpath function to get this data but it only returns an empty string. I don't understand what i am doing wrong. Can somebody tell me how I can do this? Or if there is any other method, besides using Selenium to achieve this?
I have used the following code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
def driver_init():
driver = webdriver.Chrome(executable_path='chromedriver.exe')
driver.wait=WebDriverWait(driver,5)
return driver
def get_data(driver, val):
driver.get('https://passwordsgenerator.net/sha1-hash-generator/')
xpath='//*[#id="txt1"]'
box=driver.find_element_by_xpath(xpath)
box.send_keys(val)
element=driver.wait.until(EC.presence_of_element_located((By.XPATH, '//*[#id="txt2"]')))
return element.text
driver=driver_init()
w=get_data(driver, '100011')
`
The reason you're not seeing that text populate to your variable is because it's value is being set by scripts running on the page. You can get the value by using element.get_attribute('value') to get the element's current value.
A word of caution though: these types of hashes are generally used for security purposes, and unless you strongly trust this website, I would suggest using a local solution like an SSL (openSSL, cryptography, etc.) library to resolve these hashes instead of sending requests over the web. If something were to happen to this site's server or if it was taken over by a malicious actor, they could modify the site to send them all your (previously) secure data.

Decoding Class names on facebook through Selenium

I noticed that facebook has some weird class names that look computer generated. What I don't know is if these classes are at least constant over time or they change in some time interval? Maybe someone who has experience with that can answer. Only thing I can see is that when I exit Chrome and open it again it is still the same, so at least they don't change every browser session.
So I'd guess the best way to go about scraping facebook would be to use some elements in user interface and assume structure is always the same, like for example to get address from About section something like this:
from selenium import webdriver
driver = webdriver.Chrome("C:/chromedriver.exe")
driver.get("https://www.facebook.com/pg/Burma-Superstar-620442791345784/about/?ref=page_internal")
# wait some time
address_elements = driver.find_elements_by_xpath("//span[text()='FIND US']/../following-sibling::div//button[text()='Get Directions']/../../preceding-sibling::div[1]/div/span")
for item in address_elements:
print item.text
You were pretty correct. Facebook is built through ReactJS which is pretty much evident from the presence of the following keywords and tags within the HTML DOM:
{"react_render":true,"reflow":true}
<!-- react-mount-point-unstable -->
["React-prod"]
["ReactDOM-prod"]
ReactComposerTaggerType:{r:["t5r69"],be:1}
So, the dynamically generated class names are bound to change after certain timegaps.
Solution
The solution would be to use the static attributes to construct a dynamic Locator Strategy.
To retrieve the first line of the address just below the text FIND US you need to induce WebDriverWait in conjunction with expected_conditions as visibility_of_element_located() and you can use the following optimized solution:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[normalize-space()='FIND US']//following::span[2]"))))
References
You can find some relevant discussions in:
Logging Facebook using selenium
Why Selenium driver fail to recognize ID element of Facebook login page?
Outro
Note: Scraping Facebook violates their Terms of Service of section 3.2.3 and you are liable to be questioned and may even land up in Facebook Jail. Use Facebook Graph API instead.

Selenium download entire html

I have been trying to use selenium to scrape and entire web page. I expect at least a handful of them are spa's such as Angular, React, Vue so that is why I am using Selenium.
I need to download the entire page (if some content isn't loaded from lazy loading because of not scrolling down that is fine). I have tried setting a time.sleep() delay, but that has not worked. After I get the page I am looking to hash it and store it in a db to compare later and check to see if the content has changed. Currently the hash is different every time and that is because selenium is not downloading the entire page, each time a different partial amount is missing. I have confirmed this on several web pages not just a singular one.
I also have probably a 1000+ web pages to go through by hand just getting all the links so I do not have time to find an element on them to make sure it is loaded.
How long this process takes is not important. If it takes 1+ hours so be it, speed is not important only accuracy.
If you have an alternative idea please also share.
My driver declaration
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
driverPath = '/usr/lib/chromium-browser/chromedriver'
def create_web_driver():
options = webdriver.ChromeOptions()
options.add_argument('headless')
# set the window size
options.add_argument('window-size=1200x600')
# try to initalize the driver
try:
driver = webdriver.Chrome(executable_path=driverPath, chrome_options=options)
except WebDriverException:
print("failed to start driver at path: " + driverPath)
return driver
My url call my timeout = 20
driver.get(url)
time.sleep(timeout)
content = driver.page_source
content = content.encode('utf-8')
hashed_content = hashlib.sha512(content).hexdigest()
^ getting different hash here every time since same url not producing same web page
As the Application Under Test(AUT) is based on Angular, React, Vue in that case Selenium seems to be the perfect choice.
Now, as you are fine with the fact that some content isn't loaded from lazy loading because of not scrolling makes the usecase feasible. But in all possible ways ...do not have time to find an element on them to make sure it is loaded... can't be really compensated inducing time.sleep() as time.sleep() have certain drawbacks. You can find a detailed discussion in How to sleep webdriver in python for milliseconds. It would be worth to mention that the state of the HTML DOM will be different for all the 1000 odd web pages.
Solution
A couple of viable solutions:
A pottential solution could have been to induce WebDriverWait and ensure that some HTML elements are loaded as per the discussion How can I make sure if some HTML elements are loaded for Selenium + Python? validating atleast either of the following:
Page Title
Page Heading
Another solution would be to tweak the capability pageLoadStrategy. You can set the pageLoadStrategy for all the 1000 odd web pages to common point assigning a value either:
normal (full page load)
eager (interactive)
none
You can find a detailed discussion in How to make Selenium not wait till full page load, which has a slow script?
If you implement pageLoadStrategy, page_source method will be triggered at the same tripping point and possibly you would see identical hashed_content.
In my experience time.sleep() does not work well with dynamic loading times.
If the page is javascript-heavy you have to use the WebDriverWait clause.
Something like this:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get(url)
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "[my-attribute='my-value']")))
Change 10 with whatever timer you want, and By.CSS_SELECTOR and its value with whatever type you want to use as a reference for a lo
You can also wrap the WebDriverWait around a Try/Except statement with the TimeoutException exception, which you can get from the submodule selenium.common.exceptions in case you want to set a hard limit.
You can probably set it inside a while loop if you truly want it to check forever until the page's loaded, because I couldn't find any reference in the docs about waiting "forever", but you'll have to experiment with it.

Python/Selenium:Different ways to click this specific button

I am trying to understand Python in general as I just switched over from using VBA. I interested in the possible ways you could approach this single issue. I already went around it by just going to the link directly, but I need to understand and apply here.
from selenium import webdriver
chromedriver = r'C:\Users\dd\Desktop\chromedriver.exe'
browser = webdriver.Chrome(chromedriver)
url = 'https://www.fake.com/'
browser.get(url)
browser.find_element_by_id('txtLoginUserName').send_keys("Hello")
browser.find_element_by_id('txtLoginPassword').send_keys("There")
browser.find_element_by_id('btnLogin').click()
At this point, I am trying to navigate to a particular button/link.
Here is the info from the page/element
T-Mobile
Here are some of the things I tried:
for elem in browser.find_elements_by_xpath("//*[contains(text(), 'T-Mobile')]"):
elem.click
browser.execute_script("InitiateCallBack(187, True, T-Mobile, https://www.fake.com/, TMobile)")
I also attempted to look for tags and use css selector all of which I deleted out of frustration!
Specific questions
How do I utilize the innertext,"T-Mobile", to click the button?
How would I execute the onclick event?
I've tried to read the following links, but still have not succeeded incoming up with a different way. Part of it is probably because I don't understand the specific syntax yet. This is just some of the things I looked at. I spent about 3 hours trying various things before I came here!
selenium python onclick() gives StaleElementReferenceException
http://selenium-python.readthedocs.io/locating-elements.html
Python: Selenium to simulate onclick
https://stackoverflow.com/questions/43531654/simulate-a-onclick-with-selenium-https://stackoverflow.com/questions/45360707/python-selenium-using-onclick
Running javascript in Selenium using Python
How do I utilize the innertext,"T-Mobile", to click the button?
find_elements_by_link_text would be appropriate for this case.
elements = driver.find_elements_by_link_text('T-Mobile')
for elem in elements:
elem.click()
There's also a by_partial_link_text locator as well if you don't have the full exact text.
How would I execute the onclick event?
The simplest way would be to simply call .click() on the element as shown above and the event should, naturally, execute at that time.
Alternatively, you can retrieve the onclick attribute and use driver.execute_script to run the js.
for elem in elements:
script = elem.get_attribute('onlcick')
driver.execute_script(script)
Edit:
note that in your code you did element.click -- this does nothing. element.click() (note the parens) calls the click method.
is there a way to utilize browser.execute_script() for the onclick event
execute_script can fire the equivalent event, but there may be more listeners that you miss by doing this. Using the element click method is the most sound. There may very well be many implementation details of the site that may hinder your automation efforts, but those possibilities are endless. Without seeing the actual context, it's hard to say.
You can use JS methods to click an element or otherwise interact with the page, but you may miss certain event listeners that occur when using the site 'normally'; you want to emulate, more or less, the normal use as closely as possible.
As per the HTML you have shared it's pretty clear the website uses JavaScript. So to click() on the link with text as T-Mobile you have to induce WebDriverWait with expected_conditions clause as element_to_be_clickable and your can use the following code block :
WebDriverWait(driver, 20).until(expected_conditions.element_to_be_clickable((By.XPATH, "//a[contains(.,'T-Mobile')]"))).click()
you can use it
<div class="button c_button s_button" onclick="submitForm('rMTF')" style="margin-bottom: 30px;">
<input class="v_small" type="button"></input>
<span>
Reset
</span>

How do I click a button in a form using Selenium and Python 2.7?

I am trying to create a Python program that periodically checks a website for a specific update. The site is secured and multiple clicks are required to get to the page that I want to monitor. Unfortunately, I am stuck trying to figure out how to click a specific button. Here is the button code:
<input type="button" class="bluebutton" name="manageAptm" value="Manage Interview Appointment" onclick="javascript:programAction('existingApplication', '0');">
I have tried numerous ways to access the button and always get "selenium.common.exceptions.NoSuchElementException:" error. The obvious approach to access the button is XPath and using the Chrome X-Path Helper tool, I get the following:
/html/body/form/table[#class='appgridbg mainContent']/tbody/tr/td[2]/div[#class='maincontainer']/div[#class='appcontent'][1]/table[#class='colorgrid']/tbody/tr[#class='gridItem']/td[6]/input[#class='bluebutton']
If I include the above as follows:
browser.find_element_by_xpath("/html/body/form/table[#class='appgridbg mainContent']/tbody/tr/td[2]/div[#class='maincontainer']/div[#class='appcontent'][1]/table[#class='colorgrid']/tbody/tr[#class='gridItem']/td[6]/input[#class='bluebutton']").submit()
I still get the NoSuchElementException error.
I am new to selenium and so there could be something obvious that I am missing; however, after much Googling, I have not found an obvious solution.
On another note, I have also tried find_element_by_name('manageAptm') and find_element_by_class_name('bluebutton') and both give the same error.
Can someone advise on how I can effectively click this button with Selenium?
Thank you!
To follow your attempts and the #har07's comment, the find_element_by_name('manageAptm') should work, but the element might not be immediately available and you may need to wait for it:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
manageAppointment = wait.until(EC.presence_of_element_located((By.NAME, "manageAptm")))
manageAppointment.click()
Also, check if the element is inside an iframe or not. If yes, you would need to switch into the context of it and only then issue the "find" command:
driver.switch_to.frame("frame_name_or_id")
driver.find_element_by_name('manageAptm').click()

Categories

Resources