Selenium - Find out where script was downloaded from

Selenium - Find out where script was downloaded from - python

I have to validate that a web application, when executed in the client browser, is fetching some assets (*.js) from a particular remote server.
Say two options exist: whether it gets the script from server A or it gets a copy from server B. I need to assert (based on some preconditions) that the script was downloaded from server A.
The question: Is there a way to inspect source url of loaded javascript using selenium (preferably with python)?

Here it is a possible solution to extract url of javascript libraries from the stackoverflow site.
You should adapt the solution to the site you are working on.
driver = webdriver.Firefox()
driver.get("http://www.http://stackoverflow.com/")
link= driver.find_elements_by_tag_name('script')
for i in link:
print i.get_attribute("src")
Example of output:
http://rules.quantcount.com/rules-p-c1rF4kxgLUzNc.js
http://edge.quantserve.com/quant.js
http://b.scorecardresearch.com/beacon.js
https://www.google-analytics.com/analytics.js
https://ajax.googleapis.com/ajax/libs/jquery/1.12.4/jquery.min.js
https://cdn.sstatic.net/Js/stub.en.js?v=9798373e8e81
There are various strategies to locate elements in a page.
You can use the most appropriate one for your case (http://selenium-python.readthedocs.io/locating-elements.html)

Related

How do I get URLs that are being accessed in my browser in 'real time'?

I want to write a program that returns the current or last visited URL by me on my computer (Windows 10) browser. Is there any way in which I can get that URL?
I tried using Python and SQLite to access Chrome history database on C:\Users%USERNAME%\AppData\Local\Google\Chrome\User Data\Default\History and it worked, but if I'm using the browser, the database gets locked.
I know that by using Wireshark, one can see the packets when accessing an URL, but I cannot find the complete URL in those packets fields, only the server name (ie: stackoverflow.com).
I'd like to know whether there is a way in which I can see that information as it's done by Wireshark, but only to get the complete URL, nothing else. Thank you!

I found a solution to this by using mitmproxy: https://mitmproxy.org/. This video on YouTube helped me with the installation and setup process: https://www.youtube.com/watch?v=7BXsaU42yok. The video explains the installation on Mac, but it's not so different from Windows. Then you can use Python to capture and process the URLs contained within the HTTPS requests by using the flow.request.pretty_url property: https://docs.mitmproxy.org/stable/addons-scripting/.

How to unittest selenium code without browser or network access while testing code in CI platform like travis-ci?

I am scraping website data using selenium package in python with functions like:
driver.find_element_by_ ... (...)
I want to test this code in travis-ci platform where it has no access to browser or network.
How to test this code? it uses the following methods to retrieve data:
driver.get(url)
driver.find_element()
Because it processes html inside, I can't feed static fixed html data directly to it for testing. So can you suggest best way to do so?

If you can put the website page on the same machine where yous test script besides, Selenium can open local HTML file with file: protocol as following:
driver.get('file:///C:/Users/xxxxx/Desktop/test.html')

Download chrome extension for selenium test

I'm working on a selenium test and I need to get a chrome extension from the chrome app store to use in the test. Right now it's a manual process to update to a newer version of the extension.
Current Flow:
1. Manual download extension through a chrome extension downloader.
2. Store the .crx file in a location visible to the selenium test.
3. Execute test with that extension.
I was hoping that google had an API that could be hit in order to download the extension but I've been unable to find anything to that effect. Has anyone ran into a situation like this and been able to solve it?

Basically you just have to capture the redirect url and then request that.
In python:
pluginId = id at end of url on plugin page. option 2 on here explains it well
blah=requests.get(url,params{'prodversion':'57.0','x':"id=pluginId",'response':'redirect'},verify=False,stream=True)
blahFile = requests.get(blah.url)
extension = open("yourExtension.crx", 'wb')
extension.write(blahFile.content)
extension.close()

How to manage chrome local storage

I'm using Selenium + Python + ChromeDriver to test web application. Web application contains tables with data that could be sorted using various embedded filters. The problem is that after first test executed application saves current state (like which table page is opened, which data sorting method applied) in browser local storage, so that when next test starts data appears already filtered... But I need default data filters for each test and so I need to set default key:value pairs or clear storage before each test case. I found this solution
driver.get('javascript:localStorage.clear();')
but get
selenium.common.exceptions.WebDriverException: Message: unknown error:unsupported protocol
How can I manage (change or clear) Chrome local storage using Selenium?

You should execute the script instead:
driver.execute_script('window.localStorage.clear();')

how to convert IP address into http for urllib

I'm looking to embark on my own personal project of creating an application which i can save doc/texts/image from the site my browser is at. I have done a lot of research to conclude that either of the two ways is possible for now: using cookies or packet sniffers to identify the IP address(the packet sniffer method being more relevent at the moment).
I would like to automate the application so I would not have to copy and paste the url on my browser and paste it into the script using urllib.
Are there any suggestions that experienced network programmers can provide with regards to the process or modules or libraries I need?
thanks so much
jonathan

If you want to download all images, docs, and text while you're actively browsing (which is probably a bad idea considering the sheer amount of bandwidth) then you'll want something more than urllib2. I assume you don't want to have to keep copying and pasting all the urls into a script to download everything, if that is not the case a simple urllib2 and beautifulsoup filter would do you wonders.
However if what I assume is correct then you are probably going to want to investigate selenium. From there you can launch a selenium window (defaults to Firefox) and then do your browsing normally. The best option from there is to continually poll the current url and if it is different identify all of the elements you want to download and then use urllib2 to download them. Since I don't know what you want to download I can't really help you on that part. However here is what something like that would look like in selenium:
from selenium import webdriver
from time import sleep
# Startup the web-browser
browser = webdriver.Firefox()
current_url = browser.current_url
while True:
try:
# If we have a url, identify and download your items
if browser.current_url != current_url:
# Download the stuff here
current_url = browser.current_url
# Triggered once you close the web-browser
except:
break
# Sleep for half a second to avoid demolishing your machine from constant polling
sleep(0.5)
Once again I advise against doing this, as constantly downloading images, text, and documents would take up a huge amount of space.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Selenium - Find out where script was downloaded from - python

Related

How do I get URLs that are being accessed in my browser in 'real time'?

How to unittest selenium code without browser or network access while testing code in CI platform like travis-ci?

Download chrome extension for selenium test

How to manage chrome local storage

how to convert IP address into http for urllib

Categories

Resources