how do I use an existing profile in-place with Selenium Webdriver?

how do I use an existing profile in-place with Selenium Webdriver? - python

I am trying to do what the title says, using Python's Selenium Webdriver, with an existing Firefox profile living in <PROFILE-DIR>.
What I've tried
used the options.profile option on the driver:
#!/usr/bin/env python
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
from selenium.webdriver import Firefox, DesiredCapabilities
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.firefox.options import Options
options = Options()
options.profile = '<PROFILE_DIR>'
webdriver = Firefox(options=options)
This copies the existing profile to a temporary location. I can see it works because the new session I start has access to the profile's old cookies, etc. But it is not what I am after: I want to use the profile in-place.
tried to pass the profile as '--profile ' to the args capability: change the code above to
capabilities = DesiredCapabilities.FIREFOX.copy()
capabilities['args'] = '--profile <PROFILE-DIR>'
webdriver = Firefox(desired_capabilities=capabilities)
Nothing doing: viewing the geckodriver.log after closing the session still shows me something like Running command: "/usr/bin/firefox" "--marionette" "-foreground" "-no-remote" "-profile" "/tmp/rust_mozprofileOFKY46", i.e. still using a temporary profile (which is not even a copy of the existing one; the old cookies are not there).
tried the same with capabilities['args'] = ['-profile', '<PROFILE-DIR>'] instead (i.e. a list of strings instead of a single string); same result.
read a bunch of other SO posts, none of which do it. This is mostly because they're specific to Chrome (to whose driver you can apparently pass command-line options; I haven't seen something like this for geckodriver), or because they fall back to copies of existing profiles.
The most relevant answer in this direction implements essentially the same hack I'd thought of, in a pinch:
start the driver with a copy of your existing profile using options.profile as described above;
close the driver instance manually when done (e.g. with a Ctrl+C, or SIGINT) so that the temp-profile directory isn't deleted;
copy over all of the stuff that's left on top of the existing profile, giving you access to whatever leftovers you wanted from the automated session.
This is ugly and feels unnecessary. Besides, geckodriver's failure to remove temporary profiles (which I'd be relying on) is considered a bug..
Surely I'm misunderstanding how to pass those capability options mentioned above, or some such thing. But the docs could do a better job of giving examples.

I've come up with a solution that allows the user to use a Firefox profile in place, by passing the profile path dynamically via an environment variable to Geckodriver.
I start by downloading Geckodriver 0.32.0 and made it so that you simply need to provide the Firefox profile directory via the environment variable FIREFOX_PROFILE_DIR.
The code change is in src/browser.rs, line 88, replacing:
let mut profile = match options.profile {
ProfileType::Named => None,
ProfileType::Path(x) => Some(x),
ProfileType::Temporary => Some(Profile::new(profile_root)?),
};
with:
let mut profile = if let Ok(profile_dir) = std::env::var("FIREFOX_PROFILE_DIR") {
Some(Profile::new_from_path(Path::new(&profile_dir))?)
} else {
match options.profile {
ProfileType::Named => None,
ProfileType::Path(x) => Some(x),
ProfileType::Temporary => Some(Profile::new(profile_root)?),
}
};
You may refer to my Git commit to see the diff against the original Geckodriver code.

Related

Selenium | Loading multiple extensions into Selenium's ChromeDriver, only receiving the last defined

I'm currently attempting to load two extensions into Selenium's ChromeDriver. Ublock Origin and Ghostery. Looking online, the solution for this tends to just be as simple as adding an argument for each extension. However, when I attempt to add these two arguments, it will only load the second defined extension and ignore the first.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
PATH = "C:\Program Files (x86)\chromedriver.exe"
ublock = r'C:\Users\Senuvox\PycharmProjects\Projects\SeleniumExtensions\1.35.2_0'
ghostery = r'C:\Users\Senuvox\PycharmProjects\Projects\SeleniumExtensions\8.5.5_0'
options = Options()
options.add_argument('load-extension=' + ghostery)
options.add_argument('load-extension=' + ublock)
options.add_argument('--ignore-ssl-errors=yes')
options.add_argument('--ignore-certificate-errors')
driver = webdriver.Chrome(PATH, options=options)
driver.create_options()
driver.get("http://www.google.com")
When one runs this script, as stated before, only Ublock will load into the Selenium chrome browser. Similarly, if I swap the order so Ublock is first and Ghostery is second, only Ghostery will load.
Additionally I have attempted to one-line it by adding commas in-between the two extension variables. Unfortunately, this provides an error as add_argument takes only two positional arguments.
Any insights on how I might solve this would be greatly appreciated!

#use .crx file path of extension
options.add_extension('C:\\Users\\ublock-1.35.2_0.crx')
options.add_extension('C:\\Users\\ghostery-8.5.5_0.crx')
useful resource

Python Selenium Code behaving differently depending on calling mode

I've tripped on a very easy to reproduce code that's driving me crazy.
I'm trying to take a picture of the screen after entering a container code in a website form, using Python and PhantomJS webdriver (selenium).
The code that makes it possible is quite short, and reproducing it in the console works fine. But if this same code is within a function or a script, it doesn't behave the same.
Here is the working code, that works for me writing it line by line in the console:
(Python version 2.7.9, selenium 2.53.6)
> python
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.set_window_size(1280, 1024)
driver.get('http://www.track-trace.com/container')
driver.find_element_by_name('number').clear()
driver.find_element_by_name('number').send_keys('CGMU5109933')
driver.find_element_by_xpath('//input[#name="commit" and #value="Track direct"]').click()
driver.save_screenshot('./x.png')
However, this same code inside a function or a script only behaves the same until it reaches the click.
It freezes while loading, and the screenshot shows just that.
It doesn't matter what type of wait I try, implicit or explicit, the loading button won't end.
Here is the same code inside a function in foo.py module to make trying it easier.
The picture that is taken won't match the picture taken by the above code.
# foo.py
from selenium import webdriver
def try_it():
driver = webdriver.PhantomJS()
driver.set_window_size(1280, 1024)
driver.get('http://www.track-trace.com/container')
driver.find_element_by_name('number').clear()
driver.find_element_by_name('number').send_keys('CGMU5109933')
driver.find_element_by_xpath('//input[#name="commit" and #value="Track direct"]').click()
driver.save_screenshot('./x.png')
> python
>>> import foo
>>> foo.try_it()
The code must be in a function as it is called on demand when new search petitions arrive on a web service that is integrated in an application.
I always seek first for questions that may have valid answers but this time it doesn't seem to be anything similar to my problem.
Any idea why this may be happening and how to avoid it would be very much appreciated. If any other code or clarification is needed don't hesitate to ask.

This is purely a timing issue. Button clicks don't block, meaning once a click is issued, it returns and python runs the very next line right away. You need to wait until the next page has finished loading before you can take the screenshot. I would use an explicit wait that blocks until an element on the next page you're interested in has loaded. Likewise, I think you do need to worry about the popup asking if you really want to use direct.
My script:
from explicit import waiter
from selenium import webdriver
from selenium.webdriver.common.by import By
def locate_container(driver, container_id):
url = 'http://www.track-trace.com/container'
track_direct_xpath = '//input[#name="commit" and #value="Track direct"]'
im_sure_css = 'div.modal-footer button.jq-directinfo-continue'
tracking_details_header_css = 'div#wrapper > div.inner > h1'
# Load the container search page
driver.get(url)
# Locate the container field, enter container_id, click direct search button
waiter.find_write(driver, 'number', container_id, clear_first=True, by=By.NAME)
waiter.find_element(driver, track_direct_xpath, By.XPATH).click()
# Locate I'm Sure button and click it
waiter.find_element(driver, im_sure_css, By.CSS_SELECTOR).click()
# Wait for the "Tracking details for Container: XXX" header to load
waiter.find_element(driver, tracking_details_header_css, By.CSS_SELECTOR)
# Now we know the page has loaded and we can take the screenshot:
driver.save_screenshot('./x.png')
def main():
driver = webdriver.PhantomJS()
try:
driver.set_window_size(1280, 1024)
locate_container(driver, 'CGMU5109933')
finally:
driver.quit()
if __name__ == "__main__":
main()
(Full disclosure: I maintain the explicit package, which is meant to simplify using explicit waits. You can replace it with direct waits and get the same affect. Simply pip install explicit to install)

Selenium Python Firefox webdriver : can't modify profile

I want to use, on a webdriver Firefox instance, the "new tab instead of window" option.
1/ I created a profile with this option on, but when I use the profile a lot of options are OK but not this one.
2/ After the load of the profile I tried to change the option in the code but it does n't work.
My code :
profile = webdriver.FirefoxProfile(os.path.join(s_path, name))
profile.set_preference("browser.link.open_newwindow.restriction", 0)
profile.set_preference("browser.link.open_newwindow", 3)
profile.set_preference("browser.link.open_external", 3)
profile.set_preference("browser.startup.homepage","http://www.google.fr")
profile.update_preferences()
print(os.path.join(s_path, name))
driver = webdriver.Firefox(set_profile())
All is OK (the start homepage is google.fr) except this option which is not OK.
It seems that Selenium copy the profile in a temp dir. where users.js have the wrong line :
user_pref("browser.link.open_newwindow", 2);
Python 3.4.2, Windows 7, Firefox 39.0, Selenium lib 2.46

From what I've researched, browser.link.open_newwindow is a frozen setting and it's always synced with the value 2. If you dig up the source of the selenium Python bindings, you would find a set of frozen settings that is applied after your custom settings are set.
Note that in java bindings this set of default frozen settings is explicitly hardcoded:
/**
* Profile preferences that are essential to the FirefoxDriver operating correctly. Users are not
* permitted to override these values.
*/
private static final ImmutableMap<String, Object> FROZEN_PREFERENCES =
ImmutableMap.<String, Object>builder()
.put("app.update.auto", false)
.put("app.update.enabled", false)
.put("browser.download.manager.showWhenStarting", false)
.put("browser.EULA.override", true)
.put("browser.EULA.3.accepted", true)
.put("browser.link.open_external", 2)
.put("browser.link.open_newwindow", 2) // here it is
// ...
And a bit of an explanation coming from Firefox only supports windows not tabs:
This is a known issue and unfortunately we will not be supporting tabs.
We force Firefox to open all links in a new window. We can't access
the tabs to know when to switch. When we move to marionette (Mozilla
project) in the future we should be able to do this but for now it is
working as intended
A workaround solution would be to change the target of a link manually - may not work in all of the cases depending on how a new link is opened.

"browser.link.open_newwindow" is a frozen preference, which means that it can't be modified using profile.set_preference("browser.link.open_newwindow", 3)
The solution is to use profile.DEFAULT_PREFERENCES["frozen"]["browser.link.open_newwindow"] = 3 instead. (the other non-frozen preferences can be set with set_preference without problem)

multiple download folders using Selenium Webdriver?

I'm using the Selenium webdriver (in Python) to automate the donwloading of thousands of files. I want the files to be saved in different folders. The following code works, but it requires quitting and re-starting the webdriver multiple times, which slows down the process:
some_list = ["item1", "item2", "item3"] # over 300 items on the actual code
for item in some_list:
download_folder = "/Users/myusername/Desktop/" + item
os.makedirs(download_folder)
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", download_folder)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/plain")
browser = webdriver.Firefox(firefox_profile = fp)
# a bunch of code that accesses the site and downloads the files
browser.close()
browser.quit()
So, at every iteration I have to quit the webdriver and re-start it, which is pretty innefficient. Is there a better way to do this? Apparently we can't change the Firefox profile after the webdriver is instantiated (see this and this previous questions), but perhaps there is some alternative I'm missing?
(Mac OS X 10.6.8, Python 2.7.5, Selenium 2.2.0)

No, I don't think you can do it.
Option one: specify different default directories for one FirefoxProfile
You can't. In my opinion, this is the issue with Firefox, not Selenium. However, this Firefox limitation looks like the correct design to me. browser.download.dir is the default download destination, if it allows multiple directories, then that's not "default" anymore.
Option two: switch multiple FirefoxProfile for one driver instance
If not doing it in Firefox, can FirefoxProfile be switched for same driver instance? As far as I know, the answer is no. (You have already done some research on this)
Option three: use normal non-Selenium way to do the downloading
If you want to avoid using this auto-downloading approach and do it the normal way (like Auto-it etc.), then it falls in the category of "How To Download Files With Selenium And Why You Shouldn’t". But in this case, your code can be simplified.
some_list = ["item1", "item2", "item3"] # over 300 items on the actual code
for item in some_list:
download_folder = "/Users/myusername/Desktop/" + item
some_way_magically_do_the_downloading(download_folder)

Running Selenium WebDriver using Python with extensions (.crx files)

I went to Chrome Extension Downloader to snag the .crx file for 'Adblock-Plus_v1.4.1'.
I threw it in the directory I'm working in, and then ran:
from selenium import webdriver
chop = webdriver.ChromeOptions()
chop.add_extension('Adblock-Plus_v1.4.1.crx')
driver = webdriver.Chrome(chrome_options = chop)
It totally acknowledges that it exists, but it gives me what looks like a ChromeDriver.exe style message:
ERROR:extension_error_reporter.cc(56)] Extension error: Package is invalid: 'CRX_PUBLIC_KEY_INVALID'.
Then eventually a webdriver exception:
selenium.common.exceptions.WebDriverException: Message: u'Extension could not be installed'
I am almost 100% sure that there is nothing wrong with my code, because of the fact it puts a ChromeDriver type message first before throwing the exception.
I also tried to pack it myself by going to 'C:\Documents and Settings\\*UserName*\Local Settings\Application Data\Google\Chrome\User Data\Default\Extensions' on chrome://extensions/ with developer mode on, tried to use that .crx that was created and got the exact same error message
I also tried a different way:
chop = webdriver.ChromeOptions()
chop.add_argument('--load_extension=Adblock-Plus_v1.4.1.crx')
driver = webdriver.Chrome(chrome_options = chop)
this doesn't cause an exception or even a Chrome Driver error, but if I manually go to chrome://extensions/ it doesn't say that the extension is loaded...
I'm thinking my problem has to do with the actual .crx file itself. because of the nature of the error message... but then at the same time, I'm not sure because if I spawn a webdriver.Chrome() session, and then manually go to chrome://extensions/ i can physically drag and drop install the same .crx file.
Edit: I realized I didn't actually ask a question so here it is:
What am I doing wrong? Why can't I load this chrome extension? Is it my code, or the .crx file itself?
UPDATE: #Pat Meeker
I've tried this, but im losing something in the translation from java to python
capability = webdriver.DesiredCapabilities.CHROME returns a dictionary that has all my arguments in i, so I'm pretty sure the only part that I need to do is add the arguments.
options = webdriver.ChromeOptions()
options.add_argument('--user-data-dir=C:/Users/USER_NAME/AppData/Local/Google/Chrome/User Data/Default/')
This is what I have right now, and whenever I try to driver = webdriver.Chrome(chrome_options=options) chrome opens up, and it seems to remember its previous position, but NOTHING more, no bookmarks, no extensions no nothing.

Just add this extra line in your program
from selenium.webdriver.chrome.options import Options it will work...
like this
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chop = webdriver.ChromeOptions()
chop.add_extension('Adblock-Plus_v1.4.1.crx')
driver = webdriver.Chrome(chrome_options = chop)

From my skimpy experience the problem is with the load-extesion argument and not your code as I had the same problem with testing an extension that's not from Chrome Web Store.
I managed to solve it by installing the extension with Drag & Drop and using only the --user-data-dir argument.
This worked for me with C# and Chrome 33, I know it sounds flimsy but it works for me for several months now so I hope it'll help.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how do I use an existing profile in-place with Selenium Webdriver? - python

Related

Selenium | Loading multiple extensions into Selenium's ChromeDriver, only receiving the last defined

Python Selenium Code behaving differently depending on calling mode

Selenium Python Firefox webdriver : can't modify profile

multiple download folders using Selenium Webdriver?

Running Selenium WebDriver using Python with extensions (.crx files)

Categories

Resources