selenium, chromedriver_autoinstaller, & pyinstaller in onefile - python

I am trying to create a onefile .exe using pyinstaller that will utilize chromedriver_autoinstaller so that the chromedriver is always up to date. The code works fine when run within the IDE but once I use pyinstaller to create the .exe it throws the 'Specific issue with chrome_autoinstaller' message. Because the chromedriver_autoinstaller is not working, the program then cannot find chromedriver. The program does work in an .exe without the autoinstaller by directly referencing the chromedriver file path but I would prefer to utilize this package if possible.
class LoginPCC:
def __init__(self): # create an instance of this class. Begins by logging in
try:
chrome_options = webdriver.ChromeOptions()
settings = {
"recentDestinations": [{
"id": "Save as PDF",
"origin": "local",
"account": "",
}],
"selectedDestinationId": "Save as PDF",
"version": 2
}
prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings),
"plugins.always_open_pdf_externally": True}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
try:
chromedriver_autoinstaller.install()
except:
print('Specific issue with chrome_autoinstaller')
try:
self.driver = webdriver.Chrome(options=chrome_options)
except:
print('Cannot find chromedriver')
python 3.7
pyinstaller 4.0.dev
selenium 4.141.0
chromedriver-autoinstaller 0.2

The below worked for me:
chromedriver_autoinstaller.install(cwd=True)
When you launch your EXE first time, a new folder will be created in a folder where you place your EXE. This is where the chromedriver will be stored. This is not the prettiest solution, but it works.

Related

Enable backgroundCSS option when Saving PDFs of webpages with Selenium

I'm trying to "Print as PDF" a set of webpage reports repeatedly with a Selenium ChromeDriver script in Python. As you can see below, if we have the "Background graphics" option selected, the header of the report renders perfectly.
However, when I run Selenium, I cannot figure out how to set the parameter for background graphics. The saved PDF will instead look like this:
Here is a minimally reproducible script that I've been attempting to run:
import selenium
from selenium import webdriver
import json
from time import sleep
import os
chrome_options = webdriver.ChromeOptions()
settings = {
"recentDestinations": [{
"id": "Save as PDF",
"origin": "local",
"account": "",
}],
"selectedDestinationId": "Save as PDF",
"version": 2,
#"cssBackground": 2,
}
prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings), }
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=CHROMEDRIVER_PATH)
driver.get("https://dash.gallery/dash-multipage-report/")
sleep(3)
driver.execute_script('window.print();')
sleep(5)
driver.quit()
I've done some digging on Chromium's documentation and I believe that I've found the cssBackground setting here. Additionally, there's a BackgroundGraphicsModeRestriction in this file that might be affecting things? I found what I thought would be a useful lead to setting the value in this StackOverflow post but I can't seem to put the pieces together and check that "Background graphics" box from the start. Any help that you can provide would be incredibly appreciated as I'm pretty stuck here. I've also tried rendering my page's CSS with -webkit-print-color-adjust:exact; but it doesn't seem to capture the entire background.

Printing a PDF with Selenium Chrome Driver in headless mode

I have no problems printing without headless mode, however once I enable headless mode, it just refuses to print a PDF. I'm currently working on an app with a GUI, so I'd rather not have the Selenium webdriver visible to the end user if possible.
For this project I'm using an older version of Selenium, 4.2.0. That coupled with Python 3.9.
import os
from os.path import exists
import json
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver import Chrome, ChromeOptions
# Paths
dir_path = os.getcwd()
download_path = os.path.join(dir_path, "letters")
chrome_path = os.path.join(dir_path, "chromium\\app\\Chrome-bin\\chrome.exe")
user_data_path = os.path.join(dir_path, "sessions")
website = "https://www.google.com/"
def main():
print_settings = {
"recentDestinations": [{
"id": "Save as PDF",
"origin": "local",
"account": "",
}],
"selectedDestinationId": "Save as PDF",
"version": 2,
"isHeaderFooterEnabled": False,
"isLandscapeEnabled": True
}
options = ChromeOptions()
options.binary_location = chrome_path
options.add_argument("--start-maximized")
options.add_argument('--window-size=1920,1080')
options.add_argument(f"user-data-dir={user_data_path}")
options.add_argument("--headless")
options.add_argument('--enable-print-browser')
options.add_experimental_option("prefs", {
"printing.print_preview_sticky_settings.appState": json.dumps(print_settings),
"savefile.default_directory": download_path, # Change default directory for downloads
"download.default_directory": download_path, # Change default directory for downloads
"download.prompt_for_download": False, # To auto download the file
"download.directory_upgrade": True,
"profile.default_content_setting_values.automatic_downloads": 1,
"safebrowsing.enabled": True
})
options.add_argument("--kiosk-printing")
driver = Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get(website)
driver.execute_script("window.print();")
if exists(os.path.join(user_data_path, "Google.pdf")):
print("YAY!")
else:
print(":(")
if __name__ == '__main__':
main()
For anyone else coming across this with a similar issue, I fixed it by using the print method described here: Selenium print PDF in A4 format
Using my example from above, I replaced:
driver.execute_script("window.print();")
with:
pdf_data = driver.execute_cdp_cmd("Page.printToPDF", print_settings)
with open('Google.pdf', 'wb') as file:
file.write(base64.b64decode(pdf_data['data']))
And that worked just fine for me.

ChromeDriver shuts down printing window immediately - Solution?

I am trying to print my html code to pdf by opening up the html document with selenium (chrome-driver) in Python. Having opened the html file, I try to print the document to pdf but whenever I open the printer window (manually or through selenium commands) it immediately shuts down.
I have tried with the following code which cannot produce a .pdf file:
chrome_options = webdriver.ChromeOptions()
settings = {"recentDestinations": [{"id": "Save as PDF", "origin": "local", "account": ""}], "selectedDestinationId": "Save as PDF", "version": 2}
prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings),
'savefile.default_directory': path_html}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
browser = webdriver.Chrome(CHROMEDRIVER_PATH, options=chrome_options)
browser.get(path_html)
browser.execute_script('window.print();')
browser.close()
I am using the latest chromedriver version from https://chromedriver.chromium.org/downloads for windows.
If I simply open my own browser (not through selenium) I can easily print the html document to a .pdf file.
UPDATE:
If someone bumps into the sample problem, I solved the problem by utilizing the a wrapper that converts the html file to pdf - it essentially goes through the above code.
Check out: the converter function from https://pypi.org/project/pyhtml2pdf/

Python 3 Chrome Selenium Keep Downloaded Jar File

I'm using Selenium, Chrome, and Python 3.
Here's what I'm doing to set everything up
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
chrome_options = Options()
options = webdriver.ChromeOptions()
prefs = {
"download.default_directory": r"C:\Download\Dir",
"download.directory_upgrade": "true",
"download.prompt_for_download": "false",
"disable-popup-blocking": "true",
'download.neverAsk.saveToDisk': 'application/octet-stream, application/json, jar, ' +
'text/comma-separated-values, text/csv, application/csv, ' +
'application/excel, application/vnd.ms-excel, ' +
'application/vnd.msexcel, text/anytext, text/plaintext, ' +
'image/png, image/pjpeg, image/jpeg, application/zip',
"safebrowsing.enabled": True
}
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(options=options)
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-software-rasterizer')
chrome_options.add_argument('--safebrowsing-disable-download-protection')
Then I have some credential secrets imported, and some website navigation to get to the file to download.
Afterwards, I click on the link to download the file:
# Click Upgrade Jar File
driver.find_element_by_xpath('//a[#id="accordionPanel:j_id_3a:1:j_id_3j"]').click()
The problem I have is here. Chrome asks me if I'd like to keep the jar file after I've downloaded it.
I've been reading through a lot of documentation and I don't understand how I can circumvent this. I believe that it should be added to prefs or as an chrome_options.add_argument but so far I've had no luck with the options I've found.
Update 01:
A little update, I was able to get it to work "silently", which allows you to bypass the "keep" button, but I haven't yet found a solution that will bypass the "keep" button when you're looking at the GUI.
options = webdriver.ChromeOptions()
prefs = {
"download.default_directory": r"C:\Download\Dir",
"download.directory_upgrade": "true",
"download.prompt_for_download": "false",
"disable-popup-blocking": "true",
"safebrowsing.enabled": False,
"default_content_settings": "contentSettings",
"download": "download"
}
options.add_experimental_option("prefs", prefs)
options.add_argument("--headless")
options.add_argument("--disable-notifications")
options.add_argument('--disable-gpu')
options.add_argument('--disable-software-rasterizer')
driver = webdriver.Chrome(options=options)
options.add_argument('--safebrowsing-disable-download-protection')
Update 02
Adapted code from a comment:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import time
import os
# Secrets
user_id = [grabs from a function that queries secrets]
user_password = [grabs from a function that queries secrets]
download_dir = "C:\Download\Dir"
options = Options()
options.add_argument("--disable-infobars")
options.add_argument("start-maximized")
options.add_argument("--disable-extensions")
options.add_argument("--disable-popup-blocking")
# disable the banner "Chrome is being controlled by automated test software"
options.add_experimental_option("useAutomationExtension", False)
options.add_experimental_option("excludeSwitches", ['enable-automation'])
prefs = {
"download.default_directory": r"C:\Download\Dir",
'download.prompt_for_download': False,
'download.extensions_to_open': 'jar',
'safebrowsing.enabled': True
}
capabilities = DesiredCapabilities().CHROME
options.add_experimental_option('prefs', prefs)
capabilities.update(options.to_capabilities())
driver = webdriver.Chrome('.\chromedriver.exe', options=options)
url = 'URL GOES HERE'
driver.get(url)
print("Navigating to", url)
driver.get(url)
# Define Text Boxes For Login
print('Logging into GoAnywhere Website')
username = driver.find_element_by_id('email')
password = driver.find_element_by_name('secret')
# Enter Username/Password
username.send_keys(user_id)
password.send_keys(user_password)
password.send_keys('\n')
# Wait for Page to load
....[Navigation Logic Here]....
# Click Upgrades
print("Selecting Upgrades")
time.sleep(1)
driver.find_element_by_xpath('//div[#id="accordionPanel"]/div[5]').click()
# CLick Upgrade Jar File
time.sleep(1)
print("Downloading upgrade jar to", download_dir)
driver.find_element_by_xpath('//a[#id="accordionPanel:j_id_3a:1:j_id_3j"]').click()
....[Logic to do stuff with file after downloaded]....
UPDATE 05-28-2021
After doing some more testing I have determined the code below works flawlessly on a Microsoft Windows platform that does not have any type of Chrome Browser policies enabled.
The code will likely fail when the Microsoft Windows platform is under corporate IT control. Google has policy templates that allow corporate IT to set Chrome politics that can override the selenium code within this answer.
For example this is one of the politics that can be set by corporate IT that will negate the selenium code within this answer.
If either SafeBrowsingProtectionLevel_StandardProtection or SafeBrowsingProtectionLevel_EnhancedProtection are enabled Google Chrome will Gives you warnings about potentially risky sites, downloads, and extensions.
CATEGORY !!SafeBrowsing_Category
POLICY !!SafeBrowsingExtendedReportingEnabled_Policy
#if version >= 4
SUPPORTED !!SUPPORTED_WIN7
#endif
EXPLAIN !!SafeBrowsingExtendedReportingEnabled_Explain
VALUENAME "SafeBrowsingExtendedReportingEnabled"
VALUEON NUMERIC 1
VALUEOFF NUMERIC 0
END POLICY
POLICY !!SafeBrowsingProtectionLevel_Policy
#if version >= 4
SUPPORTED !!SUPPORTED_WIN7
#endif
EXPLAIN !!SafeBrowsingProtectionLevel_Explain
PART !!SafeBrowsingProtectionLevel_Part DROPDOWNLIST
VALUENAME "SafeBrowsingProtectionLevel"
ITEMLIST
NAME !!SafeBrowsingProtectionLevel_NoProtection_DropDown VALUE NUMERIC 0
NAME !!SafeBrowsingProtectionLevel_StandardProtection_DropDown VALUE NUMERIC 1
NAME !!SafeBrowsingProtectionLevel_EnhancedProtection_DropDown VALUE NUMERIC 2
END ITEMLIST
END PART
END POLICY
I also noted this within Chrome's source code. This is why the message "This type of file can harm your computer" is thrown when downloading a .jar file
namespace download_util
static const struct Executables {
const char* extension;
DownloadDangerLevel level;
} g_executables[] = {
// Some files are dangerous on all platforms.
truncated...
// Java.
{ "class", DANGEROUS },
{ "jar", DANGEROUS },
{ "jnlp", DANGEROUS },
truncated...
Jar files are considered dangerous by Google Chrome. The verbiage below is from Chrome's GitHub repository.
Note the text below is taking about displaying warning messages in the User interface (UI) of the Chrome browser.
platform_settings.danger_level: (required) Controls how files should be handled by the UI in the absence of a better signal from the Safe Browsing ping. This applies to all file types where ping_setting is either SAMPLED_PING or NO_PING, and downloads where the Safe Browsing ping either fails, is disabled, or returns an UNKNOWN verdict. Exceptions are noted below.
The warning controlled here is a generic "This file may harm your computer." If the Safe Browsing verdict is UNCOMMON, POTENTIALLY_UNWANTED, DANGEROUS_HOST, or DANGEROUS, Chrome will show that more severe warning regardless of this setting.
This policy also affects also how subresources are handled for "Save As ..." downloads of complete web pages. If any subresource ends up with a file type that is considered DANGEROUS or ALLOW_ON_USER_GESTURE, then the filename will be changed to end in .download. This is done to prevent the file from being opened accidentally.
This policy also affects also how subresources are handled for "Save As ..." downloads of complete web pages. If any subresource ends up with a file type that is considered DANGEROUS or ALLOW_ON_USER_GESTURE, then the filename will be changed to end in .download. This is done to prevent the file from being opened accidentally.
NOT_DANGEROUS: Safe to download and open, even if the download was accidental. No additional warnings are necessary.
DANGEROUS: Always warn the user that this file may harm their computer. We let them continue or discard the file. If Safe Browsing returns a SAFE verdict, we still warn the user.
CONCLUSION
The OP of this question stated in the comments that he was using a corporate IT controlled computer. It is HIGHLY LIKELY that SafeBrowsingProtection was enabled on the OP's system. With this extra level of security protection enforced by corporate IT the selenium code within this answer could not suppress the warning message being displayed in the OP's Chrome browser UI.
The OP stated that he was able to bypass the "This type of file can harm your computer" message when using selenium and Chrome in headless mode. The reason the message was suppressed is because headless mode does not use a UI thus the warning message will not be raised. Chrome's own documentation/source validates that this warning message is only displayed in the UI. Ref: platform_settings.danger_level above.
ORIGINAL POST 05-12-2021
The code below allows me to click on a href link for a .jar file without receiving "This type of file can harm your computer" message.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
chrome_options = Options()
chrome_options.add_argument("--disable-infobars")
chrome_options.add_argument("start-maximized")
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--disable-popup-blocking")
# disable the banner "Chrome is being controlled by automated test software"
chrome_options.add_experimental_option("useAutomationExtension", False)
chrome_options.add_experimental_option("excludeSwitches", ['enable-automation'])
prefs = {
'download.default_directory': 'download_directory',
'download.prompt_for_download': False,
'download.extensions_to_open': 'jar',
'safebrowsing.enabled': True
}
capabilities = DesiredCapabilities().CHROME
chrome_options.add_experimental_option('prefs', prefs)
capabilities.update(chrome_options.to_capabilities())
driver = webdriver.Chrome('/usr/local/bin/chromedriver', options=chrome_options)
# I used this site in my testing, because it had JAR files
url_main = 'http://www.jgoodies.com/downloads/demos/'
driver.get(url_main)
driver.implicitly_wait(20)
# download a jar file
download_jar_file = driver.find_element_by_xpath('//*[#id="post-70"]/div/table/tbody/tr[2]/td[5]/a')
download_jar_file.click()
the preference download.extensions_to_open is linked to this policy in the Chrome's source code.
"AutoOpenFileTypes" : {
"os": ["win", "mac", "linux", "chromeos"],
"policy_pref_mapping_tests": [
{
"policies": { "AutoOpenFileTypes": ["exe", ".txt", "pdf"] },
"prefs": {
"download.extensions_to_open_by_policy": {"value" : ["exe", "pdf"] }
}
}
]
},
----------------------------------------
My system information
----------------------------------------
Platform: macOS
Python: 3.8.0
Selenium: 3.141.0
Chromedriver: 90.0.4430.24
----------------------------------------
Try adding chrome_options.add_argument('--safebrowsing-disable-download-protection') in your chrome options setup.
Edit:
Wait a second. You've defined options = webdriver.ChromeOptions(). Try setting the arguments like this?
options.add_argument('--disable-gpu')
options.add_argument('--disable-software-rasterizer')
options.add_argument('--safebrowsing-disable-download-protection')

how to save opened page as pdf in Selenium (Python)

Have tried all the solutions I could find on the Internet to be able to print a page that is open in Selenium in Python. However, while the print pop-up shows up, after a second or two it goes away, with no PDF saved.
Here is the code being tried. Based on the code here - https://stackoverflow.com/a/43752129/3973491
Coding on a Mac with Mojave 10.14.5.
from selenium import webdriver
from selenium.webdriver.support.select import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.chrome.options import Options
from selenium.common.exceptions import WebDriverException
import time
import json
options = Options()
appState = {
"recentDestinations": [
{
"id": "Save as PDF",
"origin": "local"
}
],
"selectedDestinationId": "Save as PDF",
"version": 2
}
profile = {'printing.print_preview_sticky_settings.appState': json.dumps(appState)}
# profile = {'printing.print_preview_sticky_settings.appState':json.dumps(appState),'savefile.default_directory':downloadPath}
options.add_experimental_option('prefs', profile)
options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'
driver = webdriver.Chrome(options=options, executable_path=CHROMEDRIVER_PATH)
driver.implicitly_wait(5)
driver.get(url)
driver.execute_script('window.print();')
$chromedriver --v
ChromeDriver 75.0.3770.90 (a6dcaf7e3ec6f70a194cc25e8149475c6590e025-refs/branch-heads/3770#{#1003})
Any hints or solutions as to what can be done to print the open html page to a PDF. Have spent hours trying to make this work. Thank you!
Update on 2019-07-11:
My question has been identified as a duplicate, but a) the other question seems to be using javascript code, and b) the answer does not solve the problem being raised in this question - it may be to do with more recent software versions. Chrome version being used is Version 75.0.3770.100 (Official Build) (64-bit), and chromedriver is ChromeDriver 75.0.3770.90. On Mac OS Mojave. Script is running on Python 3.7.3.
Update on 2019-07-11:
Changed the code to
from selenium import webdriver
import json
chrome_options = webdriver.ChromeOptions()
settings = {
"appState": {
"recentDestinations": [{
"id": "Save as PDF",
"origin": "local",
"account": "",
}],
"selectedDestinationId": "Save as PDF",
"version": 2
}
}
prefs = {'printing.print_preview_sticky_settings': json.dumps(settings)}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=CHROMEDRIVER_PATH)
driver.get("https://google.com")
driver.execute_script('window.print();')
driver.quit()
And now, nothing happens. Chrome launches, loads url, print dialog appears but then nothing seems to happen - nothing in the default printer queue, and no pdf either - I even searched for the PDF files by looking up "Recent Files" on Mac.
The answer here, worked when I did not have any other printer setup in my OS. But when I had another default printer, this did not work.
I don't understand how, but making small change this way seems to work.
from selenium import webdriver
import json
chrome_options = webdriver.ChromeOptions()
settings = {
"recentDestinations": [{
"id": "Save as PDF",
"origin": "local",
"account": "",
}],
"selectedDestinationId": "Save as PDF",
"version": 2
}
prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings)}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
CHROMEDRIVER_PATH = '/usr/local/bin/chromedriver'
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=CHROMEDRIVER_PATH)
driver.get("https://google.com")
driver.execute_script('window.print();')
driver.quit()
You can use the following code to print PDFs in A5 size with background css enabled:
import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import json
import time
chrome_options = webdriver.ChromeOptions()
settings = {
"recentDestinations": [{
"id": "Save as PDF",
"origin": "local",
"account": ""
}],
"selectedDestinationId": "Save as PDF",
"version": 2,
"isHeaderFooterEnabled": False,
"mediaSize": {
"height_microns": 210000,
"name": "ISO_A5",
"width_microns": 148000,
"custom_display_name": "A5"
},
"customMargins": {},
"marginsType": 2,
"scaling": 175,
"scalingType": 3,
"scalingTypePdf": 3,
"isCssBackgroundEnabled": True
}
mobile_emulation = { "deviceName": "Nexus 5" }
chrome_options.add_experimental_option("mobileEmulation", mobile_emulation)
chrome_options.add_argument('--enable-print-browser')
#chrome_options.add_argument('--headless')
prefs = {
'printing.print_preview_sticky_settings.appState': json.dumps(settings),
'savefile.default_directory': '<path>'
}
chrome_options.add_argument('--kiosk-printing')
chrome_options.add_experimental_option('prefs', prefs)
for dirpath, dirnames, filenames in os.walk('<source path>'):
for fileName in filenames:
print(fileName)
driver = webdriver.Chrome("./chromedriver", options=chrome_options)
driver.get(f'file://{os.path.join(dirpath, fileName)}')
time.sleep(7)
driver.execute_script('window.print();')
driver.close()
The solution is not very good, but you can take a screenshot and convert to pdf by Pillow...
from selenium import webdriver
from io import BytesIO
from PIL import Image
driver = webdriver.Chrome(executable_path='path to your driver')
driver.get('your url here')
img = Image.open(BytesIO(driver.find_element_by_tag_name('body').screenshot_as_png))
img.save('filename.pdf', "PDF", quality=100)
Here is the solution I use with Windows :
First download the ChromeDriver here : http://chromedriver.chromium.org/downloads and install Selenium
Then run this code (based on the accepted answer, slightly modified to work on Windows):
import json
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
settings = {"recentDestinations": [{"id": "Save as PDF", "origin": "local", "account": ""}], "selectedDestinationId": "Save as PDF", "version": 2}
prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings)}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
browser = webdriver.Chrome(r"chromedriver.exe", options=chrome_options)
browser.get("https://google.com/")
browser.execute_script('window.print();')
browser.close()
I would suggest Downloading the page source html which can be done like so
in vb.net:
Dim Html As String = webdriver.PageSource
Not sure how it is done in python but I'm sure it's very similar
Once you have done that then you can select the parts of the page you want to save using an html parser or by parsing it manually with string parsing code. Once you have the html for the part you want to save stored in a string then use an html to pdf converter library or program. There are lots of these for programming languages like C# and vb.net. I don't know about any for python but I'm sure some exist. Just do some research. (some are free and some are expensive)

Categories

Resources