Webdriver open a file as soon as it finishes downloading - python

I'm writing some tests and I'm I'm using the Firefox webdriver with a FirefoxProfile to download a file from an external url, but I need to read such file as soon as it finishes downloading to retrieve some specific data.
I set my profile and driver like this:
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", '/some/path/')
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream")
ff = webdriver.Firefox(firefox_profile=fp)
Is there some way to know when the file finishes downloading, so that I know when to call the reader function without having to poll the download directory, waiting with time.sleep or using any Firefox add-on?
Thanks for any help :)

You could try hooking the file up to a file object as it downloads to use it like a stream buffer, polling it as it downloads to get the data you need, monitoring for the download completion yourself directly (either by waiting for the file to be of the expected size or by assuming it is complete if there has been no new data added for a certain amount of time).
Edit:
You could try to look at the download tracking db in the profile folder as referenced here. Looks like you can wait for your file to have status 1.

I like to use inotify to watch for these kinds of events. Some example code,
from pyinotify import (
EventsCodes,
ProcessEvent,
Notifier,
WatchManager,
)
class EventManager(ProcessEvent):
def process_IN_CLOSE_WRITE(self, event):
file_path = os.path.join(event.path, event.name)
# do something to file, you might want to wait a second here and
# also test for existence because ff might be making temp files
wm = WatchManager()
notifier = Notifier(wm, EventManager())
wdd = wm.add_watch('/some/path', EventsCodes.ALL_FLAGS['IN_CLOSE_WRITE'], rec=True)
While True:
try:
notifier.process_events()
if notifier.check_events():
notifier.read_events()
except:
notifier.stop()
raise
The notifier decides which method to call on the event manager based on the name of the event. So in this case we are only watching for IN_CLOSE_WRITE events

It's far from ideal, however with firefox you could check the target folder for the presence of the .part file which is present while it's still downloading (with other browsers you can do something similar).
A while loop will then halt everything while waiting for the download to complete:
import os
def test_for_partfile():
part_file = False
dir = "C:\\Downloads"
filelist = (os.listdir(dir))
for partfile in filelist:
if partfile.endswith('.part'):
part_file = True
return part_file
while test_for_partfile():
time.sleep(15)

Related

Python tqdm download progress with playwright

I have following code to download something (in this example it's a video).
from playwright.sync_api import sync_playwright
import func
import os, time, shutil
import requests
from tqdm.auto import tqdm
def download():
with page.expect_download() as download_info:
page.click("text=720p (MP4")
download = download_info.value
with requests.get(download.url, stream=True) as r:
# check header to get content length, in bytes
total_length = int(r.headers.get("Content-Length"))
# implement progress bar via tqdm
with tqdm.wrapattr(r.raw, "read", total=total_length, desc="") as raw:
# save the output to a file
with open(f"{os.path.basename(r.url)}", 'wb') as output:
shutil.copyfileobj(raw, output)
download.save_as(os.path.join(func.report_folder_path, download.suggested_filename))
# initialize navigation
with sync_playwright() as p:
browser = p.chromium.launch(channel="msedge", headless=False)
context = browser.new_context(accept_downloads=True)
page = context.new_page()
# go to canon s21 login
print("Entering on download page")
page.goto('https://www.jw.org/en/library/videos/#en/mediaitems/StudioFeatured/docid-702023003_1_VIDEO')
page.wait_for_selector("text=Download")
page.locator("text=Download").first.click()
download()
time.sleep(3)
print('end')
I'm trying to implement a file download progress bar at the same time as the download is done by playwright. This code manages to download the file in the directory correctly and although it does not show a timeout error, it only ends the download function after the 30 seconds of the default timeout.
Another thing I noticed is that the download progress bar does not correctly show the real download status, i.e. it shows the download at 100% even though the real download file is still in progress.
Any idea how to fix these problems?

Selenium (Python), How do I get the file path after the download completes of each download process using Pool?

I looked at other topics on almost the same issue, but basically the function is implemented for downloading and waiting for one file, getting one path. That's all right.
But what to do when the files are downloaded almost simultaneously, but through the Pool, and the response returns the same file in the directory?
Important: Chromium works in headless mode.
Now I have the following functions working together
def download_wait(path_to_downloads):
seconds = 0
download_waiting = True
while download_waiting and seconds < 300:
time.sleep(1)
download_waiting = False
for filename in os.listdir(path_to_downloads):
if filename.endswith('.crdownload'):
download_waiting = True
seconds += 1
print(f'File downloaded in {seconds} seconds')
def last_downloaded_file(download_dir):
filename = max([f for f in os.listdir(download_dir)], key=lambda xa: os.path.getctime(os.path.join(download_dir, xa)))
return filename
Any ideas for upgrading these features? It is necessary for each Pool to receive its own file.

Pythons Webbrowser Module Will Never Open a Link in a new Window

I was trying to automate opening multiple user profiles given a list of names on a few different sites but i can not find a way to open a link in a new window meaning i can not sort the different sites i am opening into their own window collection.
here is my code:
import webbrowser
chrome_path="C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe"
firefox_path="C:\\Program Files\\Mozilla Firefox\\Firefox.exe"
strURL = "http://www.python.org"
webbrowser.register('chrome', None,webbrowser.BackgroundBrowser(chrome_path),1)
webbrowser.register('firefox', None,webbrowser.BackgroundBrowser(chrome_path),1)
webbrowser.open(strURL, new=0)
webbrowser.open(strURL, new=1)
webbrowser.open(strURL, new=2)
webbrowser.get('chrome').open(strURL)
webbrowser.get('firefox').open(strURL)
webbrowser.get('chrome').open_new(strURL)
webbrowser.get('firefox').open_new(strURL)
no matter what value i put for new (0, 1, or 2), all that ever happens is it opens a new tab in the last window i clicked on. i have tried all of the other methods that i found in they python documentation for the webbrowser module and everyone online is just saying to use "new=1" or webbroswer.open_new() but neither of those work. and even when i point it at firefox it just goes to chrome.
P.S.
i found a small workaround that i am not totally satisfied with.
import webbrowser
chrome_path = "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe %s"
chrome_path_NW = "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe %s --new-window"
firefox_path = "C:\\Program Files\\Mozilla Firefox\\Firefox.exe"
strURL = "http://www.python.org"
controller = webbrowser.get(chrome_path)
controllerNW = webbrowser.get(chrome_path_NW)
controllerNW.open(strURL, new=0)
controller.open(strURL, new=1)
controller.open(strURL, new=2)
controller.open("www.youtube.com", new=2)
the important thing to look at would be the "chrome_path" variable. i have changed it so it will run as a command and accept arguments. i found some launch arguments for chromium, here, that seem to work from chrome too. "--new-window" will open a new window and i can then open more tabs in that window but this is a total workaround of pythons module that i am not confident won't break if i am trying to use chrome while running this script. if there is any feature where i could group links together to open in specific windows that would be much more useful to me.
I realise this is a bit late but hopefully i can help someone in the future.
Basically you need to use the subprocess module to open up a new window before you load a new webpage
import subprocess
import time
import webbrowser
subprocess.Popen('open -a /Applications/Google\ Chrome.app --new', shell=True)
time.sleep(0.5) # this is to let the app open before you try to load a new page
webbrowser.open(url)

How to perform a context menu action on a file using python 3

How can i do a context menu action on a particular file?
I managed to open the explorer and get the list of files through python using pywinauto.
On that file I need to perform a context menu action, is it possible through pywinauto?
import pywinauto
path = "C:\\Users\\Vishnu\\Desktop\\DM-test\\"
pywinauto.Application().Start(r'explorer.exe')
explorer = pywinauto.Application().Connect(path='explorer.exe')
NewWindow = explorer.Window_(top_level_only=True, active_only=True, class_name='CabinetWClass')
NewWindow.AddressBandRoot.ClickInput()
NewWindow.TypeKeys(path+'{ENTER}', with_spaces=True, set_foreground=False)
The code above will open the explorer and navigate to the dir. This is the Context menu action required on the file:
I managed to find the reg value and changed my code to pass that action to the file, It works perfect!!
pywinauto.Application().start(r'"C:\Program Files (x86)\Qualcomm\QCAT 6.x\Bin\QCAT.exe" -txt "{}"'.format(fileName))
Arrgh! Nobody reads the docs... The example is provided in the main Readme: MS UI Automation Example. For your case it should look like that:
# no need to type the path, explorer.exe has a cmd param for that
pywinauto.Application().start(r'explorer.exe "{}"'.format(path))
# backend is important!!!
app = Application(backend="uia").connect(path="explorer.exe")
NewWindow = explorer.Window_(top_level_only=True, active_only=True, class_name='CabinetWClass')
file_item = NewWindow.ItemsView.get_item('dmlog20180517-121505slot0.dlf')
file_item.right_click_input()
app.ContextMenu["Convert to QCAT Text"].invoke()
# further actions depend on a process / dialog started...
More details about backends: Getting Started Guide.

Do anyone know how to use python webbrowser.open under window

I had try with python with webbrowser.open, but it only work on IE. How to let it open chrome or firefox. I don't want it to open on IE, i wants to be open on Chrome or Firefox. Due to i try many method, but none of them works.
import time
import webbrowser
webbrowser.open('www.google.com')
you need specify your webbrowser's name, detal see webbrowser.get
import webbrowser
webbrowser.open('www.google.com')
a = webbrowser.get('firefox')
a.open('www.google.com') # True
UPDATE
If you have chrome or firefox installed in your computer, do as following:
chrome_path =r'C:\Users\Administrator\AppData\Local\Google\Chrome\Application\chrome.exe' # change to your chrome.exe path
# webbrowser is just call subprocess.Popen, so make sure this work in your cmd firstly
# C:\Users\Administrator>C:\Users\Administrator\AppData\Local\Google\Chrome\Application\chrome.exe www.google.com
# there two way solve your problem
# you have change \ to / in windows
# this seems a bug in browser = shlex.split(browser) in windows
# ['C:UsersAdministratorAppDataLocalGoogleChromeApplicationchrome.exe', '%s']
a = webbrowser.get(r'C:/Users/Administrator/AppData/Local/Google/Chrome/Application/chrome.exe %s')
a.open('www.google.com') #True
# or by register
webbrowser.register('chrome', None,webbrowser.BackgroundBrowser(r'C:\Users\Administrator\AppData\Local\Google\Chrome\Application\chrome.exe'))
a = webbrowser.get('chrome')
a.open('www.google.com') #True
else you can try selenium, it provide much more functions and only need chromedriver.

Categories

Resources