I am writing a short script that will loop through the pdf files in a directory, open them iteratively and let me classify them. The code I've written almost does everything I want. It loops through the files and opens and closes them. Note I'm working on Linux.
However, the cursor is not always staying on the shell, and I don't want to have to keep pressing Alt-tab. Does anyone know a way to use subprocess to run system commands, but to never move away from the shell when Python is running?
A simplified version of what I've got so far:
import signal
import os
import subprocess
import glob
files = glob.glob("pdfs/*.pdf", recursive=True)
for ff in files:
# get the evince call
command = f"evince --fullscreen {ff}"
# open the pdf
process_cat = subprocess.Popen(command, stdout = subprocess.PIPE, shell = True)
next = input("Keep viewing files? Y/N")
# close the pdf
os.kill(process_cat.pid, signal.SIGKILL)
if next != "Y":
break
idea 1: use of browser
No idea if this will help, but are you wedded to a linux shell approach, would viewing in a web browser perhaps work better?
If so, I was wondering if this package might help?
import webbrowser
files = glob.glob("pdfs/*.pdf", recursive=True)
for ff in files:
webbrowser.open_new(ff)
# etc
idea 2: direct display of PNG
Would conversion to PNG and then direct display of the image file from python itself also be an alternative? This post may help in that case: View pdf image in an iPython Notebook
Related
I have a python script I wrote which uses tkinter and panda to:
choose a CSV file on the desktop,
imports in that data,
does some stuff,
exports the new dataframe created into a new csv.
Is there a way to run the program without the person needing to open up a python IDE and run it from there?
Currently when I try to just click and run the tester.py program I see the cmd_line (terminal) box open briefly and then close without my tkinter prompt or anything else.
My goal, or my ideal is that I wrote this program to help automate some tasks for non-technical coworkers. Is there a way that I could set up this program to just have them click on an exe file or a bat file and for the script to run, collect the User Input needed, and output the csv file like I want it to?
I've done some brief google searching but I haven't been able to find a clear answer.
import tkinter
import csv
import pandas as pd
from tkinter import Tk
from tkinter.filedialog import askopenfilename
Tk().withdraw() # we don't want a full GUI, so keep the root window from appearing
filename = askopenfilename() # show an "Open" dialog box and return the path to the selected file
print(filename)
df1 = pd.read_csv(filename)
df2 = df1.query("STATE != 'NY'") # stores records not in NY
df3 = df1[df1["FIRST_NAME"].str.contains(" ", regex=True)] # stores records that have a space in the first name
dferror = [df2, df3]
dferror = pd.concat(dferror).drop_duplicates().reset_index() # merges dataframes, drops duplicates
dferror.to_csv("C:\errors.csv")
edited to add my import clauses
You can write a small executable script based upon your OS
Windows
Create a .bat file in which you need there needs to the command to execute your python file.
e.g. c:\python27\python.exe c:\somescript.py %*
For reference look here
MacOS / Linux
Create a .sh file which can be executed from your shell/terminal or by double clicking its sym link.
Your .sh file will look like
#!/bin/bash
python PATH_TO_YOUR_PYTHON_FILE
Then you must make it executable via running the following in terminal
chmod u+x PATH_TO_.SH_FILE
Alternatively you can create a symbolic link to your python file and make it executable. This symlink will be executable by double click.
To create a symlink:
ln -sf PATH_TO_.SH_FILE PATH_OF_SYMLINK
If you put just a name in place of PATH_OF_SYMLINK it will be created in your present directory.
Thanks to #abarnert, the solution was to use Pyinstaller which allows me to "freeze" the code into an exe file
I am trying to use a script to open a file inside a network drive using python. The script is given below:
import os
import subprocess
file_path = r"O:\XXXX\test.xls"
subprocess.Popen(filepath, shell=True)
The network drive requires sign in but I always by default sign it the moment I on the computer. Also, using a os.listdir(folderpath) has no problem going into the network drive and listing all the files in the directory containing the file.
Tried some suggestions from similar posts but they don't work.
I am using Python 2.7, and Windows.
UPDATE:
No error was prompted after executing the script.
I am trying to open an Excel file. The script works to open Excel in other folders within the computer, but just not within the network drive.
Thanks to #J.F. Sebastian's suggestion. replacing subprocess.Popen(filepath, shell=True) with os.startfile(filepath) works.
I think this could help you
import subprocess
file_path = r"X:\dir\excelfile.xlsx"
#~ also this works
#~ file_path = r"\\server\dir\excelfile.xlsx"
subprocess.call(file_path,shell = True)
I'm trying to code a simple application that must read all currently open files within a certain directory.
More specificly, I want to get a list of files open anywhere inside my Documents folder,
but I don't want only the processes' IDs or process name, I want the full path of the open file.
The thing is I haven't quite found anything to do that.
I couldn't do it neither in linux shell (using ps and lsof commands) nor using python's psutil library. None of these is giving me the information I need, which is only the path of currently open files in a dir.
Any advice?
P.S: I'm tagging this as python question (besides os related tags) because it would be a plus if it could be done using some python library.
This seems to work (on Linux):
import subprocess
import shlex
cmd = shlex.split('lsof -F n +d .')
try:
output = subprocess.check_output(cmd).splitlines()
except subprocess.CalledProcessError as err:
output = err.output.splitlines()
output = [line[3:] for line in output if line.startswith('n./')]
# Out[3]: ['file.tmp']
it reads open files from current directory, non-recursively.
For recursive search, use +D option. Keep in mind, that it is vulnerable to race condition - when you get your ouput, situation might have changed already. It is always best to try to do something (open file), and check for failure, e.g. open file and catch exception or check for null FILE value in C.
I see that you can set where to download a file to through Webdriver, as follows:
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir",getcwd())
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv")
browser = webdriver.Firefox(firefox_profile=fp)
But, I was wondering if there is a similar way to give the file a name when it is downloaded? Preferably, probably not something that is associated with the profile, as I will be downloading ~6000 files through one browser instance, and do not want to have to reinitiate the driver for each download.
I would suggest a little bit strange way: do not download files with the use of Selenium if possible.
I mean get the file URL and use urllib library to download the file and save it to disk in a 'manual' way. The issue is that selenium doesn't have a tool to handle Windows dialogs, such as 'save as' dialog. I'm not sure, but I doubt that it can handle any OS dialogs at all, please correct me I'm wrong. :)
Here's a tiny example:
import urllib
urllib.urlretrieve( "http://www.yourhost.com/yourfile.ext", "your-file-name.ext")
The only job for us here is to make sure that we handle all the urllib Exceptions. Please see http://docs.python.org/2/library/urllib.html#urllib.urlretrieve for more info.
I do not know if there is a pure Selenium handler for this, but here is what I have done when I needed to do something with the downloaded file.
Set a loop that polls your download directory for the latest file that does not have a .part extension (this indicates a partial download and would occasionally trip things up if not accounted for. Put a timer on this to ensure that you don't go into an infinite loop in the case of timeout/other error that causes the download not to complete. I used the output of the ls -t <dirname> command in Linux (my old code uses commands, which is deprecated so I won't show it here :) ) and got the first file by using
# result = output of ls -t
result = result.split('\n')[1].split(' ')[-1]
If the while loop exits successfully, the topmost file in the directory will be your file, which you can then modify using os.rename (or anything else you like).
Probably not the answer you were looking for, but hopefully it points you in the right direction.
Solution with code as suggested by the selected answer. Rename the file after each one is downloaded.
import os
os.chdir(SAVE_TO_DIRECTORY)
files = filter(os.path.isfile, os.listdir(SAVE_TO_DIRECTORY))
files = [os.path.join(SAVE_TO_DIRECTORY, f) for f in files] # add path to each file
files.sort(key=lambda x: os.path.getmtime(x))
newest_file = files[-1]
os.rename(newest_file, docName + ".pdf")
This answer was posted as an edit to the question naming a file when downloading with Selenium Webdriver by the OP user1253952 under CC BY-SA 3.0.
I have a rather simple program that writes HTML code ready for use.
It works fine, except that if one were to run the program from the Python command line, as is the default, the HTML file that is created is created where python.exe is, not where the program I wrote is. And that's a problem.
Do you know a way of getting the .write() function to write a file to a specific location on the disc (e.g. C:\Users\User\Desktop)?
Extra cool-points if you know how to open a file browser window.
The first problem is probably that you are not including the full path when you open the file for writing. For details on opening a web browser, read this fine manual.
import os
target_dir = r"C:\full\path\to\where\you\want\it"
fullname = os.path.join(target_dir,filename)
with open(fullname,"w") as f:
f.write("<html>....</html>")
import webbrowser
url = "file://"+fullname.replace("\\","/")
webbrowser.open(url,True,True)
BTW: the code is the same in python 2.6.
I'll admit I don't know Python 3, so I may be wrong, but in Python 2, you can just check the __file__ variable in your module to get the name of the file it was loaded from. Just create your file in that same directory (preferably using os.path.dirname and os.path.join to remain platform-independent).