Solved see my answer below for anyone who might find this helpful.
I have two scripts a.py and b.py.
In my current directory "C:\Users\MyName\Desktop\MAIN", I run > python a.py.
The first script, a.py runs in my current directory, does something to a bunch of files and creates a new directory (testA) with the edited versions of those files which are simultaneously moved into that new directory. Then I need to run b.py for the files in testA.
As a beginner, I would just copy and paste my b.py script into testA and execute the command again "> python b.py", which runs some commands on those new files and creates another folder (testB) with those edited files.
I am trying to eliminate the hassle of waiting for a.py to finish, move into that new directory, paste b.py, and then run b.py. I am trying to write a bash script that executes these scripts while maintaining my hierarchy of directories.
#!/usr/bin/env bash
python a.py && python b.py
Script a.py runs smoothly, but b.py does not execute at all. There are no error messages coming up about b.py failing, I just think it cannot execute because once a.py is done, b.py does not exist in that NEW directory.
Is there a small script I can add within b.py that moves it into the new directory? I actually tried changing b.py directory paths as well but it did not work.
For example in b.py:
mydir = os.getcwd() # would be the same path as a.py
mydir_new = os.chdir(mydir+"\\testA")
I changed mydirs to mydir_new in all instances within b.py, but that also made no difference...I also don't know how to move a script into a new directory within bash.
As a little flowchart of the folders:
MAIN # main folder with unedited files and both a.py and b.py scripts
|
| (execute a.py)
|
--------testA # first folder created with first edits of files
|
| (execute b.py)
|
--------------testB # final folder created with final edits of files
TLDR: How do I execute a.py and b.py from the main test folder (bash script style?), if b.py relies on files created and stored in testA. Normally I copy and paste b.py into testA, then run b.py - but now I have 200+ files so copying and pasting is a waste of time.
The easiest answer is probably to change your working directory, then call the second .py file from where it is:
python a.py && cd testA && python ../b.py
Of course you might find it even easier to write a script that does it all for you, like so:
Save this as runTests.sh in the same directory as a.py is:
#!/bin/sh
python a.py
cd testA
python ../b.py
Make it executable:
chmod +x ./runTests.sh
Then you can simply enter your directory and run it:
./runTests.sh
I managed to get b.py executing and producing the testB folder where I need it to, while remaining in the MAIN folder. For anyone who might wonder, at the beginning of my b.py script I would simply use mydir = os.getcwd() which normally is wherever b.py is.
To keep b.py in MAIN while making it work on files in other directories, I wrote this:
mydir = os.getcwd() # would be the MAIN folder
mydir_tmp = mydir + "//testA" # add the testA folder name
mydir_new = os.chdir(mydir_tmp) # change the current working directory
mydir = os.getcwd() # set the main directory again, now it calls testA
Running the bash script now works!
In your batch file, you can set the %PYTHONPATH% variable to the folder with the Python module. This way, you don't have to change directories or use pushd to for network drives. I believe you can also do something like
set "PYTHONPATH=%PYTHONPATH%;c:\the path\to\my folder\which contains my module"
This will append the paths I believe (This will only work if you already have set %PYTHONPATH% in your environment variables).
If you haven't, you can also just do
set "PYTHONPATH=c:\the path\to\my folder\which contains my module"
Then, in the same batch file, you can do something like
python -m mymodule ...
despite there are already answers i still wrote a script out of fun and it still could be of help in some respects.
I wrote it for python3, so it is necessary to tweak some minor things to execute it on v2.x (e.g. the prints).
Anyways... the code creates a new folder relative to the location of a.py, creates and fills script b.py with code, executes b and displays b's results and errors.
The resulting path-structure is:
testFolder
|-testA
| |-a.py
|-testB
| |-b.py
The code is:
import os, sys, subprocess
def getRelativePathOfNewFolder(folderName):
return "../" + folderName + "/"
def getAbsolutePathOfNewFolder(folderName):
# create new folder with absolute path:
# get path of current script:
tmpVar = sys.argv[0]
# separate path from last slash and file name:
tmpVar = tmpVar[:sys.argv[0].rfind("/")]
# again to go one folder up in the path, but this time let the slash be:
tmpVar = tmpVar[:tmpVar.rfind("/")+1]
# append name of the folder to be created:
tmpVar += folderName + "/"
# for the crazy ones out there, you could also write this like this:
# tmpVar = sys.argv[0][:sys.argv[0].rfind("/", 0,
sys.argv[0].rfind("/")-1)+1] + folderName + "/"
return tmpVar
if __name__ == "__main__":
# do stuff here:
# ...
# create new folder:
bDir = getAbsolutePathOfNewFolder("testB")
os.makedirs(bDir, exist_ok=True) # makedirs can create new nested dirs at once. e.g: "./new1/new2/andSoOn"
# fill new folder with stuff here:
# ...
# create new python file in location bDir with code in it:
bFilePath = bDir + "b.py"
with open(bFilePath, "a") as toFill:
toFill.write("if __name__ == '__main__':")
toFill.write("\n")
toFill.write("\tprint('b.py was executed correctly!')")
toFill.write("\n")
toFill.write("\t#do other stuff")
# execute newly created python file
args = (
"python",
bFilePath
)
popen = subprocess.Popen(args, stdout=subprocess.PIPE)
# use next line if the a.py has to wait until the subprocess execution is finished (in this case b.py)
popen.wait()
# you can get b.py´s results with this:
resultOfSubProcess, errorsOfSubProcess = popen.communicate()
print(str(resultOfSubProcess)) # outputs: b'b.py was executed correctly!\r\n'
print(str(errorsOfSubProcess)) # outputs: None
# do other stuff
instead of creating a new code file and filling it with code you of course can simply copy an existing one as shown here:
How do I copy a file in python?
Your b.py script could take the name of the directory as a parameter. Access the first parameter passed to b.py with:
import sys
dirname = sys.argv[1]
Then iterate over the files in the named directory with:
import os
for filename in os.listdir(dirname):
process(filename)
Also see glob.glob and os.walk for more options processing files.
Related
Suppose I have two python scripts methods.py and driver.py.
methods.py has all the methods defined in it, and driver.py does the required job when I run it.
Let's say I am in a main directory with the two files driver.py and methods.py, and I have n subdirectories in the main directory named subdir1,subdir2,...,subdirn. All of these subdirectories have files which act as inputs to driver.py.
What I want to do is run driver.py in all these subdirectories and get my output from them, without writing driver.py to disk.
How should I go about this?
At the moment, I am using the subprocess module to
Copy driver.py and methods.py to the subdirectories.
Run them.
The copying part is simple:
import subprocess
for i in range(n):
cmd = "cp methods.py driver.py subdir"+str(i)
p = subprocess.Popen(cmd, shell=True)
p.wait()
#once every subdirectory has driver.py and methods.py, start running these codes
for i in range(n):
cmd = "cd subdir" + str(i) +" && python driver.py"
p = subprocess.Popen(cmd, shell=True)
p.wait()
Is there a way to do the above without using up disk space?
you might use pythons os.chdir() function to change the current working directory:
import os
#import methods
root = os.getcwd()
for d in ['subdir1', 'subdir2']:
os.chdir(os.path.join(root, d))
print("dir:", os.getcwd())
exec(open("../driver.py").read())
I am also not sure if you need popen, since python is able to execute python files using the exec function. In this case it depends on how you import the methods.py. Do you simply import it or use it somehow else inside of your driver.py ?
You could try to import it at toplevel inside your main script or use an extended path like:
exec(open("../methods.py").read())
inside of your driver script. Keep in mind these solutions are all not very elegant. Best would be processing the path inside of your driver.py as suggested by Gino Mempin. You could call os.chdir() from there.
Expanding on my initial comment, instead of copying driver.py everywhere, you can just make it accept a path to the subdirectory as a command-line argument. Then, you'll have to also make sure it can do whatever it's supposed to do from any directory. This means taking into account correct paths to files.
There are a number of ways to accept command-line args (see How to read/process command line arguments?). To make it simple, let's just use sys.args to get the path.
Here's a modified driver.py
import sys
from pathlib import Path
# Receive the path to the target subdir as command line args
try:
subdir_path = Path(sys.argv[1]).resolve()
except Exception as e:
print('ERROR resolving path to subdir')
print(e)
sys.exit(1)
# Do stuff, taking into account full path to subdir
print(f'Running driver on {subdir_path}')
with open(subdir_path.joinpath('input.txt'), 'r') as f:
data = f.read()
print(f'Got data = {data}')
Let's say after getting the path, driver.py expects to read a file (input.txt) from each subdirectory. So here you need to get the absolute path to the subdirectory (.resolve()) and to use that when accessing input.txt (.joinpath()). Basically, think that driver.py will always be running from the main dir.
Sample usage would be:
main$ tree
.
├── driver.py
├── method.py
├── subdir1
│ └── input.txt
├── subdir2
│ └── input.txt
└── subdir3
└── input.txt
main$ python driver.py /temp/main/subdir3
Running driver on /temp/main/subdir3
Got data = 3333
Now, in method.py, you then don't need the "copy driver.py" code. Just loop through all the subdir{n} folders under main, then pass the full path to driver.py. You can still use the same Popen(...) code to call driver.py.
Here's a modified method.py:
import subprocess
from pathlib import Path
# Assume all subdirs are under the current directory
parent_path = Path.cwd()
for child_path in parent_path.iterdir():
# Skip files/folders not named 'subdir{n}'
if not child_path.is_dir() or 'subdir' not in child_path.stem:
continue
# iterdir returns full paths (cwd + /subdir)
print(f'Calling driver.py on {child_path}')
cmd = f'python driver.py {child_path}'
p = subprocess.Popen(cmd, shell=True)
p.wait()
Sample run:
main$ python method.py
Calling driver.py on /temp/main/subdir3
Running driver on /temp/main/subdir3
Got data = 3333
Calling driver.py on /temp/main/subdir2
Running driver on /temp/main/subdir2
Got data = 2222
Calling driver.py on /temp/main/subdir1
Running driver on /temp/main/subdir1
Got data = 1111
Notice also that all execution is done from main.
Some notes:
Instead of hardcoded range(n), you can use iterdir or its equivalent os.listdir to get all the files/folders under a base path, which should conveniently return absolute paths. Unless you need to go through the subdirs in a specific order.
I prefer using pathlib over os.path because it offers more high-level abstractions to file/folder paths. If you want or need to use os.path, there is this table of os.path - pathlib equivalents.
I used f-strings here (see What is print(f"...")) which is only available for Python 3.6+. If you are on a lower version, just replace with any of the other string formatting methods.
I have a problem with relative paths in my python 2.7 project. I have two files, let's call them script.py and importedScript.py, which have different directories, because importedScript is in a subfolder.
importedScript.py has a method called openCSV(), which gets imported in script.py with
from subfolder.importedScript import openCSV
This works fine. The method openCSV(filename) has the following code inside:
script_path = os.path.dirname(os.path.abspath(__file__))
filepath = os.path.join(script_path, 'subfolder2/' + filename)
dataset = pd.read_csv(filepath)
This code imports a .csv file from a subfolder. This works also fine, if I run the importedScript.py by itself.
The problem now is, that when I run script.py, the relative path in importedScript.py is generated wrong. For some reasons, the system tries to load the importedScript.py from "subfolder2/" instead of "subfolder/subfolder2".
Does anyone know how to fix this?
Edit: In subfolder2 are different .csv files and I want to open different files from different python files.
you can pass the __file__ variable to the method on call:
def OpenCSV(file):
here = os.path.dirname(os.path.abspath(file))
...etc
can be called by doing OpenCSV(__file__)
I have a folder which contains .py scripts and each contains a same variable which I need in other script if that certain script is used from that folder.
folder_x
main.py
folder_y
script1.py
script2.py
script3.py
So all the scripts are not used at the same time just one of them.
I found this solution https://stackoverflow.com/a/35524184/5708537
And it works well but I have to list all the scripts manually.
I thought that I automate this and make a list of the files, and strip down the .py ending:
path = '/home/folder_x/folder_y'
files = os.listdir(path)
module_list = [i for i in files if i.endswith('.py')]
module_list = [os.path.splitext(x)[0] for x in module_list]
Works like a charm.
But this part of the code still thinks that the scripts are in folder_x
variables = {}
for mod_name in module_list:
mod = import_module(mod_name)
variables[mod_name] = getattr(mod, 'var')
So how can I tell to that the scripts are in folder_y and take that variable from those?
Or is there a better way to list scripts/modules from another folder, and get a variable from each of those?
If you want to import your own .py, just place them in the same directory as the program that is calling it. For example:
from mymodule import *
This lets you run all functions and methods in that file.
Lets assume the file structure as below.
C:\folder1
file1.py
folder2
folder3
file3.py
I want file3.py to run file1 from command line with its arguments. Do I need to import folder1 or file1? How? How to call the script?
I tried the following
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
sys.path.append(os.path.join(currentdir, '../../'))
To run external program in Python, some common choices are subprocess.Popen, subprocess.call, os.system.
Take subprocess.Popen and your folder structure as example, here is file3.py:
import os
import subprocess
current_dir = os.path.dirname(os.path.realpath(__file__))
target_script = os.path.abspath(os.path.join(current_dir, '..', '..', 'file1.py'))
arg1 = 'test_value'
call_args = ['python', target_script, arg1]
subprocess.Popen(call_args)
The above codes will run file1.py in a subprocess, and pass 'arg1' to it.
More Pythonic solution is: put a __init__.py file under "folder1", "folder2" and "folder3", then Python will treat these directories as packages.
In file1.py:
import sys
def func1(arg):
print 'func1 received: %s' % arg
if __name__ == '__main__':
# better to validate sys.argv here
func1(sys.argv[1])
In this way, you can import file1.func1 in other python scripts, as well as run file1.py in command line directly.
Then, file3.py:
from ...file1 import func1
# "." means current dir, ".." means one level above, and "..." is 2 levels above
func1('test_value')
To execute file3.py: go to folder1's parent folder (i.e. C:\ in your example), then execute python -m folder1.folder2.folder3.file3
This solution may look more complicated, but with your project going bigger, well organized package structure will benefit more.
Lets assume the following structure of directories for a project
<root>
__init__.py
helloworld.py
<moduleOne>
f.txt
__init__.py
printfile.py
where root and moduleOne are directories
Content of helloworld.py:
#!/usr/bin/python
import helloworld.printfile
printf()
Content of moduleOne/printfile
#!/usr/bin/python
f = open('f.txt')
def printf():
print 'print file'
print f
if __name__ == '__main__':
printf()
My issue:
From moduleOne/ the execution of printfile is ok, but from root/, if i run helloworld.py the following error happens:
import moduleOne.printfile
File "/root/moduleOne/printfile.py", line 5, in <module>
f = open('f.txt')
IOError: [Errno 2] No such file or directory: 'f.txt'
How to solve this in python?
[Edited]
I solved (more or less) this issue with a "workaround", but stil have a problem:
My solution:
In moduleOne/printfile
import sys
fname = 'moduloOne/f.txt'
def printf():
f = open(fname)
print 'print file'
print f
if __name__ == '__main__':
fname = 'f.txt'
printf()
But....
lets say i have a new directory, from the root, called etc, then the new structure is:
<root>
__init__.py
helloworld.py
<moduleOne>
f.txt
__init__.py
printfile.py
<etc>
f2.txt
And now i need to acess etc/f2.txt from moduleOne/printfile. how?
You need more abstraction.
Don't hard-code the file path in printfile.py
Don't access a global in the printf function.
Do accept a file handle as a parameter to the printf function:
def printf(file_handle):
print 'print file'
print file_handle
In a script that does actually need to know the path of f.txt (I guess helloworld.py in your case), put it there, open it, and pass it to printf:
from moduleOne.printfile import printf
my_f_file = open('/path/to/f.txt')
printf(my_f_file)
Better yet, get the file path from the command line
import sys
from moduleOne.printfile import printf
input_file_path = sys.argv[1]
my_f_file = open(input_file_path)
printf(my_f_file)
EDIT: You said on your Google+ cross-post:
full path is problem, the program will run on differents environments.
If you're trying to distribute your program to other users and machines, you should look into making a distribution package (see side note 3 below), and using package_data to include your configuration file, and pkgutil or pkg_resources to access the configuration file. See How do I use data in package_data from source code?
Some side-notes:
Represent directories as the directory name with a trailing slash, à la the conventions of the tree command: / instead of <root>, moduleOne/ instead of <moduleOne>
You're conflating "module" with "package". I suggest you rename moduleOne/ to packageOne/. A directory with an __init__.py file constitutes a package. A file ending in a .py extension is a module. Modules can be part of packages by physically existing inside a directory with an __init__.py file. Packages can be part of other packages by being a physical subdirectory of a parent directory with an __init__.py file.
Unfortunately, the term "package" is overloaded in Python and also can mean a collection of Python code for distribution and installation. See the Python Packaging Guide glossary.