Lets assume the file structure as below.
C:\folder1
file1.py
folder2
folder3
file3.py
I want file3.py to run file1 from command line with its arguments. Do I need to import folder1 or file1? How? How to call the script?
I tried the following
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
sys.path.append(os.path.join(currentdir, '../../'))
To run external program in Python, some common choices are subprocess.Popen, subprocess.call, os.system.
Take subprocess.Popen and your folder structure as example, here is file3.py:
import os
import subprocess
current_dir = os.path.dirname(os.path.realpath(__file__))
target_script = os.path.abspath(os.path.join(current_dir, '..', '..', 'file1.py'))
arg1 = 'test_value'
call_args = ['python', target_script, arg1]
subprocess.Popen(call_args)
The above codes will run file1.py in a subprocess, and pass 'arg1' to it.
More Pythonic solution is: put a __init__.py file under "folder1", "folder2" and "folder3", then Python will treat these directories as packages.
In file1.py:
import sys
def func1(arg):
print 'func1 received: %s' % arg
if __name__ == '__main__':
# better to validate sys.argv here
func1(sys.argv[1])
In this way, you can import file1.func1 in other python scripts, as well as run file1.py in command line directly.
Then, file3.py:
from ...file1 import func1
# "." means current dir, ".." means one level above, and "..." is 2 levels above
func1('test_value')
To execute file3.py: go to folder1's parent folder (i.e. C:\ in your example), then execute python -m folder1.folder2.folder3.file3
This solution may look more complicated, but with your project going bigger, well organized package structure will benefit more.
Related
The project has the same structure as in the picture: I'm trying to import from "mod.py " in "index.py "
from .. import mod
However, it gives the error: "ImportError: attempted relative import with no known parent package" If you use this option:
from pack1 import mod
Then error: "ModuleNotFoundError error: there is no module named 'pack1'"
enter image description here
PROJECT/
pack1/
__init__.py
mod.py
pack2/
__init__.py
index.py
What is the problem?
This is a recurring question on StackOverflow. And much of the confusion (in my opinion) comes from how Python interprets the files and folders it sees is based on where Python is run from. First, some terminology:
module: a file containing Python code.
package: a folder containing files with Python code and other folders.
When you start Python in a directory (folder), it doesn't "know" what the namespace of that directory should be. I.e., if you are working in Z:\path\to_my\project\ when you start Python:
it does NOT consider project to be a package.
any .py files you want to import from will be in their own namespace as modules.
any folders you want to import from will also be in their own namespace as packages.
What about __init__.py? Since version 3.3, Python has implicit namespace packages, which allows importing without needing to create an empty __init__.py file.
Consider #2: if you have two files: first.py and second.py:
path/
to_my/
project/
>>Python is running here<<
first.py
second.py
with these contents:
# first.py
first_var = 'hello'
# second.py
from .first import first_var
second_var = first_var + ' world'
if you try to import like this:
>>> import second
Python basically does the following:
"ok, I see second.py"
"Reading that in as a module, chief!"
"Ok, it wants to import .first
"The . means get the package (folder) that contains first.py"
"Wait, I don't have a parent package for first.py!"
"Better raise an error."
The same rules apply for #3 as well. If we add a few packages to the project like this:
path/
to_my/
project/
>>Python is running here<<
first.py
second.py
pack1/
mod.py
other_mod.py
pack2/
index.py
with the following contents:
# pack1/mod.py
mod_var = 1234
# pack1/other_mod.py
from .mod import mod_var
other_var = mod_var * 10
# pack2/index.py
from ..pack1 import mod
and when you try to import like this:
>>> from pack2 import index.py
The import in pack2/index.py is going to fail for the same reason second.py, Python will work its way up the import chain of dots like this:
"Reading in in index.py as a module."
"Looks like it wants to import mod from ..pack1.
"Ok, . is the pack2 parent package namespace of index.py, found that."
"So, .. is the parent package of pack2."
"But, I don't have a parent package for pack2!"
"Better raise an error."
How do we make it work? Two thing.
First, move where Python is running up one level so that all of the .py files and subfolders are considered to be part of the same package namespace, which allows the file to reference each other using relative references.
path/
to_my/
>>Python is running here now<<
project/
first.py
second.py
pack1/
mod.py
other_mod.py
pack2/
index.py
So now Python sees project as a package namespace, and all of the files within can use relative references up to that level.
This changes how you import when you are in the Python interpreter:
>>> from project.pack2 import index.py
Second, you make explicit references instead of relative references. That can make the import statements really long, but if you have several top-level modules that need to pull from one another, this is how you can do it. This is useful when you are defining your functions in one file and writing your script in another.
# first.py
first_var = 'hello'
# second.py
from first import first_var # we dropped the dot
second_var = first_var + ' world'
I hope this helps clear up some of the confusion about relative imports.
Suppose I have two python scripts methods.py and driver.py.
methods.py has all the methods defined in it, and driver.py does the required job when I run it.
Let's say I am in a main directory with the two files driver.py and methods.py, and I have n subdirectories in the main directory named subdir1,subdir2,...,subdirn. All of these subdirectories have files which act as inputs to driver.py.
What I want to do is run driver.py in all these subdirectories and get my output from them, without writing driver.py to disk.
How should I go about this?
At the moment, I am using the subprocess module to
Copy driver.py and methods.py to the subdirectories.
Run them.
The copying part is simple:
import subprocess
for i in range(n):
cmd = "cp methods.py driver.py subdir"+str(i)
p = subprocess.Popen(cmd, shell=True)
p.wait()
#once every subdirectory has driver.py and methods.py, start running these codes
for i in range(n):
cmd = "cd subdir" + str(i) +" && python driver.py"
p = subprocess.Popen(cmd, shell=True)
p.wait()
Is there a way to do the above without using up disk space?
you might use pythons os.chdir() function to change the current working directory:
import os
#import methods
root = os.getcwd()
for d in ['subdir1', 'subdir2']:
os.chdir(os.path.join(root, d))
print("dir:", os.getcwd())
exec(open("../driver.py").read())
I am also not sure if you need popen, since python is able to execute python files using the exec function. In this case it depends on how you import the methods.py. Do you simply import it or use it somehow else inside of your driver.py ?
You could try to import it at toplevel inside your main script or use an extended path like:
exec(open("../methods.py").read())
inside of your driver script. Keep in mind these solutions are all not very elegant. Best would be processing the path inside of your driver.py as suggested by Gino Mempin. You could call os.chdir() from there.
Expanding on my initial comment, instead of copying driver.py everywhere, you can just make it accept a path to the subdirectory as a command-line argument. Then, you'll have to also make sure it can do whatever it's supposed to do from any directory. This means taking into account correct paths to files.
There are a number of ways to accept command-line args (see How to read/process command line arguments?). To make it simple, let's just use sys.args to get the path.
Here's a modified driver.py
import sys
from pathlib import Path
# Receive the path to the target subdir as command line args
try:
subdir_path = Path(sys.argv[1]).resolve()
except Exception as e:
print('ERROR resolving path to subdir')
print(e)
sys.exit(1)
# Do stuff, taking into account full path to subdir
print(f'Running driver on {subdir_path}')
with open(subdir_path.joinpath('input.txt'), 'r') as f:
data = f.read()
print(f'Got data = {data}')
Let's say after getting the path, driver.py expects to read a file (input.txt) from each subdirectory. So here you need to get the absolute path to the subdirectory (.resolve()) and to use that when accessing input.txt (.joinpath()). Basically, think that driver.py will always be running from the main dir.
Sample usage would be:
main$ tree
.
├── driver.py
├── method.py
├── subdir1
│ └── input.txt
├── subdir2
│ └── input.txt
└── subdir3
└── input.txt
main$ python driver.py /temp/main/subdir3
Running driver on /temp/main/subdir3
Got data = 3333
Now, in method.py, you then don't need the "copy driver.py" code. Just loop through all the subdir{n} folders under main, then pass the full path to driver.py. You can still use the same Popen(...) code to call driver.py.
Here's a modified method.py:
import subprocess
from pathlib import Path
# Assume all subdirs are under the current directory
parent_path = Path.cwd()
for child_path in parent_path.iterdir():
# Skip files/folders not named 'subdir{n}'
if not child_path.is_dir() or 'subdir' not in child_path.stem:
continue
# iterdir returns full paths (cwd + /subdir)
print(f'Calling driver.py on {child_path}')
cmd = f'python driver.py {child_path}'
p = subprocess.Popen(cmd, shell=True)
p.wait()
Sample run:
main$ python method.py
Calling driver.py on /temp/main/subdir3
Running driver on /temp/main/subdir3
Got data = 3333
Calling driver.py on /temp/main/subdir2
Running driver on /temp/main/subdir2
Got data = 2222
Calling driver.py on /temp/main/subdir1
Running driver on /temp/main/subdir1
Got data = 1111
Notice also that all execution is done from main.
Some notes:
Instead of hardcoded range(n), you can use iterdir or its equivalent os.listdir to get all the files/folders under a base path, which should conveniently return absolute paths. Unless you need to go through the subdirs in a specific order.
I prefer using pathlib over os.path because it offers more high-level abstractions to file/folder paths. If you want or need to use os.path, there is this table of os.path - pathlib equivalents.
I used f-strings here (see What is print(f"...")) which is only available for Python 3.6+. If you are on a lower version, just replace with any of the other string formatting methods.
python project files hierarchy:
parent/
__init__.py
one/
__init__.py
bar.py
two/
__init__.py
foo.py
foo.py
from one import bar
I tried to run foo.py from terminal in other directory (e.g. users/user), I got the next error:
No module named one
When I trying to run foo.py, I guess it is trying to import the files from the directory that the code had been executed from, I had tried lot of ways and I couldn't find solution, finally I found a solution, the problem with this solution is that the solution is not elegant and I hope there is an elegant and better solution.
foo.py
from pathlib import Path
import sys
sys.path.append(str(Path(__file__).parent.parent))
sys.path.append("..")
from one import bar
This solution is not elegant because it preventing me to put all the imports in the start of the page.
The fact that you have an __init.py__ in the parent directory suggests that parent is part of your package structure and that its parent directory, whatever that might be, should be in the PATH. Therefore your import should really be:
from parent.one import bar
It can be useful for an application directory structure to have a single root. Then the __init.py__ in that single root package can be used to load modules from subpackages, but this is certainly not a requirement. If that was not your intention, then you should probably delete the __init__.py that is in parent as it is serving no purpose (and is confusing) and ensure that directory parent is in your PATH.
HOWEVER: As long as the current directory you are in when you run your program is the parent directory of the root(s) of your package structure, Python should be able to find your packages with no special action on your part because the current directory is automatically added to the path. If that is inconvenient, you can set environment variable PYTHONPATH.
So, determine whether you should be changing your import statement or not based on which directories are part of your package structure. Then you should arrange for Python to find your packages either by setting the current directory, PYTHONPATH, or sys.path to the required directory -- but do this once. If you have to set sys.path, I would do this in your main program at startup before it needs to include anything:
If foo.py is your main program, then at the top of the program I would have:
if __name__ == '__main__':
from pathlib import Path
import sys
# if your import statement is: from parent.one import bar, then:
sys.path.insert(0, str(Path(__file__).parent.parent))
"""
# if your import statement is: from one import bar, then:
sys.path.insert(0, str(Path(__file__).parent))
"""
Why don’t you let the parent act like a path provider to the child, by creating a path dictionary ? like this way :
class parent:
...
def createPathDict(self):
self.path_dict = {}
self.path_dict ['parent'] = self.parentPath
self.path_dict ['one'] = os.path.join(self.parentPath, 'one')
self.path_dict ['two'] = os.path.join(self.parentPath, 'two')
# self.path_dict ['three'] = …
# ...
From child ‘two’ you import the dictionary like this (I assume you use classes) :
class foo:
def __init__(self, parent):
self.parent = parent
def addPathsToPythDirs(self):
sys.path.insert(1, self.parent.path_dict ['one']) # better
# sys.path.insert(0, self.parent.path_dict [key])
...
In that way you could keep your imports in foo.py
Why use sys.path.append(path) instead of sys.path.insert(1, path)?
Solved see my answer below for anyone who might find this helpful.
I have two scripts a.py and b.py.
In my current directory "C:\Users\MyName\Desktop\MAIN", I run > python a.py.
The first script, a.py runs in my current directory, does something to a bunch of files and creates a new directory (testA) with the edited versions of those files which are simultaneously moved into that new directory. Then I need to run b.py for the files in testA.
As a beginner, I would just copy and paste my b.py script into testA and execute the command again "> python b.py", which runs some commands on those new files and creates another folder (testB) with those edited files.
I am trying to eliminate the hassle of waiting for a.py to finish, move into that new directory, paste b.py, and then run b.py. I am trying to write a bash script that executes these scripts while maintaining my hierarchy of directories.
#!/usr/bin/env bash
python a.py && python b.py
Script a.py runs smoothly, but b.py does not execute at all. There are no error messages coming up about b.py failing, I just think it cannot execute because once a.py is done, b.py does not exist in that NEW directory.
Is there a small script I can add within b.py that moves it into the new directory? I actually tried changing b.py directory paths as well but it did not work.
For example in b.py:
mydir = os.getcwd() # would be the same path as a.py
mydir_new = os.chdir(mydir+"\\testA")
I changed mydirs to mydir_new in all instances within b.py, but that also made no difference...I also don't know how to move a script into a new directory within bash.
As a little flowchart of the folders:
MAIN # main folder with unedited files and both a.py and b.py scripts
|
| (execute a.py)
|
--------testA # first folder created with first edits of files
|
| (execute b.py)
|
--------------testB # final folder created with final edits of files
TLDR: How do I execute a.py and b.py from the main test folder (bash script style?), if b.py relies on files created and stored in testA. Normally I copy and paste b.py into testA, then run b.py - but now I have 200+ files so copying and pasting is a waste of time.
The easiest answer is probably to change your working directory, then call the second .py file from where it is:
python a.py && cd testA && python ../b.py
Of course you might find it even easier to write a script that does it all for you, like so:
Save this as runTests.sh in the same directory as a.py is:
#!/bin/sh
python a.py
cd testA
python ../b.py
Make it executable:
chmod +x ./runTests.sh
Then you can simply enter your directory and run it:
./runTests.sh
I managed to get b.py executing and producing the testB folder where I need it to, while remaining in the MAIN folder. For anyone who might wonder, at the beginning of my b.py script I would simply use mydir = os.getcwd() which normally is wherever b.py is.
To keep b.py in MAIN while making it work on files in other directories, I wrote this:
mydir = os.getcwd() # would be the MAIN folder
mydir_tmp = mydir + "//testA" # add the testA folder name
mydir_new = os.chdir(mydir_tmp) # change the current working directory
mydir = os.getcwd() # set the main directory again, now it calls testA
Running the bash script now works!
In your batch file, you can set the %PYTHONPATH% variable to the folder with the Python module. This way, you don't have to change directories or use pushd to for network drives. I believe you can also do something like
set "PYTHONPATH=%PYTHONPATH%;c:\the path\to\my folder\which contains my module"
This will append the paths I believe (This will only work if you already have set %PYTHONPATH% in your environment variables).
If you haven't, you can also just do
set "PYTHONPATH=c:\the path\to\my folder\which contains my module"
Then, in the same batch file, you can do something like
python -m mymodule ...
despite there are already answers i still wrote a script out of fun and it still could be of help in some respects.
I wrote it for python3, so it is necessary to tweak some minor things to execute it on v2.x (e.g. the prints).
Anyways... the code creates a new folder relative to the location of a.py, creates and fills script b.py with code, executes b and displays b's results and errors.
The resulting path-structure is:
testFolder
|-testA
| |-a.py
|-testB
| |-b.py
The code is:
import os, sys, subprocess
def getRelativePathOfNewFolder(folderName):
return "../" + folderName + "/"
def getAbsolutePathOfNewFolder(folderName):
# create new folder with absolute path:
# get path of current script:
tmpVar = sys.argv[0]
# separate path from last slash and file name:
tmpVar = tmpVar[:sys.argv[0].rfind("/")]
# again to go one folder up in the path, but this time let the slash be:
tmpVar = tmpVar[:tmpVar.rfind("/")+1]
# append name of the folder to be created:
tmpVar += folderName + "/"
# for the crazy ones out there, you could also write this like this:
# tmpVar = sys.argv[0][:sys.argv[0].rfind("/", 0,
sys.argv[0].rfind("/")-1)+1] + folderName + "/"
return tmpVar
if __name__ == "__main__":
# do stuff here:
# ...
# create new folder:
bDir = getAbsolutePathOfNewFolder("testB")
os.makedirs(bDir, exist_ok=True) # makedirs can create new nested dirs at once. e.g: "./new1/new2/andSoOn"
# fill new folder with stuff here:
# ...
# create new python file in location bDir with code in it:
bFilePath = bDir + "b.py"
with open(bFilePath, "a") as toFill:
toFill.write("if __name__ == '__main__':")
toFill.write("\n")
toFill.write("\tprint('b.py was executed correctly!')")
toFill.write("\n")
toFill.write("\t#do other stuff")
# execute newly created python file
args = (
"python",
bFilePath
)
popen = subprocess.Popen(args, stdout=subprocess.PIPE)
# use next line if the a.py has to wait until the subprocess execution is finished (in this case b.py)
popen.wait()
# you can get b.py´s results with this:
resultOfSubProcess, errorsOfSubProcess = popen.communicate()
print(str(resultOfSubProcess)) # outputs: b'b.py was executed correctly!\r\n'
print(str(errorsOfSubProcess)) # outputs: None
# do other stuff
instead of creating a new code file and filling it with code you of course can simply copy an existing one as shown here:
How do I copy a file in python?
Your b.py script could take the name of the directory as a parameter. Access the first parameter passed to b.py with:
import sys
dirname = sys.argv[1]
Then iterate over the files in the named directory with:
import os
for filename in os.listdir(dirname):
process(filename)
Also see glob.glob and os.walk for more options processing files.
Lets assume the following structure of directories for a project
<root>
__init__.py
helloworld.py
<moduleOne>
f.txt
__init__.py
printfile.py
where root and moduleOne are directories
Content of helloworld.py:
#!/usr/bin/python
import helloworld.printfile
printf()
Content of moduleOne/printfile
#!/usr/bin/python
f = open('f.txt')
def printf():
print 'print file'
print f
if __name__ == '__main__':
printf()
My issue:
From moduleOne/ the execution of printfile is ok, but from root/, if i run helloworld.py the following error happens:
import moduleOne.printfile
File "/root/moduleOne/printfile.py", line 5, in <module>
f = open('f.txt')
IOError: [Errno 2] No such file or directory: 'f.txt'
How to solve this in python?
[Edited]
I solved (more or less) this issue with a "workaround", but stil have a problem:
My solution:
In moduleOne/printfile
import sys
fname = 'moduloOne/f.txt'
def printf():
f = open(fname)
print 'print file'
print f
if __name__ == '__main__':
fname = 'f.txt'
printf()
But....
lets say i have a new directory, from the root, called etc, then the new structure is:
<root>
__init__.py
helloworld.py
<moduleOne>
f.txt
__init__.py
printfile.py
<etc>
f2.txt
And now i need to acess etc/f2.txt from moduleOne/printfile. how?
You need more abstraction.
Don't hard-code the file path in printfile.py
Don't access a global in the printf function.
Do accept a file handle as a parameter to the printf function:
def printf(file_handle):
print 'print file'
print file_handle
In a script that does actually need to know the path of f.txt (I guess helloworld.py in your case), put it there, open it, and pass it to printf:
from moduleOne.printfile import printf
my_f_file = open('/path/to/f.txt')
printf(my_f_file)
Better yet, get the file path from the command line
import sys
from moduleOne.printfile import printf
input_file_path = sys.argv[1]
my_f_file = open(input_file_path)
printf(my_f_file)
EDIT: You said on your Google+ cross-post:
full path is problem, the program will run on differents environments.
If you're trying to distribute your program to other users and machines, you should look into making a distribution package (see side note 3 below), and using package_data to include your configuration file, and pkgutil or pkg_resources to access the configuration file. See How do I use data in package_data from source code?
Some side-notes:
Represent directories as the directory name with a trailing slash, à la the conventions of the tree command: / instead of <root>, moduleOne/ instead of <moduleOne>
You're conflating "module" with "package". I suggest you rename moduleOne/ to packageOne/. A directory with an __init__.py file constitutes a package. A file ending in a .py extension is a module. Modules can be part of packages by physically existing inside a directory with an __init__.py file. Packages can be part of other packages by being a physical subdirectory of a parent directory with an __init__.py file.
Unfortunately, the term "package" is overloaded in Python and also can mean a collection of Python code for distribution and installation. See the Python Packaging Guide glossary.