What's the appropriate way to setup a python environment automatically without using pip/setuptools?
I'm required to do this for a project and I'm not allowed to 'install' anything on the server. I don't have permissions for that and it's a script that will be used by multiple users.
I need to set up the PYTHONPATH so modules can be imported in the package as well as create symbolic links for my command line script. It's my first time doing this so I'll appreciate your help a lot.
My project folder looks something like:
ProjectName/
README
LICENSE
projectname/
projectname # command-line script
src/
module1.py
module2.py
tests/
test.py
Should I just create a quick bash script to do this or is there a better way to do it?
Related
First of all, there are a bunch of solutions on stackoverflow regarding this but from the ones I tried none of them is working. I am working on a remote machine (linux). I am prototyping within the dir-2/module_2.py file using an ipython interpreter. Also I am trying to avoid using absolute paths as the absolute path in this remote machine is long and ugly, and I want my code to run on other machines upon download.
My directory structure is as follows:
/project-dir/
-/dir-1/
-/__ init__.py
-/module_1.py
-/dir-2/
-/__ init__.py
-/module_2.py
-/module_3.py
Now I want to import module_1 from module_2. However the solution mentioned in this stackoverflow post: link of using
sys.path.append('../..')
import module_2
Does not work. I get the error: ModuleNotFoundError: No module named 'module_1'
Moreover, within the ipython interpreter things like import .module_3 within module_2 throws error:
import .module_3
^ SyntaxError: invalid syntax
Isn't the dot operator supposed to work within the same directory as well. Overall I am quite confused by the importing mechanism. Any help with the initial problem is greatly appreciated! Thanks a lot!
Why it didn't work?
If you run the module1.py file and you want to import module2 then you need something like
sys.path.append("../dir-2")
If you use sys.path.append("../..") then the folder you added to the path is the folder containing project-dirand there is notmodule2.py` file inside it.
The syntax import .module_3 is for relative imports. if you tried to execute module2.py and it contains import .module_3 it does not work because you are using module2.py as a script. To use relative imports you need to treat both module2.py and module_3.py as modules. That is, some other file imports module2 and module2 import something from module3 using this syntax.
Suggestion on how you can proceed
One possible solution that solves both problems is property organizing the project and (optionally, ut a good idea) packaging your library (that is, make your code "installable"). Then, once your library is installed (in the virtual environment you are working) you don't need hacky sys.path solutions. You will be able to import your library from any folder.
Furthermore, don't treat your modules as scripts (don't run your modules). Use a separate python file as your "executable" (or entry point) and import everything you need from there. With this, relative imports in your module*.py files will work correctly and you don't get confused.
A possible directory structure could be
/project-dir/
- apps/
- main.py
- yourlib/
-/__ init__.py
-/dir-1/
-/__ init__.py
-/module_1.py
-/dir-2/
-/__ init__.py
-/module_2.py
-/module_3.py
Notice that the the yourlib folder as well as subfolders contain an __init__.py file. With this structure, you only run main.py (the name does not need to be main.py).
Case 1: You don't want to package your library
If you don't want to package your library, then you can add sys.path.append("../") in main.py to add "the project-dir/ folder to the path. With that your yourlib library will be "importable" in main.py. You can do something like from yourlib import module_2 and it will work correctly (and module_2 can use relative imports). Alternatively, you can also directly put main.py in the project-dir/ folder and you don't need to change sys.path at all, since project-dir/ will be the "working directory" in that case.
Note that you can also have a tests folder inside project-dir and to run a test file you can do the same as you did to run main.py.
Case 2: You want to package your library
The previous solution already solves your problems, but going the extra mile adds some benefits, such as dependency management and no need to change sys.path no matter where you are. There are several options to package your library and I will show one option using poetry due to its simplicity.
After installing poetry, you can run the command below in a terminal to create a new project
poetry new mylib
This creates the following folder structure
mylib/
- README.rst
- mylib/
- __init__.py
- pyproject.toml
- tests
You can then add the apps folder if you want, as well as subfolders inside mylib/ (each with a __init__.py file).
The pyproject.toml file specifies the dependencies and project metadata. You can edit it by hand and/or use poetry to add new dependencies, such as
poetry add pandas
poetry add --dev mypy
to add pandas as a dependency and mypy as a development dependency, for instance. After that, you can run
poetry build
to create a virtual environment and install your library in it. You can activate the virtual environment with poetry shell and you will be able to import your library from anywhere. Note that you can change your library files without the need to run poetry build again.
At last, if you want to publish your library in PyPi for everyone to see you can use
poetry publish --username your_pypi_username --password _passowrd_
TL; DR
Use an organized project structure with a clear place for the scripts you execute. Particularly, it is better if the script you execute is outside the folder with your modules. Also, don't run a module as a script (otherwise you can't use relative imports).
I have written a few Python classes that exist in their own directory.
mylib/
__init__.py
a.py
b.py
I've also written two clients that use the library:
Google AppEngine (in the API directory)
Python script providing a command line interface and flags (in the CLI directory).
My entire project directory is:
myproject/
CLI/
command_line_client.py
API/
app.yaml
lib/
mylib/
__init__.py
a.py
b.py
I don't know if a canonical structure exists, but this seemed sensible because I can change the library once, and both the CLI and API will be updated.
However I'm unsure how this would actually work. Two problems specifically:
AppEngine requires libraries to exist in a lib subdirectory so they're deployed to AppEngine along with the app. How to I get mylib to the AppEngine lib subdirectory?
The CLI and mylib directory exist at the same level, so I'm unsure how Python imports work. How would my CLI Python script import the library?
Just create a symbolic link for your my_lib folder inside API/lib. After that you can easily access mylib from your API service.
myproject$ ls -s ../mylib API/lib/
I have a bunch of unittest test cases in separate directories. There is also a directory which just contains helper scripts for the tests. So my file tree looks like this
test_dir1
test_dir2
test_dir3
helper_scripts
Each python file in test_dir* will have these lines:
import sys
sys.path.append('../helper_scripts')
import helper_script
This all works fine, as long as I run the tests from within their directory. However, I would like to be at the project root and just run:
py.test
and have it traverse all the directories and run each test it finds. The problem is that the tests are being run from the wrong directory, so the sys.path.append doesn't append the helper_scripts directory, it appends the parent of the project root. This makes all the imports fail with an Import Error.
Is there a way to tell py.test to run the test scripts from their directory? ie. change the cwd before executing them? If not, is there another test runner I can use that will?
What I usually do is structure my project like this:
myproject/
setup.py
myproject/
__init__.py
mymodule.py
tests/
__init__.py
test_dir1/
test_mymodule.py
helper_scripts/
__init__.py
helper_script.py
For running tests, I use a virtualenv with myproject installed in development mode using one of the following commands in the myproject root directory:
pip install -e .
python setup.py develop
Now in test_mymodule.py I can just say
from myproject.tests.helper_scripts import helper_script
I can then just run pytest and there's no need to change the working directory in tests at all.
See Pytest's Good Integration Practices for a great summary of pros and cons for different project directory structures.
os.chdir("newdir")
will change your current working directory
I would suggest that you instead configure your environment so that import helper_scripts will work regardless of the current directory. This is the recommended approach.
If you absolutely must though, you can use relative imports instead:
from .. import helper_script
I am developing python functions in different .py files (example DisplayTools.py, CollectionTools.py...) in order to import them as tools in a more general file Start.py. It works fine if all the files are in the same directory. I can say in Start.py "import DisplayTools" ...
But how to organize those in a more project way and more user-friendly (where they only have to work on the Start.py file). For example having such an file organization :
Project/
Start.py
Tools/
DisplayTools.py
CollectionTools.py
I've read the use of __init__ files but how they works, where to put those files and what are they containing ?
Please if you have some help to give me in that way to organize my project.
Many thanks
I'd refactor your code organization just a bit and give your toplevel directory a more descriptive name. Today, I pick happy_bananas. So let's say you organize your files like this:
happy_bananas
start.py
DisplayTools.py
CollectionTools.py
then all you need to do is add an empty __init__.py file and you can use it just like any other package, e.g.:
happy_bananas
__init__.py
start.py
DisplayTools.py
CollectionTools.py
And now if you can do:
from happy_bananas import DisplayTools
just like you would have before.
Now, to get this into your system, you need to package it and use an install script. You can do this using distutils or setuptools but perhaps the simplest existing description of how to do this is in Zed Shaw's Learn Python The Hard Way Exercise 46. You really can just copy/paste those files as described there and end up with a directory structure like this:
happy_bananas
setup.py
tests
test_happy_bananas.py
happy_bananas
__init__.py
start.py
DisplayTools.py
.
.
Then, when you have your setup script written, you can go into your folder and run python setup.py install (or python setup.py develop) and be able to import happy_bananas in any file.
On a separate note, the naming convention in python is to use snakecase for file and function names. So instead of DisplayTools.py, it would be better to rename it display_tools.py. CamelCase is usually reserved for class names only.
Well for starters i would simply change my files to have a set of functions and some main code since the files could also be executed.
For example:
if __name__ == "__main__":
dosomething()
Then in the main, you simply import the other scripts and you can use the functions used there without actually running the script.
I have some problems in structuring my python project. Currently it is a bunch of files in the same folder. I have tried to structure it like
proj/
__init__.py
foo.py
...
bar/
__init__.py
foobar.py
...
tests/
foo_test.py
foobar_test.py
...
The problem is that I'm not able, from inner directories, to import modules from the outer directories. This is particularly annoying with tests.
I have read PEP 328 about relative imports and PEP 366 about relative imports from the main module. But both these methods require the base package to be in my PYTHONPATH. Indeed I obtain the following error
ValueError: Attempted relative import in non-package.
So I added the following boilerplate code on top of the test files
import os, sys
sys.path.append(os.path.join(os.getcwd(), os.path.pardir))
Still I get the same error. What is the correct way to
structure a package, complete with tests, and
add the base directory to the path to allow imports?
EDIT As requested in the comment, I add an example import that fails (in the file foo_test.py)
import os, sys
sys.path.append(os.path.join(os.getcwd(), os.path.pardir))
from ..foo import Foo
When you use the -m switch to run code, the current directory is added to sys.path. So the easiest way to run your tests is from the parent directory of proj, using the command:
python -m proj.tests.foo_test
To make that work, you will need to include an __init__.py file in your tests directory so that the tests are correctly recognised as part of the package.
I like to import modules using the full proj.NAME package prefix whenever possible. This is the approach the Google Python styleguide recommends.
One option to allow you to keep your package structure, use full package paths, and still move forward with development would be to use a virtualenv and put your project in develop mode. Your project's setup.py will need to use setuptools instead of distutils, to get the develop command.
This will let you avoid the sys.path.append stuff above:
% virtualenv ~/virt
% . ~/virt/bin/activate
(virt)~% cd ~/myproject
(virt)~/myproject% python setup.py develop
(virt)~/myproject% python tests/foo_test.py
Where foo_test.py uses:
from proj.foo import Foo
Now when you run python from within your virtualenv your PYTHONPATH will point to all of the packages in your project. You can create a shorter shell alias to enter your virtualenv without having to type . ~/virt/bin/activate every time.