What does "Mark directory as sources root" really do? - python

In Pycharm, if you right click in a folder inside your project, you can mark it as sources root, so then you can import modules from this folder and subfolders.
However doing it this way will only make your program runnable inside Pycharm, if I try to execute from outside Pycharm (eg from the console) it will complain that certain modules are not found, which is the problem that I'm facing.
If I mark a certain folder as sources root my program runs fine, but I need to understand what does it do so I can configure the program to find this modules even if not using Pycharm.
I want to know what does this option exactly do, and how can I get the same behaviour without using it.
It is just adding a __init__.py file in the root folder?
Is it doing something like:
import sys
sys.path.insert(0, my_folder)

First __init__.py marks a directory as a regular package directory(this is pre 3.3, in 3.3+ its not required anymore). With this, python will look for submodules inside that directory for imports.
"Mark directory as sources root" sets the PATH(or PYTHONPATH) for the environment. In shell, it will be like,
export PYTHONPATH="${PYTHONPATH}:/your/source/root"
PYTHONPATH holds the default search path for module files.
This will enable the interpreter to search for the modules in the appended path. The default value is installation dependent(normally it contains path to Python binaries).
You can also manipulate this search path using sys.path from inside a python file.

Related

Is it necessary to add my project to the environment variables PATH or PYTHONPATH?

I been reading a lot on how to set up the projects and having the __init__.py in the folder structures and also one above in the root folder where the project is been sitting.
I have this folder structure, and running in pycharm work, since it adds the path to environment variables when it starts.
C:\test_folder\folder_struture\project
C:\test_folder\folder_struture\project\pak1
C:\test_folder\folder_struture\project\pak1\pak_num1.py
C:\test_folder\folder_struture\project\pak1\__init__.py
C:\test_folder\folder_struture\project\program
C:\test_folder\folder_struture\project\program\program.py
C:\test_folder\folder_struture\project\program\__init__.py
C:\test_folder\folder_struture\project\__init__.py
C:\test_folder\folder_struture\__init__.py
When I try to run program.py where I have:
from project.pak1 import pak_num1
I get the error that the module doesn't exists. when I add the project to the PYTHONPATH variable or the PATH variable in my windows machine, now everything works fine. Is it possible that the tutorials are missing the part of setting the root folder of the project into the environs since they assume you already did it?
For every project do I need to put the root project into the environment variables or is there is a way for python to recognize that is in a python module/python structure?
Adding it to the environ will let me import absolute,
but if I try to do and relative with
from ..pak1 import pak_num1
I get:
ImportError: attempted relative import with no known parent package
If I run program.py does it look for the __init__.py in the same folder and if it find it, does it go one level up to find another __init__.py and so on to get the structure?
If you are writing lots of little scripts and start wanting to organize some of them into utility packages, mostly to be used by other scripts (ouside those packages), put all the scripts that are to be called from the command-line (your entry-points for execution, or main scripts, or whatever name you want to call this scripts that run as commands from the command line) side-by-side, in the same folder as the root folders of all your packages.
Then you can import (anything) from any package from the top-level scripts. Starting the execution from the top level scripts not only gives access to any package, as also allows for all the package internal imports to work, given that an __init__.py file rests in all package/sub-package folder/sub-folder.
For code inside a package to import from another sibling package (sibling at the top-level folder) you need either to append to sys.path the sibling package, or for example, to wrap everything yet in another folder again with an __init__.py file there. Then again you should start execution from outside this overall package, not from the scripts that were top-level before you do this.
Think of packages as something to be imported for use, not to start running from a random inner entry point.
Many times a better alternative is to configure a virtualenv, and every package you install in the environment becomes known in that environment. This also isolates the dependencies from project to project, including the Python version at use if you need. Note that this is solving also a different problem, but a hairy one, by the way.

Pycharm mistakes folder for module

Recently I switched computers and re-downloaded my python and pycharm, I tried activating my saved projects on the new computer, but I ran into problem, pycharm doesn't recognize the parent folder of some of the files, and mistakes it for a module.
The folder is called: "Enrichment_extra_stuff", and when I try to import file in that folder, from another file in that folder, it seems like pycharm recognize it, because it shows me all of the files inside it, but when I run the code, I get the error ModuleNotFoundError: No module named 'Enrichment_extra_stuff'.
Also weirdly, when I try doing explict import and just write import fr to import the file fr, then pycharm shows an error but when I run it, it works like it should.
I tried digging a bit on pycharm, but got confused and didn't found anything, my python interpreter version is 3.8 and I program on windows if that helps.
A folder (or better a directory) is not seen as a module, unless you put __init__.py file in it. This could be empty, or it should be the module content. This is Python: a module is either a file, or a directory with __init__.py
The second part is only for Pycharm: PyCharm is created to handle large projects, and often your program is not in the base (root) directory of your project, but in one (or more) subdirectories (e.g. src). So you should explicitly tell PyCharm which are the root directories. So, go to the project structure panel (the panel with the files, usually on left side), go to your "root" directory, and set with right mouse click on your base source directory: on the pop-up menu select Mark directory as, and then select Source Root.

Do I need PYTHONPATH

There are many of similar questions about PYTHONPATH and imports but I didn't find exactly what I needed.
I have a git repository that contains a few python helper scripts. The scripts are naturally organized in a few packages. Something like:
scripts/main.py
scripts/other_main.py
scripts/__init__.py
a/foo.py
a/bar.py
a/__init__py
b/foo.py
b/bar.py
b/__init__.py
__init__.py
scripts depends on a and b. I'm using absolute import in all modules. I run python3 scripts/main.py. Everything works as long as I set up PYTHONPATH to the root of my project.
However, I'd like to avoid users the hassle of setting up an environment variable.
What would be the right way to go? I expected this to work like in java, where the current dir is in the classpath by default but it doesn't seem to be the case. I've also tried relative import without success.
EDIT: it seems to work if I remove the top-level __init__.py
Firstly, you're right in that I don't think you need the top-level __init__.py. Removing it doesn't solve any import error for me though.
You won't need to set PYTHONPATH and there are a few alternatives that I can think of:
Use a virtual environment (https://virtualenv.pypa.io/en/latest/). This would also require you to package up your code into an installable package (https://packaging.python.org/). I won't explain this option further since it's not directly related to your question.
Move your modules under your scripts directory. Python automatically adds the script's directory into the Python path.
Modify the sys.path variable in your scripts so they can find your local modules.
The second option is the most straightforward.
The third option would require you to add some python code to the top of your scripts, above your normal imports. In your main.py it would look like:
#!/usr/bin/env python
import os.path, sys
sys.path.insert(0, os.path.dirname(os.path.dirname(__file__)))
import a
import b
What this does is:
Take the filename of the script
calculate the parent directory of the directory of the script
Prepend that directory to sys.path
Then do normal imports of your modules

Multiple user created modules from a directory not in PATH using just command window (Anaconda 3)

Up to this point I have organized my projects in such a way that everything I'm working on is in the same folder, so to play around/debug I have just launched Python from that folder like this:
C:\Users\Username\Dropbox\Projects\MyShinyProject>Python
>>>
However, I want to start organizing things better. I have created some "Utilities" classes that I expect I'll use over and over again. So they should be in their own folder.
So now, say I have a Projects folder (in Windows) with lots of subfolders of things I have been working on:
Projects
Sandbox
Sandbox1
Sandbox2
MyUtilities
__init__.py
Utility1.py
MyShinyProject
__init__.py
ImportantClass.py
I would like to be able to go into the command prompt and use classes/functions from both the MyUtilities folder and from the MyShinyProject folder. However, if I launch Python from inside MyShinyProject, I don't have access to MyUtilities (or vice versa). I've tried doing a relative import like this:
>>>import ..MyUtilities.Utility1
But that doesn't work:
import ..MyUtilities.Utility1
^
SyntaxError
If it matters: I don't use an IDE. I just use Notepad++ and the command prompt. Also, I added the __init__.py files to the folders because I read somewhere you're supposed to do that when you make modules, but I don't understand how to get all of this working correctly, or if I'm even close to doing it right.
I also tried adding my Projects folder to the PATH variable in the Windows environment table, but that doesn't seem to work. Even after adding it importing doesn't work, and when I do this:
import sys
for x in sys.path:
print(x)
...the folder I added to PATH does not appear (I tried adding it to the beginning and the end).
How can I use several of my user created modules together using the command prompt to import them?
Assuming you have __init__.py in your Projects folder, in the console you can do this:
import sys
sys.path.append("C:\Users\Username\Dropbox\Projects")
import Projects.MyUtilities.Utility1
Or if you want to add your desired directory permanently to the python path, you can append your directory to the value of the environment variable called PYTHONPATH.

IntelliJ Python plugin & Run classpath

I have a project located at /home/myself/workspace/Project1, for which I created an SDK from a Python 2.7.3 Virtualenv I have setup.
This project uses some external code that I have in an accessible directory, e.g. /home/myself/LIBRARY; this directory contains several directories with code, docs etc....
For example, there is a module "important_util" located at /home/myself/LIBRARY/mymodule/important_util.py.
Now, I added the whole dir /home/myself/LIBRARY in the SDK Classpath, and in the Editor window it appears just fine. The imports and calls are recognized and I can also navigate through the code in LIBRARY directories.
The problem is that, when I try to run my program, it fails at the first import using LIBRARY!!!
Traceback (most recent call last):
File "/home/myself/workspace/Project1/my_program.py", line 10, in <module>
from mymodule import important_util as ut
ImportError: No module named mymodule
I also tried to add the same directories to the section "Global Libraries" in the Sources section...but no luck.
I can't seem to find a way to add this code to the Run classpath, how would I be able to do this?
Make sure you have __init__.py in mymodule directory:
The __init__.py files are required to make Python treat the
directories as containing packages; this is done to prevent
directories with a common name, such as string, from unintentionally
hiding valid modules that occur later on the module search path. In
the simplest case, __init__.py can just be an empty file, but it can
also execute initialization code for the package or set the __all__
variable, described later. ©
UPDATE: In IntelliJ IDEA additional directories should be added as Module Dependencies or configured as Libraries (to be added to the Dependencies) instead of the Classpath tab of the Python SDK:
I've verified that this folder (D:\dev\lib) is added to the PYTHONPATH and import works.
In IntelliJ 14 it's a little different, you are modules/eggs like so:
Go to File -> Project Structure
Now select Modules and then "Dependencies" tab
Click the "+" icon and select "Library"
Click "New Library" and select Java (I know it's weird...)
Now choose multiple modules / egg and "OK".
Select "Classes" from categories.
Give your new library a name, "My Python not Java Library"
And finally click "Add Selected"
From Version of 2017.1 adding it has been changed again. There is no project structure in the file menu. Writing current procedure down:
Go To Preference/Settings. File -> Settings (IDE Name -> Preferences for macOS)
Select Build, Execution, Deployment
Select Python Interpreter
Select on drop-down menu of project interpreter and select the path of path of version of Python required for the project.
Click on Apply and wait for few minutes to let IntelliJ index the python packages.
All error should be gone now and You should be able to see Python used in the project in the list of external libraries.
Happy Coding.

Categories

Resources