Including a python library with my script - python

My system administrator will not allow global installation of python packages.
I'm writing a script that people will invoke to perform certain actions for them. The script I'm writing needs certain libraries like sqlalchemy and coloredlogs. I am however allowed to install python libs any local folder. i.e not site-packages.
How would I go about installing the libs in the same folder as the script so that the script has access to them?
My folder hierarchy is like so
script_to_invoke.py
scriptpack/
bin/
coloredlogs
coloredlogs.egg
...
utils/
util1.py
util2.py
(all the folders indicated have an __init__.py)
What I've tried so far:
within script_to_invoke.py I use
from scriptpack.utils invoke util1 # no problem here
from scriptpack.bin import coloredlogs # fails to find the import
I've looked at some other SO answers abut I'm not sure how to correlate them with my problem.

I figured it out!
Python had to be directed to find the .egg files
This can be done by either
Editing the PYTHONPATH var BEFORE the interpreter is started (or)
Appending the full path to the eggs to the sys path
Code Below:
import sys
for entry in [<list of full path to egg files in bin dir>]:
sys.path.append(str(entry))
# Proceed with local imports

If you might want to try packaging up everything as a zipapp. Doing so makes a single zip file that acts as a Python script, but can contain a whole multitude of embedded packages. The steps to make it are:
Make a folder with the name of your program (testapp in my example)
Name your main script __main__.py and put it in that folder
Using pip, install the required packages to the folder with --target=/path/to/testapp
Run python3 -mzipapp testapp -p='/usr/bin/env python3' (providing the shebang line is optional; without it, users will need to run the package with python3 testapp.pyz, while with the shebag, they can just do ./testapp.pyz)
That creates a zip file with all your requirements embedded in it alongside your script, that doesn't even need to be unpacked to run (Python knows how to run zip apps natively). As a trivial example:
$ mkdir testapp
$ echo -e '#!/usr/bin/python3\nimport sqlalchemy\nprint(sqlalchemy)' > __main__.py
$ pip3 install --target=./testapp sqlalchemy
$ python3 -mzipapp testapp -p='/usr/bin/env python3'
$ ./testapp.pyz
<module 'sqlalchemy' from './testapp.pyz/sqlalchemy/__init__.py'>
showing how the simple main was able to access sqlalchemy from within the same zipapp. It's also smaller (thanks to the zipping) that distributing the uncompressed modules:
$ du -s -h testapp*
13M testapp
8.1M testapp.pyz

You can install these packages in a non-global location (generally in ~/.local/lib/python<x.y>) using the --user flag, e.g.:
pip install --user sqlalchemy coloredlogs
That way you don't have to worry about changing how imports work, and you're still compliant with your sysadmins policies.

Related

How to check if all packages listed in requirements.txt file are used in Python project

I have a requirements file which contains all installed packages. After big refactoring process of the project some of the listed packages are not needed anymore. The problem is I'm not sure which.
Is there a way to determine which packages listed in the requirements.txt file are actually used in the code?
Alternative answer that uses a Python library: pipreqs. See Automatically create requirements.txt.
Running pipreqs with the default arguments will generate a requirements.txt for you.
$ pipreqs /home/project/location
Successfully saved requirements file in /home/project/location/requirements.txt
However, it looks like you're trying to clean up an old requirements.txt file. In this case, pipreqs also comes with a --diff flag, and a --clean flag. From the docs:
--diff <file> Compare modules in requirements.txt to project imports.
--clean <file> Clean up requirements.txt by removing modules that are not imported in project.
You can use --diff to determine which libraries need to be removed, and --clean to do it automagically.
If you are using a virtual environment and have a decent enough test suite that you can repeatedly and automatically run (it's best if you can run it on your local workspace via a script or a simple command), then the brute-force type of approach is to:
Setup a fresh copy of your project (ex. git clone onto a separate folder)
Setup an empty/blank virtual environment
Run your tests
Install missing packages every time you encounter a "ModuleNotFoundError" type of error
Repeat until all tests passes
Export the packages you now have to a separate requirements.txt file (pip freeze > requirements.new.txt or any of the other ways from Automatically create requirements.txt)
For example, we have this (very) minimal example code:
# --- APP CODE
from numpy import array
from openpyxl import Workbook
from pydantic import BaseModel, ValidationError
class MyModel(BaseModel):
x: int
# --- TEST CODE
import pytest
def test_my_model():
with pytest.raises(ValidationError):
MyModel(x="not an int")
After cloning this and setting up a brand new fresh virtual environment (but without yet installing any of the packages), the 1st attempt to run the tests yields:
(my_venv) $ pytest --maxfail=1 main.py
bash: /Users/me/.venvs/my-proj-7_w3b8eb/bin/pytest: No such file or directory
So then, you install pytest.
Then, you get:
(my_venv) $ pytest --maxfail=1 main.py
...
main.py:1: in <module>
from numpy import array
E ModuleNotFoundError: No module named 'numpy'
So then you install numpy.
Then, you get:
(my_venv) $ pytest --maxfail=1 main.py
...
main.py:2: in <module>
from openpyxl import Workbook
E ModuleNotFoundError: No module named 'openpyxl'
So then you install openpyxl.
...and so on. Until all the tests passes. Of course, even when your automated tests pass, it's good to also do some manual basic tests to make sure everything is indeed working as before. Finally, generate a new copy of your requirements.txt file, and compare that with the old one, to check for any differences.
Of course, as I mentioned at the start, this assumes you have a decent enough test suite that tests a large % of your code and use cases. (It's also one of the good reasons why you should be writing tests in the first place.).

Why does unittest require __init__.py to be present in Python 3.6?

I just come across this line in python3.6 unittest (/usr/lib/python3.6/unittest/loader.py:286):
is_not_importable = not os.path.isfile(os.path.join(start_dir, '__init__.py'))
which caused the unittest discovery to fail to run my tests. Why is this line still present in python3.6 library (Ubuntu 17.10, I don't know if it matters), if __init__.py is no longer required since python3.3?
I believe that's a bug, but I want a confirmation.
When there's no __init__.py in the foo directory, the following command runs fine ({PROJECT_HOME} being a placeholder):
python3.6 -m unittest discover tests.foo -t {PROJECT_HOME} -p "*.py"
while this fails (with ImportError: Start directory is not importable):
python3.6 -m unittest discover tests/foo -t {PROJECT_HOME} -p "*.py"
The difference being . -> / When there is __init__.py, both commands work the same.
Python repository directories depend on init.py to control python's behaviour when importing modules.
So you have to follow the guidelines in packaging namespace packages
Most cases will use the native namespace packages
Another important thing is the PYTHONPATH because it is an environment variable which you can set to add additional directories where python will look for modules and packages. For most installations, you should not set these variables since they are not needed for Python to run. Python knows where to find its standard library.
The only reason to set is to maintain directories of custom Python libraries that you do not want to install in the global default location (i.e., the site-packages directory).

Structuring python projects without path hacks

I have a shared python library that I use in multiple projects, so the structure looks like this:
Project1
main.py <--- (One of the projects that uses the library)
...
sharedlib
__init__.py
ps_lib.py
another.py
Now in each project's main.py I use the following hack to make it work:
import os
import sys
sys.path.insert(0, os.path.abspath('..'))
import sharedlib.ps_lib
...
Is there a way to do it without using this hack? Or is there a better way to organize the projects structure?
I think the best way would be to make sharedlib a real package. That means changing the structure a bit:
sharedlib/
sharedlib/
__init__.py
ps_lib.py
another.py
setup.py
And using something like this in the setup.py (taken partially from Python-packaging "Minimal Structure"):
from setuptools import setup
setup(name='sharedlib',
version='0.1',
description='...',
license='...',
packages=['sharedlib'], # you might need to change this if you have subfolders.
zip_safe=False)
Then install it with python setup.py develop or pip install -e . when in the root folder of the sharedlib package.
That way (using the develop or -e option) changes to the contents of sharedlib/sharedlib/* files will be visible without re-installing the sharedlib package - although you may need to restart the interpreter if you're working in an interactive interpreter. That's because the interpreter caches already imported packages.
From the setuptools documentation:
Setuptools allows you to deploy your projects for use in a common directory or staging area, but without copying any files. Thus, you can edit each project’s code in its checkout directory, and only need to run build commands when you change a project’s C extensions or similarly compiled files. [...]
To do this, use the setup.py develop command.
(emphasis mine)
The most important thing is that you can import sharedlib everywhere now - no need to insert the sharedlib package in the PATH or PYTHONPATH anymore because Python (or at least the Python where you installed it) now treats sharedlib like any other installed package.
The way we do it is to use bash entry-scripts for the python scripts. Our directory structure would look similar to the following:
/opt/stackoverflow/
-> bin
-> conf
-> lib
-> log
Our lib folder then contains all of our sub-projects
/opt/stackoverflow/lib/
-> python_algorithms
-> python_data_structures
-> python_shared_libraries
and then when we want to execute a python script, we'll execute it via a bash script within the bin directory
/opt/stackoverflow/bin/
-> quick_sort.sh
-> merge_sort.sh
and if we cat one of our entry scripts
cat merge_sort.sh
#!/bin/bash
export STACKOVERFLOW_HOME=/opt/stackoverflow
export STACKOVERFLOW_BIN=${STACKOVERFLOW_HOME}/bin
export STACKOVERFLOW_LIB=${STACKOVERFLOW_HOME}/lib
export STACKOVERFLOW_LOG=${STACKOVERFLOW_HOME}/log
export STACKOVERFLOW_CONF=${STACKOVERFLOW_HOME}/conf
# Do any pre-script server work here
export PYTHONPATH=${PYTHONPATH}:${STACKOVERFLOW_LIB}
/usr/bin/python "${STACKOVERFLOW_LIB}/python_algorithms/merge_sort.py" $* 2>&1

My package installed with pip install shows no modules

I have a simple Python project with effectively one package (called forcelib) containing one module (also called forcelib):
- setup.py
- forcelib
|- __init__.py
|- forcelib.py
My setup.py is copied from the official example and has the obvious edits.
The problem is that I can install the forcelib package using pip but when I import forcelib, it only has the "double-underscore" attributes visible. That is, I cannot see the forcelib module.
Example to replicate:
git clone https://github.com/blokeley/forcelib
cd forcelib
pip install -e .
python
import forcelib
print(forcelib.__version__) # Correctly prints 0.1.2
dir(forcelib) # The only contents are the __version__, __path__ etc. double-underscore attributes. I had expected to see forcelib, example_read etc.
Perhaps I'm supposed to distribute just the module rather than bother with a package.
The (very small) project is on GitHub.
Any advice would be much appreciated.
It seems that there are 2 ways of doing it:
Keep the same directory structure but put the following in __init__.py
from .forcelib import *
Distribute a module, not a package. Follow the instructions to use the py_modules argument rather than the packages argument in setup.py. This would mean restructuring the project to:
setup.py
forcelib.py
Approach (1) can be seen here. It has the advantage of hiding the private functions and attributes (anything not in __all__), but the client can still see the module forcelib.forcelib, which I don't think it should.
Approach (2) can be seen here. It is simpler, but has the disadvantage that it does not hide private functions and attributes.
Download zip file and extract zip after go to forcelib-master directory
then open command prompt and go to forcelib-master directory in command prompt and run command
python setup.py install
It will install packages successfully

Unable to import from __init__.py [duplicate]

I'm having a hard time understanding how module importing works in Python (I've never done it in any other language before either).
Let's say I have:
myapp/__init__.py
myapp/myapp/myapp.py
myapp/myapp/SomeObject.py
myapp/tests/TestCase.py
Now I'm trying to get something like this:
myapp.py
===================
from myapp import SomeObject
# stuff ...
TestCase.py
===================
from myapp import SomeObject
# some tests on SomeObject
However, I'm definitely doing something wrong as Python can't see that myapp is a module:
ImportError: No module named myapp
In your particular case it looks like you're trying to import SomeObject from the myapp.py and TestCase.py scripts. From myapp.py, do
import SomeObject
since it is in the same folder. For TestCase.py, do
from ..myapp import SomeObject
However, this will work only if you are importing TestCase from the package. If you want to directly run python TestCase.py, you would have to mess with your path. This can be done within Python:
import sys
sys.path.append("..")
from myapp import SomeObject
though that is generally not recommended.
In general, if you want other people to use your Python package, you should use distutils to create a setup script. That way, anyone can install your package easily using a command like python setup.py install and it will be available everywhere on their machine. If you're serious about the package, you could even add it to the Python Package Index, PyPI.
The function import looks for files into your PYTHONPATH env. variable and your local directory. So you can either put all your files in the same directory, or export the path typing into a terminal::
export PYTHONPATH="$PYTHONPATH:/path_to_myapp/myapp/myapp/"
exporting path is a good way. Another way is to add a .pth to your site-packages location.
On my mac my python keeps site-packages in /Library/Python shown below
/Library/Python/2.7/site-packages
I created a file called awesome.pth at /Library/Python/2.7/site-packages/awesome.pth and in the file put the following path that references my awesome modules
/opt/awesome/custom_python_modules
You can try
from myapp.myapp import SomeObject
because your project name is the same as the myapp.py which makes it search the project document first
You need to have
__init__.py
in all the folders that have code you need to interact with.
You also need to specify the top folder name of your project in every import even if the file you tried to import is at the same level.
In your first myapp directory ,u can add a setup.py file and add two python code in setup.py
from setuptools import setup
setup(name='myapp')
in your first myapp directory in commandline , use pip install -e . to install the package
pip install on Windows 10 defaults to installing in 'Program Files/PythonXX/Lib/site-packages' which is a directory that requires administrative privileges. So I fixed my issue by running pip install as Administrator (you have to open command prompt as administrator even if you are logged in with an admin account). Also, it is safer to call pip from python.
e.g.
python -m pip install <package-name>
instead of
pip install <package-name>
In my case it was Windows vs Python surprise, despite Windows filenames are not case sensitive, Python import is. So if you have Stuff.py file you need to import this name as-is.
let's say i write a module
import os
my_home_dir=os.environ['HOME'] // in windows 'HOMEPATH'
file_abs_path=os.path.join(my_home_dir,"my_module.py")
with open(file_abs_path,"w") as f:
f.write("print('I am loaded successfully')")
import importlib
importlib.util.find_spec('my_module') ==> cannot find
we have to tell python where to look for the module. we have to add our path to the sys.path
import sys
sys.path.append(file_abs_path)
now importlib.util.find_spec('my_module') returns:
ModuleSpec(name='my_module', loader=<_frozen_importlib_external.SourceFileLoader object at 0x7fa40143e8e0>, origin='/Users/name/my_module.py')
we created our module, we informed python its path, now we should be able to import it
import my_module
//I am loaded successfully
This worked for me:
from .myapp import SomeObject
The . signifies that it will search any local modules from the parent module.
Short Answer:
python -m ParentPackage.Submodule
Executing the required file via module flag worked for me. Lets say we got a typical directory structure as below:
my_project:
| Core
->myScript.py
| Utils
->helpers.py
configs.py
Now if you want to run a file inside a directory, that has imports from other modules, all you need to do is like below:
python -m Core.myscript
PS: You gotta use dot notation to refer the submodules(Files/scripts you want to execute). Also I used python3.9+. So I didnt require neither any init.py nor any sys path append statements.
Hope that helps! Happy Coding!
If you use Anaconda you can do:
conda develop /Path/To/Your/Modules
from the Shell and it will write your path into a conda.pth file into the standard directory for 3rd party modules (site-packages in my case).
If you are using the IPython Console, make sure your IDE (e.g., spyder) is pointing to the right working directory (i.e., your project folder)
Besides the suggested solutions like the accepted answer, I had the same problem in Pycharm, and I didn't want to modify imports like the relative addressing suggested above.
I finally found out that if I mark my src/ (root directory of my python codes) as the source in Interpreter settings, the issue will be resolved.

Categories

Resources