Pytest needs to specify folder project when importing modules - python

I am working with python 3.7 with anaconda and visual code.
I have a project folder called providers.
Inside providers I have two folders:
├── Providers
├── __init__.py
|
├── _config
| ├── database_connection.py
| ├── __init__.py
├── src
| ├── overview.py
| ├── __init__.py
├── utils
| ├── pandas_functions.py
| ├── __init__.py
I want to import a class named DatabaseConnection that is inside the file database_connection.py in the file ovewview.py
# Overview.py
from config.database_connection import DatabaseConnection
This works as expected.
I want to run some tests, I am using pytest, and the script looks like this:
from unittest.mock import patch, Mock
import pandas as pd
from config.database_connection import DatabaseConnection
from providers.utils.pandas_functions import get_df
#patch("providers.utils.pandas_functions.pd.read_sql")
def test_get_df(read_sql: Mock):
read_sql.return_value = pd.DataFrame({"foo_id": [1, 2, 3]})
results = get_df()
read_sql.assert_called_once()
pd.testing.assert_frame_equal(results, pd.DataFrame({"bar_id": [1, 2, 3]}))
But is giving me this error:
plugins: hypothesis-5.5.4, arraydiff-0.3, astropy-header-0.1.2, doctestplus-0.5.0, mock-3.4.0, openfiles-0.4.0, remotedata-0.3.2
collected 0 items / 1 error
===================================================================================================== ERRORS =====================================================================================================
________________________________________________________________________________ ERROR collecting tests/test_pandas_functions.py _________________________________________________________________________________
ImportError while importing test module 'C:\Users\jordi_adm\Documents\GitHub\mcf-pipelines\cpke-cash-advance\providers\tests\test_pandas_functions.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests\test_pandas_functions.py:3: in <module>
from config.database_connection import DatabaseConnection
E ModuleNotFoundError: No module named 'config'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================================================================ 1 error in 0.35s ================================================================================================
It cannot find the module config, while if I run the overview.py file it can import it without problems.
If import config.database_connection import DatabaseConnection with providers at the beginning, pytest is able to call this function:
from unittest.mock import patch, Mock
import pandas as pd
from providers.config.database_connection import DatabaseConnection
from providers.utils.pandas_functions import get_df
#patch("providers.utils.pandas_functions.pd.read_sql")
def test_get_df(read_sql: Mock):
read_sql.return_value = pd.DataFrame({"foo_id": [1, 2, 3]})
results = get_df()
read_sql.assert_called_once()
pd.testing.assert_frame_equal(results, pd.DataFrame({"bar_id": [1, 2, 3]}))
plugins: hypothesis-5.5.4, arraydiff-0.3, astropy-header-0.1.2, doctestplus-0.5.0, mock-3.4.0, openfiles-0.4.0, remotedata-0.3.2
collected 1 item
tests\test_pandas_functions.py . [100%]
=============================================================================================== 1 passed in 0.29s ================================================================================================
If I try to run a script inside of this project with providers at the beginning, for example modifying overview.py:
from providers.config.database_connection import DatabaseConnection
I obtain this error:
from providers.config.database_connection import DatabaseConnection
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
c:\Users\jordi_adm\Documents\GitHub\mcf-pipelines\cpke-cash-advance\providers\run.py in
----> 2 from providers.config.database_connection import DatabaseConnection
ModuleNotFoundError: No module named 'providers'
What is the reason to pytest needs to specify providers at the begging?

Related

Problems with relative imports and pytest

I am having an issue whereby relative imports from a .py file within a module are not recognised as such by pytest.
Consider the following directory structure:
.
├── import_pytest_issue
│ ├── __init__.py
│ ├── main.py
│ ├── user.py
│ └── utils.py
└── tests
├── __init__.py
└── test_main.py
main.py, user.py, and utils.py are as follows"
# ./import_pytest_issue/main.py
from user import User
if __name__ == "__main__":
user = User("Fred")
print(f"The user is called {user.name}, and their favourite number is {user.favourite_number}")
# ----------------------------------------------------------------------
# ./import_pytest_issue/user.py
from utils import random_number
class User:
favourite_number = random_number()
def __init__(self, name: str) -> None:
self.name = name
# ----------------------------------------------------------------------
# ./import_pytest_issue/utils.py
from random import choice
def random_number():
return choice(range(1, 11))
...and my test_main.py file is like this:
# ./tests/test_main.py
from import_pytest_issue.user import User
def test_user():
user = User("Larry")
assert user.name == "Larry"
assert user.favourite_number in range(1, 11)
main.py runs without issue:
> python import_pytest_issue/main.py
# The user is called Fred, and their favourite number is 1
...but running pytest I get a ModuleNotFoundError:
> pytest
# ======================================================================== test session starts ========================================================================
# platform linux -- Python 3.10.6, pytest-7.2.1, pluggy-1.0.0
# collected 0 items / 1 error
# ============================================================================== ERRORS ===============================================================================
# ________________________________________________________________ ERROR collecting tests/test_main.py ________________________________________________________________
# ImportError while importing test module 'import-pytest-issue/tests/test_main.py'.
# Hint: make sure your test modules/packages have valid Python names.
# Traceback:
# /usr/lib/python3.10/importlib/__init__.py:126: in import_module
# return _bootstrap._gcd_import(name[level:], package, level)
# tests/test_main.py:1: in <module>
# from import_pytest_issue.user import User
# import_pytest_issue/user.py:1: in <module>
# from utils import random_number
# E ModuleNotFoundError: No module named 'utils'
# ====================================================================== short test summary info ======================================================================
# ERROR tests/test_main.py
# !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
# ========================================================================= 1 error in 0.05s ==========================================================================
FWIW this also happens if I run `python -m pytest`.
Any ideas?

How to best deal with circular import ImportError due to import statement in __init__ file

So I'm working on a rather big Python 3 project where I'm writing unit tests for some of the files using the unittest library. In one unit test, the tested file imports a function from another python package whose __init__ file itself imports from the tested file. This leads to an ImportError during the unit test.
It is desired for the __init__.py to import from the tested file periphery\foo.py, so I would like to know if there is a possibility to make the unit test work without removing the import from __init__.py
Since the project contains a lot of files, I created a minimal example that illustrates the structure and in which the error can be reproduced. The project structure looks like this
├───core
│ bar.py
│ __init__.py
│
├───periphery
│ foo.py
│ __init__.py
│
└───tests
│ __init__.py
│
├───core
│ __init__.py
│
└───periphery
test_foo.py
__init__.py
The init file in core core/__init__.py contains the code
# --- periphery ---
from periphery.foo import Foo
while periphery/foo.py, which is the file to be tested, looks like
from core.bar import Bar
def Foo():
bar = Bar()
return bar
Finally, the unit test has the following structure
from unittest import TestCase
from periphery.foo import Foo
class Test(TestCase):
def test_foo(self):
""" Test that Foo() returns "bar" """
self.assertEqual(Foo(), "bar")
Running the unit test yields the following error:
ImportError: cannot import name 'Foo' from partially initialized module 'periphery.foo' (most likely due to a circular import)

How to access a method from pytest, if there's no explicit root package or module

Simply speaking, I have this directory structure:
src/
my_file1.py
tests/
__init__.py
my_file1_test.py
In my_file1.py:
def my_func1(a):
return a + 99
How do I then access my_func1 from the tests? In my_file1_test.py I can't access the method:
# from ??? import ?? # is this needed at all?
def my_test1():
res = my_func1(123) # unaccessible
assert res = 222
Will I have to create __init__.py in scr directory first?
update1
from src.myfile import my_func1
===>
ModuleNotFoundError: No module named 'scr'
And if I add __init__.py then it'll become:
ImportError: cannot import name 'my_func1' from 'src'
If you run pytest from your project root with
python -m pytest
you then have to import the function, as you guessed, with:
from src.myfile import my_func1

Pytest fails on importing pyhive

I am working on writing some data tests. Super simple nothing crazy.
Here is what my current directory looks like.
.
├── README.md
├── hive_tests
│   ├── __pycache__
│   ├── schema_checks_hive.py
│   ├── test_schema_checks_hive.py
│   └── yaml
│   └── job_output.address_stats.yaml
└── postgres
├── __pycache__
├── schema_checks_pg.py
├── test_schema_checks_pg.py
└── yaml
When I cd in to postgres and run pytest all my tests pass.
When I cd in to hive_test and run pytest I am getting an import error.
Here is my schema_checks_hive.py file.
from pyhive import hive
import pandas as pd
import numpy as np
import os, sys
import yaml
def check_column_name_hive(schema, table):
query = "DESCRIBE {0}.{1}".format(schema, table)
df = pd.read_sql_query(query, conn)
print(df.columns)
return df.columns
check_column_name_hive('myschema', 'mytable')
Here is my test_schema_checks_hive.py file where the tests are located.
import schema_checks_hive as sch
import pandas as pd
import yaml
import sys, os
def test_column_names_hive():
for filename in os.listdir('yaml'):
data = ""
with open("yaml/{0}".format(filename), 'r') as stream:
data = yaml.safe_load(stream)
schema = data['schema']
table = data['table']
cols = data['columns']
df = sch.check_column_name_hive(schema, table)
assert len(cols) == len(df)
assert cols == df.tolist()
When I run Pytest I get an error that says:
mportError while importing test module '/Usersdata/
tests/hive_tests/test_schema_checks_hive.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
test_schema_checks_hive.py:1: in <module>
import schema_checks_hive as sch
schema_checks_hive.py:1: in <module>
from pyhive import hive
E ModuleNotFoundError: No module named 'pyhive
I would love any help! Thanks so much.

How do I load all modules under a subdirectory in Python?

I place my frequently used modules in a subdirectory lib/ and expect to load all modules into main.py by: (refer to Python: how to import from all modules in dir?)
from lib import *
but encounter the issue TypeError: 'module' object is not callable. More specifically, in main.py:
#!/usr/bin/env python
from lib import * # cause: TypeError: 'module' object is not callable
#from lib.DominatingSets import * # it works
dominatingSets = DominatingSets()
The full exception traceback:
$ python main.py
Traceback (most recent call last):
File "main.py", line 6, in <module>
dominatingSets = DominatingSets()
TypeError: 'module' object is not callable
The directories in a tree-like format.
$ tree -P '*.py' .
.
├── __init__.py
├── lib
│ ├── AnalyzeGraph.py
│ ├── AutoVivification.py
│ ├── DominatingSets.py
│ ├── __init__.py
│ ├── Output.py
│ ├── PlotGraph.py
│ ├── ProcessDatasets.py
│ └── ReadGTFS.py
├── main.py
The contents of lib/__init__.py are as follows. (refer to Loading all modules in a folder in Python)
from os.path import dirname, basename, isfile
import glob
modules = glob.glob(dirname(__file__)+"/*.py")
__all__ = [ basename(f)[:-3] for f in modules if isfile(f) and not basename(f).startswith('__')] # exclude __init__.py
This confusion happened, in part, because your module names are the same as the names of the classes you want to load from them. (At least, that's what makes it more confusing.) Your code does correctly load the modules that your classes are in. However, it doesn't load the classes out of those modules, and this is what you actually wanted to do.
Because your class DominatingSets is in the module lib.DominatingSets, its full path from root is lib.DominatingSets.DominatingSets.
from lib import *
in your case will do the same thing as
from lib import DominatingSets
from lib import AnalyzeGraph
# ...
However,
from lib import DominatingSets
is equivalent to
import lib.DominatingSets
DominatingSets = lib.DominatingSets
but lib.DominatingSets is a module (lib/DominatingSets.py), not the class you want.
from lib.DominatingSets import DominatingSets
is equivalent to
import lib.DominatingSets
DominatingSets = lib.DominatingSets.DominatingSets
which is why it works: this is the class you want imported into the name DominatingSets.
If you want to have from lib import * import all the classes in the submodules, you need to import these classes into the lib module. For example, in lib/__init__.py:
from DominatingSets import *
from AnalyzeGraph import *
# ...
While you're making changes, I'd suggest (as others have) using normal Python naming conventions, and have your module names in lowercase: change DominatingSets.py to dominatingsets.py. Then this code would become
from dominatingsets import *
from analyzegraph import *
# ...
Looking at your Traceback, I think your problem might lie here:
Firstly, lets look at an example:
import datetime
d = datetime(2005, 23, 12)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'module' object is not callable
Basically, we've just imported the entire datetime module, and we're trying to call it like a class object within a module. Let's now do:
k = datetime.datetime(2005, 12, 22)
print k
2005-12-22 00:00:00
No problems this time, as we are referencing the datetime object type within the datetime module
If we do:
from datetime import datetime
datetime
<type 'datetime.datetime'>
Again we reach the desired object, as we are importing the datetime class within the datetime module.
Also, using *
from datetime import *
d = datetime(2005, 3, 12)
will also work, as you are importing all the classes within the datetime module.
Your code saying:
from lib import * # This imports all MODULES within lib, not the classes
#from lib.DominatingSets import * # it works because you import the classes within the DominatingSets Module
You could either use from lib.DominatingSets import DominatingSets which should solve your problem, or if you stick to from lib import *, change your code to dominatingsets = DominatingSets.DominatingSets()
Hope this helps!
I learnt a lot from the accepted answer here ... but I still had problems with what to put in the lib/__init__.py file, if this directory lib is not actually included in PYTHONPATH.
I found that in addition to adding the parent directory of lib in the caller file, i.e.
sys.path.append( '.../parent_dir_of_lib' )
I either 1) had to do this in addition in the caller file:
sys.path.append( '.../parent_dir_of_lib/lib' )
Or 2) had to make the lib directory "self-loading", by putting this in its __init__.py:
import sys
from pathlib import Path
parent_dir_str = str( Path(__file__).resolve().parent )
sys.path.append( parent_dir_str )
from analyse_graph import *
from auto_vivification import *
...

Categories

Resources