Accessing Folder Structure for Spark Submit zipped PyFile - python

I have a folder structure that looks like this:
├── repo
│ ├── src
│ └── main.py
│ ├── __init__.py
│ ├── utils
│ ├── __init__.py
│ └── helpers.py
│ └── predict
│ ├── __init__.py
│ └── predict.py
│
I am submitting a spark job with the --file src/predict/predict.py and a py-files src.zip (the spark submit command is built from main.py) of the entire source directory.
In predict, I want to reference a method in helpers.py. I do this by attempting an import: from src.utils.helpers import helper_func
However this does not work: ModuleNotFoundError: No module named 'src'. The imports work locally, but I am trying to understand how this works in the EMR. Ideally I dont want to have to change the importing from local to EMR. I just want them to be the same.

Related

pytest no module named common

I'm trying to get started with python and pytest, I have the following project structure :
.
├── my_module
│ ├── __init__.py
│ ├── common
│ │ ├── __init__.py
│ │ └── utils.py
│ └── my_script.py
└── test
├── __init__.py
└── test_my_script.py
When I run tests (using pytest), I get an error:
no module named 'common'.
I have also the following all configs files:
tox.ini
setup.py
setup.cfg
pyproject.toml
someone know what I missed?
EDIT
here is how I import utils from test_my_script.py :
from common.util import func1,func2
common.util module is not accessible from your test execution because it is located in my_module package.
You should use import starting from working directory which is your project root.
Consider using following import instead:
from my_module.common.util import func1, func2

Why my test files from a test folder can't import the source files from a source folder?

I have a problem where this is my project structure:
.
├── Resources/
├── src/
│ ├── __init__.py
│ ├── main.py
│ ├── utils/
│ │ ├── __init__.py
│ │ ├── util.py
│ │ └── otherUtil.py
│ └── calculations/
│ ├── __init__.py
│ └── financials.py
└── tests/
├── __init__.py
└── test.py
My problem is that I can't reach the classes from the src/ folder from the tests, although the code in src/ can reach the Resources folder, through the first shown method.
I have tried:
To append the home library path this way:
Here I used the from src import util after these lines, I even tried from .src import util.
Then this way:
Here I used the from src import util after these lines, I even tried from .src import util.
Than without the sys.path.append() with no use.
I have tried every combination I know, but for no use, and I don't want to install them as individual packages. Does someone have an idea, witch will solve my problem?
Clarification edit:
I don't want to put the tests in the source folder, i want to keep them separate.
You can use this code found here:
# test.py
import sys
# insert at 1, 0 is the script path (or '' in REPL)
sys.path.insert(1, '/path/to/utils/')
import utils

Import Custom Module in Python [Although it works in PyCharm]

I see the question has already been asked thousands times, but I'm utterly confused.
I'm trying to run hello.py which import utils.common into hello.py
from utils.common import function
And I get the following error:
ModuleNotFoundError: No module named 'utils'
So my structure is the following:
── gig
├── __init__.py
├── src
│ ├── __init__.py
│ ├── hello
│ │ ├── __init__.py
│ │ └── hello.py
│ └── utils
│ ├── __init__.py
│ ├── common.py
Provided __init__.py in gig and in src as well, nothing changes though.
P.S. Imports work fine in PyCharm, the issue arises in docker container or when I try to run it locally from terminal.
Any pointers are very much appreciated.
Cheers,
Giga
usually your full import path: gig.src.utils.common should work.

Run single folder of pytest tests in vscode

I have a sort of a "micro-service" Python repo with a setup similar to the following:
sample
├── bar
│ ├── src
│ │ └── main.py
│ └── tests
│ └── test_main.py
├── foo
│ ├── src
│ │ └── main.py
│ └── tests
│ └── test_main.py
└── shared
├── src
│ └── main.py
└── tests
└── test_main.py
In vscode I only have the option of running all tests in foo,bar,shared or running individual test methods in the subfolders. What I want to do is be able to quickly run just the foo/tests/.
Is there some way I can configure pytest/Python to do this? I don't want to split each top level folder into its own workspace because I regularly jump back and for between them and don't want to have multiple windows per workspace open.
You should be able to run the command pytest foo/tests/ in the terminal according to the
pytest documentation

Python module import issue subdir

I have the following directory structure in my Python3 project:
├── README.md
├── requirements.txt
├── schemas
│ ├── collector.sql
│ └── controller.sql
├── src
│ ├── controller.db
│ ├── controller.py
│ ├── measurement_agent.py
├── tests
│ ├── regression.py
│ ├── test-invalid-mac.py
│ ├── test-invalid-url.py
│ ├── test-register-ma-json.py
│ ├── test-register-ma-return-code.py
│ ├── test-send-capabilities-return-code.py
│ ├── test-valid-mac.py
│ └── test-valid-url.py
└── todo
In my tests folder I have some regression tests which are ran to check the consistency of the code from src/measurement_agent.py. The problem now is that I do not want to add to my path manually the measurement_agent.py to make an import from it. I would want to know if there is any trick how to tell Python to look in my root for the import I am trying to use.
Currently I am doing:
import os.path
ma = os.path.abspath('..') + '/src'
sys.path.append(ma)
from measurement_agent import check_hardware_address
and would want to have something just like
from measurement_agent import check_hardware_address
without using any os.path tricks.
Any suggestions?
Relative imports
Make sure there is an __init__.py in all folders including the top-most (the parent)
Use a relative import, like this:
from ..src import measurement_agent
Now to run your code, cd up to the parent of your parent directory and then
python -m parent.test.regression

Categories

Resources