How to set project-dir in dbt with environment variable?

How to set project-dir in dbt with environment variable? - python

I am trying to locate my dbt_project.yml file which is not in the root directory of my project. Previously, I was using an env var called DBT_PROJECT_DIR in order to define where dbt_project.yml file is located and it was working fine. In a similar way, I am using DBT_PROFILE_DIR and it still works correct. But I cannot make DBT_PROJECT_DIR work. Any help is appreciated.

I'm fairly certain this is not supported. Are you not able to change the directory to the directory of the dbt_project.yml before running dbt commands?
As a workaround, could you just add --project-dir $PROJECT_DIR to every command you plan to run?

It is indeed not supported, this get_nearest_project_dir function is used to find the project dir. That should be adjusted to allow for using a environment variable, similar to profiles indeed.
You could open an issue on Github, discuss adding this feature there.

Related

Why I need to specify working directories and path?

Whenever I do a project for computer science, I have to make sure all of my files are located in the same folder, or I'll have errors. If I want to use a file from somewhere else, I have to insert it into the path. I do these things but don't fully understand what is happening or why. Why is the path changed in the runtime environment?

When you run a python script you are executing it in the current working directory /home/user/python.py for example. That means this script since it lives in /home/user has access to everything in that path. However you should be able to access any other directory from here as long as the permissions are setup right. You would do that by using relative paths. so for example /home/user/python.py could access a file that is /home/example/file.txt by giving it the path ../example/file.txt from the python project.

Have you tried adding the path using sys.path.append? If you don't want to do that every time then you can set (Windows) %PYTHONPATH% to include your custom path. That's what I do for my include folder.

Getting os.environ to work with Python run via NSSM

I am stuck on an environment variables mismatch.
I run a Python script on Windows 10 via a program called NSSM.
At runtime, I do the following:
Load in parameters from a text file
Put its contents into the environment using os.environ.setdefault(name, value).
Try to load in environment variables using os.environ[name]
Result:any variables I added do not show up.
I am not sure why the variables I add aren't available. Can you please tell me what I am doing wrong?
A starting point is that NSSM uses environment variables from Windows HKLM registry: source (see bottom). I am not sure if this is the reason os.environ cannot see relevant variables.

I've had trouble using os.environ.setdefault in the past as well. Instead of that, say you were trying to add to the PATH environment variable, do the following:
os.environ['PATH'] += ";" + the_path_to_the_file
EDIT:
Also, for creating a new variable:
os.environ['new_var'] = 'text'

Well it turns out that my problem was outside of the scope of this question. #Recessive and #eryksun thank you both for answering, it put me "onto the scent".
It turns out my problem was using Python pathlib's Path.home().
When running via command prompt, it pulled HOMEPATH environment variable.
When running via NSSM, it pulled USERPROFILE environment variable.
This discrepancy in Path.home() was the real problem. It wasn't finding the environment variables because NSSM was looking in a totally different folder.

Proper way to run a PyCharm project

This is a seemingly simple problem but it proves to be harder than expected.
I've created a project in pycharm in the following layout
bin
main
helpers
userhelper
models
user
session
tests
userTest
in my main I run the code that calls everything and this works like a charm in pycharm. Now I want to run this on a server and start it with cron. How do I start this from cron while keeping all the module references in place?
I guess I need to add the root of my project to the python path. To do this I added the following bash script to invoke my project with:
PYTHONPATH="${PYTHONPATH}:/home/steven/projectX"
export PYTHONPATH
python bin/main.py
But this does not seem to do anything, what would be the best way to periodically run the bin/main.py within this project and have all my modules and things like 'ConfigParser.RawConfigParser().read(os.path.abspath("../configuration.cfg"))' in place relative to my project?
EDIT: I am not trying to fix my imports or debugging my code, I have a large project in pycharm that runs a simulation that I want to invoke on the server an maintain within my development setup. The question is how do I run this in the same way pycharm does?

It sounds like you're interested in making a distributable Python package. You should read through the tutorial here. Ultimately, you're going to want to write a setup.py (sure you could call it something else, but it's like renaming self -- why do it?) that will configure your project. Now a word of advice since I've seen many people go down a wrong path here. You NEVER want to modify your PYTHONPATH directly. It might be the quickest solution to get something up and working, but it WILL cause lasting problems.

So I've found the way of dealing with mu issue here. The best option would obviously be to create a distributable python package and use this. However since I am developing a simulation that has loads of features and desired outcomes there is a lot of tuning going on and having to create a package every time I want to run the project is a bit out of the scope for my project.
What I do want to do is being able to run the files in a similar way to how I am doing this on my development machine with PyCharm. The way I resolved this is to have my main.py in the root of my project and to have all references relative to the executed file. This means my userhelper looks for the datastore as follows:
path = os.path.join(os.path.dirname(os.path.dirname(__file__)),
"resources/" + settings['datastore_filename']
This resolves my issue of not being able to run my project on the server the same way it runs on my PC.

Why doesnt python see module?

I have the next problem with Python 2.7.3:
There's little project with the next structure (see at the end); through virtual environment Python doesn't see a module, but before it did.
I've already tried rebooting and reinstalling virtualenv directory—nothing. Also, if I run the necessary script file out of any virtual environment (using ipython), it finds that module. I've been looking for the problem all night. Can anybody help me? For example, I'm trying to run the script from delete_ME_after.py (inside of it there's importing module base.config_parser).
Also, I've already set path to this project to PYTHONPATH—also nothing works.

As it turns out, the above PYTHONPATH variable contains a tilde ~ which isn't expanded to the full home directory as expected. See here for more information.
Use one of the following instead:
export PYTHONPATH=/path/to/home/smm
export PYTHONPATH=${HOME}/smm

Postgresql failed to start

I am trying to use PostgreSQL on Ubuntu. I installed it and everything was working fine. However, I needed to change the location of my database due to space constraints so I tried an online guide to do it.
I proceeded to stop postgresql, create a new empty directory and give it permissions by using
chown postgres:postgres /my/dir/path
That worked fine too. Then I used
initdb -D /my/dir/path
to enable my database. I also changed the path_data in the postgresql.conf file to my new directory.
When I now try to start the database, it says: The postgresql server failed to start, please check the log file. However, there is no log file! Something got screwed up when I changed the default directory. How do I fix this?

First: You may find it easier to manage your Pg installs on Ubuntu using the custom tools Ubuntu provides as part of pg_wrapper: pg_createcluster, pg_dropcluster, pg_ctlcluster etc. These integrate with the Ubuntu startup scripts and move the configuration to /etc/postgresql/ where Ubuntu likes to keep it, instead of the PostgreSQL default of in the datadir. To move where the actual files are stored, use a symbolic link (see below).
When you have a problem, how are you starting PostgreSQL?
If you're starting it via pg_ctl it should work fine because you have to specify the data directory location. If you're using your distro package scripts, though, they don't know you've moved the data directory.
On Ubuntu, you will need to change configuration in /etc/postgresql to tell the scripts where the data dir is, probably pg_ctl.conf or start.conf for the appropriate version. I'm not sure of the specifics as I've never needed to do it. This is why:
There's a better way, though. Use a symbolic link from your old datadir location to the new one. PostgreSQL and the setup scripts will happily follow it and you won't have to change any configuration.
cd /var/lib/postgresql/9.1/main
mv main main.old
ln -s /new/datadir/location main
I'm guessing "9.1" because you didn't give your Ubuntu version or your PostgreSQL version.
An alternative is to use mount -o bind to map your new datadir location into the old place, so nothing notices the difference. Then add the bind mount to /etc/fstab to make it persistent across reboots. You only need to do that if one of the tools doesn't like the symbolic link approach. I don't think that'll be an issue with pg_wrapper etc.
You should also note that since you've used initdb manually, your new datadir will have its configuration directly inside the datadir, not in /etc/postgresql/.
It's way easier if you just use the Ubuntu cluster management scripts instead.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.