Unsupported Interpolation Type using env variables in Hydra - python

What I'm trying to do: use environment variables in a Hydra config.
I worked from the following links: OmegaConf: Environment variable interpolation and Hydra: Job Configuration.
This is my config.yaml:
hydra:
job:
env_copy:
- EXPNAME
# I also tried hydra:EXPNAME and EXPNAME,
# which return None
test: ${env:EXPNAME}
Then I set the environment variable (Ubuntu) with:
export EXPNAME="123"
The error I get is
omegaconf.errors.UnsupportedInterpolationType: Unsupported interpolation type env
full_key: test
object_type=dict

Try this (env was removed in a long time ago in favor of oc.env).
test: ${oc.env:EXPNAME}
I don't think the rest is needed if all you need is to access environment variables on your local machine.

Related

Setting environment variable in python has no effect on cfgrib

I am using xarray with cfgrib to load grib files in Python. I have custom grib definitions, which I am providing to eccodes (backend for cfgrib) via the environment variable GRIB_DEFINITION_PATH.
This setup works well, as long as I run the Python script in an environment where the variable was already set.
Now I want to be more flexible with my setup and provide the environment variable from within Python using os.environ (see the example below).
But somehow when setting up the environment like this, the variable gets ignored and I don't understand why.
Can anyone provide me some insight into this mystery? Thanks in advance!
Here an "MRE" of the setting.
import xarray as xr
import os
grib_definitions_path = "/paths/to/definitions:/split/like/this"
os.environ["GRIB_DEFINITION_PATH"] = grib_definitions_path
grib_file = '/path/to/grib/file'
backend_args = {
"filter_by_keys": {"shortName": "P"}
}
array = xr.open_dataset(grib_file, engine="cfgrib", encode_cf=("geography", "vertical"), backend_kwargs=backend_args)["P"]
print(array.dims)
Executing the above code in a terminal fails for me with KeyError: 'P'. If I however first run
export GRIB_DEFINITION_PATH="/paths/to/definitions:/split/like/this"
the dimensions of array are being printed as expected.

Using environmental variables with R reticulate

I have a Python package that I want to use in R via reticulate. However, that Python function doesn't appear to see the environmental variables from the R environment. How can I successfully set an environmental variable for the Python function to see?
So if I had a python function like:
import os
def: toy_function():
return os.environ['ENVVAR']
I would like to be able to do:
library(reticulate)
source_python("toy_function.py")
sys.setenv("ENVVAR"="HELLO")
print(toy_function())
And see "HELLO". Currently I am getting an error that "ENVVAR" cannot be found.
Thank you!
Oh, it turns out there is a strange workaround for this, where you just need to call the environmental variable setting directly in Python from R:
py_run_string("import os")
py_run_string("os.environ['ENNVAR'] = 'HELLO'")

Ansible Tower - How to get a list of the environment variables

Within Tower there is a lot of options to add environment variables before execution. I have set some variables that get pulled into a python inventory script. However the script is responding with an error. I think the python code is not getting the values or the values are not in the correct format.
I would like to see how the environment variables are being exposed to the python script. Is there a way to get these added to the debug output in the job log?
The problem was I wasn't executing a playbook. I was executing a custom python inventory script and I need to be able to see how Ansible was loading the variables to be able to troubleshoot why the script wouldn't load the variables. I added some code to the python script to send me an email with the list of environmental variables. You can also write this to a file on the drive, but if you are using tower, you have to expose the folder location under the admin settings -> Jobs -> paths to expose. I decided it would be easier to just get an email while testing.
import smtplib
import datetime
import time
ts = time.time()
st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
output = ""
output +=('Time Start {} \r\n '.format(st))
for a in os.environ:
output +=('Var: {} Value: {} \r\n'.format(a, os.getenv(a)))
def send_email(addr_from, addr_to, contents):
svr = smtplib.SMTP('smtp.mail.local', 25)
msg = 'Subject: Subject goes here.\n\n{0}'.format(contents)
svr.sendmail(addr_from, addr_to, msg)
send_email('addr_from#mail.com','addr_to#mail.com',output)
Here is a picture of the variables
Then here is a picture of the script.
But this didn't work. Here is the code that worked.
The problem was that when you query the environmental variable in python, if its a dictionary, it will return with single quotes and you have to convert that to double quotes and json.loads it to get it to load as a dictionary.
There is multiple problems solved with this. I hope this helps others needing to troubleshoot Ansible with Python.
Just use debug along with the 'env' lookup.Below home is the environment variable.
- name: Show env variable
debug:
msg: "{{ lookup('env','HOME') }}"
https://docs.ansible.com/ansible/latest/plugins/lookup/env.html

Export environment variables at runtime with airflow

I am currently converting workflows that were implemented in bash scripts before to Airflow DAGs. In the bash scripts, I was just exporting the variables at run time with
export HADOOP_CONF_DIR="/etc/hadoop/conf"
Now I'd like to do the same in Airflow, but haven't found a solution for this yet. The one workaround I found was setting the variables with os.environ[VAR_NAME]='some_text' outside of any method or operator, but that means they get exported the moment the script gets loaded, not at run time.
Now when I try to call os.environ[VAR_NAME] = 'some_text' in a function that gets called by a PythonOperator, it does not work. My code looks like this
def set_env():
os.environ['HADOOP_CONF_DIR'] = "/etc/hadoop/conf"
os.environ['PATH'] = "somePath:" + os.environ['PATH']
os.environ['SPARK_HOME'] = "pathToSparkHome"
os.environ['PYTHONPATH'] = "somePythonPath"
os.environ['PYSPARK_PYTHON'] = os.popen('which python').read().strip()
os.environ['PYSPARK_DRIVER_PYTHON'] = os.popen('which python').read().strip()
set_env_operator = PythonOperator(
task_id='set_env_vars_NOT_WORKING',
python_callable=set_env,
dag=dag)
Now when my SparkSubmitOperator gets executed, I get the exception:
Exception in thread "main" java.lang.Exception: When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
My use case where this is relevant is that I have SparkSubmitOperator, where I submit jobs to YARN, therefore either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment. Setting them in my .bashrc or any other config is sadly not possible for me, which is why I need to set them at runtime.
Preferably I'd like to set them in an Operator before executing the SparkSubmitOperator, but if there was the possibility to pass them as arguments to the SparkSubmitOperator, that would be at least something.
From what I can see in the spark submit operator you can pass in environment variables to spark-submit as a dictionary.
:param env_vars: Environment variables for spark-submit. It
supports yarn and k8s mode too.
:type env_vars: dict
Have you tried this?

Python os.getenv returning None, missing var on Windows 10

( >= Python 3.4 )
With os.getenv(), i am not able to retrieve some env vars, like %DATE% or %TIME%,
ex:
print(os.getenv('computername')) # works
print(os.getenv('date')) # not working, returning None
print(os.getenv('time')) # not working, returning None
So i checked the list of var env python can detect by this:
[print(var) for var in os.environ]
It result a list of lot of env, but missing date, and time, at least.
A strange thing is the last item of the list is a list of None objects, probably the missing var env, but only appearing when using the interactive console.. and thus not really in the list.. or maybe some artifacts of the print function used in the interactive console?
Try
import os
varenv = [var for var in os.environ]
[print(var) for var in varenv]`
with the interactive console from the command line
Finally, when your are in windows cmd, echo %date% %computername% works great, with no different processing between these var.
I presume it's maybe related with unicode, but havent found any answers yet in my researches: Why %DATE% or %TIME% not working while %computername% just fine ?
In fact, %DATE% and %TIME% are not environment variables.
See this Microsoft documentation User Environment Variables to see the list of environment variables of your system.

Categories

Resources