Airflow command gives error due to missing api_auth.deny_all - python

On our staging machine, running any airflow command gives error:
[2018-09-01 16:12:55,938] {__init__.py:37} CRITICAL - Cannot import api_auth.deny_all for API authentication due to: No module named api_auth.deny_all
api_auth seems to come along with airflow, as I tried pip install api_auth and could not find a lib.
On the same machine, I tried to reinstall a fresh clean airflow using virtualenv and pip install airflow, and still get this error.
I tried again on my own laptop and airflow works fine. So I suspect it is probably due to the historical ~/airflow/airflow.cfg on the staging machine.
I am not familiar with the airflow.cfg settings, and cannot find any clue on Google.
Anyone know what may cause the issue and how to resolve?

You are installing a wrong version of Apache Airflow.
Please install Airflow using the following:
pip install apache-airflow
instead of
pip install airflow
Airflow package has been renamed to apache-airflow since 1.8.0
Check the following link for documentation:
https://airflow.apache.org/installation.html#getting-airflow

Related

try to run airflow on databricks but got error

I am trying to use airflow on databricks.
I have installed apache-airflow 1.10.6 from https://pypi.org/project/apache-airflow/.
I am using python3.6 on databricks.
But, I got error:
import airflow
ModuleNotFoundError: No module named 'werkzeug.wrappers.json'; 'werkzeug.wrappers' is not a package
I have tried the followings:
Apache Airflow : airflow initdb results in "ImportError: No module named json"
Apache Airflow : airflow initdb throws ModuleNotFoundError: No module named 'werkzeug.wrappers.json'; 'werkzeug.wrappers' is not a package error
But, I still got the same problem.
Thanks
Note: By default, "Airflow" and its dependency is not installed on the databricks.
You need to install the package explicitly.
Dependency installation: Using Databricks library utilities.
dbutils.library.installPyPI("Werkzeug")
You can install the packages in different methods.
Method1: Installing external packages using pip cmdlet.
Syntax: %sh /databricks/python3/bin/pip install <packagename>
%sh
/databricks/python3/bin/pip install apache-airflow
Method2: Using Databricks library utilities
Syntax:
dbutils.library.installPyPI("pypipackage", version="version", repo="repo", extras="extras")
dbutils.library.restartPython() # Removes Python state, but some libraries might not work without calling this function
To install apache-airflow using databricks library utilities use the below command.
dbutils.library.installPyPI("apache-airflow")
Method3: GUI Method
Go to Clusters => Select Cluster => Libraries => Install New => Library Source "PyPI" => Package "apache-airflow" => Install
Hope this helps. Do let us know if you any further queries.
Do click on "Mark as Answer" and Upvote on the post that helps you, this can be beneficial to other community members.

ImportError: import apache_beam as beam. Module not found

I've installed apache_beam Python SDK and apache airflow Python SDK in a Docker.
Python Version: 3.5
Apache Airflow: 1.10.5
I'm trying to execute apache-beam pipeline using **DataflowPythonOperator**.
When I run a DAG from airflow UI at that time I get
Import Error: import apache_beam as beam. Module not found
With the same setup I tried **DataflowTemplateOperator** and it's working perfectly fine.
When I tried same docker setup with Python 2 and apache airflow 1.10.3, two months back at that time operator didn't returned any error and was working as expected.
After SSH into docker when I checked the installed libraries (using pip freeze) in a docker container I can see the installed versions of apache-beam and apache-airflow.
apache-airflow==1.10.5
apache-beam==2.15.0
Dockerfile:
RUN pip install --upgrade pip
RUN pip install --upgrade setuptools
RUN pip install apache-beam
RUN pip install apache-beam[gcp]
RUN pip install google-api-python-client
ADD . /home/beam
RUN pip install apache-airflow[gcp_api]
airflow operator:
new_task = DataFlowPythonOperator(
task_id='process_details',
py_file="path/to/file/filename.py",
gcp_conn_id='google_cloud_default',
dataflow_default_options={
'project': 'xxxxx',
'runner': 'DataflowRunner',
'job_name': "process_details",
'temp_location': 'GCS/path/to/temp',
'staging_location': 'GCS/path/to/staging',
'input_bucket': 'bucket_name',
'input_path': 'GCS/path/to/bucket',
'input-files': 'GCS/path/to/file.csv'
},
dag=test_dag)
This look like a known issue: https://github.com/GoogleCloudPlatform/DataflowPythonSDK/issues/46
please run pip install six==1.10. This is a known issue in Beam (https://issues.apache.org/jira/browse/BEAM-2964) which we are trying to get fixed upstream.
So try installing six==1.10 using pip
This may not be an option for you, but I was getting the same error with python 2. Executing the same script with python 3 resolved the error.
I was running through the dataflow tutorial:
https://codelabs.developers.google.com/codelabs/cpb101-simple-dataflow-py/
and when I follow the instructions as specified:
python grep.py
I get the error from the title of your post. I hit it with:
python3 grep.py
and it works as expected. I hope it helps. Happy hunting if it doesn't. See the link for details on what exactly I was running.
From this github link will help you to solve your problem. Follow below steps.
Read following nice article on virtualenv, this will help in later steps,
https://www.dabapps.com/blog/introduction-to-pip-and-virtualenv-python/?utm_source=feedly
Create virtual environment ( Note I created it in cloudml-samples folder & named it env)
titanium-vim-169612:~/cloudml-samples$ virtualenv env
Activate virtual env
#titanium-vim-169612:~/cloudml-samples$ source env/bin/activate
Install cloud-dataflow using following link: (this brings in apache_beam)
https://cloud.google.com/dataflow/docs/quickstarts/quickstart-python
Now u can check that apache_beam is present in env/lib/python2.7/site-packages/
#titanium-vim-169612:~/cloudml-samples/flowers$ ls ../env/lib/python2.7/site-packages/
Run the sample
At this point, I got an error about missing tensorflow. I installed tensorflow in my virtualenv by using the link below (use installation steps for virtualenv),
https://www.tensorflow.org/install/install_linux#InstallingVirtualenv
The sample seems to work now.

How to install EB CLI on Windows?

I have used the entire day trying to install EB CLI on windows in order to connect to AWS Elastic Beanstalk but I keep getting the same error:
Running setup.py install for docker-py
Could not find .egg-info directory in install record for docker-py>=1.1.0 <=1.7.2 (from awsebcli)
I started out with the latest version of Python but after reading of other users issues on Stack Overflow I decided to downgrade my Python version to 3.4.0. However, I still get the same error, meaning that I cannot do EB init to connect to my Elastic Beanstalk instance since it does not recognise the command.
I also tried to un-install docker-py and re-install it - still not working.
Any ideas to what I am doing wrong?
It looks as if you may have version conflicts. See a similar issue here
Try installing awsebcli in a virtual environment, as suggested by the aws docs.

Local development of Google App Engine not importing built-in library

I followed the quickstart then I simply clone hello_world from here. I already downloaded google_appengine sdk from here. I extract it and now I have folder google_appengine alongside with hello_world
so I execute it like this:
It runs well apparently, until I start to request to localhost:8080.
then I got this error:
what's wrong with it? did I miss something?
google said that I can use the built-in library without manually install it with pip.
PS: it works when I just deploy it to my project on Google. and also it works if I manually install webapp2 inside lib inside hello_world like described here then request it locally.
my python version Python 2.7.6 on ubuntu 14.04 32bit
Please if anybody can solve this I would be appreciate it.
Seems like this is acknowledged bug in app engine SDK. As a temporary workaround, you may try this steps:
Uninstalling the following PIP packages resolved this issue for me.
sudo pip uninstall gcloud
sudo pip uninstall googleapis-common-protos
sudo pip uninstall protobuf
Credit to this thread:
https://groups.google.com/forum/?hl=nl#!topic/google-appengine/LucknWk8iaQ
Be sure to use correct executable of pip if you use virtualenv or have multiple python versions installed.
Thanks to #Dmytro Sadovnychyi for the answer. It doesn't work for me to uninstall those packages because I never installed it before, But that makes me think maybe built-in library conflict with other package so I decide to create Virtual Environment. just fresh environment no need to install any package.
activate the environment then execute dev_appserver.py hello_world now it works
for now I'll stick with it until next update like said here

Runtime error alongside Django on Amazon AWS EC2 Linux AMI issue

I am getting this error: signalling support is unavailable because the blinker library is not installed.
I am running Django 1.6.5 under python 2.6.9.
Is it possible that the error will go away if i update python on the server to 2.7.x?
If so how can I update the server without losing everything I have done upto this point creating my website on the instance?
Thanks so much in advance.
Just install blinker by typing pip install blinker in the console.
Be sure you install it in your virtualenv if by any chance you use one, just by activating it before executing the pip command.
You may also review your staging procedure to correctly install project dependencies.

Categories

Resources