I've installed apache_beam Python SDK and apache airflow Python SDK in a Docker.
Python Version: 3.5
Apache Airflow: 1.10.5
I'm trying to execute apache-beam pipeline using **DataflowPythonOperator**.
When I run a DAG from airflow UI at that time I get
Import Error: import apache_beam as beam. Module not found
With the same setup I tried **DataflowTemplateOperator** and it's working perfectly fine.
When I tried same docker setup with Python 2 and apache airflow 1.10.3, two months back at that time operator didn't returned any error and was working as expected.
After SSH into docker when I checked the installed libraries (using pip freeze) in a docker container I can see the installed versions of apache-beam and apache-airflow.
apache-airflow==1.10.5
apache-beam==2.15.0
Dockerfile:
RUN pip install --upgrade pip
RUN pip install --upgrade setuptools
RUN pip install apache-beam
RUN pip install apache-beam[gcp]
RUN pip install google-api-python-client
ADD . /home/beam
RUN pip install apache-airflow[gcp_api]
airflow operator:
new_task = DataFlowPythonOperator(
task_id='process_details',
py_file="path/to/file/filename.py",
gcp_conn_id='google_cloud_default',
dataflow_default_options={
'project': 'xxxxx',
'runner': 'DataflowRunner',
'job_name': "process_details",
'temp_location': 'GCS/path/to/temp',
'staging_location': 'GCS/path/to/staging',
'input_bucket': 'bucket_name',
'input_path': 'GCS/path/to/bucket',
'input-files': 'GCS/path/to/file.csv'
},
dag=test_dag)
This look like a known issue: https://github.com/GoogleCloudPlatform/DataflowPythonSDK/issues/46
please run pip install six==1.10. This is a known issue in Beam (https://issues.apache.org/jira/browse/BEAM-2964) which we are trying to get fixed upstream.
So try installing six==1.10 using pip
This may not be an option for you, but I was getting the same error with python 2. Executing the same script with python 3 resolved the error.
I was running through the dataflow tutorial:
https://codelabs.developers.google.com/codelabs/cpb101-simple-dataflow-py/
and when I follow the instructions as specified:
python grep.py
I get the error from the title of your post. I hit it with:
python3 grep.py
and it works as expected. I hope it helps. Happy hunting if it doesn't. See the link for details on what exactly I was running.
From this github link will help you to solve your problem. Follow below steps.
Read following nice article on virtualenv, this will help in later steps,
https://www.dabapps.com/blog/introduction-to-pip-and-virtualenv-python/?utm_source=feedly
Create virtual environment ( Note I created it in cloudml-samples folder & named it env)
titanium-vim-169612:~/cloudml-samples$ virtualenv env
Activate virtual env
#titanium-vim-169612:~/cloudml-samples$ source env/bin/activate
Install cloud-dataflow using following link: (this brings in apache_beam)
https://cloud.google.com/dataflow/docs/quickstarts/quickstart-python
Now u can check that apache_beam is present in env/lib/python2.7/site-packages/
#titanium-vim-169612:~/cloudml-samples/flowers$ ls ../env/lib/python2.7/site-packages/
Run the sample
At this point, I got an error about missing tensorflow. I installed tensorflow in my virtualenv by using the link below (use installation steps for virtualenv),
https://www.tensorflow.org/install/install_linux#InstallingVirtualenv
The sample seems to work now.
Related
I am using virtual environment in python. When I use pip list command or any command to install packages it throws an error.
from pip._internal.distributions import (
ModuleNotFoundError: No module named 'pip._internal.distributions'
When I check the version pip -V inside venv it shows the version of pip (i.e. no error, working great). But, it is not working in any other commands. Please ask if I did not make it clear.
Thank you!!!
EDIT:-
I am using Windows 10 Pro. I installed python from their official website of version 3.8.1 using the windows installer. And, my pip version is 20.0.2 I installed pip by adding get-pip.py in bin file. I think there should not be any problem.
I run a pyenv to use python 3.5.2 and venv to manage my packages. I can successfully install the graph flower from github, but not the garden.matplotlib flower.
This command works fine:
.venv/bin/python3 -m pip install https://github.com/kivy-garden/graph/archive/master.zip
while this one downloads, but fails installation:
.venv/bin/python3 -m pip install https://github.com/kivy-garden/garden.matplotlib/archive/master.zip
How can I solve this? I need to develop the code on my Mac and run it on a raspberry pi afterwards.
pip 19.2.2 (for this environment)
The graph repo has the necessary code to be installable via pip, while the garden.matplotlib repo does not.
I think the matplotlib one may be waiting to be converted to a new-style garden flower, which is designed to be pip-installable.
A workaround would be to just copy the matplotlib one into your app dir and import from there.
On our staging machine, running any airflow command gives error:
[2018-09-01 16:12:55,938] {__init__.py:37} CRITICAL - Cannot import api_auth.deny_all for API authentication due to: No module named api_auth.deny_all
api_auth seems to come along with airflow, as I tried pip install api_auth and could not find a lib.
On the same machine, I tried to reinstall a fresh clean airflow using virtualenv and pip install airflow, and still get this error.
I tried again on my own laptop and airflow works fine. So I suspect it is probably due to the historical ~/airflow/airflow.cfg on the staging machine.
I am not familiar with the airflow.cfg settings, and cannot find any clue on Google.
Anyone know what may cause the issue and how to resolve?
You are installing a wrong version of Apache Airflow.
Please install Airflow using the following:
pip install apache-airflow
instead of
pip install airflow
Airflow package has been renamed to apache-airflow since 1.8.0
Check the following link for documentation:
https://airflow.apache.org/installation.html#getting-airflow
I need a free optimizer for python. I use PYCharm and python 3.6 (I have python 2.7 on my lap top too)
Now, want to install Gurobi optimizer in PYCharm. but there are some problems:
when I wanted to install "gurobipy" library, the first error was on pip version. It was 9.0.3 and I had to upgrade that to 10.0.1. I've done that successfully and now when I want to install gurobipy, its error again: (AttributeError: module 'pip' has no attribute 'main')
After a quick search, I found that this is a problem of pip 10.0.1
And now I'm really confused. Can anyone help me? I really need this optimizer on python
I see people with the pip 10.0.1 issue downgrading pip version via python -m pip install --upgrade pip==9.0.3. So, how about using the pip 9.0.3 and an older gurobipy (like gurobipy==x.x.x) which might work with the older pip?
EDIT:
How to install gurobipy 8.0.1 for python without conda on Linux
Register an account on the Gurobi official website and login.
Download the latest version from the website.
Extract the package and go to the directory that contains the file setup.py
Run sudo python setup.py install
Add the following lines to your
.bashrc files:
export GUROBI_HOME="/path/to/gurobi801/linux64"
export PATH="${PATH}:${GUROBI_HOME}/bin"
export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${GUROBI_HOME}/lib"
or to run from PyCharm, you need to set LD_LIBRARY_PATH manually on the app like this
Test with import gurobipy
I followed the quickstart then I simply clone hello_world from here. I already downloaded google_appengine sdk from here. I extract it and now I have folder google_appengine alongside with hello_world
so I execute it like this:
It runs well apparently, until I start to request to localhost:8080.
then I got this error:
what's wrong with it? did I miss something?
google said that I can use the built-in library without manually install it with pip.
PS: it works when I just deploy it to my project on Google. and also it works if I manually install webapp2 inside lib inside hello_world like described here then request it locally.
my python version Python 2.7.6 on ubuntu 14.04 32bit
Please if anybody can solve this I would be appreciate it.
Seems like this is acknowledged bug in app engine SDK. As a temporary workaround, you may try this steps:
Uninstalling the following PIP packages resolved this issue for me.
sudo pip uninstall gcloud
sudo pip uninstall googleapis-common-protos
sudo pip uninstall protobuf
Credit to this thread:
https://groups.google.com/forum/?hl=nl#!topic/google-appengine/LucknWk8iaQ
Be sure to use correct executable of pip if you use virtualenv or have multiple python versions installed.
Thanks to #Dmytro Sadovnychyi for the answer. It doesn't work for me to uninstall those packages because I never installed it before, But that makes me think maybe built-in library conflict with other package so I decide to create Virtual Environment. just fresh environment no need to install any package.
activate the environment then execute dev_appserver.py hello_world now it works
for now I'll stick with it until next update like said here