Great expectations installation to AWS EMR - python

I tried to use great expectations for data quality purpose
I am running my jobs in AWS EMR cluster and I am trying to launch great expectations job on AWS EMR as well
I have bootstrap script for installation dependencies on a cluster. It looks like this
#!/bin/bash
sudo yes | sudo yum install python3-devel
sudo python3 -m pip install --upgrade pip
sudo python3 -m pip install cython
sudo python3 -m pip install boto3==1.26.37
sudo python3 -m pip install great-expectations==0.15.36
I saw that all dependencies was installed correctly based on log outputs, but then job started I got the following error
ImportError: this version of pandas is incompatible with numpy < 1.17.3
your numpy version is 1.16.5.
Please upgrade numpy to >= 1.17.3 to use this pandas version
I tried to uninstall numpy and install it manually via pip in bootstrap script like this but it didn't help
sudo python3 -m pip uninstall --yes numpy
I don't understand why it happens

sudo python3 -m pip install numpy==1.17.3

Usage of EMR of newer version solved problem.

Related

Are pip and python consistent? Seems like the answer is no. Can someone help me decrypt the documentation?

So i'm trying to implement stripe on a Django app and i'm having issues.
I installed Stripe using pip3 -install stripe and it downloaded. However when I run the server it says
ModuleNotFoundError: No module named 'stripe'
So looking around and on this I think I found some sort of an answer.
https://nomodulenamed.com/a/I-have-installed-the-package-using-pip#fail-to-install
Are pip and python consistent?
Seems like the answer is no.
pip3 -V returned pip 20.0.2 from /usr/local/lib/python3.7/site-packages/pip (python 3.7)
and
python3 -V returned Python 3.8.2
It seems that the easy fix is using python3 -m pip3 -V but that returns No module named pip3
and
python3 -m pip -V returns pip 20.1 from /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pip (python 3.8)
but pip -V returns zsh: command not found: pip
which leaves me quite confused
more over i'm suppose to do # install your package
python -m pip <your-package-name>
so what goes in the place of
<your-package-name>
and I'm I suppose to use pip of pip3 since I use pip3 to install thing.
Since you can have more than one Python2 installation and more than one Python3 installation available on your machine, your question is better answered by understanding virtual environments.
It is precisely the reason why virtual environments exist!
when you create a python3 virtual environment there is no need to call pip3 as it is the default pip.
start by creating your virtual env (Assuming you have virtualenv installed ... if not install it on linux ubuntu by calling
sudo apt-get install virtualenv
sudo apt-get install python3-pip
python3 -m venv env
source bin/env/activate
pip install <yourpackage>
but I believe you are on macOS since you are getting zsh error,
fix your installation by using homebrew
brew install python3
pip3 install virtualenv
virtualenv -p python3 <path-to your-project>
source <path-to your-project>/bin/activate

apt-get install python3-numpy doesn't install numpy on python3, but installed on python2.7

I'm trying to install numpy for python3, and I used sudo apt-get install python3-numpy to install numpy as I use Jetson tx2.
Although the installation is successful, but numpy is installed on python2.7 not python3. How can I solve it?
Actually when you flash your Jetson TX2 with Jetpack (version), numpy package is present for Python2 by default and not for Python3.
In order to install numpy for Python3 Please follow the steps given below:-
1. Check if you have pip3 installed for Python3. If not install pip3.
sudo apt install python3-pip
2. Then using pip3 install numpy
pip3 install numpy
After installation check the location: /usr/local/lib/python3.6/dist-packages you will find numpy installed for Python3 Hope this helps!
I think this is because your default interpreter is Py v2.7
Check this by runnin in console:
python -V
Then you can specify Py3 installation as was commented above:
pip3 install numpy
Note: Do not run this command with sudo because it will run setup.py
with sudo or
in other words - an arbitraty upload a malicious project on PyPI this
is a hight risk action.
You can try using the python3 package manager :
pip3 install --user numpy

"google-oauthlib-tool: command not found" when trying to install google assistant on raspbian

I'm trying to install google assistant on a newly set up pi 3 with Raspbian. I got the message "No module named googlesamples.assistant.auth_helpers" so I followed the instructions given in answer to this question: No module named googlesamples.assistant.auth_helpers
The first 2 commands appear to complete OK, but the third command gives
"google-oauthlib-tool: command not found"
My programming skills are too rusty to work out what's going wrong.
Python 3.5.3; forgotten how to find the version of SDK, but should be the latest one.
Any help greatly appreciated.
Have you followed the instructions to set up a virtual Python environment?
sudo apt-get update
sudo apt-get install python3-dev python3-venv
# Use python3.4-venv if the package cannot be found.
python3 -m venv env
env/bin/python -m pip install --upgrade pip setuptools
source env/bin/activate
Then you should be able to install the oauth tool with pip:
python -m pip install --upgrade google-auth-oauthlib[tool]
You can display all of your installed packages using pip freeze
pip freeze | grep google
Before running python -m pip install --upgrade google-auth-oauthlib[tool] , run
pip install google-auth
And then:
python -m pip install --upgrade google-auth-oauthlib[tool]
google-auth is a dependency of google-auth-oauthlib in Raspbian
You need to locate the tool via locate google-oauthlib-tool
Then, cd into the path and open it there with your arguments

Supporting matplotlib for both python 2 and python 3 on Mac OS X

We're building code that we want to run on both Python 2 & 3. It uses matplotlib. My local machine runs OS X Yosemite.
The matplotlib installation documentation provides instructions for both python 2 & 3, but implies that both cannot be supported on a single Mac. Is this true, and if not how can both be supported with matplotlib?
(Parenthetically, I know that separate installations can be made with virtual environments or machines. However, I've found these cumbersome on Macs. On the other hand, I'm also testing builds on a commercial cloud-based build tester that uses separate VMs for each configuration, which works reasonably well.)
I too find virtualenvs annoying for this sort of thing, and have run into strange issues on OSX virutalenvs with matplotlib in particular.
But there is a really nice tool for supporting parallel installations of different package & python versions: conda. It will manage parallel environments with any Python version; for your case you can do the following:
Install miniconda
Create a Python 3 environment: conda create -n py3env python=3.5 matplotlib
Create a Python 2 environment: conda create -n py2env python=2.7 matplotlib
Activate the one you want with, e.g. source activate py2env
And you're ready to go. For more information on conda environments, see the conda-env docs.
This appears to work:
python 3: install https://www.python.org/ftp/python/3.5.2/python-3.5.2-macosx10.6.pkg
curl -O https://bootstrap.pypa.io/get-pip.py
python3 get-pip.py
pip3 install nose
pip3 install matplotlib
pip3 install cobra
pip3 install numpy
pip3 install scipy
pip3 install openpyxl
pip3 install future
pip3 install recordtype
pip3 install lxml
pip3 install python-libsbml
python 2: install https://www.python.org/ftp/python/2.7.12/python-2.7.12-macosx10.6.pkg
curl -O https://bootstrap.pypa.io/get-pip.py
python get-pip.py
sudo pip2 install nose
sudo pip2 install matplotlib
sudo pip2 install cobra
sudo pip2 install numpy
sudo pip2 install scipy
sudo pip2 install openpyxl
sudo pip2 install future
sudo pip2 install recordtype
sudo pip2 install lxml
sudo pip2 install python-libsbml
sudo pip2 uninstall python-dateutil # deal with bug in six; see http://stackoverflow.com/a/27634264/509882
sudo pip2 install python-dateutil==2.2

ImportError: No module named psycopg2

In installation process of OpenERP 6, I want to generate a config file with these commands:
cd /home/openerp/openerp-server/bin/
./openerp-server.py -s --stop-after-init -c /home/openerp/openerp-server.cfg
But it always showed the message: ImportError: No module named psycopg2
When I checked for psycopg2 package, it's already installed. Package python-psycopg2-2.4.5-1.rhel5.x86_64 is already installed to its latest version. Nothing to do. What's wrong with this? My server is CentOS, I've installed Python 2.6.7.
Step 1: Install the dependencies
sudo apt-get install build-dep python-psycopg2
Step 2: Run this command in your virtualenv
pip install psycopg2-binary
Ref: Fernando Munoz
Use psycopg2-binary instead of psycopg2.
pip install psycopg2-binary
Or you will get the warning below:
UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
Reference: Psycopg 2.7.4 released | Psycopg
I faced the same issue and resolved it with following commands:
sudo apt-get install libpq-dev
pip install psycopg2
Try installing
psycopg2-binary
with
pip install psycopg2-binary --user
Please try to run the command import psycopg2 on the python console. If you get the error then check the sys.path where the python look for the install module. If the parent directory of the python-psycopg2-2.4.5-1.rhel5.x86_64 is there in the sys.path or not. If its not in the sys.path then run export PYTHONPATH=<parent directory of python-psycopg2-2.4.5-1.rhel5.x86_64> before running the openerp server.
Import Error on Mac OS
If psycopg2 is getting installed but you are unable to import it in your .py file then the problem is libpq, its linkages, and the library openssl, on which libpq depends upon. The overall steps are reproduced below. You can check it step by step to know which is the source of error for you and then you can troubleshoot from there.
Check for the installation of the openssl and make sure it's working.
Check for installation of libpq in your system it may not have been installed or not linked. If not installed then install it using the command brew install libpq. This installs libpq library. As per the documentation
libpq is the C application programmer's interface to PostgreSQL. libpq is a set of library functions that allow client programs to pass queries to the PostgreSQL backend server and to receive the results of these queries.
Link libpq using brew link libpq, if this doesn't work then use the command: brew link libpq --force.
Also put in your .zshrc file the following export PATH="/usr/local/opt/libpq/bin:$PATH". This creates all the necessary linkages for libpq library .
Now restart the terminal or use the following command source ~/.zshrc.
Now use the command pip install psycopg2. It will work.
This works, even when you are working in conda environment.
N.B. pip install psycopg2-binaryshould be avoided because as per the developers of the psycopg2 library
The use of the -binary packages in production is discouraged because in the past they proved unreliable in multithread environments. This might have been fixed in more recent versions but I have never managed to reproduce the failure.
Try with these:
virtualenv -p /usr/bin/python3 test_env
source test_env/bin/activate
pip install psycopg2
run python and try to import if you insist on installing it on your systems python try:
pip3 install psycopg2
Recently faced this issue on my production server. I had installed pyscopg2 using
sudo pip install psycopg2
It worked beautifully on my local, but had me for a run on my ec2 server.
sudo python -m pip install psycopg2
The above command worked for me there. Posting here just in case it would help someone in future.
sudo pip install psycopg2-binary
You need to install the psycopg2 module.
On CentOS:
Make sure Python 2.7+ is installed. If not, follow these instructions: http://toomuchdata.com/2014/02/16/how-to-install-python-on-centos/
# Python 2.7.6:
$ wget http://python.org/ftp/python/2.7.6/Python-2.7.6.tar.xz
$ tar xf Python-2.7.6.tar.xz
$ cd Python-2.7.6
$ ./configure --prefix=/usr/local --enable-unicode=ucs4 --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"
$ make && make altinstall
$ yum install postgresql-libs
# First get the setup script for Setuptools:
$ wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py
# Then install it for Python 2.7 and/or Python 3.3:
$ python2.7 ez_setup.py
$ easy_install-2.7 psycopg2
Even though this is a CentOS question, here are the instructions for Ubuntu:
$ sudo apt-get install python3-pip python-distribute python-dev
$ easy_install psycopg2
Cite: http://initd.org/psycopg/install/
For python3 on ubuntu, this worked for me:
$sudo apt-get update
$sudo apt-get install libpq-dev
$sudo pip3 install psycopg2-binary
i have the same problem, but this piece of snippet alone solved my problem.
pip install psycopg2
Run into the same issue when I switch to Ubuntu from Windows 10.. the following worked for me.. this after googling and trying numerous suggestions for 2 hours...
sudo apt-get install libpq-dev
then
pip3 install psycopg2
I hope this helps someone who has encountered the same problem especially when switching for windows OS to Linux(Ubuntu).
I have done 2 things to solve this issue:
use Python 3.6 instead of 3.8.
change Django version to 2.2 (may be working with some higher but I change to 2.2)
For Python3
Step 1: Install Dependencies
sudo apt-get install python3 python-dev python3-dev
Step 2: Install
pip install psycopg2
check correctly if you had ON your virtual env of your peoject, if it's OFF then make it ON. execute following cammands:
workon <your_env_name>
python manage.py runserver
It's working for me
It's very simple, not sure why nobody mentioned this for mac before.
brew install postgresql
pip3 install psycopg2
In simple terms, psycopg2 wants us to install postgres first.
PS: Don't forget to upvote, so that it can help other people as well.
Solved the issue with below solution :
Basically the issue due to _bz2.cpython-36m-x86_64-linux-gnu.so Linux package file. Try to find the the location.
Check the install python location ( which python3)- Example: /usr/local/bin/python3
copy the file under INSTALL_LOCATION/lib/python3.6
cp -rvp /usr/lib64/python3.6/lib-dynload/_bz2.cpython-36m-x86_64-linux-gnu.so /usr/local/lib/python3.6
try:
pip install psycopg2 --force-reinstall --no-cache-dir
Python2 importerror no module named psycopg2
pip install psycopg2-binary
Requirement already satisfied...
Solved by following steps:
sudo curl https://bootstrap.pypa.io/pip/2.7/get-pip.py -o get-pip.py
sudo python get-pip.py
sudo python -m pip install psycopg2-binary
pip install psycopg-binary
The line above helped me
For Python3 use this:
sudo apt-get install -y python3-psycopg2

Categories

Resources