Installing Anaconda on Amazon Elastic Beanstalk

Installing Anaconda on Amazon Elastic Beanstalk - python

I've added deploy commands to my Elastic Beanstalk deployment which download the Anaconda installer, and install it into /anaconda. Everything goes well, but I cannot seem to correctly modify the PATH of my instance to include /anaconda/bin as suggested by the Anaconda installation page. If I SSH into an instance and manually add it, everything works fine. But this is obviously not the correct approach, as machines will be added automatically by EB.
So my question is: how can I use Anaconda in my script?
A couple more details:
I've tried adding /anaconda/bin to the system PATH all ways I can think of. Pre/post deploy scripts, custom environment variables, etc. It seems that no matter what I do, the modifications don't persist to when the application is run.
I've tried to include Anaconda via adding it to sys.path: sys.path.append('/anaconda/bin')
to no avail. Using the following: sys.path.append('/anaconda/lib/python2.7/site-packages') allows me to import some packages but fails on import pandas. Strangely enough, if I SSH into the instance and run the application with their python (/opt/python/run/venv/bin/python2.7) it runs fine. Am I going crazy? Why does it fail on a specific import statement when run via EB?

Found the answer: import pandas was failing because matplotlib was failing to initialize, because it was trying to get the current user's home directory. Since the application is run via WSGI, the HOME variable is set to /home/wsgi but this directory doesn't exist. So, creating this directory via deployment command fixed this issue.
My overall setup to use Anaconda on Elastic Beanstalk is as follows:
.ebextensions/options.config contains:
commands:
00_download_conda:
command: 'wget http://repo.continuum.io/archive/Anaconda-2.0.1-Linux-x86_64.sh'
test: test ! -d /anaconda
01_install_conda:
command: 'bash Anaconda-2.0.1-Linux-x86_64.sh -b -f -p /anaconda'
test: test ! -d /anaconda
02_create_home:
command: 'mkdir -p /home/wsgi'
00_download_conda simply downloads Anaconda. See here for latest Anaconda version download link. The test commands are EB's way of letting you only execute the command if the test fails...Just prevents double downloading when in development.
01_install_conda installs Anaconda with options -b -f -p /anaconda which allows it to be installed in the specified directory, without user input, and skips installation if it has already been installed.
02_create_home creates the missing directory.
And finally - to use Anaconda inside your python application: sys.path.append('/anaconda/lib/python2.7/site-packages')
Cheers!

Related

Building Python packages for a Raspberry Pi using Azure Pipelines

I have set up two pipelines for a Python package. One is for Windows, the other is for Linux. The one for Windows works as expected. However, if I copy the Linux executable to a Raspberry, it won't run. If I double click it, nothing happens and if I execute it using a terminal, I get permission denied. If I build the Python package locally on my Raspberry, everything works as expected.
So basically my question is, do I need to specifically target Linux ARM for my Python app to run on my Raspberry? If so, how can I achieve this? When creating a Pipeline, I can only choose between x86 and x64 architecture:
Repo can be found here.
This is the pipeline I use for building and publishing:
trigger:
- master
jobs:
- job: 'Raspberry'
pool:
name: arm32 # Already tried to use a self-hosted build agent, but didn't get it to work
variables:
python.version: '3.7'
steps:
- task: UsePythonVersion#0
inputs:
versionSpec: '$(python.version)'
- script: |
cd AzurePipelinesWithPython
python -m pip install --upgrade pip
displayName: 'Install dependencies'
- script: pip install pyinstaller
name: 'pyinstaller'
- script: cd AzurePipelinesWithPython && pyinstaller --onefile --noconfirm --clean test.py
name: 'build'
- task: PublishBuildArtifacts#1
inputs:
pathtoPublish: './AzurePipelinesWithPython/dist/'
artifactName: 'AzurePipelinesWithPython-raspi-$(python.version)'
Sorry for not being able to post a Azure DevOps repo, it belongs to our corporate subscription and isn't public.

So basically my question is, do I need to specifically target Linux
ARM for my Python app to run on my Raspberry?
No, you don't need to specify it to target ARM for your app. Especially, I think it won't make big effects even if you remove that task. If you're using hosted agent, it contains different python versions by default. And if you use self-hosted agent, you just need to make sure the suitable version is installed in local machine.
However, if I copy the Linux executable to a Raspberry, it won't run.
If I double click it, nothing happens and if I execute it using a
terminal, I get permission denied.
For the strange behavior you met, I guess it didn't result from Azure Devops side. I searched and found several similar issues like yours, see one, two, three, four.
So I would think that it's more like a common issue in Linux environment, try using terminal to execute that program after using chmod u+x file_name to give the file the permission to run. Also, you can use file xx.exe to check if the exe is generated correctly for Linux if the issue persists. Hope all above gives a correct direction to resolve your issue.

why am i getting (gsutil): "C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe": command not found

After installing Google cloud sdk and connecting to desired firebase project i am receiving :
ERROR: (gsutil)
"C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe":
command not found when running any gsutil command.
My current stup is:
windows 10
Google Cloud SDK 281.0.0
bq 2.0.53
core 2020.02.14
gsutil 4.47
python 3.7
My theory is, that while installed "correctly" python doesnt have access to gsutil commands

I had the same problem and I was able to solve it by setting a new environment variable for CLOUDSDK_PYTHON. On windows 10 you can do this from the command line in 2 ways:
Set an env variable for the current terminal session
set CLOUDSDK_PYTHON="C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe"
Set a permanent env variable
setx CLOUDSDK_PYTHON="C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe"
The file path will probably be different for everyone, so check first where is python.exe located and use your own path. I hope this helps.

Run:
set CLOUDSDK_PYTHON=C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe
Note: There should be no quotes around the python path like this "C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe" or it would attempt to run the command with quotes, which we know won't work.

To see a list of components that are available and currently installed, run command:
gcloud components list
To update all installed components to the latest available version(282.0) of Cloud SDK, run command:
gcloud components update
You also can reinstall it following this document, while Cloud SDK currently uses Python 2 by default, you can use an existing Python installation if necessary by unchecking the option to 'Install Bundled Python'.

As was suggested above reinstalling using bundled python worked for me. I had incorrectly assumed from google's doc i should choose between bundled or current python install not realizing both could run without conflict.

Syntax needed to be a little different for me in CMD and/or PowerShell - also I installed Python via the Microsoft Store so the command for me was:
SETX CLOUDSDK_PYTHON "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.1520.0_x64__qbz5n2kfra8p0\python3.9.exe"
you can get the exact path by running the python app from the start menu and then reading the window title:

Hitting local IDE breakpoints on libraries within a running docker container

I have a Dockerfile containing the lines:
COPY requirements.txt requirements.txt
RUN pip3 install -r requirements.txt
I would like to set some breakpoints on libraries installed via requirements in my local IDE. I am wondering how to launch the docker image such that these files are accessible from my local IDE. The relevant modules are located within the image at:
/usr/local/lib/python3.7/site-packages
so, i was thinking of using the line:
docker run \
-v site_pkgs:/usr/local/lib/python3.7/site-packages
--entrypoint python3 \
app-dev
but, this seems to overwrite the containers directory rendering it unable to find the modules it expects. Any guidance in how to perform this type of debugging on a running container would be very helpful. Thank you!

a) If you just mean to get the python package module code in container, and have test application in your local IDE to call these modules. Then you do not need to run this container, just:
docker cp /usr/local/lib/python3.7/site-packages .
After that, these python module's py files will be in your local(docker host). Then you can use your local IDE to debug them, of course with your own test code.
b) If you mean to use local IDE directly debug the code in container, then VSCode IDE is your choice.
NOTE: you need to use insiders build currently, as it's a pretty new feature also I think is a great feature.
See Developing inside a Container, vscode give you ability to set IDE in your local host machine, but still can let IDE debug the code in container.

Problems running rethinkdb-dump from cron

I'm trying to setup regular backups of rethinkdb, but keep running into issues. How do you setup rethinkdb-dump to run from cron?
Here is my script:
$ cat backup.sh
#!/bin/bash
NOW=$(date +"%Y-%m-%d-%H-%M")
/usr/bin/rethinkdb dump -e my_db -f /root/db_backup/$NOW.tar.gz
The script runs just fine when I run it manually. However, when try and run it from cron it doesn't work and I get the following at stderr:
Error when launching 'rethinkdb-dump': No such file or directory
The rethinkdb-dump command depends on the RethinkDB Python driver, which must be installed.
If the Python driver is already installed, make sure that the PATH environment variable
includes the location of the backup scripts, and that the current user has permission to
access and run the scripts.
Instructions for installing the RethinkDB Python driver are available here:
http://www.rethinkdb.com/docs/install-drivers/python/
It appears to be a Python environment issue, but I cannot figure out how to make it happy... thoughts? Help!

When you run it from that backup.sh script, it maybe run without correct PATH setup and cannot found the PATH of rethinkdb-dump.
First, let find out where is rethinkdb-dump
which rethinkdb-dump
(on my pc, I guess it's very different on your pc)
/usr/local/bin/rethinkdb-dump
Now, try to append the PATH to your script backup.sh
#!/bin/bash
export PATH="$PATH:/path/to/folder/contain-rethinkdb-dump"
# The rest of your script normally
So take my example, I will put it like this:
export PATH="$PATH:/usr/local/bin"
I think your rethinkdb-dump live outside normal bin folder (/usr/bin, /usr/local/bin etc)

The python installer for windows installs scripts and packages in subfolders here:
$env:APPDATA\Python\Python37 for powershell
%APPDATA%\Python\Python37 for cmd
cd this directory to see /Scripts and /site-packages (pip packages)

Hadoop, gcloud utilities, bdutils within Cygwin. No connection, commands not recognized

Not sure where the Hadoop forum is...this seems closest bet.
I am trying to set up the cluster to run the Hortonworks platform, meaning I need bdutil working.
However, while I can run the install.py script inside the bootstrapping folder, I can not get any of the gcloud or bdutil functions to work. I initially thought there was incompatibility between the 64 bit python install and the 32 bit GC SDK...so, installed a 32 bit Python 2.7 and forced Cygwin to use this path by temporarily deleting the environmental variable with the path to the 64 bit install.
Below is a log of my errors, as well as info regarding contents of the dirs. Assistance would be greatly appreciated. I've been fighting with this for three days now.
--KNOWS WHICH PYTHON TO USE
$ which python/cygdrive/c/Users/MJ/Anaconda/python
--INSIDE FOLDER WITH PYTHON SCRIPTS
MJ#Speed_rAcer ~/google-cloud-sdk/bin/bootstrapping
$ ls
__init__.py bq.py install.py setup.py
bootstrapping.py gcutil.py prerun.py setup.pyc
bootstrapping.pyc gsutil.py print_env_info.py
--RUNS PYTHON SCRIPT (install.py). DIDN'T LET ME PICK Y OR N BUT SAYS IT'S CONFIGURED
MJ#Speed_rAcer ~/google-cloud-sdk/bin/bootstrapping
$ python install.py
Do you want to help improve the Google Cloud SDK (Y/n)?
All components are up to date.
Update %PATH% to include Cloud SDK binaries? (Y/n)?
The Google Cloud SDK is currently in developer preview. To help improve the
quality of this product, we collect anonymized data on how the SDK is used.
You may choose to opt out of this collection now (by choosing 'N' at the below
prompt), or at any time in the future by running the following command:
gcloud config set --scope=user disable_usage_reporting true
This will install all the core command line tools necessary for working with
the Google Cloud Platform.
The following directory has been added to your PATH.
C:\Users\MJ\home\google-cloud-sdk\bin
Create a new command shell for the changes to take effect.
For more information on how to get started, please visit:
https://developers.google.com/cloud/sdk/gettingstarted
--NEW SHELL. SHOWING COMMANDS I SHOULD BE ABLE TO RUN (first is gcloud.cmd)
MJ#Speed_rAcer ~/google-cloud-sdk/bin
$ ls
bootstrapping gcloud - Copy.cmd gcutil.cmd gsutil.cmd
bq.cmd gcloud.cmd git-credential-gcloud.cmd sdk
--TRY ONE TO EXECUTE COMMAND
$ ./gcloud auth login
-bash: ./gcloud: No such file or directory
--TRY TWO TO EXECUTE COMMAND
MJ#Speed_rAcer ~/google-cloud-sdk/bin
$ gcloud auth login
-bash: gcloud: command not found

The Cloud SDK for Windows instructions are for Windows, where "command shell" means cmd.exe. gcloud auth login at the bash prompt instructs bash to find an executable file gcloud on PATH. You installed gcloud for Windows, so gcloud.cmd was installed. bash does not do suffix based search for commands, so it doesn't find gcloud.cmd when searching for gcloud.
You can do one of the following to get bash to recognize the gcloud command:
Run this to install the CygWin shell scripts:
gcloud.cmd components update
Run:alias gcloud='cmd /c gcloud.cmd'
In the directory containing gcloud.cmd run:
echo cmd /c gcloud.cmd \"\$#\" > gcloud
chmod +x gcloud
Run cmd to get a Windows command prompt. You won't be in bash anymore.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.