Hadoop, gcloud utilities, bdutils within Cygwin. No connection, commands not recognized - python

Not sure where the Hadoop forum is...this seems closest bet.
I am trying to set up the cluster to run the Hortonworks platform, meaning I need bdutil working.
However, while I can run the install.py script inside the bootstrapping folder, I can not get any of the gcloud or bdutil functions to work. I initially thought there was incompatibility between the 64 bit python install and the 32 bit GC SDK...so, installed a 32 bit Python 2.7 and forced Cygwin to use this path by temporarily deleting the environmental variable with the path to the 64 bit install.
Below is a log of my errors, as well as info regarding contents of the dirs. Assistance would be greatly appreciated. I've been fighting with this for three days now.
--KNOWS WHICH PYTHON TO USE
$ which python/cygdrive/c/Users/MJ/Anaconda/python
--INSIDE FOLDER WITH PYTHON SCRIPTS
MJ#Speed_rAcer ~/google-cloud-sdk/bin/bootstrapping
$ ls
__init__.py bq.py install.py setup.py
bootstrapping.py gcutil.py prerun.py setup.pyc
bootstrapping.pyc gsutil.py print_env_info.py
--RUNS PYTHON SCRIPT (install.py). DIDN'T LET ME PICK Y OR N BUT SAYS IT'S CONFIGURED
MJ#Speed_rAcer ~/google-cloud-sdk/bin/bootstrapping
$ python install.py
Do you want to help improve the Google Cloud SDK (Y/n)?
All components are up to date.
Update %PATH% to include Cloud SDK binaries? (Y/n)?
The Google Cloud SDK is currently in developer preview. To help improve the
quality of this product, we collect anonymized data on how the SDK is used.
You may choose to opt out of this collection now (by choosing 'N' at the below
prompt), or at any time in the future by running the following command:
gcloud config set --scope=user disable_usage_reporting true
This will install all the core command line tools necessary for working with
the Google Cloud Platform.
The following directory has been added to your PATH.
C:\Users\MJ\home\google-cloud-sdk\bin
Create a new command shell for the changes to take effect.
For more information on how to get started, please visit:
https://developers.google.com/cloud/sdk/gettingstarted
--NEW SHELL. SHOWING COMMANDS I SHOULD BE ABLE TO RUN (first is gcloud.cmd)
MJ#Speed_rAcer ~/google-cloud-sdk/bin
$ ls
bootstrapping gcloud - Copy.cmd gcutil.cmd gsutil.cmd
bq.cmd gcloud.cmd git-credential-gcloud.cmd sdk
--TRY ONE TO EXECUTE COMMAND
$ ./gcloud auth login
-bash: ./gcloud: No such file or directory
--TRY TWO TO EXECUTE COMMAND
MJ#Speed_rAcer ~/google-cloud-sdk/bin
$ gcloud auth login
-bash: gcloud: command not found

The Cloud SDK for Windows instructions are for Windows, where "command shell" means cmd.exe. gcloud auth login at the bash prompt instructs bash to find an executable file gcloud on PATH. You installed gcloud for Windows, so gcloud.cmd was installed. bash does not do suffix based search for commands, so it doesn't find gcloud.cmd when searching for gcloud.
You can do one of the following to get bash to recognize the gcloud command:
Run this to install the CygWin shell scripts:
gcloud.cmd components update
Run:alias gcloud='cmd /c gcloud.cmd'
In the directory containing gcloud.cmd run:
echo cmd /c gcloud.cmd \"\$#\" > gcloud
chmod +x gcloud
Run cmd to get a Windows command prompt. You won't be in bash anymore.

Related

why am i getting (gsutil): "C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe": command not found

After installing Google cloud sdk and connecting to desired firebase project i am receiving :
ERROR: (gsutil)
"C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe":
command not found when running any gsutil command.
My current stup is:
windows 10
Google Cloud SDK 281.0.0
bq 2.0.53
core 2020.02.14
gsutil 4.47
python 3.7
My theory is, that while installed "correctly" python doesnt have access to gsutil commands
I had the same problem and I was able to solve it by setting a new environment variable for CLOUDSDK_PYTHON. On windows 10 you can do this from the command line in 2 ways:
Set an env variable for the current terminal session
set CLOUDSDK_PYTHON="C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe"
Set a permanent env variable
setx CLOUDSDK_PYTHON="C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe"
The file path will probably be different for everyone, so check first where is python.exe located and use your own path. I hope this helps.
Run:
set CLOUDSDK_PYTHON=C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe
Note: There should be no quotes around the python path like this "C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe" or it would attempt to run the command with quotes, which we know won't work.
To see a list of components that are available and currently installed, run command:
gcloud components list
To update all installed components to the latest available version(282.0) of Cloud SDK, run command:
gcloud components update
You also can reinstall it following this document, while Cloud SDK currently uses Python 2 by default, you can use an existing Python installation if necessary by unchecking the option to 'Install Bundled Python'.
As was suggested above reinstalling using bundled python worked for me. I had incorrectly assumed from google's doc i should choose between bundled or current python install not realizing both could run without conflict.
Syntax needed to be a little different for me in CMD and/or PowerShell - also I installed Python via the Microsoft Store so the command for me was:
SETX CLOUDSDK_PYTHON "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.1520.0_x64__qbz5n2kfra8p0\python3.9.exe"
you can get the exact path by running the python app from the start menu and then reading the window title:

How to execute Kaggle Api commands on windows system?

I'm referring to https://github.com/Kaggle/kaggle-api
I tried executing the sample commands listed on the page in windows CMD and Python's IDLE. Not sure where it should be executed or how can I go to Kaggle CLI?
Eg. command: kaggle datasets list -s demographics
Windows CMD says: 'kaggle' is not recognized as an internal or external command,
operable program or batch file.
Assuming the Kaggle API has been successfully installed using pip and the python install location along with the location of the Scripts\ folder have been added into the PATH; the execution of kaggle directly within Windows command prompt (CMD) should be able.
In order to ensure Python and the folder Scripts\ have been added into the PATH execute the command WHERE python3 succeeding WHERE kaggle.
If any of the two commands above produce an equivalent output of INFO: Could not find files for the given pattern(s) manually modify the PATH using the directions in Excursus: Setting environment variables to add both python install location and location of the Scripts\ folder.
You can run Bash commands on Windows using the Bash shell, which is a little tricky to launch the first time. You can find instructions on how to do that here: https://www.windowscentral.com/how-install-bash-shell-command-line-windows-10
Hope that helps! :)

Python Error on Google Cloud Install. How do I properly set the environment variable?

I am trying to install the Google Cloud SDK on my Windows machine. I have Python 2.7 currently installed on this machine, and it's located in the System Variables Path like this -> C:\Python27\;
I am getting this error during installation:
ERROR: gcloud failed to load: DLL load failed: %1 is not a valid Win32
application.
The error message also prompts me to check the Python executable by saying:
If it is not, please set the CLOUDSDK_PYTHON environment variable to
point to a working Python 2.7 executable.
So, I'm trying to set the CLOUDSDK_PYTHON environment variable in the install.sh shell script...But nothing is working. Here is the code from that file:
echo Welcome to the Google Cloud SDK!
if [ -z "$CLOUDSDK_PYTHON" ]; then
if [ -z "$(which python)" ]; then
echo
echo "To use the Google Cloud SDK, you must have Python installed and on your PATH."
echo "As an alternative, you may also set the CLOUDSDK_PYTHON environment variable"
echo "to the location of your Python executable."
exit 1
fi
CLOUDSDK_PYTHON="python"
fi
I have tried python2.7, and the path to the executable, C:\Python27, but I'm getting this error when I try to run the script with those variables:
install.sh: line 128: $'python\r': command not found
I found this stack question, but none of the solutions worked for me. Any help would be great appreciated.
I had the same issue when the sdk was pointing to the virtualenv python. I solved it by using the default python2.7 in Ubuntu
Type this in termimal
export CLOUDSDK_PYTHON=/usr/bin/python
This is because the gcloud.bat command can't find the right python.exe. I solved the problem by simply put
SET CLOUDSDK_PYTHON=pathWherePythonexeLocate
into the file cloud_env.bat in google Cloud SDK file folder.
And revise the install.sh won't help, because it do nothing to the env since the install.sh was run when you first install gcloud sdk.
and sdk only support python2.7, so the path is pointed to python2.7, such as C:\myname\soft\python27.exe
Two configurations fixed my issue with this.
My laptop runs Windows 10, and I found that there was file:
C:\Users\<myusername>\AppData\Local\Microsoft\WindowsApps\python.exe
That file is size 0 Kb. This directory was ahead of the C:\Python27 path where Python was actually installed. I tried moving C:\Python27 higher in the Path string, but this did not work.
While I did not reboot, I did open a fresh CMD window and confirmed that C:\Python27 was higher in the path than the AppData directory. Still did not work.
When I changed the CLOUDSDK_PYTHON path, just the "path" is not enough. The FULL path must be provided, including the executable name.
Making these two changes enabled gCloud to work.
Of course just as I finished typing the above, I saw email from Google regarding the change below.
IMPORTANT NOTE Python 2.7 will no longer get updates after Jan 1, 2020, so gcloud as of v274.0.0 will run with Python 3x. I can't find a web page announcing this, but there is mention of the change on this page: https://cloud.google.com/sdk/docs/quickstart-linux
The way I solved this was simply by downloading the Versioned SDK instead of the Interactive SDK. I manually added gcloud to my path, and all worked. I still don't know why the interactive download was not finding Python from my systems path, but the Versioned SDK without Python worked.
Thanks for the tips #DanCornilescu.
if you are facing the issue on 274.0.0 on Windows,
This is being tracked in the public bug https://issuetracker.google.com/issues/146458519
An employee replied:
We have a patch for two files that are causing these problems. These
apply in two cases (both on Windows):
1. A new install fails, or
2. You are unable to run gcloud after performing a components update.
For case # 1, please download the attached file install.bat, and copy
it to the location where you have attempted to install gcloud, e.g.
C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk. Then run
it, e.g.
> cd C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk
> .\install.bat
For both cases #1 and #2, download the attached file gcloud.cmd, and
copy it to the bin directory under your gcloud installation, e.g.
C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\bin. When
prompted to replace the previous copy, type Yes. This should allow
you to run gcloud without being prompted to set CLOUDSDK_PYTHON.
The files are attached in the public bug tracker.
Add CLOUDSDK_PYTHON in your system variable and assign it the value of your python.exe file as shown below :
Restart your services so that the change can take effect.

Problems running rethinkdb-dump from cron

I'm trying to setup regular backups of rethinkdb, but keep running into issues. How do you setup rethinkdb-dump to run from cron?
Here is my script:
$ cat backup.sh
#!/bin/bash
NOW=$(date +"%Y-%m-%d-%H-%M")
/usr/bin/rethinkdb dump -e my_db -f /root/db_backup/$NOW.tar.gz
The script runs just fine when I run it manually. However, when try and run it from cron it doesn't work and I get the following at stderr:
Error when launching 'rethinkdb-dump': No such file or directory
The rethinkdb-dump command depends on the RethinkDB Python driver, which must be installed.
If the Python driver is already installed, make sure that the PATH environment variable
includes the location of the backup scripts, and that the current user has permission to
access and run the scripts.
Instructions for installing the RethinkDB Python driver are available here:
http://www.rethinkdb.com/docs/install-drivers/python/
It appears to be a Python environment issue, but I cannot figure out how to make it happy... thoughts? Help!
When you run it from that backup.sh script, it maybe run without correct PATH setup and cannot found the PATH of rethinkdb-dump.
First, let find out where is rethinkdb-dump
which rethinkdb-dump
(on my pc, I guess it's very different on your pc)
/usr/local/bin/rethinkdb-dump
Now, try to append the PATH to your script backup.sh
#!/bin/bash
export PATH="$PATH:/path/to/folder/contain-rethinkdb-dump"
# The rest of your script normally
So take my example, I will put it like this:
export PATH="$PATH:/usr/local/bin"
I think your rethinkdb-dump live outside normal bin folder (/usr/bin, /usr/local/bin etc)
The python installer for windows installs scripts and packages in subfolders here:
$env:APPDATA\Python\Python37 for powershell
%APPDATA%\Python\Python37 for cmd
cd this directory to see /Scripts and /site-packages (pip packages)

Installing Anaconda on Amazon Elastic Beanstalk

I've added deploy commands to my Elastic Beanstalk deployment which download the Anaconda installer, and install it into /anaconda. Everything goes well, but I cannot seem to correctly modify the PATH of my instance to include /anaconda/bin as suggested by the Anaconda installation page. If I SSH into an instance and manually add it, everything works fine. But this is obviously not the correct approach, as machines will be added automatically by EB.
So my question is: how can I use Anaconda in my script?
A couple more details:
I've tried adding /anaconda/bin to the system PATH all ways I can think of. Pre/post deploy scripts, custom environment variables, etc. It seems that no matter what I do, the modifications don't persist to when the application is run.
I've tried to include Anaconda via adding it to sys.path: sys.path.append('/anaconda/bin')
to no avail. Using the following: sys.path.append('/anaconda/lib/python2.7/site-packages') allows me to import some packages but fails on import pandas. Strangely enough, if I SSH into the instance and run the application with their python (/opt/python/run/venv/bin/python2.7) it runs fine. Am I going crazy? Why does it fail on a specific import statement when run via EB?
Found the answer: import pandas was failing because matplotlib was failing to initialize, because it was trying to get the current user's home directory. Since the application is run via WSGI, the HOME variable is set to /home/wsgi but this directory doesn't exist. So, creating this directory via deployment command fixed this issue.
My overall setup to use Anaconda on Elastic Beanstalk is as follows:
.ebextensions/options.config contains:
commands:
00_download_conda:
command: 'wget http://repo.continuum.io/archive/Anaconda-2.0.1-Linux-x86_64.sh'
test: test ! -d /anaconda
01_install_conda:
command: 'bash Anaconda-2.0.1-Linux-x86_64.sh -b -f -p /anaconda'
test: test ! -d /anaconda
02_create_home:
command: 'mkdir -p /home/wsgi'
00_download_conda simply downloads Anaconda. See here for latest Anaconda version download link. The test commands are EB's way of letting you only execute the command if the test fails...Just prevents double downloading when in development.
01_install_conda installs Anaconda with options -b -f -p /anaconda which allows it to be installed in the specified directory, without user input, and skips installation if it has already been installed.
02_create_home creates the missing directory.
And finally - to use Anaconda inside your python application: sys.path.append('/anaconda/lib/python2.7/site-packages')
Cheers!

Categories

Resources