How to SSH and run PythonOperator in Airflow - python

Is there a way to ssh to a server and run a PythonOperator with Airflow? I am looking something like SSHExecuteOperator but instead of executing a bash command execute a python callable.

Its ssh privacy issue
goto your host ssh server and run ssh-keygen -t rsa and press enter all the way
You will get 2 rsa files.copy that file to airflow env and copy its full path
The Just add the below in your connection from the airflow UI
{"key_file": "/usr/local/airflow/.ssh/id_rsa.pub", "no_host_key_check": true}
Recompile the DAG and run it

Related

Airflow Remote PythonOperator

I have a python script in my locale file and I don't want to SCP it to the remote machine and run with SSHOperator remotely triggered by airflow. How can I run a locale .py file in a remote machine and get results?
I need SSHOperator with python_callable, not bash_command.
Can anyone show me a remote custom operator sample like SSHPYTHONOperator ?
I solve problem following:
gettime="""
import os
import datetime
def gettimes():
print(True)
gettimes()
"""
remote_python_get_delta_times=SSHOperator(task_id= "get_delta_times",do_xcom_push=True,
command="MYVAR=`python -c" + ' "%s"`;echo $MYVAR' % gettime ,dag=dag,ssh_hook=remote)
I see an SSH operator in the Airflow docs: https://airflow.apache.org/docs/apache-airflow/1.10.13/_api/airflow/contrib/operators/ssh_operator/index.html
If that doesn't work out for you then, you'll have to create a custom Operator using an SSH library like Paramiko
and then use it to pull code from either Github/S3 or SCP your file to the server and then execute it there.
You would need to make sure all your dependencies are also installed on the remote server.

Python os how to accept verification for SSH key

So basically I have a function that launches a glue development endpoint, and I want to programmatically launch zeppelin, then use the IP from the endpoint to ssh into it with my local browser. To do so, I run the following commands in the terminal:
cd ~/zepp081
bin/zeppelin-daemon.sh start
ssh -i pem_file_path -NTL 9007:169.254.76.1:9007 glue#ip_address
When I run the last command, I get the prompt:
The authenticity of host 'ip_address' can't be established.
ECDSA key fingerprint is X.
Are you sure you want to continue connecting (yes/no/[fingerprint])?
at which point I type yes and continue on with my life. I would like to automate this in python using the following command:
import os
ssh_string = "ssh -i pem_file_path -NTL 9007:169.254.76.1:9007 glue#ip_address"
os.system(f"cd ~/zepp081 && bin/zeppelin-daemon.sh start && {ssh_string}")
The first two commands in line 4 run successfully, but the third fails saying Host key verification failed. How can I update the command to have it continue connecting?
ssh -o "StrictHostKeyChecking=no" .....

Run crontab job with P4

I have a shell script which in turn calls a python script. But before the python script runs I am setting up the environment variables in order to have the right P4 configurations.
shell: /home/ag/ump_prod/run.sh
python script: /home/ag/ump_prod/cron.py
Environment conf: /home/ag/ump_prod/env.conf
The python script executes command line P4 commands via the subprocess module.
Here is the code for the shell script
#!/bin/sh
. /home/ag/ump_prod/env.conf
python /home/ag/ump_prod/cron.py
env.conf
export SHELL=/bin/bash
export USER=ag
export MAIL=/var/mail/ag
export HOME=/home/ag
export LOGNAME=ag
export P4CONFIG=/home/ag/ump_prod/.perforce
The perforce config /home/ag/ump_prod/.perforce :
P4CLIENT=ag_ump
P4EDITOR=/usr/bin/vim
P4PORT=rsh:ssh -2 -q -a -x -l p4server p4.****.com /bin/true
P4USER=ag
Manually running the shell script executes it without any issues.
However, when I run it via cronjob it complains that it cannot connect to the server.
Error Message:
['TCP receive failed.\n', 'read: socket stdio: Connection reset by peer\n', 'Perforce client error:\n', '\tTCP receive failed.\n', '\tread: socket stdio: Connection reset by peer\n']
Please let me know where I could be going wrong in setting the environment variables for P4 configs. Thanks in advance!
I found the solution: I needed to start the ssh-agent dynamically for the cronjob as well. It is just like how you login for the first time and start the ssh-agent. We need to tell cronjob too to include eval ssh-agent

Python ssh tunneling over multiple machines with agent

A little context is in order for this question: I am making an application that copies files/folders from one machine to another in python. The connection must be able to go through multiple machines. I quite literally have the machines connected in serial so I have to hop through them until I get to the correct one.
Currently, I am using python's subprocess module (Popen). As a very simplistic example I have
import subprocess
# need to set strict host checking to no since we connect to different
# machines over localhost
tunnel_string = "ssh -oStrictHostKeyChecking=no -L9999:127.0.0.1:9999 -ACt machine1 ssh -L9999:127.0.0.1:22 -ACt -N machineN"
proc = subprocess.Popen(tunnel_string.split())
# Do work, copy files etc. over ssh on localhost with port 9999
proc.terminate()
My question:
When doing it like this, I cannot seem to get agent forwarding to work, which is essential in something like this. Is there a way to do this?
I tried using the shell=True keyword in Popen like so
tunnel_string = "eval `ssh-agent` && ssh-add && ssh -oStrictHostKeyChecking=no -L9999:127.0.0.1:9999 -ACt machine1 ssh -L9999:127.0.0.1:22 -ACt -N machineN"
proc = subprocess.Popen(tunnel_string, shell=True)
# etc
The problem with this is that the name of the machines is given by user input, meaning they could easily inject malicious shell code. A second problem is that I then have a new ssh-agent process running every time I make a connection.
I have a nice function in my bashrc which identifies already running ssh-agents and sets the appropriate environment variables and adds my ssh key, but of cource subprocess cannot reference functions defined in my bashrc. I tried setting the executable="/bin/bash" variable with shell=True in Popen to no avail.
You should give Fabric a try.
It provides a basic suite of operations for executing local or remote
shell commands (normally or via sudo) and uploading/downloading files,
as well as auxiliary functionality such as prompting the running user
for input, or aborting execution.
The program below will give you a test run.
First install fabric with pip install fabric then save the code below in fabfile.py
from fabric.api import *
env.hosts = ['server url/IP'] #change to ur server.
env.user = #username for the server
env.password = #password
def run_interactive():
with settings(warn_only = True)
cmd = 'clear'
while cmd is not 'stop fabric':
run(cmd)
cmd = raw_input('Command to run on server')
Change to the directory containing your fabfile and run fab run_interactive then each command you enter will be run on the server
I tested your first simplistic example and agent forwarding worked. The only think that I can see that might cause problems is that the environment variables SSH_AGENT_PID and SSH_AUTH_SOCK are not set correctly in the shell that you execute your script from. You might use ssh -v to get a better idea of where things are breaking down.
Try setting up a SSH config file: https://linuxize.com/post/using-the-ssh-config-file/
I frequently am required to tunnel through a bastion server and I use a configuration like so in my ~/.ssh/config file. Just change the host and user names. This also presumes that you have entries for these host names in your hosts (/etc/hosts) file.
Host my-bastion-server
Hostname my-bastion-server
User user123
AddKeysToAgent yes
UseKeychain yes
ForwardAgent yes
Host my-target-host
HostName my-target-host
User user123
AddKeysToAgent yes
UseKeychain yes
I then gain access with syntax like:
ssh my-bastion-server -At 'ssh my-target-host -At'
And I issue commands against my-target-host like:
ssh my-bastion-server -AT 'ssh my-target-host -AT "ls -la"'

Connecting to EC2 using keypair (.pem file) via Fabric

Anyone has any Fabric recipe that shows how to connect to EC2 using the pem file?
I tried writing it with this manner:
Python Fabric run command returns "binascii.Error: Incorrect padding"
But I'm faced with some encoding issue, when I execute the run() function.
To use the pem file I generally add the pem to the ssh agent, then simply refer to the username and host:
ssh-add ~/.ssh/ec2key.pem
fab -H ubuntu#ec2-host deploy
or specify the env information (without the key) like the example you linked to:
env.user = 'ubuntu'
env.hosts = [
'ec2-host'
]
and run as normal:
fab deploy
Without addressing your encoding issue, you might put your EC2 stuff into an ssh config file:
~/.ssh/config
or, if global:
/etc/ssh_config
There you can specify your host, ip address, user, identify file, etc., so it's a simple matter of:
ssh myhost
Example:
Host myhost
User ubuntu
HostName 174.129.254.215
IdentityFile ~/.ssh/mykey.pem
For more details: man ssh_config
Another thing you can do is set the key_filename in the env variable: https://stackoverflow.com/a/5327496/1729558

Categories

Resources