Losing stdout data in python - python

I'm trying to make a python script which is going run a bash script on a remote machine via ssh and then parse its output. The bash script outputs lot of data (like 5 megabytes of text / 50k lines) in stdout and here is a problem - I'm getting all the data only in ~10% cases. In other 90% cases I'm getting about 97% of what i expect and it looks like it always trims at the end. This is how my script looks like:
import subprocess
import re
import sys
import paramiko
def run_ssh_command(ip, port, username, password, command):
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(ip, port, username, password)
stdin, stdout, stderr = ssh.exec_command(command)
output = ''
while not stdout.channel.exit_status_ready():
solo_line = ''
# Print stdout data when available
if stdout.channel.recv_ready():
# Retrieve the first 1024 bytes
solo_line = stdout.channel.recv(2048).
output += solo_line
ssh.close()
return output
result = run_ssh_command(server_ip, server_port, login, password, 'cat /var/log/somefile')
print "result size: ", len(result)
I'm pretty sure that problem is in overflowing of some internal buffer, but which one and how to fix it?
Thank you very much for any tip!

When stdout.channel.exit_status_ready() starts returning True, there might still be a lot of data on the remote side, waiting to be sent. But you only receive one more chunk of 2048 bytes and quit.
Instead of checking the exit status, you could keep calling recv(2048) until it returns an empty string, which means that no more data is coming:
output = ''
next_chunk = True
while next_chunk:
next_chunk = stdout.channel.recv(2048)
output += next_chunk
But really you probably just want:
output = stdout.read()

May I suggest a less crude way to execute command over ssh via Fabric library.
It may look like this (omitting ssh authentication details):
from fabric import Connection
with Connection('user#localhost') as con:
res = con.run('~/test.sh', hide=True)
lines = res.stdout.split('\n')
print('{} lines readen.'.format(len(lines)))
given the test script ~/test.sh
#!/bin/sh
for i in {1..1234}
do
echo "Line $i"
done
all of the output is correctly consumed

Related

Python paramiko use exec command in remote server [duplicate]

I wrote this code in Paramiko:
ssh = SSHClient()
ssh.set_missing_host_key_policy(AutoAddPolicy())
ssh.connect(hostname, username=user, password=passwd, timeout=3)
session = ssh.invoke_shell()
session.send("\n")
session.send("echo step 1\n")
time.sleep(1)
session.send("sleep 30\n")
time.sleep(1)
while not session.recv_ready():
time.wait(2)
output = session.recv(65535)
session.send("echo step 2\n")
time.sleep(1)
output += session.recv(65535)
I'm trying execute more commands on my Linux server. The problem is my Python code not wait to finish execute command, for example if I'm try to execute sleep 30, the Python not wait 30 seconds for finish execute commands. How can resolve this problem ? I tried with while recv_ready(), but it still does not wait.
Use exec_command: http://docs.paramiko.org/en/1.16/api/channel.html
stdin, stdout, stderr = ssh.exec_command("my_long_command --arg 1 --arg 2")
The following code works for me:
from paramiko import SSHClient, AutoAddPolicy
import time
ssh = SSHClient()
ssh.set_missing_host_key_policy(AutoAddPolicy())
ssh.connect('111.111.111.111', username='myname', key_filename='/path/to/my/id_rsa.pub', port=1123)
sleeptime = 0.001
outdata, errdata = '', ''
ssh_transp = ssh.get_transport()
chan = ssh_transp.open_session()
# chan.settimeout(3 * 60 * 60)
chan.setblocking(0)
chan.exec_command('ls -la')
while True: # monitoring process
# Reading from output streams
while chan.recv_ready():
outdata += chan.recv(1000)
while chan.recv_stderr_ready():
errdata += chan.recv_stderr(1000)
if chan.exit_status_ready(): # If completed
break
time.sleep(sleeptime)
retcode = chan.recv_exit_status()
ssh_transp.close()
print(outdata)
print(errdata)
Please note that command history cannot be executed with ssh as is.
See example here: https://superuser.com/questions/962001/incorrect-output-of-history-command-of-ssh-how-to-read-the-timestamp-info-corre
In case you do not need to read the stdout and stderr separately, you can use way more straightforward code:
stdin, stdout, stderr = ssh_client.exec_command(command)
stdout.channel.set_combine_stderr(True)
output = stdout.readlines()
The readlines reads until the command finishes and returns a complete output.
In case you need the output separately, do not be tempted to remove the set_combine_stderr and call readlines on stdout and stderr separately. That might deadlock. See Paramiko ssh die/hang with big output
For a correct code that reads the outputs separately, see Run multiple commands in different SSH servers in parallel using Python Paramiko.
Obligatory warning: Do not use AutoAddPolicy – You are losing a protection against MITM attacks by doing so. For a correct solution, see Paramiko "Unknown Server".

Python code to copy a file to a server and unzip it [duplicate]

I wrote this code in Paramiko:
ssh = SSHClient()
ssh.set_missing_host_key_policy(AutoAddPolicy())
ssh.connect(hostname, username=user, password=passwd, timeout=3)
session = ssh.invoke_shell()
session.send("\n")
session.send("echo step 1\n")
time.sleep(1)
session.send("sleep 30\n")
time.sleep(1)
while not session.recv_ready():
time.wait(2)
output = session.recv(65535)
session.send("echo step 2\n")
time.sleep(1)
output += session.recv(65535)
I'm trying execute more commands on my Linux server. The problem is my Python code not wait to finish execute command, for example if I'm try to execute sleep 30, the Python not wait 30 seconds for finish execute commands. How can resolve this problem ? I tried with while recv_ready(), but it still does not wait.
Use exec_command: http://docs.paramiko.org/en/1.16/api/channel.html
stdin, stdout, stderr = ssh.exec_command("my_long_command --arg 1 --arg 2")
The following code works for me:
from paramiko import SSHClient, AutoAddPolicy
import time
ssh = SSHClient()
ssh.set_missing_host_key_policy(AutoAddPolicy())
ssh.connect('111.111.111.111', username='myname', key_filename='/path/to/my/id_rsa.pub', port=1123)
sleeptime = 0.001
outdata, errdata = '', ''
ssh_transp = ssh.get_transport()
chan = ssh_transp.open_session()
# chan.settimeout(3 * 60 * 60)
chan.setblocking(0)
chan.exec_command('ls -la')
while True: # monitoring process
# Reading from output streams
while chan.recv_ready():
outdata += chan.recv(1000)
while chan.recv_stderr_ready():
errdata += chan.recv_stderr(1000)
if chan.exit_status_ready(): # If completed
break
time.sleep(sleeptime)
retcode = chan.recv_exit_status()
ssh_transp.close()
print(outdata)
print(errdata)
Please note that command history cannot be executed with ssh as is.
See example here: https://superuser.com/questions/962001/incorrect-output-of-history-command-of-ssh-how-to-read-the-timestamp-info-corre
In case you do not need to read the stdout and stderr separately, you can use way more straightforward code:
stdin, stdout, stderr = ssh_client.exec_command(command)
stdout.channel.set_combine_stderr(True)
output = stdout.readlines()
The readlines reads until the command finishes and returns a complete output.
In case you need the output separately, do not be tempted to remove the set_combine_stderr and call readlines on stdout and stderr separately. That might deadlock. See Paramiko ssh die/hang with big output
For a correct code that reads the outputs separately, see Run multiple commands in different SSH servers in parallel using Python Paramiko.
Obligatory warning: Do not use AutoAddPolicy – You are losing a protection against MITM attacks by doing so. For a correct solution, see Paramiko "Unknown Server".

Capturing standard out from a Paramiko command

I have a wrapper around Paramiko's SSHClient.exec_command(). I'd like to capture standard out. Here's a shortened version of my function:
def __execute(self, args, sudo=False, capture_stdout=True, plumb_stderr=True,
ignore_returncode=False):
argstr = ' '.join(pipes.quote(arg) for arg in args)
channel = ssh.get_transport().open_session()
channel.exec_command(argstr)
channel.shutdown_write()
# Handle stdout and stderr until the command terminates
captured = []
def do_capture():
while channel.recv_ready():
o = channel.recv(1024)
if capture_stdout:
captured.append(o)
else:
sys.stdout.write(o)
sys.stdout.flush()
while plumb_stderr and channel.recv_stderr_ready():
sys.stderr.write(channel.recv_stderr(1024))
sys.stderr.flush()
while not channel.exit_status_ready():
do_capture()
# We get data after the exit status is available, why?
for i in xrange(100):
do_capture()
rc = channel.recv_exit_status()
if not ignore_returncode and rc != 0:
raise Exception('Got return code %d executing %s' % (rc, args))
if capture_stdout:
return ''.join(captured)
paramiko.SSHClient.execute = __execute
In do_capture(), whenever channel.recv_ready() tells me that I can receive data from the command's stdout, I call channel.recv(1024) and append the data to my buffer. I stop when the command's exit status is available.
However, it seems like more stdout data comes at some point after the exit status.
# We get data after the exit status is available, why?
for i in xrange(100):
do_capture()
I can't just call do_capture() once, as it seems like channel.recv_ready() will return False for a few milliseconds, and then True, and more data is received, and then False again.
I'm using Python 2.7.6 with Paramiko 1.15.2.
I encountered the same problem. The problem is that after the command exited there may still be data on the stout or stderr buffers, still on its way over the network, or whatever else. I read through paramiko's source code and apparently all data's been read once chan.recv() returns empty string.
So this is my attempt to solve it, until now it's been working.
def run_cmd(ssh, cmd, stdin=None, timeout=-1, recv_win_size=1024):
'''
Run command on server, optionally sending data to its stdin
Arguments:
ssh -- An instance of paramiko.SSHClient connected
to the server the commands are to be executed on
cmd -- The command to run on the remote server
stdin -- String to write to command's standard input
timeout -- Timeout for command completion in seconds.
Set to None to make the execution blocking.
recv_win_size -- Size of chunks the output is read in
Returns:
A tuple containing (exit_status, stdout, stderr)
'''
with closing(ssh.get_transport().open_session()) as chan:
chan.settimeout(timeout)
chan.exec_command(cmd)
if stdin:
chan.sendall(stdin)
chan.shutdown_write()
stdout, stderr = [], []
# Until the command exits, read from its stdout and stderr
while not chan.exit_status_ready():
if chan.recv_ready():
stdout.append(chan.recv(recv_win_size))
if chan.recv_stderr_ready():
stderr.append(chan.recv_stderr(recv_win_size))
# Command has finished, read exit status
exit_status = chan.recv_exit_status()
# Ensure we gobble up all remaining data
while True:
try:
sout_recvd = chan.recv(recv_win_size)
if not sout_recvd and not chan.recv_ready():
break
else:
stdout.append(sout_recvd)
except socket.timeout:
continue
while True:
try:
serr_recvd = chan.recv_stderr(recv_win_size)
if not serr_recvd and not chan.recv_stderr_ready():
break
else:
stderr.append(serr_recvd)
except socket.timeout:
continue
stdout = ''.join(stdout)
stderr = ''.join(stderr)
return (exit_status, stdout, stderr)
I encountered the same issue.
This link (Paramiko: how to ensure data is received between commands) gave me some help, in explaining that after you get exit_status_ready() you still have to receive possible additional data. In my tests (with a couple of screens of output), in every single run, there will be additional data to read after exit_status_ready() returns True.
But the way it reads the remaining data it is not correct: it uses recv_ready() to check if there is something to read, and once recv_ready() returns False it exits. Now, it will work most of the time. But the following situation can happen: recv_ready() can return False to indicate that at that moment there is nothing to receive, but it doesn't mean that it is the end of the all data. In my tests, I would leave the test running, and sometimes it would take half an hour for the issue to appear.
I found the solution by reading the following sentence in the Channel.recv() documentation: "If a string of length zero is returned, the channel stream has closed."
So we just can have a single loop and read all the data until recv() returns zero length result. At that point channel stream is closed, but just to make sure that exit status is ready we can make additional loop and sleep until channel.exit_status_ready() returns True.
Note that this will work only with a channel without pty enabled (which is default).

Reading from pipe in python is imposiible

Hello I have the following code in python 2.6:
command = "tcpflow -c -i any port 5559"
port_sniffer = subprocess.Popen(command, stdout=subprocess.PIPE, bufsize=1, shell=True)
while True:
line = port_sniffer.stdout.readline()
#do some stuff with line
The purpose of this code is to sniff the traffic between two processes (A and B) that communicate on port 5559.
Now let me describe the different scenarios I am having:
1) Code above is not running:
A and B are communicating and i can see it clearly using logs and the linux command netstat -napl | grep 5559 shows that the processes are communicating on the desired port.
2) Code above is not running and I am sniffing by running tcpflow -c -i any port 5559 directly from shell:
I can see the communication on console clearly :-).
3) Code above is running: Proccesses can't communicate. netstat -napl | grep 5559 prints nothing and logs give out errors!!!
4) Code above is running in debug mode: I can't seem to be able to step after the line line = port_sniffer.stdout.readline()
I tried using an iterator instead of a while loop (not that it should matter but still I am pointing it out). I also tried different values for bufsize (none, 1, and 8).
Please help!!
So after a quick read through the docs I found these two sentences:
On Unix, if args is a string, the string is interpreted as the name or
path of the program to execute
and
The shell argument (which defaults to False) specifies whether to use
the shell as the program to execute. If shell is True, it is
recommended to pass args as a string rather than as a sequence.
Based on this, I would recommend recreating your command as a list:
command = ["tcpflow -c", "-i any port 5559"] #I don't know linux, so double check this line!!
The general idea is this (also from the docs):
If args is a sequence, the first item specifies the command string,
and any additional items will be treated as additional arguments to
the shell itself. That is to say, Popen does the equivalent of:
Popen(['/bin/sh', '-c', args[0], args[1], ...])
Additionally, it seems that to read from your process, you should use communicate(). So
while True:
line = port_sniffer.stdout.readline()
would become
while True:
line = port_sniffer.communicate()[0]
But keep in mind this note from the docs:
Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited.
If I had to guess, I think the problem that you're having is that you aren't running your program as root. TCPFlow needs to be run as a privelaged user if you want to be able to sniff other people's traffic (otherwise that'd be a serious security vulnerability). I wrote the following programs and they worked just fine for your scenario
server.py
#!/usr/bin/python
import socket
s = socket.socket()
host = socket.gethostname()
port = 12345
s.bind((host,port))
s.listen(5)
while True:
c, addr = s.accept()
print 'Connection from', addr
c.send('Test string 1234')
c.recv(1024)
while x != 'q':
print "Received " + x
c.send('Blah')
x = c.recv(1024)
print "Closing connection"
c.close()
client.py
#!/usr/bin/python
import socket, sys
from time import sleep
from datetime import datetime
s = socket.socket()
host = socket.gethostname()
port = 12345
s.connect((host,port))
c = sys.stdin.read(1) # Type a char to send to initate the sending loop
while True:
s.send(str(datetime.now()))
s.sleep(3)
msg = s.recv(1024)
flow.py
#!/usr/bin/python
import subprocess
command = 'tcpflow -c -i any port 12345'
sniffer = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
while True:
print sniffer.stdout.readline()

paramiko combine stdout and stderr

I am trying to combine the output of stdout and stderr. My belief is that this can be done with the set_combine_stderr() of a Channel object.
This is what I am doing:
SSH = paramiko.SSHClient()
#I connect and everything OK, then:
chan = ssh.invoke_shell()
chan.set_combine_stderr(True)
chan.exec_command('python2.6 subir.py')
resultado = chan.makefile('rb', -1.)
However, I get the following error when I try to store the result (last line above, chan.makefile() ):
Error: Channel closed.
Any help would be greatly appreciated
While it is true that set_combine_stderr diverts stderr to the stdout stream, it does so in chaotic order, so you do not get the result you probably want, namely, the lines combined in the order written, as if you were running the command in a local terminal window. Instead, use get_pty. That will cause the server to run the lines through a pseudo-terminal, keeping them in chronological sequence.
Here's a test program, outerr.py, that writes alternating lines on stdout and stdin. Assume it's sitting in the home directory of llmps#meerkat2.
#!/usr/bin/env python
import sys
for x in xrange(1, 101):
(sys.stdout, sys.stderr)[x%2].write('This is line #%s, on std%s.\n' %
(x, ('out', 'err')[x%2]))
Now try the following code to run it remotely:
#!/usr/bin/env python
import paramiko
def connect():
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect('meerkat2', username='llmps', password='..')
return ssh
def runTest(ssh):
tran = ssh.get_transport()
chan = tran.open_session()
# chan.set_combine_stderr(True)
chan.get_pty()
f = chan.makefile()
chan.exec_command('./outerr.py')
print f.read(),
if __name__ == '__main__':
ssh = connect()
runTest(ssh)
ssh.close()
If you run the above, you should see 100 lines in order as written. If, instead, you comment out the chan.get_pty() call and uncomment the chan.set_combine_stderr(True) call, you will get clumps of stdout and stderr lines interspersed randomly from run to run.
Ok, I know this is quite an old topic, but I run into the same problem and I got a (maybe not-so-)pretty solution. Just call the command on the remote server redirecting the stderr to stdout and then always read from the stdout. For example:
client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect('hostname', username='user', password='pass')
stdin,stdout,stderr = client.exec_command('python your_script.py 2> \&1')
print stdout.read()
#AaronMcSmooth: I am referring to the stdout and stderr of the computer I am connecting to (via SSH).
I ended up doing this:
stdin, stdout, stderr = ssh.exec_command(...)
output = stdin.read().strip() + stdout.read().strip()
For the purpose of my application, it doesn't matter to distinguish between stdout and stderr, but I don't think that's the best way to combine the two.
The code of SSHClient.exec_command() is (looking at paramiko's source code):
def exec_command(self, command, bufsize=-1):
chan = self._transport.open_session()
chan.exec_command(command)
stdin = chan.makefile('wb', bufsize)
stdout = chan.makefile('rb', bufsize)
stderr = chan.makefile_stderr('rb', bufsize)
return stdin, stdout, stderr
I am performing the same actions on the channel but receive the Channel is closed error.

Categories

Resources