Open a remote file using paramiko in python slow [duplicate] - python

This question already has an answer here:
Reading file opened with Python Paramiko SFTPClient.open method is slow
(1 answer)
Closed 7 months ago.
I am using paramiko to open a remote sftp file in python. With the file object returned by paramiko, I am reading the file line by line and processing the information. This seems really slow compared to using the python in-built method 'open' from the os. Following is the code I am using to get the file object.
Using paramiko (slower by 2 times) -
client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(myHost,myPort,myUser,myPassword)
sftp = client.open_sftp()
fileObject = sftp.file(fullFilePath,'rb')
Using os -
import os
fileObject = open(fullFilePath,'rb')
Am I missing anything? Is there a way to make the paramiko fileobject read method as fast as the one using the os fileobject?
Thanks!!

Your problem is likely to be caused by the file being a remote object. You've opened it on the server and are requesting one line at a time - because it's not local, each request takes much longer than if the file was sitting on your hard drive. The best alternative is probably to copy the file down to a local location first, using Paramiko's SFTP get.
Once you've done that, you can open the file from the local location using os.open.

I was having the same issue and I could not afford to copy the file locally because of security reasons, I solved it by using a combination of prefetching and bytesIO:
def fetch_file_as_bytesIO(sftp, path):
"""
Using the sftp client it retrieves the file on the given path by using pre fetching.
:param sftp: the sftp client
:param path: path of the file to retrieve
:return: bytesIO with the file content
"""
with sftp.file(path, mode='rb') as file:
file_size = file.stat().st_size
file.prefetch(file_size)
file.set_pipelined()
return io.BytesIO(file.read(file_size))

Here is a way that works using scraping the command line (cat) in paramiko, and reading all lines at once. Works well for me:
import paramiko
client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.WarningPolicy())
client.connect(hostname=host, port=port, username=user, key_filename=ssh_file)
stdin, stdout, stderr = client.exec_command('cat /proc/net/dev')
net_dump = stdout.readlines()
#your entire file is now in net_dump .. do as you wish with it below ...
client.close()
The files I open are quite small so it all depends on your file size. Worth a try :)

Related

Is it possible to transfer files from a directory using SCP in Python but ignore hidden files or sym links?

I'm currently utilising Paramiko and SCPClient in Python to transfer a directory from one server to another as a means of backup. This works well however I do not want it to copy hidden files (.file_name) or symbolic links. Is this possible?
Unfortunately rsync isn't an option for me as it's not available on either of the remote servers I connect to. My script is below (sensitive info replaced with dummy data). Note I need to connect to a jump host before being able to connect to target_1 or target_2.
import os
import shutil
import time
import paramiko
from scp import SCPClient
#set up ssh variables
j_host = '00.00.00.00'
target_host_1 = '00.00.00.001'
target_host_2 = '00.00.00.002'
port_no = 22
username = ''
passw = ''
#set up temporary folder on local machine to store files
path = "/local_path/"
os.mkdir(path)
#create SSH Client for jump server
jump_host=paramiko.SSHClient()
jump_host.set_missing_host_key_policy(paramiko.AutoAddPolicy())
jump_host.connect(j_host, username=username, password=passw)
#set up channel to connect to 1 via jump server
jump_host_transport_1 = jump_host.get_transport()
src_addr = (j_host, port_no)
dest_addr_1 = (target_host_1, port_no)
jump_host_channel_1 = jump_host_transport_1.open_channel("direct-tcpip", dest_addr_1, src_addr)
#set up channel to connect to 2 via jump server
jump_host_transport_2 = jump_host.get_transport()
dest_addr_2 = (target_host_2, port_no)
jump_host_channel_2 = jump_host_transport_2.open_channel("direct-tcpip", dest_addr_2, src_addr)
#function which sets up target server, either 1 or 2
def create_SSHClient(server, port, user, password, sock):
target=paramiko.SSHClient()
target.set_missing_host_key_policy(paramiko.AutoAddPolicy())
target.connect(server, port, user, password, sock=sock)
return target
#invoke above function to set up connections for 1 & 2
ssh_1 = create_SSHClient(target_host_1, port_no, username, passw, jump_host_channel_1)
ssh_2 = create_SSHClient(target_host_2, port_no, username, passw, jump_host_channel_2)
#delete old files in backup folder
command = "rm -rf /filepath/{*,.*}"
stdin, stdout, stderr = ssh_2.exec_command(command)
lines = stdout.readlines()
#print(lines)
#pause to ensure old directory is cleared
time.sleep(5)
#SCPCLient takes a paramiko transport as an argument, sets up file transfer connection
scp_1 = SCPClient(ssh_1.get_transport())
scp_2 = SCPClient(ssh_2.get_transport())
#get files from 1, store on local machine, put on 2
scp_1.get('/filepath/.', '/target_folder_local/', recursive=True)
scp_2.put('/target_folder_local/.', '/filepath/', recursive=True)
#remove temporary folder
shutil.rmtree(path)
#close connections
ssh_1.close()
ssh_2.close()
jump_host.close()
There's no API in SCPClient to skip hidden files or symbolic links.
For upload, it's easy, if you copy the SCPClient's code and modify it as you need. See the os.walk loop in _send_recursive function.
If you do not want to modify the SCPClient's code, you will have to iterate the files on your own, calling SCPClient.put for each. It will be somewhat less efficient, as it will start new SCP server for each file.
For download, you might be able to modify the SCPClient code to respond with non-zero code to C commands fed by the server for the files you do not want to download.
Check the _recv_file function. There where name is resolved, check for names or attributes of files you are not interested in downloading and do chan.send('\x01') and exit the function.
Though why do you want to use SCP? Use SFTP. It is much better suited for custom rules you need.
Paramiko does not have recursive SFTP transfer functionality (But pysftp does, see pysftp vs. Paramiko). But you won't be able to use it anyway, for the same reason you cannot use it with SCP. For your specific needs.
But check my answer to Python pysftp get_r from Linux works fine on Linux but not on Windows. It shows a simple recursive SFTP download code. Just modify it slightly to skip the files you do not want to download.
Something like
if (not S_ISLNK(mode)) and (not entry.filename.startswith(".")):
(see Checking if a file on SFTP server is a symbolic link, and deleting the symbolic link, using Python Paramiko/pysftp)

Upload new file to SFTP server using Paramiko without having to overwrite an existing file

I am trying to upload a file via SFTP to my server. But instead of just uploading it, i have to explicitly tell my skript what file to overwrite on the server. I don't know how to change that.
#!/usr/bin/python3
import paramiko
k = paramiko.RSAKey.from_private_key_file("/home/abdulkarim/.ssh/id_rsa")
c = paramiko.SSHClient()
c.set_missing_host_key_policy(paramiko.AutoAddPolicy())
print("connecting")
c.connect( hostname = "do-test", username = "abdulkarim", pkey = k )
print("connected")
sftp = c.open_sftp()
sftp.put('/home/abdulkarim/Skripte/data/test.txt', '/home/abdulkarim/test/test1.txt')
c.close()
In the below call, the second (remotepath) parameter refers to the path, where the file will be stored on the server. There is not requirement for the remote file to actually exist. It will be created.
sftp.put('/home/abdulkarim/Skripte/data/test.txt', '/home/abdulkarim/test/test1.txt')
Obligatory warning: Do not use AutoAddPolicy – You are losing a protection against MITM attacks by doing so. For a correct solution, see Paramiko "Unknown Server".

How to update a file in server using SFTP in Paramiko

I want to go to a path on a remote SFTP server and verify if the file is present. If the file is present, then I want to open the file and update its contents.
Is it possible with SFTP in Paramiko?
Paramiko SFTP client has SFTPClient.open method that is an equivalent of regular Python open function. It returns a file-like object, which you can then use as if you were editing a local file:
ssh = paramiko.SSHClient()
# ...
ssh.connect(...)
sftp = ssh.open_sftp()
with sftp.open("/remote/path/file.txt", "r+") as f:
f.seek(10)
f.write(b'foo')

pysftp putfo creates an empty file on SFTP server but not streaming the content from StringIO

My code first writes lines to a CSV in io.StringIO():
fileBuffer = io.StringIO()
# write header
header_writer = csv.DictWriter(fileBuffer, fieldnames=columnNames)
header_writer.writeheader()
# write lines
writer = csv.writer(fileBuffer, delimiter=',')
for line in data:
line_dec = line.decode('ISO-8859-1')
# print([line_dec])
writer.writerow([line_dec])
The following code also prints all expected rows:
$print(fileBuffer.getvalue()) # -> prints all expected rows
I can also successfully connect to the SFTP Server using pysftp and even in the with pysftp the code successfully returns all expected rows:
with pysftp.Connection(host, username=user, password=pw, cnopts=cnopts) as sftp:
print('sucessfully connected to {} via Port 22'.format(host))
print(fileBuffer.getvalue()) # -> prints all expected rows
sftp.putfo(fileBuffer, file2BeSavedAs) # -> no rows put on FTP Server
Here comes the actual problem:
Unfortunately, the code only creates the file without writing the data respectively the body into it. On the other hand, my code does not return any error message.
How can I put a CSV from StringIO to an SFTP server?
You have to seek a read pointer of the buffer back to the beginning, before you try to upload the buffer:
fileBuffer.seek(0)
sftp.putfo(fileBuffer, file2BeSavedAs)
Though a better approach is to write the CSV directly to the server, without an intermediate buffer. Use Connection.open to obtain a file-like object representing a file on the SFTP server:
with sftp.open(file2BeSavedAs, mode='w', bufsize=32768) as f:
writer = csv.writer(f, delimiter=',')
# ...
For the purpose of the bufsize argument, see:
Writing to a file on SFTP server opened using Paramiko/pysftp "open" method is slow
For a similar question, with progress display, see:
How to use Paramiko getfo to download file from SFTP server to memory to process it
Though pysftp is dead. You better use Paramiko. See pysftp vs. Paramiko. With Paramiko the code would be pretty much the same.

How can I download file from SFTP using pysftp?

I have a SFTP server that contains a set of folders. I would like to download a file from one of these folders.
Can someone help me with the python code?
Where should I start?
Have you tried the example in pysftp's documentation?
import pysftp
with pysftp.Connection('hostname', username='me', password='secret') as sftp:
with sftp.cd('public'): # temporarily chdir to public
sftp.put('/my/local/filename') # upload file to public/ on remote
sftp.get('remote_file') # get a remote file
An addition from Michael Powers.
If your server is one account and your personal account is another one.
If your server account has root permission, then follow Michael's method.
If no, you may use io buffer to download this file
import io
server_sftp = pysftp.Connection(host=test_host, username=server_account,
password=server_pwd, cnopts=cn_opts)
your_user_sftp = pysftp.Connection(host=host_name, username = user_name,
password = user_pwd, cnopts=cn_opts)
try:
file_like=io.BytesIO()
server_account.getfo(server_file_location,file_like)
your_user_sftp.putfo(file_like, target_file_location)
finally:
flie_like.close()
server_sftp.close()
your_user_sftp.close()

Categories

Resources