Python script timeouts while uploading to an ftp server - python

I'm creating an csv file and uploading it to a ftp server.
So the upload includes paths of multiple companies on that ftp server.
like :
COMANY1/FOO/BAR
But for some companies at the end I get this traceback:
Traceback (most recent call last):
File "exporter.py", line 117, in <module>
Exporter().run()
File "exporter.py", line 115, in run
self.upload(file, vendor['ftp_path'], filename)
File "exporter.py", line 78, in upload
sftp.chdir(dir)
File "/home/johndoe/exports/daily/venv/local/lib/python2.7/site-packages/paramiko/sftp_client.py", line 580, in chdir
if not stat.S_ISDIR(self.stat(path).st_mode):
File "/home/johndoe/exports/daily/venv/local/lib/python2.7/site-packages/paramiko/sftp_client.py", line 413, in stat
t, msg = self._request(CMD_STAT, path)
File "/home/johndoe/exports/daily/venv/local/lib/python2.7/site-packages/paramiko/sftp_client.py", line 730, in _request
return self._read_response(num)
File "/home/johndoe/exports/daily/venv/local/lib/python2.7/site-packages/paramiko/sftp_client.py", line 781, in _read_response
self._convert_status(msg)
File "/home/johndoe/exports/daily/venv/local/lib/python2.7/site-packages/paramiko/sftp_client.py", line 807, in _convert_status
raise IOError(errno.ENOENT, text)
IOError: [Errno 2] The requested file does not exist
I think the stack trace refers to the ftp path, that it doesn't exists (but It does exist).
But when I try to run that script on those exact same paths alone (that are causing the problem) it passes?
So, its not a logical error, and those ftp paths do exist - but can it be due to some timeout occurring?
Thanks,
Tom
Update:
I call the method like this:
self.upload(file, vendor['ftp_path'], filename)
And the actual method that does the uploading:
def upload(self, buffer, ftp_dir, filename):
for dir in ftp_dir.split('/'):
if dir == '':
continue
sftp.chdir(dir)
with sftp.open(filename, 'w') as f:
f.write(buffer)
f.close()
sftp.close()

Related

Use multiprocess for boto3 s3 upload_fileobj causes SSLError

In AWS Lambda with runtimes python3.9 and boto3-1.20.32, I run the following code,
s3_client = boto3.client(service_name="s3")
s3_bucket = "bucket"
s3_other_bucket = "other_bucket"
def multiprocess_s3upload(tar_index: dict):
def _upload(filename, bytes_range):
src_key = ...
# get single raw file in tar with bytes range
s3_obj = s3_client.get_object(
Bucket=s3_bucket,
Key=src_key,
Range=f"bytes={bytes_range}"
)
# upload raw file
# error occur !!!!!
s3_client.upload_fileobj(
s3_obj["Body"],
s3_other_bucket,
filename
)
def _wait(procs):
for p in procs:
p.join()
processes = []
proc_limit = 256 # limit concurrent processes to avoid "open too much files" error
for filename, bytes_range in tar_index.items():
# filename = "hello.txt"
# bytes_range = "1024-2048"
proc = Process(
target=_upload,
args=(filename, bytes_range)
)
proc.start()
processes.append(proc)
if len(processes) == proc_limit:
_wait(processes)
processes = []
_wait(processes)
This program is extract partial raw files in a tar file in a s3 bucket, then upload each raw file to another s3 bucket. There may be thousands of raw files in a tar file, so I use multiprocess to speed up s3 upload operation.
And, I got the exception in a subprocess about SSLError for processing the same tar file randomly. I tried different tar file and got the same result. Only the last one subprocess threw the exception, the remaining worked fine.
Process Process-2:
Traceback (most recent call last):
File "/var/runtime/urllib3/response.py", line 441, in _error_catcher
yield
File "/var/runtime/urllib3/response.py", line 522, in read
data = self._fp.read(amt) if not fp_closed else b""
File "/var/lang/lib/python3.9/http/client.py", line 463, in read
n = self.readinto(b)
File "/var/lang/lib/python3.9/http/client.py", line 507, in readinto
n = self.fp.readinto(b)
File "/var/lang/lib/python3.9/socket.py", line 704, in readinto
return self._sock.recv_into(b)
File "/var/lang/lib/python3.9/ssl.py", line 1242, in recv_into
return self.read(nbytes, buffer)
File "/var/lang/lib/python3.9/ssl.py", line 1100, in read
return self._sslobj.read(len, buffer)
ssl.SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2633)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/lang/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self._target(*self._args, **self._kwargs)
File "/var/task/main.py", line 144, in _upload
s3_client.upload_fileobj(
File "/var/runtime/boto3/s3/inject.py", line 540, in upload_fileobj
return future.result()
File "/var/runtime/s3transfer/futures.py", line 103, in result
return self._coordinator.result()
File "/var/runtime/s3transfer/futures.py", line 266, in result
raise self._exception
File "/var/runtime/s3transfer/tasks.py", line 269, in _main
self._submit(transfer_future=transfer_future, **kwargs)
File "/var/runtime/s3transfer/upload.py", line 588, in _submit
if not upload_input_manager.requires_multipart_upload(
File "/var/runtime/s3transfer/upload.py", line 404, in requires_multipart_upload
self._initial_data = self._read(fileobj, threshold, False)
File "/var/runtime/s3transfer/upload.py", line 463, in _read
return fileobj.read(amount)
File "/var/runtime/botocore/response.py", line 82, in read
chunk = self._raw_stream.read(amt)
File "/var/runtime/urllib3/response.py", line 544, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "/var/lang/lib/python3.9/contextlib.py", line 137, in __exit__
self.gen.throw(typ, value, traceback)
File "/var/runtime/urllib3/response.py", line 452, in _error_catcher
raise SSLError(e)
urllib3.exceptions.SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2633)
According to this 10-years-ago similar question Multi-threaded S3 download doesn't terminate, the root cause might be boto3 s3 upload use a non-thread-safe library for sending http request. But, the solution doesn't work for me.
I found a boto3 issue about my question. This the problem has disappeared without any change on the author part.
Actually, the problem has recently disappeared on its own, without any (!) change on my part. As I thought, the problem was created and fixed by Amazon. I'm only afraid what if it will be a thing again...
Does anyone know how to fix this?
According to boto3 documentation about multiprocessing (doc),
Resource instances are not thread safe and should not be shared across threads or processes. These special classes contain additional meta data that cannot be shared. It's recommended to create a new Resource for each thread or process:
My modified code,
def multiprocess_s3upload(tar_index: dict):
def _upload(filename, bytes_range):
src_key = ...
# get single raw file in tar with bytes range
s3_client = boto3.client(service_name="s3") # <<<< one clien per thread
s3_obj = s3_client.get_object(
Bucket=s3_bucket,
Key=src_key,
Range=f"bytes={bytes_range}"
)
# upload raw file
s3_client.upload_fileobj(
s3_obj["Body"],
s3_other_bucket,
filename
)
def _wait(procs):
...
...
It seems that no SSLError exception occurs.

Unable to transfer file from master node to minion nodes using sftp in a python script

I am trying to send a file from the master node to minion nodes using a python script but a single error OSError: Failure keeps on coming up.
I tried to code this file to send this file from one local machine to another local machine.
My code:
#! /usr/bin/python
#! /usr/bin/python3
import paramiko
import os
#Defining working connect
def workon(host):
#Making a connection
ssh_client = paramiko.SSHClient()
ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) #To add the missing host key and auto add policy
ssh_client.connect(hostname = host, username = 'username', password = 'password')
ftp_client = ssh_client.open_sftp()
ftp_client.put("/home/TrialFolder/HelloPython", "/home/")
ftp_client.close()
#stdin, stdout, stderr = ssh_client.exec_command("ls")
#lines = stdout.readlines()
#print(lines)
def main():
hosts = ['192.16.15.32', '192.16.15.33', '192.16.15.34']
threads = []
for h in hosts:
workon(h)
main()
Error:
Traceback (most recent call last):
File "PythonMultipleConnectionUsinhSSH.py", line 28, in <module>
main()
File "PythonMultipleConnectionUsinhSSH.py", line 26, in main
workon(h)
File "PythonMultipleConnectionUsinhSSH.py", line 15, in workon
ftp_client.put("/home/Sahil/HelloPython", "/home/")
File "/usr/local/lib/python3.6/site-packages/paramiko/sftp_client.py", line 759, in put
return self.putfo(fl, remotepath, file_size, callback, confirm)
File "/usr/local/lib/python3.6/site-packages/paramiko/sftp_client.py", line 714, in putfo
with self.file(remotepath, "wb") as fr:
File "/usr/local/lib/python3.6/site-packages/paramiko/sftp_client.py", line 372, in open
t, msg = self._request(CMD_OPEN, filename, imode, attrblock)
File "/usr/local/lib/python3.6/site-packages/paramiko/sftp_client.py", line 813, in _request
return self._read_response(num)
File "/usr/local/lib/python3.6/site-packages/paramiko/sftp_client.py", line 865, in _read_response
self._convert_status(msg)
File "/usr/local/lib/python3.6/site-packages/paramiko/sftp_client.py", line 898, in _convert_status
raise IOError(text)
OSError: Failure
First, you should make sure the target directory /home/ is writable for you. Then you should review documentation for the put method. It says this about the second argument (remotepath):
The destination path on the SFTP server. Note that the filename should be included. Only specifying a directory may result in an error.
Try including the filename in the path, like:
...
ftp_client.put("/home/TrialFolder/HelloPython", "/home/HelloPython")
...

Python pysftp.put raises "No such file" exception although file is uploaded

I am using pysftp to connect to a server and upload a file to it.
cnopts = pysftp.CnOpts()
cnopts.hostkeys = None
self.sftp = pysftp.Connection(host=self.serverConnectionAuth['host'], port=self.serverConnectionAuth['port'],
username=self.serverConnectionAuth['username'], password=self.serverConnectionAuth['password'],
cnopts=cnopts)
self.sftp.put(localpath=self.filepath+filename, remotepath=filename)
Sometimes it does okay with no error, but sometime it puts the file correctly, BUT raises the following exception. The file is read and processed by another program running on the server, so I can see that the file is there and it is not corrupted
File "E:\Anaconda\envs\py35\lib\site-packages\pysftp\__init__.py", line 364, in put
confirm=confirm)
File "E:\Anaconda\envs\py35\lib\site-packages\paramiko\sftp_client.py", line 727, in put
return self.putfo(fl, remotepath, file_size, callback, confirm)
File "E:\Anaconda\envs\py35\lib\site-packages\paramiko\sftp_client.py", line 689, in putfo
s = self.stat(remotepath)
File "E:\Anaconda\envs\py35\lib\site-packages\paramiko\sftp_client.py", line 460, in stat
t, msg = self._request(CMD_STAT, path)
File "E:\Anaconda\envs\py35\lib\site-packages\paramiko\sftp_client.py", line 780, in _request
return self._read_response(num)
File "E:\Anaconda\envs\py35\lib\site-packages\paramiko\sftp_client.py", line 832, in _read_response
self._convert_status(msg)
File "E:\Anaconda\envs\py35\lib\site-packages\paramiko\sftp_client.py", line 861, in _convert_status
raise IOError(errno.ENOENT, text)
FileNotFoundError: [Errno 2] No such file
How can I prevent the exception?
From the described behaviour, I assume that the file is removed very shortly after it is uploaded by some server-side process.
By default pysftp.Connection.put verifies the upload by checking a size of the target file. If the server-side processes manages to remove the file too fast, reading the file size would fail.
You can disable the post-upload check by setting confirm parameter to False:
self.sftp.put(localpath=self.filepath+filename, remotepath=filename, confirm=False)
I believe the check is redundant anyway, see
How to perform checksums during a SFTP file transfer for data integrity?
For a similar question about Paramiko (which pysftp uses internally), see:
Paramiko put method throws "[Errno 2] File not found" if SFTP server has trigger to automatically move file upon upload
Also had this issue of the file automatically getting moved before paramiko could do an os.stat on the uploaded file and compare the local and uploaded file sizes.
#Martin_Prikryl solution works works fine for removing the error by passing in confirm=False when using sftp.put or sftp.putfo
If you want this check to still run to verify the file has been uploaded fully you can run something along these lines. For this to work you will need to know the moved file location and have the ability to read the file.
import os
sftp.putfo(source_file_object, destination_file, confirm=False)
upload_size = sftp.stat(moved_path).st_size
local_size = os.stat(source_file_object).st_size
if upload_size != local_size:
raise IOError(
"size mismatch in put! {} != {}".format(upload_size, local_size)
)
Both checks use os.stat

ftplib.error_perm: 553 Could not create file. (Python 2.4.4)

I am writing to the home directory of the user I'm FTPing into, so permissions shouldn't be an issue. FTP works in FileZilla.
I checked the vsftp.conf and made the local_enable=YES change
On a Debian4 system with Python 2.4.4 (I can't upgrade it), I am using this code with ftplib
>>> f = ftplib.FTP('address', 'user', 'password')
>>> f.cwd('/home/user/some/dir/')
'250 Directory successfully changed.'
>>> myfile = '/full/path/of/file.txt'
>>> o = open(myfile, 'rb')
>>> f.storbinary('STOR ' + myfile, o)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.4/ftplib.py", line 415, in storbinary
conn = self.transfercmd(cmd)
File "/usr/lib/python2.4/ftplib.py", line 345, in transfercmd
return self.ntransfercmd(cmd, rest)[0]
File "/usr/lib/python2.4/ftplib.py", line 327, in ntransfercmd
resp = self.sendcmd(cmd)
File "/usr/lib/python2.4/ftplib.py", line 241, in sendcmd
return self.getresp()
File "/usr/lib/python2.4/ftplib.py", line 216, in getresp
raise error_perm, resp
ftplib.error_perm: 553 Could not create file.
Any ideas why it fails?
You are not writing to a home directory, you are writing to /full/path/of/file.txt:
myfile = '/full/path/of/file.txt'
...
f.storbinary('STOR ' + myfile, o)
You have to use a file name only with the STOR command (once the "cwd" is already the correct target path):
f.cwd('/home/user/some/dir/')
f.storbinary('STOR file.txt', o)
or a correct absolute path for the remote host:
f.storbinary('STOR /home/user/some/dir/file.txt', o)

Cannot write to Twisted FTP server

I am currently using the one-line Twisted FTP server to transfer files back and forth between machines:
twistd -n ftp
Which works fine for downloading files from the server. However when I try to write to the server using:
with open('testFile.bmp', 'rb') as f:
ftp.storbinary('STOR ' + 'testFile.bmp', f)
with open('surrogate.py', 'rb') as f:
ftp.storbinary('STOR ' + 'surrogateCode.py', f)
I get errors:
Traceback (most recent call last):
File "client.py", line 13, in <module>
ftp.storbinary('STOR ' + 'testFile.bmp', f)
File "/usr/lib/python2.7/ftplib.py", line 461, in storbinary
conn = self.transfercmd(cmd, rest)
File "/usr/lib/python2.7/ftplib.py", line 368, in transfercmd
return self.ntransfercmd(cmd, rest)[0]
File "/usr/lib/python2.7/ftplib.py", line 331, in ntransfercmd
resp = self.sendcmd(cmd)
File "/usr/lib/python2.7/ftplib.py", line 244, in sendcmd
return self.getresp()
File "/usr/lib/python2.7/ftplib.py", line 219, in getresp
raise error_perm, resp
ftplib.error_perm: 550 Requested action not taken: internal server error
I tried it with the WinSCP FTP client and receive this error:
Copying files to remote side failed.
Requested action not taken: internal server error
I'm not sure if I am writing incorrectly or calling the server incorrectly.
Your code looks okay and from your description of the problem (encountering it in both WinSCP and the twisted library) I would hazard a guess that the issue is on the server side.
Using http://en.wikipedia.org/wiki/List_of_FTP_server_return_codes as a reference
Error 550 and the error ftplib.error_perm would suggest that perhaps the user you au don't have write permissions to that location

Categories

Resources