I have a requirement where I have to pull the latest files from an FTP folder, the problem is that the filename is having spaces and the filename is having a specific pattern.
Below is the code I have implemented:
import sys
from ftplib import FTP
import os
import socket
import time
import pandas as pd
import numpy as np
from glob import glob
import datetime as dt
from __future__ import with_statement
ftp = FTP('')
ftp.login('','')
ftp.cwd('')
ftp.retrlines('LIST')
filematch='*Elig.xlsx'
downloaded = []
for filename in ftp.nlst(filematch):
fhandle=open(filename, 'wb')
print 'Getting ' + filename
ftp.retrbinary('RETR '+ filename, fhandle.write)
fhandle.close()
downloaded.append(filename)
ftp.quit()
I understand that I can append an empty list to ftp.dir() command, but since the filename is having spaces, I am unable to split it in the right way and pick the latest file of the type that I have mentined above.
Any help would be great.
You can get the file mtime by sending the MDTM command iff the FTP server supports it and sort the files on the FTP server accordingly.
def get_newest_files(ftp, limit=None):
"""Retrieves newest files from the FTP connection.
:ftp: The FTP connection to use.
:limit: Abort after yielding this amount of files.
"""
files = []
# Decorate files with mtime.
for filename in ftp.nlst():
response = ftp.sendcmd('MDTM {}'.format(filename))
_, mtime = response.split()
files.append((mtime, filename))
# Sort files by mtime and break after limit is reached.
for index, decorated_filename in enumerate(sorted(files, reverse=True)):
if limit is not None and index >= limit:
break
_, filename = decorated_filename # Undecorate
yield filename
downloaded = []
# Retrieves the newest file from the FTP server.
for filename in get_newest_files(ftp, limit=1):
print 'Getting ' + filename
with open(filename, 'wb') as file:
ftp.retrbinary('RETR '+ filename, file.write)
downloaded.append(filename)
The issue is that the FTP "LIST" command returns text for humans, which format depends on the FTP server implementation.
Using PyFilesystem (in place of the standard ftplib) and its API will provide a "list" API (search "walk") that provide Pythonic structures of the file and directories lists hosted in the FTP server.
http://pyfilesystem2.readthedocs.io/en/latest/index.html
Related
I am trying to delete files older than x days using below code from FTP server.
#!/usr/bin/env python
from ftplib import FTP
import time
import sys
import os
ftp_user = sys.argv[1]
ftp_pwd = sys.argv[2]
host = sys.argv[3]
def remove_files(days, dir_path):
ftp = FTP(host)
ftp.login(ftp_user, ftp_pwd)
ftp.cwd(dir_path)
now = time.time()
for file in ftp.nlst(dir_path):
print("filename:", file)
#file_path =os.path.join(dir_path, f)
if not os.path.isfile(file):
continue
if os.stat(file).st_mtime < now - days * 86400:
ftp.delete(file)
print("Deleted ", file)
I am not getting any error but files are not deleted. I think os module commands are not working in FTP server. Is there any alternative to delete the files from FTP older than x days. Basically I calling this script from ansible to automate the process.
Indeed, os.path doesn't work with files over ftp.
You can use mlsd in the following manner:
ftp.mlsd(facts=["Modify"])
It'll return a list of tuples, each looking like:
('favicon.ico', {'modify': '20110616024613'})
(the first item is the file name, the second is a dictionary with the last modified time).
To get more information about each file - for example, the file's type, use:
ftp.mlsd(facts=["Modify", "Type"])
This results in data like:
('.manifest.full', {'modify': '20200423140048', 'type': 'file'})
('14.04', {'modify': '20140327184332', 'type': 'OS.unix=symlink'})
('.pool', {'modify': '20200423134557', 'type': 'dir'})
task: I need to connect to clients FTP, where we have many directories and each directory may or may not have .csv files in it. Now I need to go to each directory and open the files in all directories and if the file is according to the given format then dump in a server.
Presently I'm able to connect to FTP do this much
I'm able to get the directories list but not the files inside the directory.
from ftplib import FTP
from sqlalchemy import create_engine
import os
import sys
import os.path
ftp=FTP('host')
ftp.login('user','pwd')
for files in ftp.dir():
filenames=ftp.nlst(files)
ftp.retrbinary("RETR " + a, file.write)
file.close()
ftp.close() #CLOSE THE FTP CONNECTION
print "FTP connection closed. Goodbye"
I know that is not at all up to the mark.
Looks like you are looking for a way to get a list of the files in a given directory. Here is a function I often use to solve this task in unix system (macOS included). It should be a good starting point if not the final solution you are looking for.
import glob, os
def list_of_files(path, extension, recursive=False):
'''
Return a list of filepaths for each file into path with the target extension.
If recursive, it will loop over subfolders as well.
'''
if not recursive:
for file_path in glob.iglob(path + '/*.' + extension):
yield file_path
else:
for root, dirs, files in os.walk(path):
for file_path in glob.iglob(root + '/*.' + extension):
yield file_path
Also, you can use ftp.cwd('..') to change directory and ftp.retrlines('LIST') to just get the list of files of that directory.
Check the docs for some useful code snippet.
can you pls shed some light on what I am doing wrong here... I'm a Python newby... This connects and I can get a list of files in a FTP specific directory, but for the love of BigBang.. It isn't downloading any files. I need to download files starting with a specific Name string:
from ftplib import FTP_TLS
import fnmatch
import os
ftps = FTP_TLS('myftp_site.com')
ftps.login('userxx', 'pwxx')
ftps.prot_p()
ftps.cwd('Inbox')
print("File list:")
list_of_files = ftps.retrlines("LIST")
dest_dir = "C:\DownloadingDirectory"
for name in list_of_files:
if fnmatch.fnmatch(name,"StartingFileName*"):
with open(os.path.join(dest_dir, name), "wb") as f:
ftps.retrbinary("RETR {}".format(name), f.write)
ftps.quit()
enter code here
You are having problems getting the file list. LIST returns a long-format listing with file attributes like for instance
-rw------- 1 td dialout 543 Apr 3 20:18 .bash_history
Use NLST to get a short list. Also, the retrlines() function is kinda strange. It calls a callback for each line (defaulting to print) it receives. The command only returns a status string. You can add your own callback to fill a list, or use the .nlst() command to get the list for you.
from ftplib import FTP_TLS
import fnmatch
import os
ftps = FTP_TLS('myftp_site.com')
ftps.login('userxx', 'pwxx')
ftps.prot_p()
ftps.cwd('Inbox')
print("File list:")
list_of_files = []
ftps.retrlines("NLST", list_of_files.append)
# ...or use the existing helper
# list_of_files = ftps.nlst()
dest_dir = "C:\DownloadingDirectory"
for name in list_of_files:
if fnmatch.fnmatch(name,"StartingFileName*"):
with open(os.path.join(dest_dir, name), "wb") as f:
ftps.retrbinary("RETR {}".format(name), f.write)
ftps.quit()
I have a list of ftp sites ( eg:10 ) in text file and i need to download the last created file from ftp sites. Is this possible. This is my code :
import os
from ftplib import FTP
ftp = FTP("xxx.xx.xx.xx1", "USERNAME1", "PASSWORD1")
ftp = FTP("xxx.xx.xx.xx2", "USERNAME2", "PASSWORD2")
ftp = FTP("xxx.xx.xx.xx3", "USERNAME3", "PASSWORD3")
ftp = FTP("xxx.xx.xx.xx4", "USERNAME4", "PASSWORD4")
ftp = FTP("xxx.xx.xx.xx5", "USERNAME5", "PASSWORD5")
ftp.login()
ftp.retrlines("LIST")
ftp.cwd("SmythIN/2014-10-29") --- here i have a folder created by current date ...how can i pass current date folder i change directory.
ftp.cwd("subFolder") # or ftp.cwd("folderOne/subFolder")
listing = []
ftp.retrlines("LIST", listing.append)
words = listing[0].split(None, 8)
filename = words[-1].lstrip()
# download the file
local_filename = os.path.join(r"c:\myfolder", filename)
lf = open(local_filename, "wb")
ftp.retrbinary("RETR " + filename, lf.write, 8*1024)
lf.close()
updated code :
ftp.cwd("SmythIN/2014-10-29")- the directory with today date is already created.
Just looping through the servers and pulling the last file within specified directories (if I understand your question correctly) is straight forward. Remembering what server each file came from should not be problematic either since you can use different local directories on your local machine or edit the filename as the file transfers. Here is my suggestions (to be modified to your application of course):
import os
from ftplib import FTP
# read in text file containing server login information and jam into dictionary
with open('server_file.txt','r') as tmp:
servers = {}
for r in tmp.read().split('\n'):
rs = r.split(',') # split r by comma
servers[rs[0]] = {'uname':rs[1],'pwd':[rs[2]]}
# if you want to create a new directory to save the file to
heute = dt.datetime.strftime(dt.datetime.today(),'%Y%m%d')
if os.path.isdir('my_dir' + heute)==False:
os.mkdir('my_dir' + heute)
for s in servers:
ftp = FTP(s,servers[s]['uname'],servers[s]['pwd'])
ftp.cwd('desired_subdir')
# if you want to download the last file I would us nlst
with open('local_file','wb') as lf:
ftp.retrbinary('RETR' + ftp.nlst()[-1], lf.write, 8*1024)
I want to move a large number of files from a windows system to a unix ftp server using python. I have a csv which has the current full path and filename and the new base bath to send it to (see here for an example dataset).
I have got a script using os.renames to do the transfer and directory creation in windows but can figure out a way to easily do it via ftp.
import os, glob, arcpy, csv, sys, shutil, datetime
top=os.getcwd()
RootOutput = top
startpath=top
FileList = csv.reader(open('FileList.csv'))
filecount=0
successcount=0
errorcount=0
# Copy/Move to FTP when required
ftp = ftplib.FTP('xxxxxx')
ftp.login('xxxx', 'xxxx')
directory = '/TransferredData'
ftp.cwd(directory)
##f = open(RootOutput+'\\Success_LOG.txt', 'a')
##f.write("Log of files Succesfully processed. RESULT of process run #:"+str(datetime.datetime.now())+"\n")
##f.close()
##
for File in FileList:
infile=File[0]
# local network ver
#outfile=RootOutput+File[4]
#os.renames(infile, outfile)
# ftp netowrk ver
# outfile=RootOutput+File[4]
# ftp.mkd(directory)
print infile, outfile
I tried the process in http://forums.arcgis.com/threads/17047-Upload-file-to-FTP-using-Python-ftplib but this is for moving all files in a directory, I have the old and new full file names and just need it to create the intermediate directories.
Thanks,
The following might work (untested):
def mkpath(ftp, path):
path = path.rsplit('/', 1)[0] # parent directory
if not path:
return
try:
ftp.cwd(path)
except ftplib.error_perm:
mkpath(ftp, path)
ftp.mkd(path)
ftp = FTP(...)
directory = '/TransferredData/'
for File in FileList:
infile = File[0]
outfile = File[4].split('\\') # need forward slashes in FTP
outfile = directory + '/'.join(outfile)
mkpath(ftp, outfile)
ftp.storbinary('STOR '+outfile, open(infile, 'rb'))