Delete files from FTP older than x day in python - python

I am trying to delete files older than x days using below code from FTP server.
#!/usr/bin/env python
from ftplib import FTP
import time
import sys
import os
ftp_user = sys.argv[1]
ftp_pwd = sys.argv[2]
host = sys.argv[3]
def remove_files(days, dir_path):
ftp = FTP(host)
ftp.login(ftp_user, ftp_pwd)
ftp.cwd(dir_path)
now = time.time()
for file in ftp.nlst(dir_path):
print("filename:", file)
#file_path =os.path.join(dir_path, f)
if not os.path.isfile(file):
continue
if os.stat(file).st_mtime < now - days * 86400:
ftp.delete(file)
print("Deleted ", file)
I am not getting any error but files are not deleted. I think os module commands are not working in FTP server. Is there any alternative to delete the files from FTP older than x days. Basically I calling this script from ansible to automate the process.

Indeed, os.path doesn't work with files over ftp.
You can use mlsd in the following manner:
ftp.mlsd(facts=["Modify"])
It'll return a list of tuples, each looking like:
('favicon.ico', {'modify': '20110616024613'})
(the first item is the file name, the second is a dictionary with the last modified time).
To get more information about each file - for example, the file's type, use:
ftp.mlsd(facts=["Modify", "Type"])
This results in data like:
('.manifest.full', {'modify': '20200423140048', 'type': 'file'})
('14.04', {'modify': '20140327184332', 'type': 'OS.unix=symlink'})
('.pool', {'modify': '20200423134557', 'type': 'dir'})

Related

Renaming log files with Python using wildcard and datetime

I was searching today for options to manipulate some log files, after executing actions on them, and I found that Python has the os.rename resource, after importing the os module, but I have some regex doubts..
Tried to fit a wildcard "*****" on my file names, but Python seems not to understand it.
My file names are:
Application_2021-08-06_hostname_[PID].log
Currently I'm asking Python to read these application files, and search for defined words/phrases such as "User has logged in", "User disconnected" and etc. And he does well. I'm using datetime module so Python will always read the current files only.
But what I'm trying to do now, is to change the name of the file, after Python read it and execute something. So when he find "Today's sessions are done", he will change the name of the file to:
Application_06-08-2021_hostname_[PID].log
Because it will be easier for manipulating later..
Considering that [PID] will always change, this is the part that I wanted to set the wildcard, because it can be 56, 142, 3356, 74567 or anything.
Using the os.rename module, I've got some errors. What do you suggest?
Code:
import os
import time
from datetime import datetime
path = '/users/application/logs'
file_name = 'Application_%s_hostname_'% datetime.now().strftime('%Y-%m-%d')
new_file_name = 'Application_%s_hostname_'% datetime.now().strftime('%d-%m-%Y')
os.rename(file_name, new_file_name)
The error is:
OSError: [Errno 2] No such file or directory
You can use glob which allows for wildcards:
import glob, os
from datetime import datetime
current_date = datetime.now()
path = '/users/application/logs'
# note the use of * as wild card
filename ='Application_%s_hostname_*'% current_date.strftime('%Y-%m-%d')
full_path = os.path.join(path, filename)
replace_file = glob.glob(full_path)[0] # will return a list so take first element
# or loop to get all files
new_name = replace_file.replace( current_date.strftime('%Y-%m-%d'), current_date.strftime('%d-%m-%Y') )
os.rename(replace_file, new_name)

How do I get FTP's current working directory using python

How can I get current working directory of FTP using python?
I have the following code and wanted to store file names into the list in the root directory.
from ftplib import FTP
ftp = FTP('domainname.com')
ftp.login(user='username',passwd = 'password')
You can use ftp.pwd to get current working dir.
Ex:
from ftplib import FTP
ftp = FTP('domainname.com')
ftp.login(user='username',passwd = 'password')
ftp.pwd() #Current working dir
ftp.cwd("Destination_Path") #To change to a different path
MoreInfo

Pulling out the files from directories by connecting to ftp using python

task: I need to connect to clients FTP, where we have many directories and each directory may or may not have .csv files in it. Now I need to go to each directory and open the files in all directories and if the file is according to the given format then dump in a server.
Presently I'm able to connect to FTP do this much
I'm able to get the directories list but not the files inside the directory.
from ftplib import FTP
from sqlalchemy import create_engine
import os
import sys
import os.path
ftp=FTP('host')
ftp.login('user','pwd')
for files in ftp.dir():
filenames=ftp.nlst(files)
ftp.retrbinary("RETR " + a, file.write)
file.close()
ftp.close() #CLOSE THE FTP CONNECTION
print "FTP connection closed. Goodbye"
I know that is not at all up to the mark.
Looks like you are looking for a way to get a list of the files in a given directory. Here is a function I often use to solve this task in unix system (macOS included). It should be a good starting point if not the final solution you are looking for.
import glob, os
def list_of_files(path, extension, recursive=False):
'''
Return a list of filepaths for each file into path with the target extension.
If recursive, it will loop over subfolders as well.
'''
if not recursive:
for file_path in glob.iglob(path + '/*.' + extension):
yield file_path
else:
for root, dirs, files in os.walk(path):
for file_path in glob.iglob(root + '/*.' + extension):
yield file_path
Also, you can use ftp.cwd('..') to change directory and ftp.retrlines('LIST') to just get the list of files of that directory.
Check the docs for some useful code snippet.

Getting the latest files from FTP folder (filename having spaces) in Python

I have a requirement where I have to pull the latest files from an FTP folder, the problem is that the filename is having spaces and the filename is having a specific pattern.
Below is the code I have implemented:
import sys
from ftplib import FTP
import os
import socket
import time
import pandas as pd
import numpy as np
from glob import glob
import datetime as dt
from __future__ import with_statement
ftp = FTP('')
ftp.login('','')
ftp.cwd('')
ftp.retrlines('LIST')
filematch='*Elig.xlsx'
downloaded = []
for filename in ftp.nlst(filematch):
fhandle=open(filename, 'wb')
print 'Getting ' + filename
ftp.retrbinary('RETR '+ filename, fhandle.write)
fhandle.close()
downloaded.append(filename)
ftp.quit()
I understand that I can append an empty list to ftp.dir() command, but since the filename is having spaces, I am unable to split it in the right way and pick the latest file of the type that I have mentined above.
Any help would be great.
You can get the file mtime by sending the MDTM command iff the FTP server supports it and sort the files on the FTP server accordingly.
def get_newest_files(ftp, limit=None):
"""Retrieves newest files from the FTP connection.
:ftp: The FTP connection to use.
:limit: Abort after yielding this amount of files.
"""
files = []
# Decorate files with mtime.
for filename in ftp.nlst():
response = ftp.sendcmd('MDTM {}'.format(filename))
_, mtime = response.split()
files.append((mtime, filename))
# Sort files by mtime and break after limit is reached.
for index, decorated_filename in enumerate(sorted(files, reverse=True)):
if limit is not None and index >= limit:
break
_, filename = decorated_filename # Undecorate
yield filename
downloaded = []
# Retrieves the newest file from the FTP server.
for filename in get_newest_files(ftp, limit=1):
print 'Getting ' + filename
with open(filename, 'wb') as file:
ftp.retrbinary('RETR '+ filename, file.write)
downloaded.append(filename)
The issue is that the FTP "LIST" command returns text for humans, which format depends on the FTP server implementation.
Using PyFilesystem (in place of the standard ftplib) and its API will provide a "list" API (search "walk") that provide Pythonic structures of the file and directories lists hosted in the FTP server.
http://pyfilesystem2.readthedocs.io/en/latest/index.html

Trouble making a good ftp checking python program

Please bear with me. Its only been a week I started on python. Heres the problem: I want to connect to a FTP Server. Presuming the file structure on my ftp and local directory is same. I want my python program to do the following:
1> On running the program, it should upload all the files that are NOT on the server but on my
local(Upload just the missing files-Not replacing all).
Say, I add a new directory or a new file, it should be uploaded on the server as it is.
2> It should then check for modified times of both the files on the local and the server and inform which is the latest.
NOw, I have made two programs:
1>One program that will upload ALL files from local to server, as it is. I would rather it check for missing files and then uipload just the missing files folders. NOt replace all.
2> Second program will list all files from the local, using os.walk and it will upload all files on the server without creating the correct directory structure.
All get copied to the root of server. Then it also checks for modified times.
I am in amess right now trying to JOIN these two modules into one perfect that does all I want it to.Anyone who could actually look at these codes and try joining them to what I wanana do, would be perfect. Sorry for being such a pain!!
PS:I might have not done everything the easy way!!
Code NO 1:
import sys
from ftplib import FTP
import os
def uploadDir(localdir,ftp):
"""
for each directory in an entire tree
upload simple files, recur into subdirectories
"""
localfiles = os.listdir(localdir)
for localname in localfiles:
localpath = os.path.join(localdir, localname)
print ('uploading', localpath, 'to', localname)
if not os.path.isdir(localpath):
os.chdir(localdir)
ftp.storbinary('STOR '+localname, open(localname, 'rb'))
else:
try:
ftp.mkd(localname)
print ('directory created')
except:
print ('directory not created')
ftp.cwd(localname) # change remote dir
uploadDir(localpath,ftp) # upload local subdir
ftp.cwd('..') # change back up
print ('directory exited')
def Connect(path):
ftp = FTP("127.0.0.1")
print ('Logging in.')
ftp.login('User', 'Pass')
uploadDir(path,ftp)
Connect("C:\\temp\\NVIDIA\\Test")
Code No2:
import os,pytz,smtplib
import time
from ftplib import FTP
from datetime import datetime,timedelta
from email.mime.text import MIMEText
def Connect_FTP(fileName,pathToFile): path from the local path
(dir,file) = os.path.split(fileName)
fo = open("D:\log.txt", "a+") # LOgging Important Events
os.chdir(dir)
ftp = FTP("127.0.0.1")
print ('Logging in.')
ftp.login('User', 'Pass')
l=dir.split(pathToFile)
print(l[1])
if file in ftp.nlst():
print("file2Check:"+file)
fo.write(str(datetime.now())+": File is in the Server. Checking the Versions....\n")
Check_Latest(file,fileName,ftp,fo)
else:
print("File is not on the Server. So it is being uploaded!!!")
fo.write(str(datetime.now())+": File is NOT in the Server. It is being Uploaded NOW\n")
ftp.storbinary('STOR '+file, open(file, 'rb'))
print("The End")
def Check_Latest(file2,path_on_local,ftp,fo): # Function to check the latest file, USING the "MOdified TIme"
LOcalFile = os.path.getmtime(path_on_local)
dloc=datetime.fromtimestamp(LOcalFile).strftime("%d %m %Y %H:%M:%S")
print("Local File:"+dloc)
localTimestamp=str(time.mktime(datetime.strptime(dloc, "%d %m %Y %H:%M:%S").timetuple())) # Timestamp to compare LOcalTime
modifiedTime = ftp.sendcmd('MDTM '+file2) # Using MDTM command to get the MOdified time.
IST = pytz.timezone('Asia/Kolkata')
ServTime=datetime.strptime(modifiedTime[4:], "%Y%m%d%H%M%S")
tz = pytz.timezone("UTC")
ServTime = tz.localize(ServTime)
j=str(ServTime.astimezone(IST)) # Changing TimeZone
g=datetime.strptime(j[:-6],"%Y-%m-%d %H:%M:%S")
ServerTime = g.strftime('%d %m %Y %H:%M:%S')
serverTimestamp=str(time.mktime(datetime.strptime(ServerTime, "%d %m %Y %H:%M:%S").timetuple())) # Timestamp to compare Servertime
print("ServerFile:"+ServerTime)
if serverTimestamp > localTimestamp:
print ("Old Version on The Client. You need to update your copy of the file")
fo.write(str(datetime.now())+": Old Version of the file "+file2+" on the Client\n")
return
else:
print ("The File on the Server is Outdated!!!New COpy Uploaded")
fo.write(str(datetime.now())+": The server has an outdated file: "+file2+". An email is being generated\n")
ftp.storbinary('STOR '+file2, open(file2, 'rb'))
def Connect(pathToFile):
for path, subdirs, files in os.walk(pathToFile):
for name in files:
j=os.path.join(path, name)
print(j)
Connect_FTP(j,pathToFile)
Connect("C:\temp\NVIDIA\Test")
May be this script will be useful.

Categories

Resources