This question already has answers here:
Python FTP get the most recent file by date
(5 answers)
Closed 4 years ago.
How do I determine the most recently modified file from an ftp directory listing? I used the max function on the unix timestamp locally, but the ftp listing is harder to parse. The contents of each line is only separated by a space.
from ftplib import FTP
ftp = FTP('ftp.cwi.nl')
ftp.login()
data = []
ftp.dir(data.append)
ftp.quit()
for line in data:
print line
output:
drwxrwsr-x 5 ftp-usr pdmaint 1536 Mar 20 09:48 .
dr-xr-srwt 105 ftp-usr pdmaint 1536 Mar 21 14:32 ..
-rw-r--r-- 1 ftp-usr pdmaint 5305 Mar 20 09:48 INDEX
Just to make some corrections:
date_str = ' '.join(line.split()[5:8])
time.strptime(date_str, '%b %d %H:%M') # import time
And to find the most recent file
for line in data:
col_list = line.split()
date_str = ' '.join(line.split()[5:8])
if datePattern.search(col_list[8]):
file_dict[time.strptime(date_str, '%b %d %H:%M')] = col_list[8]
date_list = list([key for key, value in file_dict.items()])
s = file_dict[max(date_list)]
print s
If the FTP server supports the MLSD command (and quite possibly it does), you can use the FTPDirectory class from that answer in a related question.
Create an ftplib.FTP instance (eg aftp) and an FTPDirectory instance (eg aftpdir), connect to the server, .cwd to the directory you want, and read the files using aftpdir.getdata(aftp). After that, you get name of the freshest file as:
import operator
max(aftpdir, key=operator.attrgetter('mtime')).name
To parse the date, you can use (from version 2.5 onwards):
datetime.datetime.strptime('Mar 21 14:32', '%b %d %H:%M')
You can split each line and get the date:
date_str = ' '.join(line.split(' ')[5:8])
Then parse the date (check out egenix mxDateTime package, specifically the DateTimeFromString function) to get comparable objects.
Related
I have a large text file on the web that I am using requests to obtain and parse data from. The text file begins each line with a format like [Mon Oct 10 08:58:26 2022]. How can I get the latest 7 days or convert only the datetime to an object or string for storing and parsing later? I simply want to extract the timestamps from the log and print them
You can use TimedRotatingFileHandler for daily or 7-days logs.
read more about timed rotating file handler here
and
read more about extracting timestamps from files
Can you tell me if this snippet solves your problem?
from datetime import datetime
log_line = "[Sun Oct 09 06:14:26 2022] Wiladoc is browsing your wares."
_datetime = log_line[1:25]
_datetime_strp = datetime.strptime(_datetime, '%a %b %d %H:%M:%S %Y')
print(_datetime)
print(_datetime_strp)
Is it possible to pickup the latest file in my folder?
I currently type in the month to pickup the file of choice, However I am not sure how to pickup the latest modified file?
I am confused trying to use max since the date I Input I use is used twice and want to use it to pickup latest smalldate and then pickup the file in that folder.
( {smalldate} for parent folder and {monthyear} for a part of the file name)
monthyear = input("Enter Full Month and Year: ") #Ie August 2021
smalldate = pd.to_datetime(monthyear, format="%B %Y").strftime("%b %Y")
userid = str(os.getlogin())
gco = (rf"C:\Users\{userid}\OneDrive\Report\{smalldate}\Detail Report - {monthyear}.xlsx")
I am currently working on a programme within the django environment which operates off a json api provided by a third party. There is an object within that API which I want however the string of information it provides is too much for me.
The data I want is the created_at tag from the twitter api using tweepy. This created_at contains data in the following format:
"created_at": "Mon Aug 27 17:21:03 +0000 2012"
This is all fine however this will return the date AND time whereas I simply want the the time part of the above example i.e. 17:21:03.
Is there any way I can just take this part of the created_at response string and store it in a separate variable?
You can use the dateutil module
from dateutil import parser
created_at = "Mon Aug 27 17:21:03 +0000 2012"
created_at = parser.parse(created_at)
print created_at.time()
Output:
17:21:03
Try below code.
my_datetime = response_from_twitter['created_at']
my_time = my_datetime.split(' ')[3]
# my_time will now contain time part.
You could just split the string into a list and take the 4th element:
time = source['created_at'].split(' ')[3]
What about a regular expression with re.search():
>>> import re
>>> d = {"created_at": "Mon Aug 27 17:21:03 +0000 2012"}
>>> re.search('\d{2}:\d{2}:\d{2}', d['created_at']).group(0)
'17:21:03'
I've been struggling with this problem for a bit, I am trying to create a program that will create a datetime object based on the current date and time, create a second such object from our file data, find the difference between the two, and if it is greater than 10 minutes search for a "handshake file", which is a file we receive back when our file has successfully loaded. If we don't find that file, I want to kick out an error email.
My problem lies in being able to capture the result of my ls command in a meaningful way where I would be able to parse through it and see if the correct file exists. Here is my code:
"""
This module will check the handshake files sent by Pivot based on the following conventions:
- First handshake file (loaded to the CFL, *auditv2*): Check every half-hour
- Second handshake file (proofs are loaded and available, *handshake*): Check every 2 hours
"""
import smtplib
from email.mime.text import MIMEText
from datetime import datetime, timedelta
from csv import DictReader
from subprocess import *
from os import chdir
from glob import glob
def main():
audit_in = '/prod/bcs/lgnp/clientapp/csvbill/audit_process/lgnp.smr.csv0000.audit.qty'
with open(audit_in, 'rbU') as audit_qty:
my_audit_reader = DictReader(audit_qty, delimiter=';', restkey='ignored')
my_audit_reader.fieldnames = ("Property Code",
"Pivot ID",
"Inwork File",
"Billing Manager E-mail",
"Total Records",
"Number of E-Bills",
"Printed Records",
"File Date",
"Hour",
"Minute",
"Status")
# Get current time to reconcile against
now = datetime.now()
# Change internal directory to location of handshakes
chdir('/prod/bcs/lgnp/input')
for line in my_audit_reader:
piv_id = line['Pivot ID']
status = line['Status']
file_date = datetime(int(line['File Date'][:4]),
int(line['File Date'][4:6]),
int(line['File Date'][6:8]),
int(line['Hour']),
int(line['Minute']))
# print(file_date)
if status == 's':
diff = now - file_date
print diff
print piv_id
if 10 < (diff.seconds / 60) < 30:
proc = Popen('ls -lh *{0}*'.format(status),
shell=True) # figure out how to get output
print proc
def send_email(recipient_list):
msg = MIMEText('Insert message here')
msg['Subject'] = 'Alert!! Handshake files missing!'
msg['From'] = r'xxx#xxx.com'
msg['To'] = recipient_list
s = smtplib.SMTP(r'xxx.xxx.xxx')
s.sendmail(msg['From'], msg['To'], msg.as_string())
s.quit()
if __name__ == '__main__':
main()
To parse ls output is not the best solution here. You can surely do that parsing subprocess.check_output result or in any other way, but let me give you an advice.
It is a good criterion of something going wrong if you find yourself parsing someone's output or logs to solve a standard problem, please consider other solutions, like offered below:
If the only thing you want is to see the contents of the directory use os.listdir like:
my_home_files = os.listdir(os.path.expanduser('~/my_dir')) # surely it's cross-platform
now you have a list of files in your my_home_files variable.
You can filter them in the way you want or use glob.glob to use metacharacters like that:
glob.glob("/home/me/handshake-*.txt") # will output everything matching the expression
# (say you have ids in your filenames).
After that you may want to check some stats of the file (like the date of last access etc.)
consider using os.stat:
os.stat(my_home_files[0]) # outputs stats of the first
# posix.stat_result(st_mode=33104, st_ino=140378115, st_dev=3306L, st_nlink=1, st_uid=23449, st_gid=59216, st_size=1442, st_atime=1421834474, st_mtime=1441831745, st_ctime=1441234474)
# see os.stat linked above to understand how to parse it
I'm trying to write an imapsync software that connects on host 1 to account1#host1.com and copy messages and folder to account2#host2.com host2.
Supposing I've already fetched the selected message with his UID with:
msg = connection.fetch(idOne, '(RFC822)'))
and msg is a good message, below you have the code I've tried to append the message:
date = connection.fetch(idOne, '(INTERNALDATE)')[1][0]
date = date.split("\"")[1]
authconnection1.append(folder, "", date, msg)
Error is:
ValueError: date_time not of a known type
I've tried many other possible solutions (with dateutil to convert date string to datetime object, using imaplib.Time2Internaldate I got the same error above ValueError: date_time not of a known type), but no one seems to work. I've searched around the network but nobody seems to have this issue.
Any idea? I'm getting very frustrated of it...
Thank you very much
Update:
I've resolved the date_time issue, because the "append" method of imaplib wants that date_time is an integer of seconds, so to retrieve the date I've written this code:
# Fetch message from host1
msg = connection.fetch(idOne, '(RFC822)')
# retrieve this message internal date
date = connection.fetch(idOne, '(INTERNALDATE)')[1][0].split("\"")[1]
# Cast str date to datetime object
date = parser.parse(date)
# Removes tzinfo
date = date.replace(tzinfo=None)
# Calculates total seconds between today and the message internal date
date = (datetime.datetime.now()-date).total_seconds()
# Makes the append of the message
authconnection1.append(folder, "", date, msg)
But now this fails with error:
TypeError: expected string or buffer
So the issue is only changed...
Any ideas?
Update (RESOLVED):
imaplib is not working fine, so I've made a workaround for append messages with right date/time. This is my code, I hope it will help everybody:
Function to convert date in right format:
def convertDate(date):
from time import struct_time
import datetime
from dateutil import parser
date = parser.parse(date)
date = date.timetuple()
return date
Main code:
#Get message
msg = authconnection.fetch(idOne, '(RFC822)')
#Get message date
date = authconnection.fetch(idOne, '(INTERNALDATE)')[1][0].split("\"")[1]
#Apply my function I've written above
date = convertDate(date)
#Append message with right date
authconnection1.append(folder, flags, date, msg[1][0][1])