Can anyone suggest a way in python to do logging with:
log rotation every day
compression of logs when they're rotated
optional - delete oldest log file to preserve X MB of free space
optional - sftp log files to server
Thanks for any responses,
Fred
log rotation every day: Use a TimedRotatingFileHandler
compression of logs: Set the encoding='bz2' parameter. (Note this "trick" will only work for Python2. 'bz2' is no longer considered an encoding in Python3.)
optional - delete oldest log file to preserve X MB of free space.
You could (indirectly) arrange this using a RotatingFileHandler. By setting the maxBytes parameter, the log file will rollover when it reaches a certain size. By setting the backupCount parameter, you can control how many rollovers are kept. The two parameters together allow you to control the maximum space consumed by the log files. You could probably subclass the TimeRotatingFileHandler to incorporate this behavior into it as well.
Just for fun, here is how you could subclass TimeRotatingFileHandler. When you run the script below, it will write log files to /tmp/log_rotate*.
With a small value for time.sleep (such as 0.1), the log files fill up quickly, reach the maxBytes limit, and are then rolled over.
With a large time.sleep (such as 1.0), the log files fill up slowly, the maxBytes limit is not reached, but they roll over anyway when the timed interval (of 10 seconds) is reached.
All the code below comes from logging/handlers.py. I simply meshed TimeRotatingFileHandler with RotatingFileHandler in the most straight-forward way possible.
import time
import re
import os
import stat
import logging
import logging.handlers as handlers
class SizedTimedRotatingFileHandler(handlers.TimedRotatingFileHandler):
"""
Handler for logging to a set of files, which switches from one file
to the next when the current file reaches a certain size, or at certain
timed intervals
"""
def __init__(self, filename, maxBytes=0, backupCount=0, encoding=None,
delay=0, when='h', interval=1, utc=False):
handlers.TimedRotatingFileHandler.__init__(
self, filename, when, interval, backupCount, encoding, delay, utc)
self.maxBytes = maxBytes
def shouldRollover(self, record):
"""
Determine if rollover should occur.
Basically, see if the supplied record would cause the file to exceed
the size limit we have.
"""
if self.stream is None: # delay was set...
self.stream = self._open()
if self.maxBytes > 0: # are we rolling over?
msg = "%s\n" % self.format(record)
# due to non-posix-compliant Windows feature
self.stream.seek(0, 2)
if self.stream.tell() + len(msg) >= self.maxBytes:
return 1
t = int(time.time())
if t >= self.rolloverAt:
return 1
return 0
def demo_SizedTimedRotatingFileHandler():
log_filename = '/tmp/log_rotate'
logger = logging.getLogger('MyLogger')
logger.setLevel(logging.DEBUG)
handler = SizedTimedRotatingFileHandler(
log_filename, maxBytes=100, backupCount=5,
when='s', interval=10,
# encoding='bz2', # uncomment for bz2 compression
)
logger.addHandler(handler)
for i in range(10000):
time.sleep(0.1)
logger.debug('i=%d' % i)
demo_SizedTimedRotatingFileHandler()
The other way to compress logfile during rotate (new in python 3.3) is using BaseRotatingHandler (and all inherited) class attribute rotator for example:
import gzip
import os
import logging
import logging.handlers
class GZipRotator:
def __call__(self, source, dest):
os.rename(source, dest)
f_in = open(dest, 'rb')
f_out = gzip.open("%s.gz" % dest, 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()
os.remove(dest)
logformatter = logging.Formatter('%(asctime)s;%(levelname)s;%(message)s')
log = logging.handlers.TimedRotatingFileHandler('debug.log', 'midnight', 1, backupCount=5)
log.setLevel(logging.DEBUG)
log.setFormatter(logformatter)
log.rotator = GZipRotator()
logger = logging.getLogger('main')
logger.addHandler(log)
logger.setLevel(logging.DEBUG)
....
More you can see here.
In addition to unutbu's answer: here's how to modify the TimedRotatingFileHandler to compress using zip files.
import logging
import logging.handlers
import zipfile
import codecs
import sys
import os
import time
import glob
class TimedCompressedRotatingFileHandler(logging.handlers.TimedRotatingFileHandler):
"""
Extended version of TimedRotatingFileHandler that compress logs on rollover.
"""
def doRollover(self):
"""
do a rollover; in this case, a date/time stamp is appended to the filename
when the rollover happens. However, you want the file to be named for the
start of the interval, not the current time. If there is a backup count,
then we have to get a list of matching filenames, sort them and remove
the one with the oldest suffix.
"""
self.stream.close()
# get the time that this sequence started at and make it a TimeTuple
t = self.rolloverAt - self.interval
timeTuple = time.localtime(t)
dfn = self.baseFilename + "." + time.strftime(self.suffix, timeTuple)
if os.path.exists(dfn):
os.remove(dfn)
os.rename(self.baseFilename, dfn)
if self.backupCount > 0:
# find the oldest log file and delete it
s = glob.glob(self.baseFilename + ".20*")
if len(s) > self.backupCount:
s.sort()
os.remove(s[0])
#print "%s -> %s" % (self.baseFilename, dfn)
if self.encoding:
self.stream = codecs.open(self.baseFilename, 'w', self.encoding)
else:
self.stream = open(self.baseFilename, 'w')
self.rolloverAt = self.rolloverAt + self.interval
if os.path.exists(dfn + ".zip"):
os.remove(dfn + ".zip")
file = zipfile.ZipFile(dfn + ".zip", "w")
file.write(dfn, os.path.basename(dfn), zipfile.ZIP_DEFLATED)
file.close()
os.remove(dfn)
if __name__=='__main__':
## Demo of using TimedCompressedRotatingFileHandler() to log every 5 seconds,
## to one uncompressed file and five rotated and compressed files
os.nice(19) # I always nice test code
logHandler = TimedCompressedRotatingFileHandler("mylog", when="S",
interval=5, backupCount=5) # Total of six rotated log files, rotating every 5 secs
logFormatter = logging.Formatter(
fmt='%(asctime)s.%(msecs)03d %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
logHandler.setFormatter(logFormatter)
mylogger = logging.getLogger('MyLogRef')
mylogger.addHandler(logHandler)
mylogger.setLevel(logging.DEBUG)
# Write lines non-stop into the logger and rotate every 5 seconds
ii = 0
while True:
mylogger.debug("Test {0}".format(ii))
ii += 1
Be warned: The class signatures have changed in python 3. Here is my working example for python 3.6
import logging.handlers
import os
import zlib
def namer(name):
return name + ".gz"
def rotator(source, dest):
print(f'compressing {source} -> {dest}')
with open(source, "rb") as sf:
data = sf.read()
compressed = zlib.compress(data, 9)
with open(dest, "wb") as df:
df.write(compressed)
os.remove(source)
err_handler = logging.handlers.TimedRotatingFileHandler('/data/errors.log', when="M", interval=1,
encoding='utf-8', backupCount=30, utc=True)
err_handler.rotator = rotator
err_handler.namer = namer
logger = logging.getLogger("Rotating Log")
logger.setLevel(logging.ERROR)
logger.addHandler(err_handler)
I guess it's too late to join the party, but here is what I did. I created a new class inheriting logging.handlers.RotatingFileHandler class and added a couple of lines to gzip the file before moving it.
https://github.com/rkreddy46/python_code_reference/blob/master/compressed_log_rotator.py
#!/usr/bin/env python
# Import all the needed modules
import logging.handlers
import sys
import time
import gzip
import os
import shutil
import random
import string
__version__ = 1.0
__descr__ = "This logic is written keeping in mind UNIX/LINUX/OSX platforms only"
# Create a new class that inherits from RotatingFileHandler. This is where we add the new feature to compress the logs
class CompressedRotatingFileHandler(logging.handlers.RotatingFileHandler):
def doRollover(self):
"""
Do a rollover, as described in __init__().
"""
if self.stream:
self.stream.close()
if self.backupCount > 0:
for i in range(self.backupCount - 1, 0, -1):
sfn = "%s.%d.gz" % (self.baseFilename, i)
dfn = "%s.%d.gz" % (self.baseFilename, i + 1)
if os.path.exists(sfn):
# print "%s -> %s" % (sfn, dfn)
if os.path.exists(dfn):
os.remove(dfn)
os.rename(sfn, dfn)
dfn = self.baseFilename + ".1.gz"
if os.path.exists(dfn):
os.remove(dfn)
# These two lines below are the only new lines. I commented out the os.rename(self.baseFilename, dfn) and
# replaced it with these two lines.
with open(self.baseFilename, 'rb') as f_in, gzip.open(dfn, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
# os.rename(self.baseFilename, dfn)
# print "%s -> %s" % (self.baseFilename, dfn)
self.mode = 'w'
self.stream = self._open()
# Specify which file will be used for our logs
log_filename = "/Users/myname/Downloads/test_logs/sample_log.txt"
# Create a logger instance and set the facility level
my_logger = logging.getLogger()
my_logger.setLevel(logging.DEBUG)
# Create a handler using our new class that rotates and compresses
file_handler = CompressedRotatingFileHandler(filename=log_filename, maxBytes=1000000, backupCount=10)
# Create a stream handler that shows the same log on the terminal (just for debug purposes)
view_handler = logging.StreamHandler(stream=sys.stdout)
# Add all the handlers to the logging instance
my_logger.addHandler(file_handler)
my_logger.addHandler(view_handler)
# This is optional to beef up the logs
random_huge_data = "".join(random.choice(string.ascii_letters) for _ in xrange(10000))
# All this code is user-specific, write your own code if you want to play around
count = 0
while True:
my_logger.debug("This is the message number %s" % str(count))
my_logger.debug(random_huge_data)
count += 1
if count % 100 == 0:
count = 0
time.sleep(2)
I think that the best option will be to use current implementation of TimedRotatingFileHandler and after renaming log file to the rotated version just compress it:
import zipfile
import os
from logging.handlers import TimedRotatingFileHandler
class TimedCompressedRotatingFileHandler(TimedRotatingFileHandler):
"""
Extended version of TimedRotatingFileHandler that compress logs on rollover.
"""
def find_last_rotated_file(self):
dir_name, base_name = os.path.split(self.baseFilename)
file_names = os.listdir(dir_name)
result = []
prefix = '{}.20'.format(base_name) # we want to find a rotated file with eg filename.2017-12-12... name
for file_name in file_names:
if file_name.startswith(prefix) and not file_name.endswith('.zip'):
result.append(file_name)
result.sort()
return result[0]
def doRollover(self):
super(TimedCompressedRotatingFileHandler, self).doRollover()
dfn = self.find_last_rotated_file()
dfn_zipped = '{}.zip'.format(dfn)
if os.path.exists(dfn_zipped):
os.remove(dfn_zipped)
with zipfile.ZipFile(dfn_zipped, 'w') as f:
f.write(dfn, dfn_zipped, zipfile.ZIP_DEFLATED)
os.remove(dfn)
To copy the file, gzip the copied file (using epoch time), and then clearing out the existing file in a way that won't upset the logging module:
import gzip
import logging
import os
from shutil import copy2
from time import time
def logRoll(logfile_name):
log_backup_name = logfile_name + '.' + str(int(time()))
try:
copy2(logfile_name, log_backup_name)
except IOError, err:
logging.debug(' No logfile to roll')
return
f_in = open(log_backup_name, 'rb')
f_out = gzip.open(log_backup_name + '.gz', 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()
os.remove(log_backup_name)
f=open(logfile_name, 'w')
f.close()
Here is my solution(modified from evgenek), simple and does not block python code while gzipping huge log files:
class GZipRotator:
def __call__(self, source, dest):
os.rename(source, dest)
subprocess.Popen(['gzip', dest])
I have added below one solution where i am basically zipping old
backup logs to zip with timestamp on it. using one extra variable
called ZipbackupCount. # no of old files to be zipped
e.g. we have logs like this. (backupcount = 5 and ZipbackupCount = 2)
a.log.1
a.log.2
a.log.3
a.log.4
a.log.11-09-2020-11-11-11.zip
once count of backup logs hits to 5 it triggers to zip a.log.5 and a.log.4 to above zip and continues.
import os
import datetime
import gzip
import logging.handlers
import zipfile
from config.config import PROJECT_PATH, LOG_DIR, LOG_FILE_NAME, LOG_FILESIZE
class NewRotatingFileHandler(logging.handlers.RotatingFileHandler):
def __init__(self, filename, **kws):
backupCount = kws.get('backupCount', 0)
self.backup_count = backupCount
self.ZipbackupCount = kws.pop('ZipbackupCount', 0)
self.file_name = filename
self.log_dir = os.path.split(self.file_name)[0]
self.log_file_name = os.path.split(self.file_name)[-1]
logging.handlers.RotatingFileHandler.__init__(self, filename, **kws)
def doArchive(self, old_log):
with open(old_log) as log:
with gzip.open(old_log + '.gz', 'wb') as comp_log:
comp_log.writelines(log)
os.remove(old_log)
def doRollover(self):
super(NewRotatingFileHandler, self).doRollover()
zip_file_name = self.log_file_name + "." + datetime.datetime.now().strftime("%d-%m-%Y-%H-%M-%S") + ".zip"
if os.path.exists(self.rotation_filename("%s.%d" % (self.baseFilename, self.backupCount))) and self.ZipbackupCount > 0 and self.file_name:
with zipfile.ZipFile(os.path.join(self.log_dir, zip_file_name), "w", zipfile.ZIP_DEFLATED, allowZip64=True) as zf:
for i in range(self.backupCount, self.backupCount - self.ZipbackupCount, -1):
sfn = self.rotation_filename("%s.%d" % (self.baseFilename, i))
if os.path.exists(sfn):
zf.write(sfn, "%s.%d" % (self.log_file_name, i))
os.remove(sfn)
else:
continue
zf.close()
# handler = NewRotatingFileHandler(filename=os.path.join(PROJECT_PATH, LOG_DIR, LOG_FILE_NAME),
# maxBytes=LOG_FILESIZE, backupCount=5, ZipbackupCount=2)
#
# handler.doRollover()
I see lot of answers focussing on overriding the doRollover method of the handler class, which is fine and works but I think a much cleaner and recommended solution is to define the rotator attribute on the handler class which is checked for in the rotate method rather than overriding doRollover.
def setup_logger():
def bz2namer(name):
return name + ".bz2"
def bzip_rotator(source, dest):
with open(source, "rb") as sf:
data = sf.read()
compressed = bz2.compress(data, 9)
with open(dest, "wb") as df:
df.write(compressed)
os.remove(source)
log = logging.getLogger("myapp")
file_handler = RotatingFileHandler(filename="app.log", backupCount=10, maxBytes=10000000)
file_handler.rotator = bzip_rotator
file_handler.namer = bz2namer
_log_format = "%(asctime)s [%(process)d] [%(levelname)s] [%(filename)s: %(funcName)s] - %(message)s "
log_formatter = logging.Formatter(_log_format, datefmt="%Y-%m-%d %H:%M:%S")
file_handler.setFormatter(log_formatter)
file_handler.setLevel(logging.DEBUG)
log.propagate = False
log.addHandler(file_handler)
Since python 3.3 you can easily extend any subclass of BaseRotatingFileHandler to implement gzipping:
import os
from logging.handlers import TimedRotatingFileHandler
from gzip import open as gzip_open
class CompressingTimedRotatingFileHandler(TimedRotatingFileHandler):
"""TimedRotatingFileHandler with gzip compression on rotate"""
def rotation_filename(self, default_name: str) -> str:
"""
Extend the default filename to use gzip-ending.
:param default_name: The default filename of this handler
"""
return default_name + ".gz"
def rotate(self, source: str, dest: str) -> None:
"""
Rotate the current log
:param source: The source filename. This is normally the base
filename, e.g. 'test.log'
:param dest: The destination filename. This is normally
what the source is rotated to, e.g. 'test.log.1'.
"""
# compress source file and write to destination
with open(source, 'rb') as f_in, gzip_open(dest, 'wb') as f_out:
f_out.writelines(f_in)
# delete source file
os.remove(source)
Related
I am trying to zip logs once it reaches 10 backup counts in logging handler. Each log file am checking file size and am creating new file. Now if 10 files are getting created, for next file it should zip all logs and delete all 10 log files and start next new file till 10 files again.
Here is what I tried:
class CompressedRotatingFileHandler(logging.handlers.RotatingFileHandler):
def doRollover(self):
"""
Do a rollover, as described in __init__().
"""
if self.backupCount > 0:
for i in range(self.backupCount - 1, 0, -1):
sfn = "%s.%d.gz" % (self.baseFilename, i)
dfn = "%s.%d.gz" % (self.baseFilename, i + 1)
if os.path.exists(sfn):
# print "%s -> %s" % (sfn, dfn)
if os.path.exists(dfn):
os.remove(dfn)
os.rename(sfn, dfn)
dfn = self.baseFilename + ".1.gz"
if os.path.exists(dfn):
os.remove(dfn)
# These two lines below are the only new lines. I commented out the os.rename(self.baseFilename, dfn) and
# replaced it with these two lines.
with open(self.baseFilename, 'rb') as f_in, gzip.open(dfn, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
# os.rename(self.baseFilename, dfn)
# print "%s -> %s" % (self.baseFilename, dfn)
self.mode = 'w'
if __name__ == '__main__':
name = 'test.log'
logging.basicConfig(level=logging.INFO, filename=name, filemode="a+",
format="%(asctime)-15s %(levelname)-8s %(message)s")
log = logging.getLogger()
handler = CompressedRotatingFileHandler(name, maxBytes=2000, backupCount=10)
ii = 0
while True:
logging.debug("Test {0}".format(ii))
ii += 1
But, I am not getting what I am expecting.
When you do the rollover, you need to create a .zip file with all the test.log.XX, delete them. In case you don't have 10 test.log.XX files, you should do the normal behavior of creating them.
The doRollover method looks like this:
import zipfile
import glob
import os
def doRollover(self):
"""
Do a rollover, as described in __init__().
"""
# This creates the .zip file from the 10 backup files
# Count all the test.log.XXX that already exist
log_files = [
f"{self.baseFilename}.{i}"
for i in range(1, self.backupCount + 1)
if os.path.exists(f"{self.baseFilename}.{i}")
]
# Count all the backup.XXX.zip files that already exist
# to avoid overwriting one.
nzip_backups = len(glob.glob('backup.*.zip'))
# If there are 10 (self.backupCount) test.log.XXX files, create the zip and delete the files
if len(log_files) == self.backupCount:
with zipfile.ZipFile(f"backup.{nzip_backups + 1}.zip", "w") as myzip:
for fn in log_files:
if os.path.exists(fn):
myzip.write(fn, fn.split("/")[-1])
os.remove(fn)
# In all cases, resort to the default behavior
# (that copies test.log.4 -> test.log.5 etc.). Don't rewrite it,
# just reuse the parent class' method.
super().doRollover()
I also encourage you not to use the root logger (that you are using with log = logging.getLogger() and logging.debug("XXX")), but to use a child logger. Here, we will be using the "foobar" logger with logging.getLogger("foobar"). Since it does not already exist, we give it your handler and the correct Formatter.
By piecing everything together, it gives:
import logging
import logging.handlers
import os
import time
import glob
import zipfile
class CompressedRotatingFileHandler(logging.handlers.RotatingFileHandler):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def doRollover(self):
"""
Do a rollover, as described in __init__().
"""
# This creates the .zip file from the 10 backup files
log_files = [
f"{self.baseFilename}.{i}"
for i in range(1, self.backupCount + 1)
if os.path.exists(f"{self.baseFilename}.{i}")
]
# get count of backup.XXX.zip
nzip_backups = len(glob.glob('backup.*.zip'))
if len(log_files) == self.backupCount:
with zipfile.ZipFile(f"backup.{nzip_backups + 1}.zip", "w") as myzip:
for fn in log_files:
if os.path.exists(fn):
myzip.write(fn, fn.split("/")[-1])
os.remove(fn)
time.sleep(1)
super().doRollover()
if __name__ == '__main__':
name = 'test.log'
log = logging.getLogger("foobar")
log.setLevel(logging.DEBUG)
log.propagate = False
handler = CompressedRotatingFileHandler(
name, mode="a+", maxBytes=2000, backupCount=10
)
handler.setFormatter(
logging.Formatter(
"%(asctime)-15s %(levelname)-8s %(message)s"
)
)
log.addHandler(handler)
ii = 0
while True:
log.info(f"Test {ii}")
time.sleep(0.3)
ii += 1
I'm running a certain Python code for days, every day I want one .log file without backup, basically, let's say I overwrite bt releasing everything and start again. I used TimedRotatingFileHandler but it would save previous logs, is there any other way than making a timer thread to remove any extra log file?
If you can live with just one backup file, specify ...,backupCount=1,... when you make an instance.
Otherwise you can subclass TimedRotatingFileHandler and override its getFilesToDelete method.
class MyRFH(TimedRotatingFileHandler):
"""
Handler for logging to a file, rotating the log file at certain timed
intervals.
If backupCount is > 0, when rollover is done, ALL previous logs are deleted.
"""
def getFilesToDelete(self):
"""
Determine the files to delete when rolling over.
Will NOT keep any backup files.
Modified from Standard Library Class
"""
dirName, baseName = os.path.split(self.baseFilename)
fileNames = os.listdir(dirName)
result = []
prefix = baseName + "."
plen = len(prefix)
for fileName in fileNames:
if fileName[:plen] == prefix:
suffix = fileName[plen:]
if self.extMatch.match(suffix):
result.append(os.path.join(dirName, fileName))
if self.backupCount == 0:
result = []
## if len(result) < self.backupCount:
## result = []
## else:
## result.sort()
## result = result[:len(result) - self.backupCount]
return result
Adapting an example from the Logging Cookbook
import logging, time, glob,sys,os
import logging.config
import logging.handlers
class MyRFH(logging.handlers.TimedRotatingFileHandler):
"""
Handler for logging to a file, rotating the log file at certain timed
intervals.
If backupCount is > 0, when rollover is done, ALL previous logs are deleted.
"""
def getFilesToDelete(self):
"""
Determine the files to delete when rolling over.
Will NOT keep any backup files.
Modified from Standard Library Class
"""
dirName, baseName = os.path.split(self.baseFilename)
fileNames = os.listdir(dirName)
result = []
prefix = baseName + "."
plen = len(prefix)
for fileName in fileNames:
if fileName[:plen] == prefix:
suffix = fileName[plen:]
if self.extMatch.match(suffix):
result.append(os.path.join(dirName, fileName))
if self.backupCount == 0:
result = []
return result
if __name__ == '__main__':
LOG_FILENAME = 'logging_rotatingfile_example'
# Set up a specific logger with our desired output level
my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.DEBUG)
# Add the log message handler to the logger
handler = MyRFH(LOG_FILENAME, when='s', interval=5, backupCount=1)
my_logger.addHandler(handler)
# Log some messages
for i in range(200):
my_logger.debug('i = %d' % i)
time.sleep(.1)
# See what files are created
logfiles = glob.glob('%s*' % LOG_FILENAME)
for filename in logfiles:
print(filename)
I have a silly code with a library called logging. I want to put a resume of debug mode in a file inside a specific path ./logs/.
In the middle of the code I have INFO to save in the .log file, but not works already.
I think I'm wrong in something very basic, but I do not see it.
# coding=utf-8
import re
import os
import csv
import datetime, timedelta
import logging
import logging.config
def configure_logging(logger):
# Configure logger with custom formatter.
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
# Create file handler which logs DEBUG messages.
now = datetime.now().strftime('%Y%m%d-%Hh%M')
logname = './logs/' + now + '.log'
fh = logging.FileHandler(logname)
fh.setLevel(logging.DEBUG)
fh.setFormatter(formatter)
# Create console handler with a higher log level.
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
ch.setFormatter(formatter)
# Add handlers to the logger
logger.addHandler(fh)
logger.addHandler(ch)
def between(value, a, b):
pos_a = value.find(a) # Find and validate before-part.
if pos_a == -1: return "" # Find and validate after part.
pos_b = value.rfind(b)
if pos_b == -1: return "" # Return middle part.
adjusted_pos_a = pos_a + len(a)
if adjusted_pos_a >= pos_b: return ""
return value[adjusted_pos_a:pos_b]
def main():
logger = logging.getLogger('Main')
configure_logging(logger)
module_logger = logging.getLogger('Extract Information')
def scan_folder():
path = '/Users/anna/PycharmProjects/extractData/DiarioOficial'
with open('All.csv', 'w') as csvFile:
headers = ['COMPANY NAME', 'CVE']
writer = csv.writer(csvFile, delimiter=';')
writer.writerow(headers)
for (path, dirnames, file_names) in os.walk(path):
for file_name in file_names:
if file_name.endswith(".txt"):
file_path=os.path.join(path, file_name)
mensaje = open(file_path).read()
module_logger.info('File is opened correct')
# Company Name
keywords_cap = ['SpA', 'SPA', 'LIMITADA', 'LTDA', 'S.A.', 'E.I.R.L.', 'S.L.']
# re.escape to solve the problem with metacharacters in keyword_obj
keywords_cap = map(re.escape, keywords_cap)
# sorting the items by lengh in descending order
keywords_cap.sort(key=len, reverse=True)
obj = re.compile(r'[:,;.]\s*"?([^:,;.]*?(?<!\w)(?:{}))'.format('|'.join(keywords_cap)))
obj2 = obj.search(mensaje)
if obj2:
# To obtain the first match in group(1)
company_name = obj2.group(1)
else:
company_name = "None"
# CVE Number of the file
regex = r"\s*CVE\s+([^|]*)"
matches = re.search(regex, mensaje)
if matches:
company_cve = matches.group(1).strip()
else:
company_cve = "None"
csvData = [company_name, company_cve]
csvData = [str(data).replace('\n', '').replace('\r', '') for data in csvData]
writer = csv.writer(csvFile, delimiter=';')
writer.writerow(csvData)
scan_folder()
if __name__ == '__main__':
main()
As it can be seen, it is a simply code that create a cvs where data extracted from a .txt file is entered. Regex has been used to extract the data from the text file.
I've adapted your code to only focus on the logging part:
# coding=utf-8
import logging
from datetime import datetime # You had a minor bug here, see https://stackoverflow.com/questions/415511/how-to-get-the-current-time-in-python
def configure_logging(logger):
# Unchanged
# ...
def main():
logger = logging.getLogger('Main')
configure_logging(logger)
# Here, you need to use the configured logger, not another one
logger.info("This is a test writing in log file")
if __name__ == '__main__':
main()
Note that you need to create the logs folder manually before running the code.
After running this, I have a file in the logs folder with the following content:
2018-08-17 10:13:09,304 - Main - INFO - This is a test writing in log file
I'm using the below script to re-encode my existing media files to MP4 using the HandBrake CLI. It's going to be a long process, so I'd like to have a way to capture files that have been created in the past 7 days, as well as the other filters (on file extensions), so that new content can be updated, while older content can be run on a separate script at different times. What do I have to change in the script to only capture files created in the past 7 days?
import os
import time
import subprocess
import sys
import httplib
import urllib
from xml.dom import minidom
import logging
import datetime
#Script Directory setup
myDateTime = datetime.datetime.now().strftime("%y-%m-%d-%H-%M")
logdir = 'D:\\logs\\'
logfile = logdir + 'TV_encode-' + myDateTime + '.log'
#Log Handler Setup
logger = logging.getLogger('TV_encode')
hdlr = logging.FileHandler(logfile)
formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
hdlr.setFormatter(formatter)
logger.addHandler(hdlr)
logger.setLevel(logging.INFO)
logger.info('Getting list of files to re-encode...')
fileList = []
rootdir = 'T:\\'
logger.info('Using %s as root directory for scan...' % rootdir)
for root, subFolders, files in os.walk(rootdir):
for file in files:
theFile = os.path.join(root,file)
fileName, fileExtension = os.path.splitext(theFile)
if fileExtension.lower() in ('.avi', '.divx', '.flv', '.m4v', '.mkv', '.mov', '.mpg', '.mpeg', '.wmv'):
print 'Adding',theFile
logger.info('Adding %s to list of file to re-encode.' % theFile)
fileList.append(theFile)
runstr = '"C:\\Program Files\\Handbrake\\HandBrakeCLI.exe" -i "{0}" -o "{1}" --preset="Normal" --two-pass --turbo'
print '=======--------======='
logger.info('=======--------=======')
logger.info('Starting processing of files...')
while fileList:
inFile = fileList.pop()
logger.info('Original file: %s' % inFile)
fileName, fileExtension = os.path.splitext(inFile)
outFile = fileName+'.mp4'
logger.info('New file: %s' % outFile)
print 'Processing',inFile
logger.info('Processing %s' % inFile)
returncode = subprocess.call(runstr.format(inFile,outFile))
time.sleep(5)
print 'Removing',inFile
logger.info('Removing %s' % inFile)
os.remove(inFile)
logger.info('Sending Pushover notification...')
conn = httplib.HTTPSConnection("api.pushover.net:443")
conn.request("POST", "/1/messages.json",
urllib.urlencode({
"token": "TOKENHERE",
"user": "USERKEY",
"message": "Re-encoding complete for %s" % fileName,
}), {"Content-type": "application/x-www-form-urlencoded"})
conn.getresponse()
os.path.getmtime(filename) will give you the modification time in seconds since the epoch.
Use the datetime module to convert it to a datetime object, and compare it as usual.
import datetime
import os
ONE_WEEK_AGO = datetime.datetime.today() - datetime.timedelta(days=7)
mod_date = datetime.datetime.fromtimestamp(os.path.getmtime(theFile))
if mod_date > ONE_WEEK_AGO:
# use the file.
My script allows me to import photos into a specific folder and update my database MongoDB via a JSON file import.
I set up a logging system with log rotation and compression.
I have different problems:
Some actions are not loggued file, but in the console. (subprocess.call)
the logger creates me three files instead of a one file
python script:
def moveFTPFiles(serverName,userName,passWord,remotePath,localPath,deleteRemoteFiles=False,onlyDiff=False):
"""Connect to an FTP server and bring down files to a local directory"""
import os
import sys
import glob
from sets import Set
import ftplib
logger.info(' Deleting Files ')
os.chdir(localDirectoryPath)
files=glob.glob('*.*')
for filename in files:
os.unlink(filename)
logger.info(' Retreiving Files ')
try:
ftp = ftplib.FTP(serverName)
ftp.login(userName,passWord)
ftp.cwd(remotePath)
logger.info(' Connecting ')
if onlyDiff:
lFileSet = Set(os.listdir(localPath))
rFileSet = Set(ftp.nlst())
transferList = list(rFileSet - lFileSet)
logger.info(' Missing ' + str(len(transferList)))
else:
transferList = ftp.nlst()
delMsg = ""
filesMoved = 0
for fl in transferList:
# create a full local filepath
localFile = localPath + fl
# print "Create a full local filepath: " + localFile
grabFile = True
if grabFile:
#open a the local file
fileObj = open(localFile, 'wb')
# Download the file a chunk at a time using RETR
ftp.retrbinary('RETR ' + fl, fileObj.write)
# Close the file
# print "Close the file "
fileObj.close()
filesMoved += 1
#print "Uploaded: " + str(filesMoved)
#sys.stdout.write(str(filesMoved)+' ')
#sys.stdout.flush()
# Delete the remote file if requested
if deleteRemoteFiles:
ftp.delete(fl)
delMsg = " and Deleted"
logger.info('Files Moved' + delMsg + ': ' + str(filesMoved) + ' on ' + timeStamp())
except ftplib.all_errors as e:
logger.error('We have a problem on moveFTPFiles' + '%s' % e)
ftp.close() # Close FTP connection
ftp = None
def timeStamp():
"""returns a formatted current time/date"""
import time
return str(time.strftime("%a %d %b %Y %I:%M:%S %p"))
def importData(serverName,userName,passWord,directory,filematch,source,destination):
import socket
import ftplib
import os
import subprocess
import json
try:
ftp = ftplib.FTP(serverName)
ftp.login(userName,passWord)
ftp.cwd(directory)
logger.info(' Connecting ')
# Loop through matching files and download each one individually
for filename in ftp.nlst(filematch):
fhandle = open(filename, 'wb')
logger.info(' Getting ' + filename)
ftp.retrbinary('RETR ' + filename, fhandle.write)
fhandle.close()
#convert xml to json
logger.info(' Convert ' + filename + ' to .json ')
subprocess.call('xml2json -t xml2json -o stockvo.json stockvo.xml --strip_text', shell=True)
#remove xml file
logger.info(' Delete ' + filename)
os.unlink(source+'stockvo.xml')
#modify json file
logger.info(' Modify .json file')
data = json.loads(open("stockvo.json").read())
with open("stockvo.json", "w") as outfile:
json.dump(data["Stock"]["Vehicule"], outfile)
#import json file to MongoDB
logger.info(' Import json file to MongoDB')
subprocess.call('mongoimport --db AutoPrivilege -c cars stockvo.json --jsonArray --upsert --drop',shell=True)
#remove old json file
logger.info('Delete old .json file')
## if file exists, delete it ##
myfile=destination+"stockvo.json"
if os.path.isfile(myfile):
os.remove(myfile)
#os.unlink(destination+'stockvo.json')
#move json file
logger.info('Move .json')
os.system('mv %s %s' % (source+'stockvo.json', destination+'stockvo.json'))
except ftplib.all_errors as e:
logger.error('We have a problem on importData' + '%s' % e)
ftp.close() # Close FTP connection
ftp = None
import time
import re
import os
import stat
import logging
import logging.handlers as handlers
class SizedTimedRotatingFileHandler(handlers.TimedRotatingFileHandler):
"""
Handler for logging to a set of files, which switches from one file
to the next when the current file reaches a certain size, or at certain
timed intervals
"""
def __init__(self, filename, mode='a', maxBytes=0, backupCount=0, encoding=None,
delay=0, when='h', interval=1, utc=False):
# If rotation/rollover is wanted, it doesn't make sense to use another
# mode. If for example 'w' were specified, then if there were multiple
# runs of the calling application, the logs from previous runs would be
# lost if the 'w' is respected, because the log file would be truncated
# on each run.
if maxBytes > 0:
mode = 'a'
handlers.TimedRotatingFileHandler.__init__(
self, filename, when, interval, backupCount, encoding, delay, utc)
self.maxBytes = maxBytes
def shouldRollover(self, record):
"""
Determine if rollover should occur.
Basically, see if the supplied record would cause the file to exceed
the size limit we have.
"""
if self.stream is None: # delay was set...
self.stream = self._open()
if self.maxBytes > 0: # are we rolling over?
msg = "%s\n" % self.format(record)
self.stream.seek(0, 2) #due to non-posix-compliant Windows feature
if self.stream.tell() + len(msg) >= self.maxBytes:
return 1
t = int(time.time())
if t >= self.rolloverAt:
return 1
return 0
if __name__ == '__main__':
#log to a file
log_filename='/opt/log/importData.log'
logger=logging.getLogger('importData')
logger.setLevel(logging.DEBUG)
handler=SizedTimedRotatingFileHandler(
log_filename, maxBytes=100, backupCount=5,
when='s',interval=10,
# encoding='bz2', # uncomment for bz2 compression
)
formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
#--- constant connection values
ftpServerName = "xxxxx.xxxxxx"
ftpU = "xxxxxxxx"
ftpP = "xxxxxx"
remoteDirectoryPath = "/xxxxxx/xxxxxx/xxxxxx/xxxxxxx/"
localDirectoryPath = "/xxxxx/xxxxxxxxx/xxxxxxxx/xxxxxxx/"
directory = '/xxxxxxx/'
filematch = '*.xml'
source='/xxxxxx/xxxxxxxx/'
destination='/xxxxxxxx/xxxxxx/'
deleteAfterCopy = False #set to true if you want to clean out the remote directory
onlyNewFiles = True #set to true to grab & overwrite all files locally
importData(ftpServerName,ftpU,ftpP,directory,filematch,source,destination)
moveFTPFiles(ftpServerName,ftpU,ftpP,remoteDirectoryPath,localDirectoryPath,deleteAfterCopy,onlyNewFiles)
importData.log:
2015-04-23 11:33:57,408 INFO Files Moved: 1145 on Thu 23 Apr 2015 11:33:57 AM
importData.log.2015-04-23_11-33-40:
2015-04-23 11:33:40,896 INFO Deleting Files 2015-04-23 11:33:40,956
INFO Retreiving Files
importData.log2015-04-23_11-33-41:
2015-04-23 11:33:41,386 INFO Connecting 2015-04-23 11:33:41,825 INFO
Missing 1145
Can anyone suggest a way in python to solve my problems .?
SizedTimedRotatingFileHandler(log_filename, maxBytes=100, ...
Does exactly what it's configured to do - a maximum of 100 bytes will be logged into one file. Increase the max size to a few (hundred?) megabytes and you'll get infrequent rotation.
Regarding only partial logging to the file, you define the handler only for the 'importData' module. Other modules will write to the default handler (likely the console).
Regarding the subprocess.call() itself, it's not actually logging anything. Unless you capture the output using the stdout argument, the output will be printed to the normal stdout. You need to setup a pipe and read from it into the logger instead.