I'm invoking another program from the command line to create visual studio solutions and build them. This program outputs the results of those commands.
I want to print warning lines that are output in yellow text rather than the default grey and error lines in red.
Let's assume that my cmd.exe console has already been modified to support rendering ascii2 escape codes to color output.
I've done quite a bit of searching for solutions, but most of the things I've found are made for linux/osx. I did find a script that given regex as input, could replace text using the specified rules.
regex script
Is it possible for me to run this script in the background, but still connected to the cmd.exe, such that it will run on all the text that is output to the cmd.exe, to run the regex search and replace before the text is displayed in the cmd.exe window? I could put this into a batch file or python script.
I wanted to lay out the specific application, but to make this question potentially more generic, how do I apply an existing script/program to a running cmd.exe prompt in the background, such that the user can still run commands in the cmd prompt, but have the background program apply to the commands run by the user?
I'm open to trying powershell if there are no other performant viable solutions that exist.
The regular expression to detect if a line is an error just searches for the word error
"\berror\b"
It's the same search for a warning
"\bwarning\b"
Edit: Adding the better solution first. This solution sets up a Pipe so it can receive the output from the external program, then prints the colorized result in realtime.
#Python 2
from subprocess import Popen, PIPE
def invoke(command):
process = Popen(command, stdout=PIPE, bufsize=1)
with process.stdout:
#b'' is byte. Per various other SO posts, we use this method to
#iterate to workaround bugs in Python 2
for line in iter(process.stdout.readline, b''):
line = line.rstrip()
if not line:
continue
line = line.decode()
if "error" in line:
print (bcolors.FAIL + line + bcolors.ENDC)
elif "warning" in line:
print (bcolors.WARNING + line + bcolors.ENDC)
else:
print (line)
error_code = process.wait()
return error_code
To accomplish this, I pipped the output of the build command to a file. I then wrote this python script to install a required dependency, loop through the file contents, then print the data with appropriate coloring.
I will now look into a solution which colors the output in real time, as this solution requires the user to wait for the build to complete before seeing the colored output.
#Python 2
import pip
def install(package):
if hasattr(pip, 'main'):
pip.main(['install', package])
else:
pip._internal.main(['install', package])
class bcolors:
WARNING = '\033[93m'
FAIL = '\033[91m'
ENDC = '\033[0m'
def print_text():
install('colorama')
try:
import colorama
colorama.init()
except:
print ("could not import colorama")
if len(sys.argv) != 2:
print ("usage: python pretty_print \"file_name\"")
return 0
else:
file_name = sys.argv[1]
with open(sys.argv[1], "r") as readfile:
for line in readfile:
line = line.rstrip()
if not line:
continue
if "error" in line:
print (bcolors.FAIL + line + bcolors.ENDC)
elif "warning" in line:
print (bcolors.WARNING + line + bcolors.ENDC)
else:
print (line)
return 0
if __name__ == "__main__":
ret = print_text()
sys.exit(ret)
I built a python (2.7) script that parses a txt file with this code:
cnt = 1
logFile = open( logFilePath, 'r' )
for line in logFile:
if errorCodeGetHostName in line:
errorHostNameCnt = errorHostNameCnt + 1
errorGenericCnt = errorGenericCnt + 1
reportFile.write( "--- Error: GET HOST BY NAME # line " + str( cnt ) + "\n\r" )
reportFile.write( line )
elif errorCodeSocke462 in line:
errorSocket462Cnt = errorSocket462Cnt + 1
errorGenericCnt = errorGenericCnt + 1
reportFile.write("--- Error: SOCKET -462 # line " + str(cnt) + "\n\r" )
reportFile.write(line)
elif errorCodeMemory in line:
errorMemoryCnt = errorMemoryCnt + 1
errorGenericCnt = errorGenericCnt + 1
reportFile.write("--- Error: MEMORY NOT RELEASED # line " + str(cnt) + "\n\r" )
reportFile.write(line)
cnt = cnt + 1
I want to add the line number of each error, and for this purpose I added a counter (cnt) but its value is not related to to the real line number.
This is a piece of my log file:
=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2017.06.13 17:05:43 =~=~=~=~=~=~=~=~=~=~=~=
UTC Time fetched from server #1: '0.pool.ntp.org'
*** Test (cycle #1) starting...
--- Test01 completed successfully!
--- Test02 completed successfully!
--- Test03 completed successfully!
--- Test04 completed successfully!
--- Test01 completed successfully!
--- Test02 completed successfully!
INF:[CONFIGURATION] Completed
--- Test03 completed successfully!
Firmware Version: 0.0.0
*** Test (cycle #1) starting...
How can I get the real line number?
Thanks for the help.
apart from the line-ending issue, there are some other issues with this code
Filehandles
as remarked in on of the comments, it is best to open files with a with-statement
Separation of functions
Now you have 1 big loop where you both loop over the original file, parse it and immediately write to the ReportFile. I think it would be best to separate those.
Make one function to loop over the log, return the details you need, and a next function looping over these details and writing them to a report. this is a lot more robust, and easier to debug and test when something goes wrong
I would also let the IO as much outside as possible. If you later want to stream to a socket or something, this can be easily done
DRY
Lines 6 to 24 of your code contain a lot of lines that are almost the same, and if you want to add another error you want to report, you need to add another 5 lines of code, almost the same. I would use a dict and a for-loop to cut on the boilerplate-code
Pythonic
A smaller remark is that you don't use the handy things Python offers, like yield the with-statement, enumerate or collections.counter Also variable naming is not according to PEP-8, but that is mainly aesthetic
My attempt
errors = {
error_hostname_count: {'error_msg' = '--- Error: GET HOST BY NAME # line %i'},
error_socker_462: {'error_msg' = '--- Error: SOCKET -462 # line %i'},
error_hostname_count: {'error_msg' = '--- Error: MEMORY NOT RELEASED # line %i'},
}
Here you define what errors can occur and what the error message should look like
def get_events(log_filehandle):
for line_no, line in enumerate(log_filehandle):
for error_code, error in errors.items():
if error_code in line:
yield line_no, error_code, line
This just takes a filehandle (can be a Stream or Buffer too) and just looks for error_codes in there, if it finds one, it yields it together with the line
def generate_report(report_filehandle, error_list):
error_counter = collections.Counter()
for line_no, error_code, error_line in error_list:
error_counter['generic'] += 1
error_counter[error_code] += 1
error_msg = format_error_msg(line_no, error_code)
report_file.write(error_msg)
report_file.write(error_line)
return error_counter
This loops over the found errors. It increases they counter, formats the message and writes it to the report_file
def format_error_msg(line_no, error_code):
return errors[error_code['error_msg'] % line_no
This uses string-formatting to generate a message from an error_code and line_no
with open(log_filename, 'r') as log_filehandle, open(report_filename, 'w') as report_filehandle:
error_list = get_events(log_filehandle):
error_counter = print_events(report_filehandle, error_list)
This ties it all together. You could use the error_counter to add a summary to the report, or write a summary to another file or database.
This approach has the advantage that if your error recognition changes, you can do this independent of the reporting and vice-versa
Intro: the log that I want parse is coming from an embedded platform programmed in C.
I found into the embedded code, that somewhere there are a printf with \n\r instead of \r\n. I replace each \n\r with \r\n that correspond to windows CR LF.
With this change the python script works! And I can identify the error by its line.
I've been trying to run a Java program and capture it's STDOUT output to a file from the Python script. The idea is to run test files through my program and check if it matches the answers.
Per this and this SO questions, using subprocess.call is the way to go. In the code below, I am doing subprocess.call(command, stdout=f) where f is the file I opened.
The resulted file is empty and I can't quite understand why.
import glob
test_path = '/path/to/my/testfiles/'
class_path = '/path/to/classfiles/'
jar_path = '/path/to/external_jar/'
test_pattern = 'test_case*'
temp_file = 'res'
tests = glob.glob(test_path + test_pattern) # find all test files
for i, tc in enumerate(tests):
with open(test_path+temp_file, 'w') as f:
# cd into directory where the class files are and run the program
command = 'cd {p} ; java -cp {cp} package.MyProgram {tc_p}'
.format(p=class_path,
cp=jar_path,
tc_p=test_path + tc)
# execute the command and direct all STDOUT to file
subprocess.call(command.split(), stdout=f, stderr=subprocess.STDOUT)
# diff is just a lambda func that uses os.system('diff')
exec_code = diff(answers[i], test_path + temp_file)
if exec_code == BAD:
scream(':(')
I checked the docs for subprocess and they recommended using subprocess.run (added in Python 3.5). The run method returns the instance of CompletedProcess, which has a stdout field. I inspected it and the stdout was an empty string. This explained why the file f I tried to create was empty.
Even though the exit code was 0 (success) from the subprocess.call, it didn't mean that my Java program actually got executed. I ended up fixing this bug by breaking down command into two parts.
If you notice, I initially tried to cd into correct directory and then execute the Java file -- all in one command. I ended up removing cd from command and did the os.chdir(class_path) instead. The command now contained only the string to run the Java program. This did the trick.
So, the code looked like this:
good_code = 0
# Assume the same variables defined as in the original question
os.chdir(class_path) # get into the class files directory first
for i, tc in enumerate(tests):
with open(test_path+temp_file, 'w') as f:
# run the program
command = 'java -cp {cp} package.MyProgram {tc_p}'
.format(cp=jar_path,
tc_p=test_path + tc)
# runs the command and redirects it into the file f
# stores the instance of CompletedProcess
out = subprocess.run(command.split(), stdout=f)
# you can access useful info now
assert out.returncode == good_code
I'd like to make the output of tail -F or something similar available to me in Python without blocking or locking. I've found some really old code to do that here, but I'm thinking there must be a better way or a library to do the same thing by now. Anyone know of one?
Ideally, I'd have something like tail.getNewData() that I could call every time I wanted more data.
Non Blocking
If you are on linux (as windows does not support calling select on files) you can use the subprocess module along with the select module.
import time
import subprocess
import select
f = subprocess.Popen(['tail','-F',filename],\
stdout=subprocess.PIPE,stderr=subprocess.PIPE)
p = select.poll()
p.register(f.stdout)
while True:
if p.poll(1):
print f.stdout.readline()
time.sleep(1)
This polls the output pipe for new data and prints it when it is available. Normally the time.sleep(1) and print f.stdout.readline() would be replaced with useful code.
Blocking
You can use the subprocess module without the extra select module calls.
import subprocess
f = subprocess.Popen(['tail','-F',filename],\
stdout=subprocess.PIPE,stderr=subprocess.PIPE)
while True:
line = f.stdout.readline()
print line
This will also print new lines as they are added, but it will block until the tail program is closed, probably with f.kill().
Using the sh module (pip install sh):
from sh import tail
# runs forever
for line in tail("-f", "/var/log/some_log_file.log", _iter=True):
print(line)
[update]
Since sh.tail with _iter=True is a generator, you can:
import sh
tail = sh.tail("-f", "/var/log/some_log_file.log", _iter=True)
Then you can "getNewData" with:
new_data = tail.next()
Note that if the tail buffer is empty, it will block until there is more data (from your question it is not clear what you want to do in this case).
[update]
This works if you replace -f with -F, but in Python it would be locking. I'd be more interested in having a function I could call to get new data when I want it, if that's possible. – Eli
A container generator placing the tail call inside a while True loop and catching eventual I/O exceptions will have almost the same effect of -F.
def tail_F(some_file):
while True:
try:
for line in sh.tail("-f", some_file, _iter=True):
yield line
except sh.ErrorReturnCode_1:
yield None
If the file becomes inaccessible, the generator will return None. However it still blocks until there is new data if the file is accessible. It remains unclear for me what you want to do in this case.
Raymond Hettinger approach seems pretty good:
def tail_F(some_file):
first_call = True
while True:
try:
with open(some_file) as input:
if first_call:
input.seek(0, 2)
first_call = False
latest_data = input.read()
while True:
if '\n' not in latest_data:
latest_data += input.read()
if '\n' not in latest_data:
yield ''
if not os.path.isfile(some_file):
break
continue
latest_lines = latest_data.split('\n')
if latest_data[-1] != '\n':
latest_data = latest_lines[-1]
else:
latest_data = input.read()
for line in latest_lines[:-1]:
yield line + '\n'
except IOError:
yield ''
This generator will return '' if the file becomes inaccessible or if there is no new data.
[update]
The second to last answer circles around to the top of the file it seems whenever it runs out of data. – Eli
I think the second will output the last ten lines whenever the tail process ends, which with -f is whenever there is an I/O error. The tail --follow --retry behavior is not far from this for most cases I can think of in unix-like environments.
Perhaps if you update your question to explain what is your real goal (the reason why you want to mimic tail --retry), you will get a better answer.
The last answer does not actually follow the tail and merely reads what's available at run time. – Eli
Of course, tail will display the last 10 lines by default... You can position the file pointer at the end of the file using file.seek, I will left a proper implementation as an exercise to the reader.
IMHO the file.read() approach is far more elegant than a subprocess based solution.
Purely pythonic solution using non-blocking readline()
I am adapting Ijaz Ahmad Khan's answer to only yield lines when they are completely written (lines end with a newline char) gives a pythonic solution with no external dependencies:
import time
from typing import Iterator
def follow(file, sleep_sec=0.1) -> Iterator[str]:
""" Yield each line from a file as they are written.
`sleep_sec` is the time to sleep after empty reads. """
line = ''
while True:
tmp = file.readline()
if tmp is not None:
line += tmp
if line.endswith("\n"):
yield line
line = ''
elif sleep_sec:
time.sleep(sleep_sec)
if __name__ == '__main__':
with open("test.txt", 'r') as file:
for line in follow(file):
print(line, end='')
The only portable way to tail -f a file appears to be, in fact, to read from it and retry (after a sleep) if the read returns 0. The tail utilities on various platforms use platform-specific tricks (e.g. kqueue on BSD) to efficiently tail a file forever without needing sleep.
Therefore, implementing a good tail -f purely in Python is probably not a good idea, since you would have to use the least-common-denominator implementation (without resorting to platform-specific hacks). Using a simple subprocess to open tail -f and iterating through the lines in a separate thread, you can easily implement a non-blocking tail operation in Python.
Example implementation:
import threading, Queue, subprocess
tailq = Queue.Queue(maxsize=10) # buffer at most 100 lines
def tail_forever(fn):
p = subprocess.Popen(["tail", "-f", fn], stdout=subprocess.PIPE)
while 1:
line = p.stdout.readline()
tailq.put(line)
if not line:
break
threading.Thread(target=tail_forever, args=(fn,)).start()
print tailq.get() # blocks
print tailq.get_nowait() # throws Queue.Empty if there are no lines to read
All the answers that use tail -f are not pythonic.
Here is the pythonic way: ( using no external tool or library)
def follow(thefile):
while True:
line = thefile.readline()
if not line or not line.endswith('\n'):
time.sleep(0.1)
continue
yield line
if __name__ == '__main__':
logfile = open("run/foo/access-log","r")
loglines = follow(logfile)
for line in loglines:
print(line, end='')
So, this is coming quite late, but I ran into the same problem again, and there's a much better solution now. Just use pygtail:
Pygtail reads log file lines that have not been read. It will even
handle log files that have been rotated. Based on logcheck's logtail2
(http://logcheck.org)
Ideally, I'd have something like tail.getNewData() that I could call every time I wanted more data
We've already got one and itsa very nice. Just call f.read() whenever you want more data. It will start reading where the previous read left off and it will read through the end of the data stream:
f = open('somefile.log')
p = 0
while True:
f.seek(p)
latest_data = f.read()
p = f.tell()
if latest_data:
print latest_data
print str(p).center(10).center(80, '=')
For reading line-by-line, use f.readline(). Sometimes, the file being read will end with a partially read line. Handle that case with f.tell() finding the current file position and using f.seek() for moving the file pointer back to the beginning of the incomplete line. See this ActiveState recipe for working code.
You could use the 'tailer' library: https://pypi.python.org/pypi/tailer/
It has an option to get the last few lines:
# Get the last 3 lines of the file
tailer.tail(open('test.txt'), 3)
# ['Line 9', 'Line 10', 'Line 11']
And it can also follow a file:
# Follow the file as it grows
for line in tailer.follow(open('test.txt')):
print line
If one wants tail-like behaviour, that one seems to be a good option.
Another option is the tailhead library that provides both Python versions of of tail and head utilities and API that can be used in your own module.
Originally based on the tailer module, its main advantage is the ability to follow files by path i.e. it can handle situation when file is recreated. Besides, it has some bug fixes for various edge cases.
Python is "batteries included" - it has a nice solution for it: https://pypi.python.org/pypi/pygtail
Reads log file lines that have not been read. Remembers where it finished last time, and continues from there.
import sys
from pygtail import Pygtail
for line in Pygtail("some.log"):
sys.stdout.write(line)
You can also use 'AWK' command.
See more at: http://www.unix.com/shell-programming-scripting/41734-how-print-specific-lines-awk.html
awk can be used to tail last line, last few lines or any line in a file.
This can be called from python.
If you are on linux you implement a non-blocking implementation in python in the following way.
import subprocess
subprocess.call('xterm -title log -hold -e \"tail -f filename\"&', shell=True, executable='/bin/csh')
print "Done"
# -*- coding:utf-8 -*-
import sys
import time
class Tail():
def __init__(self, file_name, callback=sys.stdout.write):
self.file_name = file_name
self.callback = callback
def follow(self, n=10):
try:
# 打开文件
with open(self.file_name, 'r', encoding='UTF-8') as f:
# with open(self.file_name,'rb') as f:
self._file = f
self._file.seek(0, 2)
# 存储文件的字符长度
self.file_length = self._file.tell()
# 打印最后10行
self.showLastLine(n)
# 持续读文件 打印增量
while True:
line = self._file.readline()
if line:
self.callback(line)
time.sleep(1)
except Exception as e:
print('打开文件失败,囧,看看文件是不是不存在,或者权限有问题')
print(e)
def showLastLine(self, n):
# 一行大概100个吧 这个数改成1或者1000都行
len_line = 100
# n默认是10,也可以follow的参数传进来
read_len = len_line * n
# 用last_lines存储最后要处理的内容
while True:
# 如果要读取的1000个字符,大于之前存储的文件长度
# 读完文件,直接break
if read_len > self.file_length:
self._file.seek(0)
last_lines = self._file.read().split('\n')[-n:]
break
# 先读1000个 然后判断1000个字符里换行符的数量
self._file.seek(-read_len, 2)
last_words = self._file.read(read_len)
# count是换行符的数量
count = last_words.count('\n')
if count >= n:
# 换行符数量大于10 很好处理,直接读取
last_lines = last_words.split('\n')[-n:]
break
# 换行符不够10个
else:
# break
# 不够十行
# 如果一个换行符也没有,那么我们就认为一行大概是100个
if count == 0:
len_perline = read_len
# 如果有4个换行符,我们认为每行大概有250个字符
else:
len_perline = read_len / count
# 要读取的长度变为2500,继续重新判断
read_len = len_perline * n
for line in last_lines:
self.callback(line + '\n')
if __name__ == '__main__':
py_tail = Tail('test.txt')
py_tail.follow(1)
A simple tail function from pypi app tailread
You Can use it also via pip install tailread
Recommended for tail access of large files.
from io import BufferedReader
def readlines(bytesio, batch_size=1024, keepends=True, **encoding_kwargs):
'''bytesio: file path or BufferedReader
batch_size: size to be processed
'''
path = None
if isinstance(bytesio, str):
path = bytesio
bytesio = open(path, 'rb')
elif not isinstance(bytesio, BufferedReader):
raise TypeError('The first argument to readlines must be a file path or a BufferedReader')
bytesio.seek(0, 2)
end = bytesio.tell()
buf = b""
for p in reversed(range(0, end, batch_size)):
bytesio.seek(p)
lines = []
remain = min(end-p, batch_size)
while remain > 0:
line = bytesio.readline()[:remain]
lines.append(line)
remain -= len(line)
cut, *parsed = lines
for line in reversed(parsed):
if buf:
line += buf
buf = b""
if encoding_kwargs:
line = line.decode(**encoding_kwargs)
yield from reversed(line.splitlines(keepends))
buf = cut + buf
if path:
bytesio.close()
if encoding_kwargs:
buf = buf.decode(**encoding_kwargs)
yield from reversed(buf.splitlines(keepends))
for line in readlines('access.log', encoding='utf-8', errors='replace'):
print(line)
if 'line 8' in line:
break
# line 11
# line 10
# line 9
# line 8
I'm starting to work on problems for google's Code Jam. However I there seams to be a problem with my submission. Whenever I submit I am told "Your output should start with 'Case #1: '". My output a print statement starts with ""Case #%s: %s"%(y + 1, p)" which says Case #1: ext... when I run my code.
I looked into it and it said "Your output should start with 'Case #1: ': If you get this message, make sure you did not upload the source file in place of the output file, and that you're outputting case numbers properly. The first line of the output file should always start with "Case #1:", followed by a space or the end of the line."
So what is an output file and how would I incorporate it into my code?
Extra info: This is my code I'm saving it as GoogleCode1.py and submitting that file. I wrote it in the IDLE.
import string
firstimput = raw_input ("cases ")
for y in range(int(first)):
nextimput = raw_input ("imput ")
firstlist = string.split(nextimput)
firstlist.reverse()
p = ""
for x in range(len(firstlist)):
p = p +firstlist[x] + " "
p = p [:-1]
print "Case #%s: %s"%(y + 1, p)
Run the script in a shell, and redirect the output.
python GoogleCode1.py > GoogleCode1.out
I/O redirection aside, the other way to do this would be to read from and write to various files. Lookup file handling in python
input_file = open('/path/to/input_file')
output_file = open('/path/to/output_file', 'w')
for line in input_file:
answer = myFunction(line)
output_file.write("Case #x: "+str(answer))
input_file.close()
output_file.close()
Cheers
Make sure you're submitting a file containing what your code outputs -- don't submit the code itself during a practice round.