python pattern match and process - python

i am trying to parse log with bunch of lines.
The line i am trying to parse from live trace (kind of tail from file) is the one that starts with "Contact".
Actually i need to use everything between brackets whatever is within
[2a00:c30:7141:230:1066:4f46:7243:a6d2] and number separated by double dots after brackets (56791)
as variables.
I have tried wit regex search but i do not know how to deal with.
Contact: "200" <sip:200#[2a00:c30:7141:230:1066:4f46:7243:a6d2]:56791;transport=udp;registering_acc=example_com>;expires=600

If the format is always the same:
for line in logfile:
if "Contact" in line:
myIPAddress=line.split('[')[1].split(']')[0]
myPort=line.split(']:')[1].split(';')[0]

use regex to do so
import re
logfile = open('xxx.log')
p = r'\[([a-f0-9:]+)\]:([0-9]+)'
pattern = re.compile(p)
for line in logfile:
if line.startswith('Contact:'):
print pattern.search(line).groups()
logfile.close()

If you getting new entries through something like tail -f $logfile, you can pipe the output of that to this:
import re
import sys
for line in sys.stdin:
m = re.match(r'Contact: .*?\[(.*?)\]:(\d+)', line)
if m is not None:
address, port = m.groups()
print address, port
Basically reads each line that comes in on standard input and tryes to find the items you are interested in. If a line does not match, then shows nothing.

data =re.search(r'Contact: .*?\[(.*?)\]:(\d+)', line_in_file)
if match:
temp=line_in_file.split('[')
temp1=temp[1].split(';')
hexValues = re.findall('[a-f0-9]', temp1[0])

Related

I am able to read the txt line by line, buy not sure how can I now search and replace perticular string with X

I am currently trying to develop a python script to sanitize configuration. My objective is to read line by line from txt, which I could using following code
fh = open('test.txt')
for line in fh:
print(line)
fh.close()
output came up as follows
hostname
198.168.1.1
198.168.1.2
snmp string abck
Now I want to
Search the string matching "hostname" replace with X
Search the ipv4 addresses using regular expression
\b(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(?1)){3}\b and replace with X.X\1 (replacing only first two octets with X)
Aything after "snmp string" need to replaced with X
so the file final output I am looking for is
X
x.x.1.1
x.x.1.2
snmp string x
I could not orchestrate everything together. Any help or guidance will be greatly appreciated.
There are lots of approaches to this, but here's one: rather than just printing each line of the file, store each line in a list:
with open("test.txt") as fh:
contents = []
for line in fh:
contents.append(line)
print(contents)
Now you can loop through that list in order to perform your regex operations. I'm not going to write that code for you, but you can use python's inbuilt regex library.

Matching a simple string with regex not working?

I have a large txt-file and want to extract all strings with these patterns:
/m/meet_the_crr
/m/commune
/m/hann_2
Here is what I tried:
import re
with open("testfile.txt", "r") as text_file:
contents = text_file.read().replace("\n", "")
print(re.match(r'^\/m\/[a-zA-Z0-9_-]+$', contents))
The result I get is a simple "None". What am I doing wrong here?
You need to not remove lineends and use the re.MULTILINE flag so you get multiple results from a bigger text returned:
# write a demo file
with open("t.txt","w") as f:
f.write("""
/m/meet_the_crr\n
/m/commune\n
/m/hann_2\n\n
# your text looks like this after .read().replace(\"\\n\",\"\")\n
/m/meet_the_crr/m/commune/m/hann_2""")
Program:
import re
regex = r"^\/m\/[a-zA-Z0-9_-]+$"
with open("t.txt","r") as f:
contents = f.read()
found_all = re.findall(regex,contents,re.M)
print(found_all)
print("-")
print(open("t.txt").read())
Output:
['/m/meet_the_crr', '/m/commune', '/m/hann_2']
Filecontent:
/m/meet_the_crr
/m/commune
/m/hann_2
# your text looks like this after .read().replace("\n","")
/m/meet_the_crr/m/commune/m/hann_2
This is about what Wiktor Stribiżew did tell you in his comment - although he suggested to use a better pattern as well: r'^/m/[\w-]+$'
There is nothing logically wrong with your code, and in fact your pattern will match the inputs you describe:
result = re.match(r'^\/m\/[a-zA-Z0-9_-]+$', '/m/meet_the_crr')
if result:
print(result.groups()) # this line is reached, as there is a match
Since you did not specify any capture groups, you will see () being printed to the console. You could capture the entire input, and then it would be available, e.g.
result = re.match(r'(^\/m\/[a-zA-Z0-9_-]+$)', '/m/meet_the_crr')
if result:
print(result.groups(1)[0])
/m/meet_the_crr
You are reading a whole file into a variable (into memory) using .read(). With .replace("\n", ""), you re,ove all newlines in the string. The re.match(r'^\/m\/[a-zA-Z0-9_-]+$', contents) tries to match the string that entirely matches the \/m\/[a-zA-Z0-9_-]+ pattern, and it is impossible after all the previous manipulations.
There are at least two ways out. Either remove .replace("\n", "") (to prevent newline removal) and use re.findall(r'^/m/[\w-]+$', contents, re.M) (re.M option will enable matching whole lines rather than the whole text), or read the file line by line and use your re.match version to check each line for a match, and if it matches add to the final list.
Example:
import re
with open("testfile.txt", "r") as text_file:
contents = text_file.read()
print(re.findall(r'^/m/[\w-]+$', contents, re.M))
Or
import re
with open("testfile.txt", "r") as text_file:
for line in text_file:
if re.match(r'/m/[\w-]+\s*$', line):
print(line.rstrip())
Note I used \w to make the pattern somewhat shorter, but if you are working in Python 3 and only want to match ASCII letters and digits, use also re.ASCII option.
Also, / is not a special char in Python regex patterns, there is no need escaping it.

Searching for a given string IP-address in a logfile

I am working on a project to search an IP address and see if it is in the logfile. I made some good progress but got stuck when dealing with searching certain items in the logfile format.
Here is what I have:
IP = raw_input('Enter IP Address:')
with open ('RoutingTable.txt', 'r') as searchIP:
for line in searchIP:
if IP in line:
ipArray = line.split()
print ipArray
if IP == ipArray[0]:
print "Success"
else:
print "Fail"
As you can see this is very bad code but I am new to Python and programming so I used this to make sure I can at least open file and compare first item to string I enter.
Her is an example file content (my actual file has like thousands of entries):
https://pastebin.com/ff40sij5
I would like a way to store all IPs (just IP and not other junk) in an array and then a loop to go through all items in array and compare with user defined IP.
For example, for this line all care care about is 10.20.70.0/23
D EX 10.20.70.0/23 [170/3072] via 10.10.10.2, 6d06h, Vlan111
[170/3072] via 10.10.10.2, 6d06h, Vlan111
[170/3072] via 10.10.10.2, 6d06h, Vlan111
[170/3072] via 10.10.10.2, 6d06h, Vlan111
Please help.
Thanks
Damon
Edit: I am digging setting flags but that only works in some cases as you can see all lines do not start with D but there are some that start with O (for OSFP routes) and C (directly connected).
Here is how is what I am doing:
f = open("RoutingTable.txt")
Pr = False
for line in f.readlines():
if Pr: print line
if "EX" in line:
Pr = True
print line
if "[" in line:
Pr = False
f.close()
That gives me a bit cleaner result but still whole line instead of just IP.
Do you necessarily need to store all the IPs by themselves? You can do the following, where you grab all the data into a list and check if your input string resides inside the list:
your_file = 'RoutingTable.txt'
IP = input('Enter IP Address:')
with open(your_file, 'r') as f:
data = f.readlines()
for d in data:
if IP in d:
print 'success'
break
else:
print 'fail'
The else statement only triggers when you don't break, i.e. there is no success case.
If you cannot read everything into memory, you can iterate over each line like you did in your post, but thousands of lines should be easily doable.
Edit
import re
your_file = 'RoutingTable.txt'
ip_addresses = []
IP = input('Enter IP Address:')
with open(your_file, 'r') as f:
data = f.readlines()
for d in data:
res = re.search('(\d+\.\d+\.\d+\.\d+\/\d+)', d)
if res:
ip_addresses.append(res.group(1))
for ip_addy in ip_addresses:
if IP == ip_addy:
print 'success'
break
else:
print 'fail'
print ip_addresses
First up, I'd like to mention that your initial way of handling the file opening and closing (where you used a context manager, the "with open(..)" part) is better. It's cleaner and stops you from forgetting to close it again.
Second, I would personally approach this with a regular expression. If you know you'll be getting the same pattern of it beginning with D EX or O, etc. and then an address and then the bracketed section, a regular expression shouldn't be much work, and they're definitely worth understanding.
This is a good resource to learn generally about them: http://regular-expressions.mobi/index.html?wlr=1
Different languages have different ways to interpret the patterns. Here's a link for python specifics for it (remember to import re): https://docs.python.org/3/howto/regex.html
There is also a website called regexr (I don't have enough reputation for another link) that you can use to mess around with expressions on to get to grips with it.
To summarise, I'd personally keep the initial context manager for opening the file, then use the readlines method from your edit, and inside that, use a regular expression to get out the address from the line, and stick the address you get back into a list.

Grep a string in python

Friends,
I have a situation where i need to grep a word from a string
[MBeanServerInvocationHandler]com.bea:Name=itms2md01,Location=hello,Type=ServerRuntime
What I want to grep is the word that assigned to the variable Name in the above string which is itms2md01.
In my case i have to grep which ever string assigned to Name= so there is no particular string i have to search
Tried:
import re
import sys
file = open(sys.argv[2], "r")
for line in file:
if re.search(sys.argv[1], line):
print line,
Deak is right. As I am not having enough reputation to comment, I am depicting it below. I am not going to the file level. Just see as an instance:-
import re
str1 = "[MBeanServerInvocationHandler]com.bea:Name=itms2md01,Location=hello,Type=ServerRuntime"
pat = '(?<=Name=)\w+(?=,)'
print re.search(pat, str1).group()
Accordingly you can apply your logic with the file content with this pattern
I like to use named groups, because I'm often searching for more than one thing. But even for one item in the search, it still works nicely, and I can remember very easily what I was searching for.
I'm not certain that I fully understand the question, but if you are saying that the user can pass a key to search the value for and also a file from which to search, you can do that like this:
So, for this case, I might do:
import re
import sys
file = open(sys.argv[2], "r")
for line in file:
match = re.search(r"%s=(?P<item>[^,]+)" % sys.argv[1], line)
if match is not None:
print match.group('item')
I am assuming that is the purpose, as you have included sys.argv[1] into the search, though you didn't mention why you did so in your question.

Read file in python from specific file

I have a big log file, and I want to read the relevant part from this log.
Every section start with ###start log###, so I need to search the last occurrence of ###start log###, and read the lines until the end of the file.
I see a solution that can search a line by it seek (number), but I don't know it, I know only the content of the line.
What is the best solution for this case?
I'd suggest reading the file backwards until the first occurrence of the start tag.
You may do it in one of two ways: if the file fits into memory try this: Read a file in reverse order using python
If the file is too large - you may find this link helpful:
http://code.activestate.com/recipes/120686-read-a-text-file-backwards/
Given the size of the file, you basically need to read the file in reverse order. There are some posts on how to read a file in reverse order in python; If you are on a unix system, you may also take a look at unix tac command, then read the output through a pipe and stop when you hit the start of the log:
>>> from subprocess import PIPE, Popen
>>> from itertools import takewhile
>>> with Popen(['tac', 'tmp.txt'], stdout=PIPE) as proc:
... iter = takewhile(lambda line: line != b'###start log###\n', proc.stdout)
... lines = list(iter)
Then the last log lines in correct order would be:
>>> list(reversed(lines))
with open(filename) as handle:
text = handle.read()
lines = text.splitlines()
lines.reverse()
i = next(i for i, line in enumerate(lines) if line == '###start log###')
relevant_lines = lines[:i]
relevant_lines.reverse()

Categories

Resources