I have a file with several IP addresses. There are about 900 IPs on 4 lines of txt. I would like the output to be 1 IP per line. How can I accomplish this? Based on other code, I have come up wiht this, but it fails becasue multiple IPs are on single lines:
import sys
import re
try:
if sys.argv[1:]:
print "File: %s" % (sys.argv[1])
logfile = sys.argv[1]
else:
logfile = raw_input("Please enter a log file to parse, e.g /var/log/secure: ")
try:
file = open(logfile, "r")
ips = []
for text in file.readlines():
text = text.rstrip()
regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text)
if regex is not None and regex not in ips:
ips.append(regex)
for ip in ips:
outfile = open("/tmp/list.txt", "a")
addy = "".join(ip)
if addy is not '':
print "IP: %s" % (addy)
outfile.write(addy)
outfile.write("\n")
finally:
file.close()
outfile.close()
except IOError, (errno, strerror):
print "I/O Error(%s) : %s" % (errno, strerror)
The $ anchor in your expression is preventing you from finding anything but the last entry. Remove that, then use the list returned by .findall():
found = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})',text)
ips.extend(found)
re.findall() will always return a list, which could be empty.
if you only want unique addresses, use a set instead of a list.
If you need to validate IP addresses (including ignoring private-use networks and local addresses), consider using the ipaddress.IPV4Address() class.
The findall function returns an array of matches, you aren't iterating through each match.
regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text)
if regex is not None:
for match in regex:
if match not in ips:
ips.append(match)
Extracting IP Addresses From File
I answered a similar question in this discussion. In short, it's a solution based on one of my ongoing projects for extracting Network and Host Based Indicators from different types of input data (e.g. string, file, blog posting, etc.): https://github.com/JohnnyWachter/intel
I would import the IPAddresses and Data classes, then use them to accomplish your task in the following manner:
#!/usr/bin/env/python
"""Extract IPv4 Addresses From Input File."""
from Data import CleanData # Format and Clean the Input Data.
from IPAddresses import ExtractIPs # Extract IPs From Input Data.
def get_ip_addresses(input_file_path):
""""
Read contents of input file and extract IPv4 Addresses.
:param iput_file_path: fully qualified path to input file. Expecting str
:returns: dictionary of IPv4 and IPv4-like Address lists
:rtype: dict
"""
input_data = [] # Empty list to house formatted input data.
input_data.extend(CleanData(input_file_path).to_list())
results = ExtractIPs(input_data).get_ipv4_results()
return results
Now that you have a dictionary of lists, you can easily access the data you want and output it in whatever way you desire. The below example makes use of the above function; printing the results to console, and writing them to a specified output file:
# Extract the desired data using the aforementioned function.
ipv4_list = get_ip_addresses('/path/to/input/file')
# Open your output file in 'append' mode.
with open('/path/to/output/file', 'a') as outfile:
# Ensure that the list of valid IPv4 Addresses is not empty.
if ipv4_list['valid_ips']:
for ip_address in ipv4_list['valid_ips']:
# Print to console
print(ip_address)
# Write to output file.
outfile.write(ip_address)
Without re.MULTILINE flag $ matches only at the end of string.
To make debugging easier split the code into several parts that you could test independently.
def extract_ips(data):
return re.findall(r"\d{1,3}(?:\.\d{1,3}){3}", data)
the regex filters out some valid ips e.g., 2130706433, "1::1".
And in reverse, the regex matches invalid strings e.g., 999.999.999.999. You could validate an ip string using socket.inet_aton() or more general socket.inet_pton(). You could even split the input into pieces without searching for ip and use these functions to keep valid ips.
If input file is small and you don't need to preserve original order of ips:
with open(filename) as infile, open(outfilename, "w") as outfile:
outfile.write("\n".join(set(extract_ips(infile.read()))))
Otherwise:
with open(filename) as infile, open(outfilename, "w") as outfile:
seen = set()
for line in infile:
for ip in extract_ips(line):
if ip not in seen:
seen.add(ip)
print >>outfile, ip
Related
I am currently trying to develop a python script to sanitize configuration. My objective is to read line by line from txt, which I could using following code
fh = open('test.txt')
for line in fh:
print(line)
fh.close()
output came up as follows
hostname
198.168.1.1
198.168.1.2
snmp string abck
Now I want to
Search the string matching "hostname" replace with X
Search the ipv4 addresses using regular expression
\b(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(?1)){3}\b and replace with X.X\1 (replacing only first two octets with X)
Aything after "snmp string" need to replaced with X
so the file final output I am looking for is
X
x.x.1.1
x.x.1.2
snmp string x
I could not orchestrate everything together. Any help or guidance will be greatly appreciated.
There are lots of approaches to this, but here's one: rather than just printing each line of the file, store each line in a list:
with open("test.txt") as fh:
contents = []
for line in fh:
contents.append(line)
print(contents)
Now you can loop through that list in order to perform your regex operations. I'm not going to write that code for you, but you can use python's inbuilt regex library.
I'm trying to write a code using netmiko that will execute few Show commands and save the results into text file with the name of variable (which is IP here).
For example if I insert IP 8.8.8.8 I want the results to be save into a text file with the name 8.8.8.8. Any ideas?
Since the SwIp variable appears to contain the IP address (as a string), you can do it by slightly changing how you open() the file.
with open(SwIp, 'wb') as f:
f.write(str(pre_r)+'\n')
Note I added a trailing newline to the data written to the file and removed the unnecessary f.close() in your code (the with will do that for you automatically).
The problem is that print returns None, and you're setting pre_r equal to that print call's return value:
pre_r = print(connection.send_command(command))
Instead, set pre_r equal to the data, and print that instead:
for command in commands:
pre_r = connection.send_command(command)
print(pre_r)
with open(SwIp, 'wb') as f:
f.write(pre_r)
You also may want to move that for loop inside the with statement:
with open(SwIp, 'wb') as f:
f.writelines([connection.send_command(cmd) for cmd in commands])
I am working on a project to search an IP address and see if it is in the logfile. I made some good progress but got stuck when dealing with searching certain items in the logfile format.
Here is what I have:
IP = raw_input('Enter IP Address:')
with open ('RoutingTable.txt', 'r') as searchIP:
for line in searchIP:
if IP in line:
ipArray = line.split()
print ipArray
if IP == ipArray[0]:
print "Success"
else:
print "Fail"
As you can see this is very bad code but I am new to Python and programming so I used this to make sure I can at least open file and compare first item to string I enter.
Her is an example file content (my actual file has like thousands of entries):
https://pastebin.com/ff40sij5
I would like a way to store all IPs (just IP and not other junk) in an array and then a loop to go through all items in array and compare with user defined IP.
For example, for this line all care care about is 10.20.70.0/23
D EX 10.20.70.0/23 [170/3072] via 10.10.10.2, 6d06h, Vlan111
[170/3072] via 10.10.10.2, 6d06h, Vlan111
[170/3072] via 10.10.10.2, 6d06h, Vlan111
[170/3072] via 10.10.10.2, 6d06h, Vlan111
Please help.
Thanks
Damon
Edit: I am digging setting flags but that only works in some cases as you can see all lines do not start with D but there are some that start with O (for OSFP routes) and C (directly connected).
Here is how is what I am doing:
f = open("RoutingTable.txt")
Pr = False
for line in f.readlines():
if Pr: print line
if "EX" in line:
Pr = True
print line
if "[" in line:
Pr = False
f.close()
That gives me a bit cleaner result but still whole line instead of just IP.
Do you necessarily need to store all the IPs by themselves? You can do the following, where you grab all the data into a list and check if your input string resides inside the list:
your_file = 'RoutingTable.txt'
IP = input('Enter IP Address:')
with open(your_file, 'r') as f:
data = f.readlines()
for d in data:
if IP in d:
print 'success'
break
else:
print 'fail'
The else statement only triggers when you don't break, i.e. there is no success case.
If you cannot read everything into memory, you can iterate over each line like you did in your post, but thousands of lines should be easily doable.
Edit
import re
your_file = 'RoutingTable.txt'
ip_addresses = []
IP = input('Enter IP Address:')
with open(your_file, 'r') as f:
data = f.readlines()
for d in data:
res = re.search('(\d+\.\d+\.\d+\.\d+\/\d+)', d)
if res:
ip_addresses.append(res.group(1))
for ip_addy in ip_addresses:
if IP == ip_addy:
print 'success'
break
else:
print 'fail'
print ip_addresses
First up, I'd like to mention that your initial way of handling the file opening and closing (where you used a context manager, the "with open(..)" part) is better. It's cleaner and stops you from forgetting to close it again.
Second, I would personally approach this with a regular expression. If you know you'll be getting the same pattern of it beginning with D EX or O, etc. and then an address and then the bracketed section, a regular expression shouldn't be much work, and they're definitely worth understanding.
This is a good resource to learn generally about them: http://regular-expressions.mobi/index.html?wlr=1
Different languages have different ways to interpret the patterns. Here's a link for python specifics for it (remember to import re): https://docs.python.org/3/howto/regex.html
There is also a website called regexr (I don't have enough reputation for another link) that you can use to mess around with expressions on to get to grips with it.
To summarise, I'd personally keep the initial context manager for opening the file, then use the readlines method from your edit, and inside that, use a regular expression to get out the address from the line, and stick the address you get back into a list.
i am trying to parse log with bunch of lines.
The line i am trying to parse from live trace (kind of tail from file) is the one that starts with "Contact".
Actually i need to use everything between brackets whatever is within
[2a00:c30:7141:230:1066:4f46:7243:a6d2] and number separated by double dots after brackets (56791)
as variables.
I have tried wit regex search but i do not know how to deal with.
Contact: "200" <sip:200#[2a00:c30:7141:230:1066:4f46:7243:a6d2]:56791;transport=udp;registering_acc=example_com>;expires=600
If the format is always the same:
for line in logfile:
if "Contact" in line:
myIPAddress=line.split('[')[1].split(']')[0]
myPort=line.split(']:')[1].split(';')[0]
use regex to do so
import re
logfile = open('xxx.log')
p = r'\[([a-f0-9:]+)\]:([0-9]+)'
pattern = re.compile(p)
for line in logfile:
if line.startswith('Contact:'):
print pattern.search(line).groups()
logfile.close()
If you getting new entries through something like tail -f $logfile, you can pipe the output of that to this:
import re
import sys
for line in sys.stdin:
m = re.match(r'Contact: .*?\[(.*?)\]:(\d+)', line)
if m is not None:
address, port = m.groups()
print address, port
Basically reads each line that comes in on standard input and tryes to find the items you are interested in. If a line does not match, then shows nothing.
data =re.search(r'Contact: .*?\[(.*?)\]:(\d+)', line_in_file)
if match:
temp=line_in_file.split('[')
temp1=temp[1].split(';')
hexValues = re.findall('[a-f0-9]', temp1[0])
I'm trying to create a python script to look up for a specific string in a txt file
For example I have the text file dbname.txt includes the following :
Level1="50,90,40,60"
Level2="20,10,30,80"
I will need the script to search for the user input in the file and print the output that equals that value Like :
Please enter the quantity : 50
The level is : Level1
I am stuck in the search portion from the file ?
any advise ?
Thanks in advance
In these sorts of limited cases, I would recommend regular expressions.
import re
import os
You need a file to get the info out of, make a directory for it, if it's not there, and then write the file:
os.mkdir = '/tmp'
filepath = '/tmp/foo.txt'
with open(filepath, 'w') as file:
file.write('Level1="50,90,40,60"\n'
'Level2="20,10,30,80"')
Then read the info and parse it:
with open(filepath) as file:
txt = file.read()
We'll use a regular expression with two capturing groups, the first for the Level, the second for the numbers:
mapping = re.findall(r'(Level\d+)="(.*)"', txt)
This will give us a list of tuple pairs. Semantically I'd consider them keys and values. Then get your user input and search your data:
user_input = raw_input('Please enter the quantity: ')
I typed 50, and then:
for key, value in mapping:
if user_input in value:
print('The level is {0}'.format(key))
which prints:
The level is Level1
Use the mmap module, the most efficient way to do this hands down. mmap doesn't read the entire file into memory (it pages it on demand) and supports both find() and rfind()
with open("hello.txt", "r+b") as f:
# memory-map the file, size 0 means whole file
mm = mmap.mmap(f.fileno(), 0)
position = mm.find('blah')