Find and replace regex within text file (mac addresses)

Find and replace regex within text file (mac addresses) - python

This has been asked other places, but no joy when trying those solutions. I am trying to search and replace using open(file) instead of file input. Reason is I am printing a "x of y completed" message as it works (fileinput puts that in the file and not to terminal). My test file is 100 mac addresses separated by new lines.
All I would like to do is find the regex matching a mac address and replace it with "MAC ADDRESS WAS HERE". Below is what I have and it is only putting the replace string once at bottom of file.
#!/usr/bin/env python3
import sys
import getopt
import re
import socket
import os
import fileinput
import time
file = sys.argv[1]
regmac = re.compile("^(([a-fA-F0-9]{2}-){5}[a-fA-F0-9]{2}|([a-fA-F0-9]{2}:){5}[a-fA-F0-9]{2}|([0-9A-Fa-f]{4}\.){2}[0-9A-Fa-f]{4})?$")
regmac1 = "^(([a-fA-F0-9]{2}-){5}[a-fA-F0-9]{2}|([a-fA-F0-9]{2}:){5}[a-fA-F0-9]{2}|([0-9A-Fa-f]{4}\.){2}[0-9A-Fa-f]{4})?$"
regv4 = re.compile(r'^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$')
regv41 = '^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$'
menu = {}
menu['1']="MAC"
menu['2']="IPV4"
menu['3']="IPV6"
menu['4']="STRING"
menu['5']="EXIT"
while True:
options=menu.keys()
sorted(options)
for entry in options:
print(entry, menu[entry])
selection = input("Please Select:")
if selection == '1':
print("MAC chosen...")
id = str('mac')
break
elif selection == '2':
print("IPV4 chosen")
id = str('ipv4')
break
elif selection == '3':
print("IPV6 chosen")
id = str('ipv6')
break
elif selection == '4':
print("String chosen")
id = str('string')
break
elif selection == '5':
print("Exiting...")
exit()
else:
print("Invalid selection!")
macmatch = 0
total = 0
while id == 'mac':
with open(file, 'r') as i:
for line in i.read().split('\n'):
matches = regmac.findall(line)
macmatch += 1
print("I found",macmatch,"MAC addresses")
print("Filtering found MAC addresses")
i.close()
with open(file, 'r+') as i:
text = i.readlines()
text = re.sub(regmac, "MAC ADDRESS WAS HERE", line)
i.write(text)
The above will put "MAC ADDRESS WAS HERE", at the end of the last line while not replacing any MAC addresses.
I am fundamentally missing something. If someone would please point me in right direction that would be great!
caveat, I have this working via fileinput, but cannot display progress from it, so trying using above. Thanks again!

All, I figured it out. Posting working code just in case someone happens upon this post.
#!/usr/bin/env python3
#Rewriting Sanitizer script from bash
#Import Modules, trying to not download any additional packages. Using regex to make this python2 compatible (does not have ipaddress module).
import sys
import getopt
import re
import socket
import os
import fileinput
import time
#Usage Statement sanitize.py /path/to/file, add help statement
#Test against calling entire directories, * usage
#Variables
file = sys.argv[1]
regmac = re.compile("^(([a-fA-F0-9]{2}-){5}[a-fA-F0-9]{2}|([a-fA-F0-9]{2}:){5}[a-fA-F0-9]{2}|([0-9A-Fa-f]{4}\.){2}[0-9A-Fa-f]{4})?$")
regmac1 = "^(([a-fA-F0-9]{2}-){5}[a-fA-F0-9]{2}|([a-fA-F0-9]{2}:){5}[a-fA-F0-9]{2}|([0-9A-Fa-f]{4}\.){2}[0-9A-Fa-f]{4})?$"
regv4 = re.compile(r'^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$')
regv41 = '^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$'
#Functions
menu = {}
menu['1']="MAC"
menu['2']="IPV4"
menu['3']="IPV6"
menu['4']="STRING"
menu['5']="EXIT"
while True:
options=menu.keys()
sorted(options)
for entry in options:
print(entry, menu[entry])
selection = input("Please Select:")
if selection == '1':
print("MAC chosen...")
id = str('mac')
break
elif selection == '2':
print("IPV4 chosen")
id = str('ipv4')
break
elif selection == '3':
print("IPV6 chosen")
id = str('ipv6')
break
elif selection == '4':
print("String chosen")
id = str('string')
break
elif selection == '5':
print("Exiting...")
exit()
else:
print("Invalid selection!")
macmatch = 0
total = 0
while id == 'mac':
with open(file, 'r') as i:
for line in i.read().split('\n'):
matches = regmac.findall(line)
macmatch += 1
print("I found",macmatch,"MAC addresses")
print("Filtering found MAC addresses")
i.close()
with open(file, 'r') as i:
lines = i.readlines()
with open(file, 'w') as i:
for line in lines:
line = re.sub(regmac, "MAC ADDRESS WAS HERE", line)
i.write(line)
i.close()
break
The above overwrites the regex match (found MAC address) with "MAC ADDRESS WAS HERE". Hopefully this helps someone. Any suggestions to make this more efficient or another way to accomplish are welcomed. Will mark as answer once i am able to, 2days.

Related

I cannot delete a file with python

First off I am not too familiar with python and still currently learning also its my first time ever posting here so sorry if I am mess up with some details.
I am running experiments, and need to run multiple replicates. The issue arises when I need to start a new set of replicates, the program moves the already run experiments into a new folder and is then suppose to start the new replicates. However, what happens only on the last experiment, the environment.cfg folder is not transferred and the program crashes.
from os import listdir,chdir
import subprocess
from random import randrange,sample
from shutil import copy,copytree,rmtree
from os import mkdir,remove
import csv
import time
import shutil
test = input("Is this a test? (y/n)")
if test == "y":
text = input("What are you testing?")
testdoc=open("Testing Documentation.txt","a")
testdoc.write(text)
elif test != "y" and test != "n":
print("Please type in y or n")
test = input("Is this a test? (y/n)")
expnum = int(input("Number of Experiments: "))
exptype = int(input("Which experiment do you want to run? (1/2)"))
repnum= int(input("How many replicates do you want?"))
print(f"You want {repnum} replicates, each replicate will contain {expnum} for a total of {repnum*expnum}")
confirm = input("Is this correct? (y/n)")
if confirm == "y":
for rep in range(repnum):
if exptype == 1:
for ex in range (expnum):
num= ex + 1
mkdir('experiment_'+str(num)) #create a directory named cdir
copy('./default_files/environment.cfg','./experiment_'+str(num)+'/environment.cfg') #copy one file from one place to another
#WSIZE
env_file = open('./experiment_'+str(num)+'/environment.cfg','r')
env_file_content = []
for i in env_file:
env_file_content.append(i.split(' '))
#env_file_content = [['a','b'],['c','d']]
#access first line: env_file_content[0] ; note that index in python start from 0
#access first element in second line: env_file_content[1][0] ; note that index in python start from 0
n = num #number of resources
var0 = '100' #resource inflow
var1 = '0.01' #resource outflow
var3 = '1' # The minimum amount of resource required
reactiontype = ['not','nand','and','orn','or','andn','nor','xor','equ']
reward = ["1.0","1.0","2.0","2.0","4.0","4.0","8.0","8.0","16.0"]
#n = sample(range(10),1)[0]
out = open('./experiment_'+str(num)+'/environment.cfg','w')
for i in range(n):
out.write('RESOURCE res'+str(i)+':inflow='+var0+':outflow='+var1+'\n')
sc=0
for i in range(n):
out.write('REACTION reaction'+str(i)+' '+reactiontype[sc]+' process:resource=res'+str(i)+':value='+reward[sc]+':min='+var3+ '\n')
sc+=1
if sc==len(reactiontype):
sc = 0
out.close()
##RUN Avida from python
copy('./experiment_' + str(num) + '/environment.cfg', './')
print("starting experiment_" + str(num))
proc = subprocess.Popen(['./avida'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
# Wait for the subprocess to finish
output, error = proc.communicate()
# Close the subprocess
proc.terminate()
shutil.move('./data','./experiment_' +str(num))
#copytree('./data', './experiment_' + str(num) + '/data')
#rmtree('./data')
remove('./environment.cfg')
replicatenum = rep + 1
mkdir('replicate_'+str(replicatenum)) #create a directory named replicate_
for repl in range(expnum):
numb = repl + 1
source = './experiment_'+str(numb)
dest = './replicate_' + str(replicatenum) + '/experiment_' + str(numb)
shutil.move(source, dest)
I tried renaming the folder, however that also failed. The issue seems only to arise when the program is running as taking the same code after the program has crashed, will cause no problems.

Python for loop only running once?

This script looks interrogates a csv containing species names against a database in a csv and returns if they are in both. The issue is while it is still reading all the terms to search fine, it is only searching the first one. i.e. if I print speciesl before 'for row in p' all species names are returned correctly
from pathlib import Path
import os
import csv
p = csv.reader(open('Paldat.csv','r',newline=''), delimiter=',')
with open('newsssssss.csv','r',newline='\n')as r:
for line in r:
taxons=line.split(',')
no = ['\r\n']
noo = ['\n']
if taxons == no:
continue
elif taxons == noo:
continue
else:
speciesl = []
for val in taxons:
val = val.replace('\n','')
speciesl.append(val)
g=speciesl[0].lower()
if len(speciesl) < 2:
continue
else:
s=speciesl[1].lower()
for row in p: #This loop seems to be the issue
genus = row[0].lower()
species = row[1].lower()
if g == genus and s == species:
print('Perfect match')
print(g)
elif s == species:
print(speciesl)
print('Species found')
else:
continue
else:
continue
Here is part of Paldat.csv:
Camassia,leichtlinii,monad,monad,large (51-100 µm),-,-,-,-,-,sulcate,heteropolar,oblate,-,elliptic,-,-,boat-shaped,no suitable term,aperture(s) sunken,1,sulcus,sulcate,aperture membrane ornamented,-,-,-,"reticulate, heterobrochate, perforate",-,-,-,-,-,-,-,-,-,-,-,present,,
Cistus,parviflorus,monad,monad,medium-sized (26-50 µm),-,-,-,-,-,colporate,isopolar,-,spheroidal,circular,-,-,spheroidal,circular,"aperture(s) sunken, not infolded",3,colporus,"colporate, tricolporate",-,-,-,-,striato-reticulate,-,-,-,-,-,-,-,-,-,-,-,absent,,
Camellia,japonica,monad,monad,medium-sized (26-50 µm),41-50 µm,36-40 µm,41-50 µm,41-50 µm,41-50 µm,colpate,isopolar,-,spheroidal,circular,oblique,prolate,-,triangular,aperture(s) sunken,3,colpus,"colpate, tricolpate",operculum,"granulate, scabrate, reticulate",-,-,microreticulate,-,-,-,-,-,-,-,-,-,-,-,-,,
Camellia,sinensis,monad,monad,medium-sized (26-50 µm),41-50 µm,36-40 µm,41-50 µm,41-50 µm,41-50 µm,colporate,isopolar,oblate,-,triangular,oblique,isodiametric,-,triangular,aperture(s) sunken,3,colporus,"colporate, tricolporate",operculum,"scabrate, verrucate, gemmate",-,-,"verrucate, perforate",-,-,-,-,-,-,-,-,-,-,-,-,,
And part of newsssssss.csv:
Camassia,leichtlinii
Camellia,japonica
Camellia,sinensis
Chrysanthemum,leucanthemum
Cirsium,arvense
Cissus,quadrangularis

Try removing "newline='\n'" from the "open" line.

Python log.txt making fancy. grep/regex

There are a log.txt.
"[25-Feb-2016 11:27:16 +0200]: Login failed .... 212.153.100.19 Get/.... emailaddress#email.com"........
How can i write a script which can grep or regex me only the dates/IP addresses and email addresses and write it out to an other .txt.
The most important thing is that i need dates and the corresponding IPs and emails.
I try it to with the next code, but it is segment all of the data ..
import os
import re
import datetime
filename = 'log.txt'
newfilename = 'output.txt'
if os.path.exists(filename):
data = open(filename,'r')
bulkemails = data.read()
else:
print "File not found."
raise SystemExit
r = re.compile(r'[\w\.-]+#[\w\.-]+\b')
results = r.findall(bulkemails)
emails = ""
for x in results:
emails += str(x)+"\n"
ip = re.compile(r'\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b')
result = ip.findall(bulkemails)
ip =""
for y in result:
ip += str(y)+"\n"
dt = re.compile(r'(\d{4})-(\d{2})-(\d{2})')
result = dt.findall(bulkemails)
dt =""
for z in result:
dt += str(z)+"\n"
def writefile():
f = open(newfilename, 'w')
f.write(emails + ip + dt)
f.close()
print "File written."
def overwrite_ok():
response = raw_input("Are you sure you want to overwrite "+str(newfilename)+"? Yes or No\n")
if response == "Yes":
writefile()
elif response == "No":
print "Aborted."
else:
print "Please enter Yes or No."
overwrite_ok()
if os.path.exists(newfilename):
overwrite_ok()
else:
writefile()
So i whould like to same output.txt what is included the next :
25-Feb-2016 11:27:16 +0200] -- 212.153.100.19 -- emailaddress#email.com"
25-Feb-2016 11:27:16 +0200] -- 212.153.100.10 -- emailaddress1#email.com"
25-Feb-2016 11:27:16 +0200] -- 212.153.100.11 -- emailaddress2#email.com"
Thanks for help, and have a nice day :)

You should make a regex with three groups, one for the time, one for the IP and one for the email.
import re
my_regex = re.compile(r".+?(\d{2}-\w+-\d{4}).+?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).+?\b([\w.\d]+#[\w.\d]+)(?:\b|$)")
with open("somefile") as f_logs:
logs = f_logs.readlines()
for line in logs:
my_regex.sub(r"[\1] -- \2 -- \3",line)
You can check it on regex101

Checking if a file exists in Python 2.7

I am using a simple python script to search and play songs on my laptop. The code goes as follows :-
import os
d_name = raw_input("enter drive name:-")
choice = raw_input("song or video(s/v):-")
if(choice == 's'):
s_name = raw_input("enter song name:- ")
flag = 1
elif(choice=='v'):
s_name = raw_input("enter video name:-")
flag = 2
if(flag == 1):
f_s_name = "start "+d_name+":/"+s_name+".mp3"
elif(flag == 2):
f_s_name = "start "+d_name+":/"+s_name+".mp4"
dir_list = os.listdir("d_name:/")
i=0
while(1):
if(not(os.system(f_s_name))):
break
else:
if(flag == 1):
f_s_name = "start "+d_name+":/"+dir_list[i]+"/"+s_name+".mp3"
elif(flag == 2):
f_s_name = "start "+d_name+":/"+dir_list[i]+"/"+s_name+".mp4"
i = i+1
the above program works fine but when one of the calls to the function os.system() fails until the required condition matches it pops out a dialog box claiming that the song is not there until it is found. How can i prevent popping up of that dialog box?

You'd use os.path.exists to test whether the file you're about to start actually exists; if it is not found, do not try to start that file:
import os
....
filename = '{}:/{}/{}.mp3'.format(d_name, dir_list[i], s_name)
if os.path.exists(filename):
system('start ' + filename)
else:
print "File {} was not found".format(filename)

Python: For loop only loops over the first part of a txt file

Recently I had to make a script for my internship to check if a subnet occurs in a bunch of router/switch configs.
I've made a script that generates the output. Now I need a second script (I couldn't get it to work into one), that reads the output, if the subnet occurs write it to aanwezig.txt, if not write to nietAanwezig.txt.
A lot of other answers helped me to make this script and it works but it only executes for the first 48 IPs and there are over 2000...
The code of checkOutput.py:
def main():
file = open('../iprangesclean.txt', 'rb')
aanwezig = open('../aanwezig.txt', 'w')
nietAanwezig = open('../nietAanwezig.txt', 'w')
output = open('output.txt', 'rb')
for line in file:
originalLine = line
line.rstrip()
line = line.replace(' ', '')
line = line.replace('\n', '')
line = line.replace('\r', '')
one,two,three,four = line.split('.')
# 3Byte IP:
ipaddr = str(one) + "." + str(two) + "." + str(three)
counter = 1
found = 0
for lijn in output:
if re.search("\b{0}\b".format(ipaddr),lijn) and found == 0:
found = 1
else:
found = 2
print counter
counter= counter + 1
if found == 1:
aanwezig.write(originalLine)
print "Written to aanwezig"
elif found == 2:
nietAanwezig.write(originalLine)
print "Written to nietAanwezig"
found = 0
file.close()
aanwezig.close()
nietAanwezig.close()
main()
The format of iprangesclean.txt is like following:
10.35.6.0/24
10.132.42.0/24
10.143.26.0/24
10.143.30.0/24
10.143.31.0/24
10.143.32.0/24
10.35.7.0/24
10.143.35.0/24
10.143.44.0/24
10.143.96.0/24
10.142.224.0/24
10.142.185.0/24
10.142.32.0/24
10.142.208.0/24
10.142.70.0/24
and so on...
Part of output.txt (I can't give you everything because it has user information):
*name of device*.txt:logging 10.138.200.100
*name of device*.txt:access-list 37 permit 10.138.200.96 0.0.0.31
*name of device*.txt:access-list 38 permit 10.138.200.100
*name of device*.txt:snmp-server host 10.138.200.100 *someword*
*name of device*.txt:logging 10.138.200.100

Try this change:
for lijn in output:
found = 0 # put this here
if re.search("\b{0}\b".format(ipaddr),lijn) and found == 0:
found = 1
else:
found = 2
print counter
counter= counter + 1
"""Indent one level so it us in the for statement"""
if found == 1:
aanwezig.write(originalLine)
print "Written to aanwezig"
elif found == 2:
nietAanwezig.write(originalLine)
print "Written to nietAanwezig"
If I understand the problem correctly, this should guide you to the right direction. The if statement is currently not executed in the for statement. If this does solve your problem, then you don't need the found variable either. You can just have something like:
for counter, lijn in enumerate(output, 1):
if re.search("\b{0}\b".format(ipaddr),lijn):
aanwezig.write(originalLine)
print "Written to aanwezig"
else:
nietAanwezig.write(originalLine)
print "Written to nietAanwezig"
print counter
Please let me know if I have misunderstood the question.
Note I haven't tested the code above, try them out as a starting point.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find and replace regex within text file (mac addresses) - python

Related

I cannot delete a file with python

Python for loop only running once?

Python log.txt making fancy. grep/regex

Checking if a file exists in Python 2.7

Python: For loop only loops over the first part of a txt file

Categories

Resources