python nesting loops - python

I am trying perform a nested loop to combine data into a line by using matched MAC Addresses in both files.
I am able to pull the loop fine without the regex, however, when using the search regex below, it will only loop through the MAC_Lines once and print the correct results using the first entry in the MAC_Lines and stop. I'm unsure how to make the MAC_Lines go to the next line and repeat the process for all of the entries in the MAC_Lines.
try:
for mac in MAC_Lines:
MAC_address = re.search(r'([a-fA-F0-9]{2}[:|\-]?){6}', mac, re.I)
MAC_address_final = MAC_address.group()
for arp in ARP_Lines:
ARP_address = re.search(r'([a-fA-F0-9]{2}[:|\-]?){6}', arp, re.I)
ARP_address_final = ARP_address.group()
if MAC_address_final == ARP_address_final:
print mac + arp
continue
except Exception:
print 'completed.'
Results:
13,64,00:0c:29:36:9f:02,giga-swx 0/213,172.20.13.70, 00:0c:29:36:9f:02, vlan 64
completed.

I learned that the issue was how I opened the file. I should have used the 'open':'as' keywords when opening both files to allow the files to properly close and reopen for the next loop. Below is the code I was looking for.
Below is the code:
with open('MAC_List.txt', 'r') as read0:for items0 in read0:
MAC_address = re.search(r'([a-fA-F0-9]{2}[:|\-]?){6}', items0, re.I)
if MAC_address:
mac_addy = MAC_address.group().upper()
with open('ARP_List.txt', 'r') as read1:
for items1 in read1:
ARP_address = re.search(r'([a-fA-F0-9]{2}[:|\-]?){6}', items1, re.I)
if ARP_address:
arp_addy = ARP_address.group()
if mac_addy == arp_addy:
print(items0.strip() + ' ' + items1.strip())

Related

IndexError: list index out of range in Python Script

I'm new to python and so I apologize if this question has already been answered. I've used this script before and its worked so I'm not at all sure what is wrong.
I'm trying to transform a MALLET output document into a long list of topic, weight, value rather than a wide list of topics documents and weights.
Here's what the original csv I'm trying to convert looks like but there are 30 topics in it (its a text file called mb_composition.txt):
0 file:/Users/mandyregan/Dropbox/CPH-DH/MiningtheSurge/txt/Abizaid.txt 6.509147794508226E-6 1.8463345214533957E-5 3.301298069640119E-6 0.003825178550032757 0.15240841618294929 0.03903974304065183 0.10454783676528623 0.1316719812119471 1.8018057013225344E-5 4.869261713020613E-6 0.0956868156114931 1.3521101623203115E-5 9.514591058923748E-6 1.822741355900598E-5 4.932324961835634E-4 2.756817586271138E-4 4.039186874601744E-5 1.0503346606335033E-5 1.1466132458804392E-5 0.007003443189848799 6.7094360963952E-6 0.2651753488982284 0.011727025879070194 0.11306132549594633 4.463460490946615E-6 0.0032751230536005056 1.1887304822238514E-5 7.382714572306351E-6 3.538808652077042E-5 0.07158823129977483
1 file:/Users/mandyregan/Dropbox/CPH-DH/MiningtheSurge/txt/Jeffrey,%20Jim%20-%20Chk5-%20ASC%20-%20FINAL%20-%20Sept%202017.docx.txt 4.296636200313062E-6 1.218750594272488E-5 1.5556725986514498E-4 0.043172816021532695 0.04645757277949794 0.01963429696910822 0.1328206370818606 0.116826297071711 1.1893574776047563E-5 3.2141605637859693E-6 0.10242945223692496 0.010439315937573735 0.2478814493196687 1.2031769351093548E-5 0.010142417179693447 2.858721603853616E-5 2.6662348272204834E-5 6.9331747684835E-6 7.745091995495631E-4 0.04235638910274044 4.428844900369446E-6 0.0175105406405736 0.05314379308820005 0.11788631730736487 2.9462944350793084E-6 4.746133386282654E-4 7.846714475661223E-6 4.873270616886766E-6 0.008919869163605806 0.02884824479155971
And here's the python script I'm trying to use to convert it:
infile = open('mallet_output_files/mb_composition.txt', 'r')
outfile = open('mallet_output_files/weights.csv', 'w+')
outfile.write('file,topicnum,weight\n')
for line in infile:
tokens = line.split('\t')
fn = tokens[1]
topics = tokens[2:]
#outfile.write(fn[46:] + ",")
for i in range(0,59):
outfile.write(fn[46:] + ",")
outfile.write(topics[i*2]+','+topics[i*2+1]+'\n')
I'm running this in the terminal with python reshape.py and I get this error:
Traceback (most recent call last):
File "reshape.py", line 12, in <module>
outfile.write(topics[i*2]+','+topics[i*2+1]+'\n')
IndexError: list index out of range
Any idea what I'm doing wrong here? I can't seem to figure it out and am frustrated because I know Ive used this script many times before with success! If it helps I'm on Mac OSx with Python Version 2.7.10
The problem is you're looking for 60 topics per line of your CSV.
If you just want to print out the topics in the list up to the nth topic per line, you should probably define your range by the actual number of topics per line:
for i in range(len(topics) // 2):
outfile.write(fn[46:] + ",")
outfile.write(topics[i*2]+','+topics[i*2+1]+'\n')
Stated more pythonically, it would look something like this:
# Group the topics into tuple-pairs for easier management
paired_topics = [tuple(topics[i:i+2]) for i in range(0, len(topics), 2)]
# Iterate the paired topics and print them each on a line of output
for topic in paired_topics:
outfile.write(fn[46:] + ',' + ','.join(topic) + '\n')
You need to debug your code. Try printing out variables.
infile = open('mallet_output_files/mb_composition.txt', 'r')
outfile = open('mallet_output_files/weights.csv', 'w+')
outfile.write('file,topicnum,weight\n')
for line in infile:
tokens = line.split('\t')
fn = tokens[1]
topics = tokens[2:]
# outfile.write(fn[46:] + ",")
for i in range(0,59):
# Add a print statement like this
print(f'Topics {i}: {i*2} and {i*2+1}')
outfile.write(fn[46:] + ",")
outfile.write(topics[i*2]+','+topics[i*2+1]+'\n')
Your 'topics' list only has 30 elements? It looks like you're trying to access items far outside of the available range, i.e., you're trying to access topics[x] where x > 30.

Python: Having trouble replacing lines from file

I'm trying to build a translator using deepl for subtitles but it isn't running perfectly. I managed to translate the subtitles and most of the part I'm having problems replacing the lines. I can see that the lines are translated because it prints them but it doesn't replace them. Whenever I run the program it is the same as the original file.
This is the code responsible for:
def translate(input, output, languagef, languaget):
file = open(input, 'r').read()
fileresp = open(output,'r+')
subs = list(srt.parse(file))
for sub in subs:
try:
linefromsub = sub.content
translationSentence = pydeepl.translate(linefromsub, languaget.upper(), languagef.upper())
print(str(sub.index) + ' ' + translationSentence)
for line in fileresp.readlines():
newline = fileresp.write(line.replace(linefromsub,translationSentence))
except IndexError:
print("Error parsing data from deepl")
This is the how the file looks:
1
00:00:02,470 --> 00:00:04,570
- Yes, I do.
- (laughs)
2
00:00:04,605 --> 00:00:07,906
My mom doesn't want
to babysit everyday
3
00:00:07,942 --> 00:00:09,274
or any day.
4
00:00:09,310 --> 00:00:11,977
But I need
my mom's help sometimes.
5
00:00:12,013 --> 00:00:14,046
She's just gonna
have to be grandma today.
Help will be appreaciated :)
Thanks.
You are opening fileresp with r+ mode. When you call readlines(), the file's position will be set to the end of the file. Subsequent calls to write() will then append to the file. If you want to overwrite the original contents as opposed to append, you should try this instead:
allLines = fileresp.readlines()
fileresp.seek(0) # Set position to the beginning
fileresp.truncate() # Delete the contents
for line in allLines:
fileresp.write(...)
Update
It's difficult to see what you're trying to accomplish with r+ mode here but it seems you have two separate input and output files. If that's the case consider:
def translate(input, output, languagef, languaget):
file = open(input, 'r').read()
fileresp = open(output, 'w') # Use w mode instead
subs = list(srt.parse(file))
for sub in subs:
try:
linefromsub = sub.content
translationSentence = pydeepl.translate(linefromsub, languaget.upper(), languagef.upper())
print(str(sub.index) + ' ' + translationSentence)
fileresp.write(translationSentence) # Write the translated sentence
except IndexError:
print("Error parsing data from deepl")

Output Print is Slow

I am writing a script, this part of the code is making my script output to print slow. I think its the nested loop which is causing the issue ( Used Dictionary concept there ). Is there any alternative way I can make my script to print the result without waiting for it.
Log = open("file.txt")
for LogLine in Log:
flag = True
for key, ConfLine in Conf.items():
for patterns in ConfLine:
patterns = DateString + patterns
if re.match(patterns, LogLine):
flag = False
break
if(flag == False):
break
if(flag):
print LogLine.strip()
C Panda's answer is good but it's not obvious that a regex full of | is the fastest way to try all regexes. Test the performance of this alternative:
pats = [re.compile(date_string+pat) for conf in Conf.values() for pat in conf]
with open('file.txt') as log:
for line in log:
if any(pat.match(line) for pat in pats):
print(line.strip())
On a side note, here's how your current code could be written with a clean break and no need for flag:
for ConfLine, patterns in ((c, p) for c in Conf.values() for p in c):
patterns = DateString + patterns
if re.match(patterns, LogLine):
break
else:
print LogLine.strip()
Try the following. It will give you a lot of speed up. Apply appropriate changes for Python 2.x
pats = (date_string+pat for conf in Conf.values() for pat in conf)
master_pat = re.compile('|'.join(pats))
with open('file.txt') as log:
for line in log:
if master_pat.match(line):
print(line.strip())
If I misread the logic and is not working, please comment.

python reading lines in file and joining them

I have this code
with open ('ip.txt') as ip :
ips = ip.readlines()
with open ('user.txt') as user :
usrs = user.readlines()
with open ('pass.txt') as passwd :
passwds = passwd.readlines()
with open ('prefix.txt') as pfx :
pfxes = pfx.readlines()
with open ('time.txt') as timer :
timeout = timer.readline()
with open ('phone.txt') as num :
number = num.readline()
which open all those files and join them in this shape
result = ('Server:{0} # U:{1} # P:{2} # Pre:{3} # Tel:{4}\n{5}\n'.format(b,c,d,a,number,ctime))
print (result)
cmd = ("{0}{1}#{2}".format(a,number,b))
print (cmd)
I supposed it will print like this
Server:x.x.x.x # U:882 # P:882 # Pre:900 # Tel:456123456789
900456123456789#x.x.x.x
but the output was like this
Server:x.x.x.x
# U:882 # P:882 # Pre:900
# Tel:456123456789
900
456123456789#187.191.45.228
New output :-
Server:x.x.x.x # U:882 # P:882 # Pre:900 # Tel:['456123456789']
900['456123456789']#x.x.x.x
how i can solve this ?
may be you should remove newline using strip()
Example
with open ('ip.txt') as ip :
ips = ip.readline().strip()
readline() will read one line at a time, where readlines() will read entire files as a list of lines
I am guessing from your limited example is that b has a newline embedded. That's because of readlines(). The python idiom to use here is: ip.read().splitlines() where ip is one of your file handles.
See more splitlines options at python docs
Apart from other great answers, for completeness sake here I am going to post an alternative answer using string.translate, which will cover in case of any \n or newline has been accidentally inserted into middle of your string, like '123\n456\n78', which will cover the corner cases from using rstrip or strip.
Server:x.x.x.x # U:882 # P:882 # Pre:900 # Tel:['456123456789']
900['456123456789']#x.x.x.x
You have this is because you're printing a list, to resolve this, you need to join the string in your list number
Altogether, solution will be something Like this:
import string
# prepare for string translation to get rid of new lines
tbl = string.maketrans("","")
result = ('Server:{0} # U:{1} # P:{2} # Pre:{3} # Tel:{4}\n{5}\n'.format(b,c,d,a,''.join(number),ctime))
# this will translate all new lines to ""
print (result.translate(tbl, "\n"))
cmd = ("{0}{1}#{2}".format(a,''.join(number),b))
print (cmd.translate(tbl, "\n"))

Automator/Applescript rename files if

I have a large list of images that have been misnamed by my artist. I was hoping to avoid giving him more work by using Automator but I'm new to it. Right now they're named in order what001a and what002a but that should be what001a and what001b. So basically odd numbered are A and even numbered at B. So i need a script that changes the even numbered to B images and renumbers them all to the proper sequential numbering. How would I go about writing that script?
A small Ruby script embedded in an AppleScript provides a very comfortable solution, allowing you to select the files to rename right in Finder and displaying an informative success or error message.
The algorithm renames files as follows:
number = first 3 digits in filename # e.g. "006"
letter = the letter following those digits # e.g. "a"
if number is even, change letter to its successor # e.g. "b"
number = (number + 1)/2 # 5 or 6 => 3
replace number and letter in filename
And here it is:
-- ask for files
set filesToRename to choose file with prompt "Select the files to rename" with multiple selections allowed
-- prepare ruby command
set ruby_script to "ruby -e \"s=ARGV[0]; m=s.match(/(\\d{3})(\\w)/); n=m[1].to_i; a=m[2]; a.succ! if n.even?; r=sprintf('%03d',(n+1)/2)+a; puts s.sub(/\\d{3}\\w/,r);\" "
tell application "Finder"
-- process files, record errors
set counter to 0
set errors to {}
repeat with f in filesToRename
try
do shell script ruby_script & (f's name as text)
set f's name to result
set counter to counter + 1
on error
copy (f's name as text) to the end of errors
end try
end repeat
-- display report
set msg to (counter as text) & " files renamed successfully!\n"
if errors is not {} then
set AppleScript's text item delimiters to "\n"
set msg to msg & "The following files could NOT be renamed:\n" & (errors as text)
set AppleScript's text item delimiters to ""
end if
display dialog msg
end tell
Note that it will fail when the filename contains spaces.
A friend of mine wrote a Python script to do what I needed. Figured I'd post it here as an answer for anyone stumbling upon a similar problem looking for help. It is in Python though so if anyone wants to convert it to AppleScript for those that may need it go for it.
import os
import re
import shutil
def toInt(str):
try:
return int(str)
except:
return 0
filePath = "./"
extension = "png"
dirList = os.listdir(filePath)
regx = re.compile("[0-9]+a")
for filename in dirList:
ext = filename[-len(extension):]
if(ext != extension): continue
rslts = regx.search(filename)
if(rslts == None): continue
pieces = regx.split(filename)
if(len(pieces) < 2): pieces.append("")
filenumber = toInt(rslts.group(0).rstrip("a"))
newFileNum = (filenumber + 1) / 2
fileChar = "b"
if(filenumber % 2): fileChar = "a"
newFileName = "%s%03d%s%s" % (pieces[0], newFileNum, fileChar, pieces[1])
shutil.move("%s%s" % (filePath, filename), "%s%s" % (filePath, newFileName))

Categories

Resources