beginning python programmer here. I am currently stuck with writing a small python script that would open a txt source file, find a specific number in that source file with a regular expression (107.5 in this case) and ultimately replace that 107.5 with a new number. the new number comes from a second txt file which contains 30 numbers. Each time a number has been replaced, the script uses the next number for its replacement. Although the command prompt does seem to print a successfull find and replace, "an IndexError: list index out of range" occurs after the 30th loop...
My hunge is that I somehow have to limit my loop with something like "for i in range x". However I am not sure which list this should be and how I can incorporate that loop limitation in my current code. Any help is much appreciated!
nTemplate = [" "]
output = open(r'C:\Users\Sammy\Downloads\output.txt','rw+')
count = 0
for line in templateImport:
priceValue = re.compile(r'107.5')
if priceValue.sub(pllines[count], line) != None:
priceValue.sub(pllines[count], line)
nTemplate.append(line)
count = count + 1
print('found a match. replaced ' + '107.5 ' + 'with ' + pllines[count] )
print(nTemplate)
else:
nTemplate.append(line)
The IndexError is raised because you are incrementing count in each iteration of the loop, but haven't added an upper limit based on how many values the pllines list actually contains. You should break out of the loop when it reaches len(pllines) in order to avoid the error.
Another issue which you may not have noticed is with your usage of the re.sub() method. It returns a new string with the appropriate replacements, and does not modify the original.
If the pattern doesn't exist in the string, it'll return the original itself. So your nTemplate list probably never had any of the replaced strings appended to it. Unless you need to do some other actions if the pattern was found in the line, you can do away with the if condition (as I have in the example below).
Since the priceValue object is the same for all lines, it can be moved outside the loop.
The following code should work:
nTemplate = [" "]
output = open(r'C:\Users\Sammy\Downloads\output.txt','rw+')
count = 0
priceValue = re.compile(r'107.5')
for line in templateImport:
if count == len(pllines):
break
nTemplate.append(priceValue.sub(pllines[count], line))
count = count + 1
print(nTemplate)
Working on project where i am given raw log data and need to parse it out to a readable state, know enought with python to break off all the undeed part and just left with raw data that needs to be split and formated, but cant figure out a way to break it apart if they put multiple records on the same line, which does not always happen.
This is the string value i am getting so far.
*190205*12,6000,0000000,12,6000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000*190206*01,2050,0100550,01,4999,0000000,,
I need to break it out apart so that each line starts with the number value, but since i can assume there will only be 1 or two of them i cant think of a way to do it, and the number of other comma seperate values after it vary so i cant go by length. this is what i am looking to get to use will further operations with data from the above example.
*190205*12,6000,0000000,12,6000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000
*190206*01,2050,0100550,01,4999,0000000,,
txt = "*190205*12,6000,0000000,12,6000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000*190206*01,2050,0100550,01,4999,0000000,,"
output = list()
i = 0
x = txt.split("*")
while i < len(x):
if len(x[i]) == 0:
i += 1
continue
print ("*{0}*{1}".format(x[i],x[i+1]))
output.append("*{0}*{1}".format(x[i],x[i+1]))
i += 2
Use split to tokezine the words between *
Print two constitutive tokens
You can use regex:
([*][0-9]*[*])
You can catch the header part with this and then split according to it.
Same answer as #mujiga but I though a dict might better for further operations
txt = "*190205*12,6000,0000000,12,6000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000*190206*01,2050,0100550,01,4999,0000000,,"
datadict=dict()
i=0
x=txt.split("*")
while i < len(x):
if len(x[i]) == 0:
i += 1
continue
datadict[x[i]]=x[i+1]
i += 2
Adding on to #Ali Nuri Seker's suggestion to use regex, here' a simple one lacking lookarounds (which might actually hurt it in this case)
>>> import re
>>> string = '''*190205*12,6000,0000000,12,6000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000*190206*01,2050,0100550,01,4999,0000000,,'''
>>> print(re.sub(r'([\*][0-9,]+[\*]+[0-9,]+)', r'\n\1', string))
#Output
*190205*12,6000,0000000,12,6000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000,13,2590,0000000,13,7000,0000000,13,7000,0000000
*190206*01,2050,0100550,01,4999,0000000,,
Program Details:
I am writing a program for python that will need to look through a text file for the line:
Found mode 1 of 12: EV= 1.5185449E+04, f= 19.612545, T= 0.050988.
Problem:
Then after the program has found that line, it will then store the line into an array and get the value 19.612545, from f = 19.612545.
Question:
I so far have been able to store the line into an array after I have found it. However I am having trouble as to what to use after I have stored the string to search through the string, and then extract the information from variable f. Does anyone have any suggestions or tips on how to possibly accomplish this?
Depending upon how you want to go at it, CosmicComputer is right to refer you to Regular Expressions. If your syntax is this simple, you could always do something like:
line = 'Found mode 1 of 12: EV= 1.5185449E+04, f= 19.612545, T= 0.050988.'
splitByComma=line.split(',')
fValue = splitByComma[1].replace('f= ', '').strip()
print(fValue)
Results in 19.612545 being printed (still a string though).
Split your line by commas, grab the 2nd chunk, and break out the f value. Error checking and conversions left up to you!
Using regular expressions here is maddness. Just use string.find as follows: (where string is the name of the variable the holds your string)
index = string.find('f=')
index = index + 2 //skip over = and space
string = string[index:] //cuts things that you don't need
string = string.split(',') //splits the remaining string delimited by comma
your_value = string[0] //extracts the first field
I know its ugly, but its nothing compared with RE.
I'm not going to lie. I'm trying to do an assignment and I'm being beaten by it.
I need to have python prompt the user to enter a room number, then lookup that room number in a supplied .txt file which has csv [comma-separated values], and then show multiple results if there are any.
I was able to get python to return the first result ok, but then it stops. I got around the csv thing by using a hash command and .split (I would rather read it as a csv although I couldn't get it to work.) I had to edit the external file so instad of the data being seperated by commas it was seperated by semicolons, which is not ideal as I am not supposed to be messing with the supplied file.
Anyhow...
My external file looks like this:
roombookings.txt
6-3-07;L1;MSW001;1
6-3-07;L2;MSP201;1
6-3-07;L3;WEB201;1
6-3-07;L4;WEB101;1
6-3-07;L5;WEB101;1
7-3-07;L1;MSW001;2
7-3-07;L2;MSP201;2
7-3-07;L3;WEB201;2
7-3-07;L4;WEB101;2
7-3-07;L5;WEB101;2
8-3-07;L1;WEB101;1
8-3-07;L2;MSP201;3
Here's what my code looks like:
roomNumber = (input("Enter the room number: "))
def find_details(id2find):
rb_text = open('roombookings.txt', 'r')
each_line = rb_text.readline()
while each_line != '':
s = {}
(s['Date'], s['Room'], s['Course'], s['Stage']) = each_line.split(";")
if id2find == (s['Room']):
rb_text.close()
return(s)
each_line = rb_text.readline()
rb_text.close()
room = find_details(roomNumber)
if room:
print("Date: " + room['Date'])
print("Room: " + room['Room'])
print("Course: " + room['Course'])
print("Stage: " + room['Stage'])
If i run the program, I get prompted for a room number. If I enter, say, "L1"
I get:
Date: 6-3-07
Room: L1
Course: MSW001
Stage: 1
I should get 3 positive matches. I guess my loop is broken? Please help me save my sanity!
Edit. I've tried the solutions here but keeps either crashing the program (I guess I'm not closing the file properly?) or giving me errors. I've seriously been trying to sort this for 2 days and keep in mind I'm at a VERY basic level. I've read multiple textbooks and done many Google searches but it's all still beyond me, I'm afraid. I appreciate the assistance though.
Your code does "return(s)" the first time the "id2find" argument is exactly equal to the room.
If you want multiple matches, you could create an empty list before entering the loop, append every match to the list WITHOUT returning, return the list, and then use a for-loop to print out each match.
First. For iterating over lines in the file use next:
for line in rb_text:
# do something
Second. Your function returns after first match. How can it match more then one record? Maybe you need something like:
def find_details(id2find):
rb_text = open('roombookings.txt', 'r')
for line in rb_text:
s = {}
(s['Date'], s['Room'], s['Course'], s['Stage']) = line.split(";")
if id2find == (s['Room']):
yield s
rb_text.close()
And then:
for room in find_details(roomNumber):
print("Date: " + room['Date'])
print("Room: " + room['Room'])
print("Course: " + room['Course'])
print("Stage: " + room['Stage'])
And yes, you better use some CSV parser.
Your problem is the return(s) in find_details(). As soon as you have found an entry, you are leaving the function. You do not even close the file then. One solution is to use an empty list at the beginning, e.g results = [], and then append all entries which matches your requirements (results.append(s)).
I'm working through some python problems on pythonchallenge.com to teach myself python and I've hit a roadblock, since the string I am to be using is too large for python to handle. I receive this error:
my-macbook:python owner1$ python singleoccurrence.py
Traceback (most recent call last):
File "singleoccurrence.py", line 32, in <module>
myString = myString.join(line)
OverflowError: join() result is too long for a Python string
What alternatives do I have for this issue? My code looks like such...
#open file testdata.txt
#for each character, check if already exists in array of checked characters
#if so, skip.
#if not, character.count
#if count > 1, repeat recursively with first character stripped off of page.
# if count = 1, add to valid character array.
#when string = 0, print valid character array.
valid = []
checked = []
myString = ""
def recursiveCount(bigString):
if len(bigString) == 0:
print "YAY!"
return valid
myChar = bigString[0]
if myChar in checked:
return recursiveCount(bigString[1:])
if bigString.count(myChar) > 1:
checked.append(myChar)
return recursiveCount(bigString[1:])
checked.append(myChar)
valid.append(myChar)
return recursiveCount(bigString[1:])
fileIN = open("testdata.txt", "r")
line = fileIN.readline()
while line:
line = line.strip()
myString = myString.join(line)
line = fileIN.readline()
myString = recursiveCount(myString)
print "\n"
print myString
string.join doesn't do what you think. join is used to combine a list of words into a single string with the given seperator. Ie:
>>> ",".join(('foo', 'bar', 'baz'))
'foo,bar,baz'
The code snippet you posted will attempt to insert myString between every character in the variable line. You can see how that will get big quickly :-). Are you trying to read the entire file into a single string, myString? If so, the way you want to concatenate the strings is like this:
myString = myString + line
While I'm here... since you're learning Python here are some other suggestions.
There are easier ways to read an entire file into a variable. For instance:
fileIN = open("testdata.txt", "r")
myString = fileIN.read()
(This won't have the exact behaviour of your existing strip() code, but may in fact do what you want.)
Also, I would never recommend practical Python code use recursion to iterate over a string. Your code will make a function call (and a stack entry) for every character in the string. Also I'm not sure Python will be very smart about all the uses of bigString[1:]: it may well create a second string in memory that's a copy of the original without the first character. The simplest way to process every character in a string is:
for mychar in bigString:
... do your stuff ...
Finally, you are using the list named "checked" to see if you've ever seen a particular character before. But the membership test on lists ("if myChar in checked") is slow. In Python you're better off using a dictionary:
checked = {}
...
if not checked.has_key(myChar):
checked[myChar] = True
...
This exercise you're doing is a great way to learn several Python idioms.