How to fix IndexError: String index out of range - Python - python

I am trying to make a parser for my text adventure. I used a text file called test.txt.
I keep getting IndexError: string index out of range. How can I fix this?
parser.py
def parse(file):
data = {}
with open(file, "r") as f:
lines = f.readlines()
f.close()
for line in lines:
line = line.strip()
if line[0] == "#":
name = line[1:]
name = name.replace("\n", "")
data[name] = {}
if line[0] == "-":
prop = line.split(":")
prop_name = prop[0].replace("-", "")
prop_name = prop_name.replace("\n", "")
prop_desc = prop[1][1:]
prop_desc = prop_desc.replace("\n", "")
data[name][prop_name] = prop_desc
return data
print(parse("test.txt"))
test.txt
#hello
-desc: Hello World! Lorem ipsum
-north: world
#world
-desc: World Hello! blah
-south: hello

You're stripping the newlines (line = line.strip()), so if a line is empty, there is just an empty string and line[0] is out of range.
You should test if the line is truthy:
if line and line[0] == "-":
Or, better, in the beginning of the loop, skip blank lines:
for line in lines:
if line == '\n':
continue
# rest of code

Since there is many "\n" in your text, you should ignore them in file reading.
try this:
with open(file, "r") as f:
lines = f.readline().splitlines()
f.close()

Related

Deleting a specific word form a text file in python

I used this code to delete a word from a text file.
f = open('./test.txt','r')
a = ['word1','word2','word3']
lst = []
for line in f:
for word in a:
if word in line:
line = line.replace(word,'')
lst.append(line)
f.close()
f = open('./test.txt','w')
for line in lst:
f.write(line)
f.close()
But for some reason if the words have the same characters, all those characters get deleted. So for e.g
in my code:
def cancel():
global refID
f1=open("refID.txt","r")
line=f1.readline()
flag = 0
while flag==0:
refID=input("Enter the reference ID or type 'q' to quit: ")
for i in line.split(','):
if refID == i:
flag=1
if flag ==1:
print("reference ID found")
cancelsub()
elif (len(refID))<1:
print("Reference ID not found, please re-enter your reference ID\n")
cancel()
elif refID=="q":
flag=1
else:
print("reference ID not found\n")
menu()
def cancelsub():
global refIDarr, index
refIDarr=[]
index=0
f = open('flightbooking.csv')
csv_f = csv.reader(f)
for row in csv_f:
refIDarr.append(row[1])
for i in range (len(refIDarr)):
if refID==refIDarr[i]:
index=i
print(index)
while True:
proceed=input("You are about to cancel your flight booking, are you sure you would like to proceed? y/n?: ")
while proceed>"y" or proceed<"n" or (proceed>"n" and proceed<"y") :
proceed=input("Invalid entry. \nPlease enter y or n: ")
if proceed=="y":
Continue()
break
elif proceed=="n":
main_menu
break
exit
break
def Continue():
lines = list()
with open('flightbooking.csv', 'r') as readFile:
reader = csv.reader(readFile)
for row in reader:
lines.append(row)
for field in row:
if field ==refID:
lines.remove(row)
break
with open('flightbooking.csv', 'w') as writeFile:
writer = csv.writer(writeFile)
writer.writerows(lines)
f = open('refID.txt','r')
a=refIDarr[index]
print(a)
lst = []
for line in f:
for word in a:
if word in line:
line = line.replace(word,'')
lst.append(line)
print(lst)
f.close()
f = open('refID.txt','w')
for line in lst:
f.write(line)
f.close()
print("Booking successfully cancelled")
menu()
When the code is run, the refID variable has one word stored in it, and it should replace just that word with a blank space, but it takes that word for e.g 'AB123', finds all other words which might have an 'A' or a 'B' or the numbers, and replace all of them. How do I make it so it only deletes the word?
Text file before running code:
AD123,AB123
Expected Output in the text file:
AD123,
Output in text file:
D,
Edit: I have added the entire code, and maybe you can help now after seeing that the array is being appended to and then being used to delete from a text file.
here's my opinion.
refIDarr = ["AB123"]
a = refIDarr[0] => a = "AB123"
strings in python are iterable, so when you do for word in a, you're getting 5 loops where each word is actually a letter.
Something like the following is being executed.
if "A" in line:
line = line.replace("A","")
if "B" in line:
line = line.replace("B","")
if "1" in line:
line = line.replace("1","")
if "2" in line:
line = line.replace("2","")
if "3" in line:
line = line.replace("3","")
they correct way to do this is loop over refIDarr
for word in refIDarr:
line = line.replace(word,'')
NOTE: You don't need the if statement, since if the word is not in the line it will return the same line as it was.
"abc".replace("bananan", "") => "abc"
Here's a working example:
refIDarr = ["hello", "world", "lol"]
with open('mytext.txt', "r") as f:
data = f.readlines()
for word in refIDarr:
data = [line.replace(word, "") for line in data]
with open("mytext.txt", "w") as newf:
newf.writelines(data)
The problem is here:
a=refIDarr[index]
If refIDarr is a list of words, accessing specific index makes a be a word. Later, when you iterate over a (for word in a:), word becomes a letter and not a word as you expect, which causes eventually replacing characters of word instead the word itself in your file.
To avoid that, remove a=refIDarr[index] and change your loop to be:
for line in f:
for word in refIDarr:
if word in line:
line = line.replace(word,'')

Python to read txt files and delete lines that contains same part

I have a tons (1000+) of txt files that looks like this
TextTextText('aaa/bbb`ccc' , "ddd.eee");
TextTextText('yyy/iii`ooo' , "rrr.ttt");
TextTextText('aaa/fff`ggg' , "hhh.jjj");
What I want to achieve is to delete all lines that contains same "aaa" part, and leave only one line with it (remove all duplicates).
my code so far:
import os
from collections import Counter
sourcepath = os.listdir('Process_Directory3/')
for file in sourcepath:
inputfile = 'Process_Directory3/' + file
outputfile = "Output_Directory/" + file
lines_seen = set()
outfile = open(outputfile, "w")
for line in open(inputfile, "r"):
print(line)
cut_line = line.split("'")
new_line = cut_line[1]
cut_line1 = new_line.split("/")
new_line1 = cut_line1[0]
if new_line1 not in lines_seen:
outfile.write(new_line1)
lines_seen.add(new_line1)
outfile.close()
My code is not working at all, I dont get any results
Console Report:
Line13 in <module>
new_line = cut_line[1]
IndexError: list index out of range
Sorry for my bad writing, it's my first post so far :D
Best Regards
Update:
I added
startPattern = "TextTextText"
if(startPattern in line):
to make sure i target only lines that begins with "TextTextText", but for some reason I am getting .txt in destination folder that contains only 1 line of content "aaa".
In the end of the day, here is a fully working code:
import os
sourcepath = os.listdir('Process_Directory3/')
for file in sourcepath:
inputfile = 'Process_Directory3/' + file
outputfile = "Output_Directory/" + file
lines_seen = set()
outfile = open(outputfile, "w")
for line in open(inputfile, "r"):
if line.startswith("TextTextText"):
try:
cut_line = line.split("'")
new_line = cut_line[1]
cut_line1 = new_line.split("/")
new_line1 = cut_line1[0]
if new_line1 not in lines_seen:
outfile.write(line)
lines_seen.add(new_line1)
except:
pass
else:
outfile.write(line)
outfile.close()
Thanks for a great help guys!
Use a try-except block in inner for loop. This will prevent your program from being interrupted if any error is encountered due to any line which doesn't contain ' or /.
Update:
I've tried the code given below and it worked fine for me.
sourcepath = os.listdir('Process_Directory3/')
for file in sourcepath:
inputfile = 'Process_Directory3/' + file
outputfile = "Output_Directory/" + file
lines_seen = set()
outfile = open(outputfile, "w")
for line in open(inputfile, "r"):
try:
cut_line = line.split("'")
new_line = cut_line[1]
cut_line1 = new_line.split("/")
new_line1 = cut_line1[0]
if new_line1 not in lines_seen:
outfile.write(line) # Replaced new_line1 with line
lines_seen.add(new_line1)
except:
pass
outfile.close() # This line was having bad indentation
Input file:
TextTextText('aaa/bbb`ccc' , "ddd.eee");
TextTextText('yyy/iii`ooo' , "rrr.ttt");
TextTextText('aaa/fff`ggg' , "hhh.jjj");
TextTextText('WWW/fff`ggg' , "hhh.jjj");
TextTextText('yyy/iii`ooo' , "rrr.ttt");
Output File:
TextTextText('aaa/bbb`ccc' , "ddd.eee");
TextTextText('yyy/iii`ooo' , "rrr.ttt");
TextTextText('WWW/fff`ggg' , "hhh.jjj");
It looks like you encountered line inside your file which has not ', in this case line.split("'") produce list with single element, for example
line = "blah blah blah"
cut_line = line.split("'")
print(cut_line) # ['blah blah blah']
so trying to get cut_line[1] result in error as there is only cut_line[0]. As this piece of your code is inside loop you might avoid that by skipping to next iteration using continue word, if cut_line has not enough elements, just replace:
cut_line = line.split("'")
new_line = cut_line[1]
by:
cut_line = line.split("'")
if len(cut_line) < 2:
continue
new_line = cut_line[1]
This will result in ignoring all lines without '.
I think using a regular expression would make it easier. I have made a simplified working code using re.
import re
lines = [
"",
"dfdsa sadfsadf sa",
"TextTextText('aaa/bbb`ccc' ,dsafdsafsA ",
"TextTextText('yyy/iii`ooo' ,SDFSDFSDFSA ",
"TextTextText('aaa/fff`ggg' ,SDFSADFSDF ",
]
lines_seen = set()
out_lines = []
for line in lines:
# SEARCH FOR 'xxx/ TEXT in the line -----------------------------------
re_result = re.findall(r"'[a-z]+\/", line)
if re_result:
print(f're_result {re_result[0]}')
if re_result[0] not in lines_seen:
print(f'>>> newly found {re_result[0]}')
lines_seen.add(re_result[0])
out_lines.append(line)
print('------------')
for line in out_lines:
print(line)
Result
re_result 'aaa/
>>> newly found 'aaa/
re_result 'yyy/
>>> newly found 'yyy/
re_result 'aaa/
------------
TextTextText('aaa/bbb`ccc' ,dsafdsafsA
TextTextText('yyy/iii`ooo' ,SDFSDFSDFSA
You can experiment with regular expressions here regex101.com.
Try r"'.+/" any character between ' and /, or r"'[a-zA-Z]+/" lower and uppercase letters between ' and /.

Python replace value in text file

I'm trying to replace a value in a specific line in a text file.
My text file contains count of the searchterm, searchterm & date and time
Text file:
MemTotal,5,2016-07-30 12:02:33,781
model name,3,2016-07-30 13:37:59,074
model,3,2016-07-30 15:39:59,075
How can I replace for example the count of the searchterm for line 2 (model name,3,2016-07-30 13:37:59,074)?
This is what I have already:
f = open('file.log','r')
filedata = f.read()
f.close()
newdata = filedata.replace("2", "3")
f = open('file.log', 'w')
f.write(newdata)
f.close()
It replace all values 2.
You have to change three things in your code to get the job done:
Read the file using readlines.
filedata = f.readlines()
Modify the line you want to change (keep in mind that Python indices start at 0 and don't forget to add a newline character \n at the end of the string):
filedata[1] = 'new count,new search term,new date and time\n'
Save the file using a for loop:
for line in filedata:
f.write(line)
Here is the full code (notice I used the with context manager to open/close the file):
with open('file.log', 'r') as f:
filedata = f.readlines()
filedata[1] = 'new count,new search term,new date and time\n'
with open('file.log', 'w') as f:
for line in filedata:
f.write(line)
My solution:
count = 0
line_number = 0
replace = ""
f = open('examen.log','r')
term = "MemTotal"
for line in f.read().split('\n'):
if term in line:
replace= line.replace("5", "25", 1)
line_number = count
count = count + 1
print line_number
f.close()
f = open('examen.log','r')
filedata = f.readlines()
f.close()
filedata[line_number]=replace+'\n'
print filedata[line_number]
print filedata
f = open('examen.log','w')
for line in filedata:
f.write(line)
f.close()
You only need to define the searchterm & the replace value

Replace string in line without adding new line?

I want to replace string in a line which contain patternB, something like this:
from:
some lines
line contain patternA
some lines
line contain patternB
more lines
to:
some lines
line contain patternA
some lines
line contain patternB xx oo
more lines
I have code like this:
inputfile = open("d:\myfile.abc", "r")
outputfile = open("d:\myfile_renew.abc", "w")
obj = "yaya"
dummy = ""
item = []
for line in inputfile:
dummy += line
if line.find("patternA") != -1:
for line in inputfile:
dummy += line
if line.find("patternB") != -1:
item = line.split()
dummy += item[0] + " xx " + item[-1] + "\n"
break
outputfile.write(dummy)
It do not replace the line contain "patternB" as expected, but add an new line below it like :
some lines
line contain patternA
some lines
line contain patternB
line contain patternB xx oo
more lines
What can I do with my code?
Of course it is, since you append line to dummy in the beginning of the for loop and then the modified version again in the "if" statement. Also why check for Pattern A if you treat is as you treat everything else?
inputfile = open("d:\myfile.abc", "r")
outputfile = open("d:\myfile_renew.abc", "w")
obj = "yaya"
dummy = ""
item = []
for line in inputfile:
if line.find("patternB") != -1:
item = line.split()
dummy += item[0] + " xx " + item[-1] + "\n"
else:
dummy += line
outputfile.write(dummy)
The simplest will be:
1. Read all File into string
2. Call string.replace
3. Dump string to file
If you want to keep line by line iterator
(for a big file)
for line in inputfile:
if line.find("patternB") != -1:
dummy = line.replace('patternB', 'patternB xx oo')
outputfile.write(dummy)
else:
outputfile.write(line)
This is slower than other responses, but enables big file processing.
This should work
import os
def replace():
f1 = open("d:\myfile.abc","r")
f2 = open("d:\myfile_renew.abc","w")
ow = raw_input("Enter word you wish to replace:")
nw = raw_input("Enter new word:")
for line in f1:
templ = line.split()
for i in templ:
if i==ow:
f2.write(nw)
else:
f2.write(i)
f2.write('\n')
f1.close()
f2.close()
os.remove("d:\myfile.abc")
os.rename("d:\myfile_renew.abc","d:\myfile.abc")
replace()
You can use str.replace:
s = '''some lines
line contain patternA
some lines
line contain patternB
more lines'''
print(s.replace('patternB', 'patternB xx oo'))

Why is my code printing incorrectly to the text file?

I have this code:
with open("pool2.txt", "r") as f:
content = f.readlines()
for line in content:
line = line.strip().split(' ')
try:
line[0] = float(line[0])+24
line[0] = "%.5f" % line[0]
line = ' ' + ' '.join(line)
except:
pass
with open("pool3.txt", "w") as f:
f.writelines(content)
It should take lines that look like this:
-0.597976 -6.85293 8.10038
Into a line that has 24 added to the first number. Like so:
23.402024 -6.85293 8.10038
When I use print in the code to print the line, the line is correct, but when it prints to the text file, it prints as the original.
The original text file can be found here.
When you loop through an iterable like:
for line in content:
line = ...
line is a copy1 of the element. So if you modify it, the changes won't affect to content.
What can you do? You can iterate through indices, so you access directly to the current element:
for i in range(len(content)):
content[i] = ...
1: See #MarkRansom comment.

Categories

Resources