I'm writing a script to automatically annotate a txt file.
I open the txt file and segment it into a list of lines. Then I iterate over every line. I want the PC to check if the previous element in the list (the line before in the text) is an empty element (the paragraph division in the text) and if it so to put an annotation.
final_list = []
something = open(x, 'r', encoding='utf8', errors='ignore')
file = something.read()
y = file.split("\n")
for position, i in enumerate(y):
if position == 0:
final_list.append(i)
elif position > 0:
z = i[position-1]
if z == '':
final_list.append("<p>"+i)
return final_list
I expect to a have a final list with all the element of the previous line with some of them marked with the element, but when I iterate over the list Python gives me a
IndexError: string index out of range
I cannot understand where is the problem.
As you are not using values of list, instead of enumerate take length of list and iterate.
You can try this,
for position in range(len(y)):
if position == 0:
final_list.append(i)
elif position > 0:
z = y[position-1]
if z == '':
final_list.append("<p>"+i)
How about something like this :
last_line = ''
output_lines = []
with open('file.txt', 'r') as f:
for line in f:
line = line.strip()
if last_line == '': # if last line was empty, start a new paragraph
output_lines.append('<p>')
output_lines.append(line)
elif line == '': # if current line is empty close the paragraph
output_lines.append('</p>')
else:
output_lines.append(line)
last_line = line
Related
How I can use a list comprehension for this example?
l_words = list_word(domain)
words = list(l_words)
l_lines = [line for count, line in enumerate(open(file_found,"rb"))]
lines = list(l_lines)
# from here
for bline in lines:
for word in words:
try:
aline = bline.decode()
line = return_only_ascii(aline)
if line == "\n":
break
if line.find(word) != -1:
login += 1
# to here
I need to find 83 words in each .txt file, of course for each line I need to see if the word is in each line.
word = "some string"
file1 = open("songs.txt", "r")
flag = 0
index = 0
for line in file1:
index += 1
if word in line:
flag = 1
break
if flag == 0:
print(word + " not found")
else:
#I would like to print not only the line that has the string, but also the previous and next lines
print(?)
print(line)
print(?)
file1.close()
Use contents = file1.readlines() which converts the file into a list.
Then, loop through contents and if word is found, you can print contents[i], contents[i-1], contents[i+1]. Make sure to add some error handling if word is in the first line as contents[i-1] would throw and error.
word = "some string"
file1 = open("songs.txt", "r")
flag = 0
index = 0
previousline = ''
nextline = ''
for line in file1:
index += 1
if word in line:
finalindex = index
finalline = line
flag = 1
elsif flag==1
print(previousline + finalline + line)
print(index-1 + index + index+1)
else
previousline = line
You basically already had the main ingredients:
you have line (the line you currently evaluate)
you have the index (index)
the todo thus becomes storing the previous and next line in some variable and then printing the results.
have not tested it but code should be something like the above.....
splitting if you find the word, if you have found it and you flagged it previous time and if you have not flagged it.
i believe the else-if shouldnt fire unless flag ==1
aspiring Python newb (2 months) here. I am trying to create a program that inserts information to two specific places of each line of a .txt file, actually creating a new file in the process.
The information in the source file is something like this:
1,340.959,859.210,0.0010,VV53
18abc,34099.9590,85989.2100,0.0010,VV53
00y46646464,34.10,859487.2970,11.4210,RP27
Output would be:
1,7340.959,65859.210,0.0010,VV53
18abc,734099.9590,6585989.2100,0.0010,VV53
00y46646464,734.10,65859487.2970,11.4210,RP27
Each line different, hundreds of lines. The specific markers I'm looking for are the first and second occurence of a comma (,). The stuff needs to be added after the first and second comma. You'll know what I mean when you see the code.
I have gotten as far as this: the program finds the correct places and inserts what I need, but doesn't write more than 1 line to the new file. I tried debugging and seeing what's going on 'under the hood', all seemed good there.
Lots of scrapping code and chin-holding later I'm still stuck where I was a week ago.
tl;dr Code only outputs 1 line to new file, need hundreds.
f = open('test.txt', 'r')
new = open('new.txt', 'w')
first = ['7']
second = ['65']
line = f.readline()
templist = list(line)
counter = 0
while line != '':
for i, j in enumerate(templist):
if j == ',':
place = i + 1
templist1 = templist[:place]
templist2 = templist[place:]
counter += 1
if counter == 1:
for i, j in enumerate(templist2):
if j == ',':
place = i + 1
templist3 = templist2[:place]
templist4 = templist2[place:]
templist5 = templist1 + first + templist3 + second + templist4
templist6 = ''.join(templist5)
new.write(templist6)
counter += 1
break
if counter == 2:
break
break
line = f.readline()
templist = list(line)
f.close()
new.close()
If I'm understanding your samples and code correctly, this might be a valid approach:
with open('test.txt', 'r') as infd, open('new.txt', 'w') as outfd:
for line in infd:
fields = line.split(',')
fields[1] = '7' + fields[1]
fields[2] = '65' + fields[2]
outfd.write('{}\n'.format(','.join(fields)))
I need to convert lines of different lengths to one dictionary. It's for player stats. The text file is formatted like below. I need to return a dictionary with each player's stats.
{Lebron James:(25,7,1),(34,5,6), Stephen Curry: (25,7,1),(34,5,6), Draymond Green: (25,7,1),(34,5,6)}
Data:
Lebron James
25,7,1
34,5,6
Stephen Curry
25,7,1
34,5,6
Draymond Green
25,7,1
34,5,6
I need help starting the code. So far I have a code that removes the blank lines and makes the lines into a list.
myfile = open("stats.txt","r")
for line in myfile.readlines():
if line.rstrip():
line = line.replace(",","")
line = line.split()
I think this should do what you want:
data = {}
with open("myfile.txt","r") as f:
for line in f:
# Skip empty lines
line = line.rstrip()
if len(line) == 0: continue
toks = line.split(",")
if len(toks) == 1:
# New player, assumed to have no commas in name
player = toks[0]
data[player] = []
elif len(toks) == 3:
data[player].append(tuple([int(tok) for tok in toks]))
else: raise ValueErorr # or something
The format is somewhat ambiguous, so we have to make some assumptions about what the names can be. I've assumed that names can't contain commas here, but you could relax that a bit if needed by trying to parse int,int,int, and falling back on treating it as a name if it fails to parse.
Here's a simple way to do this:
scores = {}
with open('stats.txt', 'r') as infile:
i = 0
for line in infile.readlines():
if line.rstrip():
if i%3!=0:
t = tuple(int(n) for n in line.split(","))
j = j+1
if j==1:
score1 = t # save for the next step
if j==2:
score = (score1,t) # finalize tuple
scores.update({name:score}) # add to dictionary
else:
name = line[0:-1] # trim \n and save the key
j = 0 # start over
i=i+1 #increase counter
print scores
Maybe something like this:
For Python 2.x
myfile = open("stats.txt","r")
lines = filter(None, (line.rstrip() for line in myfile))
dictionary = dict(zip(lines[0::3], zip(lines[1::3], lines[2::3])))
For Python 3.x
myfile = open("stats.txt","r")
lines = list(filter(None, (line.rstrip() for line in myfile)))
dictionary = dict(zip(lines[0::3], zip(lines[1::3], lines[2::3])))
I am both new to programming and python. I need to read a list in a file, use a while-loop or for-loop to alphabetize that list and then write the alphabetized list to a second file. The file is not sorting and it is not writing to the file. Any insight or constructive criticism is welcome.
unsorted_list = open("unsorted_list.txt", "r") #open file congaing list
sorted_list = open ("sorted_list.txt", "w") #open file writing to
usfl = [unsorted_fruits.read()] #create variable to work with list
def insertion_sort(list): #this function has sorted other list
for index in range(1, len(list)):
value = list[index]
i = index - 1
while i >= 0:
if value < list[i]:
list[i+1] = list[i]
list[i] = value
i = i - 1
else:
break
insertion_sort(usfl) #calling the function to sort
print usfl #print list to show its sorted
sfl = usfl
sorted_furits.write(list(sfl)) #write the sorted list to the file
unsorted_fruits.close()
sorted_fruits.close()
exit()
If insertion_sort worked before, I guess it works now, too. The problem is that usfl contains only one element, the content of the file.
If you have a fruit on each line, you can use this to populate your list:
usfl = [line.rstrip () for line in unsorted_fruits]
or if it is a comma separated list, you can use:
usfl = unsorted_fruits.read ().split (',')
Your problem seems to be the way you're handling files.
Try something along the lines of:
input_file = open("unsorted_list.txt", "r")
output_file = open("sorted_list.txt", "w")
#Sorting function
list_of_lines = list(input_file) #Transform your file into a
#list of strings of lines.
sort(list_of_lines)
long_string = "".join(list_of_lines) #Turn your now sorted list
#(e.g. ["cat\n", "dog\n", "ferret\n"])
#into one long string
#(e.g. "cat\ndog\nferret\n").
output_file.write(long_string)
input_file.close()
output_file.close()
exit()
Let me begin by saying thank you to all the answers. Using the answer here to guide me in searching and tinkering I have produced code that works as required.
infile = open("unsorted_fruits.txt", "r")
outfile = open("sorted_fruits.txt", "w")
all_lines = infile.readlines()
for line in all_lines:
print line,
def insertion_sort(list):
for index in range(1, len(list)):
value = list[index]
i = index - 1
while i >= 0:
if value < list[i]:
list[i+1] = list[i]
list[i] = value
i = i - 1
else:
break
insertion_sort(all_lines)
all_sorted = str(all_lines)
print all_sorted
outfile.write(all_sorted)
print "\n"
infile.close()
outfile.close()
exit()