How to Iterate over readlines() in python - python

I am trying to add lines from a txt file to a python list for iteration, and the script wants to print every line and return an error. I'm using the readlines() function, but when I use list.remove(lines), it returns an error: File "quotes.py", line 20, in main list.remove(lines) TypeError: remove() takes exactly one argument (0 given).
def main():
while True:
try:
text_file = open("childrens-catechism.txt", "r")
lines = text_file.readlines()
# print lines
# print len(lines)
if len(lines) > 0:
print lines
list.remove(lines)
time.sleep(60)
else:
print "No more lines!"
break
text_file.close()
I can't see what I'm doing wrong. I know it has to do with list.remove(). Thank you in advance.

You can write in this way. It will save you some time and give you more efficiency.
import time
def main():
with open("childrens-catechism.txt", "r") as file:
for line in file:
print line,
time.sleep(60)

Try this as per your requirements, this will do what you need.
import time
def main():
with open("childrens-catechism.txt", "r") as file:
for lines in file.readlines():
if len(lines) > 0:
for line in lines:
print line
lines.remove(line)
else:
print "No more lines to remove"
time.sleep(60)

lines is a list here from your txt. files, and list.remove(lines) is not a correct syntax, you trying to delete a list on list. list is a function in Python. You can delete the elements in lines like;
del lines[0]
del lines[1]
...
or
lines.remove("something")
The logic is, remove() is deleting an element in a list, you have to write that list before remove() after then you have to write the thing that you want to delete in paranthesis of remove() function.

On opening a file, we can convert the file lines onto a list,
lines = list(open("childrens-catechism.txt", "r"))
From this list we can now remove entries with length greater than zero, like this,
for line in lines:
if len(line) > 0:
# do sth
lines.remove(line)

If you are trying to read all the lines from the file and then print them in order, and then delete them after printing them I would recommend this approach:
import time
try:
file = open("childrens-catechism.txt")
lines = file.readlines()
while len(lines) != 0:
print lines[0],
lines.remove(lines[0])
time.sleep(60)
except IOError:
print 'No such file in directory'
This prints the first line and then deletes it. When the first value is removed, the list shifts one up making the previous line (lines[1]) the new start to the list namely lines[0].
EDITED:
If you wanted to delete the line from the file as well as from the list of lines you will have to do this:
import time
try:
file = open("childrens-catechism.txt", 'r+') #open the file for reading and writing
lines = file.readlines()
while len(lines) != 0:
print lines[0],
lines.remove(lines[0])
time.sleep(60)
file.truncate(0) #this truncates the file to 0 bytes
except IOError:
print 'No such file in directory'
As far as deleting the lines from the file line for line I am not too sure if that is possible or efficient.

Related

Search for string and delete line that contains string and the line underneath

I have a text file that contains
### 174.10.150.10 on 2018-06-20 12:19:47.533613 ###
IP : 174.10.150.10 :
IP : ALL :
I currently have code that uses Regex to search for a date/time string.
How can I delete a line that contains the string that I find? I want to delete that line and also the line underneath.
So both of these lines would get deleted:
### 174.10.150.10 on 2018-06-20 12:19:47.533613 ###
IP : 174.10.150.10 :
My code currently just adds 'None' to the bottom of the text file.
import re
def run():
try:
with open('file.txt', 'r') as f:
with open('file.txt', 'a') as f2:
reg = re.compile('###\s+\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}.+(\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{0,})\s###')
for line in f:
m = reg.match(line)
answer = raw_input("Delete line? ")
if answer == "y":
# delete line that contains "###" and line underneath
f2.write(str(m))
else:
print("You chose no.")
except OSError as e:
print (e)
run()
(EDIT: I now understand from your comments that you have a blank line after two data lines, so when you want to delete a line you also want to delete the next two lines. My code has been adjusted to do that.)
Here is some code, making various changes to your code. I wrote a new file rather than overwriting the old file, for safety and to avoid needing to keep the entire file in memory at once. I combined the with lines into one line, for readability; similarly, I split the regex string to allow shorter lines of code. To avoid having more than one line in memory at once, I used a countdown variable skipline to note if a line is to be skipped in the new file. I also show each line before asking whether or not to delete it (with its following line). Note that lines that do not have the date and time are copied, by checking that the regexp match variable is None. Finally, I changed raw_input to input so this code will run in Python 3. Change it back to raw_input for Python 2.
By the way, the reason your code just adds 'None' to the end of the file is that you put your write line outside the main loop over the lines of the file. Thus you write only the regex match object for the last line of the file. Since the last line in your file does not have a date and time, the regex did not match so the string representation of the failed match is 'None'. In your second with statement you opened file.txt in append mode, so that 'None' is appended to the file.
I want to emphasize that you should create a new file. If you really want to overwrite the old file, the safe way to do that is to create a new file first with a slightly different name. Then if that file is made successfully, overwrite the old file with the new file and rename one copy to something like file.bak. This takes possible OS errors into account, as your code attempts to do. Without something like that, an error could end up deleting your file completely or mangling it. I leave that part of the code to you.
import re
def run():
try:
with open('file.txt', 'r') as f, open('file.tmp', 'w') as f2:
reg = re.compile('###\s+\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}'
'.+(\d{4}-\d{2}-\d{2}\s\d{2}'
':\d{2}:\d{2}.\d{0,})\s###')
skipline = 0 # do not skip lines
for line in f:
if skipline:
skipline -= 1
continue # Don't write or process this line
m = reg.match(line)
if m:
answer = input("Delete line {} ? ".format(m.group()))
if answer == "y":
skipline = 2 # leave out this and next 2 lines
else:
print("You chose no.")
if not skipline:
f2.write(line)
except OSError as e:
print(e)
run()
I refactor the filtering part into a function called filter_lines and move the regex as module variable. This approach make use of iterator.
import re
regex = re.compile('###\s+\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}.+(\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{0,})\s###')
def filter_lines(lines):
it = iter(lines)
try:
while True:
line = next(it)
m = regex.match(line)
if m:
# You may add the question-answer code here to ask the user whether delete the matched line.
next(it) # Comsume the line following the commented line
continue
yield line
except StopIteration:
# In the future, StopIteration raised in generator function will be converted to RuntimeError so it have to be caught.
# https://www.python.org/dev/peps/pep-0479/
pass
def run():
try:
with open('file.txt', 'r') as f:
with open('file.txt', 'a') as f2:
filtered_lines = list(filter_lines(f1.readlines()))
print(*filtered_lines, sep='')
# You may use the following line to actually write the result to a file
# f2.writelines(filtered_lines)
except OSError as e:
print (e)
run()
This program should print the resultant content.
With some basic refactoring, here's the result...
import re
valid_lines = []
def run():
try:
with open('file.txt', 'r') as f:
reg = re.compile('###\s+\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}.+(\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{0,})\s###\s?')
lines = f.readlines()
invalid_index = -10
for a in range(len(lines)):
reg_result = reg.match(lines[a])
if invalid_index == (a - 1):
# Skip the line underneath the invalid line
continue
if reg_result != None:
# If the line matches the regexp.
invalid_index = a
answer = raw_input("Delete line? ")
if answer.lower() != 'y':
print("You chose no.")
valid_lines.append(lines[a])
else:
valid_lines.append(lines[a])
with open('file.txt', 'w') as f:
# Override the file...
f.writelines(valid_lines)
except OSError as e:
print (e)
run()
If you want to remove any lines that start with ### then, maybe you should consider this as the regexp: ###.*
EDIT: In your regular expression, you should add a \s? at the end to optionally match \n, as the file contains newlines. Also, use fullmatch() instead of match().

Read file and find if all lines are the same length

Using python I need to read a file and determine if all lines are the same length or not. If they are I move the file into a "good" folder and if they aren't all the same length I move them into a "bad" folder and write a word doc that says which line was not the same as the rest. Any help or ways to start?
You should use all():
with open(filename) as read_file:
length = len(read_file.readline())
if all(len(line) == length for line in read_file):
# Move to good folder
else:
# Move to bad folder
Since all() is short-circuiting, it will stop reading the file at the first non-match.
First off, you can read the file, here example.txt and put all lines in a list, content:
with open(filename) as f:
content = f.readlines()
Next you need to trim all the newline characters from the end of a line and put it in another list result:
for line in content:
line = line.strip()
result.append(line)
Now it's not that hard to get the length of every sentence, and since you want lines that are bad, you loop through the list:
for line in result:
lengths.append(len(line))
So the i-th element of result has length [i-th element of lengths]. We can make a counter for what line length occurs the most in the list, it is as simple as one line!
most_occuring = max(set(lengths), key=lengths.count)
Now we can make another for-loop to check which lengths don't correspond with the most-occuring and add those to bad-lines:
for i in range(len(lengths)):
if (lengths[i] != most_occuring):
bad_lines.append([i, result[i]])
The next step is check where the file needs to go, the good folder, or the bad folder:
if len(bad_lines == 0):
#Good file, move it to the good folder, use the os or shutil module
os.rename("path/to/current/file.foo", "path/to/new/desination/for/file.foo")
else:
#Bad file, one or more lines are bad, thus move it to the bad folder
os.rename("path/to/current/file.foo", "path/to/new/desination/for/file.foo")
The last step is writing the bad lines to another file, which is do-able, since we have the bad lines already in a list bad_lines:
with open("bad_lines.txt", "wb") as f:
for bad_line in bad_lines:
f.write("[%3i] %s\n" % (bad_line[0], bad_line[1]))
It's not a doc file, but I think this is a nice start. You can take a look at the docx module if you really want to write to a doc file.
EDIT: Here is an example python script.
with open("example.txt") as f:
content = f.readlines()
result = []
lengths = []
#Strip the file of \n
for line in content:
line = line.strip()
result.append(line)
lengths.append(len(line))
most_occuring = max(set(lengths), key=lengths.count)
bad_lines = []
for i in range(len(lengths)):
if (lengths[i] != most_occuring):
#Append the bad_line to bad_lines
bad_lines.append([i, result[i]])
#Check if it's a good, or a bad file
#if len(bad_lines == 0):
#Good File
#Move file to the good folder...
#else:
#Bad File
with open("bad_lines.txt", "wb") as f:
for bad_line in bad_lines:
f.write("[%3i] %s\n" % (bad_line[0], bad_line[1]))

How to write into nth line in a file with python

I wanted to write a line at particular line number, my file contains following data
one
two
three
four
five
Six
Seven
Eight
Nine
Ten
Eleven
Here is my code -
f = open("file.txt",'r+')
for i, line in enumerate(f):
if i == 2:
f.write("Added at 2nd line")
elif i == 3:
f.write("Added at 2nd line")
elif i > 29:
break
f.close()
after running above code i am getting O/P as below
one
two
three
four
five
Six
Seven
Eight
Nine
Ten
ElevenAdded at 2nd line
Please help me how to write to particular;ar number in file
Ihave tried it and it actually works, and really quickly
Go through the concept..
linecache
and if not this..then this is what u need to do:
Files in Python are iterators, meaning they can be looped over, or have iteration operations applied to them.
To get every 5th line, for example, would be:
import itertools
with open(filename, 'r') as f:
fifthlines = itertools.islice(f, 0, None, 5)
for line in fifthlines:
# do something with line
To skip a series of lines, use a noop for loop; here we skip 10 lines, then read 10:
for _ in itertools.islice(f, 0, 10):
pass
for line in itertools.islice(f, 0, 10):
# do something with this 10th line
With the itertools library, and a quick scan through the Python tutorial you can figure out the rest of the script easily enough.
Your question is not very clear to me, but this is the way i understand it. I tried to use your approach with enumerate and new variable line_num (your line number) which you can set manually in the script or from the terminal:
main_file = open('file.txt', 'r')
lines = []
line_num = 4 # if you want set it manually from script
#line_num = input('Type the line number: ') # if you want to set it from terminal
for i, line in enumerate(main_file):
if i+1 != line_num:
str_line = line
else:
# enumerate starts from zero so increment i by 1
str_line = line.rstrip('\n') + ' (added at %s line)\n' % (i+1)
lines.append(str_line)
if i > 29:
break
main_file.close()
with open('file.txt', 'w') as main_file:
main_file.writelines(lines)
I've divided work with files into two blocks:
- first is reading file lines and adds it to lines list (with added edited line)
- and second block is writing lines into files.txt writelines method concluded in with statement.
Never try to write in the same file you are reading unless it is a direct (binary) file with block of constant size ! You could break it and hardly recover.
In your use case it is not too worse. As you use the iterator interface of the file object, Python reads a whole chunk (in fact the whole file) and the file pointer is immediately positionned at the end. The you get the line one at a time, but when you write you (hopefully) write at the end of file. If you had read the file with readline, your write would have overwritten following lines.
The correct way is to rename input file, open it readonly, open a new file writeonly copy and modify at will and if was ok delete input file.
The alternative simple way when the file is small, is to load everything in memory and rewind the file before writing all back.
f = open("file.txt",'r+')
lines = f.readlines()
f.seek(0)
for i, line in enumerate(lines):
if i == 2:
f.write("Added at 2nd line")
elif i == 3:
f.write("Added at 2nd line")
elif i > 29:
break
f.close()
You can use fileinput:
import fileinput
for line in fileinput.input('file.txt', inplace=True):
if fileimput.filelineno() == 2:
print line.replace(line, 'Added at 2nd line\n'),
or you have to write a second file:
with open('file.txt', 'r') as input_file, open('new_file.txt', 'w') as output_file:
for line in input_file:
if input_file.index(line) == 1:
output_file.write('Added at 2nd line\n')
else:
output_file.write(line)

parse blocks of text from text file using Python

I am trying to parse some text files and need to extract blocks of text. Specifically, the lines that start with "1:" and 19 lines after the text. The "1:" does not start on the same row in each file and there is only one instance of "1:". I would prefer to save the block of text and export it to a separate file. In addition, I need to preserve the formatting of the text in the original file.
Needless to say I am new to Python. I generally work with R but these files are not really compatible with R and I have about 100 to process. Any information would be appreciated.
The code that I have so far is:
tmp = open(files[0],"r")
lines = tmp.readlines()
tmp.close()
num = 0
a=0
for line in lines:
num += 1
if "1:" in line:
a = num
break
a = num is the line number for the block of text I want. I then want to save to another file the next 19 lines of code, but can't figure how how to do this. Any help would be appreciated.
Here is one option. Read all lines from your file. Iterate till you find your line and return next 19 lines. You would need to handle situations where your file doesn't contain additional 19 lines.
fh = open('yourfile.txt', 'r')
all_lines = fh.readlines()
fh.close()
for count, line in enumerate(all_lines):
if "1:" in line:
return all_lines[count+1:count+20]
Could be done in a one-liner...
open(files[0]).read().split('1:', 1)[1].split('\n')[:19]
or more readable
txt = open(files[0]).read() # read the file into a big string
before, after = txt.split('1:', 1) # split the file on the first "1:"
after_lines = after.split('\n') # create lines from the after text
lines_to_save = after_lines[:19] # grab the first 19 lines after "1:"
then join the lines with a newline (and add a newline to the end) before writing it to a new file:
out_text = "1:" # add back "1:"
out_text += "\n".join(lines_to_save) # add all 19 lines with newlines between them
out_text += "\n" # add a newline at the end
open("outputfile.txt", "w").write(out_text)
to comply with best practice for reading and writing files you should also be using the with statement to ensure that the file handles are closed as soon as possible. You can create convenience functions for it:
def read_file(fname):
"Returns contents of file with name `fname`."
with open(fname) as fp:
return fp.read()
def write_file(fname, txt):
"Writes `txt` to a file named `fname`."
with open(fname, 'w') as fp:
fp.write(txt)
then you can replace the first line above with:
txt = read_file(files[0])
and the last line with:
write_file("outputfile.txt", out_text)
I always prefer to read the file into memory first, but sometimes that's not possible. If you want to use iteration then this will work:
def process_file(fname):
with open(fname) as fp:
for line in fp:
if line.startswith('1:'):
break
else:
return # no '1:' in file
yield line # yield line containing '1:'
for i, line in enumerate(fp):
if i >= 19:
break
yield line
if __name__ == "__main__":
with open('ouput.txt', 'w') as fp:
for line in process_file('intxt.txt'):
fp.write(line)
It's using the else: clause on a for-loop which you don't see very often anymore, but was created for just this purpose (the else clause if executed if the for-loop doesn't break).

Python: Print next x lines from text file when hitting string

The situation is as follows:
I have a .txt file with results of several nslookups.
I want to loop tru the file and everytime it hits the string "Non-authoritative answer:" the scripts has to print the following 8 lines from that position. If it works I shoud get all the positive results in my screen :).
First I had the following code:
#!/bin/usr/python
file = open('/tmp/results_nslookup.txt', 'r')
f = file.readlines()
for positives in f:
if 'Authoritative answers can be found from:' in positives:
print positives
file.close()
But that only printed "Authoritative answers can be found from:" the times it was in the .txt.
The code what I have now:
#!/bin/usr/python
file = open('/tmp/results_nslookup.txt', 'r')
lines = file.readlines()
i = lines.index('Non-authoritative answer:\n')
for line in lines[i-0:i+9]:
print line,
file.close()
But when I run it, it prints the first result nicely to my screen but does not print the other positve results.
p.s. I am aware of socket.gethostbyname("foobar.baz") but first I want to solve this basic problem.
Thank you in advance!
You can use the file as an iterator, then print the next 8 lines every time you find your sentence:
with open('/tmp/results_nslookup.txt', 'r') as f:
for line in f:
if line == 'Non-authoritative answer:\n':
for i in range(8):
print(next(lines).strip())
Each time you use the next() function on the file object (or loop over it in a for loop), it'll return the next line in that file, until you've read the last line.
Instead of the range(8) for loop, I'd actually use itertools.islice:
from itertools import islice
with open('/tmp/results_nslookup.txt', 'r') as f:
for line in f:
if line == 'Non-authoritative answer:\n':
print(''.join(islice(f, 8)))
file = open('/tmp/results_nslookup.txt', 'r')
for line in file:
if line=='Non-authoritative answer:\n':
for _ in range(8):
print file.next()
By the way: don't ever use the name file for a variable because it is the name of a built-in function.

Categories

Resources