Reading text file - for loop gives unexpected output - python

I am learning how to read txt files and find something in them. The example below outputs the entire txt file. I am trying to get it to print out "found it" when it finds the word "thanks" in the txt file. Where am I wrong?
This is the txt file I am reading:
this is a
demo file
for exercises
thanks
bye
This is the code I have written:
f = open("demo.txt", "r")
print(f.readline())
print(f.readline())
for word in f:
print(word)
if word == "thanks":
print("found it")
This is the output:
this is a
demo file
for exercises
thanks
bye
Process finished with exit code 0

with open("demo.txt", "r") as f:
for word in f:
print(word)
if "thanks" in word:
print("found it")
break

Files are iterable, so if you want to read a text file line by line, all you have to do is iterate over it. Also, you must ensure the file is closed after use - which is easily done using the with statement. And, finally, lines ends with the (system-dependant) newline marker, which you may want to strip for comparisons.
IOW, your code should look something like:
# nb: "r" (read) is the default
with open("path/to/your/file") as f:
for line in f:
# removes the ending newline marker
line = line.rstrip("\n")
print(line)
# given your spec 'when it finds the word "thanks"'
# I assume that it doesn't matter if there's
# something else in the line, so we test for
# containment.
if "thanks" in line:
print("found it")

Related

Python 3: How to loop through a text file and find matching keywords?

To start, I am a complete newb to coding and don’t know what I’m doing.
I am working with a database txt file and have got it imported and open. I need to now loop through the file, find a specific keyword (number), and print this out to a new file. I have tried endlessly to understand coding to no avail. Can someone explain how to do this to me. Please explain in a dumbed down way so an idiot like me can understand.
file1 = open('database.txt', 'r')
Lines = file1.readlines()
pattern = "gene_numbers_here"
for line in Lines:
if pattern in line:
print(..., file = open("gene1found.txt",'w'))```
Use readlines to load up the txt file line by line into a list of strings
file1 = open('myfile.txt', 'r')
Lines = file1.readlines()
Now for the looping:
for line in Lines:
print(line)
Based on your problem, you are actually wanting to do a "pattern search" in a string.
For that, just use the same code from the looping example and insert a "pattern search" function to check if your pattern exists in your txt file, line by line.
# declare the pattern
pattern = "this_pattern_only"
# loop through the list of strings in Lines
for line in Lines:
# patter search statement
if pattern in line:
print("pattern exist")
else:
print("pattern does not exist")
If you want to print this to a file, just change the print code lines I made.
Check out more on the write functionalities here:
https://www.w3schools.com/python/python_file_write.asp
Based on you new info about the code, try this:
# file1 is database, file2 is output
file1 = open('database.txt', 'r')
file2 = open('gene1found.txt', 'w')
Lines = file1.readlines()
pattern = "gene_numbers_here"
# search and write lines with gene pattern
print("Searching database ...")
for line in Lines:
if pattern in line:
file2.write(line)
print("Search complete !")
# close the file
file1.close()
file2.close()
This will write the gene lines with the pattern you want to your file.

How can I split information in a python script?

This is my code. What it should do is open the file called example.txt in the same directory and it should only print out the first word of a big list.
with open('example.txt') as file:
line = 'example.txt'
important_info = line.split()
print(important_info[0])
I'm pretty sure I messed up but I don't know how.
I first coded this and it worked
acc = ('info blah bloh blrjejw bfwe tee')
tui = acc.split()
print(tui[0])
In the code I showed above it only prints the first word for one line. But I want something that can do over 100 lines quickly. T think I'm close.
Want to make sure I understand- you want this program to read the first line of a file and print the first word right?
You're on the right track. You're accidentally splitting on the name of the file rather than it's file contents- you're missing the code that reads the contents of the file.
To explain your code:
with open('example.txt') as file:
line = 'example.txt'
important_info = line.split()
print(important_info[0])
( important_info is a list containing just the filename i.e. ['example.txt'] so printing the first element would just be the string example.txt )
Something like this would work (reading the first line, splitting it by whitespace so that it's a list of words and then printing the first word in that list)
f = open("example.txt", "r")
print(f.readline().split()[0])
You need to read the file:
with open('example.txt') as file:
line = file.read()
important_info = line.split()
print(important_info[0])

print next line only if it begins with a specific word

I am a bit new to python and I was wondering if anyone can help. Basically I am reading contents of a file and when I find the word "prb" I want to check the next line using the next() function and if it starts with the word "rt", i want to print both lines. So far I wrote this piece of code:
with open('/home/user/Desktop/3rdstep.txt', 'r') as f:
f.readline()
for line in f:
if "prb" in line:
try:
myword = next(f)
if "rt" in myword:
print(line.strip())
print(myword)
except:
print("pass")
This works fine but the only problem is that it skips randomly "rt" words for a reason I don't know. Can anyone help please or have someone done something similar?
Thanks
If your input has two consecutive lines starting with 'prb' followed by line starting with 'rt' then they are skipped. The only exception is the case where they are the first three lines in the file. This is because for line in f: reads the first line starting with 'prb' and myword = next(f) reads the second line. Thus on the following iteration line starts with 'rt'.
Instead of reading the next line you could store the previous line and then check if two lines match:
prev = ''
with open('/home/user/Desktop/3rdstep.txt') as f:
for line in f:
if prev.startswith('prb') and line.startswith('rt'):
print(prev.strip())
print(line)
prev = line
You may use if myword.startswith("rt"): instead of if "rt" in myword:

Read next line in Python

I am trying to figure out how to search for a string in a text file, and if that string is found, output the next line.
I've looked at some similar questions on here but couldn't get anything from them to help me.
This is the program I have made. I have made it solely to solve this specific problem and so it's also probably not perfect in many other ways.
def searcher():
print("Please enter the term you would like the definition for")
find = input()
with open ('glossaryterms.txt', 'r') as file:
for line in file:
if find in line:
print(line)
So the text file will be made up of the term and then the definition below it.
For example:
Python
A programming language I am using
If the user searches for the term Python, the program should output the definition.
I have tried different combinations of print (line+1) etc. but no luck so far.
your code is handling each line as a term, in the code below f is an iterator so you can use next to move it to the next element:
with open('test.txt') as f:
for line in f:
nextLine = next(f)
if 'A' == line.strip():
print nextLine
If your filesize is small, then you may simply read the file by using readlines() which returns a list of strings each delimited by \n character, and then find the index of the selected word, and the print the item at position + 1 in the given list.
This can be done as:
def searcher():
print("Please enter the term you would like the definition for")
find = input()
with open("glossaryterms.txt", "r") as f:
words = list(map(str.strip, f.readlines()))
try:
print(words[words.index(find) + 1])
except:
print("Sorry the word is not found.")
You could try it Quick and dirty with a flag.
with open ('glossaryterms.txt', 'r') as file:
for line in file:
if found:
print (line)
found = False
if find in line:
found = True
It's just important to have the "if found:" before setting the flag. So if you found your search term next iteration/line will be printed.
In my mind, the easiest way would be to cache the last line. This means that on any iteration you would have the previous line, and you'd check on that - keeping the loop relatively similar
For example:
def searcher():
last_line = ""
print("Please enter the term you would like the definition for")
find = input()
with open ('glossaryterms.txt', 'r') as file:
for line in file:
if find in last_line:
print(line)
last_line = line

Python: How to go through a file and replace curse words with a "censored"

Basically, I want a script that opens a file, and then goes through the file and sees if the file contains any curse words. If a line in the file contains a curse word, then I want to replace that line with "CENSORED". So far, I think I'm just messing up the code somehow because I'm new to Python:
filename = input("Enter a file name: ")
censor = input("Enter the curse word that you want censored: ")
with open(filename)as fi:
for line in fi:
if censor in line:
fi.write(fi.replace(line, "CENSORED"))
print(fi)
I am new to this, so I'm probably just messing something up...
Line, as in This code (if "Hat" was a curse word):
There Is
A
Hat
Would be:
There Is
A
CENSORED
You cannot write to the same file your are reading, for two reasons:
You opened the file in read-only mode, you cannot write to such a file. You'd have to open the file in read-write mode (using open(filename, mode='r+')) to be able to do what you want.
You are replacing data as you read, with lines that are most likely going to be shorter or longer. You cannot do that in a file. For example, replacing the word cute with censored would create a longer line, and that would overwrite not just the old line but the start of the next line as well.
You need to write out your changed lines to a new file, and at the end of that process replace the old file with the new.
Note that your replace() call is also incorrect; you'd call it on the line:
line = line.replace(censor, 'CENSORED')
The easiest way for you to achieve what you want is to use the fileinput module; it'll let you replace a file in-place, as it'll handle writing to another file and the file swap for you:
import fileinput
filename = input("Enter a file name: ")
censor = input("Enter the curse word that you want censored: ")
for line in fileinput.input(filename, inplace=True):
line = line.replace(censor, 'CENSORED')
print(line, end='')
The print() call is a little magic here; the fileinput module temporarily replaces sys.stdout meaning that print() will write to the replacement file rather than your console. The end='' tells print() not to include a newline; that newline is already part of the original line read from the input file.
Consider:
filename = input("Enter a file name: ")
censor = input("Enter the curse word that you want censored: ")
# Open the file, iterate through the lines and censor them, storing them in lines list
with open(filename) as f:
lines = [line.replace(censor, 'CENSORED').strip() for line in f]
# If you want to re-write the censored file, re-open it, and write the lines
with open(filename, 'w') as f:
f.write('\n'.join(lines))
We're using a list comprehension to censor the lines of the file.
If you want to replace the entire line, and not just the word, replace
lines = [line.replace(censor, 'CENSORED').strip() for line in f]
with
lines = ['CENSORED' if censor in line else line.strip() for line in f]
filename = input("Enter a file name: ")
censor = input("Enter the curse word that you want censored: ")
with open(filename)as fi:
for line in fi:
if censor in line:
print("CENSORED")
else:
print(line)
with open('filename.txt', 'r') as data:
the_lines = data.readlines()
with open('filename.txt', 'w') as data:
for line_content in the_lines:
if curse_word in line_content:
data.write('Censored')
else:
data.write(line_content)
You have only opened the file for reading. Some options:
Read the whole file in, do the replacement, and write it over the original file again.
Read the file line-by-line, process and write the lines to a new file, then delete the old file and rename in the new file.
Use the fileinput module, which does all the work for you.
Here's an example of the last option:
import fileinput,sys
for line in fileinput.input(inplace=1):
line = line.replace('bad','CENSORED')
sys.stdout.write(line)
And use:
test.py file1.txt file2.txt file3.txt
Each file will be edited in place.

Categories

Resources