Printing the output of a Line search - python

I'm new to programming pretty much in general and I am having difficulty trying to get this command to print it's output to the .txt document. My goal in the end is to be able to change the term "Sequence" out for a variable where I can integrate it into a custom easygui for multiple inputs and returns, but that's a story for later down the road. For the sake of testing and completion of the current project I will be just manually altering the term.
I've been successful in being able to get another program to send it's output to a .txt but this one is being difficult. I don't know if I have been over looking something simple, but I have been grounded for more time than I would like to have been on this.
When the it searches for the lines it prints the fields in the file I want, however when it goes to write it finds the last line of the file and then puts that in the .txt as the output. I know the issue but I haven't been able to wrap my head around how to fix it, mainly due to my lack of knowledge of the language I think.
I am using Sublime Text 2 on Windows
def main():
import os
filelist = list()
filed = open('out.txt', 'w')
searchfile = open("asdf.csv")
for lines in searchfile:
if "Sequence" in lines:
print lines
filelist.append(lines)
TheString = " ".join(filelist)
searchfile.close()
filed.write(TheString)
filed.close()
main()

It sounds like you want to the lines you are printing out collected in the variable "filelist", which will then be printed to the file at the .write() call. Only a difference of indentation (which is significant in Python) prevents this from happening:
def main():
import os
filelist = list()
filed = open('out.txt', 'w')
searchfile = open("asdf.csv")
for lines in searchfile:
if "Sequence" in lines:
print lines
filelist.append(lines)
TheString = " ".join(filelist)
searchfile.close()
filed.write(TheString)
filed.close()
main()
Having
filelist.append(lines)
at the same level of indentation as
print lines
tells Python that they are in the same block, and that the second statement also belongs to the "then" clause of the if statement.

Your problem is that you are not appending inside the loop, as a consequence you are only appending the last line, do like this:
for lines in searchfile:
if "Sequence" in lines:
print lines
filelist.append(lines)
BONUS: This is the "pythonic" way to do what you want:
def main():
with open('asdf.csv', 'r') as src, open('out.txt', 'w') as dest:
dest.writelines(line for line in src if 'sequence' in line)

def main():
seq = "Sequence"
record = file("out.txt", "w")
search = file("in.csv", "r")
output = list()
for line in search:
if seq in line: output.append(line)
search.close()
record.write(" ".join(output))
record.close()

Related

replace new line in a different file with an underscore (without using with)

I posted a question yesterday in similar regards to this but didn't quite gauge the response I wanted because I wasn't specific enough. Basically the function takes a .txt file as the argument and returns a string with all \n characters replaced with an '_' on the same line. I want to do this without using WITH. I thought I did this correctly but when I run it and check the file, nothing has changed. Any pointers?
This is what I did:
def one_line(filename):
wordfile = open(filename)
text_str = wordfile.read().replace("\n", "_")
wordfile.close()
return text_str
one_line("words.txt")
but to no avail. I open the text file and it remains the same.
The contents of the textfile are:
I like to eat
pancakes every day
and the output that's supposed to be shown is:
>>> one_line("words.txt")
’I like to eat_pancakes every day_’
The fileinput module in the Python standard library allows you to do this in one fell swoop.
import fileinput
for line in fileinput.input(filename, inplace=True):
line = line.replace('\n', '_')
print(line, end='')
The requirement to avoid a with statement is trivial but rather pointless. Anything which looks like
with open(filename) as handle:
stuff
can simply be rewritten as
try:
handle = open(filename)
stuff
finally:
handle.close()
If you take out the try/finally you have a bug which leaves handle open if an error happens. The purpose of the with context manager for open() is to simplify this common use case.
You are missing some steps. After you obtain the updated string, you need to write it back to the file, example below without using with
def one_line(filename):
wordfile = open(filename)
text_str = wordfile.read().replace("\n", "_")
wordfile.close()
return text_str
def write_line(s):
# Open the file in write mode
wordfile = open("words.txt", 'w')
# Write the updated string to the file
wordfile.write(s)
# Close the file
wordfile.close()
s = one_line("words.txt")
write_line(s)
Or using with
with open("file.txt",'w') as wordfile:
#Write the updated string to the file
wordfile.write(s)
with pathlib you could achieve what you want this way:
from pathlib import Path
path = Path(filename)
contents = path.read_text()
contents = contents.replace("\n", "_")
path.write_text(contents)

Read next line in Python

I am trying to figure out how to search for a string in a text file, and if that string is found, output the next line.
I've looked at some similar questions on here but couldn't get anything from them to help me.
This is the program I have made. I have made it solely to solve this specific problem and so it's also probably not perfect in many other ways.
def searcher():
print("Please enter the term you would like the definition for")
find = input()
with open ('glossaryterms.txt', 'r') as file:
for line in file:
if find in line:
print(line)
So the text file will be made up of the term and then the definition below it.
For example:
Python
A programming language I am using
If the user searches for the term Python, the program should output the definition.
I have tried different combinations of print (line+1) etc. but no luck so far.
your code is handling each line as a term, in the code below f is an iterator so you can use next to move it to the next element:
with open('test.txt') as f:
for line in f:
nextLine = next(f)
if 'A' == line.strip():
print nextLine
If your filesize is small, then you may simply read the file by using readlines() which returns a list of strings each delimited by \n character, and then find the index of the selected word, and the print the item at position + 1 in the given list.
This can be done as:
def searcher():
print("Please enter the term you would like the definition for")
find = input()
with open("glossaryterms.txt", "r") as f:
words = list(map(str.strip, f.readlines()))
try:
print(words[words.index(find) + 1])
except:
print("Sorry the word is not found.")
You could try it Quick and dirty with a flag.
with open ('glossaryterms.txt', 'r') as file:
for line in file:
if found:
print (line)
found = False
if find in line:
found = True
It's just important to have the "if found:" before setting the flag. So if you found your search term next iteration/line will be printed.
In my mind, the easiest way would be to cache the last line. This means that on any iteration you would have the previous line, and you'd check on that - keeping the loop relatively similar
For example:
def searcher():
last_line = ""
print("Please enter the term you would like the definition for")
find = input()
with open ('glossaryterms.txt', 'r') as file:
for line in file:
if find in last_line:
print(line)
last_line = line

Deleting a specific word from a file in python

I am quite new to python and have just started importing text files. I have a text file which contains a list of words, I want to be able to enter a word and this word to be deleted from the text file. Can anyone explain how I can do this?
text_file=open('FILE.txt', 'r')
ListText = text_file.read().split(',')
DeletedWord=input('Enter the word you would like to delete:')
NewList=(ListText.remove(DeletedWord))
I have this so far which takes the file and imports it into a list, I can then delete a word from the new list but want to delete the word also from the text file.
Here's what I would recommend since its fairly simple and I don't think you're concerned with performance.:
f = open("file.txt",'r')
lines = f.readlines()
f.close()
excludedWord = "whatever you want to get rid of"
newLines = []
for line in lines:
newLines.append(' '.join([word for word in line.split() if word != excludedWord]))
f = open("file.txt", 'w')
for line in lines:
f.write("{}\n".format(line))
f.close()
This allows for a line to have multiple words on it, but it will work just as well if there is only one word per line
In response to the updated question:
You cannot directly edit the file (or at least I dont know how), but must instead get all the contents in Python, edit them, and then re-write the file with the altered contents
Another thing to note, lst.remove(item) will throw out the first instance of item in lst, and only the first one. So the second instance of item will be safe from .remove(). This is why my solution uses a list comprehension to exclude all instances of excludedWord from the list. If you really want to use .remove() you can do something like this:
while excludedWord in lst:
lst.remove(excludedWord)
But I would discourage this in favor for the equivalent list comprehension
We can replace strings in files (some imports needed;)):
import os
import sys
import fileinput
for line in fileinput.input('file.txt', inplace=1):
sys.stdout.write(line.replace('old_string', 'new_string'))
Find this (maybe) here: http://effbot.org/librarybook/fileinput.htm
If 'new_string' change to '', then this would be the same as to delete 'old_string'.
So I was trying something similar, here are some points to people whom might end up reading this thread. The only way you can replace the modified contents is by opening the same file in "w" mode. Then python just overwrites the existing file.
I tried this using "re" and sub():
import re
f = open("inputfile.txt", "rt")
inputfilecontents = f.read()
newline = re.sub("trial","",inputfilecontents)
f = open("inputfile.txt","w")
f.write(newline)
#Wnnmaw your code is a little bit wrong there it should go like this
f = open("file.txt",'r')
lines = f.readlines()
f.close()
excludedWord = "whatever you want to get rid of"
newLines = []
for line in newLines:
newLines.append(' '.join([word for word in line.split() if word != excludedWord]))
f = open("file.txt", 'w')
for line in lines:
f.write("{}\n".format(line))
f.close()

Write strings to another file

The Problem - Update:
I could get the script to print out but had a hard time trying to figure out a way to put the stdout into a file instead of on a screen. the below script worked on printing results to the screen. I posted the solution right after this code, scroll to the [ solution ] at the bottom.
First post:
I'm using Python 2.7.3. I am trying to extract the last words of a text file after the colon (:) and write them into another txt file. So far I am able to print the results on the screen and it works perfectly, but when I try to write the results to a new file it gives me str has no attribute write/writeline. Here it the code snippet:
# the txt file I'm trying to extract last words from and write strings into a file
#Hello:there:buddy
#How:areyou:doing
#I:amFine:thanks
#thats:good:I:guess
x = raw_input("Enter the full path + file name + file extension you wish to use: ")
def ripple(x):
with open(x) as file:
for line in file:
for word in line.split():
if ':' in word:
try:
print word.split(':')[-1]
except (IndexError):
pass
ripple(x)
The code above works perfectly when printing to the screen. However I have spent hours reading Python's documentation and can't seem to find a way to have the results written to a file. I know how to open a file and write to it with writeline, readline, etc, but it doesn't seem to work with strings.
Any suggestions on how to achieve this?
PS: I didn't add the code that caused the write error, because I figured this would be easier to look at.
End of First Post
The Solution - Update:
Managed to get python to extract and save it into another file with the code below.
The Code:
inputFile = open ('c:/folder/Thefile.txt', 'r')
outputFile = open ('c:/folder/ExtractedFile.txt', 'w')
tempStore = outputFile
for line in inputFile:
for word in line.split():
if ':' in word:
splitting = word.split(':')[-1]
tempStore.writelines(splitting +'\n')
print splitting
inputFile.close()
outputFile.close()
Update:
checkout droogans code over mine, it was more efficient.
Try this:
with open('workfile', 'w') as f:
f.write(word.split(':')[-1] + '\n')
If you really want to use the print method, you can:
from __future__ import print_function
print("hi there", file=f)
according to Correct way to write line to file in Python. You should add the __future__ import if you are using python 2, if you are using python 3 it's already there.
I think your question is good, and when you're done, you should head over to code review and get your code looked at for other things I've noticed:
# the txt file I'm trying to extract last words from and write strings into a file
#Hello:there:buddy
#How:areyou:doing
#I:amFine:thanks
#thats:good:I:guess
First off, thanks for putting example file contents at the top of your question.
x = raw_input("Enter the full path + file name + file extension you wish to use: ")
I don't think this part is neccessary. You can just create a better parameter for ripple than x. I think file_loc is a pretty standard one.
def ripple(x):
with open(x) as file:
With open, you are able to mark the operation happening to the file. I also like to name my file object according to its job. In other words, with open(file_loc, 'r') as r: reminds me that r.foo is going to be my file that is being read from.
for line in file:
for word in line.split():
if ':' in word:
First off, your for word in line.split() statement does nothing but put the "Hello:there:buddy" string into a list: ["Hello:there:buddy"]. A better idea would be to pass split an argument, which does more or less what you're trying to do here. For example, "Hello:there:buddy".split(":") would output ['Hello', 'there', 'buddy'], making your search for colons an accomplished task.
try:
print word.split(':')[-1]
except (IndexError):
pass
Another advantage is that you won't need to check for an IndexError, since you'll have, at least, an empty string, which when split, comes back as an empty string. In other words, it'll write nothing for that line.
ripple(x)
For ripple(x), you would instead call ripple('/home/user/sometext.txt').
So, try looking over this, and explore code review. There's a guy named Winston who does really awesome work with Python and self-described newbies. I always pick up new tricks from that guy.
Here is my take on it, re-written out:
import os #for renaming the output file
def ripple(file_loc='/typical/location/while/developing.txt'):
outfile = "output.".join(os.path.basename(file_loc).split('.'))
with open(outfile, 'w') as w:
lines = open(file_loc, 'r').readlines() #everything is one giant list
w.write('\n'.join([line.split(':')[-1] for line in lines]))
ripple()
Try breaking this down, line by line, and changing things around. It's pretty condensed, but once you pick up comprehensions and using lists, it'll be more natural to read code this way.
You are trying to call .write() on a string object.
You either got your arguments mixed up (you'll need to call fileobject.write(yourdata), not yourdata.write(fileobject)) or you accidentally re-used the same variable for both your open destination file object and storing a string.

How to delete specific strings from a file?

I have a data file (unstructured, messy file) from which I have to scrub specific list of strings (delete strings).
Here is what I am doing but with no result:
infile = r"messy_data_file.txt"
outfile = r"cleaned_file.txt"
delete_list = ["firstname1 lastname1","firstname2 lastname2"....,"firstnamen lastnamen"]
fin=open(infile,"")
fout = open(outfile,"w+")
for line in fin:
for word in delete_list:
line = line.replace(word, "")
fout.write(line)
fin.close()
fout.close()
When I execute the file, I get the following error:
NameError: name 'word' is not defined
The readlines method returns a list of lines, not words, so your code would only work where one of your words is on a line by itself.
Since files are iterators over lines this can be done much easier:
infile = "messy_data_file.txt"
outfile = "cleaned_file.txt"
delete_list = ["word_1", "word_2", "word_n"]
with open(infile) as fin, open(outfile, "w+") as fout:
for line in fin:
for word in delete_list:
line = line.replace(word, "")
fout.write(line)
To remove the string within the same file, I used this code
f = open('./test.txt','r')
a = ['word1','word2','word3']
lst = []
for line in f:
for word in a:
if word in line:
line = line.replace(word,'')
lst.append(line)
f.close()
f = open('./test.txt','w')
for line in lst:
f.write(line)
f.close()
Based on your comment "I am double clicking the .py file. It seems to invoke the python application which disappears after a couple of seconds. I dont get any error thought" I believe your issue is the script is not finding the input file. That is also why you are not getting any output. When you double click on it... I actually can't recall where the interpreter is going to look but I think it's where the python.exe is installed.
Use a fully qualified path like so.
# Depends on your OS
infile = r"C:\tmp\messy_data_file.txt"
outfile = r"C:\tmp\cleaned_file.txt"
infile = r"/etc/tmp/messy_data_file.txt"
outfile = r"/etc/tmp/cleaned_file.txt"
Also, for your sanity, run it from the command-line instead of double clicking. It'll be much easier to catch errors/output.
To the OP,
Ross Patterson's method above works perfectly for me, i.e.
infile = "messy_data_file.txt"
outfile = "cleaned_file.txt"
delete_list = ["word_1", "word_2", "word_n"]
fin = open(infile)
fout = open(outfile, "w+")
for line in fin:
for word in delete_list:
line = line.replace(word, "")
fout.write(line)
fin.close()
fout.close()
Example:
I have a file named messy_data_file.txt that includes the following words (animals), not necessarily on the same line. Like this:
Goat
Elephant
Horse Donkey Giraffe
Lizard
Bird
Fish
When I modify the code to read (actually just adding the words to delete to the "delete_list" line):
infile = "messy_data_file.txt"
outfile = "cleaned_file.txt"
delete_list = ["Donkey", "Goat", "Fish"]
fin = open(infile)
fout = open(outfile, "w+")
for line in fin:
for word in delete_list:
line = line.replace(word, "")
fout.write(line)
fin.close()
fout.close()
The resulting "cleaned_file.txt" looks like this:
Elephant
Horse Giraffe
Lizard
Bird
There is a blank line where "Goat" used to be (where, oddly, removing "Donkey" did not) but for my purposes, this works fine.
I also add input("Press Enter to exit...") the the very end of the code to keep the command line window from opening and slamming shut on me when I'm double-clicking the remove_text.py file to run it, but take note that you'll catch no errors this way.
To do that I run it from the command line (where C:\Just_Testing is the directory where all my files are, i.e. remove_text.py and messy_text.txt)
like this:
C:\Just_Testing\>py remove_text.py
or
C:\Just_Testing>python remove_text.py
works exactly the same.
Of course, like when writing HTML, I guess it never hurts to use a fully qualified path when running py or python from somewhere other than the directory you happen to be sitting in, such as:
C:\Windows\System32\>python C:\Users\Me\Desktop\remove_text.py
Of course in the code it would be:
infile = "C:\Users\Me\Desktop\messy_data_file.txt"
outfile = "C:\Users\Me\Desktop\cleaned_file.txt"
Be careful to use the same fully qualified path to place your newly created cleaned_file.txt in or it will be created wherever you may be and that could cause confusion when looking for it.
Personally, I have the PATH in my Environment Variables set to point to all my Python installs i.e. C:\Python3.5.3, C:\Python2.7.13, etc. so I can run py or python from anywhere.
Anyway, I hope making fine-tuning adjustments to this code from Mr. Patterson can get you exactly what you need. :)
.
Maybe you can add encoding='utf-8' in your fin and fout variables.
Here is the modified one you may want to use:
fin=open(infile,"", encoding='utf-8')
fout = open(outfile,"w+", encoding='utf-8')
This(adding utf-8) mostly occurs on the OS Windows. Also for reading, writing, and appending the file, this usually isn't a problem but for advanced things to do a file like replacing text in there, etc then you should do this.
Hope this helps you.
The code below just gets the old data and checks if the string doesnt contain the string you doesnt want then continues. (this also works if you want to remove empty lines)
str = []
with open("file.txt", "r+") as f:
for i in f.readlines():
str.append(i)
with open("file.txt", "w") as f:
for i in str:
if i != "The string you want to remove":
f.write(i)

Categories

Resources