am simply iterating through an external file (which contains a phrase) and want to see if a line exists (which has the word 'Dad' in it) If i find it, I want to replace it with 'Mum'. Here is the program i've built... but am not sure why it isn't working?!
message_file = open('test.txt','w')
message_file.write('Where\n')
message_file.write('is\n')
message_file.write('Dad\n')
message_file.close()
message_temp_file = open('testTEMP.txt','w')
message_file = open('test.txt','r')
for line in message_file:
if line == 'Dad': # look for the word
message_temp_file.write('Mum') # replace it with mum in temp file
else:
message_temp_file.write(line) # else, just write the word
message_file.close()
message_temp_file.close()
import os
os.remove('test.txt')
os.rename('testTEMP.txt','test.txt')
This should be so simple...it's annoyed me! Thanks.
You don't have any lines that are "Dad". You have a line that is "Dad\n", but no "Dad". In addition, since you've done message_file.read(), the cursor is at the end of your file so for line in message_file will return StopIteration immediately. You should do message_file.seek(0) just before your for loop.
print(message_file.read())
message_file.seek(0)
for line in message_file:
if line.strip() == "Dad":
...
That should put the cursor back at the beginning of the file, and strip out the newline and get you what you need.
Note that this exercise is a great example of how not to do things in general! The better implementation would have been:
in_ = message_file.read()
out = in_.replace("Dad","Mum")
message_temp_file.write(out)
print(message_file.read())
here you already read the whole file.
Nothing is left for the for loop to check
A file object always remembers where it stopped to read/write the last time you accessed it.
So if you call print(message_file.readline()), the first line of the file is read and printed. Next time you call the same command, the second line is read and printed and so on until you reach the end of the file. By using print(message_file.read()) you have read the whole file and any further call of read or readline will give you nothing
You can get the current position by message_file.tell() and set it to a certain value by message_file.seek(value), or simply reopen the file
The problem most likely is due to the fact that your conditional will only match the string "Dad", when the string is actually "Dad\n". You could either update your conditional to:
if line == "Dad\n":
OR
if "Dad" in line:
Lastly, you also read the entire file when you call print(message_file.read()). You either need to remove that line, or you need to put a call to message_file.seek(0) in order for the loop that follows to actually do anything.
Related
Is there a way to precurse a write function in python (I'm working with fasta files but any write function that works with text files should work)?
The only way I could think is to read the whole file in as an array and count the number of lines I want to start at and just re-write that array, at that value, to a text file.
I was just thinking there might be a write an option or something somewhere.
I would add some code, but I'm writing it right now, and everyone on here seems to be pretty well versed, and probably know what I'm talking about. I'm an EE in the CS domain and just calling on the StackOverflow community to enlighten me.
From what I understand you want to truncate a file from the start - i.e remove the first n lines.
Then no - there is no way you can do without reading in the lines and ignoring the lines - this is what I would do :
import shutil
remove_to = 5 # Remove lines 0 to 5
try:
with open('precurse_me.txt') as inp, open('temp.txt') as out:
for index, line in enumerate(inp):
if index <= remove_to:
continue
out.write(line)
# If you don't want to replace the original file - delete this
shutil.move('temp.txt', 'precurse_me.txt')
except Exception as e:
raise e
Here I open a file for the output and then use shutil.move() to replace the input file only after the processing (the for loop) is complete. I do this so that I don't break the 'precurse_me.txt' file in case the processing fails. I wrap the whole thing in a try/except so that if anything fails it doesn't try to move the file by accident.
The key is the for loop - read the input file line by line; using the enumerate() function to count the lines as they come in.
Ignore those lines (by using continue) until the index says to not ignore the line - after that simply write each line to the out file.
I want my Python code to read a file that will contain numbers only in one line. But that one line will not necessarily be the first one. I want my program to ignore all empty lines until it gets to the first line with numbers.
The file will look something like this:
In this example I would want my Python code to ignore the first 2 lines which are empty and just grabbed the first one.
I know that when doing the following I can read the first line:
import sys
line = sys.stdin.readline()
And I tried doing a for loop like the following to try to get it done:
for line in sys.stdin.readlines():
values = line.split()
rest of code ....
However I cannot get the code to work properly if the line of numbers in the file is empty. I did try a while loop but then it became an infinite loop. Any suggestions on how does one properly skip empty lines and just performs specific actions on the first line that is not empty?
Here is example of a function to get the next line containing some non-whitespace character, from a given input stream.
You might want to modify the exact behaviour in the event that no line is found (e.g. return None or do something else instead of raising an exception).
import sys
import re
def get_non_empty_line(fh):
for line in fh:
if re.search(r'\S', line):
return line
raise EOFError
line = get_non_empty_line(sys.stdin)
print(line)
Note: you can happily call the function more than once; the iteration (for line in f:) will carry on from wherever it got to the last time.
You probably want to use the continue keyword with a check if the line is empty, like this:
for line in sys.stdin.readlines():
if not line.strip():
continue
values = line.split()
rest of code ....
I have 2 files, passwd and dictionary. The passwd is a test file with one word, while the dictionary has a list of a few lines of words. My program so far reads and compares only the first line of the dictionary file. For example. My dictionary file contain (egg, fish, red, blue). My passwd file contains only (egg).
The program runs just fine, but once I switch the word egg in the dictionary file to lets say last in the list, the program wont read it and wont pull up results.
My code is below.
#!/usr/bin/passwd
import crypt
def testPass(line):
e = crypt.crypt(line,"HX")
print e
def main():
dictionary = open('dictionary', 'r')
password = open('passwd', 'r')
for line in dictionary:
for line2 in password:
if line == line2:
testPass(line2)
dictionary.close()
password.close()
main()
If you do
for line in file_obj:
....
you are implicitly using the readline method of the file, advancing the file pointer with each call. This means that after the inner loop is done for the first time, it will no longer be executed, because there are no more lines to read.
One possible solution is to keep one -- preferably the smaller -- file in memory using readlines. This way, you can iterate over it for each line you read from the other file.
file_as_list = file_obj.readlines()
for line in file_obj_2:
for line in file_as_list:
..
Once your inner loop runs once, it will have reached the end of the password file. When the outer loop hits its second iteration, there's nothing left to read in the password file because you haven't move the file pointer back to the start of the file.
There are many solutions to the problem. You can use seek to move the file pointer back to the start. Or, you can read the whole password file once and save the data in a list. Or, you can reopen the file on every iteration of the outer loop. The choice of which is best depends on the nature of the data (how many lines there are, are they on a slow network share or fast local disk?) and what your performance requirements are.
I want to know how to edit a file on the fly row by row in python.
For example I have a text file where I usually have:
key value
key value
key value
key value
key value
...
they are not necessarily the same pair for each line. It's just the way I explained it.
I would like to show line by line key and value (on my terminal) and then I want to do one of this two things:
-just press enter (or whatever hot-key) to go ahead and read (show) next line.
-enter a new value then hit enter. this will actually replace the value (that was being shown) on the file and finally go ahead to show next pair of key values.
Till end of file or possibly till I type 'quit' or some other keyword. doesn't matter.
-Being able to go back to the previous row would be a plus (in case of accidentally going to next row), but it's not too important for now.
I find myself often editing huge files in a very tedious and repetitive way, and text editors are really frustrating with their cursors going everywhere when pressing the arrow-key. Also having to use the backspace to delete is annoying.
I know how to read a file and how to write a file in python. But not in such interactive way. I only know how to write the whole file at once. Plus I wouldn't know if it is safe to open the same file in both reading and writing. Also I know how to manipulate each line, split the text in a list of values etc... all I really need is to understand how to modify the file at that exact current line and handle well this type of interaction.
what is the best way to do this?
All the answers focus on loading the contents of the file in memory, modifying and then on close saving all on disk, so I thought I'd give it a try:
import os
sep = " "
with open("inline-t.txt", "rb+") as fd:
seekpos = fd.tell()
line = fd.readline()
while line:
print line
next = raw_input(">>> ")
if next == ":q":
break
if next:
values = line.split(sep)
newval = values[0] + sep + next + '\n'
if len(newval) == len(line):
fd.seek(seekpos)
fd.write(newval)
fd.flush()
os.fsync(fd)
else:
remaining = fd.read()
fd.seek(seekpos)
fd.write(newval + remaining)
fd.flush()
os.fsync(fd)
fd.seek(seekpos)
line = fd.readline()
seekpos = fd.tell()
line = fd.readline()
The script simply opens the file, reads line by line, and rewrites it if the user inputs a new value. If the length of the data matches previous data, seek and write are enough. If the new data is of different size, we need to clean-up after us. So the remainder of the file is read, appended to the new data, and everything is rewritten to disk. fd.flush and os.fsync(fd) guarantee that changes are indeed available in the file as soon as it is written out. Not the best solution, performance-wise, but I believe this is closer to what he asked.
Also, consider there might be a few quirks in this code, and I'm sure there's room for optimizing -- perhaps one global read at the beggining to avoid multiple whole file reads if changes that need adjusting are made often, or something like that.
The way I would go about this is to load all the lines of the text file in a list, and then iterate through that list, changing the values of the list as you go along. Then at the very end (when you get to the last line or whenever you want), you will write that whole list out to the file with the same name, so that way it will overwrite the old file.
I need to get a specific line number from a file that I am passing into a python program I wrote. I know that the line I want will be line 5, so is there a way I can just grab line 5, and not have to iterate through the file?
If you know how many bytes you have before the line you're interested in, you could seek to that point and read out a line. Otherwise, a "line" is not a first class construct (it's just a list of characters terminated by a character you're assigning a special meaning to - a newline). To find these newlines, you have to read the file in.
Practically speaking, you could use the readline method to read off 5 lines and then read your line.
Why are you trying to do this?
you can to use linecache
import linecache
get = linecache.getline
print(get(path_of_file, number_of_line))
I think following should do :
line_number=4
# Avoid reading the whole file
f = open('path/to/my/file','r')
count=1
for i in f.readline():
if count==line_number:
print i
break
count+=1
# By reading the whole file
f = open('path/to/my/file','r')
lines = f.read().splitlines()
print lines[line_number-1] # Index starts from 0
This should give you the 4th line in the file.