I'm looking at how to do file input and output in Python. I've written the following code to read a list of names (one per line) from a file into another file while checking a name against the names in the file and appending text to the occurrences in the file. The code works. Could it be done better?
I'd wanted to use the with open(... statement for both input and output files but can't see how they could be in the same block meaning I'd need to store the names in a temporary location.
def filter(txt, oldfile, newfile):
'''\
Read a list of names from a file line by line into an output file.
If a line begins with a particular name, insert a string of text
after the name before appending the line to the output file.
'''
outfile = open(newfile, 'w')
with open(oldfile, 'r', encoding='utf-8') as infile:
for line in infile:
if line.startswith(txt):
line = line[0:len(txt)] + ' - Truly a great person!\n'
outfile.write(line)
outfile.close()
return # Do I gain anything by including this?
# input the name you want to check against
text = input('Please enter the name of a great person: ')
letsgo = filter(text,'Spanish', 'Spanish2')
Python allows putting multiple open() statements in a single with. You comma-separate them. Your code would then be:
def filter(txt, oldfile, newfile):
'''\
Read a list of names from a file line by line into an output file.
If a line begins with a particular name, insert a string of text
after the name before appending the line to the output file.
'''
with open(newfile, 'w') as outfile, open(oldfile, 'r', encoding='utf-8') as infile:
for line in infile:
if line.startswith(txt):
line = line[0:len(txt)] + ' - Truly a great person!\n'
outfile.write(line)
# input the name you want to check against
text = input('Please enter the name of a great person: ')
letsgo = filter(text,'Spanish', 'Spanish2')
And no, you don't gain anything by putting an explicit return at the end of your function. You can use return to exit early, but you had it at the end, and the function will exit without it. (Of course with functions that return a value, you use the return to specify the value to return.)
Using multiple open() items with with was not supported in Python 2.5 when the with statement was introduced, or in Python 2.6, but it is supported in Python 2.7 and Python 3.1 or newer.
http://docs.python.org/reference/compound_stmts.html#the-with-statement
http://docs.python.org/release/3.1/reference/compound_stmts.html#the-with-statement
If you are writing code that must run in Python 2.5, 2.6 or 3.0, nest the with statements as the other answers suggested or use contextlib.nested.
Use nested blocks like this,
with open(newfile, 'w') as outfile:
with open(oldfile, 'r', encoding='utf-8') as infile:
# your logic goes right here
You can nest your with blocks. Like this:
with open(newfile, 'w') as outfile:
with open(oldfile, 'r', encoding='utf-8') as infile:
for line in infile:
if line.startswith(txt):
line = line[0:len(txt)] + ' - Truly a great person!\n'
outfile.write(line)
This is better than your version because you guarantee that outfile will be closed even if your code encounters exceptions. Obviously you could do that with try/finally, but with is the right way to do this.
Or, as I have just learnt, you can have multiple context managers in a with statement as described by #steveha. That seems to me to be a better option than nesting.
And for your final minor question, the return serves no real purpose. I would remove it.
Sometimes, you might want to open a variable amount of files and treat each one the same, you can do this with contextlib
from contextlib import ExitStack
filenames = [file1.txt, file2.txt, file3.txt]
with open('outfile.txt', 'a') as outfile:
with ExitStack() as stack:
file_pointers = [stack.enter_context(open(file, 'r')) for file in filenames]
for fp in file_pointers:
outfile.write(fp.read())
Related
I posted a question yesterday in similar regards to this but didn't quite gauge the response I wanted because I wasn't specific enough. Basically the function takes a .txt file as the argument and returns a string with all \n characters replaced with an '_' on the same line. I want to do this without using WITH. I thought I did this correctly but when I run it and check the file, nothing has changed. Any pointers?
This is what I did:
def one_line(filename):
wordfile = open(filename)
text_str = wordfile.read().replace("\n", "_")
wordfile.close()
return text_str
one_line("words.txt")
but to no avail. I open the text file and it remains the same.
The contents of the textfile are:
I like to eat
pancakes every day
and the output that's supposed to be shown is:
>>> one_line("words.txt")
’I like to eat_pancakes every day_’
The fileinput module in the Python standard library allows you to do this in one fell swoop.
import fileinput
for line in fileinput.input(filename, inplace=True):
line = line.replace('\n', '_')
print(line, end='')
The requirement to avoid a with statement is trivial but rather pointless. Anything which looks like
with open(filename) as handle:
stuff
can simply be rewritten as
try:
handle = open(filename)
stuff
finally:
handle.close()
If you take out the try/finally you have a bug which leaves handle open if an error happens. The purpose of the with context manager for open() is to simplify this common use case.
You are missing some steps. After you obtain the updated string, you need to write it back to the file, example below without using with
def one_line(filename):
wordfile = open(filename)
text_str = wordfile.read().replace("\n", "_")
wordfile.close()
return text_str
def write_line(s):
# Open the file in write mode
wordfile = open("words.txt", 'w')
# Write the updated string to the file
wordfile.write(s)
# Close the file
wordfile.close()
s = one_line("words.txt")
write_line(s)
Or using with
with open("file.txt",'w') as wordfile:
#Write the updated string to the file
wordfile.write(s)
with pathlib you could achieve what you want this way:
from pathlib import Path
path = Path(filename)
contents = path.read_text()
contents = contents.replace("\n", "_")
path.write_text(contents)
I wrote the following python code snippet to append a lower p character to each line of a txt file:
f = open('helloworld.txt','r')
for line in f:
line+='p'
print(f.read())
f.close()
However, when I execute this python program, it returns nothing but an empty blank:
zhiwei#zhiwei-Lenovo-Rescuer-15ISK:~/Documents/1001/ass5$ python3 helloworld.py
Can anyone tell me what's wrong with my codes?
Currently, you are only reading each line and not writing to the file. reopen the file in write mode and write your full string to it, like so:
newf=""
with open('helloworld.txt','r') as f:
for line in f:
newf+=line.strip()+"p\n"
f.close()
with open('helloworld.txt','w') as f:
f.write(newf)
f.close()
well, type help(f) in shell, you can get "Character and line based layer over a BufferedIOBase object, buffer."
it's meaning:if you reading first buffer,you can get content, but again. it's empty。
so like this:
with open(oldfile, 'r') as f1, open(newfile, 'w') as f2:
newline = ''
for line in f1:
newline+=line.strip()+"p\n"
f2.write(newline)
open(filePath, openMode) takes two arguments, the first one is the path to your file, the second one is the mode it will be opened it. When you use 'r' as second argument, you are actually telling Python to open it as an only reading file.
If you want to write on it, you need to open it in writing mode, using 'w' as second argument. You can find more about how to read/write files in Python in its official documentation.
If you want to read and write at the same time, you have to open the file in both reading and writing modes. You can do this simply by using 'r+' mode.
It seems that your for loop has already read the file to the end, so f.read() return empty string.
If you just need to print the lines in the file, you could move the print into for loop just like print(line). And it is better to move the f.read() before for loop:
f = open("filename", "r")
lines = f.readlines()
for line in lines:
line += "p"
print(line)
f.close()
If you need to modify the file, you need to create another file obj and open it in mode of "w", and use f.write(line) to write the modified lines into the new file.
Besides, it is more better to use with clause in python instead of open(), it is more pythonic.
with open("filename", "r") as f:
lines = f.readlines()
for line in lines:
line += "p"
print(line)
When using with clause, you have no need to close file, this is more simple.
I'm fairly new to python and I'm having an issue with my python script (split_fasta.py). Here is an example of my issue:
list = ["1.fasta", "2.fasta", "3.fasta"]
for file in list:
contents = open(file, "r")
for line in contents:
if line[0] == ">":
new_file = open(file + "_chromosome.fasta", "w")
new_file.write(line)
I've left the bottom part of the program out because it's not needed. My issue is that when I run this program in the same direcoty as my fasta123 files, it works great:
python split_fasta.py *.fasta
But if I'm in a different directory and I want the program to output the new files (eg. 1.fasta_chromsome.fasta) to my current directory...it doesn't:
python /home/bin/split_fasta.py /home/data/*.fasta
This still creates the new files in the same directory as the fasta files. The issue here I'm sure is with this line:
new_file = open(file + "_chromosome.fasta", "w")
Because if I change it to this:
new_file = open("seq" + "_chromosome.fasta", "w")
It creates an output file in my current directory.
I hope this makes sense to some of you and that I can get some suggestions.
You are giving the full path of the old file, plus a new name. So basically, if file == /home/data/something.fasta, the output file will be file + "_chromosome.fasta" which is /home/data/something.fasta_chromosome.fasta
If you use os.path.basename on file, you will get the name of the file (i.e. in my example, something.fasta)
From #Adam Smith
You can use os.path.splitext to get rid of the .fasta
basename, _ = os.path.splitext(os.path.basename(file))
Getting back to the code example, I saw many things not recommended in Python. I'll go in details.
Avoid shadowing builtin names, such as list, str, int... It is not explicit and can lead to potential issues later.
When opening a file for reading or writing, you should use the with syntax. This is highly recommended since it takes care to close the file.
with open(filename, "r") as f:
data = f.read()
with open(new_filename, "w") as f:
f.write(data)
If you have an empty line in your file, line[0] == ... will result in a IndexError exception. Use line.startswith(...) instead.
Final code :
files = ["1.fasta", "2.fasta", "3.fasta"]
for file in files:
with open(file, "r") as input:
for line in input:
if line.startswith(">"):
new_name = os.path.splitext(os.path.basename(file)) + "_chromosome.fasta"
with open(new_name, "w") as output:
output.write(line)
Often, people come at me and say "that's hugly". Not really :). The levels of indentation makes clear what is which context.
I need to add a single line to the first line of a text file and it looks like the only options available to me are more lines of code than I would expect from python. Something like this:
f = open('filename','r')
temp = f.read()
f.close()
f = open('filename', 'w')
f.write("#testfirstline")
f.write(temp)
f.close()
Is there no easier way? Additionally, I see this two-handle example more often than opening a single handle for reading and writing ('r+') - why is that?
Python makes a lot of things easy and contains libraries and wrappers for a lot of common operations, but the goal is not to hide fundamental truths.
The fundamental truth you are encountering here is that you generally can't prepend data to an existing flat structure without rewriting the entire structure. This is true regardless of language.
There are ways to save a filehandle or make your code less readable, many of which are provided in other answers, but none change the fundamental operation: You must read in the existing file, then write out the data you want to prepend, followed by the existing data you read in.
By all means save yourself the filehandle, but don't go looking to pack this operation into as few lines of code as possible. In fact, never go looking for the fewest lines of code -- that's obfuscation, not programming.
I would stick with separate reads and writes, but we certainly can express each more concisely:
Python2:
with file('filename', 'r') as original: data = original.read()
with file('filename', 'w') as modified: modified.write("new first line\n" + data)
Python3:
with open('filename', 'r') as original: data = original.read()
with open('filename', 'w') as modified: modified.write("new first line\n" + data)
Note: file() function is not available in python3.
Other approach:
with open("infile") as f1:
with open("outfile", "w") as f2:
f2.write("#test firstline")
for line in f1:
f2.write(line)
or a one liner:
open("outfile", "w").write("#test firstline\n" + open("infile").read())
Thanks for the opportunity to think about this problem :)
Cheers
with open("file", "r+") as f: s = f.read(); f.seek(0); f.write("prepend\n" + s)
You can save one write call with this:
f.write('#testfirstline\n' + temp)
When using 'r+', you would have to rewind the file after reading and before writing.
Here's a 3 liner that I think is clear and flexible. It uses the list.insert function, so if you truly want to prepend to the file use l.insert(0, 'insert_str'). When I actually did this for a Python Module I am developing, I used l.insert(1, 'insert_str') because I wanted to skip the '# -- coding: utf-8 --' string at line 0. Here is the code.
f = open(file_path, 'r'); s = f.read(); f.close()
l = s.splitlines(); l.insert(0, 'insert_str'); s = '\n'.join(l)
f = open(file_path, 'w'); f.write(s); f.close()
This does the job without reading the whole file into memory, though it may not work on Windows
def prepend_line(path, line):
with open(path, 'r') as old:
os.unlink(path)
with open(path, 'w') as new:
new.write(str(line) + "\n")
shutil.copyfileobj(old, new)
One possibility is the following:
import os
open('tempfile', 'w').write('#testfirstline\n' + open('filename', 'r').read())
os.rename('tempfile', 'filename')
If you wish to prepend in the file after a specific text then you can use the function below.
def prepend_text(file, text, after=None):
''' Prepend file with given raw text '''
f_read = open(file, 'r')
buff = f_read.read()
f_read.close()
f_write = open(file, 'w')
inject_pos = 0
if after:
pattern = after
inject_pos = buff.find(pattern)+len(pattern)
f_write.write(buff[:inject_pos] + text + buff[inject_pos:])
f_write.close()
So first you open the file, read it and save it all into one string.
Then we try to find the character number in the string where the injection will happen. Then with a single write and some smart indexing of the string we can rewrite the whole file including the injected text now.
Am I not seeing something or couldn't we just use a buffer large-enough to read-in the input file in parts (instead of the whole content) and with this buffer traverse the file while it is open and keep exchanging file<->buffer contents?
This seems much more efficient (for big files especially) than reading the whole content in memory, modifying it in memory and writing it back to the same file or (even worse) a different one. Sorry that now I don't have time to implement a sample snippet, I'll get back to this later, but maybe you get the idea.
As I suggested in this answer, you can do it using the following:
def prepend_text(filename: Union[str, Path], text: str):
with fileinput.input(filename, inplace=True) as file:
for line in file:
if file.isfirstline():
print(text)
print(line, end="")
If you rewrite it like this:
with open('filename') as f:
read_data = f.read()
with open('filename', 'w') as f:
f.write("#testfirstline\n" + read_data)
It's rather short and simple.
For 'r+' the file needs to exist already.
this worked for me
def prepend(str, file):
with open(file, "r") as fr:
read = fr.read()
with open(file, "w") as fw:
fw.write(str + read)
fw.close()
I am new to Python programming...
I have a .txt file....... It looks like..
0,Salary,14000
0,Bonus,5000
0,gift,6000
I want to to replace the first '0' value to '1' in each line. How can I do this? Any one can help me.... With sample code..
Thanks in advance.
Nimmyliji
I know that you're asking about Python, but forgive me for suggesting that perhaps a different tool is better for the job. :) It's a one-liner via sed:
sed 's/^0,/1,/' yourtextfile.txt > output.txt
This applies the regex /^0,/ (which matches any 0, that occurs at the beginning of a line) to each line and replaces the matched text with 1, instead. The output is directed into the file output.txt specified.
inFile = open("old.txt", "r")
outFile = open("new.txt", "w")
for line in inFile:
outFile.write(",".join(["1"] + (line.split(","))[1:]))
inFile.close()
outFile.close()
If you would like something more general, take a look to Python csv module. It contains utilities for processing comma-separated values (abbreviated as csv) in files. But it can work with arbitrary delimiter, not only comma. So as you sample is obviously a csv file, you can use it as follows:
import csv
reader = csv.reader(open("old.txt"))
writer = csv.writer(open("new.txt", "w"))
writer.writerows(["1"] + line[1:] for line in reader)
To overwrite original file with new one:
import os
os.remove("old.txt")
os.rename("new.txt", "old.txt")
I think that writing to new file and then renaming it is more fault-tolerant and less likely corrupt your data than direct overwriting of source file. Imagine, that your program raised an exception while source file was already read to memory and reopened for writing. So you would lose original data and your new data wouldn't be saved because of program crash. In my case, I only lose new data while preserving original.
o=open("output.txt","w")
for line in open("file"):
s=line.split(",")
s[0]="1"
o.write(','.join(s))
o.close()
Or you can use fileinput with in place edit
import fileinput
for line in fileinput.FileInput("file",inplace=1):
s=line.split(",")
s[0]="1"
print ','.join(s)
f = open(filepath,'r')
data = f.readlines()
f.close()
edited = []
for line in data:
edited.append( '1'+line[1:] )
f = open(filepath,'w')
f.writelines(edited)
f.flush()
f.close()
Or in Python 2.5+:
with open(filepath,'r') as f:
data = f.readlines()
with open(outfilepath, 'w') as f:
for line in data:
f.write( '1' + line[1:] )
This should do it. I wouldn't recommend it for a truly big file though ;-)
What is going on (ex 1):
1: Open the file in read mode
2,3: Read all the lines into a list (each line is a separate index) and close the file.
4,5,6: Iterate over the list constructing a new list where each line has the first character replaced by a 1. The line[1:] slices the string from index 1 onward. We concatenate the 1 with the truncated list.
7,8,9: Reopen the file in write mode, write the list to the file (overwrite), flush the buffer, and close the file handle.
In Ex. 2:
I use the with statement that lets the file handle closing itself, but do essentially the same thing.