Insert string at the beginning of each line - python

How can I insert a string at the beginning of each line in a text file, I have the following code:
f = open('./ampo.txt', 'r+')
with open('./ampo.txt') as infile:
for line in infile:
f.insert(0, 'EDF ')
f.close
I get the following error:
'file' object has no attribute 'insert'

Python comes with batteries included:
import fileinput
import sys
for line in fileinput.input(['./ampo.txt'], inplace=True):
sys.stdout.write('EDF {l}'.format(l=line))
Unlike the solutions already posted, this also preserves file permissions.

You can't modify a file inplace like that. Files do not support insertion. You have to read it all in and then write it all out again.
You can do this line by line if you wish. But in that case you need to write to a temporary file and then replace the original. So, for small enough files, it is just simpler to do it in one go like this:
with open('./ampo.txt', 'r') as f:
lines = f.readlines()
lines = ['EDF '+line for line in lines]
with open('./ampo.txt', 'w') as f:
f.writelines(lines)

Here's a solution where you write to a temporary file and move it into place. You might prefer this version if the file you are rewriting is very large, since it avoids keeping the contents of the file in memory, as versions that involve .read() or .readlines() will. In addition, if there is any error in reading or writing, your original file will be safe:
from shutil import move
from tempfile import NamedTemporaryFile
filename = './ampo.txt'
tmp = NamedTemporaryFile(delete=False)
with open(filename) as finput:
with open(tmp.name, 'w') as ftmp:
for line in finput:
ftmp.write('EDF '+line)
move(tmp.name, filename)

For a file not too big:
with open('./ampo.txt', 'rb+') as f:
x = f.read()
f.seek(0,0)
f.writelines(('EDF ', x.replace('\n','\nEDF ')))
f.truncate()
Note that , IN THEORY, in THIS case (the content is augmented), the f.truncate() may be not really necessary. Because the with statement is supposed to close the file correctly, that is to say, writing an EOF (end of file ) at the end before closing.
That's what I observed on examples.
But I am prudent: I think it's better to put this instruction anyway. For when the content diminishes, the with statement doesn't write an EOF to close correctly the file less far than the preceding initial EOF, hence trailing initial characters remains in the file.
So if the with statement doens't write EOF when the content diminishes, why would it write it when the content augments ?
For a big file, to avoid to put all the content of the file in RAM at once:
import os
def addsomething(filepath, ss):
if filepath.rfind('.') > filepath.rfind(os.sep):
a,_,c = filepath.rpartition('.')
tempi = a + 'temp.' + c
else:
tempi = filepath + 'temp'
with open(filepath, 'rb') as f, open(tempi,'wb') as g:
g.writelines(ss + line for line in f)
os.remove(filepath)
os.rename(tempi,filepath)
addsomething('./ampo.txt','WZE')

f = open('./ampo.txt', 'r')
lines = map(lambda l : 'EDF ' + l, f.readlines())
f.close()
f = open('./ampo.txt', 'w')
map(lambda l : f.write(l), lines)
f.close()

Related

Adding a comma to end of first row of csv files within a directory using python

Ive got some code that lets me open all csv files in a directory and run through them removing the top 2 lines of each file, Ideally during this process I would like it to also add a single comma at the end of the new first line (what would have been originally line 3)
Another approach that's possible could be to remove the trailing comma's on all other rows that appear in each of the csvs.
Any thoughts or approaches would be gratefully received.
import glob
path='P:\pytest'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r') as f:
lines = f.read().split("\n")
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w')
for line in lines:
o.write(line+'\n')
o.close()
adding a counter in there can solve this:
import glob
path=r'C:/Users/dsqallihoussaini/Desktop/dev_projects/stack_over_flow'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r') as f:
lines = f.read().split("\n")
print(lines)
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w')
counter=0
for line in lines:
counter=counter+1
if counter==1:
o.write(line+',\n')
else:
o.write(line+'\n')
o.close()
One possible problem with your code is that you are reading the whole file into memory, which might be fine. If you are reading larger files, then you want to process the file line by line.
The easiest way to do that is to use the fileinput module: https://docs.python.org/3/library/fileinput.html
Something like the following should work:
#!/usr/bin/env python3
import glob
import fileinput
# inplace makes a backup of the file, then any output to stdout is written
# to the current file.
# change the glob..below is just an example.
#
# Iterate through each file in the glob.iglob() results
with fileinput.input(files=glob.iglob('*.csv'), inplace=True) as f:
for line in f: # Iterate over each line of the current file.
if f.filelineno() > 2: # Skip the first two lines
# Note: 'line' has the newline in it.
# Insert the comma if line 3 of the file, otherwise output original line
print(line[:-1]+',') if f.filelineno() == 3 else print(line, end="")
Ive added some encoding as well as mine was throwing a error but encoding fixed that up nicely
import glob
path=r'C:/whateveryourfolderis'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r',encoding='utf-8') as f:
lines = f.read().split("\n")
#print(lines)
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w',encoding='utf-8')
counter=0
for line in lines:
counter=counter+1
if counter==1:
o.write(line+',\n')
else:
o.write(line+'\n')
o.close()

reading .txt file in python

I have a problem with a code in python. I want to read a .txt file. I use the code:
f = open('test.txt', 'r') # We need to re-open the file
data = f.read()
print(data)
I would like to read ONLY the first line from this .txt file. I use
f = open('test.txt', 'r') # We need to re-open the file
data = f.readline(1)
print(data)
But I am seeing that in screen only the first letter of the line is showing.
Could you help me in order to read all the letters of the line ? (I mean to read whole the line of the .txt file)
with open("file.txt") as f:
print(f.readline())
This will open the file using with context block (which will close the file automatically when we are done with it), and read the first line, this will be the same as:
f = open(“file.txt”)
print(f.readline())
f.close()
Your attempt with f.readline(1) won’t work because it the argument is meant for how many characters to print in the file, therefore it will only print the first character.
Second method:
with open("file.txt") as f:
print(f.readlines()[0])
Or you could also do the above which will get a list of lines and print only the first line.
To read the fifth line, use
with open("file.txt") as f:
print(f.readlines()[4])
Or:
with open("file.txt") as f:
lines = []
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
print(lines[-1])
The -1 represents the last item of the list
Learn more:
with statement
files in python
readline method
Your first try is almost there, you should have done the following:
f = open('my_file.txt', 'r')
line = f.readline()
print(line)
f.close()
A safer approach to read file is:
with open('my_file.txt', 'r') as f:
print(f.readline())
Both ways will print only the first line.
Your error was that you passed 1 to readline which means you want to read size of 1, which is only a single character. please refer to https://www.w3schools.com/python/ref_file_readline.asp
I tried this and it works, after your suggestions:
f = open('test.txt', 'r')
data = f.readlines()[1]
print(data)
Use with open(...) instead:
with open("test.txt") as file:
line = file.readline()
print(line)
Keep f.readline() without parameters.
It will return you first line as a string and move cursor to second line.
Next time you use f.readline() it will return second line and move cursor to the next, etc...

Writing a new text file in python

I'm writing code that goes over a text file counting how many words are in every line and having trouble putting the result (many lines that each consist ofa number) into a new text file.
My code:
in_file = open("our_input.txt")
out_file = open("output.txt", "w")
for line in in_file:
line = (str(line)).split()
x = (len(line))
x = str(x)
out_file.write(x)
in_file.close()
out_file.close()
But the file I'm getting has all the number together in one line.
How do I seperate them in the file I'm making?
You need to add a new line after each line :
out_file.write(x + '\n')
Also as a more pythonic way for dealing with files you can use with statement to open the files which will close the files at the end of the block.
And instead of multiple assignment and converting the length to string you can use str.format() method to do all of this jobs in one line:
with open("our_input.txt") as in_file,open("output.txt", "w") as out_file:
for line in in_file:
out_file.write('{}\n'.format(len(line.split())))
Add newline in the file while writing
in_file = open("our_input.txt")
out_file =open("output.txt", "w")
for line in in_file:
line= (str(line)).split()
x=(len(line))
x=str(x)
out_file.write(x)
#Write newline
out_file.write('\n')
in_file.close()
As the previous answers have pointed out, your need to write a newline to separate the ouput.
Here is yet another way to write the code
with open("our_input.txt") as in_file, open("output.txt", "w") as out_file:
res = map(lambda line: len(line.split()), in_file)
for r in res:
out_file.write('%d\n' % r)

Take lines from two files, output into same line- python

I am attempting to open two files then take the first line in the first file, write it to an out file, then take the first line in the second file and append it to the same line in the output file, separated by a tab.
I've attempted to code this, and my outfile just ends up being the whole contents of the first file, followed by the entire contents of the second file. I included print statements just because I wanted to see something going on in the terminal while the script was running, that is why they are there. Any ideas?
import sys
InFileName = sys.argv[1]
InFile = open(InFileName, 'r')
InFileName2 = sys.argv[2]
InFile2 = open(InFileName2, 'r')
OutFileName = "combined_data.txt"
OutFile = open(OutFileName, 'a')
for line in InFile:
OutFile.write(str(line) + '\t')
print line
for line2 in InFile2:
OutFile.write(str(line2) + '\n')
print line
InFile.close()
InFile2.close()
OutFile.close()
You can use zip for this:
with open(file1) as f1,open(file2) as f2,open("combined_data.txt","w") as fout:
for t in zip(f1,f2):
fout.write('\t'.join(x.strip() for x in t)+'\n')
In the case where your two files don't have the same number of lines (or if they're REALLY BIG), you could use itertools.izip_longest(f1,f2,fillvalue='')
Perhaps this gives you a few ideas:
Adding entries from multiple files in python
o = open('output.txt', 'wb')
fh = open('input.txt', 'rb')
fh2 = open('input2.txt', 'rb')
for line in fh.readlines():
o.write(line.strip('\r\n') + '\t' + fh2.readline().strip('\r\n') + '\n')
## If you want to write remaining files from input2.txt:
# for line in fh2.readlines():
# o.write(line.rstrip('\r\n') + '\n')
fh.close()
fh2.close()
o.close()
This will give you:
line1_of_file_1 line1_of_file_2
line2_of_file_1 line2_of_file_2
line3_of_file_1 line3_of_file_2
line4_of_file_1 line4_of_file_2
Where the space in my output example is a [tab]
Note: no line ending is appended to the file for obvious reasons.
For this to work, the linendings would need to be proper in both file 1 and 2.
To check this:
print 'File 1:'
f = open('input.txt', 'rb')
print [r.read[:200]]
f.close()
print 'File 2:'
f = open('input2.txt', 'rb')
print [r.read[:200]]
f.close()
This should give you something like
File 1:
['This is\ta lot of\t text\r\nWith a few line\r\nendings\r\n']
File 2:
['Give\r\nMe\r\nSome\r\nLove\r\n']

Using python, how to read a file starting at the seventh line ?

I have a text file structure as:
date
downland
user
date data1 date2
201102 foo bar 200 50
201101 foo bar 300 35
So first six lines of file are not needed. filename:dnw.txt
f = open('dwn.txt', 'rb')
How do I "split" this file starting at line 7 to EOF?
with open('dwn.txt') as f:
for i in xrange(6):
f, next()
for line in f:
process(line)
Update: use next(f) for python 3.x.
Itertools answer!
from itertools import islice
with open('foo') as f:
for line in islice(f, 6, None):
print line
Python 3:
with open("file.txt","r") as f:
for i in range(6):
f.readline()
for line in f:
# process lines 7-end
with open('test.txt', 'r') as fo:
for i in xrange(6):
fo.next()
for line in fo:
print "%s" % line.strip()
In fact, to answer precisely at the question as it was written
How do I "split" this file starting at line 7 to EOF?
you can do
:
in case the file is not big:
with open('dwn.txt','rb+') as f:
for i in xrange(6):
print f.readline()
content = f.read()
f.seek(0,0)
f.write(content)
f.truncate()
in case the file is very big
with open('dwn.txt','rb+') as ahead, open('dwn.txt','rb+') as back:
for i in xrange(6):
print ahead.readline()
x = 100000
chunk = ahead.read(x)
while chunk:
print repr(chunk)
back.write(chunk)
chunk = ahead.read(x)
back.truncate()
The truncate() function is essential to put the EOF you asked for. Without executing truncate() , the tail of the file, corresponding to the offset of 6 lines, would remain.
.
The file must be opened in binary mode to prevent any problem to happen.
When Python reads '\r\n' , it transforms them in '\n' (that's the Universal Newline Support, enabled by default) , that is to say there are only '\n' in the chains chunk even if there were '\r\n' in the file.
If the file is from Macintosh origin , it contains only CR = '\r' newlines before the treatment but they will be changed to '\n' or '\r\n' (according to the platform) during the rewriting on a non-Macintosh machine.
If it is a file from Linux origin, it contains only LF = '\n' newlines which, on a Windows OS, will be changed to '\r\n' (I don't know for a Linux file processed on a Macintosh ).
The reason is that the OS Windows writes '\r\n' whatever it is ordered to write , '\n' or '\r' or '\r\n'. Consequently, there would be more characters rewritten than having been read, and then the offset between the file's pointers ahead and back would diminish and cause a messy rewriting.
In HTML sources , there are also various newlines.
That's why it's always preferable to open files in binary mode when they are so processed.
Alternative version
You can direct use the command read() if you know the character position pos of the separating (header part from the part of interest) linebreak, e.g. an \n, in the text at which you want to break your input text:
with open('input.txt', 'r') as txt_in:
txt_in.seek(pos)
second_half = txt_in.read()
If you are interested in both halfs, you could also investigate the following method:
with open('input.txt', 'r') as txt_in:
all_contents = txt_in.read()
first_half = all_contents[:pos]
second_half = all_contents[pos:]
You can read the entire file into an array/list and then just start at the index appropriate to the line you wish to start reading at.
f = open('dwn.txt', 'rb')
fileAsList = f.readlines()
fileAsList[0] #first line
fileAsList[1] #second line
#!/usr/bin/python
with open('dnw.txt', 'r') as f:
lines_7_through_end = f.readlines()[6:]
print "Lines 7+:"
i = 7;
for line in lines_7_through_end:
print " Line %s: %s" % (i, line)
i+=1
Prints:
Lines 7+:
Line 7: 201102 foo bar 200 50
Line 8: 201101 foo bar 300 35
Edit:
To rebuild dwn.txt without the first six lines, do this after the above code:
with open('dnw.txt', 'w') as f:
for line in lines_7_through_end:
f.write(line)
I have created a script used to cut an Apache access.log file several times a day.
It's not original topic of question, but I think it can be useful, if you have store the file cursor position after the 6 first lines reading.
So I needed the set a position cursor on last line parsed during last execution.
To this end, I used file.seek() and file.seek() methods which allows the storage of the cursor in file.
My code :
ENCODING = "utf8"
CURRENT_FILE_DIR = os.path.dirname(os.path.abspath(__file__))
# This file is used to store the last cursor position
cursor_position = os.path.join(CURRENT_FILE_DIR, "access_cursor_position.log")
# Log file with new lines
log_file_to_cut = os.path.join(CURRENT_FILE_DIR, "access.log")
cut_file = os.path.join(CURRENT_FILE_DIR, "cut_access", "cut.log")
# Set in from_line
from_position = 0
try:
with open(cursor_position, "r", encoding=ENCODING) as f:
from_position = int(f.read())
except Exception as e:
pass
# We read log_file_to_cut to put new lines in cut_file
with open(log_file_to_cut, "r", encoding=ENCODING) as f:
with open(cut_file, "w", encoding=ENCODING) as fw:
# We set cursor to the last position used (during last run of script)
f.seek(from_position)
for line in f:
fw.write("%s" % (line))
# We save the last position of cursor for next usage
with open(cursor_position, "w", encoding=ENCODING) as fw:
fw.write(str(f.tell()))
Just do f.readline() six times. Ignore the returned value.
Solutions with readlines() are not satisfactory in my opinion because readlines() reads the entire file. The user will have to read again the lines (in file or in the produced list) to process what he wants, while it could have been done without having read the intersting lines already a first time. Moreover if the file is big, the memory is weighed by the file's content while a for line in file instruction would have been lighter.
Doing repetition of readline() can be done like that
nb = 6
exec( nb * 'f.readline()\n')
It's short piece of code and nb is programmatically adjustable

Categories

Resources