Writing a list to a file with Python, with newlines - python

How do I write a list to a file? writelines() doesn't insert newline characters, so I need to do:
f.writelines([f"{line}\n" for line in lines])

Use a loop:
with open('your_file.txt', 'w') as f:
for line in lines:
f.write(f"{line}\n")
For Python <3.6:
with open('your_file.txt', 'w') as f:
for line in lines:
f.write("%s\n" % line)
For Python 2, one may also use:
with open('your_file.txt', 'w') as f:
for line in lines:
print >> f, line
If you're keen on a single function call, at least remove the square brackets [], so that the strings to be printed get made one at a time (a genexp rather than a listcomp) -- no reason to take up all the memory required to materialize the whole list of strings.

What are you going to do with the file? Does this file exist for humans, or other programs with clear interoperability requirements?
If you are just trying to serialize a list to disk for later use by the same python app, you should be pickleing the list.
import pickle
with open('outfile', 'wb') as fp:
pickle.dump(itemlist, fp)
To read it back:
with open ('outfile', 'rb') as fp:
itemlist = pickle.load(fp)

Simpler is:
with open("outfile", "w") as outfile:
outfile.write("\n".join(itemlist))
To ensure that all items in the item list are strings, use a generator expression:
with open("outfile", "w") as outfile:
outfile.write("\n".join(str(item) for item in itemlist))
Remember that itemlist takes up memory, so take care about the memory consumption.

Using Python 3 and Python 2.6+ syntax:
with open(filepath, 'w') as file_handler:
for item in the_list:
file_handler.write("{}\n".format(item))
This is platform-independent. It also terminates the final line with a newline character, which is a UNIX best practice.
Starting with Python 3.6, "{}\n".format(item) can be replaced with an f-string: f"{item}\n".

Yet another way. Serialize to json using simplejson (included as json in python 2.6):
>>> import simplejson
>>> f = open('output.txt', 'w')
>>> simplejson.dump([1,2,3,4], f)
>>> f.close()
If you examine output.txt:
[1, 2, 3, 4]
This is useful because the syntax is pythonic, it's human readable, and it can be read by other programs in other languages.

I thought it would be interesting to explore the benefits of using a genexp, so here's my take.
The example in the question uses square brackets to create a temporary list, and so is equivalent to:
file.writelines( list( "%s\n" % item for item in list ) )
Which needlessly constructs a temporary list of all the lines that will be written out, this may consume significant amounts of memory depending on the size of your list and how verbose the output of str(item) is.
Drop the square brackets (equivalent to removing the wrapping list() call above) will instead pass a temporary generator to file.writelines():
file.writelines( "%s\n" % item for item in list )
This generator will create newline-terminated representation of your item objects on-demand (i.e. as they are written out). This is nice for a couple of reasons:
Memory overheads are small, even for very large lists
If str(item) is slow there's visible progress in the file as each item is processed
This avoids memory issues, such as:
In [1]: import os
In [2]: f = file(os.devnull, "w")
In [3]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 385 ms per loop
In [4]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
ERROR: Internal Python error in the inspect module.
Below is the traceback from this internal error.
Traceback (most recent call last):
...
MemoryError
(I triggered this error by limiting Python's max. virtual memory to ~100MB with ulimit -v 102400).
Putting memory usage to one side, this method isn't actually any faster than the original:
In [4]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 370 ms per loop
In [5]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
1 loops, best of 3: 360 ms per loop
(Python 2.6.2 on Linux)

Because i'm lazy....
import json
a = [1,2,3]
with open('test.txt', 'w') as f:
f.write(json.dumps(a))
#Now read the file back into a Python list object
with open('test.txt', 'r') as f:
a = json.loads(f.read())

Serialize list into text file with comma sepparated value
mylist = dir()
with open('filename.txt','w') as f:
f.write( ','.join( mylist ) )

In Python 3 you can use print and * for argument unpacking:
with open("fout.txt", "w") as fout:
print(*my_list, sep="\n", file=fout)

Simply:
with open("text.txt", 'w') as file:
file.write('\n'.join(yourList))

In General
Following is the syntax for writelines() method
fileObject.writelines( sequence )
Example
#!/usr/bin/python
# Open a file
fo = open("foo.txt", "rw+")
seq = ["This is 6th line\n", "This is 7th line"]
# Write sequence of lines at the end of the file.
line = fo.writelines( seq )
# Close opend file
fo.close()
Reference
http://www.tutorialspoint.com/python/file_writelines.htm

file.write('\n'.join(list))

Using numpy.savetxt is also an option:
import numpy as np
np.savetxt('list.txt', list, delimiter="\n", fmt="%s")

You can also use the print function if you're on python3 as follows.
f = open("myfile.txt","wb")
print(mylist, file=f)

with open ("test.txt","w")as fp:
for line in list12:
fp.write(line+"\n")

Why don't you try
file.write(str(list))

I recently found Path to be useful. Helps me get around having to with open('file') as f and then writing to the file. Hope this becomes useful to someone :).
from pathlib import Path
import json
a = [[1,2,3],[4,5,6]]
# write
Path("file.json").write_text(json.dumps(a))
# read
json.loads(Path("file.json").read_text())

You can also go through following:
Example:
my_list=[1,2,3,4,5,"abc","def"]
with open('your_file.txt', 'w') as file:
for item in my_list:
file.write("%s\n" % item)
Output:
In your_file.txt items are saved like:
1
2
3
4
5
abc
def
Your script also saves as above.
Otherwise, you can use pickle
import pickle
my_list=[1,2,3,4,5,"abc","def"]
#to write
with open('your_file.txt', 'wb') as file:
pickle.dump(my_list, file)
#to read
with open ('your_file.txt', 'rb') as file:
Outlist = pickle.load(file)
print(Outlist)
Output:
[1, 2, 3, 4, 5, 'abc', 'def']
It save dump the list same as a list when we load it we able to read.
Also by simplejson possible same as above output
import simplejson as sj
my_list=[1,2,3,4,5,"abc","def"]
#To write
with open('your_file.txt', 'w') as file:
sj.dump(my_list, file)
#To save
with open('your_file.txt', 'r') as file:
mlist=sj.load(file)
print(mlist)

This logic will first convert the items in list to string(str). Sometimes the list contains a tuple like
alist = [(i12,tiger),
(113,lion)]
This logic will write to file each tuple in a new line. We can later use eval while loading each tuple when reading the file:
outfile = open('outfile.txt', 'w') # open a file in write mode
for item in list_to_persistence: # iterate over the list items
outfile.write(str(item) + '\n') # write to the file
outfile.close() # close the file

Another way of iterating and adding newline:
for item in items:
filewriter.write(f"{item}" + "\n")

In Python3 You Can use this loop
with open('your_file.txt', 'w') as f:
for item in list:
f.print("", item)

Redirecting stdout to a file might also be useful for this purpose:
from contextlib import redirect_stdout
with open('test.txt', 'w') as f:
with redirect_stdout(f):
for i in range(mylst.size):
print(mylst[i])

i suggest this solution .
with open('your_file.txt', 'w') as f:
list(map(lambda item : f.write("%s\n" % item),my_list))

Let avg be the list, then:
In [29]: a = n.array((avg))
In [31]: a.tofile('avgpoints.dat',sep='\n',dtype = '%f')
You can use %e or %s depending on your requirement.

i think you are looking for an answer like this.
f = open('output.txt','w')
list = [3, 15.2123, 118.3432, 98.2276, 118.0043]
f.write('a= {:>3d}, b= {:>8.4f}, c= {:>8.4f}, d= {:>8.4f}, e=
{:>8.4f}\n'.format(*list))
f.close()

poem = '''\
Programming is fun
When the work is done
if you wanna make your work also fun:
use Python!
'''
f = open('poem.txt', 'w') # open for 'w'riting
f.write(poem) # write text to file
f.close() # close the file
How It Works:
First, open a file by using the built-in open function and specifying the name of
the file and the mode in which we want to open the file. The mode can be a
read mode (’r’), write mode (’w’) or append mode (’a’). We can also specify
whether we are reading, writing, or appending in text mode (’t’) or binary
mode (’b’). There are actually many more modes available and help(open)
will give you more details about them. By default, open() considers the file to
be a ’t’ext file and opens it in ’r’ead mode.
In our example, we first open the file in write text mode and use the write
method of the file object to write to the file and then we finally close the file.
The above example is from the book "A Byte of Python" by Swaroop C H.
swaroopch.com

Related

Improving the speed of a python script

I have an input file with containing a list of strings.
I am iterating through every fourth line starting on line two.
From each of these lines I make a new string from the first and last 6 characters and put this in an output file only if that new string is unique.
The code I wrote to do this works, but I am working with very large deep sequencing files, and has been running for a day and has not made much progress. So I'm looking for any suggestions to make this much faster if possible. Thanks.
def method():
target = open(output_file, 'w')
with open(input_file, 'r') as f:
lineCharsList = []
for line in f:
#Make string from first and last 6 characters of a line
lineChars = line[0:6]+line[145:151]
if not (lineChars in lineCharsList):
lineCharsList.append(lineChars)
target.write(lineChars + '\n') #If string is unique, write to output file
for skip in range(3): #Used to step through four lines at a time
try:
check = line #Check for additional lines in file
next(f)
except StopIteration:
break
target.close()
Try defining lineCharsList as a set instead of a list:
lineCharsList = set()
...
lineCharsList.add(lineChars)
That'll improve the performance of the in operator. Also, if memory isn't a problem at all, you might want to accumulate all the output in a list and write it all at the end, instead of performing multiple write() operations.
You can use https://docs.python.org/2/library/itertools.html#itertools.islice:
import itertools
def method():
with open(input_file, 'r') as inf, open(output_file, 'w') as ouf:
seen = set()
for line in itertools.islice(inf, None, None, 4):
s = line[:6]+line[-6:]
if s not in seen:
seen.add(s)
ouf.write("{}\n".format(s))
Besides using set as Oscar suggested, you can also use islice to skip lines rather than use a for loop.
As stated in this post, islice preprocesses the iterator in C, so it should be much faster than using a plain vanilla python for loop.
Try replacing
lineChars = line[0:6]+line[145:151]
with
lineChars = ''.join([line[0:6], line[145:151]])
as it can be more efficient, depending on the circumstances.

Sorting a file in ascending order

I was given a file with one name per line in random order myInput01.txt and I need to order it in ascending order and output the ordered names one name per line to a file named myOutput01.txt.
myhandle = open('myInput01.txt', 'r')
aLine = myhandle.readlines()
sorted(aLine)
aLine = myOutput01.txt
print myOutput01.txt
For future visitors, the easiest and most concise way of doing this in Python (assuming a sort isn't going to blow your system memory) is:
with open('myInput01.txt') as fin, open('myOutput01.txt', 'w') as fout:
fout.writelines(sorted(fin))
So, this part is ok:
myhandle = open('myInput01.txt', 'r')
aLine = myhandle.readlines()
You open a file (get a file handler in myhandle) and read its lines into aLine.
Now, there's a problem with:
sorted(aLine)
The sorted function doesn't do anything to the aLine argument. It returns a sorted new list. So it's either you use aLine.sort() to sort in-place or assign the output of the sorted function to another variable:
sorted_lines = sorted(aLine)
Take a look to this sorting tutorial.
Also, these two lines are very problematic:
aLine = myOutput01.txt
print myOutput01.txt
You're overwriting your aLine variable with something called myOutput01.txt, which is unknown to the script (what is it? where is it defined?). You need to proceed in a similar way as to read a file. You need to open a handler and write stuff to the file using that handler as a reference.
You need:
mywritehandle = open('myOutputO1.txt', 'w')
mywritehandle.writelines(sorted_lines)
mywritehandle.close()
Or, to avoid having to call close() explicitly:
with open('myOutputO1.txt', 'w') as mywritehandle:
mywritehandle.writelines(sorted_lines)
You should familiarize yourself with file objects and be aware that myOutput01.txt is very different to "myOutput01.txt".
outputFile = open('myOutput01.txt','w')
inputFile = open('myInput01.txt','r')
content = inputFile.readlines()
for name in sorted(content):
outputFile.write(name + '\n')
inputFile.close()
outputFile.close()

writing a list to a txt file in python

list consists of RANDOM strings inside it
#example
list = [1,2,3,4]
filename = ('output.txt')
outfile = open(filename, 'w')
outfile.writelines(list)
outfile.close()
my result in the file
1234
so now how do I make the program produce the result that I want which is:
1
2
3
4
myList = [1,2,3,4]
with open('path/to/output', 'w') as outfile:
outfile.write('\n'.join(str(i) for i in myList))
By the way, the list that you have in your post contains ints, not strings.
Also, please NEVER name your variables list or dict or any other type for that matter
writelines() needs a list of strings with line separators appended to them but your code is only giving it a list of integers. To make it work you'd need to use something like this:
some_list = [1,2,3,4]
filename = 'output.txt'
outfile = open(filename, 'w')
outfile.writelines([str(i)+'\n' for i in some_list])
outfile.close()
In Python file objects are context managers which means they can be used with a with statement so you could do the same thing a little more succinctly with the following which will close the file automatically. It also uses a generator expression (enclosed in parentheses instead of brackets) to do the string conversion since doing that avoids the need to build a temporary list just to pass to the function.
with open(filename, 'w') as outfile:
outfile.writelines((str(i)+'\n' for i in some_list))

create standard compliant file

I have a comma delimited file. The lines look like this...
1,2,3,4,5
6,7,8
9,10
11,12,13,14,15
I need to have exactly 5 columns across all lines. So the new file will be...
1,2,3,4,5
6,7,8,,
9,10,,,
11,12,13,14,15
In other words, if there are less than 4 commas in a line. add required number to the end. I was told that there is python module that will do exactly the same. Where can I find such module? Is awk better suited for such type of tasks?
The module you are looking for is the csv module. You'd still need to ensure that your lists meet you minimal length requirements:
with open('output.csv', 'wb') as output:
input = csv.reader(open('faultyfile.csv', 'rb'))
output = csv.writer(output, dialect=input.dialect)
for line in input:
if len(line) < 5:
line.extend([''] * (5 - len(line)))
output.writerow(line)
If you don't mind using awk, then it is easy:
$ cat data.txt
1,2,3,4,5
6,7,8
9,10
11,12,13,14,15
$ awk -F, 'BEGIN {OFS=","} {print $1,$2,$3,$4,$5}' data.txt
1,2,3,4,5
6,7,8,,
9,10,,,
11,12,13,14,15
with open('somefile.txt') as f:
rows = []
for line in f:
rows.append(line.split(","))
max_cols = len(max(rows,key=len))
for row in rows:
row.extend(['']*(max_cols-len(row))
print "\n".join(str(r) for r in rows)
If you are sure that it will always be n items long (in this case 5) and you will always know before opening the file ... it is more memory efficient to do (something like this)
with open("f1","r"):
with open("f2","w"):
for line in f1:
f2.write(line+(","*(4-line.count(",")))+"\n")
def correct_file(fname):
with open(fname) as f:
data = [ line[:-1]+(4-line.count(','))*',' + '\n' for line in f ]
with open(fname,'w'):
f.writelines(data)
As noted in the comments, this reads the entire file into memory when you really don't need to. To do it not all in one go:
import shutil
def correct_file(fname):
with open(fname,'r') as fin, open('temp','w') as fout:
for line in fin:
new = line[:-1]+(4-line.count(','))*',' + '\n'
fout.write(new)
shutil.move('temp',fname)
This will make any file named temp disappear in the current directory. Of course, you can always use the tempfile module to get around that ...
And for the slightly more verbose, but bullet-proof (?) version:
import shutil
import tempfile
import atexit
import os
def try_delete(fname):
try:
os.unlink(fname)
except OSError:
if os.path.exists(fname):
print "Couldn't delete existing file",fname
def correct_file(fname):
with open(fname,'r') as fin, tempfile.NamedTemporaryFile('w',delete=False) as fout:
atexit.register(lambda f=fout.name: try_delete(f)) #Need a closure here ...
for line in fin:
new = line[:-1]+(4-line.count(','))*',' + '\n'
fout.write(new)
shutil.move(fout.name,fname) #This should get rid of the temporary file ...
This might work for you (GNU sed):
sed ':a;s/,/&/4;t;s/$/,/;ta' file

Python: Concise / elegant way to reformat a set of text files?

I have written a python script to process a set of ASCII files within a given dir. I wonder if there is a more concise and/or "pythonesque" way to do it, without loosing readability?
Python Code
import os
import fileinput
import glob
import string
indir='./'
outdir='./processed/'
for filename in glob.glob(indir+'*.asc'): # get a list of input ASCII files to be processed
fin=open(indir+filename,'r') # input file
fout=open(outdir+filename,'w') # out: processed file
lines = iter(fileinput.input([indir+filename])) # iterator over all lines in the input file
fout.write(next(lines)) # just copy the first line (the header) to output
for line in lines:
val=iter(string.split(line,' '))
fout.write('{0:6.2f}'.format(float(val.next()))), # first value in the line has it's own format
for x in val: # iterate over the rest of the numbers in the line
fout.write('{0:10.6f}'.format(float(val.next()))), # the rest of the values in the line has a different format
fout.write('\n')
fin.close()
fout.close()
An example:
Input:
;;; This line is the header line
-5.0 1.090074154029272 1.0034662411357929 0.87336062116561186 0.78649408279093869 0.65599958665017222 0.4379879132749317 0.26310799350679176 0.087808018565486673
-4.9900000000000002 1.0890770415316042 1.0025480136545413 0.87256100700428996 0.78577373527626004 0.65539842673645277 0.43758616966566649 0.26286647978335914 0.087727357602906453
-4.9800000000000004 1.0880820021223023 1.0016316956763136 0.87176305623792771 0.78505488659611744 0.65479851808106115 0.43718526271594083 0.26262546925502467 0.087646864773454014
-4.9700000000000006 1.0870890372077564 1.0007172884938402 0.87096676998908273 0.78433753775986659 0.65419986152386733 0.4367851929843618 0.26238496225635727 0.087566540188423345
-4.9600000000000009 1.086098148170821 0.99980479337809591 0.87017214936140763 0.78362168975984026 0.65360245789061966 0.4363859610200459 0.26214495911617541 0.087486383957276398
Processed:
;;; This line is the header line
-5.00 1.003466 0.786494 0.437988 0.087808
-4.99 1.002548 0.785774 0.437586 0.087727
-4.98 1.001632 0.785055 0.437185 0.087647
-4.97 1.000717 0.784338 0.436785 0.087567
-4.96 0.999805 0.783622 0.436386 0.087486
Other than a few minor changes, due to how Python has changed through time, this looks fine.
You're mixing two different styles of next(); the old way was it.next() and the new is next(it). You should use the string method split() instead of going through the string module (that module is there mostly for backwards compatibility to Python 1.x). There's no need to use go through the almost useless "fileinput" module, since open file handle are also iterators (that module comes from a time before Python's file handles were iterators.)
Edit: As #codeape pointed out, glob() returns the full path. Your code would not have worked if indir was something other than "./". I've changed the following to use the correct listdir/os.path.join solution. I'm also more familiar with the "%" string interpolation than string formatting.
Here's how I would write this in more idiomatic modern Python
def reformat(fin, fout):
fout.write(next(fin)) # just copy the first line (the header) to output
for line in fin:
fields = line.split(' ')
# Make a format header specific to the number of fields
fmt = '%6.2f' + ('%10.6f' * (len(fields)-1)) + '\n'
fout.write(fmt % tuple(map(float, fields)))
basenames = os.listdir(indir) # get a list of input ASCII files to be processed
for basename in basenames:
input_filename = os.path.join(indir, basename)
output_filename = os.path.join(outdir, basename)
with open(input_filename, 'r') as fin, open(output_filename, 'w') as fout:
reformat(fin, fout)
The Zen of Python is "There should be one-- and preferably only one --obvious way to do it". It's interesting how you functions which, during the last 10+ years, was "obviously" the right solution, but are no longer. :)
fin=open(indir+filename,'r') # input file
fout=open(outdir+filename,'w') # out: processed file
#code
fin.close()
fout.close()
can be written as:
with open(indir+filename,'r') as fin, open(outdir+filename,'w') as fout:
#code
In python 2.6, you can use:
with open(indir+filename,'r') as fin:
with open(outdir+filename,'w') as fout:
#code
And the line
lines = iter(fileinput.input([indir+filename]))
is useless. You can just iterate over an open file(fin in your case)
You can also do line.split(' ') instead of string.split(line, ' ')
If you change those things, there is no need to import string and fileinput.
Edit: I didn't know you can use inline code. That's cool
In my build script, I have this code:
inFile = open(sourceFile,'r')
outFile = open(targetFile,'w')
for line in inFile:
line = doKeywordSubstitution(line)
outFile.write(line)
inFile.close()
outFile.close()
I don't know of a way to make this any more concise. Putting the line-changing logic in a different function looks neater to me though.
I may be missing the point of your code, but I don't understand why you have lines = iter(fileinput.input([indir+filename])).
I don't understand why do you use: string.split(line, ' ') instead of just line.split(' ').
Well maybe I would write the string-processing part like this:
values = line.split(' ')
values[0] = '{0:6.2f}'.format(float(values[0]))
values[1:] = ['{0:10.6f}'.format(float(v)) for v in values[1:]]
fout.write(' '.join(values))
At least for me this looks better but this might be subjective :)
Instead of indir I would use os.curdir. Instead of "./processed" I would do: os.path.join(os.curdir, 'processed').

Categories

Resources