I am using Notepad++ to restructure some data. Each .txt file has 99 lines. I am trying to run a python script to create 99 single-line files.
Here is the .py script I am currently running, which I found in a previous thread on the topic. I'm not sure why, but it isn't quite doing the job:
yourfile = open('filename.TXT', 'r')
counter = 0
magic = yourfile.readlines()
for i in magic:
counter += 1
newfile = open(('filename_' + str(counter) + '.TXT'), "w")
newfile.write(i)
newfile.close()
When I run this particular script, it simply creates a copy of the host file, and it still has 99 lines.
You may want to change the structure of your script a bit:
with open('filename.txt', 'r') as f:
for i, line in enumerate(f):
with open('filename_{}.txt'.format(i), 'w') as wf:
wf.write(line)
In this format you have the benefit of relying on context managers to close your file handler and also you don't have to read things separately, there isa better logical flow.
You can use the following piece of code to achieve that. It's commented, but feel free to ask.
#reading info from infile with 99 lines
infile = 'filename.txt'
#using context handler to open infile and readlines
with open(infile, 'r') as f:
lines = f.readlines()
#initializing counter
counter = 0
#for each line, create a new file and write line to it.
for line in lines:
#define outfile name
outfile = 'filename_' + str(counter) + '.txt'
#create outfile and write line
with open(outfile, 'w') as g:
g.write(line)
#add +1 to counter
counter += 1
magic = yourfile.readlines(99)
Please try remove '99' like this.
magic = yourfile.readlines()
I tried it and I have 99 file that have a single line each one.
Related
Ive got some code that lets me open all csv files in a directory and run through them removing the top 2 lines of each file, Ideally during this process I would like it to also add a single comma at the end of the new first line (what would have been originally line 3)
Another approach that's possible could be to remove the trailing comma's on all other rows that appear in each of the csvs.
Any thoughts or approaches would be gratefully received.
import glob
path='P:\pytest'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r') as f:
lines = f.read().split("\n")
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w')
for line in lines:
o.write(line+'\n')
o.close()
adding a counter in there can solve this:
import glob
path=r'C:/Users/dsqallihoussaini/Desktop/dev_projects/stack_over_flow'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r') as f:
lines = f.read().split("\n")
print(lines)
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w')
counter=0
for line in lines:
counter=counter+1
if counter==1:
o.write(line+',\n')
else:
o.write(line+'\n')
o.close()
One possible problem with your code is that you are reading the whole file into memory, which might be fine. If you are reading larger files, then you want to process the file line by line.
The easiest way to do that is to use the fileinput module: https://docs.python.org/3/library/fileinput.html
Something like the following should work:
#!/usr/bin/env python3
import glob
import fileinput
# inplace makes a backup of the file, then any output to stdout is written
# to the current file.
# change the glob..below is just an example.
#
# Iterate through each file in the glob.iglob() results
with fileinput.input(files=glob.iglob('*.csv'), inplace=True) as f:
for line in f: # Iterate over each line of the current file.
if f.filelineno() > 2: # Skip the first two lines
# Note: 'line' has the newline in it.
# Insert the comma if line 3 of the file, otherwise output original line
print(line[:-1]+',') if f.filelineno() == 3 else print(line, end="")
Ive added some encoding as well as mine was throwing a error but encoding fixed that up nicely
import glob
path=r'C:/whateveryourfolderis'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r',encoding='utf-8') as f:
lines = f.read().split("\n")
#print(lines)
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w',encoding='utf-8')
counter=0
for line in lines:
counter=counter+1
if counter==1:
o.write(line+',\n')
else:
o.write(line+'\n')
o.close()
I have a problem with a code in python. I want to read a .txt file. I use the code:
f = open('test.txt', 'r') # We need to re-open the file
data = f.read()
print(data)
I would like to read ONLY the first line from this .txt file. I use
f = open('test.txt', 'r') # We need to re-open the file
data = f.readline(1)
print(data)
But I am seeing that in screen only the first letter of the line is showing.
Could you help me in order to read all the letters of the line ? (I mean to read whole the line of the .txt file)
with open("file.txt") as f:
print(f.readline())
This will open the file using with context block (which will close the file automatically when we are done with it), and read the first line, this will be the same as:
f = open(“file.txt”)
print(f.readline())
f.close()
Your attempt with f.readline(1) won’t work because it the argument is meant for how many characters to print in the file, therefore it will only print the first character.
Second method:
with open("file.txt") as f:
print(f.readlines()[0])
Or you could also do the above which will get a list of lines and print only the first line.
To read the fifth line, use
with open("file.txt") as f:
print(f.readlines()[4])
Or:
with open("file.txt") as f:
lines = []
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
print(lines[-1])
The -1 represents the last item of the list
Learn more:
with statement
files in python
readline method
Your first try is almost there, you should have done the following:
f = open('my_file.txt', 'r')
line = f.readline()
print(line)
f.close()
A safer approach to read file is:
with open('my_file.txt', 'r') as f:
print(f.readline())
Both ways will print only the first line.
Your error was that you passed 1 to readline which means you want to read size of 1, which is only a single character. please refer to https://www.w3schools.com/python/ref_file_readline.asp
I tried this and it works, after your suggestions:
f = open('test.txt', 'r')
data = f.readlines()[1]
print(data)
Use with open(...) instead:
with open("test.txt") as file:
line = file.readline()
print(line)
Keep f.readline() without parameters.
It will return you first line as a string and move cursor to second line.
Next time you use f.readline() it will return second line and move cursor to the next, etc...
I'm currently using Python 3 on Ubuntu 18.04. I'm not a programmer by any means and I'm not asking for a code review, however, I'm having an issue that I can't seem to resolve.
I have 1 text file named content.txt that I'm reading lines from.
I have 1 text file named standard.txt that I'm reading lines from.
I have 1text file named outfile.txt that I'm writing to.
content = open("content.txt", "r").readlines()
standard = open("standard.txt", "r").readlines()
outfile = "outfile.txt"
outfile_set = set()
with open(outfile, "w") as f:
for line in content:
if line not in standard:
outfile_set.add(line)
f.writelines(sorted(outfile_set))
I'm not sure where to put the following line though. My for loop nesting may all be off:
f.write("\nNo New Content")
Any code examples to make this work would be most appreciated. Thank you.
if i understand good you whant to add outfile_set if this is not empty to the outfile or add the string "\nNo New Content"
Replace the line
f.writelines(sorted(outfile_set))
to
if any(outfile_set):
f.writelines(sorted(outfile_set))
else:
f.write("\nNo New Content")
I'm assuming that you want to write "No new content" to the file if every line in content is in standard. So you might do something like:
with open(outfile, "w") as f:
for line in content:
if line not in standard:
outfile_set.add(line)
if len(outfile_set) > 0:
f.writelines(sorted(outfile_set))
else:
f.write("\nNo New Content")
Your original code was almost there!
You can reduce your runtime a lot by using set/frozenset:
with open("content.txt", "r") as f:
content = frozenset(f.readlines()) # only get distinct values from file
with open("standard.txt", "r") as f:
standard = frozenset(f.readlines()) # only get distinct values from file
# only keep whats in content but not in standard
outfile_set = sorted(content-standard) # set difference, no loops or tests needed
with open ("outfile.txt","w") as outfile:
if outfile_set:
outfile.writelines(sorted(outfile_set))
else:
outfile.write("\nNo New Content")
You can read more about it here:
set operator list (python 2 - but valid for 3 - can't find this overview in py3 doku
set difference
Demo:
# Create files
with open("content.txt", "w") as f:
for n in map(str,range(1,10)): # use range(1,10,2) for no changes
f.writelines(n+"\n")
with open("standard.txt", "w") as f:
for n in map(str,range(1,10,2)):
f.writelines(n+"\n")
# Process files:
with open("content.txt", "r") as f:
content = set(f.readlines())
with open("standard.txt", "r") as f:
standard = set(f.readlines())
# only keep whats in content but not in standard
outfile_set = sorted(content-standard)
with open ("outfile.txt","w") as outfile:
if outfile_set:
outfile.writelines(sorted(outfile_set))
else:
outfile.write("\nNo New Content")
with open ("outfile.txt") as f:
print(f.read())
Output:
2
4
6
8
or
No New Content
I am attempting to open two files then take the first line in the first file, write it to an out file, then take the first line in the second file and append it to the same line in the output file, separated by a tab.
I've attempted to code this, and my outfile just ends up being the whole contents of the first file, followed by the entire contents of the second file. I included print statements just because I wanted to see something going on in the terminal while the script was running, that is why they are there. Any ideas?
import sys
InFileName = sys.argv[1]
InFile = open(InFileName, 'r')
InFileName2 = sys.argv[2]
InFile2 = open(InFileName2, 'r')
OutFileName = "combined_data.txt"
OutFile = open(OutFileName, 'a')
for line in InFile:
OutFile.write(str(line) + '\t')
print line
for line2 in InFile2:
OutFile.write(str(line2) + '\n')
print line
InFile.close()
InFile2.close()
OutFile.close()
You can use zip for this:
with open(file1) as f1,open(file2) as f2,open("combined_data.txt","w") as fout:
for t in zip(f1,f2):
fout.write('\t'.join(x.strip() for x in t)+'\n')
In the case where your two files don't have the same number of lines (or if they're REALLY BIG), you could use itertools.izip_longest(f1,f2,fillvalue='')
Perhaps this gives you a few ideas:
Adding entries from multiple files in python
o = open('output.txt', 'wb')
fh = open('input.txt', 'rb')
fh2 = open('input2.txt', 'rb')
for line in fh.readlines():
o.write(line.strip('\r\n') + '\t' + fh2.readline().strip('\r\n') + '\n')
## If you want to write remaining files from input2.txt:
# for line in fh2.readlines():
# o.write(line.rstrip('\r\n') + '\n')
fh.close()
fh2.close()
o.close()
This will give you:
line1_of_file_1 line1_of_file_2
line2_of_file_1 line2_of_file_2
line3_of_file_1 line3_of_file_2
line4_of_file_1 line4_of_file_2
Where the space in my output example is a [tab]
Note: no line ending is appended to the file for obvious reasons.
For this to work, the linendings would need to be proper in both file 1 and 2.
To check this:
print 'File 1:'
f = open('input.txt', 'rb')
print [r.read[:200]]
f.close()
print 'File 2:'
f = open('input2.txt', 'rb')
print [r.read[:200]]
f.close()
This should give you something like
File 1:
['This is\ta lot of\t text\r\nWith a few line\r\nendings\r\n']
File 2:
['Give\r\nMe\r\nSome\r\nLove\r\n']
How can I insert a string at the beginning of each line in a text file, I have the following code:
f = open('./ampo.txt', 'r+')
with open('./ampo.txt') as infile:
for line in infile:
f.insert(0, 'EDF ')
f.close
I get the following error:
'file' object has no attribute 'insert'
Python comes with batteries included:
import fileinput
import sys
for line in fileinput.input(['./ampo.txt'], inplace=True):
sys.stdout.write('EDF {l}'.format(l=line))
Unlike the solutions already posted, this also preserves file permissions.
You can't modify a file inplace like that. Files do not support insertion. You have to read it all in and then write it all out again.
You can do this line by line if you wish. But in that case you need to write to a temporary file and then replace the original. So, for small enough files, it is just simpler to do it in one go like this:
with open('./ampo.txt', 'r') as f:
lines = f.readlines()
lines = ['EDF '+line for line in lines]
with open('./ampo.txt', 'w') as f:
f.writelines(lines)
Here's a solution where you write to a temporary file and move it into place. You might prefer this version if the file you are rewriting is very large, since it avoids keeping the contents of the file in memory, as versions that involve .read() or .readlines() will. In addition, if there is any error in reading or writing, your original file will be safe:
from shutil import move
from tempfile import NamedTemporaryFile
filename = './ampo.txt'
tmp = NamedTemporaryFile(delete=False)
with open(filename) as finput:
with open(tmp.name, 'w') as ftmp:
for line in finput:
ftmp.write('EDF '+line)
move(tmp.name, filename)
For a file not too big:
with open('./ampo.txt', 'rb+') as f:
x = f.read()
f.seek(0,0)
f.writelines(('EDF ', x.replace('\n','\nEDF ')))
f.truncate()
Note that , IN THEORY, in THIS case (the content is augmented), the f.truncate() may be not really necessary. Because the with statement is supposed to close the file correctly, that is to say, writing an EOF (end of file ) at the end before closing.
That's what I observed on examples.
But I am prudent: I think it's better to put this instruction anyway. For when the content diminishes, the with statement doesn't write an EOF to close correctly the file less far than the preceding initial EOF, hence trailing initial characters remains in the file.
So if the with statement doens't write EOF when the content diminishes, why would it write it when the content augments ?
For a big file, to avoid to put all the content of the file in RAM at once:
import os
def addsomething(filepath, ss):
if filepath.rfind('.') > filepath.rfind(os.sep):
a,_,c = filepath.rpartition('.')
tempi = a + 'temp.' + c
else:
tempi = filepath + 'temp'
with open(filepath, 'rb') as f, open(tempi,'wb') as g:
g.writelines(ss + line for line in f)
os.remove(filepath)
os.rename(tempi,filepath)
addsomething('./ampo.txt','WZE')
f = open('./ampo.txt', 'r')
lines = map(lambda l : 'EDF ' + l, f.readlines())
f.close()
f = open('./ampo.txt', 'w')
map(lambda l : f.write(l), lines)
f.close()