Creating files with names based on entries in txt file

Creating files with names based on entries in txt file - python

i = 1
with open("randomStuff\\test\\brief.txt") as textFile:
lines = [line.split('\n') for line in textFile]
for row in lines:
for elem in row:
with open(elem + ".txt", "w") as newLetter:
newLetter.writelines(elem)
i += 1
I have a txt file with names. I want to create files with those names like:
firstnameLastname.txt
The names appear in the files too.
At the moment it is working fine, but it creates on empty file called ".txt"
Can someone tell me why? If I'm right the problem should be in the loops.

Add an if statement to prevent creating files on empty lines
Edit
i = 1
with open("randomStuff\\test\\brief.txt") as textFile:
lines = [line.split('\n') for line in textFile]
for row in lines:
for elem in row:
if elem == “”:
continue
with open(elem + ".txt", "w") as newLetter:
newLetter.writelines(elem)
i += 1
Continue will jump to the next loop cycle without execute the below code

I don't know why you have so many loops:
from pathlib import Path
text_file_content = Path("randomStuff/test/brief.txt").read_text().split_lines()
for line in text_file_content:
if line: # in case you have a new line at the end of your file, which you probably should
with open(f"{line}.txt", "w") as new_letter:
new_letter.writelines(line)

Related

How to append current lines to previous line when its start with 'id'

My Problem is the following
I have one text file, it contains more than 1000 rows, want to read files line by line
I am trying this code, but not getting expected output
my source file:
uuid;UserGroup;Name;Description;Owner;Visibility;Members ----> header of the file
id:;group1;raji;xyzabc;ramya;public;
abc
def
geh
id:group2;raji;rtyui;ramya;private
cvb
nmh
poi
import csv
output=[]
temp=[]
fo = open ('usergroups.csv', 'r')
for line in fo:
#next(uuid)
line = line.strip()
if not line:
continue #ignore empty lines
#temp.append(line)
if not line.startswith('id:') and not None:
temp.append(line)
print(line)
else:
if temp:
line += ";" + ",".join(temp)
temp.clear()
output.append(line)
print("\n".join(output))
with open('new.csv', 'w') as f:
writer = csv.writer(f)
writer.writerows(output)
i am getting this output:
id;group1;raji;xyzabc;ramya;public;uuid;UserGroup;Name;Description;Owner;Visibility;Members
id:group2;raji;rtyui;ramya;private;abc,def,geh
So whenever a line does not start with 'id' it should be appended to the previous line.
my desired output:
uuid;UserGroup;Name;Description;Owner;Visibility;Members ----> header of the file
id;group1;raji;xyzabc;ramya;public;abc,def,geh
id:group2;raji;rtyui;ramya;private;cvb,nmh,poi

There are a few mistakes. I'll only show the relevant corrections:
Use
if not line.startswith('id'):
No 'id:', since you also have a line starting with 'id;', plus you state yourself that a line has to start with "id" (no ":" there). The and if None part is unneccessary, because it's always true.
The other part:
output.append(line.split(';'))
because writerows need an iterable (list) of "row" objects, and a row object is a list of string. So you need a list of lists, which the above is, thanks to the extra split.
(Of course, now the line print("\n".join(output)) fails, but writer.writerows(output) works.)

I don't know if it will help you but with regex, this problem is solved in a very simple way. I leave here the code in case you are interested.
import regex as re
input_text = """uuid;UserGroup;Name;Description;Owner;Visibility;Members ----> header of the file
id;group1;raji;xyzabc;ramya;public;
abc
def
geh
id:group2;raji;rtyui;ramya;private
cvb
nmh
poi"""
formatted = re.sub(r"\n(?!(id|\n))", "", input_text)
print(formatted)
uuid;UserGroup;Name;Description;Owner;Visibility;Members ----> header of the file
id;group1;raji;xyzabc;ramya;public;abcdefgeh
id:group2;raji;rtyui;ramya;privatecvbnmhpoi
This code just replace the regular expression \n(?!(id|id|n)) with the empty string. This regular expression will replace all line breaks that are not followed by the word "id" or another line break (so we keep the space between the two lines of ids).

Writing to a file has not been included here, but there is a list of strings available to work with, as in your original code.
Note: this is not really an answer to your question, as it is a solution to your problem
The structure is by and large the same, with a few changes for readability.
Readable code is easier to get right
import csv
output = []
temp = []
currIdLine = ""
with( open ('usergroups.csv', 'r')) as f:
for dirtyline in f.readlines():
line = dirtyline.strip()
if not line:
print("Skipping empty line")
continue
if line.startswith('uuid'): # append the header to the output
output.append(line)
continue
if line.startswith('id'):
if temp:
print(temp)
output.append(currIdLine + ";" + ','.join(temp)) #based on current input, there is a bug here where the output will contain two sequential ';' characters
temp.clear()
currIdLine = line
else:
temp.append(line)
output.append(currIdLine + ";" + ','.join(temp))
print(output)

Python Splitting Text file based on a keyword

I am trying to write a python program that will constantly read a text file line by line and each time it comes across a line with the word 'SPLIT' it will write the contents to a new text file.
Please could someone point me in the right direction of writing a new text file each time the script comes across the word 'split'. I have no problem reading a text file with Python, I'm unsure how to split on the keyword and create an individual text file each time.
THE SCRIPT BELOW WORKS IN 2.7.13
file_counter = 0
done = False
with open('test.txt') as input_file:
# with open("test"+str(file_counter)+".txt", "w") as out_file:
while not done:
for line in input_file:
if "SPLIT" in line:
done = True
file_counter += 1
else:
print(line)
out_file = open("test"+str(file_counter)+".txt", "a")
out_file.write(line)
#out_file.write(line.strip()+"\n")
print file_counter

You need to have two loops. One which iterates the filenames of the output files then another inside to write the input contents to the current active output until "split" is found:
out_n = 0
done = False
with open("test.txt") as in_file:
while not done: #loop over output file names
with open(f"out{out_n}.txt", "w") as out_file: #generate an output file name
while not done: #loop over lines in inuput file and write to output file
try:
line = next(in_file).strip() #strip whitespace for consistency
except StopIteration:
done = True
break
if "SPLIT" in line: #more robust than 'if line == "SPLIT\n":'
break
else:
out_file.write(line + '\n') #must add back in newline because we stripped it out earlier
out_n += 1 #increment output file name integer

for line in text.splitlines():
if " SPLIT " in line:
# write in new file.
pass
To write in new file check here:
https://www.tutorialspoint.com/python/python_files_io.htm
or
https://docs.python.org/3.6/library/functions.html#open

Concat/Append to the end of a specific line of text file in Python

I need to add some text to the end of a specific line in a text file. I'm currently trying to use a method similar to this:
entryList = [5,4,3,8]
dataFile = open("file.txt","a+)
for i in dataFile:
for j in entryList:
lines[i] = lines[i].strip()+entryList[j]+" "
dataFile.write(lines[i])
I'd like to add the numbers immediately following the text.
Text file setup:
it is
earlier it was
before that it was
later it will be

You have mentioned specific line in the question and in your code you are writing for every line.
import fileinput
entryList = [5,4,3,8]
count = 0
for line in fileinput.FileInput('data.txt',inplace=1):
print line+str(entryList[count])+" "
count+=1

Reading from and then writing to the same file is not such a good idea. I suggest opening it as needed:
entryList = [5,4,3,8]
with open("file.txt", "r") as dataFile:
# You may want to add '\n' at the end of the format string
lines = ["{} {} ".format(s.strip(), i) for s,i in zip(dataFile, entryList)]
with open("file.txt", "w") as outFile:
outFile.writelines(lines)

Create multiple texts files from data in an original text file

I have an original text file with 100 rows and 40 columns of data.
I would like to write an individual text file for each data row of the original text file.
I can only work out how to do it the long way:
Data = loadtxt('Data.txt')
Row1 = Data[0,:]
np.savetxt('Row1.txt', [Row1])
Row2 = Data[1,:]
np.savetxt('Row2.txt', [Row2])
Row3 = Data[2,:] etc....
Is there a way of using a loop to make this process quicker/do it all at once so I can avoid doing this 100 times?
I was thinking something along the lines of
with open('Data.txt') as f:
for line in f.
line_out = f.readlines():
with open(line + '.txt','w') as fout:
fout.write(line_out)
This doesn't work but I can't work out what the code should be.

You're on the right track. This should give you files with names corresponding to each line number:
counter = 0
with open("sampleInput.txt",'rU') as f:
for i in f:
newFileName = 'newFile_'+str(counter)
outFile = open(newFileName,'w')
outFile.write(i)
outFile.close()
counter+=1

Consider fileNames.txt contain all the words for creating multiple .txt files.
f = open('fileNames.txt', 'r+')
for line in f:
if '\n' in line:
line = line[:-1] #assuming /n at the end of file
new = open("%s.txt"%line,"w+")
new.write("File with name %s"%line) #content for each file.
new.close()
New files will not be created if \n is present in the string. Hence avoid such conditions.
If fileNames.txt contains---> frog four legs
Then three files named frog.txt four.txt and legs.txt will be created.

Python script to count num lines in all files in directory

So I'm new to python and I'm trying to write a script that iterates through all .txt files in a directory, counts the number of lines in each one (with exception to lines that are blank or commented out), and writes the final output to a csv. The final output should look something like this:
agprices, avi, adp
132, 5, 8
I'm having trouble with the syntax to save each count as the value of the dictionary. Here is my code below:
#!/usr/bin/env python
import csv
import copy
import os
import sys
#get current working dir, set count, and select file delimiter
d = os.getcwd()
count = 0
ext = '.txt'
#parses through files and saves to a dict
series_dict = {}
txt_files = [i for i in os.listdir(d) if os.path.splitext(i)[1] == ext]
#selects all files with .txt extension
for f in txt_files:
with open(os.path.join(d,f)) as file_obj:
series_dict[f] = file_obj.read()
if line.strip(): #Exclude blank lines
continue
else if line.startswith("#"): #Exclude commented lines
continue
else
count +=1
#Need to save count as val in dict here
#save the dictionary with key/val pairs to a csv
with open('seriescount.csv', 'wb') as f:
w = csv.DictWriter(f, series_dict.keys())
w.writeheader()
w.writerow(series_dict)
So here's the edit:
#!/usr/bin/env python
import csv
import copy
import os
import sys
import glob
#get current working dir, set count, and select file delimiter
os.chdir('/Users/Briana/Documents/Misc./PythonTest')
#parses through files and saves to a dict
series = {}
for fn in glob.glob('*.txt'):
with open(fn) as f:
series[fn] = (1 for line in f if line.strip() and not line.startswith('#'))
print series
#save the dictionary with key/val pairs to a csv
with open('seriescount.csv', 'wb') as f:
w = csv.DictWriter(f, series.keys())
sum(names.values())
I'm getting an indentation error on the 2nd to last line and am not quite sure why? Also, I'm not positive that I'm writing the syntax correctly on the last part. Again, I'm simply trying to return a dictionary with names of files and number of lines in files like {a: 132, b:245, c:13}

You can try something along these lines:
os.chdir(ur_directory)
names={}
for fn in glob.glob('*.txt'):
with open(fn) as f:
names[fn]=sum(1 for line in f if line.strip() and not line.startswith('#'))
print names
That will print a dictionary similar to:
{'test_text.txt': 20, 'f1.txt': 3, 'lines.txt': 101, 'foo.txt': 6, 'dat.txt': 6, 'hello.txt': 1, 'f2.txt': 4, 'neglob.txt': 8, 'bar.txt': 6, 'test_reg.txt': 6, 'mission_sp.txt': 71, 'test_nums.txt': 8, 'test.txt': 7, '2591.txt': 8303}
And you can use that Python dict in csv.DictWriter.
If you want the sum of those, just do:
sum(names.values())

I think you should make two changes to your script:
Use glob.glob() to get the list of files matching your desired suffix
Use for line in file_obj to iterate through the lines
Other problem:
The indentation is wrong on your last few lines

You could count your lines in your files with this 1-liner:
line_nums = sum(1 for line in open(f) if line.strip() and line[0] != '#')
that would shorten your code segment to
for f in txt_files:
count += sum(1 for line in open(os.path.join(d,f))
if line[0] != '#' and line.strip())

I looks like you want to use a dictionary to keep track of the counts. You could create one a the top like this counts = {}
Then (once you fix your tests) you can update it for each non-comment line:
series_dict = {}
txt_files = [i for i in os.listdir(d) if os.path.splitext(i)[1] == ext]
#selects all files with .txt extension
for f in txt_files:
counts[f] = 0 # create an entry in the dictionary to keep track of one file's lines
with open(os.path.join(d,f)) as file_obj:
series_dict[f] = file_obj.read()
if line.startswith("#"): #Exclude commented lines
continue
elif line.strip(): #Exclude blank lines
counts(f) += 1

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Creating files with names based on entries in txt file - python

Related

How to append current lines to previous line when its start with 'id'

Python Splitting Text file based on a keyword

Concat/Append to the end of a specific line of text file in Python

Create multiple texts files from data in an original text file

Python script to count num lines in all files in directory

Categories

Resources