Removing comma from Text file using Python - python

I'm starting to play around with Python and trying to merge a couple of files I have into a single file. When I use the below code:
import glob
path = "C:\\Users\\abc\\OneDrive\\Trading\\"
read_files = glob.glob(path + "*.txt")
with open("result.txt", "wb") as outfile:
for f in read_files:
with open(f, "rb") as infile:
outfile.write(infile.read())
My output file appears to have many names with ,,,, example:
ASX:MCR,,,,,,,
,ASX:RHC,,,,,,
,,ASX:LTR,,,,,
,,,,ASX:MAY,,,
,,,,,,ASX:ANP,
beside it.
How can I remove all the commas to get a list of stock codes in a single line and remove any duplicates:
ASX:BGT
ASX:CNB
ASX:BFG
ASX:ICI

Related

Trying to merge all text files in a folder and append file as well

I am trying to merge all text files in a folder. I have this part working, but when I try to append the file name before the contents of each text file, I'm getting a error that reads: TypeError: a bytes-like object is required, not 'str'
The code below must be pretty close, but something is definitely off. Any thoughts what could be wrong?
import glob
folder = 'C:\\my_path\\'
read_files = glob.glob(folder + "*.txt")
with open(folder + "final_result.txt", "wb") as outfile:
for f in read_files:
with open(f, "rb") as infile:
outfile.write(f)
outfile.write(infile.read())
outfile.close
outfile.write(f) seems to be your problem because you opened the file with in binary mode with 'wb'. You can convert to bytes using encode You'll likely not want to close outfile in your last line either (although you aren't calling the function anyway). So something like this might work for you:
import glob
folder = 'C:\\my_path\\'
read_files = glob.glob(folder + "*.txt")
with open(folder + "final_result.txt", "wb") as outfile:
for f in read_files:
with open(f, "rb") as infile:
outfile.write(f.encode('utf-8'))
outfile.write(infile.read())

How to combine several text files into one file?

I want to combine several text files into one output files.
my original code is to download 100 text files then each time I filter the text from several words and the write it to the output file.
Here is part of my code that suppose to combine the new text with the output text. The result each time overwrite the output file, delete the previous content and add the new text.
import fileinput
import glob
urls = ['f1.txt', 'f2.txt','f3.txt']
N =0;
print "read files"
for url in urls:
read_files = glob.glob(urls[N])
with open("result.txt", "wb") as outfile:
for f in read_files:
with open(f, "rb") as infile:
outfile.write(infile.read())
N+=1
and I tried this also
import fileinput
import glob
urls = ['f1.txt', 'f2.txt','f3.txt']
N =0;
print "read files"
for url in urls:
file_list = glob.glob(urls[N])
with open('result-1.txt', 'w') as file:
input_lines = fileinput.input(file_list)
file.writelines(input_lines)
N+=1
Is there any suggestions?
I need to concatenate/combine approximately 100 text files into one .txt file In sequence manner. (Each time I read one file and add it to the result.txt)
The problem is that you are re-opening the output file on each loop iteration which will cause it to overwrite -- unless you explicitly open it in append mode.
The glob logic is also unnecessary when you already know the filename.
Try this instead:
with open("result.txt", "wb") as outfile:
for url in urls:
with open(url, "rb") as infile:
outfile.write(infile.read())

combining text files in python

I am trying to combine multiple text files in a directory in one file. I want to write a HEADER and END statement in the combined file. The current python script which I am using combines all the files into one, but I am not able to figure out how to write a HEADER and END statement for each of the file in the combine file.
filenames = ['pm.pdb.B10010001.txt', 'pm.pdb.B10020001.txt', ...]
with open('/pdb3c91.0/output.txt', 'w') as outfile:
for fname in filenames:
with open(fname) as infile:
for line in infile:
outfile.write(line)
Just write the two lines.
filenames = ['pm.pdb.B10010001.txt', 'pm.pdb.B10020001.txt', ...]
with open('/pdb3c91.0/output.txt', 'w') as outfile:
for fname in filenames:
with open(fname) as infile:
outfile.write("HEADER\n")
for line in infile:
outfile.write(line)
outfile.write("END\n")

Combining Regex files in Python

I have 48 .rx.txt files and I'm trying to combine them using Python. I know that when you combine .rx.txt files, you have to include a "|" in between the files.
Here's the code that I'm using:
import glob
read_files = filter(lambda f: f!='final.txt' and f!='result.txt', glob.glob('*.txt'))
with open("REGEXES.rx.txt", "wb") as outfile:
for f in read_files:
with open(f, "rb") as infile:
outfile.write(infile.read())
outfile.write('|')
But when I try to run that I get this error:
Traceback (most recent call last):
File "/Users/kosay.jabre/Desktop/Password Assessor/RegexesNEW/CombineFilesCopy.py", line 10, in <module>
outfile.write('|')
TypeError: a bytes-like object is required, not 'str'
Any ideas on how I can combine my files into one file?
Your REGEXES.rx.txt is opened in binary mode, but with outfile.write('|') you attempting to write string to it instead of binary. It seems that all of your files contain text data, so instead of opening them as binaries open them as texts, i.e.:
with open("REGEXES.rx.txt", "w") as outfile:
for f in read_files:
with open(f, "r") as infile:
outfile.write(infile.read())
outfile.write('|')
In python2.7.x your code will work fine, but for python3.x you should add b prefix to the string outfile.write(b'|') that will mark the string as a binary string and then we will be able to write it in a binary file mode.
Then your code for python3.x will be:
import glob
read_files = filter(lambda f: f!='final.txt' and f!='result.txt', glob.glob('*.txt'))
with open("REGEXES.rx.txt", "wb") as outfile:
for f in read_files:
with open(f, "rb") as infile:
outfile.write(infile.read())
outfile.write(b'|')

Python blank txt file creation

I am trying to create bulk text files based on list. A text file has number of lines/titles and aim is to create text files. Following is how my titles.txt looks like along with non-working code and expected output.
titles = open("C:\\Dropbox\\Python\\titles.txt",'r')
for lines in titles.readlines():
d_path = 'C:\\titles'
output = open((d_path.lines.strip())+'.txt','a')
output.close()
titles.close()
titles.txt
Title-A
Title-B
Title-C
new blank files to be created under directory c:\\titles\\
Title-A.txt
Title-B.txt
Title-C.txt
It's a little difficult to tell what you're attempting here, but hopefully this will be helpful:
import os.path
with open('titles.txt') as f:
for line in f:
newfile = os.path.join('C:\\titles',line.strip()) + '.txt'
ff = open( newfile, 'a')
ff.close()
If you want to replace existing files with blank files, you can open your files with mode 'w' instead of 'a'.
The following should work.
import os
titles='C:/Dropbox/Python/titles.txt'
d_path='c:/titles'
with open(titles,'r') as f:
for l in f:
with open(os.path.join(d_path,l.strip()),'w') as _:
pass

Categories

Resources