I want to find string e.g. "Version1" from my files of a folder which contains multiple ".c" and ".h" files in it and replace it with "Version2.2.1" using python file.
Anyone know how this can be done?
Here's a solution using os, glob and ntpath. The results are saved in a directory called "output". You need to put this in the directory where you have the .c and .h files and run it.
Create a separate directory called output and put the edited files there:
import glob
import ntpath
import os
output_dir = "output"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
for f in glob.glob("*.[ch]"):
with open(f, 'r') as inputfile:
with open('%s/%s' % (output_dir, ntpath.basename(f)), 'w') as outputfile:
for line in inputfile:
outputfile.write(line.replace('Version1', 'Version2.2.1'))
Replace strings in place:
IMPORTANT! Please make sure to back up your files before running this:
import glob
for f in glob.glob("*.[ch]"):
with open(f, "r") as inputfile:
newText = inputfile.read().replace('Version1', 'Version2.2.1')
with open(f, "w") as outputfile:
outputfile.write(newText)
Related
I would like to read all the contents from all the text files in a directory. I have 4 text files in the "path" directory, and my codes are;
for filename in os.listdir(path):
filepath = os.path.join(path, filename)
with open(filepath, mode='r') as f:
content = f.read()
thelist = content.splitlines()
f.close()
print(filepath)
print(content)
print()
When I run the codes, I can only read the contents from only one text file.
I will be thankful that there are any advice or suggestions from you or that you know any other informative inquiries for this question in stackoverflow.
If you need to filter the files' name per suffix, i.e. file extension, you can either use the string method endswith or the glob module of the standard library https://docs.python.org/3/library/glob.html
Here an example of code which save each file content as a string in a list.
import os
path = '.' # or your path
files_content = []
for filename in filter(lambda p: p.endswith("txt"), os.listdir(path)):
filepath = os.path.join(path, filename)
with open(filepath, mode='r') as f:
files_content += [f.read()]
With the glob way here an example
import glob
for filename in glob.glob('*txt'):
print(filename)
This should list your file and you can read them one by one. All the lines of the files are stored in all_lines list. If you wish to store the content too, you can keep append it too
from pathlib import Path
from os import listdir
from os.path import isfile, join
path = "path_to_dir"
only_files = [f for f in listdir(path) if isfile(join(path, f))]
all_lines = []
for file_name in only_files:
file_path = Path(path) / file_name
with open(file_path, 'r') as f:
file_content = f.read()
all_lines.append(file_content.splitlines())
print(file_content)
# use all_lines
Note: when using with you do not need to call close() explicitly
Reference: How do I list all files of a directory?
Basically, if you want to read all the files, you need to save them somehow. In your example, you are overriding thelist with content.splitlines() which deletes everything already in it.
Instead you should define thelist outside of the loop and use thelist.append(content.splitlines) each time, which adds the content to the list each iteration
Then you can iterate over thelist later and get the data out.
so I have managed to concatenate every single .txt file of one directory into one file with this code:
import os
import glob
folder_path = "/Users/EnronSpam/enron1/ham"
for filename in glob.glob(os.path.join(folder_path, '*.txt')):
with open(filename, 'r', encoding="latin-1") as f:
text = f.read()
with open('new.txt', 'a') as a:
a.write(text)
but in my 'EnronSpam' folder there are actually multiple directories (enron 1-6), each of which has a ham directory. How is it possible to go through each directory and add every single file of that directory into one file?
If you just want to collect all the txt files from the enron[1-6]/ham folders try this:
glob.glob("/Users/EnronSpam/enron[1-6]/ham/*.txt")
It will pick up all txt files from the enron[1-6] folders' ham subfolders.
Also a slightly reworked snippet of the original code looks like this:
import glob
glob_path = "/Users/EnronSpam/enron[1-6]/ham/*.txt"
with open("new.txt", "w") as a:
for filename in glob.glob(glob_path):
with open(filename, "r", encoding="latin-1") as f:
a.write(f.read())
Instead of always opening and appending to the new file it makes more sense to open it right at the beginning and write the content of the ham txt files.
So, given that the count and the names of the directories are known, you should just add the full paths in a list and loop execute it all for each element:
import os
import glob
folder_list = ["/Users/EnronSpam/enron1/ham", "/Users/EnronSpam/enron2/ham", "/Users/EnronSpam/enron3/ham"]
for folder in folder_list:
for filename in glob.glob(os.path.join(folder, '*.txt')):
with open(filename, 'r', encoding="latin-1") as f:
text = f.read()
with open('new.txt', 'a') as a:
a.write(text)
I would need a script that owerwrites every file with .prj extension in a gven directory, and writes a certain string in these files. I started with this code to check if it works, and it works perfectly if I copy and run the .py file from the desired directory:
import os
for file in os.listdir("./"):
if file.endswith(".prj"):
with open(file, "r+") as f:
f.truncate(0)
f.write('Some text')
Then I added a feature where I can select the desired directory, so I would not have to copy the .py file every time.
import os
from tkinter.filedialog import askdirectory
path = askdirectory(title='Choose a directory')
for file in os.listdir(path):
if file.endswith(".prj"):
with open(file, "r+") as f:
f.truncate(0)
f.write('Some text')
Unfortunately this does not seem to work. I checked, and the path and file variables got the same values in both codes, so I do not know where I went wrong.
You can't use open(file,'r+') if you are not in the directory of the file.
You have to specify the path:
open(path+"/"+file, "r+")
os.listdir() returns filenames only without the parent directory information. Use os.path.join(path, file):
path = askdirectory(title='Choose a directory')
for file in os.listdir(path):
if file.endswith(".prj"):
with open(os.path.join(path, file), "r+") as f:
f.truncate(0)
f.write('Some text')
I want to write a program for this: In a folder I have n number of files; first read one file and perform some operation then store result in a separate file. Then read 2nd file, perform operation again and save result in new 2nd file. Do the same procedure for n number of files. The program reads all files one by one and stores results of each file separately. Please give examples how I can do it.
I think what you miss is how to retrieve all the files in that directory.
To do so, use the glob module.
Here is an example which will duplicate all the files with extension *.txt to files with extension *.out
import glob
list_of_files = glob.glob('./*.txt') # create the list of file
for file_name in list_of_files:
FI = open(file_name, 'r')
FO = open(file_name.replace('txt', 'out'), 'w')
for line in FI:
FO.write(line)
FI.close()
FO.close()
import sys
# argv is your commandline arguments, argv[0] is your program name, so skip it
for n in sys.argv[1:]:
print(n) #print out the filename we are currently processing
input = open(n, "r")
output = open(n + ".out", "w")
# do some processing
input.close()
output.close()
Then call it like:
./foo.py bar.txt baz.txt
You may find the fileinput module useful. It is designed for exactly this problem.
I've just learned of the os.walk() command recently, and it may help you here.
It allows you to walk down a directory tree structure.
import os
OUTPUT_DIR = 'C:\\RESULTS'
for path, dirs, files in os.walk('.'):
for file in files:
read_f = open(os.join(path,file),'r')
write_f = open(os.path.join(OUTPUT_DIR,file))
# Do stuff
Combined answer incorporating directory or specific list of filenames arguments:
import sys
import os.path
import glob
def processFile(filename):
fileHandle = open(filename, "r")
for line in fileHandle:
# do some processing
pass
fileHandle.close()
def outputResults(filename):
output_filemask = "out"
fileHandle = open("%s.%s" % (filename, output_filemask), "w")
# do some processing
fileHandle.write('processed\n')
fileHandle.close()
def processFiles(args):
input_filemask = "log"
directory = args[1]
if os.path.isdir(directory):
print "processing a directory"
list_of_files = glob.glob('%s/*.%s' % (directory, input_filemask))
else:
print "processing a list of files"
list_of_files = sys.argv[1:]
for file_name in list_of_files:
print file_name
processFile(file_name)
outputResults(file_name)
if __name__ == '__main__':
if (len(sys.argv) > 1):
processFiles(sys.argv)
else:
print 'usage message'
from pylab import *
import csv
import os
import glob
import re
x=[]
y=[]
f=open("one.txt",'w')
for infile in glob.glob(('*.csv')):
# print "" +infile
csv23=csv2rec(""+infile,'rb',delimiter=',')
for line in csv23:
x.append(line[1])
# print len(x)
for i in range(3000,8000):
y.append(x[i])
print ""+infile,"\t",mean(y)
print >>f,""+infile,"\t\t",mean(y)
del y[:len(y)]
del x[:len(x)]
I know I saw this double with open() somewhere but couldn't remember where. So I built a small example in case someone needs.
""" A module to clean code(js, py, json or whatever) files saved as .txt files to
be used in HTML code blocks. """
from os import listdir
from os.path import abspath, dirname, splitext
from re import sub, MULTILINE
def cleanForHTML():
""" This function will search a directory text files to be edited. """
## define some regex for our search and replace. We are looking for <, > and &
## To replaced with &ls;, > and &. We might want to replace proper whitespace
## chars to as well? (r'\t', ' ') and (f'\n', '<br>')
search_ = ((r'(<)', '<'), (r'(>)', '>'), (r'(&)', '&'))
## Read and loop our file location. Our location is the same one that our python file is in.
for loc in listdir(abspath(dirname(__file__))):
## Here we split our filename into it's parts ('fileName', '.txt')
name = splitext(loc)
if name[1] == '.txt':
## we found our .txt file so we can start file operations.
with open(loc, 'r') as file_1, open(f'{name[0]}(fixed){name[1]}', 'w') as file_2:
## read our first file
retFile = file_1.read()
## find and replace some text.
for find_ in search_:
retFile = sub(find_[0], find_[1], retFile, 0, MULTILINE)
## finally we can write to our newly created text file.
file_2.write(retFile)
This thing also works for reading multiple files, my file name is fedaralist_1.txt and federalist_2.txt and like this, I have 84 files till fedaralist_84.txt
And I'm reading the files as f.
for file in filename:
with open(f'federalist_{file}.txt','r') as f:
f.read()
I'm trying to use python to search for a string in a folder which contains multiple .txt files.
My objective is to find those files containing the string and move/or re-write them in another folder.
what I have tried is:
import os
for filename in os.listdir('./*.txt'):
if os.path.isfile(filename):
with open(filename) as f:
for line in f:
if 'string/term to be searched' in line:
f.write
break
probably there is something wrong with this but, of course, cannot figure it out.
os.listdir argument must be a path, not a pattern. You can use glob to accomplish that task:
import os
import glob
for filename in glob.glob('./*.txt'):
if os.path.isfile(filename):
with open(filename) as f:
for line in f:
if 'string/term to be searched' in line:
# You cannot write with f, because is open in read mode
# and must supply an argument.
# Your actions
break
As Antonio says, you cannot write with f because it is open in read mode.
A possible solution to avoid the problem is the following:
import os
import shutil
source_dir = "your/source/path"
destination_dir = "your/destination/path"
for top, dirs, files in os.walk(source_dir):
for filename in files:
file_path = os.path.join(top, filename)
check = False
with open(file_path, 'r') as f:
if 'string/term to be searched' in f.read():
check = True
if check is True:
shutil.move(file_path, os.path.join(destination_dir , filename))
Remember that if your source_dir or destination_dir contains some "special characters" you have to put the double back-slash.
For example, this:
source_dir = "C:\documents\test"
should be
source_dir = "C:\documents\\test"