I know there's a lot of content about reading & writing out there, but I'm still not quite finding what I need specifically.
I have 5 files (i.e. in1.txt, in2.txt, in3.txt....), and I want to open/read, run the data through a function I have, and then output the new returned value to corresponding new files (i.e. out1.txt, out2.txt, out3.txt....)
I want to do this in one program run. I'm not sure how to write the loop to process all the numbered files in one run.
If you want them to be processed serially, you can use a for loop as follows:
inpPrefix = "in"
outPrefix = "out"
for i in range(1, 6):
inFile = inPrefix + str(i) + ".txt"
with open(inFile, 'r') as f:
fileLines = f.readlines()
# process content of each file
processedOutput = process(fileLines)
#write to file
outFile = outPrefix + str(i) + ".txt"
with open(outFile, 'w') as f:
f.write(processedOutput)
Note: This assumes that the input and output files are in the same directory as the script is in.
If you are looking just for running one by one separately you can do:
import os
count = 0
directory = "dir/where/your/files/are/"
for filename in os.listdir(directory):
if filename.endswith(".txt"):
count += 1
with open(directory + filename, "r") as read_file:
return_of_your_function = do_something_with_data()
with open(directory + count + filename, "w") as write_file:
write_file.write(return_of_your_function)
Here, you go! I would do something like this:
(Assuming all the input .txt files are in the same input folder)
input_path = '/path/to/input/folder/'
output_path = '/path/to/output/folder/'
for count in range(1,6):
input_file = input_path + 'in' + str(count) + '.txt'
output_file = output_path + 'out' + str(count) + '.txt'
with open(input_file, 'r') as f:
content = f.readlines()
output = process_input(content)
with open(output_file, 'w') as f:
w.write(output)
Related
This code saves text files from a data frame of sentences, then saves each one as a ssml file.
How can I get the sentences to be saved in a new folder?
max = len(sentences)
for i in range(0,max):
txt = sentences[i]
new_txt = starter + txt + ender
print(new_txt)
num = num + 1
with open("text" + str(num) + ".ssml", 'w+') as f:
f.writelines(new_txt)
Add this at the start:
import os
folder_name = 'my_folder'
os.makedirs(folder_name, exist_ok=True)
Then change:
with open("text" + str(num) + ".ssml", 'w+') as f:
to:
with open(f'{folder_name}\\text{num}.ssml', 'w+') as f:
With Python I'm attempting to edit a series of text files to insert a series of strings. I can do so successfully with a single txt file. Here's my working code that appends messages before and after the main body within the txt file:
filenames = ['text_0.txt']
with open("text_0.txt", "w") as outfile:
for filename in filenames:
with open(filename) as infile:
header1 = "Message 1:"
lines = "\n\n\n\n"
header2 = "Message 2:"
contents = header1 + infile.read() + lines + header2
outfile.write(contents)
I'm seeking some assistance in structuring a script to iteratively make the same edits to a series of similar txt files in the directory. There are 20 or similar txt files are structured the same: text_0.txt, text_1.txt, text_2.txt, and so on. Any assistance is greatly appreciated.
to loop through a folder of text files, you need to do it like this:
import os
YOURDIRECTORY = "TextFilesAreHere" ##this is the folder where there's your text files
for file in os.listdir(YOURDIRECTORY):
filename = os.fsdecode(file)
with open(YOURDIRECTORY + "/" + filename, "r"):
###do what you want with the file
If you already know the file naming then you can simply loop:
filenames = [f'text_{index}.txt' for index in range(21)]
for file_name in filenames:
with open(file_name, "w") as outfile:
for filename in filenames:
with open(filename) as infile:
header1 = "Message 1:"
lines = "\n\n\n\n"
header2 = "Message 2:"
contents = header1 + infile.read() + lines + header2
outfile.write(contents)
Or loop the directory like:
import os
for filename in os.listdir(directory):
#do something , like check the filename in list
I have a directory with JSON files that all have the same structure. I would like to loop through them and replace the values of key 'imgName' with the basename of the filename + 'png'. I managed to do that, however, when I dump the content of my dictionaries in the JSON files, they are empty. I'm sure there is some problem with the logic of my loops, but I can't figure it out.
Here is my code:
distorted_json = glob.glob('...')
for filename in distorted_json:
with open(filename) as f:
json_content = json.load(f)
basename = os.path.basename(filename)
if basename[-9:] == '.min.json':
name = basename[0:-9]
for basename in json_content:
json_content['imgName'] = name + '.png'
for filename in distorted_json:
with open(filename, 'w') as f:
json.dumps(json_content)
Thank you very much in advance for any advice!
You need to use json.dump, that is used to dump to a file, json.dump(json_content, f). Also remove the second loop and move the contents to the previous loop.
for filename in distorted_json:
with open(filename) as f:
json_content = json.load(f)
basename = os.path.basename(filename)
if basename[-9:] == '.min.json':
name = basename[0:-9]
for basename in json_content:
json_content['imgName'] = name + '.png'
with open(filename, 'w') as f:
json.dump(json_content, f)
So I'm writing a script to take large csv files and divide them into chunks. These files each have lines formatted accordingly:
01/07/2003,1545,12.47,12.48,12.43,12.44,137423
Where the first field is the date. The next field to the right is a time value. These data points are at minute granularity. My goal is to fill files with 8 days worth of data, so I want to write all the lines from a file for 8 days worth into a new file.
Right now, I'm only seeing the program write one line per "chunk," rather than all the lines. Code shown below and screenshots included showing how the chunk directories are made and the file as well as its contents.
For reference, day 8 shown and 1559 means it stored the last line right before the mod operator became true. So I'm thinking that everything is getting overwritten somehow since only the last values are being stored.
import os
import time
CWD = os.getcwd()
WRITEDIR = CWD+"/Divided Data/"
if not os.path.exists(WRITEDIR):
os.makedirs(WRITEDIR)
FILEDIR = CWD+"/SP500"
os.chdir(FILEDIR)
valid_files = []
filelist = open("filelist.txt", 'r')
for file in filelist:
cur_file = open(file.rstrip()+".csv", 'r')
cur_file.readline() #skip first line
prev_day = ""
count = 0
chunk_count = 1
for line in cur_file:
day = line[3:5]
WDIR = WRITEDIR + "Chunk"
cur_dir = os.getcwd()
path = WDIR + " "+ str(chunk_count)
if not os.path.exists(path):
os.makedirs(path)
if(day != prev_day):
# print(day)
prev_day = day
count += 1
#Create new directory
if(count % 8 == 0):
chunk_count += 1
PATH = WDIR + " " + str(chunk_count)
if not os.path.exists(PATH):
os.makedirs(PATH)
print("Chunk count: " + str(chunk_count))
print("Global count: " + str(count))
temp_path = WDIR +" "+str(chunk_count)
os.chdir(temp_path)
fname = file.rstrip()+str(chunk_count)+".csv"
with open(fname, 'w') as f:
try:
f.write(line + '\n')
except:
print("Could not write to file. \n")
os.chdir(cur_dir)
if(chunk_count >= 406):
continue
cur_file.close()
# count += 1
The answer is in the comment but let me give it here so that your question is answered.
You're opening your file in 'w' mode which overwrites all the previously written content. You need to open it in the 'a' (append) mode:
fname = file.rstrip()+str(chunk_count)+".csv"
with open(fname, 'a') as f:
See more on open function and modes in Python documentation. It specifically mentions about 'w' mode:
note that 'w+' truncates the file
I have written a script in python, which works on a single file. I couldn't find an answer to make it run on multiple files and to give output for each file separately.
out = open('/home/directory/a.out','w')
infile = open('/home/directory/a.sam','r')
for line in infile:
if not line.startswith('#'):
samlist = line.strip().split()
if 'I' or 'D' in samlist[5]:
match = re.findall(r'(\d+)I', samlist[5]) # remember to chang I and D here aswell
intlist = [int(x) for x in match]
## if len(intlist) < 10:
for indel in intlist:
if indel >= 10:
## print indel
###intlist contains lengths of insertions in for each read
#print intlist
read_aln_start = int(samlist[3])
indel_positions = []
for num1, i_or_d, num2, m in re.findall('(\d+)([ID])(\d+)?([A-Za-z])?', samlist[5]):
if num1:
read_aln_start += int(num1)
if num2:
read_aln_start += int(num2)
indel_positions.append(read_aln_start)
#print indel_positions
out.write(str(read_aln_start)+'\t'+str(i_or_d) + '\t'+str(samlist[2])+ '\t' + str(indel) +'\n')
out.close()
I would like my script to take multiple files with names like a.sam, b.sam, c.sam and for each file give me the output : aout.sam, bout.sam, cout.sam
Can you please pass me either a solution or a hint.
Regards,
Irek
Loop over filenames.
input_filenames = ['a.sam', 'b.sam', 'c.sam']
output_filenames = ['aout.sam', 'bout.sam', 'cout.sam']
for infn, outfn in zip(input_filenames, output_filenames):
out = open('/home/directory/{}'.format(outfn), 'w')
infile = open('/home/directory/{}'.format(infn), 'r')
...
UPDATE
Following code generate output_filenames from given input_filenames.
import os
def get_output_filename(fn):
filename, ext = os.path.splitext(fn)
return filename + 'out' + ext
input_filenames = ['a.sam', 'b.sam', 'c.sam'] # or glob.glob('*.sam')
output_filenames = map(get_output_filename, input_filenames)
I'd recommend wrapping that script in a function, using the def keyword, and passing the names of the input and output files as parameters to that function.
def do_stuff_with_files(infile, outfile):
out = open(infile,'w')
infile = open(outfile,'r')
# the rest of your script
Now you can call this function for any combination of input and output file names.
do_stuff_with_files('/home/directory/a.sam', '/home/directory/a.out')
If you want to do this for all files in a certain directory, use the glob library. To generate the output filenames, just replace the last three characters ("sam") with "out".
import glob
indir, outdir = '/home/directory/', '/home/directory/out/'
files = glob.glob1(indir, '*.sam')
infiles = [indir + f for f in files]
outfiles = [outdir + f[:-3] + "out" for f in files]
for infile, outfile in zip(infiles, outfiles):
do_stuff_with_files(infile, outfile)
The following script allows working with an input and output file. It will loop over all files in the given directory with the ".sam" extension, perform the specified operation on them, and output the results to a separate file.
Import os
# Define the directory containing the files you are working with
path = '/home/directory'
# Get all the files in that directory with the desired
# extension (in this case ".sam")
files = [f for f in os.listdir(path) if f.endswith('.sam')]
# Loop over the files with that extension
for file in files:
# Open the input file
with open(path + '/' + file, 'r') as infile:
# Open the output file
with open(path + '/' + file.split('.')[0] + 'out.' +
file.split('.')[1], 'a') as outfile:
# Loop over the lines in the input file
for line in infile:
# If a line in the input file can be characterized in a
# certain way, write a different line to the output file.
# Otherwise write the original line (from the input file)
# to the output file
if line.startswith('Something'):
outfile.write('A different kind of something')
else:
outfile.write(line)
# Note the absence of either a infile.close() or an outfile.close()
# statement. The with-statement handles that for you