I have a directory with JSON files that all have the same structure. I would like to loop through them and replace the values of key 'imgName' with the basename of the filename + 'png'. I managed to do that, however, when I dump the content of my dictionaries in the JSON files, they are empty. I'm sure there is some problem with the logic of my loops, but I can't figure it out.
Here is my code:
distorted_json = glob.glob('...')
for filename in distorted_json:
with open(filename) as f:
json_content = json.load(f)
basename = os.path.basename(filename)
if basename[-9:] == '.min.json':
name = basename[0:-9]
for basename in json_content:
json_content['imgName'] = name + '.png'
for filename in distorted_json:
with open(filename, 'w') as f:
json.dumps(json_content)
Thank you very much in advance for any advice!
You need to use json.dump, that is used to dump to a file, json.dump(json_content, f). Also remove the second loop and move the contents to the previous loop.
for filename in distorted_json:
with open(filename) as f:
json_content = json.load(f)
basename = os.path.basename(filename)
if basename[-9:] == '.min.json':
name = basename[0:-9]
for basename in json_content:
json_content['imgName'] = name + '.png'
with open(filename, 'w') as f:
json.dump(json_content, f)
Related
I want all files in directory "path" to have the string "error" removed from them and the result to be saved in the same file that was editted. My current code (below) ends up clearing up the entire file, rather than just removing the string and keeping everything else the same.
import os
path = "path"
files = os.listdir(path)
error = "string"
for index, file in enumerate(files):
with open(os.path.join(path, file)) as fin, open(os.path.join(path, file), "w+") as fout:
for line in fin:
line = line.replace(error, "f")
fout.write(line)
import os
path = "path"
files = os.listdir(path)
error = "string"
for index, file in enumerate(files):
with open(os.path.join(path, file), 'r') as fin:
d = din.read()
with open(os.path.join(path, file), "w") as fout:
d = d.replace(error, "")
fout.write(d)
This is the correct way to do this:
import os
path = "path"
for file in os.listdir(path):
if not os.path.isdir(file):
with open(file, 'r+') as fd:
contents = fd.read().replace('error', '')
fd.seek(0)
fd.write(contents)
fd.truncate()
With Python I'm attempting to edit a series of text files to insert a series of strings. I can do so successfully with a single txt file. Here's my working code that appends messages before and after the main body within the txt file:
filenames = ['text_0.txt']
with open("text_0.txt", "w") as outfile:
for filename in filenames:
with open(filename) as infile:
header1 = "Message 1:"
lines = "\n\n\n\n"
header2 = "Message 2:"
contents = header1 + infile.read() + lines + header2
outfile.write(contents)
I'm seeking some assistance in structuring a script to iteratively make the same edits to a series of similar txt files in the directory. There are 20 or similar txt files are structured the same: text_0.txt, text_1.txt, text_2.txt, and so on. Any assistance is greatly appreciated.
to loop through a folder of text files, you need to do it like this:
import os
YOURDIRECTORY = "TextFilesAreHere" ##this is the folder where there's your text files
for file in os.listdir(YOURDIRECTORY):
filename = os.fsdecode(file)
with open(YOURDIRECTORY + "/" + filename, "r"):
###do what you want with the file
If you already know the file naming then you can simply loop:
filenames = [f'text_{index}.txt' for index in range(21)]
for file_name in filenames:
with open(file_name, "w") as outfile:
for filename in filenames:
with open(filename) as infile:
header1 = "Message 1:"
lines = "\n\n\n\n"
header2 = "Message 2:"
contents = header1 + infile.read() + lines + header2
outfile.write(contents)
Or loop the directory like:
import os
for filename in os.listdir(directory):
#do something , like check the filename in list
I know there's a lot of content about reading & writing out there, but I'm still not quite finding what I need specifically.
I have 5 files (i.e. in1.txt, in2.txt, in3.txt....), and I want to open/read, run the data through a function I have, and then output the new returned value to corresponding new files (i.e. out1.txt, out2.txt, out3.txt....)
I want to do this in one program run. I'm not sure how to write the loop to process all the numbered files in one run.
If you want them to be processed serially, you can use a for loop as follows:
inpPrefix = "in"
outPrefix = "out"
for i in range(1, 6):
inFile = inPrefix + str(i) + ".txt"
with open(inFile, 'r') as f:
fileLines = f.readlines()
# process content of each file
processedOutput = process(fileLines)
#write to file
outFile = outPrefix + str(i) + ".txt"
with open(outFile, 'w') as f:
f.write(processedOutput)
Note: This assumes that the input and output files are in the same directory as the script is in.
If you are looking just for running one by one separately you can do:
import os
count = 0
directory = "dir/where/your/files/are/"
for filename in os.listdir(directory):
if filename.endswith(".txt"):
count += 1
with open(directory + filename, "r") as read_file:
return_of_your_function = do_something_with_data()
with open(directory + count + filename, "w") as write_file:
write_file.write(return_of_your_function)
Here, you go! I would do something like this:
(Assuming all the input .txt files are in the same input folder)
input_path = '/path/to/input/folder/'
output_path = '/path/to/output/folder/'
for count in range(1,6):
input_file = input_path + 'in' + str(count) + '.txt'
output_file = output_path + 'out' + str(count) + '.txt'
with open(input_file, 'r') as f:
content = f.readlines()
output = process_input(content)
with open(output_file, 'w') as f:
w.write(output)
I have a csv with two columns Directory and Filename. Each row in the csv shows what directory each file belongs like so
Directory, File Name
DIR18, IMG_42.png
DIR12, IMG_16.png
DIR4, IMG_65.png
So far I have written code that grabs each directory and filename from the csv and then all files at their destination like so:
movePng.py
import shutil
import os
import csv
from collections import defaultdict
columns = defaultdict(list) # each value in each column is appended to a list
with open('/User/Results.csv') as f:
reader = csv.DictReader(f)
for row in reader:
for (k,v) in row.items():
columns[k].append(v)
source = '/User/PNGItems'
files = os.listdir(source)
for f in files:
pngName = f[:-4]
for filename in columns['File Name']:
fileName = filename[:-4]
if pngName == fileName
# GET THIS POSITION IN columns['File Name'] for columns['Directory']
shutil.move(f, source + '/' + DIRECTORY)
How do I get the index of the columns['File Name'] and grab the corresponding directory out of columns['Directory'] ?
You should read the assignments into a dictionary and then query that:
folder_assignment_file = "folders.csv"
file_folder = dict()
with open(folder_assignment_file, "r") as fh:
reader = csv.reader(fh)
for folder, filename in reader:
file_folder[filename] = folder
And then get the target folder like so: DIRECTORY = file_folder[fileName].
Some other hints:
filename, fileName are not good variable names, this will only lead to hard to find bugs because Python is case sensitive
use os.path.splitext to split the extension off the filename
if not all your files are in one folder the glob module and os.walk might come in handy
Edit:
Creating the dict can be made even nicer like so:
with open(folder_assignment_file, "r") as fh:
reader = csv.reader(fh)
file_folders = {filename: folder for folder, filename in reader}
To solve this I used #Peter Wood suggestion and it worked beautifully. Also I had to modify shutil.
Here is the code below
for f in files:
pngName = f[:-4]
for filename, directory in zip(columns['File Name'], columns['Directory']):
fileName = filename[:-4]
if pngName == fileName:
directoryName = directory[1:]
shutil.move(os.path.join(source, f), source + '/' + directoryName)
The code I am working with takes in a .pdf file, and outputs a .txt file. My question is, how do I create a loop (probably a for loop) which runs the code over and over again on all files in a folder which end in ".pdf"? Furthermore, how do I change the output each time the loop runs so that I can write a new file each time, that has the same name as the input file (ie. 1_pet.pdf > 1_pet.txt, 2_pet.pdf > 2_pet.txt, etc.)
Here is the code so far:
path="2_pet.pdf"
content = getPDFContent(path)
encoded = content.encode("utf-8")
text_file = open("Output.txt", "w")
text_file.write(encoded)
text_file.close()
The following script solve your problem:
import os
sourcedir = 'pdfdir'
dl = os.listdir('pdfdir')
for f in dl:
fs = f.split(".")
if fs[1] == "pdf":
path_in = os.path.join(dl,f)
content = getPDFContent(path_in)
encoded = content.encode("utf-8")
path_out = os.path.join(dl,fs[0] + ".txt")
text_file = open(path_out, 'w')
text_file.write(encoded)
text_file.close()
Create a function that encapsulates what you want to do to each file.
import os.path
def parse_pdf(filename):
"Parse a pdf into text"
content = getPDFContent(filename)
encoded = content.encode("utf-8")
## split of the pdf extension to add .txt instead.
(root, _) = os.path.splitext(filename)
text_file = open(root + ".txt", "w")
text_file.write(encoded)
text_file.close()
Then apply this function to a list of filenames, like so:
for f in files:
parse_pdf(f)
One way to operate on all PDF files in a directory is to invoke glob.glob() and iterate over the results:
import glob
for path in glob.glob('*.pdf')
content = getPDFContent(path)
encoded = content.encode("utf-8")
text_file = open("Output.txt", "w")
text_file.write(encoded)
text_file.close()
Another way is to allow the user to specify the files:
import sys
for path in sys.argv[1:]:
...
Then the user runs your script like python foo.py *.pdf.
You could use a recursive function to search the folders and all subfolders for files that end with pdf. Than take those files and then create a text file for it.
It could be something like:
import os
def convert_PDF(path, func):
d = os.path.basename(path)
if os.path.isdir(path):
[convert_PDF(os.path.join(path,x), func) for x in os.listdir(path)]
elif d[-4:] == '.pdf':
funct(path)
# based entirely on your example code
def convert_to_txt(path):
content = getPDFContent(path)
encoded = content.encode("utf-8")
file_path = os.path.dirname(path)
# replace pdf with txt extension
file_name = os.path.basename(path)[:-4]+'.txt'
text_file = open(file_path +'/'+file_name, "w")
text_file.write(encoded)
text_file.close()
convert_PDF('path/to/files', convert_to_txt)
Because the actual operation is changeable, you can replace the function with whatever operation you need to perform (like using a different library, converting to a different type, etc.)