Write multiple text files to the directory in Python - python

I was working on saving text to different files. so, now I already created several files and each text file has some texts/paragraph in it. Now, I just want to save these files to a directory. I already created a self-defined directory, but now it is empty. I want to save these text files into my directory.
The partial code is below:
for doc in root:
docID = doc.find('DOCID').text.strip()
text = doc.find('TEXT').text,strip()
f = open("%s" %docID, 'w')
f.write(str(text))
Now, I created all the files with text in it. and I also have a blank folder/directory now. I just don't know how to put these files into the directory.
I would be appreciate it.
========================================================================
[Solved] Thank you guys for your all helping! I figured it out. I just edit my summary here. I got a few problems.
1. my docID was saved as tuple. I need to convert to string without any extra symbol. here is the reference i used: https://stackoverflow.com/a/17426417/9387211
2. I just created a new path and write the text to it. i used this method: https://stackoverflow.com/a/8024254/9387211
Now, I can share my updated code and there is no more problem here. Thanks everyone again!
for doc in root:
docID = doc.find('DOCID').text.strip()
did = ''.join(map(str,docID))
text = doc.find('TEXT').text,strip()
txt = ''.join(map(str,docID))
filename = os.path.join(dst_folder_path, did)
f = open(filename, 'w')
f.write(str(text))

Suppose you have all the text files in home directory (~/) and you want to move them to /path/to/dir folder.
from shutil import copyfile
import os
docid_list = ['docid-1', 'docid-2']
for did in docid_list:
copyfile(did, /path/to/folder)
os.remove(did)
It will copy the docid files in /path/to/folder path and remove the files from the home directory (assuming you run this operation from home dir)

You can frame the file path for open like
doc_file = open(<file path>, 'w')

Related

How do i make a list with values taken from different text files?

I have a folder, which i want to select manually, with an X number of .txt files. I want to make a program that allows me to run it -> select my folder with files -> And cycle through all files in the folder and take a value from a set place.
I have already made a piece of code that allows me to take the value from the .txt file:
mylines = []
with open ('test1.txt', 'rt') as myfile:
for myline in myfile:
mylines.append(myline)
subline = mylines[58]
sub = subline.split(' ')
print(sub[5])`
EDIT: I also have a piece of code that makes a list of directories with all the files I want to use this on:
'''
import glob
path = r'C:/Users/Etienne/.spyder-py3/test/*.UIT'
files = glob.glob(path)
print(files)
'''
How can I use the first piece of code on every file in the list from the second piece of code so i end up with a list of values?
I never worked with coding but this would make my work a lot faster so I want to pick up python.
If I understood the problem correctly, the os module might be helpful for you.
***os.listdir() method in python is used to get the list of all files and directories in the specified directory.For example;
import os
# Get the list of all files and directories
# in the root directory, you can change your directory
path = "/"
dir_list = os.listdir(path)
print("Files and directories in '", path, "' :")
# print the list
print(dir_list)
with this list you can iterate your txt files.
To additional information you can click
How can I iterate over files in a given directory?

Edit multiple text files, and save as new files

My first post on StackOverflow, so please be nice. In other words, a super beginner to Python.
So I want to read multiple files from a folder, divide the text and save the output as a new file. I currently have figured out this part of the code, but it only works on one file at a time. I have tried googling but can't figure out a way to use this code on multiple text files in a folder and save it as "output" + a number, for each file in the folder. Is this something that's doable?
with open("file_path") as fReader:
corpus = fReader.read()
loc = corpus.find("\n\n")
print(corpus[:loc], file=open("output.txt","a"))
Possibly work with a list, like:
from pathlib import Path
source_dir = Path("./") # path to the directory
files = list(x for x in filePath.iterdir() if x.is_file())
for i in range(len(files)):
file = Path(files[i])
outfile = "output_" + str(i) + file.suffix
with open(file) as fReader, open(outfile, "w") as fOut:
corpus = fReader.read()
loc = corpus.find("\n\n")
fOut.write(corpus[:loc])
** sorry for multiple editting....
welcome to the site. Yes, what you are asking above is completely doable and you are on the right track. You will need to do a little research/practice with the os module which is highly useful when working with files. The two commands that you will want to research a bit are:
os.path.join()
os.listdir()
I would suggest you put two folders within your python file, one called data and the other called output to catch the results. Start and see if you can just make the code to list all the files in your data directory, and just keep building that loop. Something like this should list all the files:
# folder file lister/test writer
import os
source_folder_name = 'data' # the folder to be read that is in the SAME directory as this file
output_folder_name = 'output' # will be used later...
files = os.listdir(source_folder_name)
# get this working first
for f in files:
print(f)
# make output folder names and just write a 1-liner into each file...
for f in files:
output_filename = f.split('.')[0] # the part before the period
output_filename += '_output.csv'
output_path = os.path.join(output_folder_name, output_filename)
with open(output_path, 'w') as writer:
writer.write('some data')

How can I use file names as an input?

I wrote a python script (with pandas library) to create txt files. I also use a txt file as an input. It works well but I want to make it more automated.
My code starts like;
girdi = input("Lütfen gir: ")
input2 = girdi+".txt"
veriCNR = pd.read_table(
input2, decimal=",",
usecols=[
"Chromosome",
"Name",
.
.
.
I am entering the name of the files one by one and getting outputs like this:
.
.
.
outputCNR = girdi+".cnr"
sonTabloCNR.to_csv(outputCNR, sep="\t", index=False)
outputCNS = girdi+".cns"
sonTabloCNS.to_csv(outputCNS, sep="\t", index=False)
outputCNG = girdi+".genemetrics.cns"
sonTabloCNG.to_csv(outputCNG, sep="\t", index=False)
As you see I am using input name also for outputs. They are tab seperated txt files with different file extensions.
I want to use all txt files in a folder as an input and run this script for every one of them.
I hope I explained it clearly.
ps. I am not a programmer. Please be explanatory with codes :)
You can put your code creating the files in a function which takes the filepath as input:
def generate_results(input_filepath):
# your code generating the files
then you can use this function in a loop to generate all the result files.
Here's an example using glob to get all the text files in directory
import os
import glob
directory = "path to my directory"
for filepath in glob.glob(os.path.join(directory, "*.txt")):
generate_results(filepath)
If you want to iterate over the files in folder use
os.listdir(path). It returns list of filenames on path (by default it's current directory).
So for example:
import os
supported_extensions = ['.cns', '.cnr']
for filename in os.listdir():
# Make sure file has desired extension
if os.path.splitext(filename)[1] in supported_extensions:
function(filename)

Creating empty text files with specific names in a directory

I have two directories:
dir = path/to/annotations
and
dir_img = path/to/images
The format of image names in dir_img is image_name.jpg.
I need to create empty text files in dir as: image_name.txt, wherein I can later store annotations corresponding to the images. I am using Python.
I don't know how to proceed. Any help is highly appreciated. Thanks.
[Edit]: I tried the answer given here. It ran without any error but didn't create any files either.
This should create empty files for you and then you can proceed further.
import os
for f in os.listdir(source_dir):
if f.endswith('.jpg'):
file_path = os.path.join(target_dir, f.replace('.jpg', '.txt'))
with open(file_path, "w+"):
pass
You can use the module os to list the existing files, and then just open the file in mode w+ which will create the file even if you're not writing anything into it. Don't forget to close your file!
import os
for f in os.listdir(source_dir):
if f.endswith('.jpg'):
open(os.path.join(target_dir, f.replace('.jpg', '.txt')), 'w+').close()

Convert all pdf in a folder to text files and store them in different folders using python

Im trying to convert all the pdf stored in one file, say 60 pdfs into text documents and store them in different folders. the folder should have unique names.
i tried this code.The folders where created, but the pdftotext conversion command doesnt work in the loop:
import os
def listfiles(path):
for root, dirs, files in os.walk(path):
for f in files:
print(f)
newpath = r'/home/user/files/'
p=f.replace("pdf","")
newpath=newpath+p
if not os.path.exists(newpath): os.makedirs(newpath)
os.system("pdftotext f f.txt")
f=listfiles("/home/user/reports")
One problem here is the os.system("pdftotext f f.txt") call. I assume you want the f's here replaced with the current file in the loop. If that is the case you need to change this to os.system("pdftotext {0} {0}.txt".format(f))
Another issue may be that the working directory is not being set up so the call to system is looking for the file in the wrong place. Try using os.chdir every time you change folders.
to place the text file in a diffrent folder try:
os.system("pdftotext {0} {1}/{0}.txt".format(f, newpath))
I don't know Python, but I think I can clearly see a mistake there. It looks like you are just replacing the ".pdf" with a ".txt". Since a PDF isn't just plain text, this won't work.
For the convertion look at the top answer of this post:
Python module for converting PDF to text

Categories

Resources