Loop for opening .txt files

Loop for opening .txt files - python

I have a list of .txt file in one folder, with names like: "image1.txt", "image2.txt", "image3.txt", etc.
I need to perform some operations for each file.
I was trying like this:
import glob
for each_file in glob.glob("C:\...\image\d+\.txt"):
print(each_file) (or whatever)
But it seems it doesn't work. How can I solve?

I think you are looking for something like this:
import os
for file in os.listdir('parent_folder'):
with open(os.path.join('parent_folder', file), 'r') as f:
data = f.read()
# operation on data
#Alternatively
for i in range(10):
with open(f'image{i}.txt', 'r') as f:
data = f.read()
# operation on data
The with operator takes care of everything to do with the file, so you don't need to worry about the file after it goes out of scope.
If you want to read and also write to the file in the same operation, use open(file, 'r+) and then the following:
with open(f'image{i}.txt', 'r+') as f:
data = f.read()
# operation on data
f.seek(0)
f.write(data)
f.truncate()

Take this answer, that I wrote.
path objects have the read_text method. As long as it can decode it, then it will read it - you shouldn't have a problem with text files. Also, since you are using windows paths, make sure to put an r before the string, like this r"C:\...\image\d+\.txt" or change the direction of the slashes. A quick example:
from pathlib import Path
for f in Path(r"C:\...\image\d+\").rglob('**/*.txt'):
print(f.read_text())

Related

Write list of bytes into a file, but some records got lost

I am new to programming and got an issue with writing bytes. Here is what I wrote:
file = open('filePath/input.train', 'wb')
for i in range(len(myList)):
file.write(bytes((myList[i]),'UTF-8'));
If I print 'i' here, it is 629.
The '.train' suffix is required by the project. In order to check it, I read it and write to a txt file:
file = open('filePath/input.train', 'rb')
content = file.read()
testFile = open('filePath/test.txt', 'wb')
testFile.write(content)
Now, the problem is, len(list) = 629 while I got 591 lines in test.txt file. It brought me problems later.
Why did this happen and how should I solve it?

first, when you open and write a file, need remember close the file after the write.like this.
file = open('filePath/input.train', 'wb')
for i in range(len(myList)):
file.write(bytes((myList[i]),'UTF-8'));
file.close()
second, python code not must has ";"
third, file is python's keyword, so don't use file be your variable name. you can use f or my_file or anyone, but don't use python's keyword.
fourth, python has a iterator, use iterator is better than your for i in range(len(xxx)).
all of this, your code can look like this.
f = open('filePath/input.train', 'wb')
for line in myList:
f.write(bytes(line, 'UTF-8'))
f.close()

python clear content writing on same file

I am a newbie to python. I have a code in which I must write the contents again to my same file,but when I do it it clears my content.Please help to fix it.
How should I modify my code such that the contents will be written back on the same file?
My code:
import re
numbers = {}
with open('1.txt') as f,open('11.txt', 'w') as f1:
for line in f:
row = re.split(r'(\d+)', line.strip())
words = tuple(row[::2])
if words not in numbers:
numbers[words] = [int(n) for n in row[1::2]]
numbers[words] = [n+1 for n in numbers[words]]
row[1::2] = map(str, numbers[words])
indentation = (re.match(r"\s*", line).group())
print (indentation + ''.join(row))
f1.write(indentation + ''.join(row) + '\n')

In general, it's a bad idea to write over a file you're still processing (or change a data structure over which you are iterating). It can be done...but it requires much care, and there is little safety or restart-ability should something go wrong in the middle (an error, a power failure, etc.)
A better approach is to write a clean new file, then rename it to the old name. For example:
import re
import os
filename = '1.txt'
tempname = "temp{0}_{1}".format(os.getpid(), filename)
numbers = {}
with open(filename) as f, open(tempname, 'w') as f1:
# ... file processing as before
os.rename(tempname, filename)
Here I've dropped filenames (both original and temporary) into variables, so they can be easily referred to multiple times or changed. This also prepares for the moment when you hoist this code into a function (as part of a larger program), as opposed to making it the main line of your program.
You don't strictly need the temporary name to embed the process id, but it's a standard way of making sure the temp file is uniquely named (temp32939_1.txt vs temp_1.txt or tempfile.txt, say).
It may also be helpful to create backups of the files as they were before processing. In which case, before the os.rename(tempname, filename) you can drop in code to move the original data to a safer location or a backup name. E.g.:
backupname = filename + ".bak"
os.rename(filename, backupname)
os.rename(tempname, filename)
While beyond the scope of this question, if you used a read-process-overwrite strategy frequently, it would be possible to create a separate module that abstracted these file-handling details away from your processing code. Here is an example.

Use
open('11.txt', 'a')
To append to the file instead of w for writing (a new or overwriting a file).

If you want to read and modify file in one time use "r+' mode.
f=file('/path/to/file.txt', 'r+')
content=f.read()
content=content.replace('oldstring', 'newstring') #for example change some substring in whole file
f.seek(0) #move to beginning of file
f.write(content)
f.truncate() #clear file conent "tail" on disk if new content shorter then old
f.close()

Python read/write file without closing

Sometimes when I open a file for reading or writing in Python
f = open('workfile', 'r')
or
f = open('workfile', 'w')
I read/write the file, and then at the end I forget to do f.close(). Is there a way to automatically close after all the reading/writing is done, or after the code finishes processing?

with open('file.txt','r') as f:
#file is opened and accessible via f
pass
#file will be closed before here

You could always use the with...as statement
with open('workfile') as f:
"""Do something with file"""
or you could also use a try...finally block
f = open('workfile', 'r')
try:
"""Do something with file"""
finally:
f.close()
Although since you say that you forget to add f.close(), I guess the with...as statement will be the best for you and given it's simplicity, it's hard to see the reason for not using it!

Whatever you do with your file, after you read it in, this is how you should read and write it back:
$ python myscript.py sample.txt sample1.txt
Then the first argument (sample.txt) is our "oldfile" and the second argument (sample1.txt) is our "newfile". You can then do the following code into a file called "myscript.py"
from sys import argv
script_name,oldfile,newfile = argv
content = open(oldfile,"r").read()
# now, you can rearrange your content here
t = open(newfile,"w")
t.write(content)
t.close()

Problem with file concatenation in Python?

I have 3 files 1.txt, 2.txt, and 3.txt and I am trying to concatenate together the contents of these files into one output file in Python. Can anyone explain why the code below only writes the content of 1.txt and not 2.txt or 3.txt? I'm sure it's something really simple, but I can't seem to figure out the problem.
import glob
import shutil
for my_file in glob.iglob('/Users/me/Desktop/*.txt'):
with open('concat_file.txt', "w") as concat_file:
shutil.copyfileobj(open(my_file, "r"), concat_file)
Thanks for the help!

you constantly overwrite the same file.
either use:
with open('concat_file.txt', "a")
or
with open('concat_file.txt', "w") as concat_file:
for my_file in glob.iglob('/Users/me/Desktop/*.txt'):
shutil.copyfileobj(open(my_file, "r"), concat_file)

I believe that what's wrong with your code is that in every loop iteration, you are essentially adding files to themselves.
If you manually unroll the loop you will see what I mean:
# my_file = '1.txt'
concat_file = open(my_file)
shutil.copyfileobj(open(my_file, 'r'), concat_file)
# ...
I'd suggest deciding beforehand which file you want all the files to be copied to, maybe like this:
import glob
import shutil
output_file = open('output.txt', 'w')
for my_file in glob.iglob('/Users/me/Desktop/*.txt'):
with open('concat_file.txt', "w") as concat_file:
shutil.copyfileobj(open(my_file, "r"), output_file)

Prepend a line to an existing file in Python

I need to add a single line to the first line of a text file and it looks like the only options available to me are more lines of code than I would expect from python. Something like this:
f = open('filename','r')
temp = f.read()
f.close()
f = open('filename', 'w')
f.write("#testfirstline")
f.write(temp)
f.close()
Is there no easier way? Additionally, I see this two-handle example more often than opening a single handle for reading and writing ('r+') - why is that?

Python makes a lot of things easy and contains libraries and wrappers for a lot of common operations, but the goal is not to hide fundamental truths.
The fundamental truth you are encountering here is that you generally can't prepend data to an existing flat structure without rewriting the entire structure. This is true regardless of language.
There are ways to save a filehandle or make your code less readable, many of which are provided in other answers, but none change the fundamental operation: You must read in the existing file, then write out the data you want to prepend, followed by the existing data you read in.
By all means save yourself the filehandle, but don't go looking to pack this operation into as few lines of code as possible. In fact, never go looking for the fewest lines of code -- that's obfuscation, not programming.

I would stick with separate reads and writes, but we certainly can express each more concisely:
Python2:
with file('filename', 'r') as original: data = original.read()
with file('filename', 'w') as modified: modified.write("new first line\n" + data)
Python3:
with open('filename', 'r') as original: data = original.read()
with open('filename', 'w') as modified: modified.write("new first line\n" + data)
Note: file() function is not available in python3.

Other approach:
with open("infile") as f1:
with open("outfile", "w") as f2:
f2.write("#test firstline")
for line in f1:
f2.write(line)
or a one liner:
open("outfile", "w").write("#test firstline\n" + open("infile").read())
Thanks for the opportunity to think about this problem :)
Cheers

with open("file", "r+") as f: s = f.read(); f.seek(0); f.write("prepend\n" + s)

You can save one write call with this:
f.write('#testfirstline\n' + temp)
When using 'r+', you would have to rewind the file after reading and before writing.

Here's a 3 liner that I think is clear and flexible. It uses the list.insert function, so if you truly want to prepend to the file use l.insert(0, 'insert_str'). When I actually did this for a Python Module I am developing, I used l.insert(1, 'insert_str') because I wanted to skip the '# -- coding: utf-8 --' string at line 0. Here is the code.
f = open(file_path, 'r'); s = f.read(); f.close()
l = s.splitlines(); l.insert(0, 'insert_str'); s = '\n'.join(l)
f = open(file_path, 'w'); f.write(s); f.close()

This does the job without reading the whole file into memory, though it may not work on Windows
def prepend_line(path, line):
with open(path, 'r') as old:
os.unlink(path)
with open(path, 'w') as new:
new.write(str(line) + "\n")
shutil.copyfileobj(old, new)

One possibility is the following:
import os
open('tempfile', 'w').write('#testfirstline\n' + open('filename', 'r').read())
os.rename('tempfile', 'filename')

If you wish to prepend in the file after a specific text then you can use the function below.
def prepend_text(file, text, after=None):
''' Prepend file with given raw text '''
f_read = open(file, 'r')
buff = f_read.read()
f_read.close()
f_write = open(file, 'w')
inject_pos = 0
if after:
pattern = after
inject_pos = buff.find(pattern)+len(pattern)
f_write.write(buff[:inject_pos] + text + buff[inject_pos:])
f_write.close()
So first you open the file, read it and save it all into one string.
Then we try to find the character number in the string where the injection will happen. Then with a single write and some smart indexing of the string we can rewrite the whole file including the injected text now.

Am I not seeing something or couldn't we just use a buffer large-enough to read-in the input file in parts (instead of the whole content) and with this buffer traverse the file while it is open and keep exchanging file<->buffer contents?
This seems much more efficient (for big files especially) than reading the whole content in memory, modifying it in memory and writing it back to the same file or (even worse) a different one. Sorry that now I don't have time to implement a sample snippet, I'll get back to this later, but maybe you get the idea.

As I suggested in this answer, you can do it using the following:
def prepend_text(filename: Union[str, Path], text: str):
with fileinput.input(filename, inplace=True) as file:
for line in file:
if file.isfirstline():
print(text)
print(line, end="")

If you rewrite it like this:
with open('filename') as f:
read_data = f.read()
with open('filename', 'w') as f:
f.write("#testfirstline\n" + read_data)
It's rather short and simple.
For 'r+' the file needs to exist already.

this worked for me
def prepend(str, file):
with open(file, "r") as fr:
read = fr.read()
with open(file, "w") as fw:
fw.write(str + read)
fw.close()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Loop for opening .txt files - python

Related

Write list of bytes into a file, but some records got lost

python clear content writing on same file

Python read/write file without closing

Problem with file concatenation in Python?

Prepend a line to an existing file in Python

Categories

Resources