how to save extracted data to a text file - python

I have a text file with the following content
this is the first line
this is the second line
this is the third line
this is the fourth line and contains the word fox.
The goal is to write a code that reads the file, extracts the line with
the word fox in it and saves that line to a new text file.Here is the code I have so far
import os
import re
my_absolute_path = os.path.abspath(os.path.dirname(__file__))
with open('textfile', 'r') as helloFile:
for line in helloFile:
if re.findall("fox",line):
print(line.strip())
This code prints the result of the parsed text but thats not really what I want it to do. Instead I would like the code to create a new text file with that line. Is there a way to accomplish this in python?

You can do:
with open('textfile', 'r') as in_file, open('outfile', 'a') as out_file:
for line in in_file:
if 'fox' in line:
out_file.write(line)
Here I've opened the outfile in append (a) mode to accomodate multiple writes. And also used the in (str.__contains__) check for substring existence (Regex is absolutely overkill here).

Related

How to keep lines which contains specific string and remove other lines from .txt file?

How to keep lines which contains specific string and remove other lines from .txt file?
Example: I want to keep the line which has word "hey" and remove others.
test.txt file:
first line
second one
heyy yo yo
fourth line
Code:
keeplist = ["hey"]
with open("test.txt") as f:
for line in f:
for word in keeplist:
Its hard to remove lines from a file. Its usually better to write a temporary file with the desired content and then change that to the original file name.
import os
keeplist = ["hey"]
with open("test.txt") as f, open("test.txt.tmp", "w") as outf:
for line in f:
for word in keeplist:
if word in line:
outf.write(line)
break
os.rename("test.txt.tmp", "test.txt")

Concenating to every value in a list

So I have a file with some lines of text:
here's a sentence
look! another one
here's a third one too
and another one
one more
and I have some code that takes the each line and puts it into a list and then reverses the order of the whole list but now I don't know how to write each line back to the file and delete the existing ones in the text file.
Also when I run this code:
file_lines = open(file_name).readlines()
print(file_lines)
file_lines.reverse()
print(file_lines)
everything works and the line order is reversed, but when I run this code:
text_file = open(file_name, "w")
file_lines = open(file_name).readlines()
print(file_lines)
file_lines.reverse()
print(file_lines)
for line in file_lines:
text_file.write(line)
it prints empty lists for some reason.
You can fix it by doing just 2 little changes in your script.
Use \r+ in place of \w+
Before performing write operation, place file position indicator to the beginning
text_file.seek(0)
» rw_file.txt - before operation
here's a sentence
look! another one
here's a third one too
and another one
one more
Below is your modified script to reverse the content of file (It worked).
def reverseFile(file_name):
text_file = open(file_name, "r+") # Do not use 'w+', it will erase your file content
file_lines = [line.rstrip('\n') for line in text_file.readlines()]
file_lines.reverse()
print(file_lines)
text_file.seek(0) # Place file position indicator at beginning
for line_item in file_lines:
text_file.write(line_item+"\n")
reverseFile("rw_file.txt")
» rw_file.txt - after operation
one more
and another one
here's a third one too
look! another one
here's a sentence
If you open the file in 'w' mode, the file is erased. From the docs:
'w' for only writing (an existing file with the same name will be
erased)
You should also use the with keyword:
It is good practice to use the with keyword when dealing with file
objects. The advantage is that the file is properly closed after its
suite finishes...
I would recommend you read the contents of the file first, process that data, and then write:
def reverseFile(file_name):
with open(file_name, 'r') as f:
file_lines = [line.rstrip('\n') for line in f.readlines()]
file_lines.reverse()
with open(file_name, "w") as f:
for line in file_lines:
f.write(line + '\n')
reverseFile('text_lines.txt')

Replace only first line of text file in python

I have a text file which consists of many lines of text.
I would like to replace only the first line of a text file using python v3.6 regardless of the contents. I do not need to do a line-by-line search and replace the line accordingly. No duplication with question Search and replace a line in a file in Python
Here is my code;
import fileinput
file = open("test.txt", "r+")
file.seek(0)
file.write("My first line")
file.close()
The code works partially. If the original first line has string longer than "My first line", the excess sub-string still remains. To be clearer, if original line is "XXXXXXXXXXXXXXXXXXXXXXXXX", then the output will be "My first lineXXXXXXXXXXXXXX". I want the output to be only "My first line". Is there a better way to implement the code?
You can use the readlines and writelines to do this.
For example, I created a file called "test.txt" that contains two lines (in Out[3]). After opening the file, I can use f.readlines() to get all lines in a list of string format. Then, the only thing I need to do is to replace the first element of the string to whatever I want, and then write back.
with open("test.txt") as f:
lines = f.readlines()
lines # ['This is the first line.\n', 'This is the second line.\n']
lines[0] = "This is the line that's replaced.\n"
lines # ["This is the line that's replaced.\n", 'This is the second line.\n']
with open("test.txt", "w") as f:
f.writelines(lines)
Reading and writing content to the file is already answered by #Zhang.
I am just giving the answer for efficiency instead of reading all the lines.
Use: shutil.copyfileobj
from_file.readline() # and discard
to_file.write(replacement_line)
shutil.copyfileobj(from_file, to_file)
Reference

How to save the output into a new txt file?

I use this code to split an unstructured text file to its tokens and output each token in one line:
with open("C:\\...\\...\\...\\record-13.txt") as f:
lines = f.readlines()
for line in lines:
words = line.split()
for word in words:
print (word)
Now I want to save the output into a new text file instead of printing it, I modify the code to this:
with open("C:\\...\\...\\...\\record-13.txt") as f:
lines = f.readlines()
for line in lines:
words = line.split()
for word in words:
file = open ("tokens.txt", "w")
file.write (word)
file.close()
but it doesn't work. Would you please tell me what's wrong with that?
You are opening the file for each token, and because you are opening with mode 'w' the file is truncated. You can open with mode 'a' to append to the file, but that would be very inefficient.
A better way is to open the output file at the very start and let the context manager close it for you. There's also no need to read the entire file into memory at the start.
with open("in.txt") as in_file, open("tokens.txt", "w") as out_file:
for line in in_file:
words = line.split()
for word in words:
out_file.write(word)
out_file.write("\n")
I suspect you want each word to be on a different line, so make sure you also write a new line character.

Replace a word in a file

I am new to Python programming...
I have a .txt file....... It looks like..
0,Salary,14000
0,Bonus,5000
0,gift,6000
I want to to replace the first '0' value to '1' in each line. How can I do this? Any one can help me.... With sample code..
Thanks in advance.
Nimmyliji
I know that you're asking about Python, but forgive me for suggesting that perhaps a different tool is better for the job. :) It's a one-liner via sed:
sed 's/^0,/1,/' yourtextfile.txt > output.txt
This applies the regex /^0,/ (which matches any 0, that occurs at the beginning of a line) to each line and replaces the matched text with 1, instead. The output is directed into the file output.txt specified.
inFile = open("old.txt", "r")
outFile = open("new.txt", "w")
for line in inFile:
outFile.write(",".join(["1"] + (line.split(","))[1:]))
inFile.close()
outFile.close()
If you would like something more general, take a look to Python csv module. It contains utilities for processing comma-separated values (abbreviated as csv) in files. But it can work with arbitrary delimiter, not only comma. So as you sample is obviously a csv file, you can use it as follows:
import csv
reader = csv.reader(open("old.txt"))
writer = csv.writer(open("new.txt", "w"))
writer.writerows(["1"] + line[1:] for line in reader)
To overwrite original file with new one:
import os
os.remove("old.txt")
os.rename("new.txt", "old.txt")
I think that writing to new file and then renaming it is more fault-tolerant and less likely corrupt your data than direct overwriting of source file. Imagine, that your program raised an exception while source file was already read to memory and reopened for writing. So you would lose original data and your new data wouldn't be saved because of program crash. In my case, I only lose new data while preserving original.
o=open("output.txt","w")
for line in open("file"):
s=line.split(",")
s[0]="1"
o.write(','.join(s))
o.close()
Or you can use fileinput with in place edit
import fileinput
for line in fileinput.FileInput("file",inplace=1):
s=line.split(",")
s[0]="1"
print ','.join(s)
f = open(filepath,'r')
data = f.readlines()
f.close()
edited = []
for line in data:
edited.append( '1'+line[1:] )
f = open(filepath,'w')
f.writelines(edited)
f.flush()
f.close()
Or in Python 2.5+:
with open(filepath,'r') as f:
data = f.readlines()
with open(outfilepath, 'w') as f:
for line in data:
f.write( '1' + line[1:] )
This should do it. I wouldn't recommend it for a truly big file though ;-)
What is going on (ex 1):
1: Open the file in read mode
2,3: Read all the lines into a list (each line is a separate index) and close the file.
4,5,6: Iterate over the list constructing a new list where each line has the first character replaced by a 1. The line[1:] slices the string from index 1 onward. We concatenate the 1 with the truncated list.
7,8,9: Reopen the file in write mode, write the list to the file (overwrite), flush the buffer, and close the file handle.
In Ex. 2:
I use the with statement that lets the file handle closing itself, but do essentially the same thing.

Categories

Resources