Deleting a line in multiple files in python - python

i am a beginner in python and i am practicing at the moment.
So what I want to do is a script that finds a line that I am writing with raw_input and that will search this line in multiple files and delete it.
Something like this but for more files:
word = raw_input("word: ")
f = open("file.txt","r")
lines = f.readlines()
f.close()
f = open("file.txt","w")
for line in lines:
if line!=mail+"\n":
f.write(line)
f.close()
It's an easy task but it's actually hard for me since I can't find an example anywhere.

Instead of reading the entire file into memory, you should iterate through the file and write the lines that are OK to a temporary file. Once you've gone through the entire file, delete it and rename the temporary file to the name of the original file. This is a classic pattern that you'll most likely frequently encounter in the future.
I'd also recommend breaking this down into functions. You should first write the code for removing all occurrences of a line from only a single file. Then you can write another function that simply iterates through a list of filenames and calls the first function (that operates on individual files).
To get the filenames of the all the files in the directory, use os.walk. If you do not want to apply this function to all of the files in the directory, you can set the files variable yourself to store whatever configuration of filenames you want.
import os
def remove_line_from_file(filename, line_to_remove, dirpath=''):
"""Remove all occurences of `line_to_remove` from file
with name `filename`, contained at path `dirpath`.
If `dirpath` is omitted, relative paths are used."""
filename = os.path.join(dirpath, filename)
temp_path = os.path.join(dirpath, 'temp.txt')
with open(filename, 'r') as f_read, open(temp_path, 'w') as temp:
for line in f_read:
if line.strip() == line_to_remove:
continue
temp.write(line)
os.remove(filename)
os.rename(temp_path, filename)
def main():
"""Driver function"""
directory = raw_input('directory: ')
word = raw_input('word: ')
dirpath, _, files = next(os.walk(directory))
for f in files:
remove_line_from_file(f, word, dirpath)
if __name__ == '__main__':
main()
TESTS
All of these files are in the same directory. On the left is what they looked like before running the command, on the right is what they look like afterwards. The "word" I input was Remove this line.
a.txt
Foo Foo
Remove this line Bar
Bar Hello
Hello World
Remove this line
Remove this line
World
b.txt
Nothing Nothing
In In
This File This File
Should Should
Be Changed Be Changed
c.txt
Remove this line
d.txt
The last line will be removed The last line will be removed
Remove this line

something like this should work:
source = '/some/dir/path/'
for root, dirs, filenames in os.walk(source):
for f in filenames:
this_file = open(os.path.join(source, f), "r")
this_files_data = this_file.readlines()
this_file.close()
# rewrite the file with all line except the one you don't want
this_file = open(os.path.join(source, f), "w")
for line in this_files_data:
if line != "YOUR UNDESIRED LINE HERE":
this_file.write(line)
this_file.close()

Related

Python change strings line by line in all files within a directory

So I have some files located in directory.
Some of the files contain paths like this and some are empty: C:\d\folder\project\folder\Folder1\Folder2\Folder3\Module.c
What would be the best way to cut it just by counting backslashes from the end: So in this case we need to cut everything what is after 4th backslash when counting backward:
Folder1\Folder2\Folder3\Module.c
I need some function that will go through all files and do this on each line of a file.
Current code which do not work for some reason is:
directory = os.listdir(//path_to_dir//)
for file in directory:
with open (file) as f:
for s in f:
print('\\'.join(s.split('\\')[-4:]))
I would try something like this:
from pathlib import Path
def change(s):
return '\\'.join(s.split('\\')[-4:])
folder = Path.cwd() / "folder" # here is your folder with files
files = folder.glob("*")
for f in files:
with open(f, "r") as file:
content = file.read()
lines = content.split('\n')
new_lines = []
for line in lines:
new_lines.append(change(line))
with open(f, "w") as file:
file.write("\n".join(new_lines))
It look for all files in the subfolder folder, does replacing on every line of every file and saves the files.

Function that returns the content of multiple .txt files python

I'm trying to write a function that traverses a given path and opens/reads all the .txt files therein and returns these as a string or returns a value that I can use to apply text normalization.
Currently my code only returns the first .txt file it finds, except when I use a print(f.read()) statement, then it prints all the files it read.
I would like it to return all the files
def readtxt(path):
import os
for subdir, dirs, files in os.walk(path):
for file in files:
filepath = subdir + os.sep + file
if file.endswith(".txt"):
filelist = filepath.split()
for file in filelist:
with open(os.path.join(path, filepath), 'r') as f:
lines = (f.read())
return lines
readtxt('/Users/path/')
When you use a return statement, the function ends. That's why it only gives you 1 file - it stops when it finds one. You could add them all into a list & return that afterwards instead.
found = []
And then, inside of your loops, you simply do:
with open(os.path.join(path, filepath), 'r') as f:
lines = (f.read())
found.append(lines) # Append to list instead of returning
so that you can return everything you found using:
return found

Creating a .txt file of file directories

I am trying to create a .txt file of file directories at a location, remove the prefixes and save the text file.
I use the os.walk module to build a list of directories of a location into a .txt file. I always get the text file of the directories.
The part where it removes the prefixes of those lines of directories in the next chunk of code doesn't work. It creates its own .txt file (as it is supposed to) but it is always empty.
If there is a solution that does all of this in one .txt file and one block of code that would be even better!
Here is what I have so far, and I'm using dummy directories for privacy's sake.
import os
from datetime import datetime
# this is to create a filename with the timestamp_directory_list for a .txt file
now = datetime.now()
filename = datetime.now().strftime("%Y_%m_%d_%H_%M_%S_directory_list.txt")
# uses os module to walk the directories and files
# within a given location, then writes it line by line to a .txt file
with open(filename, "w") as directory_list:
for path, subdirs, files in os.walk(r"C:/Users"):
for filenameX in files:
f = os.path.join(path)
directory_list.write(str(f) + os.linesep)
# Open up .txt file, read a line, trim the prefix, then save it
# this is to create a filename with the timestamp_directory_list for a .txt file
trim = datetime.now().strftime("%Y_%m_%d_%H_%M_%S_trimmed_directories.txt")
def remove_prefix(text, prefix):
# Remove prefix from supplied text string
if prefix in text:
return text[len(prefix):]
return text
with open(filename, "r") as trim_input, \
open(trim, "a") as trim_output:
for line in trim_input:
print line
if "C" in line:
print line
trim_output = remove_prefix(trim_input, 'C')
trim_output.write(line+ os.linesep)
you've mixed up the variable names, actually I'd expect that if run it would raise some exceptions.
you have trim_output for both the output file and the trimmed line
you are calling remove_prefix on the "input_file_object" not on the line
you get the trimmed line (overriding, I think, the output file refference), but you write the (not trimmed) line to the output file
your code should be something like
with open(filename, "r") as trim_input, \
open(trim, "a") as trim_output:
for line in trim_input:
print line
if "C" in line:
# this if is a bit useless you have an another if inside the remove_prefix,
# also you are skyping all the lines without prefix
print line
trimmed_line = remove_prefix(line, 'C')
trim_output.write(trimmed_line+ os.linesep)
Later edit:
in order to have the code behave as stated in the initial description the "if" should not be present, and the code unindented by one level
also the remove_prefix is flawed
def remove_prefix(text, prefix):
# Remove prefix from supplied text string
if prefix in text:
# if prefix is "C", this is true for "Ctest" and also for "testC"
return text[len(prefix):] # but it removes the first chars
return text
it should be
def remove_prefix(text, prefix):
# Remove prefix from supplied text string
if text and text.startswith(prefix):
return text[len(prefix):]
return text

Find fileS and then find a string in those files

I have written a function that finds all of the version.php files in a path. I am trying to take the output of that function and find a line from that file. The function that finds the files is:
def find_file():
for root, folders, files in os.walk(acctPath):
for file in files:
if file == 'version.php':
print os.path.join(root,file)
find_file()
There are several version.php files in the path and I would like to return a string from each of those files.
Edit:
Thank you for the suggestions, my implementation of the code didn't fit my need. I was able to figure it out by creating a list and passing each item to the second part. This may not be the best way to do it, I've only been doing python for a few days.
def cmsoutput():
fileList = []
for root, folders, files in os.walk(acctPath):
for file in files:
if file == 'version.php':
fileList.append(os.path.join(root,file))
for path in fileList:
with open(path) as f:
for line in f:
if line.startswith("$wp_version ="):
version_number = line[15:20]
inst_path = re.sub('wp-includes/version.php', '', path)
version_number = re.sub('\';', '', version_number)
print inst_path + " = " + version_number
cmsoutput()
Since you want to use the output of your function, you have to return something. Printing it does not cut it. Assuming everything works it has to be slightly modified as follows:
import os
def find_file():
for root, folders, files in os.walk(acctPath):
for file in files:
if file == 'version.php':
return os.path.join(root,file)
foundfile = find_file()
Now variable foundfile contains the path of the file we want to look at. Looking for a string in the file can then be done like so:
with open(foundfile, 'r') as f:
content = f.readlines()
for lines in content:
if '$wp_version =' in lines:
print(lines)
Or in function version:
def find_in_file(string_to_find, file_to_search):
with open(file_to_search, 'r') as f:
content = f.readlines()
for lines in content:
if string_to_find in lines:
return lines
# which you can call it like this:
find_in_file("$wp_version =", find_file())
Note that the function version of the code above will terminate as soon as it finds one instance of the string you are looking for. If you wanna get them all, it has to be modified.

Printing content of all files in directory

I am trying to go to a directory and print out the content of all files in it.
for fn in os.listdir('Z:/HAR_File_Generator/HARS/job_search'):
print(fn)
When I use this code all it does is print out the file names. How can I make it so I can actually get the content of the file? I have seen a lot of ways to possibly do this but I am wondering if there is a way to do it in the same format as I have it. It doesn't make sense to me that I'm not able to get the file content instead of the name. What would make sense to me is doing fn.read() and then printing it out but that does not work.
directory = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(directory):
print(open(os.path.join(directory, fn), 'rb').read())
Edit: You should probably close your files too but that's a separate issue.
mydir = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(mydir):
print open(mydir+'/'+fn).readlines()
Why is your code not printing any file contents? Because you are not reading any file contents.
For printing prettily..
for fn in os.listdir(mydir):
for line in open(mydir+'/'+fn).readlines():
print line
And to avoid this closing issue in case of much much larger files,
for fn in os.listdir(mydir):
with open(mydir+'/'+fn) as fil:
print fil.readlines()
Assuming they're text files that can actually be printed:
dirpath = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(dirpath):
with open(os.path.join(dirpath, fn), 'r') as f: # open the file
for line in f: # go through each line
print(line) # and print it
Or, in Python 3 (or Python 2 with the proper import):
dirpath = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(dirpath):
with open(os.path.join(dirpath, fn), 'r') as f: # open the file
print(*f, sep='') # and send every line to the print function

Categories

Resources