Printing content of all files in directory - python

I am trying to go to a directory and print out the content of all files in it.
for fn in os.listdir('Z:/HAR_File_Generator/HARS/job_search'):
print(fn)
When I use this code all it does is print out the file names. How can I make it so I can actually get the content of the file? I have seen a lot of ways to possibly do this but I am wondering if there is a way to do it in the same format as I have it. It doesn't make sense to me that I'm not able to get the file content instead of the name. What would make sense to me is doing fn.read() and then printing it out but that does not work.

directory = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(directory):
print(open(os.path.join(directory, fn), 'rb').read())
Edit: You should probably close your files too but that's a separate issue.

mydir = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(mydir):
print open(mydir+'/'+fn).readlines()
Why is your code not printing any file contents? Because you are not reading any file contents.
For printing prettily..
for fn in os.listdir(mydir):
for line in open(mydir+'/'+fn).readlines():
print line
And to avoid this closing issue in case of much much larger files,
for fn in os.listdir(mydir):
with open(mydir+'/'+fn) as fil:
print fil.readlines()

Assuming they're text files that can actually be printed:
dirpath = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(dirpath):
with open(os.path.join(dirpath, fn), 'r') as f: # open the file
for line in f: # go through each line
print(line) # and print it
Or, in Python 3 (or Python 2 with the proper import):
dirpath = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(dirpath):
with open(os.path.join(dirpath, fn), 'r') as f: # open the file
print(*f, sep='') # and send every line to the print function

Related

Unable to load contents while reading a .txt file in Python3

I am intending to extract some data stored in a .txt file using python 3, however, when I tried to print out the file content, the program does not display any thing in the console. This is the code snippet I use to read the file:
def get_data(directory):
entries = os.listdir(directory)
#print(entries)
count = 0;
for file in entries:
#print(file)
if file.endswith('.txt'):
with open(file) as curr_file:
#print(curr_file)
#read data and write it to an
#excel worksheet
print(curr_file.readline())
curr_file.close()
What kind of changes am I supposed to make to let the program display contents of the file?
Update: I tried to print out all files saved in entries and the result looks fine. The following is the code snippet I used to unzip files in the directory, I am not sure whether there're anything wrong with it.
def read_zip(path):
file_list = os.listdir(path)
#print(file_list)
#create a new directory and store
#the extracted file there
directory = 'C:/Users/chent/Desktop/Test'
try:
if not os.path.exists(directory):
os.makedirs(directory, exist_ok=True)
print('Folder created')
except FileExistsError:
print ('Directory not created')
for file in file_list:
if file.endswith('.zip'):
filePath=path+'/'+file
zip_file = zipfile.ZipFile(filePath)
for names in zip_file.namelist():
zip_file.extract(names, directory)
get_data(directory)
zip_file.close()
Solution: It turns out that I didn't specify the file path when use with open() statement, which caused the program unable to locate files. To fix it, use with open(file_path, file, "r") as curr_file. See details in my updated code:
def get_data(path):
files = os.listdir(path)
for file in files:
#print(file)
try:
if file.endswith('.txt'):
print(file)
with open('C:/Users/chent/Desktop/Test/' + file, "r", ) as curr_file:
# print(curr_file.readlines())
print(curr_file)
line = curr_file.readline()
print(line)
except FileNotFoundError:
print ('File not found')
path = 'C:/Users/chent/Desktop/Test'
get_data(path)
The problem is that you use curr_file.readline() which only returns the first line.
Use curr_file.read() to get the whole file contents.

Changing file extensions, content being deleted

I am stole a little script that is supposed to simply add an extension where none exists from a file export. But when I run it, I get results and the actual content from the files has thus been zeroed out.
Why is this happening?
import os, sys
path = 'C:/Users/jal!/Downloads/Sinopiadata/'
for file in os.listdir(path):
if file != "complete.log" and file != "jasawn.py":
os.chdir('C:/Users/jal!/Downloads/Sinopiadata/')
file = (file)
filename = file + ".json"
filename = open(filename,'w')
There's always the rename method you can (or should, as mentioned in the comments) use:
import os
os.rename(file, file_with_extension)
You haven't put anything into the new file. If you want to copy from the file without the extension to the file with the extension, you have to read and write.
for file in os.listdir(path):
if file != "complete.log" and file != "jasawn.py":
os.chdir('C:/Users/jal!/Downloads/Sinopiadata/')
file = (file)
filename = file + ".json"
with open(filename,'w') as newfile, open(file, 'r') as oldfile:
newfile.write(oldfile.read())
You can also use shutil.copyfile()
filename = open(filename,'w')
opens the file for writing in truncating mode, which is why it gets emptied. There's no point in having that line there at all if you're only renaming things. You should just use os.rename(old_path, new_path).

Find fileS and then find a string in those files

I have written a function that finds all of the version.php files in a path. I am trying to take the output of that function and find a line from that file. The function that finds the files is:
def find_file():
for root, folders, files in os.walk(acctPath):
for file in files:
if file == 'version.php':
print os.path.join(root,file)
find_file()
There are several version.php files in the path and I would like to return a string from each of those files.
Edit:
Thank you for the suggestions, my implementation of the code didn't fit my need. I was able to figure it out by creating a list and passing each item to the second part. This may not be the best way to do it, I've only been doing python for a few days.
def cmsoutput():
fileList = []
for root, folders, files in os.walk(acctPath):
for file in files:
if file == 'version.php':
fileList.append(os.path.join(root,file))
for path in fileList:
with open(path) as f:
for line in f:
if line.startswith("$wp_version ="):
version_number = line[15:20]
inst_path = re.sub('wp-includes/version.php', '', path)
version_number = re.sub('\';', '', version_number)
print inst_path + " = " + version_number
cmsoutput()
Since you want to use the output of your function, you have to return something. Printing it does not cut it. Assuming everything works it has to be slightly modified as follows:
import os
def find_file():
for root, folders, files in os.walk(acctPath):
for file in files:
if file == 'version.php':
return os.path.join(root,file)
foundfile = find_file()
Now variable foundfile contains the path of the file we want to look at. Looking for a string in the file can then be done like so:
with open(foundfile, 'r') as f:
content = f.readlines()
for lines in content:
if '$wp_version =' in lines:
print(lines)
Or in function version:
def find_in_file(string_to_find, file_to_search):
with open(file_to_search, 'r') as f:
content = f.readlines()
for lines in content:
if string_to_find in lines:
return lines
# which you can call it like this:
find_in_file("$wp_version =", find_file())
Note that the function version of the code above will terminate as soon as it finds one instance of the string you are looking for. If you wanna get them all, it has to be modified.

Deleting a line in multiple files in python

i am a beginner in python and i am practicing at the moment.
So what I want to do is a script that finds a line that I am writing with raw_input and that will search this line in multiple files and delete it.
Something like this but for more files:
word = raw_input("word: ")
f = open("file.txt","r")
lines = f.readlines()
f.close()
f = open("file.txt","w")
for line in lines:
if line!=mail+"\n":
f.write(line)
f.close()
It's an easy task but it's actually hard for me since I can't find an example anywhere.
Instead of reading the entire file into memory, you should iterate through the file and write the lines that are OK to a temporary file. Once you've gone through the entire file, delete it and rename the temporary file to the name of the original file. This is a classic pattern that you'll most likely frequently encounter in the future.
I'd also recommend breaking this down into functions. You should first write the code for removing all occurrences of a line from only a single file. Then you can write another function that simply iterates through a list of filenames and calls the first function (that operates on individual files).
To get the filenames of the all the files in the directory, use os.walk. If you do not want to apply this function to all of the files in the directory, you can set the files variable yourself to store whatever configuration of filenames you want.
import os
def remove_line_from_file(filename, line_to_remove, dirpath=''):
"""Remove all occurences of `line_to_remove` from file
with name `filename`, contained at path `dirpath`.
If `dirpath` is omitted, relative paths are used."""
filename = os.path.join(dirpath, filename)
temp_path = os.path.join(dirpath, 'temp.txt')
with open(filename, 'r') as f_read, open(temp_path, 'w') as temp:
for line in f_read:
if line.strip() == line_to_remove:
continue
temp.write(line)
os.remove(filename)
os.rename(temp_path, filename)
def main():
"""Driver function"""
directory = raw_input('directory: ')
word = raw_input('word: ')
dirpath, _, files = next(os.walk(directory))
for f in files:
remove_line_from_file(f, word, dirpath)
if __name__ == '__main__':
main()
TESTS
All of these files are in the same directory. On the left is what they looked like before running the command, on the right is what they look like afterwards. The "word" I input was Remove this line.
a.txt
Foo Foo
Remove this line Bar
Bar Hello
Hello World
Remove this line
Remove this line
World
b.txt
Nothing Nothing
In In
This File This File
Should Should
Be Changed Be Changed
c.txt
Remove this line
d.txt
The last line will be removed The last line will be removed
Remove this line
something like this should work:
source = '/some/dir/path/'
for root, dirs, filenames in os.walk(source):
for f in filenames:
this_file = open(os.path.join(source, f), "r")
this_files_data = this_file.readlines()
this_file.close()
# rewrite the file with all line except the one you don't want
this_file = open(os.path.join(source, f), "w")
for line in this_files_data:
if line != "YOUR UNDESIRED LINE HERE":
this_file.write(line)
this_file.close()

Iterating files with os.walk, but cannot open and print text files

Currently I am trying to write a function will walk through the requested directory and print all the text of all the files.
Right now, the function works in displaying the file_names as a list so the files surely exist (and there is text in the files).
def PopularWordWalk (starting_dir, word_dict):
print ("In", os.path.abspath(starting_dir))
os.chdir(os.path.abspath(starting_dir))
for (this_dir,dir_names,file_names) in os.walk(starting_dir):
for file_name in file_names:
fpath = os.path.join(os.path.abspath(starting_dir), file_name)
fileobj = open(fpath, 'r')
text = fileobj.read()
print(text)
Here is my output with some checking of the directory contents:
>>> PopularWordWalk ('text_dir', word_dict)
In /Users/normanwei/Documents/Python for Programmers/Homework 4/text_dir
>>> os.listdir()
['.DS_Store', 'cats.txt', 'zen_story.txt']
the problem is that whenever i try to print the text, i get nothing. eventually I want to push the text through some other functions but as of now it seems moot without any text. Can anyone lend any experience on why no text is appearing? (when trying to open files/read/storing&printing text manually in idle it works i.e. if I just manually inputted 'cats.txt' instead of 'file_name') - currently running python 3.
EDIT - The question has been answered - just have to remove the os.chdir line - see jojo's answer for explanation.
This line won't work
file = open(file_name, 'r')
Because it would require that these files exist in the same folder you are running the script from. You would have to provide the path to those files, as well as the file names
with open(os.path.join(starting_dir,file_name), 'r') as file:
#do stuff
This way it will build the full path from the directory and the file name.
If you do os.chdir(os.path.abspath(starting_dir)) you go into starting_dir. Then for (this_dir,dir_names,file_names) in os.walk(starting_dir): will loop over nothing since starting_dir is not in starting_dir.
Long story short, comment the line os.chdir(os.path.abspath(starting_dir)) and you should be good.
Alternatively if you want to stick to the os.chdir, this should do the job:
def PopularWordWalk (starting_dir, word_dict):
print ("In", os.path.abspath(starting_dir))
os.chdir(os.path.abspath(starting_dir))
for (this_dir,dir_names,file_names) in os.walk('.'):
for file_name in file_names:
fpath = os.path.join(os.path.abspath(starting_dir), file_name)
with open(fpath, 'r') as fileobj:
text = fileobj.read()
print(text)
You'll want to join the root path with the file path. I'd change:
file = open(file_name, 'r')
to
fpath = os.path.join(this_dir, file_name)
file = open(fpath, 'r')
You may also want to use another word to describe it than file as that's a built-in function in Python. I'd recommend fileobj.
Just to add on to the previous answer, you will have to join the absolute path and the relative path of the walk.
Try this:
fpath = os.path.abspath(os.path.join(this_dir, file_name))
f = open(fpath, 'r')

Categories

Resources