Changing file extensions, content being deleted - python

I am stole a little script that is supposed to simply add an extension where none exists from a file export. But when I run it, I get results and the actual content from the files has thus been zeroed out.
Why is this happening?
import os, sys
path = 'C:/Users/jal!/Downloads/Sinopiadata/'
for file in os.listdir(path):
if file != "complete.log" and file != "jasawn.py":
os.chdir('C:/Users/jal!/Downloads/Sinopiadata/')
file = (file)
filename = file + ".json"
filename = open(filename,'w')

There's always the rename method you can (or should, as mentioned in the comments) use:
import os
os.rename(file, file_with_extension)

You haven't put anything into the new file. If you want to copy from the file without the extension to the file with the extension, you have to read and write.
for file in os.listdir(path):
if file != "complete.log" and file != "jasawn.py":
os.chdir('C:/Users/jal!/Downloads/Sinopiadata/')
file = (file)
filename = file + ".json"
with open(filename,'w') as newfile, open(file, 'r') as oldfile:
newfile.write(oldfile.read())
You can also use shutil.copyfile()

filename = open(filename,'w')
opens the file for writing in truncating mode, which is why it gets emptied. There's no point in having that line there at all if you're only renaming things. You should just use os.rename(old_path, new_path).

Related

Unable to load contents while reading a .txt file in Python3

I am intending to extract some data stored in a .txt file using python 3, however, when I tried to print out the file content, the program does not display any thing in the console. This is the code snippet I use to read the file:
def get_data(directory):
entries = os.listdir(directory)
#print(entries)
count = 0;
for file in entries:
#print(file)
if file.endswith('.txt'):
with open(file) as curr_file:
#print(curr_file)
#read data and write it to an
#excel worksheet
print(curr_file.readline())
curr_file.close()
What kind of changes am I supposed to make to let the program display contents of the file?
Update: I tried to print out all files saved in entries and the result looks fine. The following is the code snippet I used to unzip files in the directory, I am not sure whether there're anything wrong with it.
def read_zip(path):
file_list = os.listdir(path)
#print(file_list)
#create a new directory and store
#the extracted file there
directory = 'C:/Users/chent/Desktop/Test'
try:
if not os.path.exists(directory):
os.makedirs(directory, exist_ok=True)
print('Folder created')
except FileExistsError:
print ('Directory not created')
for file in file_list:
if file.endswith('.zip'):
filePath=path+'/'+file
zip_file = zipfile.ZipFile(filePath)
for names in zip_file.namelist():
zip_file.extract(names, directory)
get_data(directory)
zip_file.close()
Solution: It turns out that I didn't specify the file path when use with open() statement, which caused the program unable to locate files. To fix it, use with open(file_path, file, "r") as curr_file. See details in my updated code:
def get_data(path):
files = os.listdir(path)
for file in files:
#print(file)
try:
if file.endswith('.txt'):
print(file)
with open('C:/Users/chent/Desktop/Test/' + file, "r", ) as curr_file:
# print(curr_file.readlines())
print(curr_file)
line = curr_file.readline()
print(line)
except FileNotFoundError:
print ('File not found')
path = 'C:/Users/chent/Desktop/Test'
get_data(path)
The problem is that you use curr_file.readline() which only returns the first line.
Use curr_file.read() to get the whole file contents.

Getting FileNotFoundError when trying to open a file for reading in Python 3

I am using the OS module to open a file for reading, but I'm getting a FileNotFoundError.
I am trying to
find all the files in a given sub-directory that contain the word "mda"
for each of those files, grab the string in the filename just after two "_"s (indicates a specific code called an SIC)
open that file for reading
will write to a master file for some Mapreduce processing later
When I try to do the opening, I get the following error:
File "parse_mda_SIC.py", line 16, in <module>
f = open(file, 'r')
FileNotFoundError: [Errno 2] No such file or directory:
'mda_3357_2017-03-08_1000230_000143774917004005__3357.txt'
I am suspicious the issue is either with the "file" variable or the fact that it is one directory down, but confused why this would occur when I am using OS to address that lower directory.
I have the following code :
working_dir = "data/"
for file in os.listdir(working_dir):
if (file.find("mda") != -1):
SIC = re.findall("__(\d+)", file)
f = open(file, 'r')
I would expect to be able to open the file without issue and then create my list from the data. Thanks for your help.
This should work for you. You need to append the directory because it sees it as just the file name at the top of your code and will look only in the directory where your code is located for that file name.
for file in os.listdir(working_dir):
if (file.find("mda") != -1):
SIC = re.findall("__(\d+)", file)
f = open(os.path.join(working_dir, file), 'r')
Also it's a good practice to open files using a context manager of with as it will handle closing your file when it is no longer needed:
for file in os.listdir(working_dir):
if (file.find("mda") != -1):
SIC = re.findall("__(\d+)", file)
with open(os.path.join(working_dir, file), 'r') as f:
# do stuff with f here
You need to append the directory, like this:
f = open(os.path.join(working_dir, file, 'r'))

Cannot find a file in my tempfile.TemporaryDirectory() for Python3

I'm having trouble working with Python3's tempfile library in general.
I need to write a file in a temporary directory, and make sure it's there. The third party software tool I use sometimes fails so I can't just open the file, I need to verify it's there first using a 'while loop' or other method before just opening it. So I need to search the tmp_dir (using os.listdir() or equivalent).
Specific help/solution and general help would be appreciated in comments.
Thank you.
Small sample code:
import os
import tempfile
with tempfile.TemporaryDirectory() as tmp_dir:
print('tmp dir name', tmp_dir)
# write file to tmp dir
fout = open(tmp_dir + 'file.txt', 'w')
fout.write('test write')
fout.close()
print('file.txt location', tmp_dir + 'lala.fasta')
# working with the file is fine
fin = open(tmp_dir + 'file.txt', 'U')
for line in fin:
print(line)
# but I cannot find the file in the tmp dir like I normally use os.listdir()
for file in os.listdir(tmp_dir):
print('searching in directory')
print(file)
That's expected because the temporary directory name doesn't end with path separator (os.sep, slash or backslash on many systems). So the file is created at the wrong level.
tmp_dir = D:\Users\foo\AppData\Local\Temp\tmpm_x5z4tx
tmp_dir + "file.txt"
=> D:\Users\foo\AppData\Local\Temp\tmpm_x5z4txfile.txt
Instead, join both paths to get a file inside your temporary dir:
fout = open(os.path.join(tmp_dir,'file.txt'), 'w')
note that fin = open(tmp_dir + 'file.txt', 'U') finds the file, that's expected, but it finds it in the same directory where tmp_dir was created.

I want to process every file inside a folder line by line and get a particular matching string

I am trying to process every files inside a folder line by line. I need to check for a particular string and write into an excel sheet. Using my code, if i explicitly give the file name, the code will work. If I try to get all the files, then it throws an IOError. The code which I wrote is as below.
import os
def test_extract_programid():
folder = 'C://Work//Scripts//CMDC_Analysis//logs'
for filename in os.listdir(folder):
print filename
with open(filename, 'r') as fo:
strings = ("/uri")
<conditions>
for line in fo:
if strings in line:
<conditions>
I think the error is that the file is already opened when the for loop started but i am not sure. printing the file name prints the file name correctly.
The error shown is IOError: [Errno 2] No such file or directory:
if your working directory is not the same as folder, then you need to give open the path the the file as well:
with open(folder+'/'+filename, 'r') as fo
Alternatively, you can use glob
import glob
for filename in glob.glob(folder+'/*'):
print filename
It can't open the path. You should do
for filename in os.listdir(folder):
print folder+os.sep()+filename

Printing content of all files in directory

I am trying to go to a directory and print out the content of all files in it.
for fn in os.listdir('Z:/HAR_File_Generator/HARS/job_search'):
print(fn)
When I use this code all it does is print out the file names. How can I make it so I can actually get the content of the file? I have seen a lot of ways to possibly do this but I am wondering if there is a way to do it in the same format as I have it. It doesn't make sense to me that I'm not able to get the file content instead of the name. What would make sense to me is doing fn.read() and then printing it out but that does not work.
directory = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(directory):
print(open(os.path.join(directory, fn), 'rb').read())
Edit: You should probably close your files too but that's a separate issue.
mydir = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(mydir):
print open(mydir+'/'+fn).readlines()
Why is your code not printing any file contents? Because you are not reading any file contents.
For printing prettily..
for fn in os.listdir(mydir):
for line in open(mydir+'/'+fn).readlines():
print line
And to avoid this closing issue in case of much much larger files,
for fn in os.listdir(mydir):
with open(mydir+'/'+fn) as fil:
print fil.readlines()
Assuming they're text files that can actually be printed:
dirpath = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(dirpath):
with open(os.path.join(dirpath, fn), 'r') as f: # open the file
for line in f: # go through each line
print(line) # and print it
Or, in Python 3 (or Python 2 with the proper import):
dirpath = 'Z:/HAR_File_Generator/HARS/job_search'
for fn in os.listdir(dirpath):
with open(os.path.join(dirpath, fn), 'r') as f: # open the file
print(*f, sep='') # and send every line to the print function

Categories

Resources