Python script iterates over whole folder but skips files in the folder

Python script iterates over whole folder but skips files in the folder - python

I tried running the following code. The code should read hdf5 files from a directory and create for every hdf5 file a png and a txt file with the same name (btw. I need it as input for the CNN YOLO).
The code does what I described but only for 20 images! I added print(i) to see if the for-loop is working proper... and it is. It prints every file in the directory (over 200 files). But it just creates 20 .png and 20 .txt files.
def process_fpath(path1, path2):
sensor_dim = (101, 101)
onlyfiles = [f for f in os.listdir(path1) if isfile(join(path1, f))]
for i in onlyfiles:
if i.endswith(".hdf"):
print(i)
#cut ".hdf"
name = str(i[0:-5])
# create png
im = h5py.File(path1 + str(i), 'r')
labels_im = im['labels']
image = im['image']
plt.imsave(path2 + name + '.png', image)
# create txt
exp = np.column_stack((np.zeros(np.size(labels_im,0)) , labels_im[:,0]/sensor_dim[0], labels_im[:,1]/sensor_dim[1], labels_im[:,3]/sensor_dim[0], labels_im[:,3]/sensor_dim[0]))
np.savetxt(path2 + name + '.txt', exp, delimiter = ' ', fmt=['%d', '%8f', '%8f', '%8f', '%8f'])
continue
else:
continue
This is my first post so if something isn't proper please let me know.

Maybe it's because of the name variable? You remove 5 characters but you want to remove only 4: name = str(i[0:-4])
Not related to your question, the last 3 lines are useless. you can remove them.
continue
else:
continue
Try to run on a given file that is not working to understand what the problem is instead of looping on each of them.

Related

How to find identical file names in a folder and then move duplicates to a different folder in Python?

I currently have .jpg and .txt files in a single folder. Some of them have identical names and others are just .txt and .jpg files that don't have matching names. Duplicates are basically a .jpg and .txt file that have the same name disregarding file extensions. Now, I need to move all these duplicates to a single folder but I can't figure out how.
I have this current code but it doesn't move anything:
import os
import glob
import shutil
images = glob.glob('C:/Users/b/Images')
labels = glob.glob('C:/Users/b/Labels')
if os.name == 'nt':
separator = '\\'
else:
separator = '/'
duplicates =[]
for txt in labels:
# [-1] takes the last part of the path
# .strip removes .TXT from the file name
txt_name = txt.split(separator)[-1].strip('.txt')
for wav in images:
wav_name = wav.split(separator)[-1].strip('.WAV')
wav_path = wav.strip(txt_name + '.WAV')
# Check if the wav_name and txt_name are the same.
# There is no check for case.
if wav_name == txt_name:
duplicates.append(wav)
duplicates.append(txt)
for x in duplicates:
shutil.move(x , 'C:/Users/b/All')

Here is code that solves the problem you originally described. That is, if there is a jpg in the Images directory that has a corresponding txt file in the Labels directory, then both files are moved to the All directory. If that's not what you wanted, then you need to post a different question.
import os
imagepath = 'C:/Users/b/Images'
labelpath = 'C:/Users/b/Labels'
bothpath = 'C:/Users/b/All'
images = os.listdir(imagepath)
labels = os.listdir(labelpath)
imagenames = set(os.path.splitext(k)[0] for k in images if k[-4:]=='.jpg')
labelnames = set(os.path.splitext(k)[0] for k in labels if k[-4:]=='.txt')
for x in imagenames.intersection(labelnames):
os.rename( imagepath + os.sep + x + '.jpg', bothpath )
os.rename( labelpath + os.sep + x + '.txt', bothpath )

Script fails to move the rest of the file due to a file being used by another process

I'm currently trying to make a script
where it takes a zipfile inside a directory,
checks if the zipfile contains a specific name,
and if so, it will move the zipfile to another directory.
Running the following does move the first file.
However, after it moves the first file, it fails to
go through the rest of the file and gives me this error.
"WindowsError: [Error 32] The process cannot access the file because it
is being used by another process: (here shows the location of the file)"
I wonder what would be the cause of this error.
items = os.listdir(location)
Asset_list = os.listdir(drive_location)
def get_list():
for each in items:
new_location = drive_location + "\\" + each
if ".zip" in each:
selected_zip = location + "\\" + each
with ZipFile(str(selected_zip)) as zip:
list_of_files = zip.namelist()
for each in list_of_files:
if Asset_list[5] in each:
shutil.move(selected_zip, new_location)

You cannot continue to check files in the zip file and move it to another location simultaneously. Change shutil.move to shutil.copy or exit the with code block before attempting to move the file. You can just simply add a flag and then copy based on the value of that flag:
items = os.listdir(location)
Asset_list = os.listdir(drive_location)
def get_list():
for each in items:
new_location = drive_location + "\\" + each
if ".zip" in each:
to_move: bool = False
selected_zip = location + "\\" + each
with ZipFile(str(selected_zip)) as zip:
list_of_files = zip.namelist()
for each in list_of_files:
if Asset_list[5] in each:
to_move = True
if to_move:
shutil.move(selected_zip, new_location)
Although I'm not entirely convinced about what exactly the line if Asset_list[5] in each: is meant to do. Are you comparing whether one string is contained in the other? Or do you want to compare equality of values? If it's the latter then you need to change it to if Asset_list[5] == each:

Python: Reading multiple files and storing the output for a particular file

I have a thousand .xvg files in a directory which I need to read and store an output for each of them.
Currently I have a python code which works for only one file. Could you please suggest how do I read all files at once and get an output and store it for every file separately?
f = open('/home/abc/xyz/coord/coord_1.xvg')
dat = f.readlines()
dat1 = dat[22:len(dat)]
dat2=[]
for k in dat1:
dat2.append(k.split())
for k in dat2:
if float(k[1])>=9.5:
print('P')
break
elif float(k[1])<=5.9:
print('R')
break
else:
print('N')

Here's a version but used as much as code as possible to make it easier to follow.
import os
def process_files():
" Will process all files in folder using your code "
for file in os.listdir("."): # '.' correspond to the current directory
# You can specify whatever directory,
#such as /usr/accidental_coder/working
if file.endswith(".xvg"):
# Find found
# Output will be with same name but with .txt suffix
with open(os.path.join(".", file), 'r') as infile, \
open(os.path.join(".", file.replace('.xvg', '.txt')), 'w') as ofile:
# Using your original code
# left alone so you could know how to change if desired
# (note: your can be shortened)
dat = infile.readlines()
dat1 = dat[22:len(dat)]
dat2=[]
for k in dat1:
dat2.append(k.split())
for k in dat2:
if float(k[1])>=9.5:
ofile.write('P\n')
break
elif float(k[1])<=5.9:
ofile.write('R\n')
break
else:
ofile.write('N\n')
process_files()
Refactoring Your Code for Better Performance
Seems you just process the 23'rd line in each file
import os
def process_files():
for file in os.listdir("."):
# Examples of getting files from directories
# https://stackoverflow.com/questions/3964681/find-all-files-in-a-directory-with-extension-txt-in-python
if file.endswith(".xvg"):
with open(os.path.join(".", file), 'r') as infile, \
open(os.path.join(".", file.replace('.xvg', '.txt')), 'w') as ofile:
# Skip first 22 lines
for _ in range(22):
next(infile)
# use data on 23rd line
data = next(infile)
k = data.split()
if float(k[1])>=9.5:
ofile.write('P\n')
elif float(k[1])<=5.9:
ofile.write('R\n')
else:
ofile.write('N\n')
process_files()

compare two list of files between different directories in python

I am trying to compare two list of files from different directories. If there is a match found, the file should be written in a different directory. Below is my code.
filelist= ['sample2\\output_1.txt','sample2\\output_2.txt','sample3\\asn_todlx_mf_output_3.txt']
filelist2 = ['sample\\output_1.txt','sample\\output_3.txt','sample\\output_7.txt','sample\\output_2.txt','sample1\\output_3.txt']
a = 1
for name in filelist:
a = a + 1
for x in filelist2 :
file1 = open(x, 'r')
file2 = open(name,'r')
FO = open('right\\right_file'+str(a)+'.txt', 'w')
for line1 in file1:
for line2 in file2:
if line1 == line2:
FO.write("%s\n" %(line1))
FO.close()
file1.close()
file2.close()
For instance, output1 from 'sample folder(filelist)' is compared with every files in 'sample2(filelist)', if there is match, it should be written 'right' folder like 'right_file1.txt'.But the script is generating 15 files starting from 'right_file1.txt' to 'right_file15.txt'. It works good when I tried to compare one file with the list of files. Please help me, getting this.

That's how i would do it.
filelist1 = ['sample2\\output_1.txt','sample2\\output_2.txt','sample3\\asn_todlx_mf_output_3.txt']
filelist2 = ['sample\\output_1.txt','sample\\output_3.txt','sample\\output_7.txt','sample\\output_2.txt','sample1\\output_3.txt']
dir1 = filelist1[0].split('\\')[:-1]
filelist1 = [x.split('\\')[-1] for x in filelist1]
dir2 = filelist2[0].split('\\')[:-1]
filelist2 = [x.split('\\')[-1] for x in filelist2]
common = [x for x in filelist1 if x in filelist2]
print(common)
# ['output_1.txt', 'output_2.txt']
a = 1
for file in common:
a += 1
with open(dir1 + '\\' + file) as f_in:
contents = f_in.readlines()
with open('right\\right_file' + str(a) + '.txt', 'w') as f_out:
f_out.write(contents)
Initially i look for the files that are common between the two lists and i store their names in common. Then for all files in the common list i create their copy in this other directory as you mentioned. Notice the use of with which handles the closing and flushing of the files. Use that instead of manually managing the file unless you have a reason not to.
Finally, i did not get the logic behind your iterator a but i just copied it from you. It starts with value 2! If you want to grab the number from the file copied you have to go about doing it differently. Your way makes the origin of the created file untraceable..
Let me know if that worked for you.

Merging PDF's with PyPDF2 with inputs based on file iterator

I have two folders with PDF's of identical file names. I want to iterate through the first folder, get the first 3 characters of the filename, make that the 'current' page name, then use that value to grab the 2 corresponding PDF's from both folders, merge them, and write them to a third folder.
The script below works as expected for the first iteration, but after that, the subsequent merged PDF's include all the previous ones (ballooning quickly to 72 pages within 8 iterations).
Some of this could be due to poor code, but I can't figure out where that is, or how to clear the inputs/outputs that could be causing the failure to write only 2 pages per iteration:
import os
from PyPDF2 import PdfFileMerger
merger = PdfFileMerger()
rootdir = 'D:/Python/Scatterplots/BoundaryEnrollmentPatternMap'
for subdir, dirs, files in os.walk(rootdir):
for currentPDF in files:
#print os.path.join(file[0:3])
pagename = os.path.join(currentPDF[0:3])
print "pagename is: " + pagename
print "File is: " + pagename + ".pdf"
input1temp = 'D:/Python/Scatterplots/BoundaryEnrollmentPatternMap/' + pagename + '.pdf'
input2temp = 'D:/Python/Scatterplots/TraditionalScatter/' + pagename + '.pdf'
input1 = open(input1temp, "rb")
input2 = open(input2temp, "rb")
merger.append(fileobj=input1, pages=(0,1))
merger.append(fileobj=input2, pages=(0,1))
outputfile = 'D:/Python/Scatterplots/CombinedMaps/Sch_' + pagename + '.pdf'
print merger.inputs
output = open(outputfile, "wb")
merger.write(output)
output.close()
#clear all inputs - necessary?
outputfile = []
output = []
merger.inputs = []
input1temp = []
input2temp = []
input1 = []
input2 = []
print "done"
My code / work is based on this sample:
https://github.com/mstamy2/PyPDF2/blob/master/Sample_Code/basic_merging.py

I think that the error is that merger is initialized before the loop and it accumulates all the documents. Try to move line merger = PdfFileMerger() into the loop body. merger.inputs = [] doesn't seem to help in this case.
There are a few notes about your code:
input1 = [] doesn't close file. It will result in many files, which are opened by the program. You should call input1.close() instead.
[] means an empty array. It is better to use None if a variable should not contain any meaningful value.
To remove a variable (e.g. output), use del output.
After all, clearing all variables is not necessary. They will be freed with garbage collector.
Use os.path.join to create input1temp and input2temp.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python script iterates over whole folder but skips files in the folder - python

Related

How to find identical file names in a folder and then move duplicates to a different folder in Python?

Script fails to move the rest of the file due to a file being used by another process

Python: Reading multiple files and storing the output for a particular file

compare two list of files between different directories in python

Merging PDF's with PyPDF2 with inputs based on file iterator

Categories

Resources