reading multiple images in python based on their filenames

reading multiple images in python based on their filenames - python

i have a folder of 100 images of human eye. i have 50 files named retina and 50 files named mask in that folder. i need to read all the images named retina1, retina2.....retina 50 and store them in an object retina. and similarly for mask images.
i could read all the files in a folder based on the code below. but not sure how to read them based on their filenames.i.e to read all the images of retina and mask separately. as i need to implement image segmentation and cnn classifier later.
for i in os.listdir():
f = open(i,"r")
f.read()
f.close()

I would use the glob module to get the path to the correct filenames.
Reference glob
import glob
retina_images = glob.glob(r"C:\Users\Fabian\Desktop\stack\images\*retina*")
mask_images = glob.glob(r"C:\Users\Fabian\Desktop\stack\images\*mask*")
print(retina_images)
print(mask_images)
Now you can use the path list to read in the correct files.
In my case my images located under:
C:\Users\Fabian\Desktop\stack\images\ you can use the * as a wildcard.
EDIT:
import glob
images = {}
patterns = ["retina", "mask"]
for pattern in patterns:
images[pattern] = glob.glob(r"C:\Users\fabia\Desktop\stack\images\*{}*".format(pattern))
print(images)
Generate a dict out of your searching patterns could be helpful.

You can limit the loop to match certain filenames by combining the loop with a generator expression.
for i in (j for j in os.listdir() if 'retina' in j):
f = open(i,"r")
f.read()
f.close()

Related

How do I convert multiple PDFs into images from the same folder in Python?

from pdf2image import convert_from_path
images = convert_from_path('path.pdf',poppler_path=r"E:/software/poppler-0.67.0/bin")
for i in range(len(images)):
images[i].save('image_name'+ str(i) +'.jpg', 'JPEG')
But now I want to convert more than 100 pdf files into images.
Is there any way?
Thanks in advance.

You can use glob to 'glob' the file names into a list: Python glob is here https://docs.python.org/3/library/glob.html - but it's a general expression for using wildcard expansion in the (*nix) filesystem [https://en.wikipedia.org/wiki/Glob_(programming)]. I assume it works under windows :)
Then you just loop over the files. Hey presto!
import glob
from pdf2image import convert_from_path
poppler_path = r"E:/software/poppler-0.67.0/bin"
pdf_filenames = glob.glob('/path/to/image_dir/*.pdf')
for pdf_filename in pdf_filenames:
images = convert_from_path(pdf_filename, poppler_path=poppler_path)
for i in range(len(images)):
images[i].save(f"{pdf_filename}{i}.jpg", 'JPEG')
!TIP: f"{pdf_filename}{i}.jpg" is a python f-string which gives a the reader a better idea of what the string will look like eventually. You might want to zero pad the integers there, because at some point you might want to 'glob' those or some such. There are lots of ways to achieve that - see How to pad zeroes to a string? for example.

You will possibly need to use the os module.
First step:
Use the os.listdir function like this
os.listdir(path to folder containing pdf files)
to get a list of paths within that folder.
To be more specific the os.isfile() to check if the current path is a file or a folder .
Perform the conversion if the path lead to a file like this.
images = convert_from_path('path.pdf',poppler_path=r"E:/software/poppler-0.67.0/bin")
for i in range(len(images)):
images[i].save('image_name'+ str(i) +'.jpg', 'JPEG')
Otherwise use recursion to traverse the folder even more.
Here's a link to a repo where I recursively resized images in a folder . It could be useful to digest this idea.
Link to a recursive resizing of images in a given path.

Needs python or jupyter notebook helper hand image splitter

I am working with Chest X-Ray14 dataset. The data contains about 112,200 images grouped in 12 folders (i.e. images1 to images12) The image labels are in a csv file called Data_Entry_2017.csv. I want to split the images base on the csv labels (attribute "Finding Labels) into their their various train and test folders.
Can anyone help me with Python or Jupyter-notebook split code? I will be grateful.

df = pd.rread_csv("Data_Entry_2017.csv")
infiltration_df = df[df["Finding Label"]=="Infiltration"]
list_infiltration = infiltration_df .index.values.tolist() # This will be a list of image names
Then you can parse each folder and check if image name is in the list of infiltration labels, you can put that in different folders.
To read all image filenames in a folder, you can use os.listdir
from os import listdir
from os.path import isfile, join
imagefiles = [f for f in listdir(image_folder_name) if isfile(join(image_folder_name, f))]
For train test split you can refer here

Read images of multiple datatypes from multiple subdirectories

I want to read images of multiple datatypes from multiple subdirectories in python using glob function
I have successfully read images of JPG type from the subdirectories. Want to know how can I read images of multiple datatypes. Below are the codes I have tried so far
###########READ IMAGES OF MULTIPLE DATATYPES from a SINGLE Folder######
import os
import glob
files = []
for ext in ('*.jpg', '*.jpeg'):
files.extend(glob(join("C:\\Python35\\target_non_target\\test_images", ext)))
count=0
for i in range(len(files)):
image=cv2.imread(files[i])
print(image)
###### READ MULTIPLE JPG IMAGS FROM MULTIPLE SUBDIRECTORIES#########
import os
import glob
from glob import glob
folders = glob("C:\\Python36\\videos\\videos_new\\*")
img_list = []
for folder in folders:
for f in glob(folder+"/*.jpg"):
img_list.append(f)
for i in range(len(img_list)):
print(img_list[i])
Both codes work perfectly but I am confused how to include the line for reading multiple datatype images from multiple subdirectories. I have a directory with multiple subdirectories in which there are images of multiple datatypes like JPG,PNG,JPEG,etc. I want to read all those images and use them in my code

Try this:
import glob
my_root = r'C:\Python36\videos\videos_new'
my_exts = ['*.jpg', 'jpeg', '*.png']
files = [glob.glob(my_root + '/**/'+ x, recursive=True) for x in my_exts]
Check this out for various ways to recursively search folders.
On another note, instead of this:
for i in range(len(img_list)):
print(img_list[i])
do this for simple, readable for-loop using python:
for img in img_list:
print(img)

How do I load a file containing images into a python array?

Is there a way to use the pandas library to simply load the images (as pixelated data) into a single array?

Let's say you have a folder that only contains JPEG images.
First, import everything you'll need
from os import listdir
from os.path import isfile, join
import imageio
Then, set the location of the folder that contains ONLY IMAGES. With this folder location, we will generate the list of full filenames for each and every image.
image_folder_path = "D:\\temp\\images"
onlyfiles = [f for f in listdir(image_folder_path) if isfile(join(image_folder_path, f))]
full_filenames = [join(image_folder_path,this_image) for this_image in onlyfiles]
Then, you can start an empty list, start opening one file at a time and appending them to your list.
image_list = []
for this_filename in full_filenames:
image_rgb_values = imageio.imread(this_filename)
image_list.append(image_rgb_values.copy())
image_list = np.array(image_list)
Now, the variable image_list has stored all the images.
This will work best if all images have identical dimensions (width x height), but it should also work otherwise.
Hope it helps! =)

How to upload images in specifier path to python using Numerical order

I want my program to upload all the images in a directory using a path , I am using :
for subdirs, dirs, files in os.walk(args.imagesdirectory):
for file in files:
print("file is ",file)
path=subdirs+'/'+file
print("path is ",path)
img = Image.open(path)
so my question is how to make the programm always import the images in this order 0001.jpg then 0002.jpg then 0003.jpg ...ect, and not in a random way ?
Thank you in advance

If you know the names in advance, you could do that with range() function.
i.e.
for filenum in range(len(files)):
img = open(filenum + ".jpg")
...
Also, it's generally better to use with open(file) as f then f = open(file)

Assuming that you're using Py3.4 or greater, the pathlib module is very useful for work like this. It's part of the standard distribution.
I have a subdirectory on one of my drives called C:/Camera/Selected. This is how I can list the jpg images in numerical order.
>>> from pathlib import Path
>>> for p in sorted(list(Path('C:/Camera/Selected').glob('*.jpg'))):
... str(p)
...
'C:\\Camera\\Selected\\20150320_155849.jpg'
'C:\\Camera\\Selected\\20160905_184732.jpg'
'C:\\Camera\\Selected\\20170717_082735.jpg'
In your case, you would have, img = Image.open(str(p)) within the for loop.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

reading multiple images in python based on their filenames - python

You can limit the loop to match certain filenames by combining the loop with a generator expression. for i in (j for j in os.listdir() if 'retina' in j): f = open(i,"r") f.read() f.close()

Related

How do I convert multiple PDFs into images from the same folder in Python?

Needs python or jupyter notebook helper hand image splitter

Read images of multiple datatypes from multiple subdirectories

How do I load a file containing images into a python array?

How to upload images in specifier path to python using Numerical order

Categories

Resources