Prepare image data for tflearn - python

I have a few thousand pictures I want to train a model with tflearn. But somehow I have some problems to prepare the images to the right data format. I tried tflearn.data_utils.image_preloader() but I'm getting a ValueError. So I tried to write my own image_loader.py file, but with so many pictures my RAM is running full.
Does anyone know a good tutorial, example or anything to write a CNN with my own image set, along with details on how to preprocess the data for tflearn?

The Tensorflow tutorial is a good place to start. Preprocessing is covered in this section.

Related

Why the generated images from cGAN are looking different from the preview that i saw on Tensorboard during training?

I train a conditional WGAN-GP.
I cannot understand why the fake images that i manage to create, on Tensorboard where looking good enough.
enter image description here
As you can see on the first picture.
But when the train is finished and i load the model to another file and use the generator with the appropriate inputs.(noise and class label).
The fake images that produce is looking like this (picture 2)
enter image description here
As you can see the model(generator) is working but the pictures were darker.and i dont know if that is normal.Why on tensorboard are looking better
Please if anybody has idea why this happening tell me to send code details.
Thank youuuuu
I tried to run it for more epoch but the results when i use the generator are quite the same.I was expecting to see an image when i use the loaded generator as the image that i see on tensorboard grid (fake images)

How to build multi-input image process by Tensorflow ImageDataGenerator

I am attempting to build a multi-input CNN model.
Specifically, the model classifies the words "arigatou, hai,..." into 20 types as shown in the attached image.
into 20 types of words.
This is a good example.
For this purpose, the input format we are assuming is to input images from 4 channels simultaneously.
However, I am having trouble figuring out how to process the image data.
Please let me know if there is a way to use ImageDatageneraror to create training data from a directory structure like an image.
Thank you very much.
sample URL:
https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator?hl=zh-tw
Multiple ImageDataGenerator
All you have to do is make a Pandas Dataframe of all the images paths and then use the ImageDataGenerator on that.

Why a convolution neural network model which works well on new test images fails on video stream?

I have implemented a convolutional neural network by transfer learning using VGG19 to classify 5 different traffic signs. It works well with new test images, but when I apply the model upon video streaming it doesn't classify them correctly.
Assuming the neural network works well on images, it should work the same on frames of a video stream. In the end, a video stream is a sequence of images.
The problem is not that it doesn't work on video stream, it simply does not work on the type of images similar to the ones you have in the video stream.
It is hard to exactly find the problem, since the question does not have enough detail. However, some considerations are:
Obviously, there is a problem with the network's ability to generalize. Was the testing performed well? For example, is there a train-validation split of the data?
Does the training error and the validation error indicate any possible issues, such as overfitting?
Is the data used to train the image similar enough to the video frames?
Is the training dataset large enough? Data augmentation might help if there is not enough data.

How to create my own dataset to train/test a convolutional neural network

So here is my question:
I want to make my very own dataset using a motion capture camera system to get the ground truth poses and one RGB camera to get images, and then using this as input to my network, train/test a convNet.
I have looked around at other datasets for tensorflow, caffe and Matlab. I have viewed the MNIST, Cats/Dogs, Iris, LSP, HumanEva, HumanEva3.6, FLIC, etc. datasets and have viewed and tried to understand their data as best as I can. I have viewed online people trying to make their own datasets. The one thing is usually when you use their datasets as an example, you download a .txt file that already contains the labels.
If anyone could please explain to me how to use the image data with the labels to feed it into my network, it would be a tremendous help. I have made code before using tensorflow to input a .txt file into the network and get the correct predicted output. But, my brain is missing something to understand how to input an image with a label. How to I create that dataset?
Your input images and your labels are two separate variables. You will be writing separate bits of code to import them. The videos typically need to be converted to JPG files (it's a royal pain to read video files directly, mostly because you can't randomly skip around the video easily).
Probably the easiest way to structure you data is via a CSV that contains filename, poseinfoA, poseinfoB, etc. And the filename refers to the JPG image on disk.
To get started on the basics, I suggest looking at the Aymericdamen tutorial examples, I haven't found tutorials anywhere that were as clear and concise.
https://github.com/aymericdamien/TensorFlow-Examples
Those examples don't go into detail on the data input pipeline though. To set up a good data input pipeline in tensorflow I suggest you use the new (as of TF 1.4) Dataset object. It will force you into a good data input pipline workflow, and it's the way all data input is going in tensorflow, so it's worth learning. It's also easy to test and debug when you write it this way. Here's the guide you want to follow.
https://www.tensorflow.org/programmers_guide/datasets
You can start your Dataset object from the CSV, and use a dataset.map_fn() to load the images using tf.image.decode_jpeg
Since you're doing pose estimation I'll also suggest a nice blog I came across recently that will probably interest you. The topic is segmentation, but pose estimation is quite related.
http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review

How to recognize real scenes image using scikit-learn?

I am new in scikit-learn, I have a lot of images and images size not all same, A kind of are real scenes image like
cdn.mayike.com/emotion/img/attached/1/image/dt/20170920/12/20170920121356_795.png
cdn.mayike.com/emotion/img/attached/1/image/mainImg/20170916/15/20170916153205_512.png
, another are not real scenes image like
cdn.mayike.com/emotion/img/attached/1/image/dt/20170917/01/20170917011403_856.jpeg
cdn.mayike.com/emotion/img/attached/1/image/dt/20170917/14/20170917145613_197.png
.
I want to use scikit-learn recognizing which not real scenes image, I think it simlar to http://scikit-learn.org/stable/auto_examples/applications/plot_face_recognition.html#sphx-glr-auto-examples-applications-plot-face-recognition-py. I am totally no idea how to begin.How to creating dateset and extracting features from images? Can someone tell me what should I do?
This seems to not directly be a programming problem here and your questions are related to non-basic 'current' research.
It seems that you should read about Natural Scene (Statistics) and get yourself familiar with one of the current machine learning frameworks like TensorFlow, Caffe.
There are many tutorials out there to get started, for example you could begin with a binary classifier which outputs if the given image shows a natural scene or not.
Your database setup could have a structure like so:
-> Dataset
-> natural_scenes
-> artificial_images
Digits for example can use such a structure to create a dataset and is able to use models designed for Caffe and TensorFlow.
I would also recommend that you read about finetuning nerual networks, as you would need a lot of images in your database if you start training from scratch.
In Caffe you can finetune pretrained models like CaffeNet or GoogeNet.
I think those are some basic information which should get you started.
As of scikit-learn and face-detection: Face-Detection is more looking for local candidates or image patches which could possibly contain a face. Your problem on the other hand is more of a global problem as the whole image is concerned. That said I would start off with a neural network here which is able to extract local and global features for you.

Categories

Resources