how to convert image to dataset to process machine learning

how to convert image to dataset to process machine learning - python

How to convert a image to datasets or numpy array and to predict by fiting it to clf
import PIL as pillow
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
infilename=input()
im=Image.open(infilename)
imarr=np.array(im)
flatim=imarr.flatten('F')
clf=svm.SVC(gamma=0.0001,C=100)
x,y=im.size
#how to fit the numpy array to clf
clf.fit(flatim[:-1],flatim[:-1])
print("prediction:",clf.predict(flatim[-1]))
plt.imshow(flatim,camp=plt.cm.gray_r,interpolation='nearest')
plt.show()
Anyone please and thanks!!!

there is no other reason of using SVM on a single image except for fun of doing it. Here are the fixes I did. 1) use .convert("L") to convert the image as 2D array grayscale. 2) created a dummy target variable y as randomized 1D array. 3) fix type error displaying the image again (plt.imshow) cmap (instead of camp) and im (instead of flatim)
import PIL as pillow
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
im=Image.open("sample.jpg").convert("L")
imarr=np.array(im)
flatim=imarr.flatten('F')
clf=svm.SVC()
#X,y=im.size
X = imarr
y = np.random.randint(2, size=imarr.shape[0])
clf.fit(X, y)
#how to fit the numpy array to clf
#clf.fit(flatim[:-1],flatim[:-1])
# I HAVE NO IDEA WHAT I"M DOING HERE!
print("prediction:", clf.predict(X[-2:-1]))
plt.imshow(im,cmap=plt.cm.gray_r,interpolation='nearest')
plt.show()
I see a good example in scikit-learn website of using SVM. I guess this is what you are trying to copy. Isn't?

Related

3D data with SUSI

I am trying to use SUSI on hyperspectral data, but I am getting errors. I am sure that I am the problem and not SUSI.
import susi as su
import spectral as sp
import spectral.io.envi as envi
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
box = envi.open('C:/path/ref_16-2_22/normalised.hdr')
data = np.array(box.load())
som = su.SOMClassifier(n_rows=data.shape[0], n_columns=data.shape[1])
som.fit(data)
ValueError: estimator requires y to be passed, but the target y is None
som = su.SOMClustering(n_rows=data.shape[0], n_columns=data.shape[1])
som.fit(data)
ValueError: Found array with dim 3. None expected <= 2.
I am definitely the problem! Has anyone used SUSI on 3D data?

In general: the dimensions of the SOM (rows and columns) don't have to relate to the dimensions of your data.
For susi: You are using a classifier on data without class labels. in som.fit, you need to pass also the labels y:
som.fit(data, y)
data can be an n-D array, y would be a 1D array in your case I guess.
Alternatively, you can use unsupervised clustering:
som = SOMClustering()
som.fit(data)
[Disclaimer: I am the developer of susi.]

2D output on Lineal regression model

I'm getting the following error from my code:
ValueError: Expected 2D array, got scalar array instead:
array=99.
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
Here is the code used:
#importing libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import linear_model
Physical_activity_df = pd.read_excel('C:/Users/Usuario/Desktop/LW_docs/Physical_activity_nopass.xlsx')
prediction_df = Physical_activity_df[['Activity_Score','Calories']]
prediction_df.plot(kind='scatter', x= 'Activity_Score', y= 'Calories')
plt.show()
#change to df variables
activity_score = pd.DataFrame(prediction_df['Activity_Score'])
calories = pd.DataFrame(prediction_df['Calories'])
lm = linear_model.LinearRegression()
model = lm.fit(activity_score,calories)
#predict new values for calories (FROM HERE COMES THE ERROR)
activity_score_new = 99
calories_predict = model.predict(activity_score_new)
calories_predict
Any idea about how to fix this issue? Thanks!

How numpy printoptions will work with images?

I was trying to print an image to analyze, if there are some changes in the pixel intensities if the images are forged. Anyways my doubt is related with the numpy.printoptions.
I was trying below code and numpy.printoptions was not working:
Image of code snippet
Code:
import numpy
import numpy as np
DIR = "D:/Work/ML/API/MNB/28 - Forgery/data/phase-01-training.tar/dataset-dist/chunks/1500_64_16"
TRAIN_CHUNKS = os.path.join(DIR, "train")
TRAIN_FAKE_CHUNKS = os.path.join(TRAIN_CHUNKS, "fake")
TRAIN_PRISTINE_CHUNKS = os.path.join(TRAIN_CHUNKS, "pristine")
IND=2000
train_chunk_files = os.listdir(TRAIN_FAKE_CHUNKS)
src = cv2.imread(os.path.join(TRAIN_FAKE_CHUNKS, train_chunk_files[IND]))
print(src[:, :, 1].shape)
with numpy.printoptions(threshold=64):
ok = np.copy(src[:, :, 1])
print(ok)
# print(src[:, :, 1])
plt.imshow(src)
plt.show()
But on the other end numpy printoptions is working fine for mnist dataset!!!
Working code snippet
code:
import tensorflow as tf
print(tf.__version__)
mnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()
import numpy as np
np.set_printoptions(linewidth=200)
import matplotlib.pyplot as plt
print(type(training_images[0]))
plt.imshow(training_images[0])
print(training_labels[0])
print(training_images[0])
What is the mistake I am doing over here? How to print the RGB image for each channel? I have checked the datatype for both of the image, it is numpy, ndarray.
Edit 1:
Things are not working by using linewidth.
Code with linewidth in np.printoptions

From the diagram you have given in "Image of code snippet" link, it seems printoptions print the values in both cases, incase if the values needs to be printed properly (displaying all values), u could use,
with np.printoptions(linewidth=200):
ok = np.copy(src[:, :, 1])
print(ok)
if this is not what you are looking for, please let me know in the comments section. Happy to correct the answer!

What is the difference between scipy.ndimage.imread and matplotlib.pyplot.imread?

scipy.ndimage.imread has just been deprecated in scipy, so I switched my code directly to use pyplot - but the result was not the same. I am importing images for a learning algorithm built in keras - I thought it would be a 1to1 change - but it isn't - I was training fine, after the switch my system doesn't train. Is there a python guru out there that can explain what the difference is?
Scipy returns:
img_array : ndarray
The different colour bands/channels are stored in the third dimension, such that a grey-image is MxN, an RGB-image MxNx3 and an
RGBA-image MxNx4.
scipy documentation
Matplotlib returns:
Return value is a numpy.array. For grayscale images, the return array
is MxN. For RGB images, the return value is MxNx3. For RGBA images the
return value is MxNx4.
matplotlib documentation
MWE:
from scipy import ndimage
import_image = (ndimage.imread("img.png").astype(float) -
255.0 / 2) / 255.0
print import_image[0]
import matplotlib.pyplot as plt
import_image = (plt.imread("img.png").astype(float) -
255.0 / 2) / 255.0
print import_image[0]

Here would be a true mcve:
import matplotlib.pyplot as plt
import numpy as np
import scipy.ndimage
im = np.random.rand(20,20)
plt.imsave("img.png",im)
### Scipy
i2 = scipy.ndimage.imread("img.png")
print i2.shape, i2.min(), i2.max(), i2.dtype
# (20L, 20L, 4L) 1 255 uint8
### Matplotlib
i3 = plt.imread("img.png").astype(float)
print i3.shape, i3.min(), i3.max(), i3.dtype
# (20L, 20L, 4L) 0.00392156885937 1.0 float64
As can be seen
scipy.ndimage.imread creates a numpy array of int type ranging from 0..255 while
pyplot.imread creates a numpy array of float type ranging from 0. .. 1..

ndimage script mis-behaving

I have a script that reads in image data, and then iterates over the images with the median filter in scipy.ndimage. From the iteration i create new arrays.
However when i attempt to run the script with
run filtering.py
The filtering does not seem to work. The new arrays (month_f) are the same as the old ones.
import matplotlib.pyplot as plt
import numpy as numpy
from scipy import ndimage
import Image as Image
# Get images
#Load images
jan1999 = Image.open('jan1999.tif')
mar1999 = Image.open('mar1999.tif')
may1999 = Image.open('may1999.tif')
sep1999 = Image.open('sep1999.tif')
dec1999 = Image.open('dec1999.tif')
jan2000 = Image.open('jan2000.tif')
feb2000 = Image.open('feb2000.tif')
#Compute numpy arrays
jan1999 = numpy.array(jan1999)
mar1999 = numpy.array(mar1999)
may1999 = numpy.array(may1999)
sep1999 = numpy.array(sep1999)
dec1999 = numpy.array(dec1999)
jan2000 = numpy.array(jan2000)
feb2000 = numpy.array(feb2000)
########### Put arrays into a list
months = [jan1999, mar1999, may1999, sep1999, dec1999, jan2000, feb2000]
############ Filtering = 3,3
months_f = []
for image in months:
image = scipy.ndimage.median_filter(image, size=(5,5))
months_f.append(image)
Any help would be much appreciated :)

This is rather a comment but due to reputation limits I'm not able to write one.
The way you import your modules is a bit strange. Especially "import .. as" with the idential name. I think a more pythonian way would be
import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage
from PIL import Image
and then call
image = ndimage.median_filter(image, size=(...))
When I run your steps with a RGB test image it seems to work.
What does jan1999.shape return?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to convert image to dataset to process machine learning - python

Related

3D data with SUSI

2D output on Lineal regression model

How numpy printoptions will work with images?

What is the difference between scipy.ndimage.imread and matplotlib.pyplot.imread?

ndimage script mis-behaving

Categories

Resources