Hello I am using Python to try to read the digit data provided by MNIST into a data structure I can use to train a neural network. I am testing to ensure the data was read properly by creating an image using PIL. The image that is being created is horribly wrong, and I am not sure if it is because I am using PIL incorrectly or my data structures and methods are not right.
The format of the two data files is described here:
http://yann.lecun.com/exdb/mnist/
Here are the applicable functions:
read_image_data reads the pixel data organizing it into a list of 2D array numpy arrays
def read_image_data():
fd = open("train-images.idx3-ubyte", "rb")
images_bin_string = fd.read()
num_images = struct.unpack(">i", images_bin_string[4:8])[0]
image_data_bank = []
uint32_num_bytes = 4
current_index = 8
num_rows = struct.unpack(">I", \
images_bin_string[current_index: current_index + uint32_num_bytes])[0]
num_cols = struct.unpack(">I", \
images_bin_string[current_index + uint32_num_bytes: \
current_index + uint32_num_bytes * 2])[0]
current_index += 8
i = 0
while i < num_images:
image_data = np.zeros([num_rows, num_cols])
for j in range(num_rows - 1):
for k in range(num_cols - 1):
image_data[j][k] = images_bin_string[current_index + j * k]
current_index += num_rows * num_cols
i += 1
image_data_bank.append(image_data)
return image_data_bank
read_label_data reads the corresponding labels into a list
def read_label_data():
fd = open("train-labels.idx1-ubyte", "rb")
images_bin_string = fd.read()
num_images = struct.unpack(">i", images_bin_string[4:8])[0]
image_data_bank = []
current_index = 8
i = 0
while i < num_images:
image_data_bank.append(images_bin_string[current_index])
current_index += 1
i += 1
return image_data_bank
collect_data zips the structures together
def collect_data():
print("Reading image data...")
image_data = read_image_data()
print("Reading label data...")
label_data = read_label_data()
print("Zipping data sets...")
all_data = np.array(list(zip(image_data, label_data)))
return all_data
lastly run_test uses PIL to print the pixels from the first 28x28 np structure created by read_image_data
def run_test(data):
example = data[0]
pixel_data = example[0]
number = example[1]
print(number)
im = Image.fromarray(pixel_data)
im.show()
When I run the script:
Collecting data... Reading image data... Reading label data... Zipping
data sets... 5
I must be messing something up with the PIL library, but I do not know what.
That is a really weird looking 5. I am guessing that I went wrong somewhere in my organization of the data. The directions did say "Pixels are organized row-wise.", but I think I covered that by having my outer loop as the row loop then the inner as the column loop
UPDATE
I reversed the order of the row and column index in the np.arrays in read_image_data and it is making no difference.
image_data[k][j] = images_bin_string[current_index + j * k]
UPDATE
Ran quick test with matplotlib
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
imgplot = plt.imshow(pixel_data)
plt.show()
Here is what I got from matplotlib
That means it is definitely a problem with my code and not the library. The question is if it is the way I am passing the pixels to the imaging libraries or how I structured the data. If anyone can find the mistake, I would greatly appreciate.
Related
Using the the program at this link, https://leon.bottou.org/projects/infimnist, I generated some data.
As far as i can tell it is in some sort of binary format:
b"\x00\x00\x08\x01\x00\x00'\x10\x07\x02\x01\x00\x04\x01\x04\t\x05 ...
I need to extract labels and pictures from two datasets like this, generated with:
https://leon.bottou.org/projects/infimnist
with open("test10k-labels", "rb") as binary_file:
data = binary_file.read()
print(data)
>>> b"\x00\x00\x08\x01\x00\x00'\x10\x07\x02\x01\x00\x04\x01\x04\t\x05 ...
b"\x00\x00\x08\x01 ...".decode('ascii')
>>> "\x00\x00\x08\x01 ..."
I also tried the binascii package, but it did not work.
Thankful for any help!
Creating the Data
To create the dataset i am speaking download the package from the following link: https://leon.bottou.org/projects/infimnist.
$ cd dir_of_folder
$ make
Then I took the path of the resulting infimnist executable that pops up and:
$ app_path lab 10000 69999 > mnist60k-labels-idx1-ubyte
This should place the file i used in the folder.
The command after app_path can be replaced by any other command he lists on the side.
Final update
It works!
Using some numpy functions the images can be returned to their normal orientation.
# for the labels
with open(path, "rb") as binary_file:
y_train = np.array(array("B", binary_file.read()))
# for the images
with open("images path", "rb") as binary_file:
images = []
emnistRotate = True
magic, size, rows, cols = struct.unpack(">IIII", binary_file.read(16))
if magic != 2051:
raise ValueError('Magic number mismatch, expected 2051,''got {}'.format(magic))
for i in range(size):
images.append([0] * rows * cols)
image_data = array("B", binary_file.read())
for i in range(size):
images[i][:] = image_data[i * rows * cols:(i + 1) * rows * cols]
# for some reason EMNIST is mirrored and rotated
if emnistRotate:
x = image_data[i * rows * cols:(i + 1) * rows * cols]
subs = []
for r in range(rows):
subs.append(x[(rows - r) * cols - cols:(rows - r)*cols])
l = list(zip(*reversed(subs)))
fixed = [item for sublist in l for item in sublist]
images[i][:] = fixed
x = []
for image in images:
x.append(np.rot90(np.flip(np.array(image).reshape((28,28)), 1), 1))
x_train = np.array(x)
Crazy solution for such a simple thing :)
Ok, so looking at the python-mnistsource, it seems the correct way to unpack the binary format is as follows:
from array import array
with open("test10k-labels", "rb") as binary_file:
magic, size = struct.unpack(">II", file.read(8))
if magic != 2049:
raise ValueError("Magic number mismatch, expected 2049,got{}".format(magic))
labels = array("B", binary_file.read())
print(labels)
update
So I haven't tested this extensively, but the following code should work. It was taken and modified from the aforementioned python-mnistsee source
from array import array
import struct
with open("mnist8m-patterns-idx3-ubyte", "rb") as binary_file:
images = []
emnistRotate = True
magic, size, rows, cols = struct.unpack(">IIII", binary_file.read(16))
if magic != 2051:
raise ValueError('Magic number mismatch, expected 2051,''got {}'.format(magic))
for i in range(size):
images.append([0] * rows * cols)
image_data = array("B", binary_file.read())
for i in range(size):
images[i][:] = image_data[i * rows * cols:(i + 1) * rows * cols]
# for some reason EMNIST is mirrored and rotated
if emnistRotate:
x = image_data[i * rows * cols:(i + 1) * rows * cols]
subs = []
for r in range(rows):
subs.append(x[(rows - r) * cols - cols:(rows - r)*cols])
l = list(zip(*reversed(subs)))
fixed = [item for sublist in l for item in sublist]
images[i][:] = fixed
print(images)
previous answer:
You can use the python-mnist library:
from mnist import MNIST
mndata = MNIST('./data')
images, labels = mndata.load_training()
I want to run through a large tif stack +1500 frames and extract the coordinates of the local maxima for each frame. The code below does the job, however extremely slow for large files. When running on smaller bits (e.g. 20 frames) each frame is done almost instantly - when running on the whole dataset, each frame takes seconds.
Any solutions to run a faster code? I figure it is due to the loading of the large tiff file - however it should only be necessary one time initially?
I have the following code:
from pims import ImageSequence
from skimage.feature import peak_local_max
def cmask(index,array):
radius = 3
a,b = index
nx,ny = array.shape
y,x = np.ogrid[-a:nx-a,-b:ny-b]
mask = x*x + y*y <= radius*radius
return(sum(array[mask])) # number of pixels
images = ImageSequence('tryhard_red_small.tif')
frame_list = []
x = []
y = []
int_liposome = []
BG_liposome = []
for i in range(len(images[0])):
tmp_frame = images[0][i]
xy = pd.DataFrame(peak_local_max(tmp_frame, min_distance=8,threshold_abs=3000))
x.extend(xy[0].tolist())
y.extend(xy[1].tolist())
for j in range(len(xy)):
index = x[j],y[j]
int_liposome.append(cmask(index,tmp_frame))
frame_list.extend([i]*len(xy))
print "Frame: ", i, "of ",len(images[0])
features = pd.DataFrame(
{'lip_int':int_liposome,
'y' : y,
'x' : x,
'frame' : frame_list})
Have you tried profiling the code, say with %prun or %lprun in ipython? That'll tell you exactly where your slowdowns are occurring.
I can't make my own version of this without the tif stack, but I suspect the problem is the fact that you're using lists to store everything. Every time you do an append or an extension, python is having to allocate more memory. You could try getting the total count of maxima first, then allocating your output arrays, then rerunning to fill the arrays. Something like below
# run through once to get the count of local maxima
npeaks = (len(peak_local_max(f, min_distance=8, threshold_abs=3000))
for f in images[0])
total_peaks = sum(npeaks)
# allocate storage arrays and rerun
x = np.zeros(total_peaks, np.float)
y = np.zeros_like(x)
int_liposome = np.zeros_like(x)
BG_liposome = np.zeros_like(x)
frame_list = np.zeros(total_peaks, np.int)
index_0 = 0
for frame_ind, tmp_frame in enumerate(images[0]):
peaks = pd.DataFrame(peak_local_max(tmp_frame, min_distance=8,threshold_abs=3000))
index_1 = index_0 + len(peaks)
# copy the data from the DataFrame's underlying numpy array
x[index_0:index_1] = peaks[0].values
y[index_0:index_1] = peaks[1].values
for i, peak in enumerate(peaks, index_0):
int_liposome[i] = cmask(peak, tmp_frame)
frame_list[index_0:index_1] = frame_ind
# update the starting index
index_0 = index_1
print "Frame: ", frame_ind, "of ",len(images[0])
Basically I'd like to implement BSC. A photo is changed into bits, some of them are changed and a new image is created. The issue I encountered is that I get the same image back. I've put some print() statements to see if the error() works and it looks like it does. Here's my code:
import numpy as np
import random as rand
# Seed
rand.seed(4)
# Filenames
in_name = 'in_img.png'
out_name = 'out_img.png'
# Into bits
in_bytes = np.fromfile(in_name, dtype="uint8")
in_bits = np.unpackbits(in_bytes)
data = list(in_bits)
# BSC
def error(x):
p = 0.1
is_wrong = rand.random() < p
if is_wrong:
if x == 1:
return 0
else:
return 1
else:
return x
for i in data:
i = error(i)
# To PNG
out_bits = np.array(data)
out_bytes = np.packbits(out_bits)
out_bytes.tofile(out_name)
While the problem in your code seems to be a duplicate as kazemakase points out in a comment, your code should not use such a loop and a Python list in the first place. With numpy one usually tries to push as many loops as possible into the numpy data types.
import numpy as np
def main():
np.random.seed(4)
in_name = 'in_img.png'
out_name = 'out_img.png'
bits = np.unpackbits(np.fromfile(in_name, np.uint8))
bits ^= np.random.random(bits.shape) < 0.1
np.packbits(bits).tofile(out_name)
if __name__ == '__main__':
main()
This is a program for face recognition using pca logic. Everything went fine except for the index error that came up at the end of the program.
When I run the code I get an index error at the fourth last line of my program.
distances.append((dist, y[i]))
IndexError: list index out of range
can anyone just help in this. I am newbie into python, so am I not so expert in solving.
Here is my code :
from sklearn.decomposition import RandomizedPCA
import numpy as np
import glob
import cv2
import math
import os.path
import string
#function to get ID from filename
def ID_from_filename(filename):
part = string.split(filename, '/')
return part[1].replace("s", "")
#function to convert image to right format
def prepare_image(filename):
img_color = cv2.imread(filename)
img_gray = cv2.cvtColor(img_color, cv2.cv.CV_RGB2GRAY)
img_gray = cv2.equalizeHist(img_gray)
return img_gray.flat
IMG_RES = 92 * 112 # img resolution
NUM_EIGENFACES = 10 # images per train person
NUM_TRAINIMAGES = 110 # total images in training set
#loading training set from folder train_faces
folders = glob.glob('train_faces/*')
# Create an array with flattened images X
# and an array with ID of the people on each image y
X = np.zeros([NUM_TRAINIMAGES, IMG_RES], dtype='int8')
y = []
# Populate training array with flattened imags from subfolders of
train_faces and names
c = 0
for x, folder in enumerate(folders):
train_faces = glob.glob(folder + '/*')
for i, face in enumerate(train_faces):
X[c,:] = prepare_image(face)
y.append(ID_from_filename(face))
c = c + 1
# perform principal component analysis on the images
pca = RandomizedPCA(n_components=NUM_EIGENFACES, whiten=True).fit(X)
X_pca = pca.transform(X)
# load test faces (usually one), located in folder test_faces
test_faces = glob.glob('test_faces/*')
# Create an array with flattened images X
X = np.zeros([len(test_faces), IMG_RES], dtype='int8')
# Populate test array with flattened imags from subfolders of train_faces
for i, face in enumerate(test_faces):
X[i,:] = prepare_image(face)
# run through test images (usually one)
for j, ref_pca in enumerate(pca.transform(X)):
distances = []
# Calculate euclidian distance from test image to each of the known
images and save distances
for i, test_pca in enumerate(X_pca):
dist = math.sqrt(sum([diff**2 for diff in (ref_pca - test_pca)]))
distances.append((dist, y[i]))
found_ID = min(distances)[1]
print "Identified (result: "+ str(found_ID) +" - dist - " +
str(min(distances)[0]) + ")"
Your i in the loop below goes up to the length of X_pca - 1
for i, test_pca in enumerate(X_pca):
dist = math.sqrt(sum([diff**2 for diff in (ref_pca - test_pca)]))
distances.append((dist, y[i]))
However, your y is not built to have that length necessarily:
for x, folder in enumerate(folders):
train_faces = glob.glob(folder + '/*')
for i, face in enumerate(train_faces):
X[c,:] = prepare_image(face)
y.append(ID_from_filename(face))
So you are using an index i which is greater than the bounds of your list y.
What would be the fastest/memory efficient way to get average over many frames of 16-bit TIFF image as numpy array?
What I came up so far is the code below. To my surprise, method2 was faster than method1.
But, for profiling never assume, test it! So, I want to test more.
Worth trying Wand? I did not include here because after imstalling ImageMagick-6.8.9-Q16 and MAGICK_HOME env var it still does not import... Any other library for multipage tiff in Python? GDAL maybe little too much for this.
(edit) I included libtiff. Still method2 fastest and quite memory efficient.
from time import time
#import cv2 ## no multi page tiff support
import numpy as np
from PIL import Image
#from scipy.misc import imread ## no multi page tiff support
import tifffile # http://www.lfd.uci.edu/~gohlke/code/tifffile.py.html
from libtiff import TIFF # https://code.google.com/p/pylibtiff/
fp = r"path/2/1000frames-timelapse-image.tif"
def method1(fp):
'''
using tifffile.py by Christoph (Version: 2014.02.05)
(http://www.lfd.uci.edu/~gohlke/code/tifffile.py.html)
'''
with tifffile.TIFFfile(fp) as imfile:
return imfile.asarray().mean(axis=0)
def method2(fp):
'primitive peak memory friendly way with tifffile.py'
with tifffile.TIFFfile(fp) as imfile:
nframe, h, w = imfile.series[0]['shape']
temp = np.zeros( (h,w), dtype=np.float64 )
for n in range(nframe):
curframe = imfile.asarray(n)
temp += curframe
return (temp / nframe)
def method3(fp):
' like method2 but using pillow 2.3.0 '
im = Image.open(fp)
w, h = im.size
temp = np.zeros( (h,w), dtype=np.float64 )
n = 0
while True:
curframe = np.array(im.getdata()).reshape(h,w)
temp += curframe
n += 1
try:
im.seek(n)
except:
break
return (temp / n)
def method4(fp):
'''
https://code.google.com/p/pylibtiff/
documentaion seems out dated.
'''
tif = TIFF.open(fp)
header = tif.info()
meta = dict() # extracting meta
for l in header.splitlines():
if l:
if l.find(':')>0:
parts = l.split(':')
key = parts[0]
value = ':'.join(parts[1:])
elif l.find('=')>0:
key, value =l.split('=')
meta[key] = value
nframes = int(meta['frames'])
h = int(meta['ImageLength'])
w = int(meta['ImageWidth'])
temp = np.zeros( (h,w), dtype=np.float64 )
for frame in tif.iter_images():
temp += frame
return (temp / nframes)
t0 = time()
avgimg1 = method1(fp)
print time() - t0
# 1.17-1.33 s
t0 = time()
avgimg2 = method2(fp)
print time() - t0
# 0.90-1.53 s usually faster than method1 by 20%
t0 = time()
avgimg3 = method3(fp)
print time() - t0
# 21 s
t0 = time()
avgimg4 = method4(fp)
print time() - t0
# 1.96 - 2.21 s # may not be accurate. I got warning for every frame with the tiff file I tested.
np.testing.assert_allclose(avgimg1, avgimg2)
np.testing.assert_allclose(avgimg1, avgimg3)
np.testing.assert_allclose(avgimg1, avgimg4)
Simple logic would make me bet my money on method 1 or 3, since method 2 and 4 have for-loops in them. For-loops Always make your code go slower if you have more input.
I would definitely go for method 1: neat, clear to read...
To be really sure, just test them I would say. If you don't feel like testing, I would go for method one.
Kind regards,