I am trying to find the histogram values of an image by using my own function but when i run my code it prints the histogram values like [1.000e+00 4.000e+00 1.000e+00 8.000e+00 8.000e+00 2.500e+01 2.100e+01
4.500e+01 5.500e+01 8.800e+01 1.110e+02 1.220e+02 1.280e+02 1.370e+02
Is it normal or is there any other method that i can display histogram values in an understandable way? Here is my function;
import numpy as np
import cv2
def histogram(img):
height = img.shape[0]
width = img.shape[1]
hist = np.zeros((256))
for i in np.arange(height):
for j in np.arange(width):
a = img.item(i,j)
hist[a] += 1
print(hist)
img = cv2.imread('rose.jpg', cv2.IMREAD_GRAYSCALE)
histogram(img)
Where you initialize your histogram, set its type to np.uint32 or similar since you can only ever have a whole, non-negative number of pixels of a given colour:
hist = np.zeros(256, dtype=np.uint32)
Check the type of your current array and find it is float64 with:
print(hist.dtype)
Hint: See also here.
You can set suppress to True using np.set_printoptions see https://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html
Alternatively you can print like this:
with np.printoptions(suppress=True):
print(hist)
Related
I'm new to python and matplotlib.
I have implemented the k means algorithm in order to compress and image to
clusters and then plotting the changed image.
my question is: I was not able to plot the new image without using
the old one as a base, I tried a few things but could not quite get the result I want. and it's bad programming if I pass the old image as argument when I can definitely not use it.
Can someone please help?
I tried to create a new ndarray but it did not work.
Here is my function:
def changePic(newPixelList, oldPixel, image_size):
index = 0
new_pixels = []
for pixel in newPixelList:
oldPixel[index] = pixel.classification
index+=1
l = oldPixel.reshape(image_size)
plt.imshow(l)
plt.grid(False)
plt.show()
As you can see I don't really use the oldPixel values, just its structure.
now I'll show you the type of oldPixel:
Here is my loadPic method where X.copy is the argument oldPixel:
def loadPic():
"""
Load pic to array
:return: copy of original X, new lisf of pixels, image size
"""
# data preperation (loading, normalizing, reshaping)
path = 'dog.jpeg'
A = imread(path)
A = A.astype(float) / 255.
img_size = A.shape
X = A.reshape(img_size[0] * img_size[1], img_size[2])
listOfPixel= []
for pixel in X:
listOfPixel.append(Pixel(pixel))
return X.copy(), listOfPixel,img_size
Try this:
def changePic(newPixelList, oldPixel, image_size, picture_num):
index = 0
new_pixels = []
for pixel in newPixelList:
oldPixel[index] = pixel.classification
index+=1
l = oldPixel.reshape(image_size)
plt.figure(picture_num)
plt.imshow(l)
plt.grid(False)
plt.show()
Every plot that you generate should have a different picture_num in order to have separate plots.
I've constructed an image from some FITS files, and I want to save the resultant masked image as another FITS file. Here's my code:
import numpy as np
from astropy.io import fits
import matplotlib.pyplot as plt
#from astropy.nddata import CCDData
from ccdproc import CCDData
hdulist1 = fits.open('wise_neowise_w1-MJpersr.fits')
hdulist2 = fits.open('wise_neowise_w2-MJpersr.fits')
data1_raw = hdulist1[0].data
data2_raw = hdulist2[0].data
# Hide negative values in order to take logs
# Where {condition}==True, return data_raw, else return np.nan
data1 = np.where(data1_raw >= 0, data1_raw, np.nan)
data2 = np.where(data2_raw >= 0, data2_raw, np.nan)
# Calculation and image subtraction
w1mag = -2.5 * (np.log10(data1) - 9.0)
w2mag = -2.5 * (np.log10(data2) - 9.0)
color = w1mag - w2mag
## Find upper and lower 5th %ile of pixels
mask_percent = 5
masked_value_lower = np.nanpercentile(color, mask_percent)
masked_value_upper = np.nanpercentile(color, (100 - mask_percent))
## Mask out the upper and lower 5% of pixels
## Need to hide values outside the range [lower, upper]
color_masked = np.ma.masked_outside(color, masked_value_lower, masked_value_upper)
color_masked = np.ma.masked_invalid(color_masked)
plt.imshow(color)
plt.title('color')
plt.savefig('color.png', overwrite = True)
plt.imshow(color_masked)
plt.title('color_masked')
plt.savefig('color_masked.png', overwrite = True)
fits.writeto('color.fits',
color,
overwrite = True)
ccd = CCDData(color_masked, unit = 'adu')
ccd.write('color_masked.fits', overwrite = True))
hdulist1.close()
hdulist2.close()
When I use matplotlib.pyplot to imshow the images color and color_masked, they look as I expect:
However, my two output files, color_masked.fits == color.fits. I think somehow I'm not quite understanding the masking process properly. Can anyone see where I've gone wrong?
astropy.io.fits only handles normal arrays and that means it just ignores/discards the mask of your MaskedArray.
Depending on your use-case you have different options:
Saving the file so other FITS programs recognize the mask
I actually don't think that's possible. But some programs like DS9 can handle NaNs, so you could just set the masked values to NaN for the purpose of displaying them:
data_naned = np.where(color_masked.mask, np.nan, color_masked)
fits.writeto(filename, data_naned, overwrite=True)
They do still show up as "bright white spots" but they don't affect the color-scale.
If you want to take this a step further you could replace the masked pixels using a convolution filter before writing them to a file. Not sure if there's one in astropy that only replaces masked pixels though.
Saving the mask as extension so you can read them back
You could use astropy.nddata.CCDData (available since astropy 2.0) to save it as FITS file with mask:
from astropy.nddata import CCDData
ccd = CCDData(color_masked, unit='adu')
ccd.write('color_masked.fits', overwrite=True)
Then the mask will be saved in an extension called 'MASK' and it can be read using CCDData as well:
ccd2 = CCDData.read('color_masked.fits')
The CCDData behaves like a masked array in normal NumPy operations but you could also convert it to a masked-array by hand:
import numpy as np
marr = np.asanyarray(ccd2)
I am confused regarding how matplotlib handles fp32 pixel intensities. To my understanding, it rescales the values between max and min values of the image. However, when I try to view images originally in [0,1] by rescaling their pixel intensites to [-1,1] (by im*2-1) using imshow(), the image appears differently colored. How do I rescale so that images don't differ ?
EDIT : Please look at the image -
PS: I need to do this as part of a program that outputs those values in [-1,1]
Following is the code used for this:
img = np.float32(misc.face(gray=False))
fig,ax = plt.subplots(1,2)
img = img/255 # Convert to 0,1 range
print (np.max(img), np.min(img))
img0 = ax[0].imshow(img)
plt.colorbar(img0,ax=ax[0])
print (np.max(2*img-1), np.min(2*img-1))
img1 = ax[1].imshow(2*img-1) # Convert to -1,1 range
plt.colorbar(img1,ax=ax[1])
plt.show()
The max,min output is :
(1.0, 0.0)
(1.0, -1.0)
You are probably using matplotlib wrong here.
The normalization-step should work correctly, if it's active. The docs tell us, that is only active by default, if the input-image is of type float!
Code
import numpy as np
import matplotlib.pyplot as plt
from scipy import misc
fig, ax = plt.subplots(2,2)
# This usage shows different colors because there is no normalization
# FIRST ROW
f = misc.face(gray=True)
print(f.dtype)
g = f*2 # just some operation to show the difference between usages
ax[0,0].imshow(f)
ax[0,1].imshow(g)
# This usage makes sure that the input-image is of type float
# -> automatic normalization is used!
# SECOND ROW
f = np.asarray(misc.face(gray=True), dtype=float) # TYPE!
print(f.dtype)
g = f*2 # just some operation to show the difference between usages
ax[1,0].imshow(f)
ax[1,1].imshow(g)
plt.show()
Output
uint8
float64
Analysis
The first row shows the wrong usage, because the input is of type int and therefore no normalization will be used.
The second row shows the correct usage!
EDIT:
sascha has correctly pointed out in the comments that rescaling is not applied for RGB images and inputs must be ensured to be in [0,1] range.
I have a problem in which a have a bunch of images for which I have to generate histograms. But I have to generate an histogram for each pixel. I.e, for a collection of n images, I have to count the values that the pixel 0,0 assumed and generate an histogram, the same for 0,1, 0,2 and so on. I coded the following method to do this:
class ImageData:
def generate_pixel_histogram(self, images, bins):
"""
Generate a histogram of the image for each pixel, counting
the values assumed for each pixel in a specified bins
"""
max_value = 0.0
min_value = 0.0
for i in range(len(images)):
image = images[i]
max_entry = max(max(p[1:]) for p in image.data)
min_entry = min(min(p[1:]) for p in image.data)
if max_entry > max_value:
max_value = max_entry
if min_entry < min_value:
min_value = min_entry
interval_size = (math.fabs(min_value) + math.fabs(max_value))/bins
for x in range(self.width):
for y in range(self.height):
pixel_histogram = {}
for i in range(bins+1):
key = round(min_value+(i*interval_size), 2)
pixel_histogram[key] = 0.0
for i in range(len(images)):
image = images[i]
value = round(Utils.get_bin(image.data[x][y], interval_size), 2)
pixel_histogram[value] += 1.0/len(images)
self.data[x][y] = pixel_histogram
Where each position of a matrix store a dictionary representing an histogram. But, how I do this for each pixel, and this calculus take a considerable time, this seems to me to be a good problem to be parallelized. But I don't have experience with this and I don't know how to do this.
EDIT:
I tried what #Eelco Hoogendoorn told me and it works perfectly. But applying it to my code, where the data are a large number of images generated with this constructor (after the values are calculated and not just 0 anymore), I just got as h an array of zeros [0 0 0]. What I pass to the histogram method is an array of ImageData.
class ImageData(object):
def __init__(self, width=5, height=5, range_min=-1, range_max=1):
"""
The ImageData constructor
"""
self.width = width
self.height = height
#The values range each pixel can assume
self.range_min = range_min
self.range_max = range_max
self.data = np.arange(width*height).reshape(height, width)
#Another class, just the method here
def generate_pixel_histogram(realizations, bins):
"""
Generate a histogram of the image for each pixel, counting
the values assumed for each pixel in a specified bins
"""
data = np.array([image.data for image in realizations])
min_max_range = data.min(), data.max()+1
bin_boundaries = np.empty(bins+1)
# Function to wrap np.histogram, passing on only the first return value
def hist(pixel):
h, b = np.histogram(pixel, bins=bins, range=min_max_range)
bin_boundaries[:] = b
return h
# Apply this for each pixel
hist_data = np.apply_along_axis(hist, 0, data)
print hist_data
print bin_boundaries
Now I get:
hist_data = np.apply_along_axis(hist, 0, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/shape_base.py", line 104, in apply_along_axis
outshape[axis] = len(res)
TypeError: object of type 'NoneType' has no len()
Any help would be appreciated.
Thanks in advance.
As noted by john, the most obvious solution to this is to look for library functionality that will do this for you. It exists, and it will be orders of magnitude more efficient than what you are doing here.
Standard numpy has a histogram function that can be used for this purpose. If you have only few values per pixel, it will be relatively inefficient; and it creates a dense histogram vector rather than the sparse one you produce here. Still, chances are good the code below solves your problem efficiently.
import numpy as np
#some example data; 128 images of 4x4 pixels
voxeldata = np.random.randint(0,100, (128, 4,4))
#we need to apply the same binning range to each pixel to get sensibble output
globalminmax = voxeldata.min(), voxeldata.max()+1
#number of output bins
bins = 20
bin_boundaries = np.empty(bins+1)
#function to wrap np.histogram, passing on only the first return value
def hist(pixel):
h, b = np.histogram(pixel, bins=bins, range=globalminmax)
bin_boundaries[:] = b #simply overwrite; result should be identical each time
return h
#apply this for each pixel
histdata = np.apply_along_axis(hist, 0, voxeldata)
print bin_boundaries
print histdata[:,0,0] #print the histogram of an arbitrary pixel
But the more general message id like to convey, looking at your code sample and the type of problem you are working on: do yourself a favor, and learn numpy.
Parallelization certainly would not be my first port of call in optimizing this kind of thing. Your main problem is that you're doing lots of looping at the Python level. Python is inherently slow at this kind of thing.
One option would be to learn how to write Cython extensions and write the histogram bit in Cython. This might take you a while.
Actually, taking a histogram of pixel values is a very common task in computer vision and it has already been efficiently implemented in OpenCV (which has python wrappers). There are also several functions for taking histograms in the numpy python package (though they are slower than the OpenCV implementations).
I am using the PIL package in python and I want to import the pixels into a matrix after I convert it to grayscale this is my code
from PIL import Image
import numpy as np
imo = Image.open("/home/gauss/Pictures/images.jpg")
imo2 = imo.convert('L')
dim = imo2.size
pic_mat = np.zeros(shape=(dim[0] , dim[1]))
for i in range(dim[0]):
for j in range(dim[1]):
pic_mat[i][j] = imo2.getpixel((i,j))
My question is about the size function. it usually returns a tuple (a,b) where a is the width of the picture and the b is the length of the picture, but doesn't that mean that a is the column in a matrix and b is the row in a matrix. I am wondering this to see if I set up my matrix properly.
Thank you
Try just doing
pic_mat = np.array(imo.convert('L'))
You can also avoid doing things like shape=(dim[0] , dim[1]) by slicing the size tuple like this shape=dim[:2] (the :2 is even redundant in this case but I like to be careful...)