I need to convert an image to text or image that consists of only 0 and 1. Is there any way to do this programmatically, preferably on Python?
Here is my try:
Step 1: Open the image:
from PIL import Image
srcImage = Image.open("src.jpg")
Step 2: Greyscale the image:
grayImage = srcImage.convert('L')
Step 3: Binarize the image:
binarizedImage = grayImage.point(lambda x: 0 if x<128 else 255, '1')
Now, I am stuck converting black points to 1, and white points to 0 and save this to text file with image height converted to lines (in this example: 174 pixels to 174 lines) and image width converted to text length (in this example: 310 pixels to 310 character length) or larger image with 0 instead of white points and 1 instead of black points.
Solution to both cases would have been appreciated a lot.
Full binarization code (modified version of the PIL way of binarizing):
from PIL import Image
srcImage = Image.open("src.jpg")
grayImage = srcImage.convert('L')
binarizedImage = grayImage.point(lambda x: 0 if x<128 else 255, '1')
binarizedImage.save("binarized.png")
You can use numpy library for this
from PIL import Image
from scipy.ndimage import zoom
import numpy as np
srcImage = Image.open("src.jpg")
grayImage = col.convert('L')
array = np.array(grayImage)
array = zoom(array, 310/174)
np.savetxt("binarized.txt", array<128, fmt="%d")
there np.array convert PIL Image to numpy array format, zoom interpolate array with given scale, array < 128 create binary array and fmt="%d" set that result will be saved as integer
Related
I have a set of very low-resolution pictures (in .png but I can easily convert them to something else). They all only have black or white pixels, like QR codes.
What I want is to be able to read them as binary matrix (a 1 for a black pixel and a zero for a white one).
I don't need anything more fancy than that, what should I use?
Hi you can use PIL to read the image, and then numpy to convert it to a matrix
from PIL import Image
import numpy as np
im = Image.read("imageName.ext")
im_mat = np.asarray(im)
Alternatively you can do all in one step with opencv
import cv2
img = cv2.imread("imageName.ext")
in both cases you will have a matrix with size WxHxC with H the height in pixels, W the widht and c the number of channels (3 or 4 depending if there's an alpha for transparency).
If your image is black and white and you only want a matrix with size WxH take one channel with
img = img_mat[:,:,0] #8-bit matrix
and last you can binarize that givving an umbral or just by comparing
bin = img> 128
or
bin = img == 255
I corrected this last line I had a typo in it
I have found some examples on this site. I would like to create, example 6. Can you help?
Create, as a numpy array, the image of the napkin. The squares have a size of 10×10. You may use the command numpy tile. Save the resulting image to a file.
In a standard grayscale image, black pixels are 0, gray pixels are 128, and white ones are 255:
import numpy as np
import matplotlib.pyplot as plt
# first create one 20 x 20 tile
a1 = np.zeros((20,20), dtype=int)
a1[10:20,0:10] = a1[0:10,10:20] = 128
a1[10:20,10:20] = 255
# fill the whole 100 x 100 area with the tiles
a = np.tile(a1, (5,5))
# plot and save
plt.imshow(a, 'Greys_r')
plt.savefig('pattern.png')
You could do this:
from PIL import Image
import numpy as np
# Make grey 2x2 image
TwoByTwo = np.full((2,2), 128, np.uint8)
# Change top-left to black, bottom-right to white
TwoByTwo[0,0] = 0
TwoByTwo[1,1] = 255
# Tile it
tiled = np.tile(TwoByTwo, (5,5))
# Make into PIL Image, rescale in size and save
Image.fromarray(tiled).resize((100,100), Image.NEAREST).save('result.png')
I'm trying to make a simple code that loads an image, divide the value of each pixel by 2 and stores the image. The image is stored in an array [1280][720][3]. After changing the value of each pixel I've chequed that the values are the expected. For some reason the values are correct but when I store the new image and check it, the values of the pixels are not the same as before...
The image is 1280x720 pixels and each pixel has 3 bytes (one for each color rgb)
import matplotlib.image as mpimg
img = mpimg.imread('image.jpg') # (1280, 720, 3)
myImg = []
for row in img:
myRow = []
for pixel in row:
myPixel = []
for color in pixel:
myPixel.append(color // 2)
myRow.append(myPixel)
myImg.append(myRow)
mpimg.imsave("foo.jpg", myImg)
img is a numpy array, so you can just use img / 2. It's also much faster than using a list loop.
myImg = img / 2
mpimg.imsave("foo.jpg", myImg)
Trying to convert image from RGB color space to YDbDr color space according to the formula:
Y = 0.299R + 0.587G + 0.114B
Db = -0.45R - 0.883G +1.333B
Dr = -1.333R + 1.116G + 0.217B
With the following code I'm trying to show only Y channel which should be grayscale image but I keep getting image all in blue color:
import numpy as np
from PIL import Image
import cv2
import matplotlib.pyplot as plt
img = cv2.imread("./pics/Slike_modela/Test/Proba/1_Color.png")
new_img = []
for row in img:
new_row = []
for pixel in row:
Y = 0.299*pixel[2]+0.587*pixel[1]+0.114*pixel[0]
Db = -0.45*pixel[2]-0.883*pixel[1]+1.333*pixel[0]
Dr = -1.333*pixel[2]+1.116*pixel[1]+0.217*pixel[0]
new_pixel = [Y, Db, Dr]
new_row.append(new_pixel)
new_img.append(new_row)
new_img_arr = np.array(new_img)
new_img_arr_y = new_img_arr.copy()
new_img_arr_y[:,:,1] = 0
new_img_arr_y[:,:,2] = 0
print (new_img_arr_y)
cv2.imshow("y image", new_img_arr_y)
key = cv2.waitKey(0)
When printing the result array I see correct numbers according to formula and correct shape of the array.
What is my mistake? How to get Y channel image i.e. grayscale image?
When processing images with Python, you really, really should try to avoid:
treating images as lists and appending millions and millions of pixels, each of which creates a whole new object and takes space to administer
processing images with for loops, which are very slow
The better way to deal with both of these is through using Numpy or other vectorised code libraries or techniques. That is why OpenCV, wand, scikit-image open and handle images as Numpy arrays.
So, you basically want to do a dot product of the colour channels with a set of 3 weights:
import cv2
import numpy as np
# Load image
im = cv2.imread('paddington.png', cv2.IMREAD_COLOR)
# Calculate Y using Numpy "dot()"
Y = np.dot(im[...,:3], [0.114, 0.587, 0.299]).astype(np.uint8)
That's it.
I am trying to use a dicom image and manipulate it using OpenCV in a Python environment. So far I have used the pydicom library to read the dicom(.dcm) image data and using the pixel array attribute to display the picture using OpenCV imshow method. But the output is just a blank window. Here is the snippet of code I am using at this moment.
import numpy as np
import cv2
import pydicom as dicom
ds=dicom.dcmread('sample.dcm')
cv2.imshow('sample image dicom',ds.pixel_array)
cv2.waitkey()
If i print out the array which is used here, the output is different from what i would get with a normal numpy array. I have tried using matplotlib imshow method as well and it was able to display the image with some colour distortions. Is there a way to convert the array into a legible format for OpenCV?
Faced a similar issue. Used exposure.equalize_adapthist() (source). The resulting image isn't a hundred percent to that you would see using a DICOM Viewer but it's the best I was able to get.
import numpy as np
import cv2
import pydicom as dicom
from skimage import exposure
ds=dicom.dcmread('sample.dcm')
dcm_sample=ds.pixel_array
dcm_sample=exposure.equalize_adapthist(dcm_sample)
cv2.imshow('sample image dicom',dcm_sample)
cv2.waitkey()
I have figured out a way to get the image to show. As Dan mentioned in the comments, the value of the matrix was scaled down and due to the imshow function, the output was too dark for the human eye to differentiate. So, in the end the only thing i needed to do was multiply the entire mat data with 128. The image is showing perfectly now. multiplying the matrix by 255 over exposes the picture and causes certain features to blow. Here is the revised code.
import numpy as np
import cv2
import pydicom as dicom
ds=dicom.dcmread('sample.dcm')
dcm_sample=ds.pixel_array*128
cv2.imshow('sample image dicom',dcm_sample)
cv2.waitkey()
I don't think that is a correct answer. It works for that particular image because most of your pixel values are in the lower range. Check this OpenCV: How to visualize a depth image. It is for c++ but easily adapted to Python.
This is the best way(in my opinion) to open image in opencv as a numpy array while perserving the image quality:
import numpy as np
import pydicom, os, cv2
def dicom_to_numpy(ds):
DCM_Img = ds
rows = DCM_Img.get(0x00280010).value #Get number of rows from tag (0028, 0010)
cols = DCM_Img.get(0x00280011).value #Get number of cols from tag (0028, 0011)
Instance_Number = int(DCM_Img.get(0x00200013).value) #Get actual slice instance number from tag (0020, 0013)
Window_Center = int(DCM_Img.get(0x00281050).value) #Get window center from tag (0028, 1050)
Window_Width = int(DCM_Img.get(0x00281051).value) #Get window width from tag (0028, 1051)
Window_Max = int(Window_Center + Window_Width / 2)
Window_Min = int(Window_Center - Window_Width / 2)
if (DCM_Img.get(0x00281052) is None):
Rescale_Intercept = 0
else:
Rescale_Intercept = int(DCM_Img.get(0x00281052).value)
if (DCM_Img.get(0x00281053) is None):
Rescale_Slope = 1
else:
Rescale_Slope = int(DCM_Img.get(0x00281053).value)
New_Img = np.zeros((rows, cols), np.uint8)
Pixels = DCM_Img.pixel_array
for i in range(0, rows):
for j in range(0, cols):
Pix_Val = Pixels[i][j]
Rescale_Pix_Val = Pix_Val * Rescale_Slope + Rescale_Intercept
if (Rescale_Pix_Val > Window_Max): #if intensity is greater than max window
New_Img[i][j] = 255
elif (Rescale_Pix_Val < Window_Min): #if intensity is less than min window
New_Img[i][j] = 0
else:
New_Img[i][j] = int(((Rescale_Pix_Val - Window_Min) / (Window_Max - Window_Min)) * 255) #Normalize the intensities
return New_Img
file_path = "C:/example.dcm"
image = pydicom.read_file(file_path)
image = dicom_to_numpy(image)
#show image
cv2.imshow('sample image dicom',image)
cv2.waitKey(0)
cv2.destroyAllWindows()