Pandas and Python image to numpy array [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm currently teaching myself pandas and python for machine learning. I've done fine with text data thus far, but dealing with image data with limited knowledge of python and pandas is tripping me.
I have read in a .csv file into pandas dataframe, with one of its columns containing url to an image. So this is what shows when I get info from the dataframe.
dataframe = pandas.read_csv("./sample.csv")
dataframe.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total of 5 columns):
name 5000 non-null object
...
image 5000 non-null object
the image column contains url to the image. The problem is, I do not know how to import the image data from this and save it as numpy array for processing.
Any help is appreciated. Thanks in advance!

If you want to download the images from the web and then, for example, rotate your images from your dataframe, and save the results you can use the following code:
import pandas as pd
import matplotlib.pylab as plt
import numpy as np
from PIL import Image
import urllib2 as urllib
import io
df = pd.DataFrame({
"name": ["Butterfly", "Birds"],
"image": ["https://upload.wikimedia.org/wikipedia/commons/0/0c/Two-tailed_pasha_%28Charaxes_jasius_jasius%29_Greece.jpg",
'https://upload.wikimedia.org/wikipedia/commons/c/c5/Bat_cave_in_El_Maviri_Sinaloa_-_Mexico.jpg']})
def rotate_image(image, theta):
"""
3D rotation matrix around the X-axis by angle theta
"""
rotation_matrix = np.c_[
[1,0,0],
[0,np.cos(theta),-np.sin(theta)],
[0,np.sin(theta),np.cos(theta)]
]
return np.einsum("ijk,lk->ijl", image, rotation_matrix)
for i, imageUrl in enumerate(df.image):
print imageUrl
fd = urllib.urlopen(imageUrl)
image_file = io.BytesIO(fd.read())
im = Image.open(image_file)
im_rotated = rotate_image(im, np.pi)
fig = plt.figure()
plt.imshow(im_rotated)
plt.axis('off')
fig.savefig(df.name.ix[i] + ".jpg")
If instead you want to show the pictures you can do:
plt.show()
The resulting pictures are birds and butterfly which can be seen here as well:

As we don't know your csv-file, you have to tune your pd.read_csv() for your case.
Here i'm using requests to download some image in-memory.
These are then decoded with the help of scipy (which you already should have; if not: you can use Pillow too).
The decoded images are then raw numpy-arrays and shown by matplotlib.
Keep in mind, that we are not using temporary-files here and everything is hold in memory. Read also this (answer by jfs).
For people missing some required libs, one should be able to do the same with (code needs to be changed of course):
requests can be replaced with urllib (standard lib)
i'm not showing code, but this SO-question should be a good start
another relevant SO-question talking about in-memory processing with urllib
pandas can be replaced by csv (standard lib)
scipy can be replaced by Pillow (although internal storage might differ then)
matplotlib is just for demo-purposes (not sure if Pillow allows showing images; edit: it seems it can)
I just selected some random images from some german newspage.
Edit: Free images from wikipedia now used!
Code:
import requests # downloading images
import pandas as pd # csv- / data-input
from scipy.misc import imread # image-decoding -> numpy-array
import matplotlib.pyplot as plt # only for demo / plotting
# Fake data -> pandas DataFrame
urls_df = pd.DataFrame({'urls': ['https://upload.wikimedia.org/wikipedia/commons/thumb/c/cb/Rescue_exercise_RCA_2012.jpg/500px-Rescue_exercise_RCA_2012.jpg',
'https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/Clinotarsus_curtipes-Aralam-2016-10-29-001.jpg/300px-Clinotarsus_curtipes-Aralam-2016-10-29-001.jpg',
'https://upload.wikimedia.org/wikipedia/commons/thumb/9/9f/US_Capitol_east_side.JPG/300px-US_Capitol_east_side.JPG']})
# Download & Decode
imgs = []
for i in urls_df.urls: # iterate over column / pandas Series
r = requests.get(i, stream=True) # See link for stream=True!
r.raw.decode_content = True # Content-Encoding
imgs.append(imread(r.raw)) # Decoding to numpy-array
# imgs: list of numpy arrays with varying shapes of form (x, y, 3)
# as we got 3-color channels
# Beware!: downloading png's might result in a shape of (x, y, 4)
# as some alpha-channel might be available
# For more options: https://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.imread.html
# Plot
f, arr = plt.subplots(len(imgs))
for i in range(len(imgs)):
arr[i].imshow(imgs[i])
plt.show()
Output:

Related

How do I write a function that takes in two different image arrays, plots the "difference image," and returns the difference image array?

So I have these 2 images I downloaded from a fits file. Here is the code located below. Now I want to create a function that takes in both those image arrays and subtracts each array from eachother and then plots the difference as a new picture. Can anyone help me out? I'm really stuck on it been working at it for 4 hours to no avail.
import numpy as np
import matplotlib.pyplot as plt
from astropy.io import fits
Image_file = fits.open('https://raw.githubusercontent.com/msu-cmse-courses/cmse202-S21-student/master/data/m42_40min_ir.fits')
fourty_min_ir = Image_file[0].data
type(fourty_min_ir)
Image_file = fits.open('https://raw.githubusercontent.com/msu-cmse-courses/cmse202-S21-student/master/data/m42_40min_red.fits')
fourty_min_red = Image_file[0].data
type(fourty_min_red)
results_img=fourty_min_ir-fourty_min_red
plt.imshow(results_img)
plt.show()

sklearn.preprocessing.StandardScaler ValueError: Expected 2D array, got 1D array instead [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I'm trying to work through a tutorial at http://www.semspirit.com/artificial-intelligence/machine-learning/regression/support-vector-regression/support-vector-regression-in-python/
but there's no csv file included, so I'm using my own data. Here's the code so far:
import numpy as np
import pandas as pd
from matplotlib import cm
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from scipy import stats
# Here's where I import my data; there's no csv file included in the tutorial
import quasar_functions as qf
dataset, datasetname, mags = qf.loaddata('sdss12')
S = np.asarray(dataset[mags])
t = np.asarray(dataset['z'])
t.reshape(-1,1)
# Feature scaling
from sklearn.preprocessing import StandardScaler as scale
sc_S = scale()
sc_t = scale()
S2 = sc_S.fit_transform(S)
t2 = sc_t.fit_transform(t)
The last line throws an error:
ValueError: Expected 2D array, got 1D array instead:
array=[4.17974 2.06468 5.46959 ... 0.41398 0.3672 1.9235 ].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
and yes, I've reshaped my target array t with t.reshape(-1,1) as shown here, here, here and here, but to no avail. Am I reshaping correctly?
Here's all my variables:
I am guessing you have a dataframe, so you need to reassign the variable t = t.reshape(-1,1):
import pandas as pd
dataset = pd.DataFrame(np.random.normal(2,1,(100,4)),columns=['z','x1','x2','x3'])
mags = ['x1','x2','x3']
S = np.asarray(dataset[mags])
t = np.asarray(dataset['z'])
t = t.reshape(-1,1)
from sklearn.preprocessing import StandardScaler as scale
sc_S = scale()
sc_t = scale()
S2 = sc_S.fit_transform(S)
t2 = sc_t.fit_transform(t)
To check it works:
np.mean(t2)
2.4646951146678477e-16

ndarray with 3 dimension into pandas dataframe

I know this topic has been asked before, but as i'm new to python I couldn't fully understand how to do that and I would like to get explanations about.
I have ndarray cube (stack of images from the same location with the same size and shape which differs in the wavelength they were taken).
I want to convert this image into pandas dataframe in order to be able to iterate through specific rows.
i'm really confused because of the big number of columns I have: I ahve 1024 columns in each image and that confuse me when I need to index those images.
My end goal is to get in the end the images in structure of df, so maybe it means to have kind of imagecollection that I can iterate rows in each one of them.
this is the code I have written until now:
import spectral.io.envi as envi
import matplotlib.pyplot as plt
import os
from spectral import *
import numpy as np
#Create the image path
#the path
img_path = r'N:\this\is\a\path\capture'
cali_path=r'N:\location\Image_Python'
#the specific file
img_file = 'emptyname_2019-08-13_11-05-46.hdr'
img_dark= 'DARKREF_emptyname_2019-08-13_11-05-46.hdr'
cali_hdr= 'Radiometric_1x1.hdr'
cali_img = 'Radiometric_1x1.cal'
img= envi.open(os.path.join(img_path,img_file)).load()
img_dark= envi.open(os.path.join(img_path,img_dark)).load()
img_cali= envi.open(os.path.join(cali_path,cali_hdr), image = os.path.join(cali_path,cali_img)).load()
cali_shape=img_cali.shape
dark_shape=img_dark.shape
img_shape=img.shape
print('shape image:',img_shape,'shape dark:',dark_shape,'calibration shape:',cali_shape)
wavelength=[float(i) for i in img.metadata['wavelength']]
#get the exposure time
tint=float(img.metadata['tint'])
print(tint)
#goak: need to reduce the dark reference from DN image.
#step 1: for each column in the dark reference, calculate mean. then reduce this mean line from the DN image.
#we have created average according to the horizontal axix- axis=0, it calculates the mean for the whole column and we get one row.
dark_1024=img_dark.mean(axis=0)
from numpy import asarray
import pandas as pd
img_np=asarray(img)
dark_np=asarray(img_dark)
cali_np=asarray(img_cali)

How to generate an image from pixel values taken from a .csv file? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a csv file that has RGB values in each row. I want to generate a 512 x 512 image with each pixel value taken from the rows of the csv. How do I go about doing this? Any help would be much appreciated.
2.053 163.5011 0.0522
2.053 163.4489 0.0517
2.053 163.3972 0.0511
2.053 163.3461 0.0506
2.053 163.2955 0.0501
...
Tags specify Python and matlab, I just picked the former since that's what I know. Here's the code:
import numpy as np
import matplotlib.pyplot as plt
#values = np.loadtxt("D://tempCode/test.txt") # for reading from file
values = np.random.random(size=(512,512,3)) # random pixels
plt.figure()
plt.imshow(values)
plt.show()
Result:
The critical bit is realisng how your data is structured, i.e. is it given row-by-row or column-by-column. Then you'll need to call values.reshape(new_shape) on it to make it work with imshow. Have a look here for the documentation.
I guess the code should be something like that
import numpy as np
import PIL
import matplotlib.pyplot as plt
# load the data
im_l = np.genfromtxt('image.csv', delimiter=',')
# resphape the data
img = np.reshape(im_l, (256,256,3)) # change 256's according to your data
# visualize the data
plt.figure
plt.imshow(img)
# finally save the image as jpg file
image = PIL.Image.fromarray(img.astype('uint8'), 'RGB')
image.save('my_im.jpg')

Save 3D array into a stack of 2D images in Python

I made a 3D array, which consists of numbers(0~4). What I want is to save 3D array as a stack of 2D images(if possible, save *.tiff file). What am I supposed to do?
import numpy as np
a = np.random.randint(0,5, size=(100,100,100))
a = a.astype('int8')
Actually, I made it. This is my code.
With this code, I don't need to stack a series of 2D image(array).
Make a 3D array, and save it. That is just what I did for this.
import numpy as np
from skimage.external import tifffile as tif
a = np.random.randint(0,5, size=(100,100,100))
a = a.astype('int8')
tif.imsave('a.tif', a, bigtiff=True)
This should work. I haven't tested it but I have separated color images into RGB slices using this method and it should work pretty much the same way here, assuming you don't want to do anything with those pixel values first. (They will be very close to the same color in an image).
import imageio
import numpy as np
a = np.random.randint(0,5, size=(100,100,100))
a = a.astype('int8')
for i in range(100):
newimage = a[:, :, i]
imageio.imwrite("path/to/image%d.tiff" %i, newimage)
What exactly do you mean by "stack"? As you refer to tiff as output format, I assume here you want your data in one file as a multiframe-tiff.
This can easily be done with imageio's mimwrite() function:
# import numpy as np
# a = np.random.randint(0,5, size=(100,100,100))
# a = a.astype('int8')
import imageio
imageio.mimwrite("image.tiff", a)
Note that this function relies on having the counter for your several frames as first parameter and x and y follw. See also its documentation.
However, if I'm wrong and you want to have n (e.g. 100) separate tif-files, you can also use the normal imwrite() function in a loop:
n = len(a)
for i in range(n):
imageio.imwrite(f'image_{i:03}.tiff', a[i])

Categories

Resources