How to convert ndjson data into numpy to extract image data? - python

Following the Google's doodle dataset I would like to know how to get the numpy 28x28 image data out of a .ndjson file (just the image data).
I am aware that they also provide the dataset in a numpy version but I am facing a similar issue with another dataset similar to Google´s one which only has the simplified .ndjson files.

Related

Image character segmentation from plates

I am working on creating an image dataset in CSV format. I looked into MIST dataset, which stores pixel values in CSV files. But I am looking for a tool which can segment characters from image and save them in jpg/png format for future processing. Can anyone help me with it? I have already tried labelme, labelIMG but that only store in the format of yolo,XML etc

3D visualization of .dicom files with ipyvolume

i'm trying to visualize a set of .dicom files using pydicom and ipyvolume.
I used pydicom to read files and then sorted them by their location and turned the slices into a 3D array. I could draw a 3D model of the data using ipyvolume.pylab.plot_isosurface() although I'm not sure if this is the right way of visualizing medical images (it's all solid pixels with the same opacity and color). I've also tried ipyvolume.pylab.volshow() but that did not work.
Is there a right way to visualize medical images with ipyvolume? or this is just not the right library for that?
DICOM file does not have 'voxel' data so you can't simply plot a dicom in 3D view. You should estimate voxel data using slices of a dicom series. after that, using a 3D modeling algorithm such as Marching Cubes you can extract final 3D model. Take a look at CTU.
I haven't used ipyvolume, but looking at the documentation it ought to be able to visualize DICOM image sets.
If you want to try another package, I use SimpleITK to load DICOM images and itkwidgets do volume visualization in a Jupyter notebook.
Here's a simple notebook that load a DICOM series and displays it:
import SimpleITK as sitk
import itkwidgets
# Get the DICOM file names in the current directory
names = sitk.ImageSeriesReader.GetGDCMSeriesFileNames('.')
# Read the DICOM series
reader = sitk.ImageSeriesReader()
reader.SetFileNames(names)
img = reader.Execute()
itkwidgets.view(img)
If the directory has more than one DICOM series in it, GetGDCMSeriesFileNames has a seriesID parameter you can give it to specify which series to load.

Is there a way to resize non-image files for CNN similarly to image-based examples?

I am working on a research project involving brain wave data. The goal is to classify (1,0) each "image." The problem is essentially an image classification problem, where I could use a CNN, but it's not clean at all like most CNN examples online. The files that I have are tsv's (each file is an individual trial from a patient), and I have stacked them all into one pickle file with each having the participant ID and trial ID as an additional column.
I want to feed them through a CNN, but almost examples online deal with equal-sized images. My data aren't of equal size, and they aren't images. I'm wanting to use PIL to make each file the same size, but is PIL even the correct way of doing so since I don't have image files?

Is there a method of aligning 2D arrays based on features much like Image Registration

I have two 2D arrays. One consist of reference data and the other measured data. While the matrices have the same shape the measured data will not be perfectly centered; meaning, the sample may not have been perfectly aligned with the detector. It could be rotated or translated. I would like to align the matrices based on the features much like image registration. I'm hoping someone can point me in the direction of a python package capable of this or let me know if opencv can do this for numpy arrays with arbitrary values that do not fit the mold of a typical .png or .jpg file.
I have aligned images using opencv image registration functions. I have attempted to convert my arrays to images using PIL with the intent being to use the image registration functions within opencv. If needed I can post my sample code but at this point I want to know if there is a package with functions capable of doing this.

How do I prepare data (images) for py-faster-rcnn classification training?

I am trying to train my own image classificator with py-faster-rcnn link using my own images.
It looks rather simple in the example here, but they are using some ready dataset (INRIA Person). Datasets are structured and cropped to sub-images (actually data sets there are both original images and cropped people images from them) and text notation of each image with coordinates of crops. Pretty straightforward.
Still I have no idea how this is done - do they use any sort of tool for this (I can hardly imagine some test lots of data are cropped and notated manually)?
Could anyone please suggest a solution for this one? Thanks.

Categories

Resources