OpenCV stereoCalibration of two cameras using ChArUco - python

I want to calculate the relative transformation between two cameras ([R|t] matrix) using multiple frames of a charuco board. My idea was to obtain image-object point pairs from all the frames and then use a function which takes all of the detected point pairs and outputs relative transformation between cameras (e.g. stereoCalibrate).
What is the best approach to do that? I could not get stereoCalibrate to work, since it always throws assertion errors -> bugreport.
Current implementation (not working):
imagePointsA = []
imagePointsB = []
objectPoints = []
for frameA, frameB in color_framesets(...):
try:
# Find corners
cornersA, idsA, rejected = cv2.aruco.detectMarkers(frameA, charucoDict)
cornersB, idsB, rejected = cv2.aruco.detectMarkers(frameB, charucoDict)
if not cornersA or not cornersB: raise Exception("No markers detected")
retA, cornersA, idsA = cv2.aruco.interpolateCornersCharuco(cornersA, idsA, frameA, charucoBoard)
retB, cornersB, idsB = cv2.aruco.interpolateCornersCharuco(cornersB, idsB, frameB, charucoBoard)
if not retA or not retB: raise Exception("Can't interpolate corners")
# Find common points in both frames (is there a nicer way?)
objPtsA, imgPtsA = cv2.aruco.getBoardObjectAndImagePoints(charucoBoard, cornersA, idsA)
objPtsB, imgPtsB = cv2.aruco.getBoardObjectAndImagePoints(charucoBoard, cornersB, idsB)
# Create dictionary for each frame objectPoint:imagePoint
ptsA = {tuple(a):tuple(b) for a, b in zip(objPtsA[:,0], imgPtsA[:,0])}
ptsB = {tuple(a):tuple(b) for a, b in zip(objPtsB[:,0], imgPtsB[:,0])}
common = set(ptsA.keys()) & set(ptsB.keys()) # intersection between obj points
for objP in common:
objectPoints.append(np.reshape(objP, (1, 3)))
imagePointsA.append(np.reshape(ptsA[objP], (1, 2)))
imagePointsB.append(np.reshape(ptsB[objP], (1, 2)))
except Exception as e:
print(f"Skipped frame: {e}")
continue
result = cv2.stereoCalibrateExtended(objectPoints, imagePointsA, imagePointsB, intrA, distA, intrB, distB, (848, 480), flags=cv2.CALIB_FIX_INTRINSIC+cv2.CALIB_USE_EXTRINSIC_GUESS)

I have just made something similar earlier today. I assume you solved at least part of your issue, since you closed the mentioned bug. In any case, it seems to me that the issue is that you are passing an array of points, while it should be an array of array of points (an array of point for each frame with sufficient data).
On a related note, cv2.aruco.getBoardObjectAndImagePoints is probably not what you are looking for, with cornersA and cornersB already being image points (of chessboard pattern corners), and object points (position of the chessboard pattern corners) being computable from the aruco marker ids, while getBoardObjectAndImagePoints is about the aruco marker corners as far as I can tell.
Internally, cv2.aruco.calibrateCameraCharuco simply calls cv2.calibrateCamera with the passed corners as image points, and the object points computed from the passed aruco IDs. Unfortunately, getting the object point from an aruco ID isn't exposed in the API, but it's pretty easy to compute: https://github.com/opencv/opencv_contrib/blob/master/modules/aruco/src/charuco.cpp#L157-L166

Related

Why Gmsh create more surfaces than expected when I use "gmhs.model.occ.cut()" command?

I am trying to create a volume in Gmsh (using Python API) by cutting some small cylinders from a bigger one.
When I do that, I expect to have one surface for each cutted region, instead, I get the result in the figure. I have highlighted in red the surfaces that give me the problem (some cutted regions behave as expected), as you can see, instead of one surface I get two, that sometimes aren't even equal.
gmsh creates more surfaces than expected:
So, my questions are:
Why gmsh behaves like that?
How can I fix this as I need predictable behavior?
Below is the code I used to generate the geometry.
The code to work requires some parameters such as core_height, core_inner_radius and core_outer_radius, the number of small cylinders and their radius.
gmsh.initialize(sys.argv)
#gmsh.initialize()
gmsh.clear()
gmsh.model.add("circle_extrusion")
inner_cyl_tag = 1
outer_cyl_tag = 2
inner_cyl = gmsh.model.occ.addCylinder(0,0,0, 0, 0, core_height, core_inner_radius, tag = inner_cyl_tag)
outer_cyl = gmsh.model.occ.addCylinder(0,0,0, 0, 0, core_height, core_outer_radius, tag = outer_cyl_tag)
core_tag = 3
cut1 = gmsh.model.occ.cut([(3,outer_cyl)],[(3,inner_cyl)], tag = core_tag)
#create a set of filled cylinders
#set position
angle_vector = np.linspace(0,2*np.pi,number_of_hp+1)
pos_x = hp_radial_position*np.cos(angle_vector)
pos_y = hp_radial_position*np.sin(angle_vector)
pos_z = 0.0
#cut one cylinder at the time and assign the new core tag
for ii in range(0,len(angle_vector)):
old_core_tag = core_tag
heat_pipe = gmsh.model.occ.addCylinder(pos_x[ii], pos_y[ii], pos_z, 0, 0, core_height,hp_outer_radius, tag =-1)
core_tag = heat_pipe+1
core = gmsh.model.occ.cut([(3,old_core_tag)],[(3,heat_pipe)], tag = core_tag)
gmsh.model.occ.synchronize()
#get volume entities and assign physical groups
volumes = gmsh.model.getEntities(dim=3)
solid_marker = 1
gmsh.model.addPhysicalGroup(volumes[0][0], [volumes[0][1]],solid_marker)
gmsh.model.setPhysicalName(volumes[0][0],solid_marker, "solid_volume")
#get surfaces entities and apply physical groups
surfaces = gmsh.model.getEntities(dim=2)
surface_markers= np.arange(1,len(surfaces)+1,1)
for ii in range(0,len(surfaces)):
gmsh.model.addPhysicalGroup(2,[surfaces[ii][1]],tag = surface_markers[ii])
#We finally generate and save the mesh:
gmsh.model.mesh.generate(3)
gmsh.model.mesh.refine()
gmsh.model.mesh.refine()
gmsh.option.setNumber("Mesh.MshFileVersion", 2.2) #save in ASCII 2 format
gmsh.write(mesh_name+".msh")
# Launch the GUI to see the results:
#if '-nopopup' not in sys.argv:
# gmsh.fltk.run()
gmsh.finalize()
I do not think that you have additional surfaces in the sense of gmsh.model.occ surfaces. To me this looks like your volume mesh is sticking out of your surface mesh, i.e. volume and surface mesh do not fit together.
Here is what I did to check your case:
First I added the following lines at the beginning of our code to get a minimum working example:
import gmsh
import sys
import numpy as np
inner_cyl_tag = 1
outer_cyl_tag = 2
core_height = 1
core_inner_radius = 0.1
core_outer_radius = 0.2
number_of_hp = 5
hp_radial_position = 0.1
hp_outer_radius = 0.05
What I get with this code is the following:
To visualize it like this go to "Tools"-->"Options"-->"Mesh" and check "2D element faces", "3D element edges" and "3D element faces".
You can see that there are some purple triangles sticking out of the green/yellowish surfaces triangles of the inner surfaces.
You could try to visualize your case the same way and check <--> uncheck the "3D element faces" a few times.
So here is the solution for this behaviour, I did not know that gmsh behaves like this myself. It seems that when you create your mesh and refine it the refinement will be applied on the 2D surface mesh and the 3D volume mesh seperately, which means that those two meshes are not connected after the refinement anymore. What I did next was to try what happens if you create the 2D mesh only, refine it, and then create the 3D mesh, i.e.:
replace:
gmsh.model.mesh.generate(3)
gmsh.model.mesh.refine()
gmsh.model.mesh.refine()
by:
gmsh.model.mesh.generate(2)
gmsh.model.mesh.refine()
gmsh.model.mesh.refine()
gmsh.model.mesh.generate(3)
The result then looks like this:
I hope that this was actually your problem. But in future it would be good if you could provide us a minimum working example of code that we can copy-paste and get the same visualization you showed us in your image.

Creating an ASCII art world map

I'd like to render an ASCII art world map given this GeoJSON file.
My basic approach is to load the GeoJSON into Shapely, transform the points using pyproj to Mercator, and then do a hit test on the geometries for each character of my ASCII art grid.
It looks (edit: mostly) OK when centered one the prime meridian:
But centered on New York City (lon_0=-74), and it suddenly goes haywire:
I'm fairly sure I'm doing something wrong with the projections here. (And it would probably be more efficient to transform the ASCII map coordinates to lat/lon than transform the whole geometry, but I am not sure how.)
import functools
import json
import shutil
import sys
import pyproj
import shapely.geometry
import shapely.ops
# Load the map
with open('world-countries.json') as f:
countries = []
for feature in json.load(f)['features']:
# buffer(0) is a trick for fixing polygons with overlapping coordinates
country = shapely.geometry.shape(feature['geometry']).buffer(0)
countries.append(country)
mapgeom = shapely.geometry.MultiPolygon(countries)
# Apply a projection
tform = functools.partial(
pyproj.transform,
pyproj.Proj(proj='longlat'), # input: WGS84
pyproj.Proj(proj='webmerc', lon_0=0), # output: Web Mercator
)
mapgeom = shapely.ops.transform(tform, mapgeom)
# Convert to ASCII art
minx, miny, maxx, maxy = mapgeom.bounds
srcw = maxx - minx
srch = maxy - miny
dstw, dsth = shutil.get_terminal_size((80, 20))
for y in range(dsth):
for x in range(dstw):
pt = shapely.geometry.Point(
(srcw*x/dstw) + minx,
(srch*(dsth-y-1)/dsth) + miny # flip vertically
)
if any(country.contains(pt) for country in mapgeom):
sys.stdout.write('*')
else:
sys.stdout.write(' ')
sys.stdout.write('\n')
I made edit at the bottom, discovering new problem (why there is no Canada and unreliability of Shapely and Pyproj)
Even though its not exactly solving the problem, I believe this attitude has more potential than using pyproc and Shapely and in future, if you will do more Ascii art, will give you more possibilites and flexibility. Firstly I will write pros and cons.
PS: Initialy I wanted to find problem in your code, but I had problems with running it, because pyproj was returning me some error.
PROS
1) I was able to extract all points (Canada is really missing) and rotate image
2) The processing is very fast and therefore you can create Animated Ascii art.
3) Printing is done all at once without need to loop
CONS (known Issues, solvable)
1) This attitude is definetly not translating geo-coordinates correctly - too plane, it should look more spherical
2) I didnt take time to try to find out solution to filling the borders, so only borders has '*'. Therefore this attitude needs to find algorithm to fill the countries. I think it shouldnt be problem since the JSON file contains countries separated
3) You need 2 extra libs beside numpy - opencv(you can use PIL instead) and Colorama, because my example is animated and I needed to 'clean' terminal by moving cursor to (0,0) instead of using os.system('cls')
4) I made it run only in python 3. In python 2 it works too but I am getting error with sys.stdout.buffer
Change font size on terminal to lowest point so the the printed chars fit in terminal. Smaller the font, better resolution
The animation should look like the map is 'rotating'
I used little bit of your code to extract the data. Steps are in the commentaries
import json
import sys
import numpy as np
import colorama
import sys
import time
import cv2
#understand terminal_size as how many letters in X axis and how many in Y axis. Sorry not good name
if len(sys.argv)>1:
terminal_size = (int(sys.argv[1]),int(sys.argv[2]))
else:
terminal_size=(230,175)
with open('world-countries.json') as f:
countries = []
minimal = 0 # This can be dangerous. Expecting negative values
maximal = 0 # Expecting bigger values than 0
for feature in json.load(f)['features']: # getting data - I pretend here, that geo coordinates are actually indexes of my numpy array
indexes = np.int16(np.array(feature['geometry']['coordinates'][0])*2)
if indexes.min()<minimal:
minimal = indexes.min()
if indexes.max()>maximal:
maximal = indexes.max()
countries.append(indexes)
countries = (np.array(countries)+np.abs(minimal)) # Transform geo-coordinates to image coordinates
correction = np.abs(minimal) # because geo-coordinates has negative values, I need to move it to 0 - xaxis
colorama.init()
def move_cursor(x,y):
print ("\x1b[{};{}H".format(y+1,x+1))
move = 0 # 'rotate' the globe
for i in range(1000):
image = np.zeros(shape=[maximal+correction+1,maximal+correction+1]) #creating clean image
move -=1 # you need to rotate with negative values
# because negative one are by numpy understood. Positive one will end up with error
for i in countries: # VERY STRANGE,because parsing the json, some countries has different JSON structure
if len(i.shape)==2:
image[i[:,1],i[:,0]+move]=255 # indexes that once were geocoordinates now serves to position the countries in the image
if len(i.shape)==3:
image[i[0][:,1],i[0][:,0]+move]=255
cut = np.where(image==255) # Bounding box
if move == -1: # creating here bounding box - removing empty edges - from sides and top and bottom - we need space. This needs to be done only once
max_x,min_x = cut[0].max(),cut[0].min()
max_y,min_y = cut[1].max(),cut[1].min()
new_image = image[min_x:max_x,min_y:max_y] # the bounding box
new_image= new_image[::-1] # reverse, because map is upside down
new_image = cv2.resize(new_image,terminal_size) # resize so it fits inside terminal
ascii = np.chararray(shape = new_image.shape).astype('|S4') #create container for asci image
ascii[:,:]='' #chararray contains some random letters - dunno why... cleaning it
ascii[:,-1]='\n' #because I pring everything all at once, I am creating new lines at the end of the image
new_image[:,-1]=0 # at the end of the image can be country borders which would overwrite '\n' created one step above
ascii[np.where(new_image>0)]='*' # transforming image array to chararray. Better to say, anything that has pixel value higher than 0 will be star in chararray mask
move_cursor(0,0) # 'cleaning' the terminal for new animation
sys.stdout.buffer.write(ascii) # print into terminal
time.sleep(0.025) # FPS
Maybe it would be good to explain what is the main algorithm in the code. I like to use numpy whereever I can. The whole thing is that I pretend that coordinates in the image, or whatever it may be (in your case geo-coordinates) are matrix indexes. I have then 2 Matrixes - Real Image and Charray as Mask. I then take indexes of interesting pixels in Real image and for the same indexes in Charray Mask I assign any letter I want. Thanks to this, the whole algorithm doesnt need a single loop.
About Future posibilities
Imagine you will also have information about terrain(altitude). Let say you somehow create grayscale image of world map where gray shades expresses altitude. Such grayscale image would have shape x,y. You will prepare 3Dmatrix with shape = [x,y,256]. For each layer out of 256 in the 3D matrix, you assign one letter ' ....;;;;### and so on' that will express shade.
When you have this prepared, you can take your grayscale image where any pixel will actually have 3 coordinates: x,y and shade value. So you will have 3 arrays of indexes out of your grascale map image -> x,y,shade. Your new charray will simply be extraction of your 3Dmatrix with layer letters, because:
#Preparation phase
x,y = grayscale.shape
3Dmatrix = np.chararray(shape = [x,y,256])
table = ' ......;;;;;;;###### ...'
for i in range(256):
3Dmatrix[:,:,i] = table[i]
x_indexes = np.arange(x*y)
y_indexes = np.arange(x*y)
chararray_image = np.chararray(shape=[x,y])
# Ready to print
...
shades = grayscale.reshape(x*y)
chararray_image[:,:] = 3Dmatrix[(x_indexes ,y_indexes ,shades)].reshape(x,y)
Because there is no loop in this process and you can print chararray all at once, you can actually print movie into terminal with huge FPS
For example if you have footage of rotating earth, you can make something like this - (250*70 letters), render time 0.03658s
You can ofcourse take it into extreme and make super-resolution in your terminal, but resulting FPS is not that good: 0.23157s, that is approximately 4-5 FPS. Interesting to note is, that this attitude FPS is enourmous, but terminal simply cannot handle printing, so this low FPS is due to limitations of terminal and not of calculation as calculation of this high resolution took 0.00693s, that is 144 FPS.
BIG EDIT - contradicting some of above statements
I accidentaly opened raw json file and find out, there is CANADA and RUSSIA with full correct coordinates. I made mistake to rely on the fact that we both didnt have canada in the result, so I expected my code is ok. Inside JSON, the data has different NOT-UNIFIED structure. Russia and Canada has 'Multipolygon', so you need to iterate over it.
What does it mean? Dont rely on Shapely and pyproj. Obviously they cant extract some countries and if they cant do it reliably, you cant expect them to do anything more complicated.
After modifying the code, everything is allright
CODE: This is how to load the file correctly
...
with open('world-countries.json') as f:
countries = []
minimal = 0
maximal = 0
for feature in json.load(f)['features']: # getting data - I pretend here, that geo coordinates are actually indexes of my numpy array
for k in range((len(feature['geometry']['coordinates']))):
indexes = np.int64(np.array(feature['geometry']['coordinates'][k]))
if indexes.min()<minimal:
minimal = indexes.min()
if indexes.max()>maximal:
maximal = indexes.max()
countries.append(indexes)
...

Do anyone know how to extract Image coordinate from Marmot dataset?

Marmot is a document image dataset (http://www.icst.pku.edu.cn/cpdp/data/marmot_data.htm) where labelling several things such as document body, image area, table area, table caption and so on. This dataset specially use for document image analysis research purpose. They mentioned all coordinates in 16 digit hexa decimal with little endian format. Is there anyone how worked with this dataset and how to convert that 16 digit XY coordinate to human understandable format?
Finally I got the clue after analysis and posting here if anyone need to investigate this dataset. However, they mentioned the unit value in which way they convert the given coordinate into pixel value but it was difficult to trace out because they did not mentioned it in their manual/guideline. They mentioned another place as an annotation.
First you have to convert their 16 character hexadecimal value using IEEE 754 little endian format. For example, a given coordinates for a label is,
BBox=['4074145c00000005', '4074dd95999999a9', '4080921e74bc6a80', '406fb9999999999a']
Convert using python,
conv_pound = struct.unpack('!d', str(t).decode('hex'))[0]) for t in BBox]
You will get value in "pound" unit which is 1/72 inch. We usually use coordinate in pixel unit and we know 1 inch is 96 pixel. So,
conv_pound = [321.2724609375003, 333.8490234375009, 530.2648710937501, 253.8]
Then, divided each value by 72 and multiply with 96 to finally get corresponding pixel value which is,
in_pixel = [428.36328, 445.13203, 707.01983, 338.40000]
They started to count pixel position from bottom-left corner of the document image. If you consider from top-left corner (usually we consider in this way), you have to subtract 2nd and 4th value from image height. If we consider image [height, width] is [1123, 793] then we can represent the above coordinates in integer value as,
label_boundary = [428, 678, 707, 785]
After staring at the xmls for an hour, I've found the last missing piece in the answer by #MMReza:
You don't need to rely on the units of measure in (step number 3). There is an attribute called "CropBox" of the root element "Page". Use that one to scale the coordinates.
I have something along the following lines (also inverse y axis here):
px0, py1, px1, py0 = list(map(hex_to_double, page.get("CropBox").split()))
pw = abs(px1 - px0)
ph = abs(py1 - py0)
for table in page.findall(".//Composite[#Label='TableBody']"):
x0p, y1m, x1p, y0m = list(map(hex_to_double, table.get("BBox").split()))
x0 = round(imgw*(x0p - px0)/pw)
x1 = round(imgw*(x1p - px0)/pw)
y0 = round(imgh*(py1 - y0m)/ph)
y1 = round(imgh*(py1 - y1m)/ph)
In case anyone is trying to do this in Python 3 like I did, you only have to change step 2 of the other answer like this :
conv_pound = [struct.unpack('!d', bytes.fromhex(t))[0] for t in BBox]
I wanted to convert the coordinates as well as wanted to verify that my conversion actually worked. So, I made this script to read label file and respective image file then extract coordinates of table body(for eg) and visualize them on the images. It can be used to extract other fields in the similar manner. Comments explain it all
import glob
import struct
import cv2
import binascii
import re
xml_files = glob.glob("path_to_labeled_files/*.xml")
for i in xml_files:
# Open the current file and read everything
cur_file = open(i,"r")
content = cur_file.read()
# Find index of all occurrences of only needed portions (eg TableBody this case)
residxs = [l.start() for l in re.finditer('Label="TableBody"', content)]
# Read the image
img = cv2.imread("path_to_images_folder/"+i.split('/')[-1][:-3]+"jpg")
# Traverse over all occurences
for r in residxs[:-1]:
# List to store output points
coords = []
# Start index of an occurence
sidx = r
# Substring from whole file content
substr = content[sidx:sidx+400]
# Now find start index and end index of coordinates in this substring
sidx = substr.find('BBox="')
eidx = substr.find('" CLIDs')
# String containing only points
points = substr[sidx+6:eidx]
# Make the conversion (also take care of little and big endian in unpack)
bins = ''
for j in points.split(' '):
if(j == ''):
continue
coords.append(struct.unpack('>d', binascii.unhexlify(j))[0])
if len(coords) != 4:
continue
# As suggested by MMReza
for k in range(4):
coords[k] = (coords[k]/72)*96
coords[1] = img.shape[0] - coords[1]
coords[3] = img.shape[0] - coords[3]
# Print the extracted coordinates
print(coords)
# Visualize it on the image
cv2.rectangle(img, (int(coords[0]),int(coords[1])) , (int(coords[2]),int(coords[3])), (255, 0, 0), 2)
cv2.imshow("frame",img)
cv2.waitKey(0)

Faster implementation to quantize an image with an existing palette?

I am using Python 3.6 to perform basic image manipulation through Pillow. Currently, I am attempting to take 32-bit PNG images (RGBA) of arbitrary color compositions and sizes and quantize them to a known palette of 16 colors. Optimally, this quantization method should be able to leave fully transparent (A = 0) pixels alone, while forcing all semi-transparent pixels to be fully opaque (A = 255). I have already devised working code that performs this, but I wonder if it may be inefficient:
import math
from PIL import Image
# a list of 16 RGBA tuples
palette = [
(0, 0, 0, 255),
# ...
]
with Image.open('some_image.png').convert('RGBA') as img:
for py in range(img.height):
for px in range(img.width):
pix = img.getpixel((px, py))
if pix[3] == 0: # Ignore fully transparent pixels
continue
# Perform exhaustive search for closest Euclidean distance
dist = 450
best_fit = (0, 0, 0, 0)
for c in palette:
if pix[:3] == c: # If pixel matches exactly, break
best_fit = c
break
tmp = sqrt(pow(pix[0]-c[0], 2) + pow(pix[1]-c[1], 2) + pow(pix[2]-c[2], 2))
if tmp < dist:
dist = tmp
best_fit = c
img.putpixel((px, py), best_fit + (255,))
img.save('quantized.png')
I think of two main inefficiencies of this code:
Image.putpixel() is a slow operation
Calculating the distance function multiple times per pixel is computationally wasteful
Is there a faster method to do this?
I've noted that Pillow has a native function Image.quantize() that seems to do exactly what I want. But as it is coded, it forces dithering in the result, which I do not want. This has been brought up in another StackOverflow question. The answer to that question was simply to extract the internal Pillow code and tweak the control variable for dithering, which I tested, but I find that Pillow corrupts the palette I give it and consistently yields an image where the quantized colors are considerably darker than they should be.
Image.point() is a tantalizing method, but it only works on each color channel individually, where color quantization requires working with all channels as a set. It'd be nice to be able to force all of the channels into a single channel of 32-bit integer values, which seems to be what the ill-documented mode "I" would do, but if I run img.convert('I'), I get a completely greyscale result, destroying all color.
An alternative method seems to be using NumPy and altering the image directly. I've attempted to create a lookup table of RGB values, but the three-dimensional indexing of NumPy's syntax is driving me insane. Ideally I'd like some kind of code that works like this:
img_arr = numpy.array(img)
# Find all unique colors
unique_colors = numpy.unique(arr, axis=0)
# Generate lookup table
colormap = numpy.empty(unique_colors.shape)
for i, c in enumerate(unique_colors):
dist = 450
best_fit = None
for pc in palette:
tmp = sqrt(pow(c[0] - pc[0], 2) + pow(c[1] - pc[1], 2) + pow(c[2] - pc[2], 2))
if tmp < dist:
dist = tmp
best_fit = pc
colormap[i] = best_fit
# Hypothetical pseudocode I can't seem to write out
for iy in range(arr.size):
for ix in range(arr[0].size):
if arr[iy, ix, 3] == 0: # Skip transparent
continue
index = # Find index of matching color in unique_colors, somehow
arr[iy, ix] = colormap[index]
I note with this hypothetical example that numpy.unique() is another slow operation, since it sorts the output. Since I cannot seem to finish the code the way I want, I haven't been able to test if this method is faster anyway.
I've also considered attempting to flatten the RGBA axis by converting the values to a 32-bit integer and desiring to create a one-dimensional lookup table with the simpler index:
def shift(a):
return a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3]
img_arr = numpy.apply_along_axis(shift, 1, img_arr)
But this operation seemed noticeably slow on its own.
I would prefer answers that involve only Pillow and/or NumPy, please. Unless using another library demonstrates a dramatic computational speed increase over any PIL- or NumPy-native solution, I don't want to import extraneous libraries to do something these two libraries should be reasonably capable of on their own.
for loops should be avoided for speed.
I think you should make a tensor like:
d2[x,y,color_index,rgb] = distance_squared
where rgb = 0..2 (0 = r, 1 = g, 2 = b).
Then compute the distance:
d[x,y,color_index] =
sqrt(sum(rgb,d2))
Then select the color_index with the minimal distance:
c[x,y] = min_index(color_index, d)
Finally replace alpha as needed:
alpha = ceil(orig_image.alpha)
img = c,alpha

Vtk inserts incorrect color between nodes when mapping texture to mesh

Hi I am trying to map a texture to 3d mesh using Mayavi and Python bindings of vtk. I am visualising an .obj wavefront. This obj is 3D photograph of a face. The texture image is a composite of three 2D photographs.
Each node in the mesh has an (uv) co-ordinate in the image, which defines its color. Different regions of the mesh draw their colours from different sections of the image. To illustrate this I have replaced the actual texture image with this one:
And mapped this to the mesh instead.
The problem I am having is illustrated around the nose. At the border between red and green there is an outline of blue. Closer inspection of this region in wireframe mode shows that it is not a problem with the uv mapping, but with how vtk is interpolating colour between two nodes. For some reason it is adding a piece of blue in between two nodes where one is red and one is green.
This causes serious problems when visualising using the real texture
Is there a way to force vtk to choose the colour of one or the other neighbouring nodes for the colour between them? I tried turning "edge-clamping" on, but this did not achieve anything.
The code that I am using is below and you can access the files in question from here https://www.dropbox.com/sh/ipel0avsdiokr10/AADmUn1-qmsB3vX7BZObrASPa?dl=0
but I hope this is a simple solution.
from numpy import *
from mayavi import mlab
from tvtk.api import tvtk
import os
from vtk.util import numpy_support
def obj2array(f):
"""function for reading a Wavefront obj"""
if type(f)==str:
if os.path.isfile(f)==False:
raise ValueError('obj2array: unable to locate file ' + str(f))
f =open(f)
vertices = list()
connectivity = list()
uv = list()
vt = list()
fcount = 0
for l in f:
line = l.rstrip('\n')
data = line.split()
if len(data)==0:
pass
else:
if data[0] == 'v':
vertices.append(atleast_2d(array([float(item) for item in data[1:4]])))
elif data[0]=='vt':
uv.append(atleast_2d(array([float(item) for item in data[1:3]])))
elif data[0]=='f':
nverts = len(data)-1 # number of vertices comprising each face
if fcount == 0: #on first face establish face format
fcount = fcount + 1
if data[1].find('/')==-1: #Case 1
case = 1
elif data[1].find('//')==True:
case = 4
elif len(data[1].split('/'))==2:
case = 2
elif len(data[1].split('/'))==3:
case = 3
if case == 1:
f = atleast_2d([int(item) for item in data[1:len(data)]])
connectivity.append(f)
if case == 2:
splitdata = [item.split('/') for item in data[1:len(data)]]
f = atleast_2d([int(item[0]) for item in splitdata])
connectivity.append(f)
if case == 3:
splitdata = [item.split('/') for item in data[1:len(data)]]
f = atleast_2d([int(item[0]) for item in splitdata])
connectivity.append(f)
if case == 4:
splitdata = [item.split('//') for item in data[1:len(data)]]
f = atleast_2d([int(item[0]) for item in splitdata])
connectivity.append(f)
vertices = concatenate(vertices, axis = 0)
if len(uv)==0:
uv=None
else:
uv = concatenate(uv, axis = 0)
if len(connectivity) !=0:
try:
conarray = concatenate(connectivity, axis=0)
except ValueError:
if triangulate==True:
conarray=triangulate_mesh(connectivity,vertices)
else:
raise ValueError('obj2array: not all faces triangles?')
if conarray.shape[1]==4:
if triangulate==True:
conarray=triangulate_mesh(connectivity,vertices)
return vertices, conarray,uv
# load texture image
texture_img = tvtk.Texture(interpolate = 1,edge_clamp=1)
texture_img.input = tvtk.BMPReader(file_name='HM_1_repose.bmp').output
#load obj
verts, triangles, uv = obj2array('HM_1_repose.obj')
# make 0-indexed
triangles = triangles-1
surf = mlab.triangular_mesh(verts[:,0],verts[:,1],verts[:,2],triangles)
tc=numpy_support.numpy_to_vtk(uv)
pd = surf.mlab_source.dataset._vtk_obj.GetPointData()
pd.SetTCoords(tc)
surf.actor.actor.mapper.scalar_visibility=False
surf.actor.enable_texture = True
surf.actor.actor.texture = texture_img
mlab.show(stop=True)
You can turn off all interpolation (change interpolate = 1 to interpolate = 0 in your example), but there is not a way to turn off interpolation at just the places where it would interpolate across sub-images of the texture – at least not without writing your own fragment shader. This will likely look crude.
Another solution would be to create 3 texture images with transparent texels at each location that is not part of the actor's face. Then render the same geometry with the same texture coordinates but a different image each time (i.e., have 3 actors each with the same polydata but a different texture image).
I just ran into this exact problem as well and found that the reason this happens is because VTK assumes there's a 1-to-1 relationship between points in the polydata and uv coordinates when rendering the actor and associated vtkTexture. However, in my case and the case of OP, there are neighboring triangles that are mapped to different sections the the image, so they have very different uv coordinates. The points that share these neighboring faces can only have one uv coordinate (or Tcoord) associated with it, but they actually need 2 (or more, depending on your case).
My solution was to loop through and duplicate these points that lie on the the seams/borders and create a new vtkCellArray with triangles with these duplicated pointIds. Then I simply replaced the vtkPolyData Polys() list with the new triangles. It would have been much easier to duplicate the points and update the existing pointIds for each of the triangles that needed it, but I couldn't find a way to update the cells properly.

Categories

Resources