Convert pixel coordinates to frame coordinates

Convert pixel coordinates to frame coordinates - python

I am using a small window to detect Mario which is represented by a red block. However, this red block is composed of 16 by 12 pixels. I want to take the pixel coordinates I found, and convert this to a normal x/y coordinate system based on the window shown in the image: Actual frame which should be 13 by 16 grid (NOT pixels).
So for example, if Mario box is in the upper left corner of screen, the coordinates should be 0,0.
I'm also not sure how to actually make the grid.
The code I'm using is as follows:
import numpy as np
from PIL import Image
class MarioPixels:
def __init__(self):
self.mario = np.array([
[[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0],
[248, 56, 0]
]]
)
self.height = len(self.mario) # specify number of pixels for columns in the frame
self.width = len(self.mario[0]) # specificy number of pixels representing a line in the frame
print(self.mario.shape)
# find difference in R, G and B values between what's in window and what's on the frame
def pixelDiff(self, p1, p2):
return abs(p1[0] - p2[0]), abs(p1[1] - p2[1]), abs(p1[2] - p2[2])
def isMario(self, window, pattern):
total = [0, 0, 0]
count = 0
for line in range(len(pattern)):
lineItem = pattern[line]
sample = window[line]
for pixelIdx in range(len(lineItem)):
count += 1
pixel1 = lineItem[pixelIdx]
pixel2 = sample[pixelIdx]
d1, d2, d3 = self.pixelDiff(pixel1, pixel2)
# print(pixelIdx)
total[0] = total[0] + d1 # sum of difference between all R values found between window and frame
total[1] = total[1] + d2 # sum of difference between all G values found between window and frame
total[2] = total[2] + d3 # sum of difference between all B values found between window and frame
# Mario has a red hat
# if line == 0 and pixelIdx == 4 and pixel2[0] != 248:
# return 1.0
rscore = total[0] / (
count * 255) # divided by count of all possible places the R difference could be calculated
gscore = total[1] / (
count * 255) # divided by count of all possible places the G difference could be calculated
bscore = total[2] / (
count * 255) # divided by count of all possible places the B difference could be calculated
return (
rscore + gscore + bscore) / 3.0 # averaged to find a value between 0 and 1. Num close to 0 means object(mario, pipe, etc.) is there,
# whereas, number close to 1 means object was not found.
def searchForMario(self, step, state, pattern):
height = self.height
width = self.width
x1 = 0
y1 = 0
x2 = width
y2 = height
imageIdx = 0
bestScore = 1.1
bestImage = None
bestx1, bestx2, besty1, besty2 = 0, 0, 0, 0
for y1 in range(0, 240 - height, 8): # steps in range row, jump by 8 rows
y2 = y1 + height
for x1 in range(0, 256 - width, 3): # jump by 3 columns
x2 = x1 + width
window = state[y1:y2, x1:x2, :]
score = self.isMario(window, pattern)
# print(imageIdx, score)
if score < bestScore:
bestScore = score
bestImageIdx = imageIdx
bestImage = Image.fromarray(window)
bestx1, bestx2, besty1, besty2 = x1, x2, y1, y2
imageIdx += 1
bestImage.save('testrgb' + str(step) + '_' + str(bestImageIdx) + '_' + str(bestScore) + '.png')
return bestx1, bestx2, besty1, besty2

It looks like you've got a pixel aspect ratio at play here, so the width and height of each "block" in pixels will be different.
Going by your code, your pixel space is 256x240 pixels, but you say that it actually represents a 13x16 grid. This means that every block in the x-domain is (256/13) or about 20 pixels, and in the y-domain (240/16) 15 pixels. This means that "Mario", at 16x12 pixels occupies less than one complete block. Looking at your image, this seems to be a possibility - bushes and clouds also occupy less than one block.
I suggest you first make sure the 13x16 grid is correct (simply because it doesn't seem to match your pixel size exactly, and because the stride sizes in your ranges imply that blocks might actually be 3x8 pixels). Then, you can try to add the grid on to the pixel image simply by setting the value of every pixel that has an x-co-ordinate exactly divisible by 20 equal to (0,0,0) for a black RGB pixel (and also a y-coordinate exactly divisible by 15 - use modulus operator %). To get the "block" co-ordinates, simply divide the x-co by 20 and the y-co by 15 and round down to the nearest whole number (or use // to do the rounding as part of the division).
I've assumed that your pixel co-ordinates also run from top left (0,0) to bottom right (256, 240).

Related

The coordinates of the reconstructed 3D points are different after the virtual camera intrinsic K has also changed proportionally after image resize?

As far as I know, after image resize, the corresponding intrinsic parameter K also changes proportionally, but why the coordinates of the 3D reconstruction of the same point are not the same?
The following python program is a simple experiment, the original image size is , after resize it becomes , the intrinsic parameter K1 corresponds to the original image, the intrinsic parameter K2 corresponds to the resize, RT1, RT2 are the extrinsic projection matrix of the camera (should remain unchanged?,[R,T], size), without considering the effects of camera skew factor and distortions,why is there a difference in the reconstructed 3D points?
import cv2
import numpy as np
fx = 1040
fy = 1040
cx = 1920 / 2
cy = 1080 / 2
K1 = np.array([[fx, 0, cx],
[0, fy, cy],
[0, 0, 1]])
RT1 = np.array([[1, 0, 0, 4],
[0, 1, 0, 5],
[0, 0, 1, 6]]) # just random set
theta = np.pi / 6
RT2 = np.array([[np.cos(theta), -np.sin(theta), 0, 40],
[np.sin(theta), np.cos(theta), 0, 50],
[0, 0, 1, 60]]) # just random set
p1 = np.matmul(K1, RT1) # extrinsic projection matrix
p2 = np.matmul(K1, RT2) # extrinsic projection matrix
pt1 = np.array([100.0, 200.0])
pt2 = np.array([300.0, 400.0])
point3d1 = cv2.triangulatePoints(p1, p2, pt1, pt2)
# Remember to divide out the 4th row. Make it homogeneous
point3d1 = point3d1 / point3d1[3]
print(point3d1)
[[-260.07160113]
[ -27.39546108]
[ 273.95189881]
[ 1. ]]
then resize image to test recontruct 3D point, see if it is numerical equal.
rx = 640.0 / 1920.0
ry = 480.0 / 1080.0
fx = fx * rx
fy = fy * ry
cx = cx * rx
cy = cy * ry
K2 = np.array([[fx, 0, cx],
[0, fy, cy],
[0, 0, 1]])
p1 = np.matmul(K2, RT1)
p2 = np.matmul(K2, RT2)
pt1 = np.array([pt1[0] * rx, pt1[1] * ry])
pt2 = np.array([pt2[0] * rx, pt2[1] * ry])
point3d2 = cv2.triangulatePoints(p1, p2, pt1, pt2)
# Remember to divide out the 4th row. Make it homogeneous
point3d2 = point3d2 / point3d2[3]
print(point3d2)
[[-193.03965985]
[ -26.72133393]
[ 189.12512305]
[ 1. ]]
you see, point3d1 and point3d2 is not same,why?

After careful consideration, I was lucky to get a more plausible explanation, which I now state as follows to help others.
In a short conclusion:
Image scaling must specify a uniform (fx=fy) scaling factor in order to derive the correct intrinsic parameter K, otherwise inconsistencies in the x,y axis focal lengths with respect to the original image directly lead to deviations in the calculated 3D points!
Returning to the problem at the beginning, the given image size is 1080×1920, and its focal length is 1040 pixels, i.e. fx=fy=1040, because by definition fx=f/dx,fy=f/dy, where dx, dy are the number of pixels per unit length, and f is the actual physical size of the focal length; thus the a priori dx=dy can be introduced, which is constant This "convention" should also be followed for later image scaling.
Imagine if the scaled image fx,fy were obtained in different proportions, dx,dy would not be the same, causing distortion of the image, and in addition, according to the external projection matrix P = K*[R,t], fx,fy in K would vary disproportionately leading to a deviation in the calculated P!
BTW, Similarly, I put the reference answer to the experiment done by matlab at this link.

Creating a 3d rendering in python with a camera. objects get progressively deformed toward the edge of the screen

So basically, the projection matrix works, as I move the camera arround and look at the cube, there's no problem. But, When I look to the side and the cube should be at the side of the screen, it gets deformed and scaled back.
From what I heard, the projection matrix is supposed to be:
translate (points - camera)
rotate by - the angles of the camera so that z is foward. x is right and y is up.
You now have points in a 3d coordinate system of the cameras.
divide x/z and y/z so that points twice as far from you become twice closer to each other.
Step one and step two is taken into account into the projection matrix
Step 3 is the perspective.
But as I said, it gets fucked when the object isn't at the middle of the screen
I tried posting images, but I don'T have the reputation, when I posted links, it says it's spam. I AM forced to upload the code as a whole
https://jsfiddle.net/PoutineErable/w3z0mtes/69/
This is a version of the code that I modified from a youtube video (in js) that works
(asdw) and (ijkl) for player and mouse movement.
I tried for arround 4 days to get it working,
Today,
I took the second video's code in js and made some light modification that allows movement
using wasd for movement and ijkl for mouse movement, and it works.
Then, I redid my code using the current model.
And Even more stripped down if you want, this has no input, just renders an image. Gotta input the view angle
import numpy as np
import math as m
import pygame, sys, random
pygame.init() #Needed to get pygame initiated
print("\n"*20+"-"*11,"start of program","-"*11)
#---------------------------------------------------------Start of math construct
#defining constants
PI = m.pi
WIDTH , HEIGHT = 600, 1000
fov = 70 * PI/180
#initialising the player variables.
player_pos = np.array([0,0,-5])
camera_rot_y = PI/180 * float(input("what's the horizontal angle (degrees)?:"))
camera_rot_x = PI/180 * float(input("what's the vertical angle (degrees)?:"))
camera_rot_z = 0
#defining math functions
def a(x,y): # it's to put (0,0) at the center of the screen
''' (num,num) -> (num,num)
takes a coordinates in cartesian and output the number in shitty
coordinates png style
'''
return(x + WIDTH/2,HEIGHT/2 - y)
def b(array):
''' (array or list) -> list
takes an arrray corresponding to a coordinates in cartesian
| output the coordinate in array form in a shitty png style'''
return(a(array[0],array[1]))
def Rx(rot_x): #The rotation matrix over the x axis
z = np.matrix([
[ 1, 0, 0, 0],
[ 0, m.cos(rot_x), m.sin(rot_x), 0],
[ 0, -m.sin(rot_x), m.cos(rot_x), 0],
[ 0, 0, 0, 1 ]
])
return(z)
def Ry(rot_y):#The rotation matrix over the y axis
z = np.matrix([
[ m.cos(rot_y), 0,m.sin(rot_y), 0],
[ 0, 1, 0, 0],
[ -m.sin(rot_y), 0, m.cos(rot_y), 0],
[ 0, 0, 0, 1 ],
])
return(z)
def Rz(rot_z): #The rotation matrix over the z axis
z = np.matrix([
[ m.cos(rot_z), m.sin(rot_z), 0, 0],
[- m.sin(rot_z), m.cos(rot_z), 0, 0],
[ 0, 0, 1, 0],
[ 0, 0, 0, 1 ]
])
return(z)
#----------------------------------------------------------------------End of math construct.
#initialising the cube:
cube_ini = []
for i in [-1,1]:
for j in [-1,1]:
for k in [-1,1]:
cube_ini.append([i,j,k,1])
cube = np.matrix(np.transpose(cube_ini))
#----------------------------------------------------pygame boiler plate part 2
screen = pygame.display.set_mode((WIDTH,HEIGHT))
pygame.display.set_caption("3d_renderer")
clock = pygame.time.Clock()
#-----------------------------------------------------------Start of animation code
while True:
keys = pygame.key.get_pressed()
for event in pygame.event.get():
if event.type == pygame.QUIT or keys[pygame.K_ESCAPE]:
pygame.quit()
#print("The program finished running \n \n")
sys.exit()
#Creating the projection matrix
translation_matrix = np.matrix([[ 1 , 0 , 0 , -player_pos[0]],
[ 0 , 1 , 0 , -player_pos[1]],
[ 0 , 0 , 1 , -player_pos[2]],
[ 0, 0, 0, 1 ]])
rotation_matrix = np.dot(Ry(-camera_rot_y),Rz(-camera_rot_z))
rotation_matrix = np.dot(Rx(-camera_rot_x),rotation_matrix)
projection_matrix = np.dot(rotation_matrix,translation_matrix)
#making the calculation for the projection of the cube
pos_cam_proj = np.dot(projection_matrix,cube)
pos_cam_perspective = np.zeros((8,2))
for i in range(8):
pos_cam_perspective[i,0] = (0.5 * HEIGHT * pos_cam_proj[0,i] * (1/m.tan(fov))) /pos_cam_proj[2,i]
pos_cam_perspective[i,1] = (0.5 * HEIGHT * pos_cam_proj[1,i] * (1/m.tan(fov))) /pos_cam_proj[2,i]
cube_screen = np.array(pos_cam_perspective)
#-----------------------drawing the lines
screen.fill("Black")
for i in range(8):
for j in range(i,8): #
pygame.draw.line(screen, "white", b(cube_screen[i][0:2]),b(cube_screen[j][0:2]))
#---final two boiler plate lines
pygame.display.update()
clock.tick(30)
#------------------------------------------------------------End of animation code
The snippet of the code of importance:
#Creating the projection matrix
translation_matrix = np.matrix([[ 1 , 0 , 0 , -player_pos[0]],
[ 0 , 1 , 0 , -player_pos[1]],
[ 0 , 0 , 1 , -player_pos[2]],
[ 0, 0, 0, 1 ]])
rotation_matrix = np.dot(Ry(-camera_rot_y),Rz(-camera_rot_z))
rotation_matrix = np.dot(Rx(-camera_rot_x),rotation_matrix)
projection_matrix = np.dot(rotation_matrix,translation_matrix)
#making the calculation for the projection of the cube
pos_cam_proj = np.dot(projection_matrix,cube)
pos_cam_perspective = np.zeros((8,2))
for i in range(8):
pos_cam_perspective[i,0] = (0.5 * HEIGHT * pos_cam_proj[0,i] * (1/m.tan(fov))) /pos_cam_proj[2,i]
pos_cam_perspective[i,1] = (0.5 * HEIGHT * pos_cam_proj[1,i] * (1/m.tan(fov))) /pos_cam_proj[2,i]
cube_screen = np.array(pos_cam_perspective)
#-----------------------drawing the lines
screen.fill("Black")
for i in range(8):
for j in range(i,8): #
pygame.draw.line(screen, "white", b(cube_screen[i][0:2]),b(cube_screen[j][0:2]))

Your code is correct. However, you need to translate the cube instead of rotating it. Just try:
fov = 45 * PI/180
player_pos = np.array([0,0,-4], dtype=np.float32) # <-- dtype=np.float32
#camera_rot_y = PI/180 * float(input("what's the horizontal angle (degrees)?:"))
#camera_rot_x = PI/180 * float(input("what's the vertical angle (degrees)?:"))
camera_rot_y = 0
camera_rot_x = 0
camera_rot_z = 0
# [...]
angle = 0
while True:
player_pos[0] = m.sin(m.radians(angle)) * 1.5
angle += 2
# [...]

Yes, My code did work. The problem was that I had the wrong fov and I was way too close to the cube object.
fov = 60 * PI/180
# [ ... ]
player_pos = np.array([0,0,-10])
# [ ... ]
pos_cam_perspective[i,0] = 0.5 * (1/m.tan(fov/2)) * HEIGHT * pos_cam_proj[0,i] /pos_cam_proj[2,i]
pos_cam_perspective[i,1] = 0.5 * (1/m.tan(fov/2)) * HEIGHT * pos_cam_proj[1,i] /pos_cam_proj[2,i]

What is the "proper" way to get rid of gravity from accelerometer data?

The "gyro" array and accelwithg array are both the returned data from the hardware, respectevely for accelerometer and gyrometer.
My thought process was as follows:
Calculate time difference between each frame
add up all the angles
Rotation matrix for xyz rotation
Multiply the rotation matrix to the gravity array (0,0,9.8) to get an acceleration without gravity
However, I've noticed this method doesn't consistently work, as in the data varies a lot and the gravity doesn't get filtered out properly. Is there a better method to go on about this?
# gyro-meter calculations
dt = (ts - last_ts_gyro) / 1000
last_ts_gyro = ts
gyro_angle_x = gyro[0] * dt
gyro_angle_y = gyro[1] * dt
gyro_angle_z = gyro[2] * dt
if firstGyro:
total_x = gyro_angle_x
total_y = gyro_angle_y
total_z = gyro_angle_z
firstGyro = False
# totals
total_x += gyro_angle_x
total_y += gyro_angle_y
total_z += gyro_angle_z
# rad = > degree
dtotal_x = np.rad2deg(total_x) % 360
dtotal_y = np.rad2deg(total_y) % 360
dtotal_z = np.rad2deg(total_z) % 360
# rotation matrix
Qx = np.array(
[[1, 0, 0], [0, np.cos(dtotal_x[0]), -np.sin(dtotal_x[0])], [0, np.sin(dtotal_x[0]), np.cos(dtotal_x[0])]])
Qy = np.array(
[[np.cos(dtotal_y[0]), 0, np.sin(dtotal_y[0])], [0, 1, 0], [-np.sin(dtotal_y[0]), 0, np.cos(dtotal_y[0])]])
Qz = np.array(
[[np.cos(dtotal_z[0]), -np.sin(dtotal_z[0]), 0], [np.sin(dtotal_z[0]), np.cos(dtotal_z[0]), 0], [0, 0, 1]])
Qxyz = Qx#Qy#Qz
# a -Qxyz*g to filter out gravity
g = np.array([[0], [0], [gravity_norm]])
rotated_g = Qxyz # g
accelwithoutg = np.subtract(accelwithg, rotated_g)

You converted the gyro angle from Radian to degree then used the numpy trig functions. These trig functions expect angle in Radians not degrees.

Random values +/- 10 from a supplied x that will not exceed a global range of 0-255

I have found similar, but not suitable answers to my issue.
I need to build some noise into recieved RGB values, but the values cannot exceed 255 or be below 0.
I have the following example:
red,green,blue = (253, 4, 130)
print(
(np.random.randint(red-10,red+10),
np.random.randint(green-10,green+10),
np.random.randint(blue-10,blue+10)))
# output values cannot be over 255 or under 0
#The following output would not be ok.
>>>(257, -2, 132)
How can I generate random +/- 10 value from any point within the range of 0-255 that will not exceed 255 or 0?

You could place a conditional statement to check if the generated random number is within your required range.
Code:
import numpy as np
red,green,blue = (253, 4, 130)
def rand_within_rgb(inp, thres):
if np.random.randint(inp - thres, inp + thres) > 255:
return 255
elif np.random.randint(inp - thres, inp + thres) < 0:
return 0
else:
return np.random.randint(inp - thres, inp + thres)
print((rand_within_rgb(red, 10),
rand_within_rgb(green, 10),
rand_within_rgb(blue, 10)))
Output:
(253, 0, 127)
If you didn't like the clipping done above you can try changing the function to what is given below which also gives the output desired by you.
def rand_within_rgb(inp, thres):
if inp + thres > 255:
return np.random.randint(inp - thres, 255)
elif inp - thres < 0:
return np.random.randint(0, inp + thres)
else:
return np.random.randint(inp - thres, inp + thres)

To ensure you stay within boundaries, it would be more elegant to use min() and max() functions instead of if/else statements.
np.random.randint(max(x-10, 0), min(x+10, 255))
So your code can be changed to
def mutate_color(x):
return np.random.randint(max(x-10, 0), min(x+10, 255))
red,green,blue = (253, 4, 130)
print(
mutate_color(red),
mutate_color(green),
mutate_color(blue)
)

You either can compute the random value first and then clamp it to the desired [0, 255] range, or you can limit the random range first.
Compute random value, then clamp
def clamp(value, minValue, maxValue):
"""Returns `value` clamped to the range [minValue, maxValue]."""
return max(minValue, min(value, maxValue))
clamp(np.random.randint(red - 10, red + 10))
Limit the random range
np.random.randint(max(0, red - 10), min(255, red + 10))
Note that the distribution of values will be different. In the first version, a value that's initially 5 could be changed to a value in the range [-5, 15], which would be clipped to [0, 15]. Since all values in the range [-5, 0] would be mapped to 0, there would be greater (6 out of 21) chance of the new value being 0.
With the second approach, the random range itself is adjusted. A value that's initially 5 would be changed to any value in the range [0, 15] with equal probability.
You will have to choose based on what you want.

Python inquiry using a neural network

I am trying to modify a code written by a software developer (Kyle Dickerson) and have written it up like this:
So I have this code:
from __future__ import division
## Kyle Dickerson
## kyle.dickerson#gmail.com
## Jan 15, 2008
##
## Self-organizing map using scipy
## This code is licensed and released under the GNU GPL
## This code uses a square grid rather than hexagonal grid, as scipy allows for fast square grid computation.
## I designed sompy for speed, so attempting to read the code may not be very intuitive.
## If you're trying to learn how SOMs work, I would suggest starting with Paras Chopras SOMPython code:
## http://www.paraschopra.com/sourcecode/SOM/index.php
## It has a more intuitive structure for those unfamiliar with scipy, however it is much slower.
## If you do use this code for something, please let me know, I'd like to know if has been useful to anyone.
from random import *
from math import *
import sys
import scipy
import numpy
class SOM:
def __init__(self, height=4, width=4, FV_size=3, learning_rate=0.005):
self.height = height
self.width = width
self.FV_size = FV_size
self.radius = (height+width)/3
self.learning_rate = learning_rate
self.nodes = scipy.array([[ [random()*255 for
i in range(FV_size)] for x in range(width)] for y in range(height)])
self.nodes = scipy.array([[1,2,3],[4,5,6],[4,5,6],
[4,5,6],[4,5,6], [4,5,6],[4,5,6],[4,5,6],[4,5,6],
[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6]])
print "SOM",self.nodes
def train(self, iterations=1000, train_vector=[[]]):
for t in range(len(train_vector)):
train_vector[t] = scipy.array(train_vector[t])
print "training",train_vector[t],t
time_constant = iterations/log(self.radius)
delta_nodes = scipy.array([[[0 for i in range(self.FV_size)]
for x in range(self.width)] for y in range(self.height)])
for i in range(1, iterations+1):
delta_nodes.fill(0)
radius_decaying=self.radius*exp(-1.0*i/time_constant)
rad_div_val = 2 * radius_decaying * i
learning_rate_decaying=self.learning_rate*exp(-1.0*i/time_constant)
sys.stdout.write("\rTraining Iteration:
" + str(i) + "/" + str(iterations))
for j in range(len(train_vector)):
best = self.best_match(train_vector[j])
for loc in self.find_neighborhood(best, radius_decaying):
influence = exp( (-1.0 * (loc[2]**2)) / rad_div_val)
inf_lrd = influence*learning_rate_decaying
delta_nodes[loc[0],loc[1]] += inf_lrd*
(train_vector[j]- self.nodes[loc[0],loc[1]])
self.nodes += delta_nodes
sys.stdout.write("\n")
# Returns a list of points which live within 'dist' of 'pt'
# Uses the Chessboard distance
# pt is (row, column)
def find_neighborhood(self, pt, dist):
min_y = max(int(pt[0] - dist), 0)
max_y = min(int(pt[0] + dist), self.height)
min_x = max(int(pt[1] - dist), 0)
max_x = min(int(pt[1] + dist), self.width)
neighbors = []
for y in range(min_y, max_y):
for x in range(min_x, max_x):
dist = abs(y-pt[0]) + abs(x-pt[1])
neighbors.append((y,x,dist))
return neighbors
# Returns location of best match, uses Euclidean distance
# target_FV is a scipy array
def best_match(self, target_FV):
loc = scipy.argmin((((self.nodes - target_FV)**2).sum(axis=2))**0.5)
r = 0
while loc > self.width:
loc -= self.width
r += 1
c = loc
return (r, c)
# returns the Euclidean distance between two Feature Vectors
# FV_1, FV_2 are scipy arrays
def FV_distance(self, FV_1, FV_2):
return (sum((FV_1 - FV_2)**2))**0.5
if __name__ == "__main__":
print "Initialization..."
colors = [ [0, 0, 0], [0, 0, 255], [0, 255, 0],
[0, 255, 255], [255, 0, 0], [255, 0, 255],
[255, 255, 0], [255, 255, 255]]
width = 32
height = 32
color_som = SOM(width,height,3,0.05)
print "Training colors..."
color_som.train(1000, colors)
try:
from PIL import Image
print "Saving Image: sompy_test_colors.png..."
img = Image.new("RGB", (width, height))
for r in range(height):
for c in range(width):
img.putpixel((c,r),(int(color_som.nodes[r,c,0]),
int(color_som.nodes[r,c,1]), int(color_som.nodes[r,c,2])))
print "color nodes",color_som.nodes
img = img.resize((width*10, height*10),Image.NEAREST)
img.save("sompy_test_colors.png")
except:
print "Error saving the image, do you have PIL (Python Imaging Library) installed?"
but when I try to go from
self.nodes = scipy.array([[ [random()*255
for i in range(FV_size)] for x in range(width)]
for y in range(height)])
which was in the original code to something like this:
self.nodes = scipy.array([[1,2,3],[4,5,6],[4,5,6],
[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],
[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6]])
I get the error message:
File "sompy5.py", line 112, in <module>
color_som.train(1000, colors)
File "sompy5.py", line 65, in train
best = self.best_match(train_vector[j])
File "sompy5.py", line 92, in best_match
loc = scipy.argmin((((self.nodes - target_FV)**2).sum(axis=2))**0.5)
File "/usr/lib/python2.7/dist-packages/numpy/core/_methods.py",
line 25, in _sum
out=out, keepdims=keepdims)
ValueError: 'axis' entry is out of bounds
Is there something that has to be done to get the vectors to match up?

This part is a 3-D array (3 square brackets to begin the argument):
self.nodes = scipy.array([[ [random()*255 for
i in range(FV_size)] for x in range(width)] for y in range(height)])
This part is a 2-D array:
self.nodes = scipy.array([[1,2,3],[4,5,6],[4,5,6],
[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],
[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6],[4,5,6]])
So you need to turn self.nodes into the appropriate 3-D array.
EDIT: an example of the required syntax:
self.nodes = scipy.array([[ [1,2,3],[4,5,6]] , [[7,8,9],[10,11,12]]])
print(self.nodes)
>>> array([[[ 1, 2, 3],
[ 4, 5, 6]],
[[ 7, 8, 9],
[10, 11, 12]]])
EDIT 2:
Another option is to build a linear array and then reshape():
myarray = scipy.array([1,2,3,4,5,6,7,8,9,10,11,12])
myarray = myarray.reshape( (2, 2, 3) ) ## 3 numbers for 3 dimensions, but the product must be the same as the number of elements of the original array
print(myarray)
>>> array([[[ 1, 2, 3],
[ 4, 5, 6]],
[[ 7, 8, 9],
[10, 11, 12]]])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Convert pixel coordinates to frame coordinates - python

Related

The coordinates of the reconstructed 3D points are different after the virtual camera intrinsic K has also changed proportionally after image resize?

Creating a 3d rendering in python with a camera. objects get progressively deformed toward the edge of the screen

What is the "proper" way to get rid of gravity from accelerometer data?

Random values +/- 10 from a supplied x that will not exceed a global range of 0-255

Python inquiry using a neural network

Categories

Resources