Removing specific item from np.array when it matches location - python

For a Python program "Particle in a Box", I need to insert a gap in the x,y-plane. I keep track of the x- and y-location of each particle and the x- and y-components of the velocities in numpy arrays.
I have two errors.
1) positie_y[z] <= 0 and positie_x[z] > 0.8 and positie_x[z] < 0.9 cannot be used for numpy arrays. I am new to numpy arrays, so please explain how to use any() or all() or other options?
2) The np.delete is not working: the particles are not disappearing. Is this because I am not using it the right way or is there another way to do it?
def particles(n, gap):
dt = 0.01
position_x = []
position_y = []
speed_x = []
speed_y = []
for i in range(n):
alpha = random.random() * 360
speed = (0.1)*random.random() * alpha
speed_x.append(math.sin(snelheid))
speed_y.append(math.cos(snelheid))
position_x.append(0.25)
position_y.append(0.75)
position_x = np.array(position_x)
position_y = np.array(position_y)
speed_x = np.array(speed_x)
speed_y = np.array(speed_y)
Until here, it is working fine. The problem is somewhere in the following code.
while True:
position_x = position_x + speed_x * dt
position_y = position_y + speed_y * dt
# 'z' is the position number of the particle in the numpy array.
for z in range(0, n):
# Gap == 1 means there is a gap.
if gap == 1:
# The gap is at y = 0 and 0.8 < x < 0.9
if position_y[z] <= 0 and position_x[z] > 0.8 and position_x[z] < 0.9:
np.delete(position_x, position_x[z])
np.delete(position_y, position_y[z])
np.delete(speed_x, speed_x[z])
np.delete(speed_y, speed_y[z])
After that, I will plot each particle with plt.plot(positie_x, positie_y, 'ro') and particles(100, 1)

The snippet:
np.delete(position_x, position_x[z])
isn't working because the function numpy.delete doesn't make changes on the given array object, but rather returns a new array without the deleted elements.
So your code should be written like this:
position_x = np.delete(position_x, position_x[z])

Related

Fastest way to create list of (X,Y) incrementing tuples with step value?

I need a fast way to create a list of tuples representing image pixel coordinates (X, Y).
Where X is from 0 to size and Y is from 0 to size.
A step value of 1 results in X and Y values of (0, 1, 2, 3...) which is too many tuples. Using a step value greater than 1 will reduce processing time. For example, if the step value is 2 the values would be (0, 2, 4, 6...). If the step value is 4 the values would be (0, 4, 8, 12...).
In pure python range command might be used. However, NumPy is installed by default in my Linux distribution. In NumPy the arrange command might be used but I'm having a hard time wrapping my mind around NumPy array syntax.
PS: After a list of tuples is created it will be randomly shuffled and then read in the loop.
Edit 1
Using this answer below:
Instead of the image fading in it's doing some kind of weird wipe left to right. Using the code from the answer with a slight modification:
step = 4
size = self.play_rotated_art.size[0] - step
self.xy_list = [
(x, y)
for x in range(0, size - step, step)
for y in range(0, size - step, step)
]
Bug Update
There was an error in my code, it's working fine now:
The updated code is:
self.step = 4
size = self.play_rotated_art.size[0] - self.step
self.xy_list = [
(x, y)
for x in range(0, size - self.step, self.step)
for y in range(0, size - self.step, self.step)
]
shuffle(self.xy_list)
# Convert numpy array into python list & calculate chunk size
self.current_chunk = 0
self.chunk_size = int(len(self.xy_list) / 100)
# Where we stop copying pixels for current 1% chunck
end = self.current_chunk + self.chunk_size
if end > len(self.xy_list) - 1:
end = len(self.xy_list) - 1
while self.current_chunk < end:
x0, y0 = self.xy_list[self.current_chunk]
x1 = x0 + self.step
y1 = y0 + self.step
box = (x0, y0, x1, y1)
region = self.play_rotated_art.crop(box)
self.fade.paste(region, box)
self.current_chunk += 1
self.play_artfade_count += 1
return self.fade
TL;DR
I already have code with step value 1 but this code is overly complex and inefficient to request a modification. The above generic question would help others more and, still help me, if it were answered.
Existing code with step value 1:
def play_artfade2(self):
''' PILLOW VERSION:
Fade in artwork in 100 chunks leaving loop after chunk and
reentering after Tkinter updates screen and pauses.
'''
if self.play_artfade_count == 100:
# We'have completed a full cycle. Force graphical effects exit
self.play_artfade_count = 0 # Reset art fade count
self.play_rotated_value = -361 # Force Spin Art
return None
# Initialize numpy arrays first time through
if self.play_artfade_count == 0:
# Create black image to fade into
self.fade = Image.new('RGBA', self.play_rotated_art.size, \
color='black')
# Generate a randomly shuffled array of the coordinates
im = np.array(self.play_rotated_art)
X,Y = np.where(im[...,0]>=0)
coords = np.column_stack((X,Y))
np.random.shuffle(coords)
# Convert numpy array into python list & calculate chunk size
self.xy_list = list(coords)
self.current_chunk = 0
self.chunk_size = int(len(self.xy_list) / 100)
# Where we stop copying pixels for current 1% chunck
end = self.current_chunk + self.chunk_size
if end > len(self.xy_list) - 1:
end = len(self.xy_list) - 1
while self.current_chunk < end:
x0, y0 = self.xy_list[self.current_chunk]
x1 = x0 + 1
y1 = y0 + 1
box = (x0, y0, x1, y1)
region = self.play_rotated_art.crop(box)
self.fade.paste(region, box)
self.current_chunk += 1
self.play_artfade_count += 1
return self.fade
Using Pillow's Image.crop() and Image.paste() is overkill for a single pixel but the initial working design was future focused to utilize "super pixels" with box size of 2x2, 3x3, 5x5, etc as image is resized from 200x200 to 333x333 to 512x512, etc.
I need fast way to create a list of tuples representing image pixel coordinates (X, Y).
Where X is from 0 to size and Y is from 0 to size
A list comprehension with range will work:
xsize = 10
ysize = 10
coords = [(x, y) for x in range(xsize) for y in range(ysize)]
# this verifies the shape is correct
assert len(coords) == xsize * ysize
If you wanted a step other than 1, this is setting the step argument:
coords = [(x, y) for x in range(0, xsize, 2) for y in range(0, ysize, 2)]
You can use a generator expression:
size = 16
step = 4
coords = (
(x, y)
for x in range(0, size, step)
for y in range(0, size, step)
)
Then you can iterate on that like you would do with a list
for coord in coords:
print(coord)
Using a generator instead of a list or tuple has the advantage of being more memory efficient.

Filter in Fourier space does not behave like it's supposed to

This is a follow-up to an answered question that I asked and that can be found here.
I have several points (x,y,z coordinates) in a 3D box with associated masses. I want to draw an histogram of the mass-density that is found in spheres of a given radius R. The idea is to compute a 3D histogram of my box (with binning much smaller than the radius), take its FFT, multiply by the filter (a ball in real space) and inverse FFT the result. From there, I just compute the 1D histogram of the values obtained in each 3D-bin.
Following the issue I had by using an analytic expression of the filter in Fourier space, I am now generating the ball in real space and take its FFT to obtain my filter. However, the histogram I get out of this method is really strange, where I would expect a Gaussian I am getting this:
My code is the following:
import numpy as np
import matplotlib.pyplot as plt
import random
from numba import njit
# 1. Generate a bunch of points with masses from 1 to 3 separated by a radius of 1 cm
size = 100
radius = 1
rangeX = (0, size)
rangeY = (0, size)
rangeZ = (0, size)
rangem = (1,3)
qty = 300000 # or however many points you want
deltas = set()
for x in range(-radius, radius+1):
for y in range(-radius, radius+1):
for z in range(-radius, radius+1):
if x*x + y*y + z*z<= radius*radius:
deltas.add((x,y,z))
X = []
Y = []
Z = []
M = []
excluded = set()
for i in range(qty):
x = random.randrange(*rangeX)
y = random.randrange(*rangeY)
z = random.randrange(*rangeZ)
m = random.uniform(*rangem)
if (x,y,z) in excluded: continue
X.append(x)
Y.append(y)
Z.append(z)
M.append(1)
excluded.update((x+dx, y+dy, z+dz) for (dx,dy,dz) in deltas)
#print("There is ",len(X)," points in the box")
# Compute the 3D histogram
a = np.vstack((X, Y, Z)).T
b = 200
R = 10
H, edges = np.histogramdd(a, weights=M, bins = b)
Fh = np.fft.fftn(H, axes=(-3,-2, -1))
# Generate the filter in real space
Kreal = np.zeros((b,b,b))
X = edges[0]
Y = edges[1]
Z = edges[2]
mid = int(b/2)
s = (X.max()-X.min()+Y.max()-Y.min()+Z.max()-Z.min())/(3*b)
cst = 1/2 + (1/12 - (R/s)**2)*np.arctan((0.5*np.sqrt((R/s)**2-0.5))/(0.5-(R/s)**2)) + 1/3*np.sqrt((R/s)**2-0.5) + ((R/s)**2 - 1/12)*np.arctan(0.5/(np.sqrt((R/s)**2-0.5))) - 4/3*(R/s)**3*np.arctan(0.25/((R/s)*np.sqrt((R/s)**2-0.5)))
#njit(parallel=True)
def remp(Kreal):
for i in range(b):
for j in range(b):
for k in range(b):
a = cst - np.sqrt((X[i]-X[mid])**2 + (Y[j]-Y[mid])**2 + (Z[k]-Z[mid])**2)/s
if a >= 0.1 and a < 0.2:
Kreal[i][j][k] = 0.1
elif a >= 0.2 and a < 0.3:
Kreal[i][j][k] = 0.2
elif a >= 0.3 and a < 0.4:
Kreal[i][j][k] = 0.3
elif a >= 0.4 and a < 0.5:
Kreal[i][j][k] = 0.4
elif a >= 0.5 and a < 0.6:
Kreal[i][j][k] = 0.5
elif a >= 0.6 and a < 0.7:
Kreal[i][j][k] = 0.6
elif a >= 0.7 and a < 0.8:
Kreal[i][j][k] = 0.7
elif a >= 0.8 and a < 0.9:
Kreal[i][j][k] = 0.8
elif a >= 0.9 and a < 0.99:
Kreal[i][j][k] = 0.9
elif a >= 0.99:
Kreal[i][j][k] = 1
return Kreal
Kreal = remp(Kreal)
Kreal = np.fft.ifftshift(Kreal)
Kh = np.fft.fftn(Kreal, axes=(-3,-2, -1))
Gh = np.multiply(Fh, Kh)
Density = np.real(np.fft.ifftn(Gh,axes=(-3,-2, -1)))
# Generate the filter in fourier space using its analytic expression
kx = 2*np.pi*np.fft.fftfreq(len(edges[0][:-1]))*len(edges[0][:-1])/(np.amax(X)-np.amin(X))
ky = 2*np.pi*np.fft.fftfreq(len(edges[1][:-1]))*len(edges[1][:-1])/(np.amax(Y)-np.amin(Y))
kz = 2*np.pi*np.fft.fftfreq(len(edges[2][:-1]))*len(edges[2][:-1])/(np.amax(Z)-np.amin(Z))
kr = np.sqrt(kx[:,None,None]**2 + ky[None,:,None]**2 + kz[None,None,:]**2)
kr *= R
Kh = (np.sin(kr)-kr*np.cos(kr))*3/(kr)**3
Kh[0,0,0] = 1
Gh = np.multiply(Fh, Kh)
Density2 = np.real(np.fft.ifftn(Gh,axes=(-3,-2, -1)))
D = Density.flatten()
N = np.mean(D)
D2 = Density2.flatten()
N2 = np.mean(D2)
# I then compute the histogram I want
hist, bins = np.histogram(D/N, bins='auto', density=True)
bin_centers = (bins[1:]+bins[:-1])*0.5
plt.plot(bin_centers, hist,'.',label = "Defining the Filter in real space")
hist, bins = np.histogram(D2/N2, bins='auto', density=True)
bin_centers = (bins[1:]+bins[:-1])*0.5
plt.plot(bin_centers, hist,'.',label = "Using analytic expression")
plt.xlabel('Normalised Density')
plt.ylabel('Probability density')
plt.legend()
plt.show()
Do you understand why this happens ? Thank you very much for your help.
PS: the long list of if statements when I define the Filter in real space comes from how i'm drawing the sphere on the grid. I assign the value 1 to all the bins that are 100% within the sphere, and then the value decreases as the volume occupied by the sphere in the bin decreases. I checked that it gives me a sphere of the radius wanted. Details on the subject can be found here(part 2.5 and figure 8 for accuracy).
--EDIT--
The code only seems to behave like this when all the particle masses are identical
My problem comes from how I am generating my filter. In my code, the way I associate weight to voxels not entirely in the sphere is discontinuous: For example I give the weight 0.1 to a voxel whose volume ratio is between 0.1 et 0.2.
Thus what happens when all points have the same mass is: I have multiples of 1 in my grid that I multiply with a finite number of coefficients, thus there is a finite nummber of possible values that my grid can take, and thus some bins are empty or at least 'less full'. This is less likely to happen when the masses of my particle are more continuously distributed.
A fix is thus to appoint the right weight to the voxels.
def remp(Kreal):
for i in range(b):
for j in range(b):
for k in range(b):
a = cst - np.sqrt((X[i]-X[mid])**2 + (Y[j]-Y[mid])**2 + (Z[k]-Z[mid])**2)/s
if a >= 0.1 and a < 0.99:
Kreal[i][j][k] = a
elif a >= 0.99:
Kreal[i][j][k] = 1

Changing all items in a list at once

I am trying to create a simple "Particle in a box" in Python, where multiple particles are bouncing back and forth between boundaries.
I need to keep track of the x-pos and y-pos of the particle and the x- and y-components of the speed to plot these at the end.
By each dt, the program will calculate the new position. Instead of looping through each particle, I want to update the entire list at once. Otherwise, the calculation and replacements will take forever for more particles.
This question is already asked. However, I calculate each value each step. This is different from changing an item to a predetermined value.
So, how do I replace each item at once in a list after calculating the new value?
dt = 0.001
pos_x = []
pos_y = []
speed_x = []
speed_y = []
For-loop to set the speed of each particle:
for i in range(5):
alpha = random.random() * 360
speed = 0.1 * random.random() * alpha
speed_x.append(math.sin(speed))
speed_y.append(math.cos(speed))
pos_x.append(0.25)
pos_y.append(0.75)
For-loop to update the position of each particle:
for n in range(5):
pos_x[n] = pos_x[n] + speed_x[n] * dt
pos_y[n] = pos_y[n] + speed_y[n] * dt
After this, I will plot all the points and update the window each pause to let them move.
import numpy as np
if __name__ == "__main__":
pos = np.array([5,5,5,5,5])
speed = np.array([2,2,2,2,2])
new_pos = pos + speed * 0.01
print(new_pos)
Output:
[5.02 5.02 5.02 5.02 5.02]
With the numpy package you can easily add arrays together or multiply them with predefined values.

Many particles in box - physics simulation

I'm currently trying to simulate many particles in a box bouncing around.
I've taken into account #kalhartt's suggestions and this is the improved code to initialize the particles inside the box:
import numpy as np
import scipy.spatial.distance as d
import matplotlib.pyplot as plt
# 2D container parameters
# Actual container is 50x50 but chose 49x49 to account for particle radius.
limit_x = 20
limit_y = 20
#Number and radius of particles
number_of_particles = 350
radius = 1
def force_init(n):
# equivalent to np.array(list(range(number_of_particles)))
count = np.linspace(0, number_of_particles-1, number_of_particles)
x = (count + 2) % (limit_x-1) + radius
y = (count + 2) / (limit_x-1) + radius
return np.column_stack((x, y))
position = force_init(number_of_particles)
velocity = np.random.randn(number_of_particles, 2)
The initialized positions look like this:
Once I have the particles initialized I'd like to update them at each time-step. The code for updating follows the previous code immediately and is as follows:
# Updating
while np.amax(abs(velocity)) > 0.01:
# Assume that velocity slowly dying out
position += velocity
velocity *= 0.995
#Get pair-wise distance matrix
pair_dist = d.cdist(position, position)
pair_d = pair_dist<=4
#If pdist [i,j] is <=4 then the particles are too close and so treat as collision
for i in range(len(pair_d)):
for j in range(i):
# Only looking at upper triangular matrix (not inc. diagonal)
if pair_d[i,j] ==True:
# If two particles are too close then swap velocities
# It's a bad hack but it'll work for now.
vel_1 = velocity[j][:]
velocity[j] = velocity[i][:]*0.9
velocity[i] = vel_1*0.9
# Masks for particles beyond the boundary
xmax = position[:, 0] > limit_x
xmin = position[:, 0] < 0
ymax = position[:, 1] > limit_y
ymin = position[:, 1] < 0
# flip velocity and assume that it looses 10% of energy
velocity[xmax | xmin, 0] *= -0.9
velocity[ymax | ymin, 1] *= -0.9
# Force maximum positions of being +/- 2*radius from edge
position[xmax, 0] = limit_x-2*radius
position[xmin, 0] = 2*radius
position[ymax, 0] = limit_y-2*radius
position[ymin, 0] = 2*radius
After updating it and letting it run to completion I get this result:
This is infinitely better than before but there are still patches that are too close together - such as:
Too close together. I think the updating works... and thanks to #kalhartt my code is wayyyy better and faster (and I learnt some things about numpy... props #kalhartt) but I still don't know where it's screwing up. I've tried changing the order of the actual updates with the pair-wise distance going last or the position +=velocity going last but to no avail. I added the *0.9 to make the entire thing die down faster and I tried it with 4 to make sure that 2*radius (=2) wasn't too tight a criteria... but nothing seems to work.
Any and all help would be appreciated.
There are just two typos standing in your way. First for i in range(len(positions)/2): only iterates over half of your particles. This is why half the particles stay in the x bounds (if you watch for large iterations its more clear). Second, the second y condition should be a minimum (I assume) position[i][1] < 0. The following block works to bound the particles for me (I didn't test with the collision code so there could be problems there).
for i in range(len(position)):
if position[i][0] > limit_x or position[i][0] < 0:
velocity[i][0] = -velocity[i][0]
if position[i][1] > limit_y or position[i][1] < 0:
velocity[i][1] = -velocity[i][1]
As an aside, try to leverage numpy to eliminate loops when possible. It is faster, more efficient, and in my opinion more readable. For example force_init would look like this:
def force_init(n):
# equivalent to np.array(list(range(number_of_particles)))
count = np.linspace(0, number_of_particles-1, number_of_particles)
x = (count * 2) % limit_x + radius
y = (count * 2) / limit_x + radius
return np.column_stack((x, y))
And your boundary conditions would look like this:
while np.amax(abs(velocity)) > 0.01:
position += velocity
velocity *= 0.995
# Masks for particles beyond the boundary
xmax = position[:, 0] > limit_x
xmin = position[:, 0] < 0
ymax = position[:, 1] > limit_y
ymin = position[:, 1] < 0
# flip velocity
velocity[xmax | xmin, 0] *= -1
velocity[ymax | ymin, 1] *= -1
Final note, it is probably a good idea to hard clip position to the bounding box with something like position[xmax, 0] = limit_x; position[xmin, 0] = 0. There may be cases where velocity is small and a particle outside the box will be reflected but not make it inside in the next iteration. So it will just sit outside the box being reflected forever.
EDIT: Collision
The collision detection is a much harder problem, but lets see what we can do. Lets take a look at your current implementation.
pair_dist = d.cdist(position, position)
pair_d = pair_dist<=4
for i in range(len(pair_d)):
for j in range(i):
# Only looking at upper triangular matrix (not inc. diagonal)
if pair_d[i,j] ==True:
# If two particles are too close then swap velocities
# It's a bad hack but it'll work for now.
vel_1 = velocity[j][:]
velocity[j] = velocity[i][:]*0.9
velocity[i] = vel_1*0.9
Overall a very good approach, cdist will efficiently calculate the distance
between sets of points and you find which points collide with pair_d = pair_dist<=4.
The nested for loops are the first problem. We need to iterate over True values of pair_d where j > i. First your code actually iterate over the lower triangular region by using for j in range(i) so that j < i, not particularly important in this instance as long since i,j pairs are not repeated. However Numpy has two builtins we can use instead, np.triu lets us set all values below a diagonal to 0 and np.nonzero will give us the indices of non-zero elements in a matrix. So this:
pair_dist = d.cdist(position, position)
pair_d = pair_dist<=4
for i in range(len(pair_d)):
for j in range(i+1, len(pair_d)):
if pair_d[i, j]:
...
is equivalent to
pair_dist = d.cdist(position, position)
pair_d = np.triu(pair_dist<=4, k=1) # k=1 to exclude the diagonal
for i, j in zip(*np.nonzero(pair_d)):
...
The second problem (as you noted) is that the velocities are just switched and scaled instead of reflected. What we really want to do is negate and scale the component of each particles velocity along the axis that connects them. Note that to do this we will need the vector connecting them position[j] - position[i] and the length of the vector connecting them (which we already calculated). So unfortunately part of the cdist calculation gets repeated. Lets quit using cdist and do it ourselves instead. The goal here is to make two arrays diff and norm where diff[i][j] is a vector pointing from particle i to j (so diff is a 3D array) and norm[i][j] is the distance between particles i and j. We can do this with numpy like so:
nop = number_of_particles
# Give pos a 3rd index so we can use np.repeat below
# equivalent to `pos3d = np.array([ position ])
pos3d = position.reshape(1, nop, 2)
# 3D arras with a repeated index so we can form combinations
# diff_i[i][j] = position[i] (for all j)
# diff_j[i][j] = position[j] (for all i)
diff_i = np.repeat(pos3d, nop, axis=1).reshape(nop, nop, 2)
diff_j = np.repeat(pos3d, nop, axis=0)
# diff[i][j] = vector pointing from position[i] to position[j]
diff = diff_j - diff_i
# norm[i][j] = sqrt( diff[i][j]**2 )
norm = np.linalg.norm(diff, axis=2)
# check for collisions and take the region above the diagonal
collided = np.triu(norm < radius, k=1)
for i, j in zip(*np.nonzero(collided)):
# unit vector from i to j
unit = diff[i][j] / norm[i][j]
# flip velocity
velocity[i] -= 1.9 * np.dot(unit, velocity[i]) * unit
velocity[j] -= 1.9 * np.dot(unit, velocity[j]) * unit
# push particle j to be radius units from i
# This isn't particularly effective when 3+ points are close together
position[j] += (radius - norm[i][j]) * unit
...
Since this post is long enough already, here is a gist of the code with my modifications.

Python Beginner - How to equate a regression line from clicks and display graphically?

I am reading Python Programming by John Zelle and I am stuck on one the exercises shown in the picture below.
You can view my code below. I know the code is very ugly. (Any tips are appreciated)
Here's my code so far:
from graphics import *
def regression():
# creating the window for the regression line
win = GraphWin("Regression Line - Start Clicking!", 500, 500)
win.setCoords(0.0, 0.0, 10.0, 10.0)
rect = Rectangle(Point(0.5, 0.1), Point(2.5, 2.1))
rect.setFill("red")
rect.draw(win)
Text(rect.getCenter(), "Done").draw(win)
message = Text(Point(5, 0.5), "Click in this screen")
message.draw(win)
points = [] # list of points
n = 0 # count variable
sumX = 0
sumY = 0
while True:
p = win.getMouse()
p.draw(win)
# if user clicks in a red square it exits the loop and calculates the regression line
if (p.getX() >= 0.5 and p.getX() <= 2.5) and (p.getY() >= 0.1 and p.getY() <= 2.1):
break
n += 1 # count of the points
# get the sum of the X and Y points
sumX = sumX + p.getX()
sumY = sumY + p.getY()
# tuple of the X and Y points
dot = (p.getX(), p.getY())
points.append(dot)
avgX = sumX / n
avgY = sumY / n
top = 0
bottom = 0
# my ugly attempt at the regression equation shown in the book
for i in points:
gp = 0
numer = points[gp][0] * points[gp][1]
top = top + numer
denom = points[gp][0] ** 2
bottom = bottom + denom
gp += 1
m = (top - sumX * sumY) / (bottom - sumX ** 2)
y1 = avgY + m * (0.0 - avgX)
y2 = avgY + m * (10.0 - avgX)
regressionline = Line(Point(0, y1), Point(10.0, y2))
regressionline.draw(win)
raw_input("Press <Enter> to quit.")
win.close()
regression()
When I run the program the regression line never appears to be the real line of best fit. I believe I am interpreting the regression equation incorrectly in my code. What needs to be changed to get the correct regression line?
Issues:
from my_library import * should be avoided; better to specify exactly what you want from it. This helps keep your namespace uncluttered.
you've got one massive block of code; better to split it into separate functions. This makes it much easier to think about and debug, and may help you reuse code later. Sure, it's a toy problem, you're not going to reuse it - but the whole point of doing exercises is to develop good habits, and factoring your code this way is definitely a good habit! A general rule of thumb - if a function contains more than about a dozen lines of code, you should consider splitting it further.
the exercise asks you to keep track of x, y, xx, and xy running sums while getting input points. I think this is kind of a bad idea - or at least more C-ish than Python-ish - as it forces you to do two different tasks at once (get points and do math on them). My advice would be: if you are getting points, get points; if you are doing math, do math; don't try doing both at once.
similarly, I don't like the way you've got the regression calculation worrying about where the sides of the window are. Why should it know or care about windows? I hope you like my solution to this ;-)
Here's my refactored version of your code:
from graphics import GraphWin, Point, Line, Rectangle, Text
def draw_window()
# create canvas
win = GraphWin("Regression Line - Start Clicking!", 500, 500)
win.setCoords(0., 0., 10., 10.)
# exit button
rect = Rectangle(Point(0.5, 0.1), Point(2.5, 2.1))
rect.setFill("red")
rect.draw(win)
Text(rect.getCenter(), "Done").draw(win)
# instructions
Text(Point(5., 0.5), "Click in this screen").draw(win)
return win
def get_points(win):
points = []
while True:
p = win.getMouse()
p.draw(win)
# clicked the exit button?
px, py = p.getX(), p.getY()
if 0.5 <= px <= 2.5 and 0.1 <= py <= 2.1:
break
else:
points.append((px,py))
return points
def do_regression(points):
num = len(points)
x_sum, y_sum, xx_sum, xy_sum = 0., 0., 0., 0.
for x,y in points:
x_sum += x
y_sum += y
xx_sum += x*x
xy_sum += x*y
x_mean, y_mean = x_sum/num, y_sum/num
m = (xy_sum - num*x_mean*y_mean) / (xx_sum - num*x_mean*x_mean)
def lineFn(xval):
return y_mean + m*(xval - x_mean)
return lineFn
def main():
# set up
win = draw_window()
points = get_points(win)
# show regression line
lineFn = do_regression(points)
Line(
Point(0., lineFn(0. )),
Point(10., lineFn(10.))
).draw(win)
# wait to close
Text(Point(5., 5.), "Click to exit").draw(win)
win.getMouse()
win.close()
if __name__=="__main__":
main()
the for loop is all messed up! you have an i that changes in the loop, but then use gp which is always 0.
you want something more like:
for (X, Y) in points:
numer += X * Y
denom += X * X
...or move gp = 0 to before the for loop.
...or drop that part completely and add a sumXY and a sumXX to the sumX and sumY.
either way, once you fix that it should be ok (well, or maybe some other bug....).

Categories

Resources