Related
For the following code whose job is to perform Monte Carlo integration for a function f, I was wondering what would happen if I define f as y = sqrt(1-x^2), which is the equation for a unit quarter circle, and specify an endpoint that is greater than 1, since we know that f is only defined for 0<x<1.
import numpy as np
import matplotlib.pyplot as plt
def definite_integral_show(f, x0, x1, N):
"""Approximate the definite integral of f(x)dx between x0 and x1 using
N random points
Arguments:
f -- a function of one real variable, must be nonnegative on [x0, x1]
N -- the number of random points to use
"""
#First, let's compute fmax. We do that by evaluating f(x) on a grid
#of points between x0 and x1
#This assumes that f is generally smooth. If it's not, we're in trouble!
x = np.arange(x0, x1, 0.01)
y = f(x)
print(y)
f_max = max(y)
#Now, let's generate the random points. The x's should be between
#x0 and x1, so we first create points beterrm 0 and (x1-x0), and
#then add x0
#The y's should be between 0 and fmax
#
# 0...(x1-x0)
x_rand = x0 + np.random.random(N)*(x1-x0)
print(x_rand)
y_rand = 0 + np.random.random(N)*f_max
#Now, let's find the indices of the poitns above and below
#the curve. That is, for points below the curve, let's find
# i s.t. y_rand[i] < f(x_rand)[i]
#And for points above the curve, find
# i s.t. y_rand[i] >= f(x_rand)[i]
ind_below = np.where(y_rand < f(x_rand))
ind_above = np.where(y_rand >= f(x_rand))
#Finally, let's display the results
plt.plot(x, y, color = "red")
pts_below = plt.scatter(x_rand[ind_below[0]], y_rand[ind_below[0]], color = "green")
pts_above = plt.scatter(x_rand[ind_above[0]], y_rand[ind_above[0]], color = "blue")
plt.legend((pts_below, pts_above),
('Pts below the curve', 'Pts above the curve'),
loc='lower left',
ncol=3,
fontsize=8)
def f1(x):
return np.sqrt(1-x**2)
definite_integral_show(f1, 0, 6, 200)
To my surprise, the program still works and gives me the following picture.
I suspect that it works because in NumPy, nan's in an array are just ignored when performing operations on the array. However, I don't understand why the picture only contains points whose x and y coordinates are both between 0 to 1. Where are the points that aren't within this range, but whose values are computed by
x_rand = x0 + np.random.random(N)*(x1-x0)
y_rand = 0 + np.random.random(N)*f_max
You can just print out the arrays (for example by generating only one random point) and see that they go into neither ind_below nor ind_above...
That's because all comparisons that involves nan returns False. (See also: What is the rationale for all comparisons returning false for IEEE754 NaN values?). (so y_rand < nan and y_rand >= nan both evaluates to False)
The easiest way to change the code is
ind_below = np.where(y_rand < f(x_rand))
ind_above = np.where(~(y_rand < f(x_rand)))
(optionally only compute the array once)
Question
Given a plotting window, how does one generate random points at the perimeter of a square (perimeter of the plotting window)?
Background and attempt
I found a similar question with regards to a rectangle in javascript.
I managed to write a program to generate random points within limits but the question is regarding how one could find random points with the condition that they are at the edge of the plot (either x is equal to 5 or -5 ,or y is equal to 5 or -5 in this case).
import numpy as np
import matplotlib.pyplot as plt
# Parameters
n = 6 # number of points
a = 5 # upper bound
b = -5 # lower bound
# Random coordinates [b,a) uniform distributed
coordy = (b - a) * np.random.random_sample((n,)) + a # generate random y
coordx = (b - a) * np.random.random_sample((n,)) + a # generate random x
# Create limits (x,y)=((-5,5),(-5,5))
plt.xlim((b,a))
plt.ylim((b,a))
# Plot points
for i in range(n):
plt.plot(coordx[i],coordy[i],'ro')
plt.show()
Summary
So to summarize, my question is how do I generate random coordinates given that they are at the edge of the plot/canvas. Any advice or help will be appreciated.
Here is what you can do:
from random import choice
import matplotlib.pyplot as plt
from numpy.random import random_sample
n = 6
a = 5
b = -5
plt.xlim((b,a))
plt.ylim((b,a))
for i in range(n):
r = (b - a) * random_sample() + a
random_point = choice([(choice([a,b]), r),(r, choice([a,b]))])
plt.scatter(random_point[0],random_point[1])
plt.show()
Output:
One possible approach (despite not very elegant) is the following: divide horizontal and vertical points Suppose you want to draw a point at the top or at the bottom of the window. Then,
Select randomly the y coordinate as b or -b
Select randomly (uniform distribution) the x coordinate
Similar approach for right and left edges of the window.
Hope that helps.
You could use this, but this is assuming you want to discard them when it is found they aren't on the edge.
for x in coordx:
if x != a:
coordx.pop(x)
else:
continue
And then do the same for y.
Geometrically speaking, being on the edge requires that a point satisfy certain conditions. Assuming that we are talking about a grid whose dimensions are defined by x ~ [0, a] and y ~ [0, b]:
The y-coordinate is either 0 or b, with the x-coordinate within [0, a], or
The x-coordinate is either 0 or a, with the y-coordinate within [0, b]
There are obviously more than one way to go about this, but here is a simple method to get you started.
def plot_edges(n_points, x_max, y_max, x_min=0, y_min=0):
# if x_max - x_min = y_max - y_min, plot a square
# otherwise, plot a rectangle
vertical_edge_x = np.random.uniform(x_min, x_max, n_points)
vertical_edige_y = np.asarray([y_min, y_max])[
np.random.randint(2, size=n_points)
]
horizontal_edge_x = np.asarray([x_min, x_max])[
np.random.randint(2, size=n_points)
]
horizontal_edge_y = np.random.uniform(x_min, x_max, n_points)
# plot generated points
plt.scatter(vertical_edge_x, vertical_edige_y)
plt.scatter(horizontal_edge_x, horizontal_edge_y)
plt.show()
Can you try this out?
import numpy as np
import matplotlib.pyplot as plt
# Parameters
n = 6 # number of points
a = 5 # upper bound
b = -5 # lower bound
import random
coordx,coordy=[],[]
for i in range(n):
xy = random.choice(['x','y'])
if xy=='x':
coordx.append(random.choice([b,a])) # generate random x
coordy.append(random.random()) # generate random y
if xy=='y':
coordx.append(random.random()) # generate random x
coordy.append(random.choice([b,a])) # generate random y
# Create limits (x,y)=((-5,5),(-5,5))
plt.xlim((b,a))
plt.ylim((b,a))
# Plot points
for i in range(n):
plt.plot(coordx[i],coordy[i],'ro')
plt.show()
Here is a sample output:
Here's a way to do that:
import numpy as np
import matplotlib.pyplot as plt
# Parameters
n = 6 # number of points
a = 5 # upper bound
b = -5 # lower bound
# Random coordinates [b,a) uniform distributed
coordy = (b - a) * np.random.random_sample((n,)) + a # generate random y
coordx = (b - a) * np.random.random_sample((n,)) + a # generate random x
# This is the new code
reset_axis = np.random.choice([True, False], n) # select which axis to reset
reset_direction = np.random.choice([a,b], n) # select to go up / right or down / left
coordx[reset_axis] = reset_direction[reset_axis]
coordy[~reset_axis] = reset_direction[~reset_axis]
# end of new code.
# Create limits (x,y)=((-5,5),(-5,5))
plt.xlim((b,a))
plt.ylim((b,a))
# Plot points
for i in range(n):
plt.plot(coordx[i],coordy[i],'ro')
plt.show()
The result is:
Let's say that I have the following data (measurements):
As you can see, there are a lot of sharp points (i.e. where the slope changes a lot). It would therefore, be good to take some more measurements around those points. To do that I wrote a script:
I calculate the curvature of 3 consecutive points:
Menger curvature: https://en.wikipedia.org/wiki/Menger_curvature#Definition
Then I decide which values I should resample, based on the curvature.
...and I iterate until the average curvature goes down... but it does not work, because, it goes up. Do you know why ?
Here is the complete code (stopped it after the length of the x values get 60):
import numpy as np
import matplotlib.pyplot as plt
def curvature(A,B,C):
"""Calculates the Menger curvature fro three Points, given as numpy arrays.
Sources:
Menger curvature: https://en.wikipedia.org/wiki/Menger_curvature#Definition
Area of a triangle given 3 points: https://math.stackexchange.com/questions/516219/finding-out-the-area-of-a-triangle-if-the-coordinates-of-the-three-vertices-are
"""
# Pre-check: Making sure that the input points are all numpy arrays
if any(x is not np.ndarray for x in [type(A),type(B),type(C)]):
print("The input points need to be a numpy array, currently it is a ", type(A))
# Augment Columns
A_aug = np.append(A,1)
B_aug = np.append(B,1)
C_aug = np.append(C,1)
# Caclulate Area of Triangle
matrix = np.column_stack((A_aug,B_aug,C_aug))
area = 1/2*np.linalg.det(matrix)
# Special case: Two or more points are equal
if np.all(A == B) or np.all(B == C):
curvature = 0
else:
curvature = 4*area/(np.linalg.norm(A-B)*np.linalg.norm(B-C)*np.linalg.norm(C-A))
# Return Menger curvature
return curvature
def values_to_calulate(x,curvature_list, max_curvature):
"""Calculates the new x values which need to be calculated
Middle point between the three points that were used to calculate the curvature """
i = 0
new_x = np.empty(0)
for curvature in curvature_list:
if curvature > max_curvature:
new_x = np.append(new_x, x[i]+(x[i+2]-x[i])/3 )
i = i+1
return new_x
def plot(x,y, title, xLabel, yLabel):
"""Just to visualize"""
# Plot
plt.scatter(x,y)
plt.plot(x, y, '-o')
# Give a title for the sine wave plot
plt.title(title)
# Give x axis label for the sine wave plot
plt.xlabel(xLabel)
# Give y axis label for the sine wave plot
plt.ylabel(yLabel)
plt.grid(True, which='both')
plt.axhline(y=0, color='k')
# Display the sine wave
plt.show
plt.pause(0.05)
### STARTS HERE
# Get x values of the sine wave
x = np.arange(0, 10, 1);
# Amplitude of the sine wave is sine of a variable like time
def function(x):
return 1+np.sin(x)*np.cos(x)**2
y = function(x)
# Plot it
plot(x,y, title='Data', xLabel='Time', yLabel='Amplitude')
continue_Loop = True
while continue_Loop == True :
curvature_list = np.empty(0)
for i in range(len(x)-2):
# Get the three points
A = np.array([x[i],y[i]])
B = np.array([x[i+1],y[i+1]])
C = np.array([x[i+2],y[i+2]])
# Calculate the curvature
curvature_value = abs(curvature(A,B,C))
curvature_list = np.append(curvature_list, curvature_value)
print("len: ", len(x) )
print("average curvature: ", np.average(curvature_list))
# Calculate the points that need to be added
x_new = values_to_calulate(x,curvature_list, max_curvature=0.3)
# Add those values to the current x list:
x = np.sort(np.append(x, x_new))
# STOPED IT AFTER len(x) == 60
if len(x) >= 60:
continue_Loop = False
# Amplitude of the sine wave is sine of a variable like time
y = function(x)
# Plot it
plot(x,y, title='Data', xLabel='Time', yLabel='Amplitude')
This is how it should look:
EDIT:
If you let it run even further... :
So summarize my comments above:
you are computing the average curvature of your curve which has no reason to go to 0. At every point, no matter how close your points get, the circle radius will converge to whatever the curvature is at that point, not 0.
an alternative would be to use the absolute derivative change between two points: keep sampling until abs(d(df/dx)) < some_threshold where d(df/dx) = (df/dx)[n] - (df/dx)[n-1]
I have an application that requires a disk populated with 'n' points in a quasi-random fashion. I want the points to be somewhat random, but still have a more or less regular density over the disk.
My current method is to place a point, check if it's inside the disk, and then check if it is also far enough away from all other points already kept. My code is below:
import os
import random
import math
# ------------------------------------------------ #
# geometric constants
center_x = -1188.2
center_y = -576.9
center_z = -3638.3
disk_distance = 2.0*5465.6
disk_diam = 5465.6
# ------------------------------------------------ #
pts_per_disk = 256
closeness_criteria = 200.0
min_closeness_criteria = disk_diam/closeness_criteria
disk_center = [(center_x-disk_distance),center_y,center_z]
pts_in_disk = []
while len(pts_in_disk) < (pts_per_disk):
potential_pt_x = disk_center[0]
potential_pt_dy = random.uniform(-disk_diam/2.0, disk_diam/2.0)
potential_pt_y = disk_center[1]+potential_pt_dy
potential_pt_dz = random.uniform(-disk_diam/2.0, disk_diam/2.0)
potential_pt_z = disk_center[2]+potential_pt_dz
potential_pt_rad = math.sqrt((potential_pt_dy)**2+(potential_pt_dz)**2)
if potential_pt_rad < (disk_diam/2.0):
far_enough_away = True
for pt in pts_in_disk:
if math.sqrt((potential_pt_x - pt[0])**2+(potential_pt_y - pt[1])**2+(potential_pt_z - pt[2])**2) > min_closeness_criteria:
pass
else:
far_enough_away = False
break
if far_enough_away:
pts_in_disk.append([potential_pt_x,potential_pt_y,potential_pt_z])
outfile_name = "pt_locs_x_lo_"+str(pts_per_disk)+"_pts.txt"
outfile = open(outfile_name,'w')
for pt in pts_in_disk:
outfile.write(" ".join([("%.5f" % (pt[0]/1000.0)),("%.5f" % (pt[1]/1000.0)),("%.5f" % (pt[2]/1000.0))])+'\n')
outfile.close()
In order to get the most even point density, what I do is basically iteratively run this script using another script, with the 'closeness' criteria reduced for each successive iteration. At some point, the script can not finish, and I just use the points of the last successful iteration.
So my question is rather broad: is there a better way to do this? My method is ok for now, but my gut says that there is a better way to generate such a field of points.
An illustration of the output is graphed below, one with a high closeness criteria, and another with a 'lowest found' closeness criteria (what I want).
A simple solution based on Disk Point Picking from MathWorld:
import numpy as np
import matplotlib.pyplot as plt
n = 1000
r = np.random.uniform(low=0, high=1, size=n) # radius
theta = np.random.uniform(low=0, high=2*np.pi, size=n) # angle
x = np.sqrt(r) * np.cos(theta)
y = np.sqrt(r) * np.sin(theta)
# for plotting circle line:
a = np.linspace(0, 2*np.pi, 500)
cx,cy = np.cos(a), np.sin(a)
fg, ax = plt.subplots(1, 1)
ax.plot(cx, cy,'-', alpha=.5) # draw unit circle line
ax.plot(x, y, '.') # plot random points
ax.axis('equal')
ax.grid(True)
fg.canvas.draw()
plt.show()
It gives.
Alternatively, you also could create a regular grid and distort it randomly:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.tri as tri
n = 20
tt = np.linspace(-1, 1, n)
xx, yy = np.meshgrid(tt, tt) # create unit square grid
s_x, s_y = xx.ravel(), yy.ravel()
ii = np.argwhere(s_x**2 + s_y**2 <= 1).ravel() # mask off unwanted points
x, y = s_x[ii], s_y[ii]
triang = tri.Triangulation(x, y) # create triangluar grid
# distort the grid
g = .5 # distortion factor
rx = x + np.random.uniform(low=-g/n, high=g/n, size=x.shape)
ry = y + np.random.uniform(low=-g/n, high=g/n, size=y.shape)
rtri = tri.Triangulation(rx, ry, triang.triangles) # distorted grid
# for circle:
a = np.linspace(0, 2*np.pi, 500)
cx,cy = np.cos(a), np.sin(a)
fg, ax = plt.subplots(1, 1)
ax.plot(cx, cy,'k-', alpha=.2) # circle line
ax.triplot(triang, "g-", alpha=.4)
ax.triplot(rtri, 'b-', alpha=.5)
ax.axis('equal')
ax.grid(True)
fg.canvas.draw()
plt.show()
It gives
The triangles are just there for visualization. The obvious disadvantage is that depending on your choice of grid, either in the middle or on the borders (as shown here), there will be more or less large "holes" due to the grid discretization.
If you have a defined area like a disc (circle) that you wish to generate random points within you are better off using an equation for a circle and limiting on the radius:
x^2 + y^2 = r^2 (0 < r < R)
or parametrized to two variables
cos(a) = x/r
sin(a) = y/r
sin^2(a) + cos^2(a) = 1
To generate something like the pseudo-random distribution with low density you should take the following approach:
For randomly distributed ranges of r and a choose n points.
This allows you to generate your distribution to roughly meet your density criteria.
To understand why this works imagine your circle first divided into small rings of length dr, now imagine your circle divided into pie slices of angle da. Your randomness now has equal probability over the whole boxed area arou d the circle. If you divide the areas of allowed randomness throughout your circle you will get a more even distribution around the overall circle and small random variation for the individual areas giving you the psudo-random look and feel you are after.
Now your job is just to generate n points for each given area. You will want to have n be dependant on r as the area of each division changes as you move out of the circle. You can proportion this to the exact change in area each space brings:
for the n-th to n+1-th ring:
d(Area,n,n-1) = Area(n) - Area(n-1)
The area of any given ring is:
Area = pi*(dr*n)^2 - pi*(dr*(n-1))
So the difference becomes:
d(Area,n,n-1) = [pi*(dr*n)^2 - pi*(dr*(n-1))^2] - [pi*(dr*(n-1))^2 - pi*(dr*(n-2))^2]
d(Area,n,n-1) = pi*[(dr*n)^2 - 2*(dr*(n-1))^2 + (dr*(n-2))^2]
You could expound this to gain some insight on how much n should increase but it may be faster to just guess at some percentage increase (30%) or something.
The example I have provided is a small subset and decreasing da and dr will dramatically improve your results.
Here is some rough code for generating such points:
import random
import math
R = 10.
n_rings = 10.
n_angles = 10.
dr = 10./n_rings
da = 2*math.pi/n_angles
base_points_per_division = 3
increase_per_level = 1.1
points = []
ring = 0
while ring < n_rings:
angle = 0
while angle < n_angles:
for i in xrange(int(base_points_per_division)):
ra = angle*da + da*math.random()
rr = r*dr + dr*random.random()
x = rr*math.cos(ra)
y = rr*math.sin(ra)
points.append((x,y))
angle += 1
base_points_per_division = base_points_per_division*increase_per_level
ring += 1
I tested it with the parameters:
n_rings = 20
n_angles = 20
base_points = .9
increase_per_level = 1.1
And got the following results:
It looks more dense than your provided image, but I imagine further tweaking of those variables could be beneficial.
You can add an additional part to scale the density properly by calculating the number of points per ring.
points_per_ring = densitymath.pi(dr**2)*(2*n+1)
points_per_division = points_per_ring/n_angles
This will provide a an even better scaled distribution.
density = .03
points = []
ring = 0
while ring < n_rings:
angle = 0
base_points_per_division = density*math.pi*(dr**2)*(2*ring+1)/n_angles
while angle < n_angles:
for i in xrange(int(base_points_per_division)):
ra = angle*da + min(da,da*random.random())
rr = ring*dr + dr*random.random()
x = rr*math.cos(ra)
y = rr*math.sin(ra)
points.append((x,y))
angle += 1
ring += 1
Giving better results using the following parameters
R = 1.
n_rings = 10.
n_angles = 10.
density = 10/(dr*da) # ~ ten points per unit area
With a graph...
and for fun you can graph the divisions to see how well it is matching your distriubtion and adjust.
Depending on how random the points need to be, it may be simple enough to just make a grid of points within the disk, and then displace each point by some small but random amount.
It may be that you want more randomness, but if you just want to fill your disc with an even-looking distribution of points that aren't on an obvious grid, you could try a spiral with a random phase.
import math
import random
import pylab
n = 300
alpha = math.pi * (3 - math.sqrt(5)) # the "golden angle"
phase = random.random() * 2 * math.pi
points = []
for k in xrange(n):
theta = k * alpha + phase
r = math.sqrt(float(k)/n)
points.append((r * math.cos(theta), r * math.sin(theta)))
pylab.scatter(*zip(*points))
pylab.show()
Probability theory ensures that the rejection method is an appropriate method
to generate uniformly distributed points within the disk, D(0,r), centered at origin and of radius r. Namely, one generates points within the square [-r,r] x [-r,r], until a point falls within the disk:
do{
generate P in [-r,r]x[-r,r];
}while(P[0]**2+P[1]**2>r);
return P;
unif_rnd_disk is a generator function implementing this rejection method:
import matplotlib.pyplot as plt
import numpy as np
import itertools
def unif_rnd_disk(r=1.0):
pt=np.zeros(2)
while True:
yield pt
while True:
pt=-r+2*r*np.random.random(2)
if (pt[0]**2+pt[1]**2<=r):
break
G=unif_rnd_disk()# generator of points in disk D(0,r=1)
X,Y=zip(*[pt for pt in itertools.islice(G, 1, 1000)])
plt.scatter(X, Y, color='r', s=3)
plt.axis('equal')
If we want to generate points in a disk centered at C(a,b), we have to apply a translation to the points in the disk D(0,r):
C=[2.0, -3.5]
plt.scatter(C[0]+np.array(X), C[1]+np.array(Y), color='r', s=3)
plt.axis('equal')
I would like to use the Fourier transform to find the center of a simulated entity under periodic boundary condition; periodic boundary conditions means, that whenever something exits through one side of the box, it is warped around to appear on the opposite side just like in the classic game asteroids.
So what I have is for each time frame a matrix (Nx3) with N the number of points in xyz. what I want to do is determine the center of that cloud even if it all moved over the periodic boundary and is so to say stuck in between.
My idea for an solution would now be do a (mass weigted) histogram of these points and then perform an FFT on that and use the phase of the first Fourier coefficient to determine where in the box the maximum would be.
as a test case I have used
import numpy as np
Points_x = np.random.randn(10000)
Box_min = -10
Box_max = 10
X = np.linspace( Box_min, Box_max, 100 )
### make a Histogram of the points
Histogram_Points = np.bincount( np.digitize( Points_x, X ), minlength=100 )
### make an artifical shift over the periodic boundary
Histogram_Points = np.r_[ Histogram_Points[45:], Histogram_Points[:45] ]
So now I can use FFT since it expects a periodic function anyways.
## doing fft
F = np.fft.fft(Histogram_Points)
## getting rid of everything but first harmonic
F[2:] = 0.
## back transforming
Fist_harmonic = np.fft.ifft(F)
That way I get a sine wave with its maximum exactly where the maximum of the histogram is.
Now I'd like to extract the position of the maximum not by taking the max function on the sine vector, but somehow it should be retrievable from the first (not the 0th) Fourier coefficient, since that should somehow contain the phase shift of the sine to have its maximum exactly at the maximum of the histogram.
Indeed, plotting
Cos_approx = cos( linspace(0,2*pi,100) * angle(F[1]) )
will give
But I can't figure out how to get the position of the peak from this angle.
Using the FFT is overkill when all you need is one Fourier coefficent. Instead, you can simply compute the dot product of your data with
w = np.exp(-2j*np.pi*np.arange(N) / N)
where N is the number of points. (The time to compute all the Fourier coefficients with the FFT is O(N*log(N)). Computing just one coefficient is O(N).)
Here's a script similar to yours. The data is put in y; the coordinates of the data points are in x.
import numpy as np
N = 100
# x coordinates of the data
xmin = -10
xmax = 10
x = np.linspace(xmin, xmax, N, endpoint=False)
# Generate data in y.
n = 35
y = np.zeros(N)
y[:n] = 1 - np.cos(np.linspace(0, 2*np.pi, n))
y[:n] /= 0.7 + 0.3*np.random.rand(n)
m = 10
y = np.r_[y[m:], y[:m]]
# Compute coefficent 1 of the discrete Fourier transform.
w = np.exp(-2j*np.pi*np.arange(N) / N)
F1 = y.dot(w)
print "F1 =", F1
# Get the angle of F1 (in the interval [0,2*pi]).
angle = np.angle(F1.conj())
if angle < 0:
angle += 2*np.pi
center_x = xmin + (xmax - xmin) * angle / (2*np.pi)
print "center_x = ", center_x
# Create the first sinusoidal mode for the plot.
mode1 = (F1.real * np.cos(2*np.pi*np.arange(N)/N) -
F1.imag*np.sin(2*np.pi*np.arange(N)/N))/np.abs(F1)
import matplotlib.pyplot as plt
plt.clf()
plt.plot(x, y)
plt.plot(x, mode1)
plt.axvline(center_x, color='r', linewidth=1)
plt.show()
This generates the plot:
To answer the question "Why F1.conj()?":
The complex conjugate of F1 is used because of the minus sign in
w = np.exp(-2j*np.pi*np.arange(N) / N) (which I used because it
is a common convention).
Since w can be written
w = np.exp(-2j*np.pi*np.arange(N) / N)
= cos(-2*pi*arange(N)/N) + 1j*sin(-2*pi*arange(N)/N)
= cos(2*pi*arange(N)/N) - 1j*sin(2*pi*arange(N)/N)
the dot product y.dot(w) is basically a projection of y onto
cos(2*pi*arange(N)/N) (the real part of F1) and -sin(2*pi*arange(N)/N)
(the imaginary part of F1). But when we figure out the phase of
the maximum, it is based on the functions cos(...) and sin(...). Taking
the complex conjugate accounts for the opposite sign of the sin()
function. If w = np.exp(2j*np.pi*np.arange(N) / N) were used instead, the
complex conjugate of F1 would not be needed.
You could calculate the circular mean directly on your data.
When calculating the circular mean, your data is mapped to -pi..pi. This mapped data is interpreted as angle to a point on the unit circle. Then the mean value of x and y component is calculated. The next step is to calculate the resulting angle and map it back to the defined "box".
import numpy as np
import matplotlib.pyplot as plt
Points_x = np.random.randn(10000)+1
Box_min = -10
Box_max = 10
Box_width = Box_max - Box_min
#Maps Points to Box_min ... Box_max with periodic boundaries
Points_x = (Points_x%Box_width + Box_min)
#Map Points to -pi..pi
Points_map = (Points_x - Box_min)/Box_width*2*np.pi-np.pi
#Calc circular mean
Pmean_map = np.arctan2(np.sin(Points_map).mean() , np.cos(Points_map).mean())
#Map back
Pmean = (Pmean_map+np.pi)/(2*np.pi) * Box_width + Box_min
#Plotting the result
plt.figure(figsize=(10,3))
plt.subplot(121)
plt.hist(Points_x, 100);
plt.plot([Pmean, Pmean], [0, 1000], c='r', lw=3, alpha=0.5);
plt.subplot(122,aspect='equal')
plt.plot(np.cos(Points_map), np.sin(Points_map), '.');
plt.ylim([-1, 1])
plt.xlim([-1, 1])
plt.grid()
plt.plot([0, np.cos(Pmean_map)], [0, np.sin(Pmean_map)], c='r', lw=3, alpha=0.5);