Generate Dataset from plot

Generate Dataset from plot - python

Hello I have come across a problem where I need to generate dataset from a distribution given on a scatter plot where datapoints are mostly centred around the centre of the circle and also surrounded within particular radius of the circle.Any ideas of generating such datasets in python ?

One way of producing a distribution over a circular shape is to sample a one dimensional distribution and then stretch it over the 2 Pi circonference of a circle.
One could then decide to use a uniform or a normal distribution.
import matplotlib.pyplot as plt
import numpy as np
def dist(R=4., width=1., num=1000, uniform=True):
if uniform:
r = np.random.rand(num)*width+R
else:
r = np.random.normal(R, width, num)
phi = np.linspace(0,2.*np.pi, len(r))
x= r * np.sin(phi)
y = r* np.cos(phi)
return x,y
fig, ax = plt.subplots(ncols=2, figsize=(9,4))
ax[0].set_title("uniform")
x,y = dist()
ax[0].plot(x,y, linestyle="", marker="o", markersize="2")
x,y = dist(0,1.2, 400)
ax[0].plot(x,y, linestyle="", marker="o", markersize="2")
ax[1].set_title("normal")
x,y = dist(4,0.4, uniform=False)
ax[1].plot(x,y, linestyle="", marker="o", markersize="2")
x,y = dist(0,0.6, uniform=False)
ax[1].plot(x,y, linestyle="", marker="o", markersize="2")
for a in ax:
a.set_aspect("equal")
plt.show()

You can easily generalize random numbers with some distribution centered on a point, for example normal centered on the 0, 0.
x = np.random.normal(size=1000)
y = np.random.normal(size=1000)
plt.plot(x, y, 'o', alpha=0.6)
EDIT:
What we do is generate random points in polar coordinates. First we do a random for the angle (between 0 and 2 pi) and then we give the noise multiplying it by some random number.
n = 300
theta_out = np.random.uniform(low=0, high=2*np.pi, size=n)
noise_out = np.random.uniform(low=0.9, high=1.1, size=n)
x_out = np.cos(theta_out) * noise_out
y_out = np.sin(theta_out) * noise_out
theta_in = np.random.uniform(low=0, high=2*np.pi, size=n)
noise_in = np.random.uniform(low=0, high=0.5, size=n)
x_in = np.cos(theta_in) * noise_in
y_in = np.sin(theta_in) * noise_in
ax = plt.gca()
ax.set_aspect('equal')
plt.plot(x_out, y_out, 'o')
plt.plot(x_in, y_in, 'o')
Note that there is more density of points while the lower the radius.

Related

How to relate size parameter of .scatter() with radius?

I want to draw some circles using `ax3.scatter(x1, y1, s=r1 , facecolors='none', edgecolors='r'), where:
x1 and y1 are the coordinates of these circles
r1 is the radius of these circles
I thought typing s = r1 I would get the correct radius, but that's not the case.
How can I fix this?

If you change the value of 'r' (now 5) to your desired radius, it works. This is adapted from the matplotlib.org website, "Scatter Plots With a Legend". Should be scatter plots with attitude!
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(19680801)
fig, ax = plt.subplots()
for color in ['tab:blue', 'tab:orange', 'tab:green']:
r = 5 #radius
n = 750 #number of circles
x, y = np.random.rand(2, n)
#scale = 200.0 * np.random.rand(n)
scale = 3.14159 * r**2 #CHANGE r
ax.scatter(x, y, c=color, s=scale, label=color,
alpha=0.3, edgecolors='none')
ax.legend()
ax.grid(True)
plt.show()

I have a problem with plotting sphere and a curve on it

I am trying to plot a curve on a sphere but I can not plot them at the same time. I identified some points with Euclidean norm 10 for my curve, and some other points to plot the sphere of radius 10, respectively as following.
Points for curve:
random_numbers=[]
basevalues=np.linspace(-0.9,0.9,100)
for i in range(len(basevalues)):
t=random.random()
random_numbers.append(t*10)
xvalues=[random_numbers[i]*np.cos(basevalues[i]) for i in range(len(basevalues))]
yvalues=[random_numbers[i]*np.sin(basevalues[i]) for i in range(len(basevalues))]
zvalues=[np.sqrt(100-xvalues[i]**2-yvalues[i]**2)for i in range(len(basevalues))]
Where xvalues, yvalues and zvalues are our points Euclidean components.
Points for sphere:
u = np.linspace(0, 2 * np.pi, 100)
v = np.linspace(0, np.pi, 100)
x = 10 * np.outer(np.cos(u), np.sin(v))
y = 10 * np.outer(np.sin(u), np.sin(v))
z = 10 * np.outer(np.ones(np.size(u)), np.cos(v))
Where x,y and z are Euclidean components of sphere points.
My problem:
When I try to plot the curve, without plotting sphere, it works. But when I plot them together, then it just return the sphere.
The whole code is the following:
import matplotlib.pyplot as plt
import numpy as np
import random
#Curve points
random_numbers=[]
basevalues=np.linspace(-0.9,0.9,100)
for i in range(len(basevalues)):
t=random.random()
random_numbers.append(t*10)
xvalues=[random_numbers[i]*np.cos(basevalues[i]) for i in range(len(basevalues))]
yvalues=[random_numbers[i]*np.sin(basevalues[i]) for i in range(len(basevalues))]
zvalues=[np.sqrt(100-xvalues[i]**2-yvalues[i]**2)for i in range(len(basevalues))]
# Sphere points
u = np.linspace(0, 2 * np.pi, 100)
v = np.linspace(0, np.pi, 100)
x = 10 * np.outer(np.cos(u), np.sin(v))
y = 10 * np.outer(np.sin(u), np.sin(v))
z = 10 * np.outer(np.ones(np.size(u)), np.cos(v))
# Plot the surface and curve
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
circ = ax.plot(xvalues,yvalues,zvalues, color='green',linewidth=1)
sphere=ax.plot_surface(x, y, z, color='r')
ax.set_zlim(-10, 10)
plt.xlabel("X axes")
plt.ylabel("Y axes")
plt.show()
What I want to occur:
I would like to plot the curve on the sphere, but it dose not happen in my code. I appreciate any hint.

If you use a "." option for plotting the points, like
circ = ax.plot(xvalues, yvalues,zvalues, '.', color='green', linewidth=1)
you will see the points on top of the sphere for certain viewing angles, but disappear sometimes even if they are in front of the sphere. This is a known bug explained in the matplotlib documentation:
My 3D plot doesn’t look right at certain viewing angles:
This is probably the most commonly reported issue with mplot3d. The problem is that – from some viewing angles – a 3D object would appear in front of another object, even though it is physically behind it. This can result in plots that do not look “physically correct.”
In the same doc, the developers recommend to use Mayavi for more advanced use of 3D plots in Python.

Using spherical coordinates, you can easily do that:
## plot a circle on the sphere using spherical coordinate.
import numpy as np
import matplotlib.pyplot as plt
# a complete sphere
R = 10
theta = np.linspace(0, 2 * np.pi, 1000)
phi = np.linspace(0, np.pi, 1000)
x_sphere = R * np.outer(np.cos(theta), np.sin(phi))
y_sphere = R * np.outer(np.sin(theta), np.sin(phi))
z_sphere = R * np.outer(np.ones(np.size(theta)), np.cos(phi))
# a complete circle on the sphere
x_circle = R * np.sin(theta)
y_circle = R * np.cos(theta)
# 3d plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(x_sphere, y_sphere, z_sphere, color='blue', alpha=0.2)
ax.plot(x_circle, y_circle, 0, color='green')
plt.show()

How to draw a matching Bell curve over a histogram?

My code so far, I'm very new to programming and have been trying for a while.
Here I apply the Box-Muller transform to approximate two Gaussian normal distributions starting from a random uniform sampling. Then, I create a histogram for both of them.
Now, I would like to compare the obtained histograms with "the real thing": a standard Bell curve. How to draw such a curve to match the histograms?
import numpy as np
import matplotlib.pyplot as plt
N = 10000
z1 = np.random.uniform(0, 1.0, N)
z2 = np.random.uniform(0, 1.0, N)
R_sq = -2 * np.log(z1)
theta = 2 * np.pi * z2
z1 = np.sqrt(R_sq) * np.cos(theta)
z2 = np.sqrt(R_sq) * np.sin(theta)
fig = plt.figure()
ax = fig.add_subplot(2, 1, 1)
ax.hist(z1, bins=40, range=(-4, 4), color='red')
plt.title("Histgram")
plt.xlabel("z1")
plt.ylabel("frequency")
ax2 = fig.add_subplot(2, 1, 2)
ax2.hist(z2, bins=40, range=(-4, 4), color='blue')
plt.xlabel("z2")
plt.show()

To obtain the 'kernel density estimation', scipy.stats.gaussian_kde calculates a function to fit the data.
To just draw a Gaussian normal curve, there is [scipy.stats.norm]. Subtracting the mean and dividing by the standard deviation, adapts the position to the given data.
Both curves would be drawn such that the area below the curve sums to one. To adjust them to the size of the histogram, these curves need to be scaled by the length of the data times the bin-width. Alternatively, this scaling can stay at 1, and the histogram scaled by adding the parameter hist(..., density=True).
In the demo code the data is mutilated to illustrate the difference between the kde and the Gaussian normal.
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
x = np.linspace(-4,4,1000)
N = 10000
z1 = np.random.randint(1, 3, N) * np.random.uniform(0, .4, N)
z2 = np.random.uniform(0, 1, N)
R_sq = -2 * np.log(z1)
theta = 2 * np.pi * z2
z1 = np.sqrt(R_sq) * np.cos(theta)
z2 = np.sqrt(R_sq) * np.sin(theta)
fig = plt.figure(figsize=(12,4))
for ind_subplot, zi, col in zip((1, 2), (z1, z2), ('crimson', 'dodgerblue')):
ax = fig.add_subplot(1, 2, ind_subplot)
ax.hist(zi, bins=40, range=(-4, 4), color=col, label='histogram')
ax.set_xlabel("z"+str(ind_subplot))
ax.set_ylabel("frequency")
binwidth = 8 / 40
scale_factor = len(zi) * binwidth
gaussian_kde_zi = stats.gaussian_kde(z1)
ax.plot(x, gaussian_kde_zi(x)*scale_factor, color='springgreen', linewidth=3, label='kde')
std_zi = np.std(zi)
mean_zi = np.mean(zi)
ax.plot(x, stats.norm.pdf((x-mean_zi)/std_zi)*scale_factor, color='black', linewidth=2, label='normal')
ax.legend()
plt.show()
The original values for z1 and z2 very much resemble a normal distribution, and so the black line (the Gaussian normal for the data) and the green line (the KDE) very much resemble each other.
The current code first calculates the real mean and the real standard deviation of the data. As you want to mimic a perfect Gaussian normal, you should compare to the curve with mean zero and standard deviatio one. You'll see they're almost identical on the plot.

How to create a circle with uniformly distributed dots in the perimeter of it with scatterplot in python

Suppose I have a circle x**2 + y**2 = 20.
Now I want to plot the circle with n_dots number of dots in the circles perimeter in a scatter plot. So I created the code like below:
n_dots = 200
x1 = np.random.uniform(-20, 20, n_dots//2)
y1_1 = (400 - x1**2)**.5
y1_2 = -(400 - x1**2)**.5
plt.figure(figsize=(8, 8))
plt.scatter(x1, y1_1, c = 'blue')
plt.scatter(x1, y1_2, c = 'blue')
plt.show()
But this shows the dots not uniformly distributed all the places in the circle. The output is :
So how to create a circle with dots in scatter plot where all the dots are uniformly distributed in the perimeter of the circle?

A simple way to plot evenly-spaced points along the perimeter of a circle begins with dividing the whole circle into equally small angles where the angles from circle's center to all points are obtained. Then, the coordinates (x,y) of each point can be computed. Here is the code that does the task:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure(figsize=(8, 8))
n_dots = 120 # set number of dots
angs = np.linspace(0, 2*np.pi, n_dots) # angles to the dots
cx, cy = (50, 20) # center of circle
xs, ys = [], [] # for coordinates of points to plot
ra = 20.0 # radius of circle
for ang in angs:
# compute (x,y) for each point
x = cx + ra*np.cos(ang)
y = cy + ra*np.sin(ang)
xs.append(x) # collect x
ys.append(y) # collect y
plt.scatter(xs, ys, c = 'red', s=5) # plot points
plt.show()
The resulting plot:
Alternately, numpy's broadcasting nature can be used and shortened the code:
import matplotlib.pyplot as plt
import numpy as np
fig=plt.figure(figsize=(8, 8))
n_dots = 120 # set number of dots
angs = np.linspace(0, 2*np.pi, n_dots) # angles to the dots
cx, cy = (50, 20) # center of circle
ra = 20.0 # radius of circle
# with numpy's broadcasting feature...
# no need to do loop computation as in above version
xs = cx + ra*np.cos(angs)
ys = cy + ra*np.sin(angs)
plt.scatter(xs, ys, c = 'red', s=5) # plot points
plt.show()

for a very generalized answer that also works in 2D:
import numpy as np
import matplotlib.pyplot as plt
def u_sphere_pts(dim, N):
"""
uniform distribution points on hypersphere
from uniform distribution in n-D (<-1, +1>) hypercube,
clipped by unit 2 norm to get the points inside the insphere,
normalize selected points to lie on surface of unit radius hypersphere
"""
# uniform points in hypercube
u_pts = np.random.uniform(low=-1.0, high=1.0, size=(dim, N))
# n dimensional 2 norm squared
norm2sq = (u_pts**2).sum(axis=0)
# mask of points where 2 norm squared < 1.0
in_mask = np.less(norm2sq, np.ones(N))
# use mask to select points, norms inside unit hypersphere
in_pts = np.compress(in_mask, u_pts, axis=1)
in_norm2 = np.sqrt(np.compress(in_mask, norm2sq)) # only sqrt selected
# return normalized points, equivalently, projected to hypersphere surface
return in_pts/in_norm2
# show some 2D "sphere" points
N = 1000
dim = 2
fig2, ax2 = plt.subplots()
ax2.scatter(*u_sphere_pts(dim, N))
ax2.set_aspect('equal')
plt.show()
# plot histogram of angles
pts = u_sphere_pts(dim, 1000000)
theta = np.arctan2(pts[0,:], pts[1,:])
num_bins = 360
fig1, ax1 = plt.subplots()
n, bins, patches = plt.hist(theta, num_bins, facecolor='blue', alpha=0.5)
plt.show()
similar/related:
https://stackoverflow.com/questions/45580865/python-generate-an-n-dimensional-hypercube-using-rejection-sampling#comment78122144_45580865
Python Uniform distribution of points on 4 dimensional sphere
http://mathworld.wolfram.com/HyperspherePointPicking.html
Sampling uniformly distributed random points inside a spherical volume

Matplotlib - contour and quiver plot in projected polar coordinates

I need to plot contour and quiver plots of scalar and vector fields defined on an uneven grid in (r,theta) coordinates.
As a minimal example of the problem I have, consider the contour plot of a Stream function for a magnetic dipole, contours of such a function are streamlines of the corresponeding vector field (in this case, the magnetic field).
The code below takes an uneven grid in (r,theta) coordinates, maps it to the cartesian plane and plots a contour plot of the stream function.
import numpy as np
import matplotlib.pyplot as plt
r = np.logspace(0,1,200)
theta = np.linspace(0,np.pi/2,100)
N_r = len(r)
N_theta = len(theta)
# Polar to cartesian coordinates
theta_matrix, r_matrix = np.meshgrid(theta, r)
x = r_matrix * np.cos(theta_matrix)
y = r_matrix * np.sin(theta_matrix)
m = 5
psi = np.zeros((N_r, N_theta))
# Stream function for a magnetic dipole
psi = m * np.sin(theta_matrix)**2 / r_matrix
contour_levels = m * np.sin(np.linspace(0, np.pi/2,40))**2.
fig, ax = plt.subplots()
# ax.plot(x,y,'b.') # plot grid points
ax.set_aspect('equal')
ax.contour(x, y, psi, 100, colors='black',levels=contour_levels)
plt.show()
For some reason though, the plot I get doesn't look right:
If I interchange x and y in the contour function call, I get the desired result:
Same thing happens when I try to make a quiver plot of a vector field defined on the same grid and mapped to the x-y plane, except that interchanging x and y in the function call no longer works.
It seems like I made a stupid mistake somewhere but I can't figure out what it is.

If psi = m * np.sin(theta_matrix)**2 / r_matrix
then psi increases as theta goes from 0 to pi/2 and psi decreases as r increases.
So a contour line for psi should increase in r as theta increases. That results
in a curve that goes counterclockwise as it radiates out from the center. This is
consistent with the first plot you posted, and the result returned by the first version of your code with
ax.contour(x, y, psi, 100, colors='black',levels=contour_levels)
An alternative way to confirm the plausibility of the result is to look at a surface plot of psi:
import numpy as np
import matplotlib.pyplot as plt
import mpl_toolkits.mplot3d.axes3d as axes3d
r = np.logspace(0,1,200)
theta = np.linspace(0,np.pi/2,100)
N_r = len(r)
N_theta = len(theta)
# Polar to cartesian coordinates
theta_matrix, r_matrix = np.meshgrid(theta, r)
x = r_matrix * np.cos(theta_matrix)
y = r_matrix * np.sin(theta_matrix)
m = 5
# Stream function for a magnetic dipole
psi = m * np.sin(theta_matrix)**2 / r_matrix
contour_levels = m * np.sin(np.linspace(0, np.pi/2,40))**2.
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection='3d')
ax.set_aspect('equal')
ax.plot_surface(x, y, psi, rstride=8, cstride=8, alpha=0.3)
ax.contour(x, y, psi, colors='black',levels=contour_levels)
plt.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Generate Dataset from plot - python

Hello I have come across a problem where I need to generate dataset from a distribution given on a scatter plot where datapoints are mostly centred around the centre of the circle and also surrounded within particular radius of the circle.Any ideas of generating such datasets in python ?

Related

How to relate size parameter of .scatter() with radius?

I have a problem with plotting sphere and a curve on it

How to draw a matching Bell curve over a histogram?

How to create a circle with uniformly distributed dots in the perimeter of it with scatterplot in python

Matplotlib - contour and quiver plot in projected polar coordinates

Categories

Resources