I am tying to plot 1d lattice graph, but i face with below:
NetworkXPointlessConcept: the null graph has no paths, thus there is no averageshortest path length
what is the problem of this code?
thanks.
N = 1000
x = 0
for n in range(1, N, 10):
lattice_1d_distance = list()
d = 0
lattice_1d = nx.grid_graph(range(1,n))
d = nx.average_shortest_path_length(lattice_1d)
lattice_1d_distance.append(d)
x.append(n)
plt.plot(x, lattice_1d_distance)
plt.show()
According to networkx documentation nx.grid_graph the input is a list of dimensions for nx.grid_graph
Example
print(list(range(1,4)))
nx.draw(nx.grid_graph(list(range(1,4))) # this is a two dimensional graph, as there is only 3 entries AND ONE ENTRY = 1
[1, 2, 3]
print(list(range(1,5)))
nx.draw(nx.grid_graph([1,2,3,4])) # this is a 3 dimensional graph, as there is only 4 entries AND ONE ENTRY = 1
[1, 2, 3, 4]
Therefore, lets say if you want to 1. plot the distance vs increment of number of dimensions for grid graphs but with constant size for each dimension, or you want to 2. plot the distance vs increment of size for each dimension for grid graphs but with constant number of dimensions:
import networkx as nx
import matplotlib.pyplot as plt
N = 10
x = []
lattice_1d_distance = []
for n in range(1, 10):
d = 0
lattice_1d = nx.grid_graph([2]*n) # plotting incrementing number of dimensions, but each dimension have same length.
d = nx.average_shortest_path_length(lattice_1d)
lattice_1d_distance.append(d)
x.append(n)
plt.plot(x, lattice_1d_distance)
plt.show()
N = 10
x = []
lattice_1d_distance = []
for n in range(1, 10):
d = 0
lattice_1d = nx.grid_graph([n,n]) # plotting 2 dimensional graphs, but each graph have incrementing length for each dimension.
d = nx.average_shortest_path_length(lattice_1d)
lattice_1d_distance.append(d)
x.append(n)
plt.plot(x, lattice_1d_distance)
plt.show()
Also, you need to pay attention to the declaration of list variables.
Related
I would like to apply function func over each row of 2D ndarray arr shaped n x m with provided list of arguments args (of lengh n). That is for each row i function is executed as func(arr[i, :], args[i]).
This task can be acomplished with np.fromiter (using for loop):
iterable = (func(row, arg) for row, arg in zip(arr, args))
results = np.fromiter(iterable, dtype=int)
However this can take some time in case of large arrays. Acoording to unutbu's answer using numpy's python utility functions (e.g. np.apply_along_axis) does not provide siginifacnt speedup. Is there a way to optimize this process?
To avoid falling into XY problem trap, beneath is my orginal problem statement:
I have an ndarray representing image, shaped n x m. This image undergo processing during, which for each row a specifix index i is calculated. I want to compose a image of orginal shape (n x m) using data on the right from index i for each row. That is I want to resample each row[i:] of length m - i to m samples. Note that I want to use my own implementation of resampling function (don't want to use scipy.signal.resample etc).
EDIT:
Test code with func example (added count argument to fromiter as suggested by LudvigH):
import numpy as np
import matplotlib.pyplot as plt
def simple_slant_range_correction(
row, height, n_samples, max_ground_range, max_slant_range, slant_range_resolution
):
ground_ranges = np.linspace(height, max_ground_range, n_samples)
slant_ranges = np.sqrt(ground_ranges ** 2 + height ** 2)
slant_ranges_indicies = slant_ranges / slant_range_resolution - 1
slant_ranges_indicies_floor = np.floor(slant_ranges_indicies).astype(np.int16)
slant_ranges_indicies_ceil = np.clip(
0, n_samples - 1, slant_ranges_indicies_floor + 1
)
weight = slant_ranges_indicies - slant_ranges_indicies_floor
return (
weight * row[slant_ranges_indicies_ceil]
+ (1 - weight) * row[slant_ranges_indicies_floor]
).astype(np.float32)
if __name__ == "__main__":
# Test parameters
n, m = 100, 100
max_slant_range = 50
slant_range_resolution = max_slant_range / m
# Create some dummy data
data = np.zeros((n, m))
h_indicies = np.ones((n), dtype=int)
for i in np.arange(0, n, 5):
data[:i, :i] += i
h_indicies[:i] += 1
heights = h_indicies * slant_range_resolution
max_ground_ranges = np.sqrt(max_slant_range ** 2 - heights ** 2)
# Perform resampling based on h_index
iters = (
simple_slant_range_correction(
row, height, m, max_ground_range, max_slant_range, slant_range_resolution
)
for row, height, max_ground_range in zip(data, heights, max_ground_ranges)
)
data_sampled = np.fromiter(iters, dtype=np.dtype((np.float32, m)), count=n)
# Plot data
fig, axs = plt.subplots(1, 2)
axs[0].plot(h_indicies + 0.5, np.arange(n) + 0.5, c="red")
axs[0].imshow(data, vmin=0, vmax=data.max())
axs[1].imshow(data_sampled, vmin=0, vmax=data.max())
axs[0].set_axis_off()
axs[1].set_axis_off()
plt.tight_layout()
plt.show()
It is typically faster to take advantage of vectorization by using numpy operations to manipulate the data, as compared to using python functions and objects to manipulate the data. Below is an example of a way to solve the problem described at the end of your question using numpy vectorization.
import numpy as np
Choosing some array and column indices as an example:
# 1 2 3 3 1
# A = 4 5 6 6 row_indices = 3
# 7 8 9 9 2
A = np.array([[1,2,3,3],[4,5,6,6],[7,8,9,9]])
row_indices = np.array([1,3,2])
Use vector operations to build a boolean masking array and then multiply the original array by the mask:
NM = np.shape(A)
N = NM[0]
M = NM[1]
col = np.arange(M,dtype=np.uint32)
B = np.outer(np.ones([1,N],dtype=np.uint32),col)
C = np.outer(row_indices,np.ones([1,M],dtype=np.uint32))
A_sampled = (B>=C)*A
print(A_sampled)
# output:
# 0 2 3 3
# 0 0 0 6
# 0 0 9 9
I am currently trying to generate a random dataset based on a choice of k, the number of clusters, and xlim and ylim as the boundaries to be inputted. I want my output to be as follows:
[array([11.7282981 , 6.89656728],
[ 9.88391172, 5.83611126],
[7.45631652, 7.88674093],
[8.38232831, 7.82884638])
This code is for k means project
Here is my attempt. First I create a cluster center, which is randomly generated within a range between 0 and xlimit and ylimit inputted. Then I create 2 (in this case 2 but I will be doing 100) random points around the cluster center with noise:
k = 2
xlim = 12
ylim = 12
f = []
for x in range(0,k):
clusterCenter = [random.randint(0,xlim),random.randint(0,ylim)]
cluster = np.random.randn(2, 2) + clusterCenter
f.append(cluster)
f
unfortunately the output comes out to be:
[array([[11.7282981 , 6.89656728],
[ 9.88391172, 5.83611126]]),
array([[7.45631652, 7.88674093],
[8.38232831, 7.82884638]])]
which is not what I want as I would like to put this into a pandas dataframe. can anyone help?
the numbers will be a lot greater, I have made it such that the cluster generated would be a set of 2 x and y co-ordinates, but would ideally want:
cluster = np.random.randn(100, 2) + clusterCenter
So keep that in consideration! any help would be greatly appreciated!
Replace f.append(cluster) with:
f = None # instead of []
...
if f is None:
f = cluster
else:
f = np.concatenate( (f, cluster) )
Currently working on a task which requires me to plot a probability mass function to a graph. The mass function i have is to do with a biased coin being tossed three times :
P(H) = 0.75
P(T) = 0.25
X = 0,1,2,3
F(0) = P(X=0) = P(t,t,t) = 0.015625
F(1) = P(X=1) = P(h,t,t) + P(t,h,t) + P(t,t,h) = 0.140625
F(2) = P(X=2) = P(h,h,t) + P(h,t,h) + P(t,h,h) = 0.421875
F(3) = P(X=3) = P(h,h,h) = 0.421875
When i work to plot these points using the following code
import matplotlib.pyplot as plt
prob = np.array([0,0.015625,0.140625,0.421875,0.421875])
x = np.arange(0,3)
plt.bar(x,prob, width = 0.5)
plt.xlim(0.5,3.5)
plt.show()
I am met with this error:
ValueError: shape mismatch: objects cannot be broadcast to a single shape
The shape of the x array must match the shape of the prob array. I can suggest the following:
import matplotlib.pyplot as plt
prob = np.array([0.015625,0.140625,0.421875,0.421875])
x = np.arange(4)
plt.bar(x, prob, width = 0.5)
plt.xticks(x)
plt.xlim(-0.5,3.5)
plt.show()
You have 5 elements in prob and 3 elements in x. Python cannot plot charts if the number of elements in both arrays is different. Since you have 5 elements in prob, you need to have 5 ticks on the x-axis to draw the bar chart.
Change x = np.arange(0, 3) to x = np.arange(0, 5) and plt.xlim(0.5,3.5) to plt.xlim(0.5,4.5) and you should get the plot.
I am iteratively plotting the np.exp results of 12 rows of data from a 2D array (12,5000), out_array. All data share the same x values, (x_d). I want the first 4 iterations to all plot as the same color, the next 4 to be a different color, and next 4 a different color...such that I have 3 different colors each corresponding to the 1st-4th, 5th-8th, and 9th-12th iterations respectively. In the end, it would also be nice to define these sets with their corresponding colors in a legend.
I have researched cycler (https://matplotlib.org/examples/color/color_cycle_demo.html), but I can't figure out how to assign colors into sets of iterations > 1. (i.e. 4 in my case). As you can see in my code example, I can have all 12 lines plotted with different (default) colors -or- I know how to make them all the same color (i.e. ...,color = 'r',...)
plt.figure()
for i in range(out_array.shape[0]):
plt.plot(x_d, np.exp(out_array[i]),linewidth = 1, alpha = 0.6)
plt.xlim(-2,3)
I expect a plot like this, only with a total of 3 different colors, each corresponding to the chunks of iterations described above.
An other solution
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(10)
color = ['r', 'g', 'b', 'p']
for i in range(12):
plt.plot(x, i*x, color[i//4])
plt.show()
plt.figure()
n = 0
color = ['r','g','b']
for i in range(out_array.shape[0]):
n = n+1
if n/4 <= 1:
c = 1
elif n/4 >1 and n/4 <= 2:
c = 2
elif n/4 >2:
c = 3
else:
print(n)
plt.plot(x_d, np.exp(out_array[i]),color = color[c-1])
plt.show()
I have a data set of 2 1D arrays. My goal is to count the points in each section of a grid (with a size of my choosing).
plt.figure(figsize=(8,7))
np.random.seed(5)
x = np.random.random(100)
y = np.random.random(100)
plt.plot(x,y,'bo')
plt.grid(True)
My Plot
I would like to be able to split each section into is own unique set of 2 1D or 1 2D arrays.
import numpy as np
def split(arr, cond):
return [arr[cond], arr[~cond]]
a = np.array([1,3,5,7,2,4,6,8])
print split(a, a<5)
this will return a list of two arrays containing [1,2,3,4] and [5,6,7,8].
Try using this function based on the conditions you set (intervals of 0.2 it seems)
NOTE: to implement this correctly for your problem, you'll have to modify the split function seeing that you want to split the data into more than two sections. I'll leave that as an exercise for you to do :)
This function takes in two 1D arrays and returns a 2D matrix, in which each element is the number of points in the grid section corresponding to your image:
import numpy as np
def count_points(arr1, arr2, bin_width):
x = np.floor(arr1/bin_width).astype(int) # Bin number for each value
y = np.floor(arr2/bin_width).astype(int) # Bin number for each value
counts = np.zeros(shape=(max(x)+1, max(y)+1), dtype=int)
for i in range(x.shape[0]):
row = max(y) - y[i]
col = x[i]
counts[row, col] += 1
return counts
Note that x and y don't line up with the column and row index, since the origin is at the bottom left in the plot but the "origin" (index [0,0]`) of the matrix is the top left. I rearranged the matrix so that the elements line up with what you see in the photo.
Example:
np.random.seed(0)
x = np.random.random(100)
y = np.random.random(100)
print count_points(x, y, 0.2) # 0.2 matches the default gridlines in matplotlib
# Output:
#[[8 4 5 4 0]
# [2 5 5 7 4]
# [7 1 3 8 3]
# [4 2 5 3 4]
# [4 4 3 1 4]]
Which matches the counts here: