I'm working on an animated bar plot to show how the number frequencies of rolling a six-sided die converge the more you roll the die. I'd like to show the number frequencies after each iteration, and for that I have to get a list of the number frequencies for that iteration in another list. Here's the code so far:
import numpy as np
import numpy.random as rd
rd.seed(23)
n_samples = 3
freqs = np.zeros(6)
frequencies = []
for roll in range(n_samples):
x = rd.randint(0, 6)
freqs[x] += 1
print(freqs)
frequencies.append(freqs)
print()
for x in frequencies:
print(x)
Output:
[0. 0. 0. 1. 0. 0.]
[1. 0. 0. 1. 0. 0.]
[1. 1. 0. 1. 0. 0.]
[1. 1. 0. 1. 0. 0.]
[1. 1. 0. 1. 0. 0.]
[1. 1. 0. 1. 0. 0.]
Desired output:
[0. 0. 0. 1. 0. 0.]
[1. 0. 0. 1. 0. 0.]
[1. 1. 0. 1. 0. 0.]
[0. 0. 0. 1. 0. 0.]
[1. 0. 0. 1. 0. 0.]
[1. 1. 0. 1. 0. 0.]
The upper three lists indeed show the number frequencies after each iteration. However, when I try to append the list to the 'frequencies' list, in the end it just shows the final number frequencies each time as can be seen in the lower three lists. This one's got me stumped, and I am rather new to Python. How would one get each list like in the first three lists of the output, in another? Thanks in advance!
You can do it like that by changing only frequencies.append(freqs) with frequencies.append(freqs.copy()). Like that, you can make a copy of freqs that would be independent of original freqs. A change in freqs won't change freqs.copy().
import numpy as np
import numpy.random as rd
rd.seed(23)
n_samples = 3
freqs = np.zeros(6)
frequencies = []
for roll in range(n_samples):
x = rd.randint(0, 6)
freqs[x] += 1
print(freqs)
frequencies.append(freqs.copy())
print(frequencies)
print()
for x in frequencies:
print(x)
Python is keeping track of freqs as single identity, and its value gets changed even after it gets appended. There is a good explanation for this beyond my comprehension =P
However, here is quick and dirty work around:
import numpy as np
import numpy.random as rd
rd.seed(23)
n_samples = 3
freqs = np.zeros(6)
frequencies = []
for roll in range(n_samples):
x = rd.randint(0, 6)
freqs_copy = []
for item in freqs:
freqs_copy.append(item)
freqs_copy[x] += 1
print(freqs_copy)
frequencies.append(freqs_copy)
print()
for x in frequencies:
print(x)
The idea is to make a copy of "freqs" that would be independent of original "freqs". In the code above "freqs_copy" would be unique to each iteration.
Orginally I have a hashtag co-occ network stored in dataframe like this:
0 ['#A', '#B', '#C', '#D]
1 ['#A', '#E']
2 ['#b', '#c', '#D']
3 ['#C', '#D']
Then I converted it into an adjacency matrix like this:
,#A,#B,#C,#D,#E,#F,#G,#H,#I,#J,#K
#A,0,1,1,0,1,1,1,1,0,1,0
#B,1,0,0,0,1,1,1,1,0,1,0
#C,1,0,0,0,1,1,1,1,0,1,0
...
I want to load the net into networkx in order to do the math and draw the graph. So I use the np.genfromtext method to load the data into ndarrary. I have loaded the data successfully but I don't know how to label them.
mydata = genfromtxt(src5+fname[0], delimiter=',',encoding='utf-8',comments='**')
adjacency = mydata[1:,1:]
print(adjacency)
[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]
By the way, can I just input the data from the original dataframe instead of using the adjacency matrix?
You can display both edge and node labels. Suppose you have adjacency matrix and hashtag list:
# matrix from question
A = np.array([[0,1,1,0,1,1,1,1,0,1,0],
[1,0,0,0,1,1,1,1,0,1,0],
[1,0,0,0,1,1,1,1,0,1,0],
[0,0,0,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0,0,0]])
labels = ['#A','#B','#C','#D','#E','#F','#G','#H','#I','#J','#K']
Here is some visulisation example:
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
# labels to dict
labels = {k: v for k, v in enumerate(labels)}
# create graph and compute layout coords
G = nx.from_numpy_matrix(A, parallel_edges=True)
# k controls node closeness, 0 <= k <= 1
coord = nx.spring_layout(G, k=0.55, iterations=20)
# optional: set label coords a bit upper the nodes
node_label_coords = {}
for node, coords in coord.items():
node_label_coords[node] = (coords[0], coords[1] + 0.04)
# draw the network, node and edge labels
plt.figure(figsize=(20, 14))
nx.draw_networkx_nodes(G, pos=coord)
nx.draw_networkx_edges(G, pos=coord)
nx.draw_networkx_edge_labels(G, pos=coord)
nx.draw_networkx_labels(G, pos=node_label_coords, labels=labels)
You can find more info on the adjacency matrix graph creation at the NetworkX documentation
Update:
Refer to set_node_attributes function to add attributes to your network nodes
degree_centr = nx.degree_centrality(G)
nx.set_node_attributes(G, degree_centr, "degree")
nx.write_gexf(G, "test.gexf")
After saving graph to file with write_gexf, you'll have a file with attributes suitable for Gephi.
I have a list of x y like the picture above
in code it works like this:
np.array([[1.3,2.1],[1.5,2.2],[3.1,4.8]])
now I would like to set a grid of which I can set the start, the number of columns and rows as well as the row and columns size, and then count the number of points in each cell.
in this example [0,0] has 1 point in it, [1,0] has 1, [2,0] has 3, [0,1] has 4 and so on.
I know it would probably be trivial to do by hand, even without numpy, but I need to create it as fast as possible, since I will have to process a ton of data this way.
whats a good way to do this? Basicly create a 2D Histogramm of points? And more importantly, how can I do it as fast as possible?
I think numpy.histogram2d is the best option.
a = np.array([[1.3,2.1],[1.5,2.2],[3.1,4.8]])
H, _, _ = np.histogram2d(a[:, 0], a[:, 1], bins=(range(6), range(6)))
print(H)
# [[0. 0. 0. 0. 0.]
# [0. 0. 2. 0. 0.]
# [0. 0. 0. 0. 0.]
# [0. 0. 0. 0. 1.]
# [0. 0. 0. 0. 0.]]
In my case I will have a PCM.txt file which contains the binary representation of a PCM data like below.
[1. 1. 0. 1. 0. 1. 1. 1. 1. 1. 0. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 1. 0.
1.
0. 1. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 1.
0. 1. 0. 1. 0. 1. 0. 1. 1. 1. 1. 1. 0. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 1.
0. 1. 0. 1. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0.
0. 1. 0. 1.]
1's meaning binary 1
0's meaning binary 0
This is nothing but 100 samples of data.
Is it possible to implement a python code which will read this PCM.txt as the input and plot this PCM data using matplotlib. ? Could you please give me some tips to implement this scenario ?
Will this plotted figure look like a square wave ?
I think you want this:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(100)
y = [1.,1.,0.,1.,0.,1.,1.,1.,1.,1.,0.,1.,1.,1.,1.,1.,1.,1.,0.,1.,1.,1.,0.,1.,0.,1.,0.,1.,0.,0.,1.,0.,0.,0.,0.,0.,1.,0.,0.,0.,0.,0,0.,0.,1.,0.,0.,1.,0.,1.,0.,1.,0.,1.,0.,1.,1.,1.,1.,1.,0.,1.,1.,1.,1.,1.,1.,1.,0.,1.,1.,1.,0.,1.,0.,1.,0.,1.,0.,0.,1.,0.,0.,0.,0.,0.,1.,0.,0.,0.,0.,0.,0.,0.,1.,0.,0.,1.,0.,1.]
plt.step(x, y)
plt.show()
If you are having trouble reading the file, you can just use a regex to find things that look like numbers:
import matplotlib.pyplot as plt
import numpy as np
import re
# Slurp entire file
with open('data') as f:
s = f.read()
# Set y to anything that looks like a number
y = re.findall(r'[0-9.]+', s)
# Set x according to number of samples found
x = np.arange(len(y))
# Plot that thing
plt.step(x, y)
plt.show()