I am trying to implement a simple neural net. I want to print the initial pattern, weights, activation. I then want it to print the learning process (i.e. every pattern it goes through as it learns). I am as yet unable to do this - it returns the initial and final pattern (whn I put print p in appropriate places), but nothing else. Hints and tips appreciated - I'm a complete newbie to Python!
#!/usr/bin/python
import random
p = [ [1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1] ] # pattern I want the net to learn
n = 5
alpha = 0.01
activation = [] # unit activations
weights = [] # weights
output = [] # output
def initWeights(n): # set weights to zero, n is the number of units
global weights
weights = [[[0]*n]*n] # initialised to zero
def initNetwork(p): # initialises units to activation
global activation
activation = p
def updateNetwork(k): # pick unit at random and update k times
for l in range(k):
unit = random.randint(0,n-1)
activation[unit] = 0
for i in range(n):
activation[unit] += output[i] * weights[unit][i]
output[unit] = 1 if activation[unit] > 0 else -1
def learn(p):
for i in range(n):
for j in range(n):
weights += alpha * p[i] * p[j]
You have a problem with the line:
weights = [[[0]*n]*n]
When you use*, you multiply object references. You are using the same n-len array of zeroes every time. This will cause:
>>> weights[0][1][0] = 8
>>> weights
[[[8, 0, 0], [8, 0, 0], [8, 0, 0]]]
The first item of all the sublists is 8, because they are one and the same list. You stored the same reference multiple times, and so modifying the n-th item on any of them will alter all of them.
this the line is where you get :
"IndexError: list index out of range"
output[unit] = 1 if activation[unit] > 0 else -1
because output = [] , you should do output.append() or ...
Related
For a given array (1 or 2-dimensional) I would like to know, how many "patches" there are of nonzero elements. For example, in the array [0, 0, 1, 1, 0, 1, 0, 0] there are two patches.
I came up with a function for the 1-dimensional case, where I first assume the maximal number of patches and then decrease that number if a neighbor of a nonzero element is nonzero, too.
def count_patches_1D(array):
patches = np.count_nonzero(array)
for i in np.nonzero(array)[0][:-1]:
if (array[i+1] != 0):
patches -= 1
return patches
I'm not sure if that method works for two dimensions as well. I haven't come up with a function for that case and I need some help for that.
Edit for clarification:
I would like to count connected patches in the 2-dimensional case, including diagonals. So an array [[1, 0], [1, 1]] would have one patch as well as [[1, 0], [0, 1]].
Also, I am wondering if there is a build-in python function for this.
The following should work:
import numpy as np
import copy
# create an array
A = np.array(
[
[0, 1, 1, 1, 0, 1],
[0, 0, 1, 0, 0, 0],
[1, 0, 0, 1, 0, 1],
[1, 0, 0, 0, 0, 1],
[0, 0, 1, 0, 0, 1]
]
)
def isadjacent(pos, newpos):
"""
Check whether two coordinates are adjacent
"""
# check for adjacent columns and rows
return np.all(np.abs(np.array(newpos) - np.array(pos)) < 2):
def count_patches(A):
"""
Count the number of non-zero patches in an array.
"""
# get non-zero coordinates
coords = np.nonzero(A)
# add them to a list
inipatches = list(zip(*coords))
# list to contain all patches
allpatches = []
while len(inipatches) > 0:
patch = [inipatches.pop(0)]
i = 0
# check for all points adjacent to the points within the current patch
while True:
plen = len(patch)
curpatch = patch[i]
remaining = copy.deepcopy(inipatches)
for j in range(len(remaining)):
if isadjacent(curpatch, remaining[j]):
patch.append(remaining[j])
inipatches.remove(remaining[j])
if len(inipatches) == 0:
break
if len(inipatches) == 0 or plen == len(patch):
# nothing added to patch or no points remaining
break
i += 1
allpatches.append(patch)
return len(allpatches)
print(f"Number of patches is {count_patches(A)}")
Number of patches is 5
This should work for arrays with any number of dimensions.
I am looking for a way to speed up the specific operation on tensors in PyTorch. Since it is a general operation on matrices, I am open to answers in NumPy as well.
Let's say I have a tensor with values from 0 to N-1 (N=4) where each value repeats the same number of times (R=2).
import torch
x = torch.Tensor([0, 0, 1, 1, 2, 2, 3, 3])
In this case, it is sorted, but any permutation of x is also in the set of considered tensors X.
I am getting an input tensor with values from 0 to N-1 but without any constraints on the repetition.
z = torch.tensor([3, 2, 3, 0, 2, 3, 1, 2])
And I would like to find an efficient implementation of foo such that y = foo(z). y should be some permutation of x (from the set X) that tries to do as few changes in z as possible (in terms of Hamming distance), for example
y = torch.tensor([3, 2, 3, 0, 2, 0, 1, 1])
The trivial solution is to keep counting the number elements with the same value, but it is extremely inefficient to process elements one-by-one for larger tensors:
def foo(z):
R = 2
N = 4
counters = [0] * N
# first, we replace extra elements with -1
y = []
for elem in z:
if counters[elem] < R:
counters[elem] += 1
y.append(elem)
else:
y.append(-1)
y = torch.tensor(y)
assert torch.equal(y, torch.tensor([3, 2, 3, 0, 2, -1, 1, -1]))
# second, we replace -1 by "unfilled" counters
for i in range(len(y)):
if y[i] == -1:
first_unfilled = [n for n in range(N) if counters[n] < R][0]
counters[first_unfilled] += 1
y[i] = first_unfilled
return y
assert torch.equal(y, foo(z))
I have implemented the Discounted Cumulative Gain (DCG) and Normalized Discounted Cumulative Gain (NDCG) in python. I am not sure if the code is correct or did I forget some important criteria for DCG and NDCG. Here is my code so far:
import numpy as np
def get_dcg_score(predictions: np.ndarray, test_interaction_matrix: np.ndarray, topK = 10) -> float:
"""
predictions - np.ndarray - predictions of the recommendation algorithm for each user.
test_interaction_matrix - np.ndarray - test interaction matrix for each user.
returns - float - mean dcg score over all user.
"""
score = None
# TODO: YOUR IMPLEMENTATION.
score = []
for idx, (pred,test) in enumerate(zip(predictions,test_interaction_matrix)):
print(idx,pred,test)
for i, (j,jj) in enumerate(zip(pred[:topK], test[:topK])):
if i == 0 and jj == 1:
sc = jj
score.append(sc)
if i != 0 and jj == 1:
sc = jj / np.log2(j+2)
score.append(sc)
if (i != 0 and jj == 0) or ( i == 0 and jj == 0):
continue
score = sum(score)/len(predictions)
return score
I evaluate this on the two arrays.
predictions = np.array([[0, 1, 2, 3], [3, 2, 1, 0]])
test_interaction_matrix = np.array([[1, 0, 0, 0], [0, 0, 0, 1]])
dcg_score = get_dcg_score(predictions, test_interaction_matrix, topK=4)
print(dcg_score)
assert np.isclose(dcg_score, 1), "1 expected"
Now for the NDCG I need to implement Ideal Discounted Cumulative Gain (IDCG) first and then divide DCG by IDCG.
Here what I have for NDCG.
def get_ndcg_score(predictions: np.ndarray, test_interaction_matrix: np.ndarray, topK = 10) -> float:
"""
predictions - np.ndarray - predictions of the recommendation algorithm for each user.
test_interaction_matrix - np.ndarray - test interaction matrix for each user.
topK - int - topK recommendations should be evaluated.
returns - average ndcg score over all users.
"""
score = None
# TODO: YOUR IMPLEMENTATION.
score_idcg = []
for i, (vp, vt) in enumerate(zip(predictions,test_interaction_matrix)):
element_sorted = sorted(vp,reverse=True)
for j, (ele_p, ele_vt) in enumerate(zip(element_sorted, vt)):
if j == 0 and ele_vt == 1:
scr = ele_vt
score_idcg.append(scr)
if j != 0 and ele_vt == 1:
scr = ele_vt / np.log2(j+2)
score_idcg.append(scr)
if (j != 0 and ele_vt == 0) or (j == 0 and ele_vt == 0):
continue
print(score_idcg)
score_idcg = sum(score_idcg)/len(predictions)
print(score_idcg)
score_dcg = get_dcg_score(predictions, test_interaction_matrix, topK = 4)
score_ndcg = score_dcg / score_idcg
return score_ndcg
Again I test it on these two arrays:
predictions = np.array([[0, 1, 2, 3], [3, 2, 1, 0], [1, 2, 3, 0], [-1, -1, -1, -1]])
test_interaction_matrix = np.array([[1, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 0], [0, 0, 0, 0]])
ndcg_score = get_ndcg_score(predictions, test_interaction_matrix, topK=4)
assert np.isclose(ndcg_score, 1), "ndcg score is not correct."
Could somebody please look at my code and find why I don't get the right result for ndcg test? I just can't figure it out. Please also look at dcg implementation as well if it is faulty. Sorry for the horrible code. Write me if you need more info. Any suggestion is appreciated.
I want to convert [0, 0, 1, 0, 1, 0, 1, 0] to [2, 4, 6] using ortools.
Where "2", "4", "6" in the second list are the index of "1" in the first list.
Using the below code I could get a list [0, 0, 2, 0, 4, 0, 6, 0]. How can I get [2, 4, 6]?
from ortools.sat.python import cp_model
model = cp_model.CpModel()
solver = cp_model.CpSolver()
work = {}
days = 8
horizon = 7
for i in range(days):
work[i] = model.NewBoolVar("work(%i)" % (i))
model.Add(work[0] == 0)
model.Add(work[1] == 0)
model.Add(work[2] == 1)
model.Add(work[3] == 0)
model.Add(work[4] == 1)
model.Add(work[5] == 0)
model.Add(work[6] == 1)
model.Add(work[7] == 0)
v1 = [model.NewIntVar(0, horizon, "") for _ in range(days)]
for d in range(days):
model.Add(v1[d] == d * work[d])
status = solver.Solve(model)
print("status:", status)
vec = []
for i in range(days):
vec.append(solver.Value(work[i]))
print("work",vec)
vec = []
for v in v1:
vec.append(solver.Value(v))
print("vec1",vec)
You should see this output on the console,
status: 4
work [0, 0, 1, 0, 1, 0, 1, 0]
vec1 [0, 0, 2, 0, 4, 0, 6, 0]
Thank you.
Edit:
I also wish to get a result as [4, 6, 2].
For just 3 variables, this is easy. In pseudo code:
The max index is max(work[i] * i)
The min index is min(horizon - (horizon - i) * work[i])
The medium is sum(i * work[i]) - max_index - min_index
But that is cheating.
If you want more that 3 variable, you will need parallel arrays of Boolean variables that indicate the rank of each variable.
Let me sketch the full solution.
You need to build a graph. The X axis are the variables. The why axis are the ranks. You have horizontal arcs going right, and diagonal arcs going right and up. If the variable is selected, you need to use a diagonal arc, otherwise an horizontal arc.
If using a diagonal arc, you will assign the current variable to the rank of the tail of the arc.
Then you need to add constraints to make it a contiguous path:
mass conservation at each node
variable is selected -> one of the diagonal arc must be selected
variable is not selected -> one of the horizontal arc must be selected
bottom left node has one outgoing arc
top right node has one incoming arc
I'm attempting to convert a double summation formula into code, but can't figure out the correct matrix/vector representation of it.
The first summation is i to n, and the second is over j > i to n.
I'm guessing there is a much more efficient & pythonic way of writing this?
I resorted to nested for loops to just get it working but, as expected, it runs very slowly with a large dataset:
def wapc_denom(weights, vols):
x = []
y = []
for i, wi in enumerate(weights):
for j, wj in enumerate(weights):
if j > i:
x.append(wi * wj * vols[i] * vols[j])
y.append(np.sum(x))
return np.sum(y)
Edit:
Using guidance from smci's answer I think I have a potential solution:
def wapc_denom2(weights, vols):
return np.sum(np.tril(np.outer(weights, vols.T)**2, k=-1))
Assuming you want to count every term only once (for that you have to move the x = [] into the outer loop) one cheap way of computing the sum would be
Create mock data
weights = np.random.random(10)
vols = np.random.random(10)
Do the calculation
wv = weights * vols
result = (wv.sum()**2 - wv#wv) / 2
Check that it's the same
def wapc_denom(weights, vols):
y = []
for i, wi in enumerate(weights):
x = []
for j, wj in enumerate(weights):
if j > i:
x.append(wi * wj * vols[i] * vols[j])
y.append(np.sum(x))
return np.sum(y)
assert np.allclose(result, wapc_denom(weights, vols))
Why does it work?
What we are doing is compute the sum of the full matrix, subtract the diagonal and divide by two. This is cheap because it is easy to verify that the sum of an outer product is just the product of the summed factors.
wi * wj * vols[i] * vols[j] is a telltale. vols is another vector, so first you want to compute the vector wv = w * vols
then (wj * vols[j]) * (wi * vols[i]) = wv^T * wv is your (matrix outer product) expression; that's a column vector * a row vector. But actually you only want the sum. So I don't see a need to construct a vector y.append(np.sum(x)), you're only going to sum it anyway np.sum(y)
also the if j > i part means you only want the sum of the Lower Triangular part, and exclude the diagonal.
EDIT: the result is fully determined just from wv, I didn't think we needed the matrix to get the sum, and we didn't need the diagonal; #PaulPanzer found the most compact expression.
You can use triangulations in numpy, check np.triu and np.meshgrid. Do:
np.product(np.triu(np.meshgrid(weights,weights), 1) * np.triu(np.meshgrid(vols,vols), 1),0).sum(1).cumsum().sum()
Example:
w = np.arange(4) +1
v = np.array([1,3,2,2])
print(np.triu(np.meshgrid(w,w), k=1))
>>array([[[0, 2, 3, 4],
[0, 0, 3, 4],
[0, 0, 0, 4],
[0, 0, 0, 0]],
[[0, 1, 1, 1],
[0, 0, 2, 2],
[0, 0, 0, 3],
[0, 0, 0, 0]]])
# example of product + triu + meshgrid (your x values):
print(np.product(np.triu(np.meshgrid(w,w), 1) * np.triu(np.meshgrid(v,v), 1),0))
>>array([[ 0, 6, 6, 8],
[ 0, 0, 36, 48],
[ 0, 0, 0, 48],
[ 0, 0, 0, 0]])
print(np.product(np.triu(np.meshgrid(w,w), 1) * np.triu(np.meshgrid(v,v), 1),0).sum(1).cumsum().sum())
>> 428
print(wapc_denom(w, v))
>> 428