How is GRU outputting the following values?

How is GRU outputting the following values? - python

When I run the following code
from tensorflow import keras
import numpy as np
x = np.ones((1,2,1))
model = keras.models.Sequential()
model.add(keras.layers.GRU(
units = 1, activation='tanh', recurrent_activation='sigmoid',
use_bias=True, kernel_initializer='ones',
recurrent_initializer='ones',bias_initializer='zeros', return_sequences = True))
model.predict(x)
I get the output => array([[[0.20482421], [0.34675306]]], dtype=float32)
When I do this by hand I am getting 0.55
Assuming no biases and all weights are set to 1
hidden_(t-1) = 0
update_gate = sigmoid(1x1 + 1x0) = 0.73
relevance_gate = sigmoid(1x1 + 1x0) = 0.73
candidate_h(t) = tanh( 1 x (0 x 0.73) + 1 x 1) = tanh(1) = 0.76
h(t) = 0.73*0.76 + (1 - 0.73)x0 = 0.55
so shouldn't the first value of the output be 0.55?

You seem to have mistakenly swapped the equation in the last line for hidden state.
sigmoid(1 * 1 + 1 * 0) = 0.73105857863, tanh(1 * 1 + 1 * 0) = 0.761594155956
Ht = Zt ⊙ Ht-1 + (1 - Zt) ⊙ H~t
Since, Ht-1 = 0, this results in, Ht = (1 - Zt) ⊙ H~t
Following the GRU formula I got, h(t) = 0.73105857863 * 0 + (1 - 0.73105857863) x 0.761594155956 = 0.20482421480989209117972 which matches output 0.20482421.
For the next time step,
Rt = Sigmoid(1 * 1 + 1 * 0.20482421) = 0.769381871687
Zt = Sigmoid(1 * 1 + 1 * 0.20482421) = 0.769381871687
H~t = tanh(1 * 1 + 0.769381871687 * 0.20482421 * 1) = 0.8202522791
Ht = 0.769381871687 * 0.20482421 + (1 - 0.769381871687) * 0.8202522791 = 0.346753079407
This matches with final output of 0.34675306.
Reference,
https://d2l.ai/chapter_recurrent-modern/gru.html#hidden-state
https://pytorch.org/docs/stable/generated/torch.nn.GRU.html

Related

Setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (6000,)

I'm trying to give fch1 to pywt.cwt for creating coefficients but when I give that this comes with a value error as a question
samplingFreq = 100
dataPeriod = 60 # sec
numSamples = samplingFreq * dataPeriod * 3
print(numSamples)
readDataFromFile = open('input.ecg', 'rb')
datatype = np.dtype('B')
filedata = np.fromfile(readDataFromFile, datatype)
# print(filedata)
p = list(filedata)
ecgarr = np.array(p)
totSamples = len(ecgarr)
numRows = int(totSamples / numSamples)
curateData = [[0] * numSamples for i in range(numRows)]
totEntries = int(numSamples / 3)
fch1 = [[0] * 1 for i in range(totEntries)]
# Divide the total samples in sets of 'numSamples'
xcntr = 0
for x in range(0, totSamples, numSamples):
ycntr = 0
for y in range(x, (x + numSamples), 1):
curateData[xcntr][ycntr] = ecgarr[y]
ycntr = ycntr + 1
xcntr = xcntr + 1
## Convert curateData into a channels
cntr = 0
for x in range(0, numRows, 1):
cntr = 0
for y in range(1, int(numSamples / 3), 1):
#print(y)
fch1[cntr] = curateData[x][3 * (y - 1)]
# Do CWT
coef, freq = pywt.cwt(fch1, np.arange(1, 129), 'morl')
print(coef)
I'm not able to find out why I'm getting the error.
coef, freq = pywt.cwt(fch1, np.arange(1, 129), 'morl')
On the above line of code, I'm getting error.
Any Help would be so appreciable.
Thank you

Backpropagation in Andrew Trask's snippet

I'm trying to figure out just one line in the below snippet I got from here
import numpy as np
X = np.array([ [0,0,1],[0,1,1],[1,0,1],[1,1,1] ])
y = np.array([[0,1,1,0]]).T
alpha,hidden_dim = (0.5,4)
synapse_0 = 2*np.random.random((3,hidden_dim)) - 1
synapse_1 = 2*np.random.random((hidden_dim,1)) - 1
for j in xrange(60000):
layer_1 = 1/(1+np.exp(-(np.dot(X,synapse_0))))
layer_2 = 1/(1+np.exp(-(np.dot(layer_1,synapse_1))))
layer_2_delta = (layer_2 - y)*(layer_2*(1-layer_2))
layer_1_delta = layer_2_delta.dot(synapse_1.T) * (layer_1 * (1-layer_1))
synapse_1 -= (alpha * layer_1.T.dot(layer_2_delta))
synapse_0 -= (alpha * X.T.dot(layer_1_delta))
The line I cannot figure out is:
layer_1_delta = layer_2_delta.dot(synapse_1.T) * (layer_1 * (1-layer_1))
specifically, why are we doing dot product with the synapse_1 instead of layer_1?
By using synapse_1 in the delta calculation, the partial differential is carried out with respect to weights instead of the layer_1 output which is what we want right?
I think this this is what layer_1_delta should actually be:
layer_1_delta = layer_1.T.dot(layer_2_delta) * (layer_1 * (1-layer_1))

Optimizing function parameters

I explain briefly what the attached program code should do. We give a number of passes before runs = 100. and we give I = 10.
For example we set the area_factor = 1. Then the function HH_model(I,area_factor) does the following:
run 100 times with this I and this area_factor and return the number of times the barrier 60 is broken -- this is checked in the if max(v[:]-v_Rest) > 60 query.
Now I want to do the following: Determine that area_factor so that the number of count matches observations as well as possible.
For example, I know from measurements
HH_model(2*I,area_factor) = 70
HH_model(I,area_factor)=50
HH_model(0.5*I,area_factor) = 30
...
how can I find the area_factor for a given I, so that the difference to the observations becomes minimal.
import matplotlib.pyplot as py
import numpy as np
import scipy.optimize as optimize
# HH parameters
v_Rest = -65 # in mV
gNa = 120 # in mS/cm^2
gK = 36 # in mS/cm^2
gL = 0.3 # in mS/cm^2
vNa = 115 # in mV
vK = -12 # in mV
vL = 10.6 # in mV
#Number of runs
runs = 30
c = 1 # in uF/cm^2
#performing bisection-procedure
ROOT = True
def HH_model(I,area_factor):
count = 0
t_end = 10 # in ms
delay = 1 # in ms
duration = 0.3 # in ms
dt = 0.01 # in ms
I = I
area_factor = area_factor
#geometry
d = 2 # diameter in um
r = d/2 # Radius in um
l = 10 # Length of the compartment in um
A = (2 * np.pi * r * l * 1e-8)*area_factor # surface [cm^2]
C = c * A # uF
for j in range(0,runs):
# Introduction of equations and channels
def alphaM(v): return 12 * ((2.5 - 0.1 * (v)) / (np.exp(2.5 - 0.1 * (v)) - 1))
def betaM(v): return 12 * (4 * np.exp(-(v) / 18))
def betaH(v): return 12 * (1 / (np.exp(3 - 0.1 * (v)) + 1))
def alphaH(v): return 12 * (0.07 * np.exp(-(v) / 20))
def alphaN(v): return 12 * ((1 - 0.1 * (v)) / (10 * (np.exp(1 - 0.1 * (v)) - 1)))
def betaN(v): return 12 * (0.125 * np.exp(-(v) / 80))
# compute the timesteps
t_steps= t_end/dt+1
# Compute the initial values
v0 = 0
m0 = alphaM(v0)/(alphaM(v0)+betaM(v0))
h0 = alphaH(v0)/(alphaH(v0)+betaH(v0))
n0 = alphaN(v0)/(alphaN(v0)+betaN(v0))
# Allocate memory for v, m, h, n
v = np.zeros((int(t_steps), 1))
m = np.zeros((int(t_steps), 1))
h = np.zeros((int(t_steps), 1))
n = np.zeros((int(t_steps), 1))
# Set Initial values
v[:, 0] = v0
m[:, 0] = m0
h[:, 0] = h0
n[:, 0] = n0
### Noise component
knoise= 0.003 #uA/(mS)^1/2
### --------- Step3: SOLVE
for i in range(0, int(t_steps)-1, 1):
# Get current states
vT = v[i]
mT = m[i]
hT = h[i]
nT = n[i]
# Stimulus current
IStim = 0
if delay / dt <= i <= (delay + duration) / dt:
IStim = I * A # in uA
else:
IStim = 0
# Compute change of m, h and n
m[i + 1] = (mT + dt * alphaM(vT)) / (1 + dt * (alphaM(vT) + betaM(vT)))
h[i + 1] = (hT + dt * alphaH(vT)) / (1 + dt * (alphaH(vT) + betaH(vT)))
n[i + 1] = (nT + dt * alphaN(vT)) / (1 + dt * (alphaN(vT) + betaN(vT)))
# Ionic currents
iNa = gNa * m[i + 1] ** 3. * h[i + 1] * (vT - vNa)
iK = gK * n[i + 1] ** 4. * (vT - vK)
iL = gL * (vT-vL)
Inoise = (np.random.normal(0, 1) * knoise * np.sqrt(gNa * A))
IIon = ((iNa + iK + iL) * A) + Inoise #
# Compute change of voltage
v[i + 1] = vT + ((-IIon + IStim) / C) * dt # in ((uA / cm ^ 2) / (uF / cm ^ 2)) * ms == mV
# adjust the voltage to the resting potential
v = v + v_Rest
# test if there was a spike
if max(v[:]-v_Rest) > 60:
count += 1
return count
Ich habe folgendes versucht:
I = 30
xdata = np.array([0.92*I,I,1.05*I])
ydata = np.array([28,100,110])
y0=np.array([1,1,1])
def g(y,xdata,ydata):
return ydata - HH_model(xdata,y)
fit = optimize.leastsq(g, y0, args=(xdata, ydata))
File "", line
126, in HH_model
v[i + 1] = vT + ((-IIon + IStim) / C) * dt
ValueError: could not broadcast input array from shape (3) into shape
(1)
how can I get around this and make the input in the correct format?

The result of your line 126 is a three dimensional array with three times the same value. This size-3 array does not fit into an element of v, which has size-1 elements as you initialized them this way.
Therefore, you could add a [0]:
v[i + 1] = (vT + ((-IIon + IStim) / C) * dt)[0]
Furthermore, I think you do not need to allocate memory. You could for example use numpy.append in line 126.

autograd differentiation example in PyTorch - should be 9/8?

In the example for the Torch tutorial for Python, they use the following graph:
x = [[1, 1], [1, 1]]
y = x + 2
z = 3y^2
o = mean( z ) # 1/4 * x.sum()
Thus, the forward pass gets us this:
x_i = 1, y_i = 3, z_i = 27, o = 27
In code this looks like:
import torch
# define graph
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()
# if we don't do this, torch will only retain gradients for leaf nodes, ie: x
y.retain_grad()
z.retain_grad()
# does a forward pass
print(z, out)
however, I get confused at the gradients computed:
# now let's run our backward prop & get gradients
out.backward()
print(f'do/dz = {z.grad[0,0]}')
which outputs:
do/dx = 4.5
By chain rule, do/dx = do/dz * dz/dy * dy/dx, where:
dy/dx = 1
dz/dy = 9/2 given x_i=1
do/dz = 1/4 given x_i=1
which means:
do/dx = 1/4 * 9/2 * 1 = 9/8
However this doesn't match the gradients returned by Torch (9/2 = 4.5). Perhaps I have a math error (something with the do/dz = 1/4 term?), or I don't understand autograd in Torch.
Any pointers?

do/dz = 1 / 4
dz/dy = 6y = 6 * 3 = 18
dy/dx = 1
therefore, do/dx = 9/2

change scaling for the schrodinger equation

I need to change the scaling for my plot on the Schrodinger equation, y axis to show a difference between the theoretical calculation and ours which is about a 0.01 percent difference. so on the plot I am getting the scale is not small enough to show a difference. Here is the code from my project.
# -*- coding: utf-8 -*-
"""
Created on Sat Nov 05 12:25:14 2016
#author: produce
"""
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
#
c = .5 / 500 # c = delta x
x = np.arange(0, .5, c) # creates array of argument values from 0 to 1/2 in increments
# of delta x = c
psi = np.zeros(len(x)) # creates array of zeros which will be replaced by y values
k = 20 # starting energy for calculator of E
ans = 0 # The value of k, when we have y as between 0.004 and 0
ansPsi = 0
diff = 0.001
increment = 0.0001
done = False
while 1:
# print k
psi[0] = 1
psi[1] = 1
for i in range(0, len(x) - 2):
psi[i + 2] = psi[i + 1] + (psi[i + 1] - psi[i]) - 2 * k * c * c * psi[i]
# plt.plot(x,psi)
# print(x,psi)
# print (psi[i+2]--->)
if (float(psi[i + 2]) < 0.004 and float(psi[i + 2]) > 0):
ans = k
ansPsi = psi[i + 2]
# print ("NOW ENTERING INNER LOOP")
while 1: # would be an infinite loop, but have a break statement
# k = k - 0.00001
k = k + increment
for i in range(0, len(x) - 2):
psi[i + 2] = psi[i + 1] + (psi[i + 1] - psi[i]) - 2 * k * c * c * psi[i]
plt.plot(x, psi, 'r') #red solid line
if (psi[i + 2] > ansPsi or psi[i + 2] < 0):
done = True
break
else:
ansPsi = psi[i + 2]
ans = k
# print (k, psi[i+2])
if done:
break
k = k - diff
print("Value of k:", ans, "Value of Y:", ansPsi) # prints our answer for energy and psi[1/2]
k1 = 10 # 1st Higher Energy Value
k2 = 7 # 2nd Higher Energy Value
k3 = 3 # 1st Lower Energy Value
k4 = 1 # 2nd Lower Energy Value
kt = np.pi * np.pi * .5 # theoretical value
psi1 = np.zeros(len(x))
psi1[0] = 1
psi1[1] = 1
for i in range(0, len(x) - 2):
psi1[i + 2] = psi1[i + 1] + (psi1[i + 1] - psi1[i]) - 2 * k1 * c * c * psi1[i]
# psi2 = np.zeros(len(x))
# psi2[0] = 1
# psi2[1] = 1
# for i in range (0,len(x)-2):
# psi2[i+2] = psi2[i+1] + (psi2[i+1] - psi2[i]) - 2*k2*c*c*psi2[i]
# plt.plot(x,psi2,'k')
# psi3 = np.zeros(len(x))
# psi3[0] = 1
# psi3[1] = 1
# for i in range (0,len(x)-2):
# psi3[i+2] = psi3[i+1] + (psi3[i+1] - psi3[i]) - 2*k3*c*c*psi3[i]
# plt.plot(x,psi3,'p')
psi4 = np.zeros(len(x))
psi4[0] = 1
psi4[1] = 1
for i in range(0, len(x) - 2):
psi4[i + 2] = psi4[i + 1] + (psi4[i + 1] - psi4[i]) - 2 * k4 * c * c * psi4[i]
plt.plot(x, psi, 'r-', label='Corrected Energy')
psiT = np.zeros(len(x))
psiT[0] = 1
psiT[1] = 1
for i in range(0, len(x) - 2):
psiT[i + 2] = psiT[i + 1] + (psiT[i + 1] - psiT[i]) - 2 * kt * c * c * psiT[i]
plt.plot(x, psiT, 'b-', label='Theoretical Energy')
plt.ylabel("Value of Psi")
plt.xlabel("X value from 0 to 0.5")
plt.title("Schrodingers equation for varying inital energy")
plt.legend(loc=3)
plt.yscale()
plt.show()

The code you shared fails since plt.yscale() needs an argument. I simply commented that line out.
Because your theoretical energy curve and your corrected energy curve differ by so little, it is not possible to scale the y-axis and still see both curves over the full range of x (ie - from 0 to 0.5). Instead, maybe you should plot the difference of the two curves?
plt.plot(x, psiT-psi)
plt.title("Size of Correction for Varying Initial Energy")
plt.ylabel(r"$\Delta$E")
plt.xlabel("X value from 0 to 0.5")
plt.show()
Also, it might be nice to tack some units on the x and y labels. :)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How is GRU outputting the following values? - python

Related

Setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (6000,)

Backpropagation in Andrew Trask's snippet

Optimizing function parameters

autograd differentiation example in PyTorch - should be 9/8?

change scaling for the schrodinger equation

Categories

Resources