CNTK & python: How to pass input data to the eval func? - python

With CNTK I have created a network with 2 input neurons and 1 output neuron.
A line in the training file looks like
|features 1.567518 2.609619 |labels 1.000000
Then the network was trained with brain script. Now I want to use the network for predicting values. For example: Input data is [1.82, 3.57]. What ist the output from the net?
I have tried Python with the following code, but here I am new. Code does not work. So my question is: How to pass the input data [1.82, 3.57] to the eval function?
On stackoverflow there are some hints, here and here, but this is too abstract for me.
Thank you.
import cntk as ct
import numpy as np
z = ct.load_model("LR_reg.dnn", ct.device.cpu())
input_data= np.array([1.82, 3.57], dtype=np.float32)
pred = z.eval({ z.arguments[0] : input_data })
print(pred)

Here's the most defensive way of doing it. CNTK can be forgiving if you omit some of this when the network is specified with V2 constructs. Not sure about a network that was created with V1 code.
Basically you need a pair of braces for each axis. Which axes exist in Brainscript? There's a batch axis, a sequence axis and then the static axes of your network. You have one dimensional data so that means the following should work:
input_data= np.array([[[1.82, 3.57]]], dtype=np.float32)
This specifies a batch of one sequence, of length one, containing one 1d vector of two elements. You can also try omitting the outermost braces and see if you are getting the same result.
Update based on more information from the comment below, we should not forget that the V1 code also saved the part of the network that computes things like loss and accuracy. If we provide only the features, CNTK will complain that the labels have not been provided. There are two ways to deal with this issue. One possibility is to provide some fake labels, so that the network can evaluate these auxiliary operations. Another possibility is to identify the prediction and use that. If the prediction was called 'p' in V1, this python code
p = z.find_by_name('p')
should create a CNTK function that only needs the features in order to compute the prediction.

Related

understanding tensorflow binary image classification results

For one of my first attempts at using Tensor flow I've followed the Binary Image Classification tutorial https://www.tensorflow.org/tutorials/keras/text_classification_with_hub#evaluate_the_model.
I was able to follow the tutorial fine, but then I wanted to try to inspect the results more closely, namely I wanted to see what predictions the model made for each item in the test data set.
In short, I wanted to see what "label" (1 or 0) it would predict applies to a given movie review.
So I tried:
results = model.predict(test_data.batch(512))
and then
for i in results:
print(i)
This gives me close to what I would expect. A list of 25,000 entries (one for each movie review).
But the value of each item in the array is not what I would expect. I was expecting to see a predicted label, so either a 0 (for negative) or 1 (for positive).
But instead I get this:
[0.22731477]
[2.1199656]
[-2.2581818]
[-2.7382329]
[3.8788114]
[4.6112833]
[6.125982]
[5.100685]
[1.1270659]
[1.3210837]
[-5.2568426]
[-2.9904163]
[0.17620209]
[-1.1293088]
[2.8757455]
...and so on for 25,000 entries.
Can someone help me understand what these numbers mean.
Am I misunderstanding what the "predict" method does, or (since these number look similar to the word embedding vectors introduced in the first layer of the model) perhaps I am misunderstanding how the prediction relates to the word embedding layer and the ultimate classification label.
I know this a major newbie question. But appreciate your help and patience :)
According to the link that you provided, the problem come from your output activation function. That code use dense vector with 1 neuron without activation function. So it just multiplying output from previous layer with weight and bias and sum them together. The output that you get will have a range between -infinity(negative class) and +infinity(positive class), Therefore if you really want your output between zero and one you need an activation function such as sigmoid model.add(tf.keras.layers.Dense(1), activation='sigmoid'). Now we just map every thing to range 0 to 1, so we can classify as negative class if output is less than 0.5(mid point) and vice versa.
Actually your understanding of prediction function is correct. You simply did not add an activation to fit with your assumption, that's why you gat that output instead of value between 0 and 1.

Tensorflow Datasets, padded_batch, why allow different output_shapes, and is there a better way?

I'm trying to write Tensorflow 2.0 code which is good enough to share with other people. I have run into a problem with tf.data.Dataset. I have solved it, but I dislike my solutions.
Here is working Python code which generates padded batches from irregular data, two different ways. In one case, I re-use a global variable to supply the shape information. I dislike the global variable, especially because I know that the Dataset knows its own output shapes, and in the future I may have Dataset objects with several different output shapes.
In the other case, I extract the shape information from the Dataset object itself. But I have to jump through hoops to do it.
import numpy as np
import tensorflow as tf
print("""
Create a data set with the desired shape: 1 input per sub-element,
3 targets per sub-element, 8 elements of varying lengths.
""")
def gen():
lengths = np.tile(np.arange(4,8), 2)
np.random.shuffle(lengths)
for length in lengths:
inp = np.random.randint(1, 51, length)
tgt = np.random.random((length, 3))
yield inp, tgt
output_types = (tf.int64, tf.float64)
output_shapes = ([None], [None, 3])
dataset = tf.data.Dataset.from_generator(gen, output_types, output_shapes)
print("""
Using the global variable, output_shapes, allows the retrieval
of padded batches.
""")
for inp, tgt in dataset.padded_batch(3, output_shapes):
print(inp)
print(tgt)
print()
print("""
Obtaining the shapes supplied to Dataset.from_generator()
is possible, but hard.
""")
default_shapes = tuple([[y.value for y in x.shape.dims] for x in dataset.element_spec]) # Crazy!
for inp, tgt in dataset.padded_batch(3, default_shapes):
print(inp)
print(tgt)
I don't quite understand why one might want to pad the data in a batch of unevenly-sized elements to any shapes other than the output shapes which were used to define the Dataset elements in the first place. Does anyone know of a use case?
Also, there is no default value for the padded_shapes argument. I show how to retrieve what I think is the sensible default value for padded_shapes. That one-liner works... but why is it so difficult?
I'm currently trying to subclass Dataset to provide the Dataset default shapes as a Python property. Tensorflow is fighting me, probably because the underlying Dataset is a C++ object while I'm working in Python.
All this trouble makes me wonder whether there is a cleaner approach than what I have tried.
Thanks for your suggestions.
Answering my own question. I asked this same question on Reddit. A Tensorflow contributor replied that TF 2.2 will provide a default value for the padded_shapes argument. I am glad to see that the development team has recognized the same need that I identified.

CNTK with python - activation for each layer

I am using the python API of CNTK to train some CNN that I save using the save_model function.
Now I want to run some analysis on my network afterwards. Specifically I want to take a look at the activations of each layer. Obviously I can run my network on some data called img like this:
model.eval(img)
But that will only give me the output of the last Layer in my Network. Is there some easy way to also get the output from the previous layers?
Actually, there is even an example provided for that task: https://github.com/Microsoft/CNTK/tree/master/Examples/Image/FeatureExtraction
Let me give you a short overview about the essential steps:
Important is the name of your node, of which you want to get the output.
# get the node in the graph of which you desire the output
node_in_graph = loaded_model.find_by_name(node_name)
output_nodes = combine([node_in_graph.owner])
# evaluate the node e.g. using a minibatch_source
mb = minibatch_source.next_minibatch(1)
output = output_nodes.eval(mb[features_si])
# access the values as a one dimensional vector
out_values = output[0].flatten()
desired_output = out_values[np.newaxis]
Basically you just do the same like you do anyways with the difference that you retrieve an intermediate node.

Pymc3: Optimizing parameters with multiple data?

I've designed a model using Pymc3, and I have some trouble optimizing it with multiple data.
The model is a bit similar to the coal-mining disaster (as in the Pymc3 tutorial for those who know it), except there are multiple switchpoints.
The output of the network is a serie of real numbers for instance:
[151,152,150,20,19,18,0,0,0]
with Model() as accrochage_model:
time=np.linspace(0,n_cycles*data_length,n_cycles*data_length)
poisson = [Normal('poisson_0',5,1), Normal('poisson_1',10,1)]
variance=3
t = [Normal('t_0',0.5,0.01), Normal('t_1',0.7,0.01)]
taux = [Bernoulli('taux_{}'.format(i),t[i]) for i in range(n_peaks)]
switchpoint = [Poisson('switchpoint_{}'.format(i),poisson[i])*taux[i] for i in range(n_peaks)]
peak=[Normal('peak_0',150,2),Normal('peak_1',50,2),Normal('peak_2',0,2)]
z_init=switch(switchpoint[0]>=time%n_cycles,0,peak[0])
z_list=[switch(sum(switchpoint[j] for j in range(i))>=time%n_cycles,0,peak[i]-peak[i-1]) for i in range(1,n_peaks)]
z=(sum(z_list[i] for i in range(len(z_list))))
z+=z_init
m =Normal('m', z, variance,observed=data)
I have multiple realisations of the true distribution and I'd like taking all of them into account while performing optimization of the parameters of the system.
Right now my "data" that appears in observed=data is just one list of results , such as:
[151,152,150,20,19,18,0,0,0]
What I would like to do is give not just one but several lists of results,
for instance:
data=([151,152,150,20,19,18,0,0,0],[145,152,150,21,17,19,1,0,0],[151,149,153,17,19,18,0,0,1])
I tried using the shape parameter and making data an array of results but none of it seemed to work.
Does anyone have an idea of how it's possible to do the inference so that the network is optimized for an entire dataset and not a single output?

How to get feature vector column length in Spark Pipeline

I have an interesting question.
I am using Pipeline object to run a ML task.
This is how my Pipeline object looks like.
jpsa_mlp.pipeline.getStages()
Out[244]:
[StringIndexer_479d82259c10308d0587,
Tokenizer_4c5ca5ea35544bb835cb,
StopWordsRemover_4641b68e77f00c8fbb91,
CountVectorizer_468c96c6c714b1000eef,
IDF_465eb809477c6c986ef9,
MultilayerPerceptronClassifier_4a67befe93b015d5bd07]
All the estimators and transformers inside this pipeline object have been coded as part of class methods with JPSA being class object.
Now I want to put a method for hyper parameter tuning. So I use below:
self.paramGrid = ParamGridBuilder()\
.addGrid(self.pipeline.getStages()[5].layers, [len(self.pipeline.getStages()[3].vocab),10,3])\
.addGrid(self.pipeline.getStages()[5].maxIter, [100,300])\
.build()
The problem is for a Neural Network classifier one of the hyper parameter is basically the hidden layer size. The layers attribute of MLP classifier requires the size of input layer, hidden and output layer. Input and Output is fixed (based on data we have). So I wanted to put input layer size as the size of my feature vector. However I don't know the size of my feature vector because the estimator inside the pipeline object to create feature vectors (Count Vectorizer, IDF) have not been fit yet to the data.
The pipeline object will fit the data during cross validation by using a cross validator object of Spark. Then only I would be able to have CountVectorizerModel to know the feature vector size.
If I had Countvectorizer materialized then I can use either the countvectorizerModel.vocab to get the length of the feature vector and use that as a parameter for input layer value in layers attribute of mlp.
SO then how do I add hyper parameters for Layers for mlp (both the hidden and input layer size)?
You can find out that information from your dataframe schema metadata.
Scala code:
val length = datasetAfterPipe.schema(datasetAfterPipe.schema.fieldIndex("columnName"))
.metadata.getMetadata("ml_attr").getLong("num_attrs")
Since is requested PySpark code:
u can se them "navigating" metadata: datasetAfterPipe.schema["features"].metadata["ml_attr"]
here is sample output (xxx is all features made into features columns and the end results is the size):
Out:
{'attrs': {'numeric': [{'idx': xxxxxxx }]}, 'num_attrs': 337}
so u slice metadata:
lenFeatureVect = datasetAfterPipe.schema["features"].metadata["ml_attr"]["num_attrs"]
print('Len feature vector:', lenFeatureVect)
Out:
337
Note: if u have "scaled features" then u need to use "pre-Scaled" column
"features" in order to get attributes info (assuming u scale after vectorizing otherwise is not getting applied limitations if u feed original columns) since u feed feature
vectors to that step into Pipeline.

Categories

Resources