sklearn: Set the valute to the attribute out_activation_ to 'logistic' - python

I need to set the attribute activation_out = 'logistic' in a MLPRegressor of sklearn. It is supposed that this attribute can take the names of the relevant activation functions ('relu','logistic','tanh' etc). The problem is that I cannot find the way that you can control this attribute and set it to the preferred functions. Please, if someone has faced this problem before or knows something more, I want some help.
I have tried to set attribute to MLPRegressor(), error. I have tried with the method set_params(), error. I have tried manually to change it through Variable Explorer, error. Finally, I used MLPName.activation_out = 'logistic' but again when I used fit() method it changed to 'identity'.
CODE:
X_train2, X_test2, y_train2,y_test2 =
train_test_split(signals_final,masks,test_size=0.05,random_state =
17)
scaler2 = MinMaxScaler()
X_train2 = scaler.fit_transform(X_train2)
X_test2 = scaler.transform(X_test2)
MatchingNetwork = MLPRegressor(alpha = 1e-15,hidden_layer_sizes=
(300,)
,random_state=1,max_iter=20000,activation='logistic',batch_size=64)
MLPRegressor().out_activation_ = 'logistic'

You cannot. The output activation is determined by the problem type at fit time. For regression, the identity activation is used; see the User Guide.
Here is the relevant bit of source code. You might be able to hack it by fitting one iteration, changing the attribute, then using partial_fit, since then this _initialize method won't be called again; but it's likely to break when back-propogating.
Generally I think the sklearn neural networks aren't designed to be super flexible: there are other packages that play that role, are more efficient (use GPUs), etc.

Related

Converting PyTorch to CoreML gives a TypeError: 'dict' object is not callable

I've been following Apple's coremltools docs for converting PyTorch segmentation models to CoreML.
While it works fine when we're loading a remote PyTorch model, I'm yet to figure out a working Python script to perform conversions with local/already-downloaded PyTorch models.
The following piece of code throws a TypeError: 'dict' object is not callable
#This works fine: model = torch.hub.load('pytorch/vision:v0.6.0', 'deeplabv3_resnet101',pretrained=True).eval()
model = torch.load('local_model_file.pth')
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0)
with torch.no_grad():
output = model(input_batch)['out'][0] #error here
torch_predictions = output.argmax(0)
There is a SO answer that offers a solution by initialising the model class and loading the state_dict, but I wonder what's the concrete solution when we don't have access to the PyTorch model?
In your code, model is a state dict, which is a dictionary from parameter names to the parameter tensor values. As the linked answer stated, the right way to load a state dict is by (a) creating the model object that the state dict belongs to and then (b) use nn.Module.load_state_dict to load the state dict. To do (a), you need access to the model's class definition. If you don't have that access, then unfortunately I don't see any reliable way to load the state dict.
You might be able to guess what the class' __init__ look like by looking at the parameter names in the state dict (e.g., 'module.stage1.rebnconvin.conv_s1.weight' looks like a convolution). However, even if the guess is correct and the state dict can be loaded, you still need to define the forward method because the state dict only stores the parameters.

Accessing 'training' attribute in TensorFlow functional (functional API) Model

As the title states I'm wondering how I could access the privileged 'training' argument when I'm using the functional API.
So if I use subclassing, I can write something like:
class MyLayer(tf.keras.layers.Layer):
def __init__(self):
...
self.BN = tf.keras.Layers.BatchNormalization()
def call(self,inputs, training=None):
self.BN(inputs, training=training)
So I can control how my batchnorm behaves during training and prediction. But If I want to use the functional API:
input = tf.Input(someshape)
normalized = tf.keras.layers.BatchNormalization()(input)
tf.keras.Model(inputs=input, outputs=normalized)
Now I can't really set the priviledged 'training' argument for my batch_norm anymore. I love the functional API, its just really so much fun to use, but having to build around this kind of is a dealbreaker quite often. I feel like I must miss some important idea on how one would solve this here.
I'm aware that I could create a tf.Input, which could hold the 'training' argument. But this would change it from a keyord arg to some element of a list, which creates very very inconsistent code. Any smarter solution to this?
Edit: Should make it clear that I'm looking for a general idea that can be used for the 'training' arg, not just tackling the BatchNormalization in particular.
When you instantiate the model model = tf.keras.Model(inputs=input, outputs=normalized), the model has not yet been built. You will need to call the build method, usually when you do everything by hand using the gradient tape, or when you first call the fit method. At that point, the weights will be initialized. Now, if you use the fit method or call your model output_tensors = mymodel(input_tensors, training=True), or conversely if you use the predict method or use output_tensors = mymodel(input_tensors, training=False), the training flag will be set to True or False, (which is obvious if you call the model directly).

How to make statsmodels GLM.fit_constrained result picklable/store-and-reloadable

A GLS (or thus also OLS) regression with constraints on parameters can readily be run using statsmodels GLM.fit_constrained() method, as with the code below (or here).
How can I make the GLMresults object resulting from such a statsmodels GLM.fit_constrained() regression picklable, so that the estimation result can be stored for re-use for prediction in a new session anytime later?
The GLMresults object obtained from fit_constrained() and containing the relevant estimation result has its .save() method that would normally readily pickle the object into a file.
This .save() works for the result from a standard (unconstrained) GLM regression, sm.glm.fit(). However, it doesn't work with the result for sm.glm.fit_unconstrained(). Instead, it throws a pickling error, seemingly because patsy DesignMatrixBuilder is not Picklable, so it links to the never resolved issue here. This at least for my Python 3.6.3 (running on Windows).
An example:
import statsmodels
import statsmodels.api as sm
import pandas as pd
# Define exapmle data & Constraints:
import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(100, 5)), columns=list('ABCDF'))
y = df['A']
X = df[['B','C','D','F']]
constraints = ['B + C + D', 'C - F'] # Add two linear constraints on parameters: B+C+D = 0 & C-F = 0
statsmodels.genmod.families.links.identity()
OLS_from_GLM = sm.GLM(y, X)
# Unconstrained regression:
result_u = OLS_from_GLM.fit()
result_u.save('myfile_u.pickle') # This works
# Constrained regression - save() fails
result_c = OLS_from_GLM.fit_constrained(constraints)
result_c.save('myfile_c.pickle') # This fails with pickling error (tested in Python 3.6.3 on Windows): "NotImplementedError: Sorry, pickling not yet supported. See https://github.com/pydata/patsy/issues/26 if you want to help."
Is there a way to readily make the result from fit_unconstrained() picklable i.e./or storable?
I below suggest a first workaround answer; it is trivial and works well for me so far. I do not know, however, whether it is truly advisable or whether its risks are large and/or any preferable alternative solution exists.
I got this to work by simply removing (commenting out) the line
res._results.constraints = lc
in the function definition of fit_constrained() within statsmodels' active generalized_linear_model.py script (in my case in the virtualenv folder \env\Lib\site-packages\statsmodels\genmod\generalized_linear_model.py).
Idling this line seems to have created no problem for my work; I can now readily save and reload the pickled file and use it to make correct predictions based on the stored estimation; the imposed parameter constraints remain respected and predictions made using .predict() remain unchanged after reloading.
I wonder though whether there is any major risk attached to this procedure. I am not familiar with the inner workings of the statsmodels library, or with its glm.fit_constrained() method in particular. i reckon it's unadvisable to change anything in a pre-existing module one does not understand. However, it is the only way I am conveniently able to impose various constraints to my GLM parameters and to be able to save the regression results to readily re-use it for prediction in a later session.

Overwriting methods via mixin pattern does not work as intended

I am trying to introduce a mod/mixin for a problem. In particular I am focusing here on a SpeechRecognitionProblem. I intend to modify this problem and therefore I seek to do the following:
class SpeechRecognitionProblemMod(speech_recognition.SpeechRecognitionProblem):
def hparams(self, defaults, model_hparams):
SpeechRecognitionProblem.hparams(self, defaults, model_hparams)
vocab_size = self.feature_encoders(model_hparams.data_dir)['targets'].vocab_size
p = defaults
p.vocab_size['targets'] = vocab_size
def feature_encoders(self, data_dir):
# ...
So this one does not do much. It calls the hparams() function from the base class and then changes some values.
Now, there are already some ready-to-go problems e.g. Libri Speech:
#registry.register_problem()
class Librispeech(speech_recognition.SpeechRecognitionProblem):
# ..
However, in order to apply my modifications I am doing this:
#registry.register_problem()
class LibrispeechMod(SpeechRecognitionProblemMod, Librispeech):
# ..
This should, if I am not mistaken, overwrite everything (with identical signatures) in Librispeech and instead call functions of SpeechRecognitionProblemMod.
Since I was able to train a model with this code I am assuming that it's working as intended so far.
Now here comes the my problem:
After training I want to serialize the model. This usually works. However, it does not with my mod and I actually know why:
At a certain point hparams() gets called. Debugging to that point will show me the following:
self # {LibrispeechMod}
self.hparams # <bound method SpeechRecognitionProblem.hparams of ..>
self.feature_encoders # <bound method SpeechRecognitionProblemMod.feature_encoders of ..>
self.hparams should be <bound method SpeechRecognitionProblemMod.hparams of ..>! It would seem that for some reason hparams() of SpeechRecognitionProblem gets called directly instead of SpeechRecognitionProblemMod. But please note that it's the correct type for feature_encoders()!
The thing is that I know this is working during training. I can see that the hyper-paramaters (hparams) are applied accordingly simply because the model's graph node names change through my modifications.
There is one specialty I need to point out. tensor2tensor allows to dynamically load a t2t_usr_dir, which are additional python modules which get loaded by import_usr_dir. I make use of that function in my serialization script as well:
if usr_dir:
logging.info('Loading user dir %s' % usr_dir)
import_usr_dir(usr_dir)
This could be the only culprit I can see at the moment although I would not be able to tell why this may cause the problem.
If anybody sees something I do not I'd be glad to get a hint what I'm doing wrong here.
So what is the error you're getting?
For the sake of completeness, this is the result of the wrong hparams() method being called:
NotFoundError (see above for traceback): Restoring from checkpoint failed.
Key transformer/symbol_modality_256_256/softmax/weights_0 not found in checkpoint
symbol_modality_256_256 is wrong. It should be symbol_modality_<vocab-size>_256 where <vocab-size> is a vocabulary size which gets set in SpeechRecognitionProblemMod.hparams.
So, this weird behavior came from the fact that I was remote debugging and that the source files of the usr_dir were not correctly synchronized. Everything works as intended but the source files where not matching.
Case closed.

Tensorflow: Using weights trained in one model inside another, different model

I'm trying to train an LSTM in Tensorflow using minibatches, but after training is complete I would like to use the model by submitting one example at a time to it. I can set up the graph within Tensorflow to train my LSTM network, but I can't use the trained result afterward in the way I want.
The setup code looks something like this:
#Build the LSTM model.
cellRaw = rnn_cell.BasicLSTMCell(LAYER_SIZE)
cellRaw = rnn_cell.MultiRNNCell([cellRaw] * NUM_LAYERS)
cell = rnn_cell.DropoutWrapper(cellRaw, output_keep_prob = 0.25)
input_data = tf.placeholder(dtype=tf.float32, shape=[SEQ_LENGTH, None, 3])
target_data = tf.placeholder(dtype=tf.float32, shape=[SEQ_LENGTH, None])
initial_state = cell.zero_state(batch_size=BATCH_SIZE, dtype=tf.float32)
with tf.variable_scope('rnnlm'):
output_w = tf.get_variable("output_w", [LAYER_SIZE, 6])
output_b = tf.get_variable("output_b", [6])
outputs, final_state = seq2seq.rnn_decoder(input_list, initial_state, cell, loop_function=None, scope='rnnlm')
output = tf.reshape(tf.concat(1, outputs), [-1, LAYER_SIZE])
output = tf.nn.xw_plus_b(output, output_w, output_b)
...Note the two placeholders, input_data and target_data. I haven't bothered including the optimizer setup. After training is complete and the training session closed, I would like to set up a new session that uses the trained LSTM network whose input is provided by a completely different placeholder, something like:
with tf.Session() as sess:
with tf.variable_scope("simulation", reuse=None):
cellSim = cellRaw
input_data_sim = tf.placeholder(dtype=tf.float32, shape=[1, 1, 3])
initial_state_sim = cell.zero_state(batch_size=1, dtype=tf.float32)
input_list_sim = tf.unpack(input_data_sim)
outputsSim, final_state_sim = seq2seq.rnn_decoder(input_list_sim, initial_state_sim, cellSim, loop_function=None, scope='rnnlm')
outputSim = tf.reshape(tf.concat(1, outputsSim), [-1, LAYER_SIZE])
with tf.variable_scope('rnnlm'):
output_w = tf.get_variable("output_w", [LAYER_SIZE, nOut])
output_b = tf.get_variable("output_b", [nOut])
outputSim = tf.nn.xw_plus_b(outputSim, output_w, output_b)
This second part returns the following error:
tensorflow.python.framework.errors.InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder' with dtype float
[[Node: Placeholder = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
...Presumably because the graph I'm using still has the old training placeholders attached to the trained LSTM nodes. What's the right way to 'extract' the trained LSTM and put it into a new, different graph that has a different style of inputs? The Varible scoping features that Tensorflow has seem to address something like this, but the examples in the documentation all talk about using variable scope as a way of managing variable names so that the same piece of code will generate similar subgraphs within the same graph. The 'reuse' feature seems to be close to what I want, but I don't find the Tensorflow documentation linked above to be clear at all on what it does. The cells themselves cannot be given a name (in other words,
cellRaw = rnn_cell.MultiRNNCell([cellRaw] * NUM_LAYERS, name="multicell")
is not valid), and while I can give a name to a seq2seq.rnn_decoder(), I presumably wouldn't be able to remove the rnn_cell.DropoutWrapper() if I used that node unchanged.
Questions:
What is the proper way to move trained LSTM weights from one graph to another?
Is it correct to say that starting a new session "releases resources", but doesn't erase the graph built in memory?
It seems to me like the 'reuse' feature allows Tensorflow to search outside of the current variable scope for variables with the same name (existing in a different scope), and use them in the current scope. Is this correct? If it is, what happens to all of the graph edges from the non-current scope that link to that variable? If it isn't, why does Tensorflow throw an error if you try to have the same variable name within two different scopes? It seems perfectly reasonable to define two variables with identical names in two different scopes, e.g. conv1/sum1 and conv2/sum1.
In my code I'm working within a new scope but the graph won't run without data to be fed into a placeholder from the initial, default scope. Is the default scope always 'in-scope' for some reason?
If graph edges can span different scopes, and names in different scopes can't be shared unless they refer to the exact same node, then that would seem to defeat the purpose of having different scopes in the first place. What am I misunderstanding here?
Thanks!
What is the proper way to move trained LSTM weights from one graph to another?
You can create your decoding graph first (with a saver object to save the parameters) and create a GraphDef object that you can import in your bigger training graph:
basegraph = tf.Graph()
with basegraph.as_default():
***your graph***
traingraph = tf.Graph()
with traingraph.as_default():
tf.import_graph_def(basegraph.as_graph_def())
***your training graph***
make sure you load your variables when you start a session for a new graph.
I don't have experience with this functionality so you may have to look into it a bit more
Is it correct to say that starting a new session "releases resources", but doesn't erase the graph built in memory?
yep, the graph object still hold it
It seems to me like the 'reuse' feature allows Tensorflow to search outside of the current variable scope for variables with the same name (existing in a different scope), and use them in the current scope. Is this correct? If it is, what happens to all of the graph edges from the non-current scope that link to that variable? If it isn't, why does Tensorflow throw an error if you try to have the same variable name within two different scopes? It seems perfectly reasonable to define two variables with identical names in two different scopes, e.g. conv1/sum1 and conv2/sum1.
No, reuse is to determine the behaviour when you use get_variable on an existing name, when it is true it will return the existing variable, otherwise it will return a new one. Normally tensorflow should not throw an error. Are you sure your using tf.get_variable and not just tf.Variable?
In my code I'm working within a new scope but the graph won't run without data to be fed into a placeholder from the initial, default scope. Is the default scope always 'in-scope' for some reason?
I don't really see what you mean. The do not always have to be used. If a placeholder is not required for running an operation you don't have to define it.
If graph edges can span different scopes, and names in different scopes can't be shared unless they refer to the exact same node, then that would seem to defeat the purpose of having different scopes in the first place. What am I misunderstanding here?
I think your understanding or usage of scopes is flawed, see above

Categories

Resources