We want to tune a SageMaker PipelineModel with a HyperparameterTuner (or something similar) where several components of the pipeline have associated hyperparameters. Both components in our case are realized via SageMaker containers for ML algorithms.
model = PipelineModel(..., models = [ our_model, xgb_model ])
deploy = Estimator(image_uri = model, ...)
...
tuner = HyperparameterTuner(deply, .... tune_parameters, ....)
tuner.fit(...)
Now, there is of course the problem how to distribute the tune_parameters to the pipeline steps during the tuning.
In scikit-learn this is achieved by specially naming the tuning parameters <StepName>__<ParameterName>.
I don't see a way to achieve something similar with SageMaker, though. Also, search of the two keywords brings up the same question here but is not really what we want to do.
Any suggestion how to achieve this?
If both the models need to be jointly optimized, you could run a SageMaker HPO job in script mode and define both the models in the script. Or you could run two HPO jobs, optimize each model, and then create the Pipeline Model. There is no native support for doing an HPO job on a PipelineModel.
I work at AWS and my opinions are my own.
Related
I would like to finetune facebook/mbart-large-cc25 on my data using pre-training tasks, in particular Masked Language Modeling (MLM).
How can I do that in HuggingFace?
Edit: rewrote the question for the sake of clarity
Since you are doing everything in HuggingFace, fine-tuning a model on pre-training tasks (assuming that pre-training task is provided in Huggingface) is pretty much the same for most models. What tasks are you interested in fine-tuning mBART on?
Hugginface provides extensive documentation for several fine-tuning tasks. For instance the links provided below will help you fine tune HF models for Language modelling, MNLI, SQuAD etc. https://huggingface.co/transformers/v2.0.0/examples.html and https://huggingface.co/transformers/training.html
I'm currently trying to implement MLFlow Tracking into my training pipeline and would like to log the hyperparameters of my hyperparameter Tuning of each training job.
Does anyone know, how to pull the list of hyperparameters that can be seen on the sagemaker training job interface (on the AWS console)? Is there any other smarter way to list how models perform in comparison in Sagemaker (and displayed)?
I would assume there must be an easy and Pythonic way to do this (either boto3 or the sagemaker api) to get this data. I wasn't able to find it in Cloudwatch.
Many thanks in advance!
there is indeed a rather pythonic way in the SageMaker python SDK:
tuner = sagemaker.tuner.HyperparameterTuner.attach('< your tuning jobname>')
results = tuner.analytics().dataframe() # all your tuning metadata, in pandas!
See full example here https://github.com/aws-samples/amazon-sagemaker-tuneranalytics-samples/blob/master/SageMaker-Tuning-Job-Analytics.ipynb
For doing more comparisons, go with what Oliver_Cruchant posted.
To just get the hyperparameters with the SageMaker Python SDK (v1.65.0+):
tuner = sagemaker.tuner.HyperparameterTuner.attach('your-tuning-job-name')
job_desc = tuner.describe()
job_desc['HyperParameterRanges'] # returns a dictionary with your tunable hyperparameters
job_desc['StaticHyperParameters'] # returns a dictionary with your other hyperparameters
and with boto3:
sagemaker = boto3.client('sagemaker')
job_desc = sagemaker.describe_hyper_parameter_tuning_job(HyperParameterTuningJobName='your-tuning-job-name')
job_desc['HyperParameterRanges'] # returns a dictionary with your tunable hyperparameters
job_desc['StaticHyperParameters'] # returns a dictionary with your other hyperparameters
Both ways return the result of calling the DescribeHyperParameterTuningJob API.
DescribeHyperParameterTuningJob API documentation: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeHyperParameterTuningJob.html
I am able to submit jobs to Azure ML services using a compute cluster. It works well, and the autoscaling combined with good flexibility for custom environments seems to be exactly what I need. However, so far all these jobs seem to only use one compute node of the cluster. Ideally I would like to use multiple nodes for a computation, but all methods that I see rely on rather deep integration with azure ML services.
My modelling case is a bit atypical. From previous experiments I identified a group of architectures (pipelines of preprocessing steps + estimators in Scikit-learn) that worked well.
Hyperparameter tuning for one of these estimators can be performed reasonably fast (couple of minutes) with RandomizedSearchCV. So it seems less effective to parallelize this step.
Now I want to tune and train this entire list of architectures.
This should be very easily to parallelize since all architectures can be trained independently.
Ideally I would like something like (in pseudocode)
tuned = AzurePool.map(tune_model, [model1, model2,...])
However, I could not find any resources on how I could achieve this with an Azure ML Compute cluster.
An acceptable alternative would come in the form of a plug-and-play substitute for sklearn's CV-tuning methods, similar to the ones provided in dask or spark.
There are a number of ways you could tackle this with AzureML. The simplest would be to just launch a number of jobs using the AzureML Python SDK (the underlying example is taken from here)
from azureml.train.sklearn import SKLearn
runs = []
for kernel in ['linear', 'rbf', 'poly', 'sigmoid']:
for penalty in [0.5, 1, 1.5]:
print ('submitting run for kernel', kernel, 'penalty', penalty)
script_params = {
'--kernel': kernel,
'--penalty': penalty,
}
estimator = SKLearn(source_directory=project_folder,
script_params=script_params,
compute_target=compute_target,
entry_script='train_iris.py',
pip_packages=['joblib==0.13.2'])
runs.append(experiment.submit(estimator))
The above requires you to factor your training out into a script (or a set of scripts in a folder) along with the python packages required. The above estimator is a convenience wrapper for using Scikit Learn. There are also estimators for Tensorflow, Pytorch, Chainer and a generic one (azureml.train.estimator.Estimator) -- they all differ in the Python packages and base docker they use.
A second option, if you are actually tuning parameters, is to use the HyperDrive service like so (using the same SKLearn Estimator as above):
from azureml.train.sklearn import SKLearn
from azureml.train.hyperdrive.runconfig import HyperDriveConfig
from azureml.train.hyperdrive.sampling import RandomParameterSampling
from azureml.train.hyperdrive.run import PrimaryMetricGoal
from azureml.train.hyperdrive.parameter_expressions import choice
estimator = SKLearn(source_directory=project_folder,
script_params=script_params,
compute_target=compute_target,
entry_script='train_iris.py',
pip_packages=['joblib==0.13.2'])
param_sampling = RandomParameterSampling( {
"--kernel": choice('linear', 'rbf', 'poly', 'sigmoid'),
"--penalty": choice(0.5, 1, 1.5)
}
)
hyperdrive_run_config = HyperDriveConfig(estimator=estimator,
hyperparameter_sampling=param_sampling,
primary_metric_name='Accuracy',
primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
max_total_runs=12,
max_concurrent_runs=4)
hyperdrive_run = experiment.submit(hyperdrive_run_config)
Or you could use DASK to schedule the work as you were mentioning. Here is a sample of how to set up DASK on and AzureML Compute Cluster so you can do interactive work on it: https://github.com/danielsc/azureml-and-dask
there's also a ParallelTaskConfiguration Class with a worker_count_per_node setting, which defaults to 1.
I would like to evaluate a custom-trained Tensorflow object detection model on a new test set using Google Cloud.
I obtained the inital checkpoints from:
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
I know that the Tensorflow object-detection API allows me to run training and evaluation simultaneously by using:
https://github.com/tensorflow/models/blob/master/research/object_detection/model_main.py
To start such a job, i submit following ml-engine job:
gcloud ml-engine jobs submit training [JOBNAME]
--runtime-version 1.9
--job-dir=gs://path_to_bucket/model-dir
--packages dist/object_detection-
0.1.tar.gz,slim/dist/slim-0.1.tar.gz,pycocotools-2.0.tar.gz
--module-name object_detection.model_main
--region us-central1
--config object_detection/samples/cloud/cloud.yml
--
--model_dir=gs://path_to_bucket/model_dir
--pipeline_config_path=gs://path_to_bucket/data/model.config
However, after I have successfully transfer-trained a model I would like to use calculate performance metrics, such as COCO mAP(http://cocodataset.org/#detection-eval) or PASCAL mAP (http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf) on a new test data set which has not been previously used (neither during training nor during evaluation).
I have seen, that there is possible flag in model_main.py:
flags.DEFINE_string(
'checkpoint_dir', None, 'Path to directory holding a checkpoint. If '
'`checkpoint_dir` is provided, this binary operates in eval-only
mode, '
'writing resulting metrics to `model_dir`.')
But I don't know whether this really implicates that model_main.py can be run in exclusive evaluation mode? If yes, how should I submit the ML-Engine job?
Alternatively, are there any functions in the Tensorflow API which allows me to evaluate an existing output dictionary (containing bounding boxes, class labels, scores) based on COCO and/or Pascal mAP? If there is, I could easily read in a Tensorflow record file locally, run inference and then evaluate the output dictionary.
I know how to obtain these metrics for the evaluation data set, which is evaluated during training in model_main.py. However, from my understanding I should still report model performance on a new test data set, since I compare multiple models and implement some hyper-parameter optimization and thus I should not report on evaluation data set, am I right? On a more general note: I can really not comprehend why one would switch from separate training and evaluation (as it is in the legacy code) to a combined training and evaluation script?
Edit:
I found two related posts. However I do not think that the answers provided are complete:
how to check both training/eval performances in tensorflow object_detection
How to evaluate a pretrained model in Tensorflow object detection api
The latter has been written while TF's object detection API still had separate evaluation and training scripts. This is not the case anymore.
Thank you very much for any help.
If you specify the checkpoint_dir and set run_once to be true, then it should run evaluation exactly once on the eval dataset. I believe that metrics will be written to the model_dir and should also appear in your console logs. I usually just run this on my local machine (since it's just doing one pass over the dataset) and is not a distributed job. Unfortunately I haven't tried running this particular codepath on CMLE.
Regarding why we have a combined script... from the perspective of the Object Detection API, we were trying to write things in the tf.Estimator paradigm --- but you are right that personally I found it a bit easier when the two functionalities lived in separate binaries. If you want, you can always wrap up this functionality in another binary :)
I am playing around with tensorflow and today I have noticed that google also open-sourced Python SDK for their dataflow.
Currently when I need to train and evaluate several networks in parallel I usually use either luigi and run one model training after another or I use spark and I am performing each model training within the map step.
Whole this data processing is just a part of the pipeline.
I am wondering if there is or if there is planned something like perform tensorflow model training step inside of the dataflow pipeline?
Is there currently some best practice around this?
Or do I have to run each model setting within the map step?
I went through the documentation and for now it seems to be really vague, so I'm asking here if someone has some experience with this.
There is nothing planned at this time.
If you can run the Tensorflow training on a single machine (it sounds like this is what you were doing with Spark) then it should be possible to do the training within a DoFn of a Dataflow pipeline.