Upon deploying a custom pytorch model with the boto3 client in python. I noticed that a new S3 bucket had been created with no visible objects. Is there a reason for this?
The bucket that contained my model was named with the keyword "sagemaker" included, so I don't any issue there.
Here is the code that I used for deployment:
remote_model = PyTorchModel(
name = model_name,
model_data=model_url,
role=role,
sagemaker_session = sess,
entry_point="inference.py",
# image=image,
framework_version="1.5.0",
py_version='py3'
)
remote_predictor = remote_model.deploy(
instance_type='ml.g4dn.xlarge',
initial_instance_count=1,
#update_endpoint = True, # comment or False if endpoint doesns't exist
endpoint_name=endpoint_name, # define a unique endpoint name; if ommited, Sagemaker will generate it based on used container
wait=True
)
It was likely created as a default bucket by the SageMaker Python SDK. Note that the code you wrote about is not boto3 (AWS python SDK), but sagemaker (link), the SageMaker-specific Python SDK, that is higher-level than boto3.
The SageMaker Python SDK uses S3 at multiple places, for example to stage training code when using a Framework Estimator, and to stage inference code when deployment with a Framework Model (your case). It gives you control of the S3 location to use, but if you don't specify it, it may use an automatically generated bucket, if it has the permissions to do so.
To control code staging S3 location, you can use the parameter code_location in either your PyTorchEstimator (training) or your PyTorchModel (serving)
Related
I am looking at this example to implement the data processing of incoming raw data for a sagemaker endpoint prior to model inference/scoring. This is all great but I have 2 questions:
How can one debug this (e.g can I invoke endpoint without it being exposed as restful API and then use Sagemaker debugger)
Sagemaker can be used "remotely" - e.g. via VSC. Can such a script be uploaded programatically?
Thanks.
Sagemaker Debugger is only to monitor the training jobs.
https://docs.aws.amazon.com/sagemaker/latest/dg/train-debugger.html
I dont think you can use it on Endpoints.
The script that you have provided is used both for training and inference. The container used by the estimator will take care of what functions to run. So it is not possible to debug the script directly. But what are you debugging in the code ? Training part or the inference part ?
While creating the estimator we need to give either the entry_point or the source directory. If you are using the "entry_point" then the value should be relative path to the file, if you are using "source_dir" then you should be able to give an S3 path. So before running the estimator, you can programmatically tar the files and upload it to S3 and then use the S3 path in the estimator.
A bit confused with automatisation of Sagemaker retraining the model.
Currently I have a notebook instance with Sagemaker LinearLerner model making the classification task. So using Estimator I'm making training, then deploying the model creating Endpoint. Afterwards using Lambda function for invoke this endpoint, I add it to the API Gateway receiving the api endpoint which can be used for POST requests and sending back response with class.
Now I'm facing with the problem of retraining. For that I use serverless approach and lambda function getting environment variables for training_jobs. But the problem that Sagemaker not allow to rewrite training job and you can only create new one. My goal is to automatise the part when the new training job and the new endpoint config will apply to the existing endpoint that I don't need to change anything in API gateway. Is that somehow possible to automatically attach new endpoint config with existing endpoint?
Thanks
Yes, use the UpdateEndpoint endpoint. However, if you are using the Python Sagemaker SDK, be aware, there might be some documentation floating around asking you to call
model.deploy(..., update_endpoint=True)
This is apparently now deprecated in v2 of the Sagemaker SDK:
You should instead use the Predictor class to perform this update:
from sagemaker.predictor import Predictor
predictor = Predictor(endpoint_name="YOUR-ENDPOINT-NAME", sagemaker_session=sagemaker_session_object)
predictor.update_endpoint(instance_type="ml.t2.large", initial_instance_count=1)
If I am understanding the question correctly, you should be able to use CreateEndpointConfig near the end of the training job, then use UpdateEndpoint:
Deploys the new EndpointConfig specified in the request, switches to using newly created endpoint, and then deletes resources provisioned for the endpoint using the previous EndpointConfig (there is no availability loss).
If the API Gateway / Lambda is routed via the endpoint ARN, that should not change after using UpdateEndpoint.
Using the Python Sagemaker SDK, one can launch a training job using TensorFlow with the following code specifying the S3 bucket where the results should be placed on the attribute model_dir:
import sagemaker
from sagemaker.tensorflow import TensorFlow
sess = sagemaker.Session()
tf_estimator = TensorFlow(model_dir='s3://bucket_name', ...)
tf_estimator.fit(...)
However, after training is done, I can see the output on the default Sagemaker bucket but not on the specified bucket, what could be going wrong?
Found an answer thanks to AWS support:
The TensorFlow estimator has as a base class sagemaker.estimator.Framework which in turn has as a base class sagemaker.estimator.EstimatorBase which accepts the parameter output_path.
So the initialization of the TensorFlow estimator to pass a custom output bucket would look like:
S3_BUCKET = 's3://xxx'
tf_estimator = TensorFlow(..., output_path=S3_BUCKET)
I am attempting to upload my cleaned (and split data using kfold) to s3 so that I can use sagemaker to create a model using it (since sagemaker wants an s3 file with training and test data). However, whenever I attempt to upload the csv to s3 it runs but I don't see the file in s3.
I have tried changing which folder I access in sagemaker, or trying to upload different types of files none of which work. In addition, I have tried the approaches in similar Stack Overflow posts without success.
Also note that I am able to manually upload my csv to s3, just not through sagemaker automatically.
The code below is what I currently have to upload to s3, which I have copied directly from AWS documentation for file uploading using sagemaker.
import io
import csv
import boto3
#key = "{}/{}/examples".format(prefix,data_partition_name)
#url = 's3n://{}/{}'.format(bucket, key)
name = boto3.Session().resource('s3').Bucket('nc-demo-sagemaker').name
print(name)
boto3.Session().resource('s3').Bucket('nc-demo-sagemaker').upload_file('train', '/')
print('Done writing to {}'.format('sagemaker bucket'))
I expect that when I run that code snippet, I am able to upload the training and test data to the folder I want for use in creating sagemaker models.
I always upload files from Sagemaker notebook instance to S3 using this code. This will upload all the specified folder's contents to S3. Alternatively, you can specify a single file to upload.
import sagemaker
s3_path_to_data = sagemaker.Session().upload_data(bucket='my_awesome_bucket',
path='local/path/data/train',
key_prefix='my_crazy_project_name/data/train')
I hope this helps!
The issue may be due to a lack of proper S3 permissions for your SageMaker notebook.
Your IAM user has a role with permissions, which is what dictates whether you can manually upload the CSV via the S3 console.
SageMaker notebooks actually have their own IAM role, which will require you to explicitly add S3 permissions. You can see this in the SageMaker console, the default IAM role is prefaced with SageMaker-XXX. You can either edit this SageMaker created IAM role, or attach existing IAM roles that include read/write permissions for S3.
Import sagemaker library and use sagemaker session to upload and download files to/from s3 bucket.
import sagemaker
sagemaker_session = sagemaker.Session(default_bucket='MyBucket')
upload_data = sagemaker_session.upload_data(path='local_file_path', key_prefix='my_prefix')
print('upload_data : {}'.format(upload_data))
I'm brand new to Azure Databricks, and my mentor suggested I complete the Machine Learning Bootcamp at
https://aischool.microsoft.com/en-us/machine-learning/learning-paths/ai-platform-engineering-bootcamps/custom-machine-learning-bootcamp
Unfortunately, after successfully setting up Azure Databricks, I've run into some issues in step 2. I successfully added the 1_01_introduction file to my workspace as a notebook. However, while the tutorial talks about teaching how to mount data in Azure Blob Storage, it seems to skip that step, which causes all of the next tutorial coding steps to throw errors. The first code bit (which the tutorial tells me to run), and the error that comes up afterwards, are included below.
%run "../presenter/includes/mnt_blob"
Notebook not found: presenter/includes/mnt_blob. Notebooks can be specified via a relative path (./Notebook or ../folder/Notebook) or via an absolute path (/Abs/Path/to/Notebook). Make sure you are specifying the path correctly.
Stacktrace:
/1_01_introduction: python
As far as I can tell, the Azure Blob storage just isn't set up yet, and so the code I run (as well as the code in all of the following steps) can't find the tutorial items that are supposed to be stored in the blob. Any help you fine folks can provide would be most appreciated.
Setting up and mounting Blob Storage in Azure Databricks does take a few steps.
First, create a storage account and then create a container inside of it.
Next, keep a note of the following items:
Storage account name: The name of the storage account when you created it
Storage account key: This can be found in the Azure Portal on the resource page.
Container name: The name of the container
In an Azure Databricks notebook, create variables for the above items.
storage_account_name = "Storage account name"
storage_account_key = "Storage account key"
container = "Container name"
Then, use the below code to set a Spark config to point to your instance of Azure Blob Storage.
spark.conf.set("fs.azure.account.key.{0}.blob.core.windows.net".format(storage_account_name), storage_account_key)
To mount it to Azure Databricks, use the dbutils.fs.mount method. The source is the address to your instance of Azure Blob Storage and a specific container. The mount point is where it will be mounted in the Databricks File Storage on Azure Databricks. The extra configs is where you pass in the Spark config so it doesn't always need to be set.
dbutils.fs.mount(
source = "wasbs://{0}#{1}.blob.core.windows.net".format(container, storage_account_name),
mount_point = "/mnt/<Mount name>",
extra_configs = {"fs.azure.account.key.{0}.blob.core.windows.net".format(storage_account_name): storage_account_key}
)
With those set, you can now start using the mount. To check it can see files in the storage account, use the dbutils.fs.ls command.
dbutils.fs.ls("dbfs:/mnt/<Mount name>")
Hope that helps!