A bit confused with automatisation of Sagemaker retraining the model.
Currently I have a notebook instance with Sagemaker LinearLerner model making the classification task. So using Estimator I'm making training, then deploying the model creating Endpoint. Afterwards using Lambda function for invoke this endpoint, I add it to the API Gateway receiving the api endpoint which can be used for POST requests and sending back response with class.
Now I'm facing with the problem of retraining. For that I use serverless approach and lambda function getting environment variables for training_jobs. But the problem that Sagemaker not allow to rewrite training job and you can only create new one. My goal is to automatise the part when the new training job and the new endpoint config will apply to the existing endpoint that I don't need to change anything in API gateway. Is that somehow possible to automatically attach new endpoint config with existing endpoint?
Thanks
Yes, use the UpdateEndpoint endpoint. However, if you are using the Python Sagemaker SDK, be aware, there might be some documentation floating around asking you to call
model.deploy(..., update_endpoint=True)
This is apparently now deprecated in v2 of the Sagemaker SDK:
You should instead use the Predictor class to perform this update:
from sagemaker.predictor import Predictor
predictor = Predictor(endpoint_name="YOUR-ENDPOINT-NAME", sagemaker_session=sagemaker_session_object)
predictor.update_endpoint(instance_type="ml.t2.large", initial_instance_count=1)
If I am understanding the question correctly, you should be able to use CreateEndpointConfig near the end of the training job, then use UpdateEndpoint:
Deploys the new EndpointConfig specified in the request, switches to using newly created endpoint, and then deletes resources provisioned for the endpoint using the previous EndpointConfig (there is no availability loss).
If the API Gateway / Lambda is routed via the endpoint ARN, that should not change after using UpdateEndpoint.
Related
I am looking at this example to implement the data processing of incoming raw data for a sagemaker endpoint prior to model inference/scoring. This is all great but I have 2 questions:
How can one debug this (e.g can I invoke endpoint without it being exposed as restful API and then use Sagemaker debugger)
Sagemaker can be used "remotely" - e.g. via VSC. Can such a script be uploaded programatically?
Thanks.
Sagemaker Debugger is only to monitor the training jobs.
https://docs.aws.amazon.com/sagemaker/latest/dg/train-debugger.html
I dont think you can use it on Endpoints.
The script that you have provided is used both for training and inference. The container used by the estimator will take care of what functions to run. So it is not possible to debug the script directly. But what are you debugging in the code ? Training part or the inference part ?
While creating the estimator we need to give either the entry_point or the source directory. If you are using the "entry_point" then the value should be relative path to the file, if you are using "source_dir" then you should be able to give an S3 path. So before running the estimator, you can programmatically tar the files and upload it to S3 and then use the S3 path in the estimator.
I am using a Python code that creates the batch pool using the Azure SDK for Python. I have deployed this code in function apps and I am using it in data factory to create the batch pool. I want to check if the nodes have started or if they are in idle state before I run the rest of components in the pipeline. Is there any way I can check this in data factory?
No, there isn't a way can check it in Data Factory. Data Factory will call and run the Azure function directly.
Yes, there is no direct way to check the node status via ADF. But you can consider below workaround to achieve the same via ADF using Web Activity.
Step 1: Make an API to call Azure Batch services to get to your node
status details. We can leverage web activity in Azure data factory
to do the same.
Step 2: Based on response from Step 1, Invoke execution of your
pipeline further activities.
Check below links to know more details about Batch Service API and Web activity
I am developing an application in python where i need to consume the current values of a metric for some pods(eg. cpu, memory). I can get this info through an API (/apis/metrics.k8s.io/v1beta1/pods) but i try to use the kubernetes client so as to access these metrics within my python app.
I see that the V2beta2PodsMetricStatus Model includes the information i need but i can not find the API endpoint i should use so as to reach this model.
Any help or alternative option would be very helpful since i am really stacked with this issue several days.
Thank you very much for you time.
I finally get the metrics by executing directly the relevant kubectl command. I do not like the solution but it works. Hope to be able to use the kubernetes-client instead at the near future.
import subprocess
p = subprocess.getoutput('kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods')
ret_metrics = json.loads(p)
items = ret_metrics['items']
for item in items:
print(item['metadata']['name'])
You could use the call_api method of api_client as below to call an API which is not part of core kubernetes API.
ret_metrics = api_client.call_api('/apis/metrics.k8s.io/v1beta1/pods', 'GET', auth_settings = ['BearerToken'], response_type='json', _preload_content=False)
response = ret_metrics[0].data.decode('utf-8')
There is an open issue to support it via V2beta2PodsMetricStatus model
I have trained and deployed a model in Pytorch with Sagemaker. I am able to call the endpoint and get a prediction. I am using the default input_fn() function (i.e. not defined in my serve.py).
model = PyTorchModel(model_data=trained_model_location,
role=role,
framework_version='1.0.0',
entry_point='serve.py',
source_dir='source')
predictor = model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')
A prediction can be made as follows:
input ="0.12787057, 1.0612601, -1.1081504"
predictor.predict(np.genfromtxt(StringIO(input), delimiter=",").reshape(1,3) )
I want to be able to serve the model with REST API and am HTTP POST using lambda and API gateway. I was able to use invoke_endpoint() for this with an XGBOOST model in Sagemaker this way. I am not sure what to send into the body for Pytorch.
client = boto3.client('sagemaker-runtime')
response = client.invoke_endpoint(EndpointName=ENDPOINT ,
ContentType='text/csv',
Body=???)
I believe I need to understand how to write the customer input_fn to accept and process the type of data I am able to send through invoke_client. Am I on the right track and if so, how could the input_fn be written to accept a csv from invoke_endpoint?
Yes you are on the right track. You can send csv-serialized input to the endpoint without using the predictor from the SageMaker SDK, and using other SDKs such as boto3 which is installed in lambda:
import boto3
runtime = boto3.client('sagemaker-runtime')
payload = '0.12787057, 1.0612601, -1.1081504'
response = runtime.invoke_endpoint(
EndpointName=ENDPOINT_NAME,
ContentType='text/csv',
Body=payload.encode('utf-8'))
result = json.loads(response['Body'].read().decode())
This will pass to the endpoint a csv-formatted input, that you may need to reshape back in the input_fn to put in the appropriate dimension expected by the model.
for example:
def input_fn(request_body, request_content_type):
if request_content_type == 'text/csv':
return torch.from_numpy(
np.genfromtxt(StringIO(request_body), delimiter=',').reshape(1,3))
Note: I wasn't able to test the specific input_fn above with your input content and shape but I used the approach on Sklearn RandomForest couple times, and looking at the Pytorch SageMaker serving doc the above rationale should work.
Don't hesitate to use endpoint logs in Cloudwatch to diagnose any inference error (available from the endpoint UI in the console), those logs are usually much more verbose that the high-level logs returned by the inference SDKs
I have built an ML model (using the sklearn module), and I want to serve it predictions via AWS API Gateway + Lambda function.
My problems are:
I can't install sklearn + numpy etc. because of the lambda capacity limitations. (the bundle is greater than 140MB)
Maybe that a silly question, but, do you know if there are better ways to do that task?
I've tried this tutorial, in order to reduce the bundle size. However, it raises an exception because of the --use-wheel flag.
https://serverlesscode.com/post/scikitlearn-with-amazon-linux-container/
bucket = s3.Bucket(os.environ['BUCKET'])
model_stream = bucket.Object(os.environ['MODEL_NAME'])
model = pickle.loads(model_stream)
model.predict(z_features)[0]
Where z_features are my features after using a scalar
Just figure it out!
The solution basically stands on top of the AWS Lambda Layers.
I created a sklearn layer that contains only the relevant compiled libraries.
Then, I run sls package in order to pack a bundle which contains those files together with my own handler.py code.
The last step was to run
sls deploy --package .serverless
Hope it'll be helpful to others.
If you simply want to serve your sklearn model, you could skip the hassle of setting up a lambda function and tinkering with the API Gateway - just upload your model as a pkl file to FlashAI.io which will serve your model automatically for free. It handles high traffic environments and unlimited inference requests. For sklearn models specifically, just check out the user guide and within 5 minutes you'll have your model available as an API.
Disclaimer: I'm the author of this service