How do you connect to AWS Elastic Transcoder?

How do you connect to AWS Elastic Transcoder? - python

I'm trying to transcode some videos, but something is wrong with the way I am connecting.
Here's my code:
transcode = layer1.ElasticTranscoderConnection()
transcode.DefaultRegionEndpoint = 'elastictranscoder.us-west-2.amazonaws.com'
transcode.DefaultRegionName = 'us-west-2'
transcode.create_job(pipelineId, transInput, transOutput)
Here's the exception:
{u'message': u'The specified pipeline was not found: account=xxxxxx, pipelineId=xxxxxx.'}

To connect to a specific region in boto, you can use:
import boto.elastictranscoder
transcode = boto.elastictranscoder.connect_to_region('us-west-2')
transcode.create_job(...)

I just started using boto the other day, but the previous answer didn't work for me - don't know if the API changed or what (seems a little weird if it did, but anyway). This is how I did it.
#!/usr/bin/env python
# Boto
import boto
# Debug
boto.set_stream_logger('boto')
# Pipeline Id
pipeline_id = 'lotsofcharacters-393824'
# The input object
input_object = {
'Key': 'foo.webm',
'Container': 'webm',
'AspectRatio': 'auto',
'FrameRate': 'auto',
'Resolution': 'auto',
'Interlaced': 'auto'
}
# The object (or objects) that will be created by the transcoding job;
# note that this is a list of dictionaries.
output_objects = [
{
'Key': 'bar.mp4',
'PresetId': '1351620000001-000010',
'Rotate': 'auto',
'ThumbnailPattern': '',
}
]
# Phone home
# - Har har.
et = boto.connect_elastictranscoder()
# Create the job
# - If successful, this will execute immediately.
et.create_job(pipeline_id, input_name=input_object, outputs=output_objects)
Obviously, this is a contrived example and just runs from a standalone python script; it assumes you have a .boto file somewhere with your credentials in it.
Another thing to note is the PresetId's; you can find these in the AWS Management Console for Elastic Transcoder, under Presets. Finally, the values that can be stuffed in the dictionaries are lifted verbatim from the following link - as far as I can tell, they are just interpolated into a REST call (case sensitive, obviously).
AWS Create Job API

Related

Pulling metrics from azure storage with the python sdk

I'm trying to get metrics from azure storage, like transaction_count, ingress, egress, server sucess latency etc.
I am trying with the following code :
from azure.storage.blob import BlobAnalyticsLogging, Metrics, CorsRule, RetentionPolicy
# Create logging settings
logging = BlobAnalyticsLogging(read=True, write=True, delete=True, retention_policy=RetentionPolicy(enabled=True, days=5))
# Create metrics for requests statistics
hour_metrics = Metrics(enabled=True, include_apis=True, retention_policy=RetentionPolicy(enabled=True, days=5))
minute_metrics = Metrics(enabled=True, include_apis=True,retention_policy=RetentionPolicy(enabled=True, days=5))
# Create CORS rules
cors_rule = CorsRule(['www.xyz.com'], ['GET'])
cors = [cors_rule]
# Set the service properties
blob_service_client.set_service_properties(logging, hour_metrics, minute_metrics, cors)
# [END set_blob_service_properties]
# [START get_blob_service_properties]
properties = blob_service_client.get_service_properties()
# [END get_blob_service_properties]
print (properties)
This one does not give an error, but returns the following output:
{'analytics_logging': <azure.storage.blob._models.BlobAnalyticsLogging object at 0x7ffa629d8880>, 'hour_metrics': <azure.storage.blob._models.Metrics object at 0x7ffa629d84c0>, 'minute_metrics': <azure.storage.blob._models.Metrics object at 0x7ffa629d8940>, 'cors': [<azure.storage.blob._models.CorsRule object at 0x7ffa629d8a60>], 'target_version': None, 'delete_retention_policy': <azure.storage.blob._models.RetentionPolicy object at 0x7ffa629d85e0>, 'static_website': <azure.storage.blob._models.StaticWebsite object at 0x7ffa629d8a00>}
I understand that maybe I am missing something, the documentation is quite dense and I don't understand it very well.
Thanks in advance for any possible answers

You would probably want to read the service properties individually, i.E.:
analytics_logging = blob_service_client.get_service_properties().get('analytics_logging')
This way you should be able to process (or read) the output further in your code.

ModelUploadOp step failing with custom prediction container

I am currenlty trying to deploy a Vertex pipeline to achieve the following:
Train a custom model (from a custom training python package) and dump model artifacts (trained model and data preprocessor that will be sed at prediction time). This is step is working fine as I can see new resources being created in the storage bucket.
Create a model resource via ModelUploadOp. This step fails for some reason when specifying serving_container_environment_variables and serving_container_ports with the error message in the errors section below. This is somewhat surprising as they are both needed by the prediction container and environment variables are passed as a dict as specified in the documentation.
This step works just fine using gcloud commands:
gcloud ai models upload \
--region us-west1 \
--display-name session_model_latest \
--container-image-uri gcr.io/and-reporting/pred:latest \
--container-env-vars="MODEL_BUCKET=ml_session_model" \
--container-health-route=//health \
--container-predict-route=//predict \
--container-ports=5000
Create an endpoint.
Deploy the model to the endpoint.
There is clearly something that I am getting wrong with Vertex, the components documentation doesn't help much in this case.
Pipeline
from datetime import datetime
import kfp
from google.cloud import aiplatform
from google_cloud_pipeline_components import aiplatform as gcc_aip
from kfp.v2 import compiler
PIPELINE_ROOT = "gs://ml_model_bucket/pipeline_root"
#kfp.dsl.pipeline(name="session-train-deploy", pipeline_root=PIPELINE_ROOT)
def pipeline():
training_op = gcc_aip.CustomPythonPackageTrainingJobRunOp(
project="my-project",
location="us-west1",
display_name="train_session_model",
model_display_name="session_model",
service_account="name#my-project.iam.gserviceaccount.com",
environment_variables={"MODEL_BUCKET": "ml_session_model"},
python_module_name="trainer.train",
staging_bucket="gs://ml_model_bucket/",
base_output_dir="gs://ml_model_bucket/",
args=[
"--gcs-data-path",
"gs://ml_model_data/2019-Oct_short.csv",
"--gcs-model-path",
"gs://ml_model_bucket/model/model.joblib",
"--gcs-preproc-path",
"gs://ml_model_bucket/model/preproc.pkl",
],
container_uri="us-docker.pkg.dev/vertex-ai/training/scikit-learn-cpu.0-23:latest",
python_package_gcs_uri="gs://ml_model_bucket/trainer-0.0.1.tar.gz",
model_serving_container_image_uri="gcr.io/my-project/pred",
model_serving_container_predict_route="/predict",
model_serving_container_health_route="/health",
model_serving_container_ports=[5000],
model_serving_container_environment_variables={
"MODEL_BUCKET": "ml_model_bucket/model"
},
)
model_upload_op = gcc_aip.ModelUploadOp(
project="and-reporting",
location="us-west1",
display_name="session_model",
serving_container_image_uri="gcr.io/my-project/pred:latest",
# When passing the following 2 arguments this step fails...
serving_container_environment_variables={"MODEL_BUCKET": "ml_model_bucket/model"},
serving_container_ports=[5000],
serving_container_predict_route="/predict",
serving_container_health_route="/health",
)
model_upload_op.after(training_op)
endpoint_create_op = gcc_aip.EndpointCreateOp(
project="my-project",
location="us-west1",
display_name="pipeline_endpoint",
)
model_deploy_op = gcc_aip.ModelDeployOp(
model=model_upload_op.outputs["model"],
endpoint=endpoint_create_op.outputs["endpoint"],
deployed_model_display_name="session_model",
traffic_split={"0": 100},
service_account="name#my-project.iam.gserviceaccount.com",
)
model_deploy_op.after(endpoint_create_op)
if __name__ == "__main__":
ts = datetime.now().strftime("%Y%m%d%H%M%S")
compiler.Compiler().compile(pipeline, "custom_train_pipeline.json")
pipeline_job = aiplatform.PipelineJob(
display_name="session_train_and_deploy",
template_path="custom_train_pipeline.json",
job_id=f"session-custom-pipeline-{ts}",
enable_caching=True,
)
pipeline_job.submit()
Errors and notes
When specifying serving_container_environment_variables and serving_container_ports the step fails with the following error:
{'code': 400, 'message': 'Invalid JSON payload received. Unknown name "MODEL_BUCKET" at \'model.container_spec.env[0]\': Cannot find field.\nInvalid value at \'model.container_spec.ports[0]\' (type.googleapis.com/google.cloud.aiplatform.v1.Port), 5000', 'status': 'INVALID_ARGUMENT', 'details': [{'#type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'model.container_spec.env[0]', 'description': 'Invalid JSON payload received. Unknown name "MODEL_BUCKET" at \'model.container_spec.env[0]\': Cannot find field.'}, {'field': 'model.container_spec.ports[0]', 'description': "Invalid value at 'model.container_spec.ports[0]' (type.googleapis.com/google.cloud.aiplatform.v1.Port), 5000"}]}]}
When commenting out serving_container_environment_variables and serving_container_ports the model resource gets created but deploying it manually to the endpoint results into a failed deployment with no output logs.

After some time researching the problem I've stumbled upon this Github issue. The problem was originated by a mismatch between google_cloud_pipeline_components and kubernetes_api docs. In this case, serving_container_environment_variables is typed as an Optional[dict[str, str]] whereas it should have been typed as a Optional[list[dict[str, str]]]. A similar mismatch can be found for serving_container_ports argument as well. Passing arguments following kubernetes documentation did the trick:
model_upload_op = gcc_aip.ModelUploadOp(
project="my-project",
location="us-west1",
display_name="session_model",
serving_container_image_uri="gcr.io/my-project/pred:latest",
serving_container_environment_variables=[
{"name": "MODEL_BUCKET", "value": "ml_session_model"}
],
serving_container_ports=[{"containerPort": 5000}],
serving_container_predict_route="/predict",
serving_container_health_route="/health",
)

Upload binary files using python-gitlab API

I'm tasked with migrating repos to gitlab and I decided to automate the process using python-gitlab. Everything works fine except for binary or considered-binary files like compiled object files ( .o ) or .zip files. (I know that repositories are not place for binaries. I work with what I got and what I'm told to do.)
I'm able to upload them using:
import gitlab
project = gitlab.Gitlab("git_adress", "TOKEN")
bin_content = base64.b64encode(open("my_file.o", 'rb').read() ).decode()
and then:
data = {'branch':'main', 'commit_message':'go away', 'actions':[{'action': 'create', 'file_path': "my_file.o", 'content': bin_content, 'encode' : 'base64'}]}
project.commits.create(data)
Problem is that content of such files inside gitlab repository is something like:
f0VMRgIBAQAAAAAAAAAAAAEAPgABAAAAAAAAAAAAA....
Which is not what I want.
If I don't .decode() I get error saying:
TypeError: Object of type bytes is not JSON serializable
Which is expected since I sent file opened in binary mode and encoded with base64.
I'd like to have such files uploaded/stored like when I upload them using web GUI "upload file" option.
Is it possible to achieve this using python-gitlab API ? If so, how?

The problem is that Python's base64.b64encode function will provide you with a bytes object, but REST APIs (specifically, JSON serialization) want strings. Also the argument you want is encoding not encode.
Here's the full example to use:
from base64 import b64encode
import gitlab
GITLAB_HOST = 'https://gitlab.com'
TOKEN = 'YOUR API KEY'
PROJECT_ID = 123 # your project ID
gl = gitlab.Gitlab(GITLAB_HOST, private_token=TOKEN)
project = gl.projects.get(PROJECT_ID)
with open('myfile.o', 'rb') as f:
bin_content = f.read()
b64_content = b64encode(bin_content).decode('utf-8')
# b64_content must be a string!
f = project.files.create({'file_path': 'my_file.o',
'branch': 'main',
'content': b64_content,
'author_email': 'test#example.com',
'author_name': 'yourname',
'encoding': 'base64', # important!
'commit_message': 'Create testfile'})
Then in the UI, you will see GitLab has properly recognized the contents as binary, rather than text:

How to deserialize App Engine application logs from StackDriver Logging API?

As part of migrating to Python 3, I need to migrate from logservice to the StackDriver Logging API. I have google-cloud-logging installed, and I can successfully fetch GAE application logs with eg:
>>> from google.cloud.logging_v2 import LoggingServiceV2Client
>>> entries = LoggingServiceV2Client().list_log_entries(('projects/projectname',),
filter_='resource.type="gae_app" AND protoPayload.#type="type.googleapis.com/google.appengine.logging.v1.RequestLog"')
>>> print(next(iter(entries)))
proto_payload {
type_url: "type.googleapis.com/google.appengine.logging.v1.RequestLog"
value: "\n\ts~brid-gy\022\0018\032R5d..."
}
This gets me a LogEntry with text application logs in the proto_payload.value field. How do I deserialize that field? I've found lots of related mentions in the docs, but nothing pointing me to a google.appengine.logging.v1.RequestLog protobuf generated class anywhere that I can use, if that's even the right idea. Has anyone done this?

Woo! Finally got this working. I had to generate and use the Python bindings for the google.appengine.logging.v1.RequestLog protocol buffer myself, by hand. Here's how.
First, I cloned these two repos at head:
https://github.com/googleapis/googleapis.git
https://github.com/protocolbuffers/protobuf.git
Then, I generated request_log_pb2.py from request_log.proto by running:
protoc -I googleapis/ -I protobuf/src/ --python_out . googleapis/google/appengine/logging/v1/request_log.proto
Finally, I pip installed googleapis-common-protos and protobuf. I was then able to deserialize proto_payload with:
from google.cloud.logging_v2 import LoggingServiceV2Client
client = LoggingServiceV2Client(...)
log = next(iter(client.list_log_entries(('projects/brid-gy',),
filter_='logName="projects/brid-gy/logs/appengine.googleapis.com%2Frequest_log"')))
import request_log_pb2
pb = request_log_pb2.RequestLog.FromString(log.proto_payload.value)
print(pb)

You can use the LogEntry.to_api_repr() function to get a JSON version of the LogEntry.
>>> from google.cloud.logging import Client
>>> entries = Client().list_entries(filter_="severity:DEBUG")
>>> entry = next(iter(entries))
>>> entry.to_api_repr()
{'logName': 'projects/PROJECT_NAME/logs/cloudfunctions.googleapis.com%2Fcloud-functions'
, 'resource': {'type': 'cloud_function', 'labels': {'region': 'us-central1', 'function_name': 'tes
t', 'project_id': 'PROJECT_NAME'}}, 'labels': {'execution_id': '1zqolde6afmx'}, 'insertI
d': '000000-f629ab40-aeca-4802-a678-d513e605608e', 'severity': 'DEBUG', 'timestamp': '2019-10-24T2
1:55:14.135056Z', 'trace': 'projects/PROJECT_NAME/traces/9c5201c3061d91c2b624abb950838b4
0', 'textPayload': 'Function execution started'}

Do you really want to use the API v2 ?
If not, use from google.cloud import logging and
set os.environ['GOOGLE_CLOUD_DISABLE_GRPC'] = 'true' - or similar env setting.
That will effectively return a JSON in payload instead of payload_pb

Serving static directories in CherryPy

I am trying to serve static files using CherryPy but I am unable to. I have looked in the tutorials but setting it up like that is also not working properly.
All this is using Python 3.4
Config
config = {
'/ws': {
'tools.websocket.on': True,
'tools.websocket.handler_cls': ChatWebSocketHandler,
'tools.websocket.protocols': ['toto', 'mytest', 'hithere']
},
'/assets': {
'tools.staticdir.on': True,
'tools.staticdir.dir': constants.TEMPLATE_PATH
},
}
I am starting up cherryPy like this
app_root = Root(args.host, args.port, args.ssl, ssl_port=args.ssl_port)
cherrypy.quickstart(app_root, '', config=config)
Constant Path is
TEMPLATE_PATH = os.path.join(os.path.dirname(os.path.realpath(__file__)),"assets/")
I have tried using paths like assets/, /assets/ as well instead of the above constant.
The thing is it does not recognize anyone of them and always gives a 404 error.

I too have had difficulty setting this up. I have a rather complicated setup complete with multiple subdomains that has evolve through several early versions of CherryPy which may no longer be valid, and I've not verified this will work in the simpler quickstart configuration you have here. However the key lines in a setup that actually works for me is to put the config lines below in the webservice object that you mount. I put the config dict that defines the static dir in the class definition before any resources. It looks to me like you've defined your static dir in the configuration dict rather of a specific resource rather than the object. So perhaps try in your hosted service object:
class WebService(object):
_cp_config = {
'tools.staticdir.on': True,
'tools.staticdir.dir': '/path/to/serve/static/files/from'
}
#cherrypy.expose
def index(self):
[ ...additional resource definitions, etc ...]
Then later on:
my_cp_app =
cherrypy.tree.mount(subDomain.WebService(),
'/subdomainFileLocation',
subdomainConfigDict)
cherrypy.quickstart(config=domainConfig)
I know you're working on Python 3. This above works for me on Python 2.7 + cherrypy-8.1.2. I hope this is helpful.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do you connect to AWS Elastic Transcoder? - python

To connect to a specific region in boto, you can use: import boto.elastictranscoder transcode = boto.elastictranscoder.connect_to_region('us-west-2') transcode.create_job(...)

Related

Pulling metrics from azure storage with the python sdk

ModelUploadOp step failing with custom prediction container

Upload binary files using python-gitlab API

How to deserialize App Engine application logs from StackDriver Logging API?

Serving static directories in CherryPy

Categories

Resources