I'm new to Azure, I have used AWS for many years. I want to create a Azure function using a docker greyed out no matter how a vary the options. image (ideally pub to docker hub). when I try to manually create the function in the portal (new account) the publish option has docker . I have setup billing.
The answer is as of sep 2021 Azure is unable to do Docker Functions with the consumption service plan kind of sucks but I'll see where that puts me.
The option magically stopped being greyed out.
Related
Is it practically possible to simulate AWS environment locally using Moto and Python?
I want to write a aws gluejob that will fetch record from my local database and will upload to S3 bucket for data quality check and later trigger a lambda function for cronjob run using Moto Library using moto.lock_glue decorator.Any suggestion or document would be highly appreciated as I don't see much clue on same.Thank you in advance.
AFAIK, moto is meant to patch boto modules for testing.
I have experience working with LocalStack, a docker you can run locally, and it acts as a live service emulator for most AWS services (some are only available for paying users).
https://docs.localstack.cloud/getting-started/
You can see here which services are supported by the free version.
https://docs.localstack.cloud/user-guide/aws/feature-coverage/
in order to use it, you need to change the endpoint-url to point to the local service running on docker.
As it's a docker, you can incorporate it with remote tests as well e.g., if you're using k8s or a similar orchestrator
I'm working on a python code where I need to build a docker image and push it to GCR (Google Container Registry). I'm able to create a docker image using Docker SDK for Python however I'm not able to find out a way to push it to gcr. I was looking into docker_client.images.push()method but I don't see a way to connect to gcr using this method. I can build the docker image using docker_client.images.build() but not able to find any way to push it to google container registry. There are ways to push it to docker registry but I need specific to gcr.
I have already implemented this using google cli or through Azure DevOps however I'm trying to do the same using python application.
Any help/suggestion is appreciated.
This seems to have worked for me using the Docker SDK:
import docker
client = docker.from_env()
image, logs = client.images.build(path="<path/to/your/docker-repo>")
image_dest = "gcr.io/<your-gcp-project>/<your-repo-name>"
image.tag(image_dest)
client.api.push(image_dest)
I also had to add permissions with gcloud auth configure-docker.
Have you looked at the container registry python library?
https://github.com/google/containerregistry
Or better yet, have you considered switching to go
https://github.com/google/go-containerregistry
I'm looking to create a publisher that streams and sends tweets containing a certain hashtag to a pub/sub topic.
The tweets will then be ingested with cloud dataflow and then loaded into a Big Query database.
In the following article they do something similar where the publisher is hosted on a docker image on a Google Compute Engine instance.
Can anyone recommend alternative Google Cloud resources that could host the publisher code more simply, that avoids the need to create a docker file etc?
The publisher would need to run constantly. Would cloud run for e.g. be a suitable alternative?
There are some workarounds I can think of:
A quick way to avoid containers architecture is having the on_data method inside a loop, for example, by using something like while(true) or start a Stream like explained in Create your Python script and run the code in a Compute Engine in the background with nohup python -u myscript.py. Or follow the steps described in Script on GCE to capture tweets that uses tweepy.Stream to start the streaming.
You might want to reconsider the Dockerfile option since its configuration could be not so difficult, see Tweets & pipelines where there is a script that read the data and publish to PubSub, you will see that 9 lines are used for the Docker file and it is deployed in App Engine using Cloud Build. Another implementation with a Docker file that requires more steps is twitter-for-bigquery, in case it helps, you will see that there are more specific steps and more configurations.
Cloud Functions is also another option, in this guide Serverless Twitter with Google Cloud you can check the Design section to know if it fits your use case.
Airflow with Twitter Scraper could work for you since Cloud Composer is a managed service for Airflow and you can create an Airflow environment quickly. It uses the Twint library, check the Technical section in the link for more details.
Stream Twitter Data into BigQuery with Cloud Dataprep is a workaround that put aside complex configurations. In this case the job won't run constantly but can be scheduled to run within minutes.
I have a huge amount of data (hundreds of Gigas) on Google BigQuery and for easy of use (many post query treatements) I'm working with the bigquery python package. The problem is that I have to run again all my queries whenever I shut my laptop down, this is very expensive as my dataset is about one Tera. I think of Google Compute Engine but this is a poor solution as I will still paying for my machines if I don't stop them. My last solution is to mount a docker image on our own sandbox, this is cheaper and can do exactly what I'm looking for. So I would like to know if someone has ever mounted a docker image for BigQuery ? Thanks for helping!
We mount all of our python/bigquery projects into docker containers and push them to google cloud registry.
Automated scheduling, dependancy graphing, and logging can be handled with Google Cloud Composer (Airflow). Its pretty simple to get set up, and Airflow has a Kubernetes Pod Operator, That allows you to specify a python file to run in your docker image on GCR. You can use this workflow to make sure all of your queries and python scripts are run on GCP without having to worry about Google Compute Engine, or any devops type of things.
https://cloud.google.com/composer/docs/how-to/using/using-kubernetes-pod-operator
https://cloud.google.com/composer/
In AWS, you can assign a role to a VM, which then authorizes the instance when it makes queries to the AWS SDK. I am looking for similar functionality in Azure, or something that would enable me to do close to that.
I found this post which suggests that this is not possible in the way AWS does it. Are there any workarounds for this? I really don't want the system administrator to have to login to the instance and give their Azure Active Directory credentials to authorize it.
Excellent question :). I would suggest to wait a few days, we have something in progress that seems to fit your need. I created this issue for tracking.
The most simple would be to create a Service Principal credentials for these VMs. To do that, execute a post deployment script to install the CLI and "az ad sp create-for-rbac --sdk-auth >~/mycredentials.json". Then, you can start SDK script reading this credential file.
The "create-for-rbac" commands already exists if you want to look at it (--sdk-auth is the new option coming), so you can see that you can specify all scope and permissions needed in this command.
(I own the Azure SDK for Python at MS)