DOCKER_BUILDKIT - passing a token secret during build time from github actions

DOCKER_BUILDKIT - passing a token secret during build time from github actions - python

I have a pip installable package that was published using twine to Azure DevOps artifact.
In my build image I need to download that package and install it with pip. So I need to
authenticate against azure artifacts to get it. So I am using artifacts-keyring to do so
The pip installable URL is something like this:
https://<token>#pkgs.dev.azure.com/<org>/<project>/_packaging/<feed>/pypi/simple/ --no-cache-dir <package-name>
I am trying to use Docker BuildKit to pass the token during build time.
I am mounting the secret and using it by getting its value with cat command substitution:
# syntax = docker/dockerfile:1.2
ENV ARTIFACTS_KEYRING_NONINTERACTIVE_MODE=true
RUN --mount=type=secret,id=azdevopstoken,dst=/run/secrets/azdevopstoken \
pip install --upgrade pip && \
pip install pyyaml numpy lxml artifacts-keyring && \
echo pip install -i https://"$(cat /run/secrets/azdevopstoken)"#pkgs.dev.azure.com/org>/<project>/_packaging/feed/pypi/simple/ --no-cache-dir package-name
and it works locally from my workstation when I run:
(My src file where the token is in plain text is azdevopstoken.txt within my local directory structure project)
DOCKER_BUILDKIT=1 docker image build --secret id=azdevopstoken,src=./azdevopstoken.txt --progress plain . -t my-image:tag
Now I am running this build command from GitHub actions pipeline
And I got this output:
Scould not parse secrets: [id=azdevopstoken,src=./azdevopstoken.txt]: failed to stat ./azdevopstoken.txt: stat ./azdevopstoken.txt: no such file or directory
Error: Process completed with exit code 1.
This is expected from me, since I am not uploading azdevopstoken.txt file, because I don’t want to have it in my repository, since there is my token in plain text.
Reading carefully here, I see there is a workaround to encrypt secrets, perhaps this could be a good solution to implement buildkit from my github actions pipeline, but I think this is an additional step in my workflow, so I am not sure whether follow this option or not.
Foremost because I already passing the secret in the old way by using the --build-arg flag during build time of this way:
ARG AZ_DEVOPS_TOKEN
ENV AZ_DEVOPS_TOKEN=$AZ_DEVOPS_TOKEN
RUN pip install --upgrade pip && \
pip install pyyaml numpy lxml artifacts-keyring && \
pip install -i https://$AZ_DEVOPS_TOKEN#pkgs.dev.azure.com/ORG/PROJECT/_packaging/feed/pypi/simple/ --no-cache-dir PACKAGE-NAME
Being my docker build command from GitHub actions of this way:
docker image build --build-arg AZ_DEVOPS_TOKEN="${{ secrets.AZ_DEVOPS_TOKEN }}" . -t containerregistry.azurecr.io/my-image:preview
It works perfect, the thing is, I heard --build-arg is not a safe solution to pass sensitive information. Despite that I ran docker history after this command, and I couldn't see the secret exposed or something similar.
> docker history af-fem-uploader:preview-buildarg
IMAGE CREATED CREATED BY SIZE COMMENT
2eff105408c9 34 seconds ago RUN /bin/sh -c pip install pytest && pytest … 5MB buildkit.dockerfile.v0
<missing> 38 seconds ago WORKDIR /home/site/wwwroot/HttpUploadTrigger/ 0B buildkit.dockerfile.v0
<missing> 38 seconds ago ADD . /home/site/wwwroot # buildkit 1.9MB buildkit.dockerfile.v0
So what is the benefit to pass the secrets via BUILDKIT in order to don’t expose them if I have to upload the file which contains the secret to my repository?
Perhaps I am missing something.

Related

how to successfully run docker image as container

Below my docker file,
FROM python:3.9.0
ARG WORK_DIR=/opt/quarter_1
RUN apt-get update && apt-get install cron -y && apt-get install -y default-jre
# Install python libraries
COPY requirements.txt /tmp/requirements.txt
RUN pip install --upgrade pip && pip install -r /tmp/requirements.txt
WORKDIR $WORK_DIR
EXPOSE 8888
VOLUME /home/data/quarter_1/
# Copy etl code
# copy code on container under your workdir "/opt/quarter_1"
COPY . .
I tried to connect to the server then i did the build with docker build -t my-python-app .
when i tried to run the container from a build image i got nothing and was not able to do it.
docker run -p 8888:8888 -v /home/data/quarter_1/:/opt/quarter_1 image_id
work here is opt

Update based on comments
If I understand everything you've posted correctly, my suggestion here is to use a base Docker Jupyter image, modify it to add your pip requirements, and then add your files to the work path. I've tested the following:
Start with a dockerfile like below
FROM jupyter/base-notebook:python-3.9.6
COPY requirements.txt /tmp/requirements.txt
RUN pip install --upgrade pip && pip install -r /tmp/requirements.txt
COPY ./quarter_1 /home/jovyan/quarter_1
Above assumes you are running the build from the folder containing dockerfile, "requirements.txt", and the "quarter_1" folder with your build files.
Note "home/joyvan" is the default working folder in this image.
Build the image
docker build -t biwia-jupyter:3.9.6 .
Start the container with open port to 8888. e.g.
docker run -p 8888:8888 biwia-jupyter:3.9.6
Connect to the container to access token. A few ways to do but for example:
docker exec -it CONTAINER_NAME bash
jupyter notebook list
Copy the token in the URL and connect using your server IP and port. You should be able to paste the token there, and afterwards access the folder you copied into the build, as I did below.
Jupyter screenshot
If you are deploying the image to different hosts this is probably the best way to do it using COPY/ADD etc., but otherwise look at using Docker Volumes which give you access to a folder (for example quarter_1) from the host, so you don't constantly have to rebuild during development.
Second edit for Python 3.9.0 request
Using the method above, 3.9.0 is not immediately available from DockerHub. I doubt you'll have much compatibility issues between 3.9.0 and 3.9.6, but we'll build it anyway. We can download the dockerfile folder from github, update a build argument, create our own variant with 3.9.0, and do as above.
Assuming you have git. Otherwise download the repo manually.
Download the Jupyter Docker stack repo
git clone https://github.com/jupyter/docker-stacks
change into the base-notebook directory of the cloned repo
cd ./base-notebook
Build the image with python 3.9.0 instead
docker build --build-arg PYTHON_VERSION=3.9.0 -t jupyter-base-notebook:3.9.0 .
Create the version with your copied folders and 3.9.0 version from the steps above, replacing the first line in the dockerfile instead with:
FROM jupyter-base-notebook:3.9.0
I've tested this and it works, running Python 3.9.0 without issue.
There are lots of ways to build Jupyter images, this is just one method. Check out docker hub for Jupyter to see their variants.

python cannot load en_core_web_lg module in azure app service with docker image

I have a flask python app that uses a spacy model (md or lg). I am running in a docker container in VSCode and all work correctly on my laptop.
When I push the image to my azure container registry the app restarts but it doesn't seem to get past this line in the log:
Initiating warmup request to the container.
If I comment out the line nlp = spacy.load('en_core_web_lg'), the website loads fine (of course it doesn't work as expected).
I am installing the model in the docker file after installing the requirements.txt:
RUN python -m spacy download en_core_web_lg.
Docker file:
FROM python:3.6
EXPOSE 5000
# Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE 1
# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED 1
# steps needed for scipy
RUN apt-get update -y
RUN apt-get install -y python-pip python-dev libc-dev build-essential
RUN pip install -U pip
# Install pip requirements
ADD requirements.txt.
RUN python -m pip install -r requirements.txt
RUN python -m spacy download en_core_web_md
WORKDIR /app
ADD . /app
# During debugging, this entry point will be overridden. For more information, refer to https://aka.ms/vscode-docker-python-debug
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "Application.webapp:app"]

Try using en_core_web_sm instead en_core_web_lg.
You can install the module by python -m spacy download en_core_web_sm

Noticed you asked your question over on MSDN. If en_core_web_sm worked but _md and _lg doesn't, increase your timeout by setting WEBSITES_CONTAINER_START_TIME_LIMIT to a value up to 1800 sec). The size might be taking a while to load the image and simply times out.
If you already done that, email us at AzCommunity[at]microsoft[dot]com ATTN Ryan so we can take a closer look. Include your subscription id and app service name.

Python- Unable to Train Tensorflow Model Container in Sagemaker

I'm fairly new to Sagemaker and Docker.I am trying to train my own custom object detection algorithm in Sagemaker using an ECS container. I'm using this repo's files:
https://github.com/svpino/tensorflow-object-detection-sagemaker
I've followed the instructions exactly, and I'm able to run the image in a container perfectly fine on my local machine. But when I push the image to ECS to run in Sagemaker, I get the following message in Cloudwatch:
I understand that for some reason, when deployed to ECS suddenly the image can't find python. At the top of my training script is the text #!/usr/bin/env python. I've tried to run the *which python * command and changed up text to point to #!/usr/local/bin python, but I just get additional errors. I don't understand why this image would work on my local (tested with both docker on windows and docker CE for WSL). Here's a snippet of the docker file:
ARG ARCHITECTURE=1.15.0-gpu
FROM tensorflow/tensorflow:${ARCHITECTURE}-py3
RUN apt-get update && apt-get install -y --no-install-recommends \
wget zip unzip git ca-certificates curl nginx python
# We need to install Protocol Buffers (Protobuf). Protobuf is Google's language and platform-neutral,
# extensible mechanism for serializing structured data. To make sure you are using the most updated code,
# replace the linked release below with the latest version available on the Git repository.
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.10.1/protoc-3.10.1-linux-x86_64.zip
RUN unzip protoc-3.10.1-linux-x86_64.zip -d protoc3
RUN mv protoc3/bin/* /usr/local/bin/
RUN mv protoc3/include/* /usr/local/include/
# Let's add the folder that we are going to be using to install all of our machine learning-related code
# to the PATH. This is the folder used by SageMaker to find and run our code.
ENV PATH="/opt/ml/code:${PATH}"
RUN mkdir -p /opt/ml/code
WORKDIR /opt/ml/code
RUN pip install --upgrade pip
RUN pip install cython
RUN pip install contextlib2
RUN pip install pillow
RUN pip install lxml
RUN pip install matplotlib
RUN pip install flask
RUN pip install gevent
RUN pip install gunicorn
RUN pip install pycocotools
# Let's now download Tensorflow from the official Git repository and install Tensorflow Slim from
# its folder.
RUN git clone https://github.com/tensorflow/models/ tensorflow-models
RUN pip install -e tensorflow-models/research/slim
# We can now install the Object Detection API, also part of the Tensorflow repository. We are going to change
# the working directory for a minute so we can do this easily.
WORKDIR /opt/ml/code/tensorflow-models/research
RUN protoc object_detection/protos/*.proto --python_out=.
RUN python setup.py build
RUN python setup.py install
# If you are interested in using COCO evaluation metrics, you can tun the following commands to add the
# necessary resources to your Tensorflow installation.
RUN git clone https://github.com/cocodataset/cocoapi.git
WORKDIR /opt/ml/code/tensorflow-models/research/cocoapi/PythonAPI
RUN make
RUN cp -r pycocotools /opt/ml/code/tensorflow-models/research/
# Let's put the working directory back to where it needs to be, copy all of our code, and update the PYTHONPATH
# to include the newly installed Tensorflow libraries.
WORKDIR /opt/ml/code
COPY /code /opt/ml/code
ENV PYTHONPATH=${PYTHONPATH}:tensorflow-models/research:tensorflow-models/research/slim:tensorflow-models/research/object_detection
RUN chmod +x /opt/ml/code/train
CMD ["/bin/bash","-c","chmod +x /opt/ml/code/train && /opt/ml/code/train"]

How content in "XDG_CACHE_HOME=/cache" different from the content in "-d /build"?

Below Dockerfile has environment variable XDG_CACHE_HOME=/cache
that allows command,
pip install -r requirements_test.txt
to utilise local cache(as shown below) instead of downloading from network:
But below Dockerfile also has /build folder.
So, I would like to understand,
if the purpose(content) of /build folder different from /cache folder
Dockerfile
FROM useraccount/todobackend-base:latest
MAINTAINER Development team <devteam#abc.com>
RUN apt-get update && \
# Development image should have access to source code and
# be able to compile python package dependencies that installs from source distribution
# python-dev has core development libraries required to build & compile python application from source
apt-get install -qy python-dev libmysqlclient-dev
# Activate virtual environment and install wheel support
# Python wheels are application package artifacts
RUN . /appenv/bin/activate && \
pip install wheel --upgrade
# PIP environment variables (NOTE: must be set after installing wheel)
# Configure docker image to output wheels to folder called /wheelhouse
# PIP cache location using XDG_CACHE_HOME to improve performance during test/build/release operation
ENV WHEELHOUSE=/wheelhouse PIP_WHEEL_DIR=/wheelhouse PIP_FIND_LINKS=/wheelhouse XDG_CACHE_HOME=/cache
# OUTPUT: Build artifacts (wheels) are output here
# Read more - https://www.projectatomic.io/docs/docker-image-author-guidance/
VOLUME /wheelhouse
# OUTPUT: Build cache
VOLUME /build
# OUTPUT: Test reports are output here
VOLUME /reports
# Add test entrypoint script
COPY scripts/test.sh /usr/local/bin/test.sh
RUN chmod +x /usr/local/bin/test.sh
# Set defaults for entrypoint and command string
ENTRYPOINT ["test.sh"]
CMD ["python", "manage.py", "test", "--noinput"]
# Add application source
COPY src /application
WORKDIR /application
Below is the docker-compose.yml file
test: # Unit & integration testing
build: ../../
dockerfile: docker/dev/Dockerfile
volumes_from:
- cache
links:
- db
environment:
DJANGO_SETTINGS_MODULE: todobackend.settings.test
MYSQL_HOST: db
MYSQL_USER: root
MYSQL_PASSWORD: password
TEST_OUTPUT_DIR: /reports
builder: # Generate python artifacts
build: ../../
dockerfile: docker/dev/Dockerfile
volumes:
- ../../target:/wheelhouse
volumes_from:
- cache
entrypoint: "entrypoint.sh"
command: ["pip", "wheel", "--non-index", "-f /build", "."]
db:
image: mysql:5.6
hostname: db
expose:
- "3386"
environment:
MYSQL_ROOT_PASSWORD: password
cache: # volume container
build: ../../
dockerfile: docker/dev/Dockerfile
volumes:
- /tmp/cache:/cache
- /build
entrypoint: "true"
Below volumes
volumes:
- /tmp/cache:/cache
- /build
are created in volume container(cache)
entrypoint file test.sh:
#!/bin/bash
# Activate virtual environment
. /appenv/bin/activate
# Download requirements to build cache
pip download -d /build -r requirements_test.txt --no-input
# Install application test requirements
# -r allows the requirements to be mentioned in a txt file
# pip install -r requirements_test.txt
pip install --no-index -f /build -r requirements_test.txt
# Run test.sh arguments
exec $#
Edit:
pip download -d /build -r requirements_test.txt --no-input storing below files in /build folder
pip install -r requirements_test.txt is picking dependencies from /build folder:
Above two commands are not using /cache folder
1)
So,
Why do we need /cache folder? pip install command is referring to /build
2)
In test.sh file.... From the aspect of using /build vs /cache content...
How
pip install --no-index -f /build -r requirements_test.txt
different from
pip install -r requirements_test.txt command ?

1) They might be the same, but might not as well. As I understand about what's being done here is that, /cache uses your host cache (/tmp/cache is in the host) and then the container builds the cache (using the host cache) and stores it in /build which points to /var/lib/docker/volumes/hjfhjksahfjksa in your host.
So, they might be the same at some point, but not always.
2) This container needs the cache stored in /build, so you need to use the -f flag to let pip know where it's located.

Python has a couple of different formats for packages. They're typically distributed as source code, which can run anywhere Python runs, but occasionally have C (or FORTRAN!) extensions that require an external compiler to build. The current recommended format is a wheel, which can be specific to a particular OS and specific Python build options, but doesn't depend on anything at the OS level outside of Python. The Python Packaging User Guide goes into a lot more detail on this.
The build volume contains .whl files for your application; the wheelhouse volume contains .whl files for other Python packages; the cache volume contains .tar.gz or .whl files that get downloaded from PyPI. The cache volume is only consulted when downloading things; the build and wheelhouse volumes are used to install code without needing to try to download at all.
The pip --no-index option says "don't contact public PyPI"; -f /build says "use artifacts located here". The environment variables mentioning /wheelhouse also have an effect. These combine to let you install packages using only what you've already built.
The Docker Compose setup is a pretty long-winded way to build your application as wheels, and then make it available to a runtime image that doesn't have a toolchain.
The cache container does literally nothing. It has the two directories you show: /cache is a host-mounted directory, and /build is an anonymous volume. Other containers have volumes_from: cache to reuse these volumes. (Style-wise, adding named volumes: to the docker-compose.yml is almost definitely better.)
The builder container only runs pip wheel. It mounts an additional directory, ./target from the point of view of the Dockerfile, on to /wheelhouse. The pip install documentation discusses how caching works: if it downloads files they go into $XDG_CACHE_DIR (the /cache volume directory), and if it builds wheels they go into the /wheelhouse volume directory. The output of pip wheel will go into the /build volume directory.
The test container, at startup time, downloads some additional packages and puts them in the build volume. Then it does pip install --no-index to install packages only using what's in the build and wheelhouse volumes, without calling out to PyPI at all.
This setup is pretty complicated for what it does. Some general guidelines I'd suggest here:
Prefer named volumes to data-volume containers. (Very early versions of Docker didn't have named volumes, but anything running on a modern Linux distribution will.)
Don't establish a virtual environment inside your image; just install directly into the system Python tree.
Install software at image build time (in the Dockerfile), not at image startup time (in an entrypoint script).
Don't declare VOLUME in a Dockerfile pretty much ever; it's not necessary for this setup and when it has effects it's usually more confusing than helpful.
A more typical setup would be to build all of this, in one shot, in a multi-stage build. The one downside of this is that downloads aren't cached across builds: if your list of requirements doesn't change then Docker will reuse it as a set, but if you add or remove any single thing, Docker will repeat the pip command to download the whole set.
This would look roughly like (not really tested):
# First stage: build and download wheels
FROM python:3 AS build
# Bootstrap some Python dependencies.
RUN pip install --upgrade pip \
&& pip install wheel
# This stage can need some extra host dependencies, like
# compilers and C libraries.
RUN apt-get update && \
apt-get install -qy python-dev libmysqlclient-dev
# Create a directory to hold built wheels.
RUN mkdir /wheel
# Install the application's dependencies (only).
WORKDIR /app
COPY requirements.txt .
RUN pip wheel --wheel-dir=/wheel -r requirements.txt \
&& pip install --no-index --find-links=/wheel -r requirements.txt
# Build a wheel out of the application.
COPY . .
RUN pip wheel --wheel-dir=/wheel --no-index --find-links=/wheel .
# Second stage: actually run the application.
FROM python:3
# Bootstrap some Python dependencies.
RUN pip install --upgrade pip \
&& pip install wheel
# Get the wheels from the first stage.
RUN mkdir /wheel
COPY --from=build /wheel /wheel
# Install them.
RUN pip install --no-index --find-links=/wheel /wheel/*.whl
# Standard application metadata.
# The application should be declared as entry_points in setup.py.
EXPOSE 3000
CMD ["the_application"]

Docker compose available in python without internet access

I'm having the issue needing to use docker compose in python (for using the docker_service functionality in Ansible), but its not possible to install using pip because of a network policy of the company (the VM has no network access only acces to a RPM). I although can use a yum repository that contains docker compose.
What I tried is to install "docker compose" (version 1.18.0) using yum. Although python is not recognizing docker compose and suggest me to use pip: "Unable to load docker-compose. Try pip install docker-compose
Since in most cases I can solve this issue by installing this using yum install python-, I already looked the web for a package called python-docker-compose but no result :(
minimalistic ansible script for test:
- name: Run using a project directory
hosts: localhost
gather_facts: no
tasks:
- docker_service:
project_src: flask
state: absent
hope anyone can help.
SOLUTION:
After some digging around I solved the issue by doing a local download on a machine that has internet:
pip download -d /tmp/docker-compose-local docker-compose
archiving all the packages that were downloaded in the folder
cd tmp
tar -czvf docker-compose-python.tgz ./docker-compose-local
since the total size of the package is slightly bigger than 1MB I added the file to the ansible docker role.
In the docker role a local install is done:
cd /tmp
tar -xzvf docker-compose-python.tgz pip install --no-index
--find-links file:/tmp/docker-compose-local/ docker_compose

Use a virtual environment!
unless you can not do that either, it depends whether the company policy is NOT to write on everyone's python (then you are fine) or whether you can not use pip (even in your own environment).
If you CAN do that then:
virtualenv docker_compose -p python3
source docker_compose/bin/activate
pip install docker-compose
You get all this junk:
Collecting docker-compose
Downloading https://files.pythonhosted.org/packages/67/03Collecting docker-pycreds>=0.3.0 (from docker<4.0,>=3.4.1->docker-compose)
Downloading https://files.pythonhosted.org/packages/ea/bf/7e70aeebc40407fbdb96fa9f79fc8e4722ea889a99378303e3bcc73f4ab5/docker_pycreds-0.3.0-py2.py3-none-any.whl
Building wheels for collected packages: PyYAML, docopt, texttable, dockerpty
Running setup.py bdist_wheel for PyYAML ... done
Stored in directory: /home/eserrasanz/.cache/pip/wheels/ad/da/0c/74eb680767247273e2cf2723482cb9c924fe70af57c334513f
Running setup.py bdist_wheel for docopt ... done
Stored in directory: /home/eserrasanz/.cache/pip/wheels/9b/04/dd/7daf4150b6d9b12949298737de9431a324d4b797ffd63f526e
Running setup.py bdist_wheel for texttable ... done
Stored in directory: /home/eserrasanz/.cache/pip/wheels/99/1e/2b/8452d3a48dad98632787556a0f2f90d56703b39cdf7d142dd1
Running setup.py bdist_wheel for dockerpty ... done
Stored in directory: /home/eserrasanz/.cache/pip/wheels/e5/1e/86/bd0a97a0907c6c654af654d5875d1d4383dd1f575f77cee4aa
Successfully installed PyYAML-3.13 cached-property-1.5.1 certifi-2018.8.24 chardet-3.0.4 docker-3.5.0 docker-compose-1.22.0 docker-pycreds-0.3.0 dockerpty-0.4.1 docopt-0.6.2 idna-2.6 jsonschema-2.6.0 requests-2.18.4 six-1.11.0 texttable-0.9.1 urllib3-1.22 websocket-client-0.53.0

After some digging around I solved the issue by doing a local download on a machine that has internet:
pip download -d /tmp/docker-compose-local docker-compose
archiving all the packages that were downloaded in the folder
cd tmp
tar -czvf docker-compose-python.tgz ./docker-compose-local
since the total size of the package is slightly bigger than 1MB I added the file to the ansible docker role.
In the docker role a local install is done:
cd /tmp
tar -xzvf docker-compose-python.tgz
pip install --no-index --find-links file:/tmp/docker-compose-local/
docker_compose

I think you should be able to install using curl from GitHub, assuming this is not blocked by your network policy. Link: https://docs.docker.com/compose/install/#install-compose.
sudo curl -L "https://github.com/docker/compose/releases/download/1.22.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
docker-compose --version
# Outputs: docker-compose version 1.22.0, build 1719ceb
Hope this helps.

To complete answer from E.Serra you can use ansible to create the virtualenv, use pip and install docker-compose in one task this way.
- pip:
name: docker-compose
virtualenv: /my_app/venv

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.