Python- Unable to Train Tensorflow Model Container in Sagemaker

Python- Unable to Train Tensorflow Model Container in Sagemaker - python

I'm fairly new to Sagemaker and Docker.I am trying to train my own custom object detection algorithm in Sagemaker using an ECS container. I'm using this repo's files:
https://github.com/svpino/tensorflow-object-detection-sagemaker
I've followed the instructions exactly, and I'm able to run the image in a container perfectly fine on my local machine. But when I push the image to ECS to run in Sagemaker, I get the following message in Cloudwatch:
I understand that for some reason, when deployed to ECS suddenly the image can't find python. At the top of my training script is the text #!/usr/bin/env python. I've tried to run the *which python * command and changed up text to point to #!/usr/local/bin python, but I just get additional errors. I don't understand why this image would work on my local (tested with both docker on windows and docker CE for WSL). Here's a snippet of the docker file:
ARG ARCHITECTURE=1.15.0-gpu
FROM tensorflow/tensorflow:${ARCHITECTURE}-py3
RUN apt-get update && apt-get install -y --no-install-recommends \
wget zip unzip git ca-certificates curl nginx python
# We need to install Protocol Buffers (Protobuf). Protobuf is Google's language and platform-neutral,
# extensible mechanism for serializing structured data. To make sure you are using the most updated code,
# replace the linked release below with the latest version available on the Git repository.
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.10.1/protoc-3.10.1-linux-x86_64.zip
RUN unzip protoc-3.10.1-linux-x86_64.zip -d protoc3
RUN mv protoc3/bin/* /usr/local/bin/
RUN mv protoc3/include/* /usr/local/include/
# Let's add the folder that we are going to be using to install all of our machine learning-related code
# to the PATH. This is the folder used by SageMaker to find and run our code.
ENV PATH="/opt/ml/code:${PATH}"
RUN mkdir -p /opt/ml/code
WORKDIR /opt/ml/code
RUN pip install --upgrade pip
RUN pip install cython
RUN pip install contextlib2
RUN pip install pillow
RUN pip install lxml
RUN pip install matplotlib
RUN pip install flask
RUN pip install gevent
RUN pip install gunicorn
RUN pip install pycocotools
# Let's now download Tensorflow from the official Git repository and install Tensorflow Slim from
# its folder.
RUN git clone https://github.com/tensorflow/models/ tensorflow-models
RUN pip install -e tensorflow-models/research/slim
# We can now install the Object Detection API, also part of the Tensorflow repository. We are going to change
# the working directory for a minute so we can do this easily.
WORKDIR /opt/ml/code/tensorflow-models/research
RUN protoc object_detection/protos/*.proto --python_out=.
RUN python setup.py build
RUN python setup.py install
# If you are interested in using COCO evaluation metrics, you can tun the following commands to add the
# necessary resources to your Tensorflow installation.
RUN git clone https://github.com/cocodataset/cocoapi.git
WORKDIR /opt/ml/code/tensorflow-models/research/cocoapi/PythonAPI
RUN make
RUN cp -r pycocotools /opt/ml/code/tensorflow-models/research/
# Let's put the working directory back to where it needs to be, copy all of our code, and update the PYTHONPATH
# to include the newly installed Tensorflow libraries.
WORKDIR /opt/ml/code
COPY /code /opt/ml/code
ENV PYTHONPATH=${PYTHONPATH}:tensorflow-models/research:tensorflow-models/research/slim:tensorflow-models/research/object_detection
RUN chmod +x /opt/ml/code/train
CMD ["/bin/bash","-c","chmod +x /opt/ml/code/train && /opt/ml/code/train"]

Related

How to run an application based on TensorFlow 2 in Docker container?

I am relatively new to TensorFlow, so I have been trying to run simple applications locally, and everything was going well.
At some point I wanted to Dockerize my application. Building the Docker image went with no errors, however, when I tried to run my application, I received the following error:
AttributeError: module 'tensorflow' has no attribute 'gfile'. Did you mean: 'fill'?
After googling about the problem, I understood that it is caused by version differences between TF1 and TF2.
One of the explanation about the problem I found is found here.
Locally, I am using TF2 (specifically 2.9.1), inside a virtual environment.
When dockerizing, I also confirmed from inside the docker container that my TF version is the same.
I also tried to run the container in interactive mode, and create virtual environment, and install all dependencies manually, exactly the same way I did locally, but still with no success.
My Dockerfile is as follows:
FROM python:3-slim
# ENV VIRTUAL_ENV=/opt/venv
# RUN python3 -m venv $VIRTUAL_ENV
# ENV PATH="$VIRTUAL_ENV/bin:$PATH"
WORKDIR /objectDetector
RUN apt-get update
RUN apt-get install -y protobuf-compiler
RUN apt-get install ffmpeg libsm6 libxext6 -y
RUN pip3 install update && python3 -m pip install --upgrade pip
RUN pip3 install tensorflow==2.9.1
RUN pip3 install tensorflow-object-detection-api
RUN pip3 install opencv-python
RUN pip3 install opencv-contrib-python
COPY detect_objects.py .
COPY detector.py .
COPY helloWorld.py .
ADD data data /objectDetector/data/
ADD models /objectDetector/models/
So my question is: How can I ran an application using TensorFlow 2 from a docker container?
Am I missing something here?
Thanks in advance for any help or explanation.

I believe that in tensorflow 2.0 :
tf.gfile was replaced by tf.io.gfile
Can you try this ?
Have a nice day,
Gabriel

how to successfully run docker image as container

Below my docker file,
FROM python:3.9.0
ARG WORK_DIR=/opt/quarter_1
RUN apt-get update && apt-get install cron -y && apt-get install -y default-jre
# Install python libraries
COPY requirements.txt /tmp/requirements.txt
RUN pip install --upgrade pip && pip install -r /tmp/requirements.txt
WORKDIR $WORK_DIR
EXPOSE 8888
VOLUME /home/data/quarter_1/
# Copy etl code
# copy code on container under your workdir "/opt/quarter_1"
COPY . .
I tried to connect to the server then i did the build with docker build -t my-python-app .
when i tried to run the container from a build image i got nothing and was not able to do it.
docker run -p 8888:8888 -v /home/data/quarter_1/:/opt/quarter_1 image_id
work here is opt

Update based on comments
If I understand everything you've posted correctly, my suggestion here is to use a base Docker Jupyter image, modify it to add your pip requirements, and then add your files to the work path. I've tested the following:
Start with a dockerfile like below
FROM jupyter/base-notebook:python-3.9.6
COPY requirements.txt /tmp/requirements.txt
RUN pip install --upgrade pip && pip install -r /tmp/requirements.txt
COPY ./quarter_1 /home/jovyan/quarter_1
Above assumes you are running the build from the folder containing dockerfile, "requirements.txt", and the "quarter_1" folder with your build files.
Note "home/joyvan" is the default working folder in this image.
Build the image
docker build -t biwia-jupyter:3.9.6 .
Start the container with open port to 8888. e.g.
docker run -p 8888:8888 biwia-jupyter:3.9.6
Connect to the container to access token. A few ways to do but for example:
docker exec -it CONTAINER_NAME bash
jupyter notebook list
Copy the token in the URL and connect using your server IP and port. You should be able to paste the token there, and afterwards access the folder you copied into the build, as I did below.
Jupyter screenshot
If you are deploying the image to different hosts this is probably the best way to do it using COPY/ADD etc., but otherwise look at using Docker Volumes which give you access to a folder (for example quarter_1) from the host, so you don't constantly have to rebuild during development.
Second edit for Python 3.9.0 request
Using the method above, 3.9.0 is not immediately available from DockerHub. I doubt you'll have much compatibility issues between 3.9.0 and 3.9.6, but we'll build it anyway. We can download the dockerfile folder from github, update a build argument, create our own variant with 3.9.0, and do as above.
Assuming you have git. Otherwise download the repo manually.
Download the Jupyter Docker stack repo
git clone https://github.com/jupyter/docker-stacks
change into the base-notebook directory of the cloned repo
cd ./base-notebook
Build the image with python 3.9.0 instead
docker build --build-arg PYTHON_VERSION=3.9.0 -t jupyter-base-notebook:3.9.0 .
Create the version with your copied folders and 3.9.0 version from the steps above, replacing the first line in the dockerfile instead with:
FROM jupyter-base-notebook:3.9.0
I've tested this and it works, running Python 3.9.0 without issue.
There are lots of ways to build Jupyter images, this is just one method. Check out docker hub for Jupyter to see their variants.

python cannot load en_core_web_lg module in azure app service with docker image

I have a flask python app that uses a spacy model (md or lg). I am running in a docker container in VSCode and all work correctly on my laptop.
When I push the image to my azure container registry the app restarts but it doesn't seem to get past this line in the log:
Initiating warmup request to the container.
If I comment out the line nlp = spacy.load('en_core_web_lg'), the website loads fine (of course it doesn't work as expected).
I am installing the model in the docker file after installing the requirements.txt:
RUN python -m spacy download en_core_web_lg.
Docker file:
FROM python:3.6
EXPOSE 5000
# Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE 1
# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED 1
# steps needed for scipy
RUN apt-get update -y
RUN apt-get install -y python-pip python-dev libc-dev build-essential
RUN pip install -U pip
# Install pip requirements
ADD requirements.txt.
RUN python -m pip install -r requirements.txt
RUN python -m spacy download en_core_web_md
WORKDIR /app
ADD . /app
# During debugging, this entry point will be overridden. For more information, refer to https://aka.ms/vscode-docker-python-debug
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "Application.webapp:app"]

Try using en_core_web_sm instead en_core_web_lg.
You can install the module by python -m spacy download en_core_web_sm

Noticed you asked your question over on MSDN. If en_core_web_sm worked but _md and _lg doesn't, increase your timeout by setting WEBSITES_CONTAINER_START_TIME_LIMIT to a value up to 1800 sec). The size might be taking a while to load the image and simply times out.
If you already done that, email us at AzCommunity[at]microsoft[dot]com ATTN Ryan so we can take a closer look. Include your subscription id and app service name.

Always installing vscode plugin in docker container doesnt work

I am using vscode with docker container. I have following entry in user settings.json.
"remote.containers.defaultExtensions": [
"ms-python.python",
"ms-azuretools.vscode-docker",
"ryanluker.vscode-coverage-gutters"
]
But when I build or rebuild container, these plugins don't get installed automatically inside container.
Am I doing something wrong ?
Modified
Here is how my dockerfile looks like
FROM ubuntu:bionic
RUN apt-get update
RUN apt-get install -y python3.6 python3-pip
RUN apt-get install -y git libgl1-mesa-dev
# Currently not using requirements.txt to improve caching
#COPY requirements.txt /home/projects/my_project/
#WORKDIR /home/projects/my_project/
#RUN pip3 install -r requirements.txt
RUN pip3 install torch pandas PyYAML==5.1.2 autowrap Cython==0.29.14
RUN pip3 install numpy==1.17.3 open3d-python==0.7.0.0 pytest==5.2.4 pptk
RUN pip3 install scipy==1.3.1 natsort matplotlib lxml opencv-python==3.2.0.8
RUN pip3 install Pillow scikit-learn testfixtures
RUN pip3 install pip-licenses pylint pytest-cov
RUN pip3 install autopep8
COPY . /home/projects/my_project/

This might be an old question, but to whomever it might concern, here is one solution. I encountered this problem, that particularly the Python extension from VS Code would not install itself inside my Docker container in VS Code. In order to get it to install the python extension (and for me anything else) you have to specify the Python version, like:
"extensions": [
"ms-azuretools.vscode-docker",
"ms-python.python#2020.9.114305",
"ms-python.vscode-pylance"
]
If you want to see this in action you can clone my repository. Simply open this repo in VS Code, install the extension Remote Container, and then it should start the docker container all by itself.

Docker image with python3, chromedriver, chrome & selenium

My objective is to scrape the web with Selenium driven by Python from a docker container.
I've looked around for and not found a docker image with all of the following installed:
Python 3
ChromeDriver
Chrome
Selenium
Is anyone able to link me to a docker image with all of these installed and working together?
Perhaps building my own isn't as difficult as I think, but it's alluded me thus far.
Any and all advice appreciated.

Try https://github.com/SeleniumHQ/docker-selenium.
It has python installed:
$ docker run selenium/standalone-chrome python3 --version
Python 3.5.2
The instructions indicate you start it with
docker run -d -p 4444:4444 --shm-size=2g selenium/standalone-chrome
Edit:
To allow selenium to run through python it appears you need to install the packages. Create this Dockerfile:
FROM selenium/standalone-chrome
USER root
RUN wget https://bootstrap.pypa.io/get-pip.py
RUN python3 get-pip.py
RUN python3 -m pip install selenium
Then you could run it with
docker build . -t selenium-chrome && \
docker run -it selenium-chrome python3
The advantage compared to the plain python docker image is that you won't need to install the chromedriver itself since it comes from selenium/standalone-chrome.

I like Harald's solution.
However, as of 2021, my environment needed some modifications.
Docker version 20.10.5, build 55c4c88
I changed the Dockerfile as follows.
FROM selenium/standalone-chrome
USER root
RUN apt-get update && apt-get install python3-distutils -y
RUN wget https://bootstrap.pypa.io/get-pip.py
RUN python3 get-pip.py
RUN python3 -m pip install selenium

https://hub.docker.com/r/joyzoursky/python-chromedriver/
It uses python3 as base image and install chromedriver, chrome and selenium (as a pip package) to build. I used the alpine based python3 version for myself, as the image size is smaller.
$ cd [your working directory]
$ docker run -it -v $(pwd):/usr/workspace joyzoursky/python-chromedriver:3.6-alpine3.7-selenium sh
/ # cd /usr/workspace
See if the images suit your case, as you could pip install selenium with other packages together by a requirements.txt file to build your own image, or take reference from the Dockerfiles of this.
If you want to pip install more packages apart from selenium, you could build your own image as this example:
First, in your working directory, you may have a requirements.txt storing the package versions you want to install:
selenium==3.8.0
requests==2.18.4
urllib3==1.22
... (your list of packages)
Then create the Dockerfile in the same directory like this:
FROM joyzoursky/python-chromedriver:3.6-alpine3.7
RUN mkdir packages
ADD requirements.txt packages
RUN pip install -r packages/requirements.txt
Then build the image:
docker build -t yourimage .
This differs with the selenium official one as selenium is installed as a pip package to a python base image. Yet it is hosted by individual so may have higher risk of stopping maintenance.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.