How to create any AWS Lambda Python Layer? (Usage example with XGBoost)

How to create any AWS Lambda Python Layer? (Usage example with XGBoost) - python

I am having trouble creating a lambda layer for the xgboost library. Im running:
Im grabbing a zip of xgboost and it's dependencies from here (https://github.com/alexeybutyrev/aws_lambda_xgboost) and loading it into a layer. When I try to test my lambda, I get this error:
Unable to import module 'lambda_function': No module named 'xgboost.core'
It looks like __init__.py is trying to reference core.py via from .core import <stuff>
Has anyone encountered this error with AWS Lambda before?

EDIT: As #Marcin has remark, the first answer provided works for packages under 262 MB large.
A. Python Packages within Lambda Layer size limit
You can also do it with AWS sam cli and Docker (see this link to install the SAM cli), to build the packages inside a container. Basically you initialize a default template with Python as runtime and then you specify the packages under the requirements.txt file. I found it more easy than the article you mentioned. I let you steps if you want to consider them for future use.
1. Initialize a default SAM template
Under any folder that you want to keep the project, you can type
sam init
this will prompt a series of questions, for a quick set up we will be choosing the Quick Start Templates as follows
1 - AWS Quick Start Templates
2 - Python 3.8
Project name [sam-app]: your_project_name
1 - Hello World Example
By choosing the Hello World Example it generates a default lambda function with a requirements.txt file. Now, we're going to edit with the name of the package that you want, in this case xgboost
2. Specify packages to install
cd your_project_name
code hello_world/requirements.txt
as I have Visual Studio Code as editor, this will open the file on it. Now, I can specify the xgboost package
your_python_package
Here comes the reason to have Docker installed. Some packages relied on C++. Thus, it is recommended to build inside a container (case on Windows). Now, move to the folder where the template.yaml file is located. Then, type
sam build -u
3. Zip packages
there are some files that you do not want to be included in your lambda layer, because we only want to keep the python libraries. Thus, you could remove the following files
rm .aws-sam/build/HelloWorldFunction/app.py
rm .aws-sam/build/HelloWorldFunction/__init__.py
rm .aws-sam/build/HelloWorldFunction/requirements.txt
and then zip the remaining content of the folder.
cp -r .aws-sam/build/HelloWorldFunction/ python/
zip -r my_layer.zip python/
where we place the layer in the python/ folder according to the docs
On Windows system the zip command should be replaced with
Compress-Archive my_layer/ my_layer.zip.
4. Upload your Layer to AWS
On AWS go to Lambda, then choose Layers and Create Layer. Now, you can upload your .zip file as the image below shows
Notice that for zip files over 50 MB, you should upload the .zip file to an s3 bucket and provide the path, for exampl, https://s3:amazonaws.com//mybucket/my_layer.zip.
B. Python packages that exceeds Lambda Layer limits
The xgboost package on its own is more than 300 MB and will throw the following error
As #Marcin has kindly pointed out, the prior approach with SAM cli would not directly work for Python layers that exceed the limit. There's an open issue on github to specify a custom docker image when running sam build -u and a possible solution retagging the default lambda/lambci image.
So, how could we pass through this?. There are already some useful resources that I would just point to.
First, the Medium article that #Alex took as solution that follow this repo code.
Second, alexeybutyrev approach that works by applying the strip command to reduce the libraries sizes. One can find this approach under a github repo, the instructions are provided.
Edit (December 2020)
This month AWS releases container Image support for AWS Lambda. Following the next tree structure for your project
Project/
|-- app/
| |-- app.py
| |-- requirements.txt
| |-- xgb_trained.bin
|-- Dockerfile
You can deploy an XGBoost model with the following Docker image. Follow this repo instructions for a detailed explanation.
# Dockerfile based on https://docs.aws.amazon.com/lambda/latest/dg/images-create.html
# Define global args
ARG FUNCTION_DIR="/function"
ARG RUNTIME_VERSION="3.6"
# Choose buster image
FROM python:${RUNTIME_VERSION}-buster as base-image
# Install aws-lambda-cpp build dependencies
RUN apt-get update && \
apt-get install -y \
g++ \
make \
cmake \
unzip \
libcurl4-openssl-dev \
git
# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Create function directory
RUN mkdir -p ${FUNCTION_DIR}
# Copy function code
COPY app/* ${FUNCTION_DIR}/
# Install python dependencies and runtime interface client
RUN python${RUNTIME_VERSION} -m pip install \
--target ${FUNCTION_DIR} \
--no-cache-dir \
awslambdaric \
-r ${FUNCTION_DIR}/requirements.txt
# Install xgboost from source
RUN git clone --recursive https://github.com/dmlc/xgboost
RUN cd xgboost; make -j4; cd python-package; python${RUNTIME_VERSION} setup.py install; cd;
# Multi-stage build: grab a fresh copy of the base image
FROM base-image
# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}
# Copy in the build image dependencies
COPY --from=base-image ${FUNCTION_DIR} ${FUNCTION_DIR}
ENTRYPOINT [ "/usr/local/bin/python", "-m", "awslambdaric" ]
CMD [ "app.handler" ]

So I was never able to figure out why it failed in this way. The solution I found that worked was to create an EC2 instance running amazon linux, install and zip the libraries there, then save to S3. See here for detailed instructions:
https://medium.com/#lucashenriquessilva/how-to-create-a-aws-lambda-python-layer-db2830e08b12

Related

Is there a way to download TextBlob corpora to Google Cloud Run?

I am using Python with TextBlob for sentiment analysis. I want to deploy my app (build in Plotly Dash) to Google Cloud Run with Google Cloud Build (without using Docker). When using locally on my virtual environment all goes fine, but after deploying it on the cloud the corpora is not downloaded. Looking at the requriements.txt file, there was also no reference to this corpora.
I have tried to add python -m textblob.download_corpora to my requriements.txt file but it doesn't download when I deploy it. I have also tried to add
import textblob
import subprocess
cmd = ['python','-m','textblob.download_corpora']
subprocess.run(cmd)
and
import nltk
nltk.download('movie_reviews')
to my script (callbacks.py, I am using Plotly Dash to make my app), all without success.
Is there a way to add this corpus to my requirements.txt file? Or is there another workaround to download this corpus? How can I fix this?
Thanks in advance!
Vijay

Since Cloud Run creates and destroys containers as needed for your traffic levels you'll want to embed your corpora in the pre-built container to ensure a fast cold start time (instead of downloading it when the container starts)
The easiest way to do this is add another line inside of a docker file that downloads and installs the corpora at build time like so:
RUN python -m textblob.download_corpora
Here's a full docker file for your reference:
# Python image to use.
FROM python:3.8
# Set the working directory to /app
WORKDIR /app
# copy the requirements file used for dependencies
COPY requirements.txt .
# Install any needed packages specified in requirements.txt
RUN pip install --trusted-host pypi.python.org -r requirements.txt
RUN python -m textblob.download_corpora
# Copy the rest of the working directory contents into the container at /app
COPY . .
# Run app.py when the container launches
ENTRYPOINT ["python", "app.py"]
Good luck,
Josh

Python wrapper for C++ .so is only useable when COPY'ing code into docker image and not when mounted by volume?

I have somewhat successfully dockerized a software repository (KPConv) that I plan to work with and extend with the following Dockerfile
FROM tensorflow/tensorflow:1.12.0-devel-gpu-py3
# Install other required python stuff
RUN apt-get update && apt install -y --fix-missing --no-install-recommends\
python3-setuptools python3-pip python3-tk
RUN pip install --upgrade pip
RUN pip3 install numpy scikit-learn psutil matplotlib pyqt5 laspy
# Compile the custom operations and CPP wrappers
# For some reason this must be done within container, cannot access libcuda.so during docker build
# Ref: https://stackoverflow.com/questions/66575232
#COPY . /kpconv
#WORKDIR /kpconv/tf_custom_ops
#RUN sh compile_op.sh
#WORKDIR /kpconv/cpp_wrappers
#RUN sh compile_wrappers.sh
# Set the working directory to kpconv
WORKDIR /kpconv
# Set root user password so we can su/sudo later if need be
RUN echo "root:pass" | chpasswd
# Create a user and group akin to the host within the container
ARG USER_ID
ARG GROUP_ID
RUN addgroup --gid $GROUP_ID user
RUN adduser --disabled-password --gecos '' --uid $USER_ID --gid $GROUP_ID user
USER user
#Build
#sudo docker build -t kpconv-test \
# --build-arg USER_ID=$(id -u) \
# --build-arg GROUP_ID=$(id -g) \
# .
At the end of this Dockerfile I followed a post found here which describes a way to correctly set the permissions of files generated by/within a container so that the host machine/user can access them without having to alter the file permissions.
Also, this software repository makes use of custom tensorflow operations in C++ (KPConv/tf_custom_ops) along with Python wrappers for custom C++ code (KPConv/cpp_wrappers). The author of KPConv, Thomas Hugues, provides a bash script which compiles each to generate various .so files.
If I COPY the repository into the image during the build process (COPY . /kpconv), startup the container, call both of the compile bash scripts, and run the code then Python correctly loads the C++ wrapper (the generated .so grid_subsampling.cpython-35m-x86_64-linux-gnu.so) and begins running the software as expected/intended.
$ sudo docker run -it \
> -v /<myhostpath>/data_sets:/data \
> -v /<myhostpath>/_output:/output \
> --runtime=nvidia kpconv-test /bin/bash
user#eec8553dcb5d:/kpconv$ cd tf_custom_ops
user#eec8553dcb5d:/kpconv/tf_custom_ops$ sh compile_op.sh
user#eec8553dcb5d:/kpconv/tf_custom_ops$ cd ..
user#eec8553dcb5d:/kpconv$ cd cpp_wrappers/
user#eec8553dcb5d:/kpconv/cpp_wrappers$ sh compile_wrappers.sh
running build_ext
building 'grid_subsampling' extension
<Redacted for brevity>
user#eec8553dcb5d:/kpconv/cpp_wrappers$ cd ..
user#eec8553dcb5d:/kpconv$ python training_ModelNet40.py
Dataset Preparation
*******************
Loading training points
1620.2 MB loaded in 0.6s
Loading test points
411.6 MB loaded in 0.2s
<Redacted for brevity>
This works well and allows me run the KPConv software.
Also to note for later the .so file has the hash
user#eec8553dcb5d:/kpconv/cpp_wrappers/cpp_subsampling$ sha1sum grid_subsampling.cpython-35m-x86_64-linux-gnu.so
a17eef453f6d2370a15bc2a0e6714c978390c5c3 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
It also has the permissions
user#eec8553dcb5d:/kpconv/cpp_wrappers/cpp_subsampling$ ls -al grid_subsampling.cpython-35m-x86_64-linux-gnu.so
-rwxr-xr-x 1 user user 561056 Mar 14 02:16 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
Though it produces a difficult workflow for quickly editing and the software for my purposes and quickly running it within the container. Every change to the code requires a new build of the image. Thus, I would much rather mount/volume the KPConv code from the host into the container at runtime and then the edits are "live" within the container as it is running.
Doing this and using the Dockerfile at the top of the post (no COPY . /kpconv) to compile an image, perform the same compilation steps, and run the code
$ sudo docker run -it \
> -v /<myhostpath>/data_sets:/data \
> -v /<myhostpath>/KPConv_Tensorflow:/kpconv \
> -v /<myhostpath>/_output:/output \
> --runtime=nvidia kpconv-test /bin/bash
user#a82e2c1af21a:/kpconv$ cd tf_custom_ops/
user#a82e2c1af21a:/kpconv/tf_custom_ops$ sh compile_op.sh
user#a82e2c1af21a:/kpconv/tf_custom_ops$ cd ..
user#a82e2c1af21a:/kpconv$ cd cpp_wrappers/
user#a82e2c1af21a:/kpconv/cpp_wrappers$ sh compile_wrappers.sh
running build_ext
building 'grid_subsampling' extension
<Redacted for brevity>
user#a82e2c1af21a:/kpconv/cpp_wrappers$ cd ..
user#a82e2c1af21a:/kpconv$ python training_ModelNet40.py
I receive the following Python ImportError
user#a82e2c1af21a:/kpconv$ python training_ModelNet40.py
Traceback (most recent call last):
File "training_ModelNet40.py", line 36, in <module>
from datasets.ModelNet40 import ModelNet40Dataset
File "/kpconv/datasets/ModelNet40.py", line 40, in <module>
from datasets.common import Dataset
File "/kpconv/datasets/common.py", line 29, in <module>
import cpp_wrappers.cpp_subsampling.grid_subsampling as cpp_subsampling
ImportError: /kpconv/cpp_wrappers/cpp_subsampling/grid_subsampling.cpython-35m-x86_64-linux-gnu.so: failed to map segment from shared object
Why is this Python wrapper for C++ only useable when COPY'ing code into the docker image and not when mounted by volume?
This .so file has the same hash and permissions as the first described situation
user#a82e2c1af21a:/kpconv/cpp_wrappers/cpp_subsampling$ sha1sum grid_subsampling.cpython-35m-x86_64-linux-gnu.so
a17eef453f6d2370a15bc2a0e6714c978390c5c3 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
user#a82e2c1af21a:/kpconv/cpp_wrappers/cpp_subsampling$ ls -al grid_subsampling.cpython-35m-x86_64-linux-gnu.so
-rwxr-xr-x 1 user user 561056 Mar 14 02:19 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
On my host machine the file has the following permissions (it's on the host because /kpconv was mounted as a volume) (for some reason the container is in the future too, check the timestamps)
$ ls -al grid_subsampling.cpython-35m-x86_64-linux-gnu.so
-rwxr-xr-x 1 <myusername> <myusername> 561056 Mar 13 21:19 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
After some research on the error message it looks like every result is specific to a situation. Though most seem to mention that the error is the result of some sort of permissions issue.
This Unix&Linux Stack answer I think provides the answer to what the actual problem is. But I am a bit too far from my days of working with C++ as an intern in college to necessarily understand how to use it to fix this issue. But I think the issue lies with the permissions between the container and host and between the users on each (that is, root on the container, user (Dockerfile) on the container, root on host, and <myusername> on host).
I have also attempted to first elevate permissions within the container using the root password created in the Dockerfile, then compiling the code, and running the software. But this results in the same issue. I have also tried compiling the code as user in the container, but running the software as root, again with the same issue.
Thus another clue I have found and provide is that there is seemingly something different with the .so when compiled "only within" the container (no --volume) and when it is compiled within the --volume (thus why I attempted to compare the file hashes). So maybe its not so much permissions but how the .so is loaded within the container by the kernel or how its location within the --volume effects that loading process?
EDIT: As for a SSCCE you should be able to clone the linked repository to your machine and use the same Dockerfile. You do not need to specify the /data or /output volumes or alter the code in any way (It attempts to load the .so before loading the data (which will just error and end execution))
If you do not have a GPU or do not want to install nvidia-runtime you should be able to alter the Dockerfile base image to tensorflow:1.12.0-devel-py3 and run the code on CPU.

Your problem is created by the linker trying to dynamically load the library. There could be several root-causes for this:
Permissions. The user should have permission to load the library, so when mounting file systems in docker, the owner id and the group id that are in the host are not necessary the same id in the container although they might be the same name.
Wrong binary format. The host OS is compiling the binary in wrong format. This can happen if you run the compile on (by example) macOS and use it in a linux container.
Wrong mounting. The mounting, by example, with noexec will also prevent the library to be loaded.
Difference in libraries from both environments. Due to the differences of the environment where the library was compiled, you might be missing some libraries, so use ldd grid_subsampling.cpython-35m-x86_64-linux-gnu.so and ldd -r -d -v grid_subsampling.cpython-35m-x86_64-linux-gnu.so check all the libraries that are linked.

How to install external modules in a Python Lambda Function created by AWS CDK?

I'm using the Python AWS CDK in Cloud9 and I'm deploying a simple Lambda function that is supposed to send an API request to Atlassian's API when an Object is uploaded to an S3 Bucket (also created by the CDK). Here is my code for CDK Stack:
from aws_cdk import core
from aws_cdk import aws_s3
from aws_cdk import aws_lambda
from aws_cdk.aws_lambda_event_sources import S3EventSource
class JiraPythonStack(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
# The code that defines your stack goes here
jira_bucket = aws_s3.Bucket(self,
"JiraBucket",
encryption=aws_s3.BucketEncryption.KMS)
event_lambda = aws_lambda.Function(
self,
"JiraFileLambda",
code=aws_lambda.Code.asset("lambda"),
handler='JiraFileLambda.handler',
runtime=aws_lambda.Runtime.PYTHON_3_6,
function_name="JiraPythonFromCDK")
event_lambda.add_event_source(
S3EventSource(jira_bucket,
events=[aws_s3.EventType.OBJECT_CREATED]))
The lambda function code uses the requests module which I've imported. However, when I check the CloudWatch Logs, and test the lambda function - I get:
Unable to import module 'JiraFileLambda': No module named 'requests'
My Question is: How do I install the requests module via the Python CDK?
I've already looked around online and found this. But it seems to directly modify the lambda function, which would result in a Stack Drift (which I've been told is BAD for IaaS). I've also looked at the AWS CDK Docs too but didn't find any mention of external modules/libraries (I'm doing a thorough check for it now) Does anybody know how I can work around this?
Edit: It would appear I'm not the only one looking for this.
Here's another GitHub issue that's been raised.

It is not even necessary to use the experimental PythonLambda functionality in CDK - there is support built into CDK to build the dependencies into a simple Lambda package (not a docker image). It uses docker to do the build, but the final result is still a simple zip of files. The documentation shows it here: https://docs.aws.amazon.com/cdk/api/latest/docs/aws-lambda-readme.html#bundling-asset-code ; the gist is:
new Function(this, 'Function', {
code: Code.fromAsset(path.join(__dirname, 'my-python-handler'), {
bundling: {
image: Runtime.PYTHON_3_9.bundlingImage,
command: [
'bash', '-c',
'pip install -r requirements.txt -t /asset-output && cp -au . /asset-output'
],
},
}),
runtime: Runtime.PYTHON_3_9,
handler: 'index.handler',
});
I have used this exact configuration in my CDK deployment and it works well.
And for Python, it is simply
aws_lambda.Function(
self,
"Function",
runtime=aws_lambda.Runtime.PYTHON_3_9,
handler="index.handler",
code=aws_lambda.Code.from_asset(
"function_source_dir",
bundling=core.BundlingOptions(
image=aws_lambda.Runtime.PYTHON_3_9.bundling_image,
command=[
"bash", "-c",
"pip install --no-cache -r requirements.txt -t /asset-output && cp -au . /asset-output"
],
),
),
)

UPDATE:
It now appears as though there is a new type of (experimental) Lambda Function in the CDK known as the PythonFunction. The Python docs for it are here. And this includes support for adding a requirements.txt file which uses a docker container to add them to your function. See more details on that here. Specifically:
If requirements.txt or Pipfile exists at the entry path, the construct will handle installing all required modules in a Lambda compatible Docker container according to the runtime.
Original Answer:
So this is the awesome bit of code my manager wrote that we now use:
def create_dependencies_layer(self, project_name, function_name: str) -> aws_lambda.LayerVersion:
requirements_file = "lambda_dependencies/" + function_name + ".txt"
output_dir = ".lambda_dependencies/" + function_name
# Install requirements for layer in the output_dir
if not os.environ.get("SKIP_PIP"):
# Note: Pip will create the output dir if it does not exist
subprocess.check_call(
f"pip install -r {requirements_file} -t {output_dir}/python".split()
)
return aws_lambda.LayerVersion(
self,
project_name + "-" + function_name + "-dependencies",
code=aws_lambda.Code.from_asset(output_dir)
)
It's actually part of the Stack class as a method (not inside the init). The way we have it set up here is that we have a folder called lambda_dependencies which contains a text file for every lambda function we are deploying which just has a list of dependencies, like a requirements.txt.
And to utilise this code, we include in the lambda function definition like this:
get_data_lambda = aws_lambda.Function(
self,
.....
layers=[self.create_dependencies_layer(PROJECT_NAME, GET_DATA_LAMBDA_NAME)]
)

You should install the dependencies of your lambda locally before deploying the lambda via CDK. CDK does not have idea how to install the dependencies and which libraries should be installed.
In you case, you should install the dependency requests and other libraries before executing cdk deploy.
For example,
pip install requests --target ./asset/package
There is an example for reference.

Wanted to share 2 template repos I made for this (heavily inspired by some of the above):
https://github.com/iguanaus/cdk-ecs-python-with-requirements- - demo of ecs service of basic python function
https://github.com/iguanaus/cdk-lambda-python-with-requirements - demo of lambda python job with requirements.
Hope they are helpful for folks :)
Lastly; if you want to see a long thread on this subject, see here: https://github.com/aws/aws-cdk/issues/3660

I ran into this issue as well. I used a solution like #Kane and #Jamie suggest just fine when I was working on my ubuntu machine. However, I ran into issue when working on MacOS. Apparently some (all?) python packages don't work on lambda (linux env) if they are pip installeded on a different os (see stackoverflow post)
My solution was to run the pip install inside a docker container. This allowed me to cdk deploy from my macbook and not run into issues with my python packages in lambda.
suppose you have a dir lambda_layers/python in your cdk project that will house your python packages for the lambda layer.
current_path = str(pathlib.Path(__file__).parent.absolute())
pip_install_command = ("docker run --rm --entrypoint /bin/bash -v "
+ current_path
+ "/lambda_layers:/lambda_layers python:3.8 -c "
+ "'pip3 install Pillow==8.1.0 -t /lambda_layers/python'")
subprocess.run(pip_install_command, shell=True)
lambda_layer = aws_lambda.LayerVersion(
self,
"PIL-layer",
compatible_runtimes=[aws_lambda.Runtime.PYTHON_3_8],
code=aws_lambda.Code.asset("lambda_layers"))

As an alternative to my other answer, here's a slightly different approach that also works with docker-in-docker (the bundling-options approach doesn't).
Set up the Lambda function like
lambda_fn = aws_lambda.Function(
self,
"Function",
runtime=lambdas.Runtime.PYTHON_3_9,
code=lambdas.Code.from_docker_build(
"function_source_dir",
),
handler="index.lambda_handler",
)
and in function_source_dir/ have these files:
index.py (to match the above code - you can name this whatever you like)
requirements.txt
Dockerfile
Set up your Dockerfile like
# Note that this dockerfile is only used to build the lambda asset - the
# lambda still just runs with a zip source, not a docker image.
# See the docstring for aws_lambda.Code.from_docker_build
FROM public.ecr.aws/lambda/python:3.9.2022.04.27.10-x86_64
COPY index.py /asset/
COPY requirements.txt /tmp/
RUN pip3 install -r /tmp/requirements.txt -t /asset
and the synth step will build your asset in docker (using the above dockerfile) then pull the built Lambda source from the /asset/ directory in the image.
I haven't looked into too much detail about why the BundlingOptions approach fails to build when running inside a docker container, but this one does work (as long as docker is run with -v /var/run/docker.sock:/var/run/docker.sock to enable docker-in-docker). As always, be sure to consider your security posture when doing this.

How to run a docker image in IBM Cloud functions?

I have a simple Python program that I want to run in IBM Cloud functions. Alas it needs two libraries (O365 and PySnow) so I have to Dockerize it and it needs to be able to accept a Json feed from STDIN. I succeeded in doing this:
FROM python:3
ADD requirements.txt ./
RUN pip install -r requirements.txt
ADD ./main ./main
WORKDIR /main
CMD ["python", "main.py"]
This runs with: cat env_var.json | docker run -i f9bf70b8fc89
I've added the Docker container to IBM Cloud Functions like this:
ibmcloud fn action create e2t-bridge --docker [username]/e2t-bridge
However when I run it, it times out.
Now I did see a possible solution route, where I dockerize it as an Openwhisk application. But for that I need to create a binary from my Python application and then load it into a rather complicated Openwhisk skeleton, I think?
But having a file you can simply run was is the whole point of my Docker, so to create a binary of an interpreted language and then adding it into a Openwhisk docker just feels awfully clunky.
What would be the best way to approach this?

It turns out you don't need to create a binary, you just need to edit the OpenWhisk skeleton like so:
# Dockerfile for example whisk docker action
FROM openwhisk/dockerskeleton
ENV FLASK_PROXY_PORT 8080
### Add source file(s)
ADD requirements.txt /action/requirements.txt
RUN cd /action; pip install -r requirements.txt
# Move the file to
ADD ./main /action
# Rename our executable Python action
ADD /main/main.py /action/exec
CMD ["/bin/bash", "-c", "cd actionProxy && python -u actionproxy.py"]
And make sure that your Python code accepts a Json feed from stdin:
json_input = json.loads(sys.argv[1])
The whole explaination is here: https://github.com/iainhouston/dockerPython

Import libraries in lambda layers

I wanted to import jsonschema library in my AWS Lambda in order to perform request validation. Instead of bundling the dependency with my app , I am looking to do this via Lambda Layers. I zipped all the dependencies under venv/lib/python3.6/site-packages/. I uploaded this as a lambda layer and added it to my aws lambda using publish-layer-version and aws lambda update-function-configuration commands respectively. The zip folder is name "lambda-dep.zip" and all the files are under it. However when I try to import jsonschema in my lambda_function , I see the error below -
from jsonschema import validate
{
"errorMessage": "Unable to import module 'lambda_api': No module named 'jsonschema'",
"errorType": "Runtime.ImportModuleError"
}
Am I missing any steps are is there a different mechanism to import anything within lambda layers?

You want to make sure your .zip follows this folder structure when unzipped
python/lib/python3.6/site-packages/{LibrariesGoHere}.
Upload that zip, make sure the layer is added to the Lambda function and you should be good to go.
This is the structure that has worked for me.

Here the script that I use to upload a layer:
#!/usr/bin/env bash
LAYER_NAME=$1 # input layer, retrived as arg
ZIP_ARTIFACT=${LAYER_NAME}.zip
LAYER_BUILD_DIR="python"
# note: put the libraries in a folder supported by the runtime, means that should by python
rm -rf ${LAYER_BUILD_DIR} && mkdir -p ${LAYER_BUILD_DIR}
docker run --rm -v `pwd`:/var/task:z lambci/lambda:build-python3.6 python3.6 -m pip --isolated install -t ${LAYER_BUILD_DIR} -r requirements.txt
zip -r ${ZIP_ARTIFACT} .
echo "Publishing layer to AWS..."
aws lambda publish-layer-version --layer-name ${LAYER_NAME} --zip-file fileb://${ZIP_ARTIFACT} --compatible-runtimes python3.6
# clean up
rm -rf ${LAYER_BUILD_DIR}
rm -r ${ZIP_ARTIFACT}
I added the content above to a file called build_layer.sh, then I call it as bash build_layer.sh my_layer. The script requires a requirements.txt in the same folder, and it uses Docker to have the same runtime used for Python3.6 Lambdas.
The arg of the script is the layer name.
After uploading a layer to AWS, be sure that the right layer's version is referenced inside your Lambda.

Update from previous answers: Per AWS documentation, requirements have been changed to simply be placed in a /python directory without the rest of the directory structure.
https://aws.amazon.com/premiumsupport/knowledge-center/lambda-import-module-error-python/
Be sure your unzipped directory structure has libraries within a /python directory.

There is an easier method. Just install the packages into a python folder. Then install the packages using the -t (Target) option. Note the "." in the zip file. this is a wild card.
mkdir lambda_function
cd lambda_function
mkdir python
cd python
pip install yourPackages -t ./
cd ..
zip /tmp/labmda_layer.zip .
The zip file is now your lambda layer.
The step by step instructions includeing video instructions can be found here.
https://geektopia.tech/post.php?blogpost=Create_Lambda_Layer_Python

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to create any AWS Lambda Python Layer? (Usage example with XGBoost) - python

Related

Is there a way to download TextBlob corpora to Google Cloud Run?

Python wrapper for C++ .so is only useable when COPY'ing code into docker image and not when mounted by volume?

How to install external modules in a Python Lambda Function created by AWS CDK?

How to run a docker image in IBM Cloud functions?

Import libraries in lambda layers

Categories

Resources