How to import lxml from precompiled binary on AWS Lambda? - python

I'm trying to import the lxml library in Python to execute an AWS Lambda function but I'm getting the following error: [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named 'lxml'. To solve this, I followed the recommendation from this SO answer and used precompiled binaries from the following repo.
I used the lxml_amazon_binaries.zip file from that repo, which has this structure:
lxml_amazon_binaries
├── lxml
└── usr
I uploaded the entire zip file to an AWS Lambda layer, created a new Lambda function, and tested with a simple from lxml import etree, which led to the above error.
Am I uploading/using these binaries correctly? I'm not sure what caused the error. Using different Python runtimes didn't help.

The most reliable way to create lxml layer is using Docker as explain in the AWS blog. Specifically, the verified steps are (executed on Linux, but windows should also work as long as you have Docker):
Create empty folder, e.g. mylayer.
Go to the folder and create requirements.txt file with the content of
lxml
Run the following docker command:
The command will create layer for python3.8:
docker run -v "$PWD":/var/task "lambci/lambda:build-python3.8" /bin/sh -c "pip install -r requirements.txt -t python/lib/python3.8/site-packages/; exit"
Archive the layer as zip:
zip -9 -r mylayer.zip python
Create lambda layer based on mylayer.zip in the AWS Console. Don't forget to specify Compatible runtime to python3.8.
Add the the layer created in step 5 to your function.
I tested the layer using your code:
from lxml import etree
def lambda_handler(event, context):
root = etree.Element("root")
root.append( etree.Element("child1") )
print(etree.tostring(root, pretty_print=True))
It works correctly:
b'<root>\n <child1/>\n</root>\n'

Related

Unable to import module AWS Lambda Function

I'm having issues running basically any lambda function on AWS as the Lambda Function tool would not be able to import the module.
I tried to import the packages as layers - pretty wrong I think. (https://www.linkedin.com/pulse/add-external-python-libraries-aws-lambda-using-layers-gabe-olokun/)
Then I've tried to import the packages as environments (from local or bash scripted):
local - like this boy https://www.youtube.com/watch?v=NGteAkN2WYc
or using bash scripting (AWS Cloudshell) - https://docs.aws.amazon.com/lambda/latest/dg/python-package.html
Either ways:
The lambda environment would look like this:
import psycopg2
def lambda_handler(event, context):
# Connect to PostgreSQL database
conn = psycopg2.connect(
host=['host'],
database=['postgres'],
user=['user'],
password=['password']
)
And I'm hitting the following errors:
-if the lambda_function.py is inside the "psycopg2+py" directory: errorMessage": "Unable to import module 'lambda_function': No module named 'lambda_function'
-if the lambda_function.py is outside the "psycopg2+py" directory and just inside the "postgresql" directory: errorMessage": "Unable to import module 'lambda_function': No module named 'psycopg2'
And I suposse the Handler is set correctly :
I must also mention, when I set up the environment to install the packages, I was using Python 3.9, the same version that I'm using on Lambda function.
Also, I've tried the same methods with another package, like fastapi, still not working, so it seems to be a functionality issue not a package issue.
I don't have any idea what else should I try.
Here is how you can go about adding dependencies for a lambda function:
Install required python packages (in this case psycopg2):
# mkdir workspace; cd workspace
# pip3.9 install pip --upgrade
# pip3.9 install --platform manylinux2014_x86_64 \
--target=./python/lib/python3.9/site-packages --implementation cp \
--python 3.9 --only-binary=:all: --upgrade psycopg2
Rename _psycopg.xxxxxx.so file:
mv ./python/lib/python3.9/site-packages/psycopg2/_psycopg*.so ./python/lib/python3.9/site-packages/psycopg2/_psycopg.so
Zip the created python directory:
# zip -r requirements.zip python/
Create a lambda layer from the requirements.zip zip file (you can also use the portal instead of awscli):
# aws lambda publish-layer-version --layer-name dependencies \
--description "Python packages" --license-info "MIT" \
--zip-file fileb://requirements.zip --compatible-runtimes python3.9 \
--compatible-architectures "x86_64"
Add the published layer version to your lambda function (or use the portal):
# aws lambda update-function-configuration --function-name Your_Lambda_Func_Name --layers arn:aws:lambda:us-east-1:xxxxxxxx:layer:dependencies:1
I found the solution, just use Python 3.8 and use the packages as layers.
Here's a link from my git with 2 layers working, psycopg2 and pandas (boto3/os are prebuilt in AWS): https://github.com/alexdragut20/AWS
Just create the layers using "Upload a zip file" option, create a function from scratch and add the layers upon the function.

Cannot import name 'cygrpc' from 'grpc._cython' - Google Ads API

I want to deploy working python project in pycharm to aws lambda. The project is using google-ads library to get some report data from google ads.
I tried deploying lambda by importing complete project as a zip file by zipping all the folders/files inside the project and not the project folder itself. But i got the following error:
{
"errorMessage": "Unable to import module 'main': cannot import name 'cygrpc' from 'grpc._cython' (/var/task/grpc/_cython/__init__.py)",
"errorType": "Runtime.ImportModuleError",
"stackTrace": []
}
Assuming that google-ads library is working and that something is wrong with grpc(btw google-ads includes grpcio and stuff on its own), i tried to create a layer for grpcio, cython, cygrpc but the error remains same.
I create projects/layers in aws lambda and they work. I dont know what i am doing wrong here.
Any help would be really appreciated!
versions: google-ads-14.1.0, python-3.9, grpcio-1.43.0
Answering my own question after a lot of workaround. I have made it generic so anyone can use it.
I believe you can fix any type of ImportModuleError as long as your deployment package's file structure, your code and architecture is ok, only then you can deploy and run your code successfully. To fix your structure and architecture, follow steps below:
1- Install "ubuntu 18.04 LTS" from microsoft store (Windows 10).
2- Open CMD and run following commands:
ubuntu1804
Enter password or create user if asked.
cd /mnt/c You can choose any of your drive. I chose C.
mkdir my-lambda-folder Create project folder.
cd my-lambda-folder Enter into project folder.
touch lambda_function.py Create file called lambda_function.py
Now copy and paste your code into file you just created i.e lambda_function.py
pip install --target ./package your-module-name
For Example: pip install --target ./package google-ads will install
google-ads module inside folder 'package'. The folder 'package' will be
created automatically if not found.
cd package
zip -r ../my-deployment-package.zip . This will create deployment package with the installed library at the root of your project folder i.e my-lambda-folder.
cd .. go back to the root of your project folder.
zip -g my-deployment-package.zip lambda_function.py Add your lambda function to the deployment package you just created i.e my-deployment-package.zip.
(Optional) In my case i was using google-ads and to run my code i needed google-ads.yaml file too in my deployment package. So i ran additional command zip -g my-deployment-package.zip google-ads-yaml (i already pasted this file in my project folder).
3- Upload my-deployment-package.zip to your lambda function in AWS console and you are good to go.
For me, it worked just by downloading the packages with pip on ubuntu on docker and packing and uploading them on AWS.

Unable to import module 'lambda_function': No module named *

I am trying to run a python lambda function that uses additional packages. However whenever I upload the .zip file to the lambda console I get the error:
{
"errorMessage": "Unable to import module 'lambda_function': No module named '*'",
"errorType": "Runtime.ImportModuleError"
}
I followed these instructions: https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-dependencies which told me to make sure my packages were in a directory local to my lambda function:
~/my-function$ pip install --target ./package Pillow
I am not using Pillow. This is sample code from their site. Nor am I using a package that you can access on Lambda already. It is one that I have got from github and need to attach to my app.
At first this didnt work so I created a setup.cfg file and added in:
[install]
prefix=
Now, when I use the pip command to install to the target, it works (and also adds loads of other folders other than my package but I assumed they were needed so I left them there.
When I go into the directory, the package is there.
I then found this answer: https://stackoverflow.com/a/12493244/5675125 which suggested perhaps some hidden files were not being included and this is how I should zip them.
Again the same error.
How do I get lambda to recognise that my package is there.
If you require Pillow, the easiest way to use it in your function is through a popular repository with a public layers (including pillow) such as keithrozario /
Klayers on github. To use it, you would locate ARN of the layer based on your region. The list of the ARNs for python 3.8 is here.
For example, for us-east-1 the layer added for Python 3.7 would be:
Update
I just created the custom layer with instabot and can confirm that it works.
The technique used includes docker tool described in the recent AWS blog:
How do I create a Lambda layer using a simulated Lambda environment with Docker?
Thus for this question, I verified it as follows:
Create empty folder, e.g. mylayer.
Go to the folder and create requirements.txt file with the content of
instabot
Run the following docker command:
docker run -v "$PWD":/var/task "lambci/lambda:build-python3.8" /bin/sh -c "pip install -r requirements.txt -t python/lib/python3.8/site-packages/; exit"
Remove numpy
The instabot requires numpy which is very large. So we remove it manually,
before
creating a layer. We are going to use numpy layer provided by AWS instead.
sudo rm -rvf ./python/lib/python3.8/site-packages/numpy*
If we don't remove numpy, the layer will be >50MB.
Create layer as zip:
zip -9 -r mylayer.zip python
Create lambda layer based on mylayer.zip in the AWS Console.
Don't forget to specify Compatible runtimes to python3.8.
Add two layers to your function:
The first one is AWSLambda-Python38-SciPy1x provided by AWS with numpy,
while the second one is the one we created above. So
your function will use two layers.
Test the layer in lambda using the following lambda function:
import json
from instabot import Bot
def lambda_handler(event, context):
# TODO implement
bot = Bot(base_path='/tmp')
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}

How to package Scrapy dependency to lambda?

I am writing a python application which dependents on Scrapy module. It works fine locally but failed when I run it from aws lambda test console. My python project has a requirements.txt file with below dependency:
scrapy==1.6.0
I packaged all dependencies by following this link: https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html. And also, I put my source code *.py at the root level of in the zip file. My package script can be found https://github.com/zhaoyi0113/quote-datalake/blob/master/bin/deploy.sh.
It basically does two things, first run command pip install -r requirements.txt -t dist to download all dependencies to dist directory. second, copy app python source code to dist directory.
The deployment is done via terraform and below is the configuration file.
provider "aws" {
profile = "default"
region = "ap-southeast-2"
}
variable "runtime" {
default = "python3.6"
}
data "archive_file" "zipit" {
type = "zip"
source_dir = "crawler/dist"
output_path = "crawler/dist/deploy.zip"
}
resource "aws_lambda_function" "test_lambda" {
filename = "crawler/dist/deploy.zip"
function_name = "quote-crawler"
role = "arn:aws:iam::773592622512:role/LambdaRole"
handler = "handler.handler"
source_code_hash = "${data.archive_file.zipit.output_base64sha256}"
runtime = "${var.runtime}"
}
It zip the directory and upload the file to lambda.
I found I get the runtime error in lambda Unable to import module 'handler': cannot import name 'etree' when there is a statement import scrapy. I didn't use etree in my code so I believe there is something used by scrapy.
My source code can be found at https://github.com/zhaoyi0113/quote-datalake/tree/master/crawler. There are only two simple python files.
It works fine if I run them locally. The error only appears in lambda. Is there a different way to package scrapy to lambda?
Based on the communication with Tim, the issue is caused by incompatible library versions between local and lambda.
The easiest way to resolve this issue is to use the docker image lambci/lambda to build a package with the command:
$ docker run -v $(pwd):/outputs -it --rm lambci/lambda:build-python3.6 pip install scrapy -t /outputs/
You need to provide the entire dependency tree, scrapy also has a set of dependencies (and they may also have dependencies).
The easiest way to download all the required dependencies is to use pip
$ pip -t packages/ install scrapy
This will download scrapy and all its dependencies into the folder packages.
Scrapy has lxml and pyOpenSSL as dependencies that include compiled components. Unless they are statically compiled they will likely require that the c-libraries they require are also installed on the lambda VM.
From the lxml documentation it requires:
libxml2 version 2.9.2 or later.
libxslt version 1.1.27 or later.
We recommend libxslt 1.1.28 or later.
Maybe try adding installation of these to your deploy script. You should be able to use (I'm making a guess at the package names) yum -y install libxml2 libxslt
Another good idea is to test your scripts on an Amazon Linux EC2 instance as this is close to the environment that Lambda executes in.

Any pyinstaller detailed example about hidden import for psutil?

I want to compile my python code to binary by using pyinstaller, but the hidden import block me. For example, the following code import psutil and print the CPU count:
# example.py
import psutil
print psutil.cpu_count()
And I compile the code:
$ pyinstaller -F example.py --hidden-import=psutil
When I run the output under dist:
ImportError: cannot import name _psutil_linux
Then I tried:
$ pyinstaller -F example.py --hidden-import=_psutil_linux
Still the same error. I have read the pyinstall manual, but I still don't know how to use the hidden import. Is there a detailed example for this? Or at least a example to compile and run my example.py?
ENVs:
OS: Ubuntu 14.04
Python: 2.7.6
pyinstaller: 2.1
Hi hope you're still looking for an answer. Here is how I solved it:
add a file called hook-psutil.py
from PyInstaller.hooks.hookutils import (collect_data_files, collect_submodules)
datas = [('./venv/lib/python2.7/site-packages/psutil/_psutil_linux.so', 'psutil'),
('./venv/lib/python2.7/site-packages/psutil/_psutil_posix.so', 'psutil')]
hiddenimports = collect_submodules('psutil')
And then call pyinstaller --additional-hooks-dir=(the dir contain the above script) script.py
pyinstall is hard to configure, the cx_freeze maybe better, both support windows (you can download the exe directly) and linux. Provide the example.py, In windows, suppose you have install python in the default path (C:\\Python27):
$ python c:\\Python27\\Scripts\\cxfreeze example.py -s --target-dir some_path
the cxfreeze is a python script, you should run it with python, then the build files are under some_path (with a lot of xxx.pyd and xxx.dll).
In Linux, just run:
$ cxfreeze example.py -s --target-dir some_path
and also output a lot of files(xxx.so) under some_path.
The defect of cx_freeze is it would not wrap all libraries to target dir, this means you have to test your build under different environments. If any library missing, just copy them to target dir. A exception case is, for example, if your build your python under Centos 6, but when running under Centos 7, the missing of libc.so.6 will throw, you should compile your python both under Centos 7 and Centos 6.
What worked for me is as follows:
Install python-psutil: sudo apt-get install python-psutil. If you
have a previous installation of the psutil module from other
method, for example through source or easy_install, remove it first.
Run pyinstaller as you do, without the hidden-import option.
still facing the error
Implementation:
1.python program with modules like platform , os , shutil and psutil
when i run the script directly using python its working fine.
2.if i build a binary using pyinstaller. The binary is build successfully. But if i run the binary iam getting the No module named psutil found.I had tried several methods like adding the hidden import and other things. None is working. I trying it almost 2 to 3 days.
Error:
ModuleNotFoundError: No module named 'psutil'
Command used for the creating binary
pyinstaller --hidden-import=['_psutil_linux'] --onefile --clean serverHW.py
i tried --additional-hooks-dir= also not working. When i run the binary im getting module not found error.

Categories

Resources