I'm using Apex to deploy lambda functions in AWS. I need to write a lambda function which runs a cleanup script on an Oracle RDS in my AWS VPC. Oracle has a very nice python library called cx_Oracle, but I'm having some problems using it in a Lambda function (running on Python 2.7). My first step was to try to run the oracle-described test code as follows:
from __future__ import print_function
import json
import boto3
import boto3.ec2
import os
import cx_Oracle
def handle(event, context):
con = cx_Oracle.connect('username/password#my.oracle.rds:1521/orcl')
print(str(con.version))
con.close()
When I try to run this piece of test code, I get the following response:
Unable to import module 'main': /var/task/cx_Oracle.so: invalid ELF header
Google has told me that this error is caused because the cx_Oracle library is not a complete oracle implementation for python, rather it requires the SQLPlus client to be pre-installed, and the cx_Oracle library references components installed as part of SQLPlus.
Obviously pre-installing SQLPlus might be difficult.
Apex has the
hooks {}
functionality which would allow me to pre-build things, but I'm having trouble finding documentation showing what happens to those artefacts and how that works. In theory I could download the libraries into a nexus or an S3 bucket, and then in my hooks{} declaration, I could add them to the zip file. I could then try to install them as part of the python script. However, I have a few problems with this:
How are the 'built' artefacts accessed inside the lambda
function? Can they be? Have I misunderstood this?
Does a python 2.7 lambda function have enough access rights to
the operating system of the host container to be able to install a
library?
If the answer to question 2 is no, is there another way to write
a lambda function to run some SQL against an Oracle RDS instance?
Related
I am trying to connect to AWS RDS SQL Server instance to query table from AWS Lambda using python script. But, I am not seeing any AWS api so when I try using "import pyodbc" seeing the below error.
Unable to import module 'lambda_function': No module named 'pyodbc'
Connection:
cnxn = pyodbc.connect("Driver={SQL Server};"
"Server=data-migration-source-instance.asasasas.eu-east-1.rds.amazonaws.com;"
"Database=sourcedb;"
"uid=source;pwd=source1234")
Any points on how to query RDS SQL Server?
The error you're getting means that the lambda doesn't have the pyodbc module.
You should read up on dependency management in AWS Lambda. There are basically two strategies for including dependencies with your deployment - Lambda Layers or zip with the deployment package.
If you're using the Serverless Framework then Serverless-python-requirements is an excellent package for managing your dependencies and lets you choose your dependency management strategy with minimal changes to your application.
you need to upload the dependencies of the lambda along with the code. If you deploy your lambda manually (i.e. create a zip file / right from the console), you will need to attach the pyodb library. (More information is available here: https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-dependencies).
If you're using any other deployment tool (serverless, SAM, chalice), it will be much easier: https://www.serverless.com/plugins/serverless-python-requirements, https://aws.github.io/chalice/topics/packaging.html#rd-party-packages, https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-using-build.html
I have written a python job that uses sqlAlchemy to query a SQL Server database, however when using external libraries with AWS Glue you are required to wrap these libraries in an egg file. This causes an issue with the sqlAlchemy package as it uses the pyodbc package that cannot be wrapped in an egg as to my understanding it has other dependencies.
I have attempted to try and find a way of connecting to a SQL Server database within a Python Glue job but so far the closest advice I've been able to find suggests I write a Spark job instead which isn't appropriate.
Does anyone have experience with connecting to SQL Server within a Python 3 Glue Job? If so can I have an example snippet of code + packages used?
Yes, I actually managed to do something similar by bundling dependencies including transitive dependencies.
Follow the below steps:
1 - Create a script which zips all of the code and dependencies into a zip file and upload to S3:
python3 -m pip install -r requirements.txt --target custom_directory
python3 -m zipapp custom_directory/
mv custom_directory.pyz custom_directory.zip
Upload this zip instead of egg or wheel.
2 - Create a driver program which executes your python source program which we just zipped in step 1.
import sys
if len(sys.argv) == 1:
raise SyntaxError("Please provide a module to load.")
sys.path.append(sys.argv[1])
from your_module import your_function
sys.exit(your_function())
3 - You can then submit your job using:
spark-submit --py-files custom_directory.zip your_program.py
See:
How can you bundle all your python code into a single zip file?
I can't seem to get --py-files on Spark to work
Is was trying to create Azure function using Python(Http trigger) to fetch data from the gremlin graph.
I used
from gremlin_python.driver import client as clientDriver
to import the libraries and it was working fine locally.
When i deploy the same code to the Azure portal and ran the code, am getting 500 internal error.
After trying some changes, i could see "from gremlin_python.driver import client as clientDriver" import statement is not working(When i remove this piece the code works)
When we run the code in VSCode, we are creating a virtual env and installing the gremlin packages, so it was working in local and not in Azure portal.
Could someone help me in resolving this issue.
For this problem, we need to make sure the requirements.txt is all right. And if you just do the import module by the line
from gremlin_python.driver import client as clientDriver
You need to add another line to import the gremlin_python.driver module explicitly.
import gremlin_python.driver
Hope it helps~
I'm trying to use boto3 within a pipenv with Python 3.6.5.
So I installed it with
pipenv install boto3
So for testing purposes I'm using a single Flask app, and add at the beginning of the file:
import boto3
However without even running the program, PyLint warns me that E0401:Unable to import 'boto3', and the auto-completion only proposes botocore.
If I try to run the flask app or to deploy it to Lambda (cause it's the purpose of this app), I get an error 500.
However, the strange is that if I use the REPL within the pipenv and in the same directory and type
>> import boto3
Well it's successful, and I can use all the other commands of boto3. So in my opinion it is installed but for I reason I can't think of, my Python file can't load it.
I heard of file naming conflicts, but honestly I doubt that this is the reason since even if I rename the file and the Flask app with a weird name it still can't load.
Any thoughts about it? Thanks a lot
I am trying to connect to MSSQL database from AWS Lambda (using python) and really struggling to proceed further.
I tried many options with pyodbc, pypyodbc, pymssql they work on local development machine (Windows 7), however AWS Lambda is unable to find the required packages when deployed on AWS. I use ZAPPA for deployment of Lambda package.
I searched through many forums but unable to see the anything moving ahead, any help on this would be highly appreciated.
Many thanks,
Akshay
I tried different trial and error steps and ended up with the following one which is working fine in AWS Lambda, I am using PYMSSQL package only.
1) did 'pip install pymssql' on amazon EC2 instance as under the hood Amazon uses Linux AMIs to run their Lambda functions.
2) copied the generated '.so' files* and packaged inside the Lambda deployment package
Below is the folder structure of my lambda deployment package
Let me know if you further need help with this.
Try to do import cython together with pymssql in your code.