I want to test my lambda in local using boto3, moto, pytest. This lambda is using chalice. When I call the function I try to insert a fake event to make it run but it still missing the context object.
If someone know how to test it the cleanest way possible it'll be great.
I tried to add objects in my s3 and retrieve events from it
I tried to simulate fake events
#app.on_s3_event(bucket=s.BUCKET_NAME, events=['s3:ObjectCreated:*'], prefix=s.PREFIX_PREPROCESSED, suffix=s.SUFFIX)
def handle_pre_processed_event(event):
"""
Lambda for preprocessed data
:param event:
:return:
"""
# Retrieve the json that was add to the bucket S3
json_s3 = get_json_file_s3(event.bucket, event.key)
# Send all the records to dynamoDB
insert_records(json_s3)
# Change the path of the file by copying it and delete it
change_path_file(event.key, s.PREFIX_PREPROCESSED)
Here is the lambda I want to test. Thanks for your responses.
If someone get the same problem it's because chalice use a wrapper. Add your notification and a context in your handler.
Related
I have a bucket called:
bucket_1
On bucket_1 i have s3 events set up for all new objects created within that bucket. What i would like to do is set up a rule so that if the file that is dropped is prefixed with "data". Then it triggers a lambda function that will process the data within the file.
What i'm struggling with is how to filter for a particular file. So far the code i have within python is this:
def handler(event, context):
s3 = boto3.resource('s3')
message = json.loads(event['Records'][0]['Sns']['Message'])
print("JSON: " + json.dumps(message))
return message
This lambda is triggered when an event is added to my sns topic, but i just want to filter specifically for an object creation with a prefix of "data".
Has anyone done something similar to this before? for clarity this is the workflow of the job that i would like to happen:
1. file added to bucket_1
2. notification sent to sns topic [TRIGGERS below python code]
3. python filters for object created notification and file with prefix of "data*" [Triggers] python below]
4. python fetched data from s3 location cleans it up and places it into table.
So specifically i am looking on how exactly to set up step 3.
After looking around i've found that this isn't actually done in python but rather when configuring events on the s3 bucket itself.
You go to this area:
s3 > choose your bucket > properties > event notifications > create notification
Once you are there you just select all create objects events and you can even specify the prefix and suffix of a file name. Then you will only receive notifications through to that topic for what you've specified.
Hopefully this helps someone at some point!
I'm trying to update a docker image within a deployment in EKS. I'm running a python code from a lambda function. However, I don't know how to use generate_presigned_url(). What should I pass as ClientMethod parameter???
import boto3
client = boto3.client("eks")
url = client.generate_presigned_url()
These are the clientMethods that you could perform in case of EKS.
'associate_encryption_config'
'associate_identity_provider_config'
'can_paginate'
'create_addon'
'create_cluster'
'create_fargate_profile'
'create_nodegroup'
'delete_addon'
'delete_cluster'
'delete_fargate_profile'
'delete_nodegroup'
'describe_addon'
'describe_addon_versions'
'describe_cluster'
'describe_fargate_profile'
'describe_identity_provider_config'
'describe_nodegroup'
'describe_update'
'disassociate_identity_provider_config'
'generate_presigned_url'
'get_paginator'
'get_waiter'
'list_addons'
'list_clusters'
'list_fargate_profiles'
'list_identity_provider_configs'
'list_nodegroups'
'list_tags_for_resource'
'list_updates'
'tag_resource'
'untag_resource'
'update_addon'
'update_cluster_config'
'update_cluster_version'
'update_nodegroup_config'
'update_nodegroup_version'
You can get more information about these method in the documentation here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/eks.html#client
After over two weeks I suppose you've found your answer, anyway the ClientMethod mentioned (and, not really well explained on the boto3 docs) is just one of the methods you can use with the EKS client itself. I honestly think this is what KnowledgeGainer was trying to say by listing all the methods, basically you can just pick one. This would give you the presigned URL.
For example, here I'm using one method that isn't requiring any additional arguments, list_clusters:
>>> import boto3
>>> client = boto3.client("eks")
>>> client.generate_presigned_url("list_clusters")
'https://eks.eu-west-1.amazonaws.com/clusters?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAQKOXLHHBFT756PNG%2F20210528%2Feu-west-1%2Feks%2Faws4_request&X-Amz-Date=20210528T014603Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=d25dNCC17013ad9bc75c04b6e067105c23199c23cbadbbbeForExample'
If the method requires any additional arguments, you add those into Params as a dictionary:
>>> method_params = {'name': <your_cluster_name>}
>>> client.generate_presigned_url('describe_cluster', Params=method_params)
I've never written a recursive python script before. I'm used to splitting up a monolithic function into sub AWS Lambda functions. However, this particular script I am working on is challenging to break up into smaller functions.
Here is the code I am currently using for context. I am using one api request to return a list of objects within a table.
url_pega_EEvisa = requests.get('https://cloud.xxxx.com:443/prweb/api/v1/data/D_pxCaseList?caseClass=xx-xx-xx-xx', auth=(username, password))
pega_EEvisa_raw = url_pega_EEvisa.json()
pega_EEvisa = pega_EEvisa_raw['pxResults']
This returns every object(primary key) within a particular table as a list. For example,
['XX-XXSALES-WORK%20PO-1', 'XX-XXSALES-WORK%20PO-10', 'XX-XXSALES-WORK%20PO-100', 'XX-XXSALES-WORK%20PO-101', 'XX-XXSALES-WORK%20PO-102', 'XX-XXSALES-WORK%20PO-103', 'XX-XXSALES-WORK%20PO-104', 'XX-XXSALES-WORK%20PO-105', 'XX-XXSALES-WORK%20PO-106', 'XX-XXSALES-WORK%20PO-107']
I then use this list to populate more get requests using a for loop which then grabs me all the data per object.
for t in caseid:
url = requests.get(('https://cloud.xxxx.com:443/prweb/api/v1/cases/{}'.format(t)), auth=(username, password)).json()
data.append(url)
This particular lambda function takes about 15min which is the limit for one AWS Lambda function. Ideally, I'd like to split up the list into smaller parts and run the same process. I am struggling marking the point where it last ran before failure and passing that information on to the next function.
Any help is appreciated!
I'm not sure if I entirely understand what you want to do with the data once you've fetched all the information about the case, but it terms of breaking up the work once lambda is doing into many lambdas, you should be able to chunk out the list of cases and pass them to new invocations of the same lambda. Python psuedocode below, hopefully it helps illustrate the idea. I stole the chunks method from this answer that would help break the list into batches
import boto3
import json
client = boto3.client('lambda')
def handler
url_pega_EEvisa = requests.get('https://cloud.xxxx.com:443/prweb/api/v1/data/D_pxCaseList?caseClass=xx-xx-xx-xx', auth=(username, password))
pega_EEvisa_raw = url_pega_EEvisa.json()
pega_EEvisa = pega_EEvisa_raw['pxResults']
for chunk in chunks(pega_EEvisa, 10)
client.invoke(
FunctionName='lambdaToHandleBatchOfTenCases',
Payload=json.dumps(chunk)
)
Hopefully that helps? Let me know if this was not on target 😅
I'm trying to implement a custom AWS Lambda layer in order to use it with my functions.
It should be a simple layer that gets some parameter from ssm and initialize puresec's function_shield for protection of my services.
The code looks more less like this:
import os
import boto3
import function_shield as shield
STAGE = os.environ['stage']
REGION = os.environ['region']
PARAMETERS_PREFIX = os.environ['parametersPrefix']
class ParameterNotFoundException(Exception):
pass
session = boto3.session.Session(region_name=REGION)
ssm = session.client('ssm')
# function_shield config
parameter_path = f"/{PARAMETERS_PREFIX}/{STAGE}/functionShieldToken"
try:
shield_token = ssm.get_parameter(
Name=parameter_path,
WithDecryption=True,
)['Parameter']['Value']
except Exception:
raise ParameterNotFoundException(f'Parameter {parameter_path} not found.')
policy = {
"outbound_connectivity": "block",
"read_write_tmp": "block",
"create_child_process": "block",
"read_handler": "block"
}
def configure(p):
"""
update function_shield policy
:param p: policy dict
:return: null
"""
policy.update(p)
shield.configure({"policy": policy, "disable_analytics": True, "token": shield_token})
configure(policy)
I want to be able to link this layer to my functions for it to be protected in runtime.
I'm using the serverless framework, and it seems like my layer was deployed just fine, as it was with my example function. Also, the AWS console shows me that the layer was linked in my function.
I named my layer 'shield' and tried to import it by its name on my test function:
import os
import shield
def test(event, context):
shield.configure(policy) # this should be reusable for easy tweaking whenever I need to give more or less permissions to my lambda code.
os.system('ls')
return {
'rep': 'ok'
}
Ideally, I should get and error on CloudWatch telling me that function_shield has prevented a child_process from running, however I instead receive an error telling me that there is no 'shield' declared on my runtime.
What am I missing?
I couldn't find any custom code examples being used for layers apart from numpy, scipy, binaries, etc.
I'm sorry for my stupidity...
Thanks for your kindness!
You also need to name the file in your layer shield.py so that it's importable in Python. Note it does not matter how the layer itself is named. That's a configuration in the AWS world and has no effect on the Python world.
What does have an effect is the structure of the layer archive. You need to place the files you want to import into a python directory, zip it and use that resulting archive as a layer (I'm pressuming serverless framework is doing this for you).
In the Lambda execution environment, the layer archive gets extracted into /opt, but it's only /opt/python that's declared in the PYTHONPATH. Hence the need for the "wrapper" python directory.
Have a look here. I have described all the necessary steps to set up or call custom lambda layers functions on lambda.
https://medium.com/#nimesh.kumar031/how-to-set-up-layers-python-in-aws-lambda-functions-1355519c11ed?source=friends_link&sk=af4994c28b33fb5ba7a27a83c35702e3
I have a lambda function that takes in a dataset name, and creates a new lambda specifically for that dataset. Here's the code that sets this up:
lambda_response = lambda_client.create_function(
FunctionName=job_name,
Runtime='python3.6',
Role=role_name,
Handler='streaming_data_lambda.lambda_handler',
Code={
'S3Bucket': code_bucket,
'S3Key': 'streaming_data_lambda.py.zip'
},
Timeout=30,
)
This appears to be creating a lambda correctly, and works when I kick it off manually. I want to have this created lambda run once an hour, so the original lambda script creates the following rules and targets:
rule_response = event_client.put_rule(
Name=rule_name,
ScheduleExpression=schedule_expression
)
event_client.put_targets(
Rule=rule_name,
Targets=[
{
'Id': lambda_response['FunctionName'],
'Arn': lambda_response['FunctionArn'],
'Input': json.dumps(input_string)
}
]
)
where input_string is just something like {"datasetName": "name"}. I can see the rule in the CloudWatch Rules UI, and can see it's linked to the correct lambda and the input text is present. It triggers correctly every hour also, but fails to invoke the lambda function. If I look at the lambda in the UI and add the CloudWatch event rule I created as a trigger in the Designer section there, then it correctly kicks off the lambda, but is there some way to set this up in Python so I don't have to do this last step in the UI?
For anyone who might be looking for the answer to this in the future - you need to add add permission for cloudwatch events to invoke your lambda function, like so:
lambda_client.add_permission(
FunctionName=lambda_response['FunctionName'],
StatementId=some_random_number,
Action='lambda:InvokeFunction',
Principal='events.amazonaws.com',
SourceArn=rule_response['RuleArn']
)
You may need to add
event_client.enable_rule(Name=rule_name)
After the put_rule
In this bit, there is possibly some extra config that the UI adds
event_client.put_rule(
Name=rule_name,
ScheduleExpression=schedule_expression)
try using "DescribeRule" on this rule after it is enabled and working and then duplicate any missing fields ( like RoleArn ) in the python
The answer mentioned by #meowseph_furbalin will work but the issue with this answer is that for each rule it'll add new resource policy to the lambda. AWS limit the maximum size of resource policies to be around 2KB, so after hitting that limit, the target will not be triggered.
One way to overcome this issue is to add wildcard matching to the lambda so that it can match every rule from a single resource policy.
aws lambda add-permission --function-name lambda_name\
--action 'lambda:InvokeFunction' --principal events.amazonaws.com \
--statement-id '1' \
--source-arn arn:aws:events:us-west-2:356280712205:rule/*
After adding this command you don't need to add permissions dynamically to the lambda, and you'll lambda will accept all the cloudwatch rules with a single resource policy.
Please keep in mind while creating cloudwatch rule, you have to keep the SID as '1' for all rules. If you want to change the SID then add another rule corresponding to that SID.