Is it possible to have DynamoDB conditionally save a timestamp if the item is created?
It looks like the AWS Java SDK provides this functionality via the #DynamoDBAutoGeneratedTimestamp annotation.
Your could write/use a DynamoDB trigger - an AWS Lambda function - to do this for you:
https://aws.amazon.com/dynamodb/faqs/
** Q. How does DynamoDB Triggers work? **
The custom logic for a DynamoDB trigger is stored in an AWS Lambda
function as code. To create a trigger for a given table, you can
associate an AWS Lambda function to the stream (via DynamoDB Streams)
on a DynamoDB table. When the table is updated, the updates are
published to DynamoDB Streams. In turn, AWS Lambda reads the updates
from the associated stream and executes the code in the function.
Related
I've written an AWS glue job ETL script in python, and I'm looking for the proper way to perform conditional writes to the DynamoDb table I'm using as the target.
# Write to DynamoDB
glueContext.write_dynamic_frame_from_options(
frame=SelectFromCollection_node1665510217343,
connection_type="dynamodb",
connection_options={
"dynamodb.output.tableName": args["OUTPUT_TABLE_NAME"]
}
)
My script is writing to dynamo with write_dynamic_frame_from_options. The aws glue connection parameter docs make no mention of the ability to customize the write behavior in the connection options.
Is there a clean way to write conditionally without using boto?
You cannot do conditional updates with the EMR DynamoDB connector which Glue uses. It does a complete overwrite of the data. For that you would have to use Boto3 and distribute it using forEachPartition across the Spark executors.
I have a script which checks if a specific value is inside a cell in a dynamodb table in AWS. I used to add hardcoded credentials containing the secret key in my script such as this:
dynamodb_session = Session(aws_access_key_id='access_key_id',
aws_secret_access_key='secret_access_key',
region_name='region')
dynamodb = dynamodb_session.resource('dynamodb')
table=dynamodb.Table('table_name')
Are there any other ways to use those credentials without adding them to my script ? Thank you.
If you are running that code on an Amazon EC2 instance, then you simply need to assign an IAM Role to the instance and it will automatically receive credentials.
If you are running that code on your own computer, then use the AWS Command-Line Interface (CLI) aws configure command to store the credentials in a local configuration file. (It will be stored in ~/.aws/credentials).
Then, in both cases, you can simply use:
dynamodb = boto3.resource('dynamodb')
You can set the default region in that configuration too.
I am looking for setting up a alert notification either from snowflake or aws side or by glue jobs / lambda functions using python or scala.
I would like to compare 2 tables which holds table names and counts in source and target.
data is loaded from s3 to snowflake via aws glue job and after that I would like to compare the 2 tables to verify if source and target record counts are matching and for any mismatches send a notification.
Please let me know your inputs to achieve this task.
Thanks,
Jo
If you are using AWS Glue to load the tables in Snowflake, you can continue using Glue to orchestrate the desired result:
Have Glue load the table.
Have Glue run a stored procedure in Snowflake comparing both tables.
https://snowflakecommunity.force.com/s/article/How-to-Use-AWS-Glue-to-Call-Procedures-in-Snowflake
Have AWS Glue send a notification through SNS.
https://aws.amazon.com/blogs/big-data/build-and-automate-a-serverless-data-lake-using-an-aws-glue-trigger-for-the-data-catalog-and-etl-jobs/
See the chapter "Monitoring and notification with Amazon CloudWatch Events".
If you need SQL for the stored procedure that compares two tables, please feel free to add a new question.
I'm trying to create an AWS Lambda webservice that takes a payload with a new username / password to create a new database and user in an RDS instance.
I'd like to use Boto3 to accomplish this, but I can't seem to find any documentation for this function.
Is this possible using this setup?
Currently AWS SDKs for RDS(Including Boto3 SDK) does not support this nor the AWS CLI.
Its because, creating DB users unique to each DB instance type (mysql, oracle & etc).
The option you have is to run a DDL query using your respective database driver.
http://boto3.readthedocs.io/en/latest/reference/services/rds.html#RDS.Client.generate_db_auth_token documents how create an auth token for connecting to an RDS instance and http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.html covers other setup details.
I have a aws lambda function which will write s3 file metadata information in dynamodb for every object created in s3 bucket, for this I have event trigger on s3 bucket. So i'm planning to automate testing using python. Can any one help out how I can automate this lambda function to test the following using unittest package.
Verify the dynamodb table existency
Validate whether the bucket exists or not in s3 for event trigger.
Verify the file count in s3 bucket and record count in Dynamodb table.
This can be done using moto and unittest. What moto will do is add in a stateful mock for AWS - your code can continue calling boto like normal, but calls won't actually be made to AWS. Instead, moto will build up state in memory.
For example, you could
Activate the mock for DynamoDB
create a DynamoDB table
Add items to the table
Retrieve items from the table and see they exist
If you're building functionality for both DynamoDB and S3, you'd leverage both the mock_s3 and mock_dynamodb2 methods from moto.
I wrote up a tutorial on how to do this (it uses pytest instead of unittest but that should be a minor difference). Check it out: joshuaballoch.github.io/testing-lambda-functions/