Boto3 - Wait until AWS Database Migration Task is deleted - python

Requirement: Delete DMS Task, DMS Endpoints and Replication Instance.
Use : Boto3 python script in Lambda
My Approach:
1. Delete the Database Migration Task first as Endpoint and Replication Instance cant be deleted before deleting this.
2. Delete Endpoints
3. Delete Replication Instance
Issue: When i am running these 3 delete commands, i get the following error
"errorMessage": "An error occurred (InvalidResourceStateFault) when calling the DeleteEndpoint operation:Endpoint arn:aws:dms:us-east-1:XXXXXXXXXXXXXX:endpoint:XXXXXXXXXXXXXXXXXXXXXX is part of one or more ReplicationTasks.
Here i know that Data migration task will take some time to delete. So till then Endpoint will be occupied by Task. So we cant delete it.
There is a aws cli command to check whether task is deleted or not - replication-task-deleted.
I can run this in shell and wait(sleep) until i get the final status and then execute delete Endpoint script.
There is no equivalent command in Boto3 DMS docs
Is there any other Boto3 command i can use to check the status and make my python script sleep till that time?
Please let me know if i can approach the the issue in different way.

You need to use waiters In your case the Waiter.ReplicationTaskDeleted

Related

Airflow: distinguish API- and UI-triggered dag runs

I'm using Apache Airflow 2.2.4. When I trigger a DAG run via UI click or via API call, I get context['dag_run'].external_trigger = True and context['dag_run'].run_type = 'scheduled' in both cases. I would like to distinguish between those two cases though. How can I do so?
Create new Role that doesn't have the permission action = website.
Create a new user that have this role for your API calls.
from the context["dag_run"] you can get "owner"

Auto renew the Kerberos ticket

I had to use Kerberos authentication for the first time, it kinda works, but I feel like missing a lot of understanding what is going on and how to cook it properly.
Basically, what I need is my Python script to run every couple of hours and send a request to a remote web server using domain account in MS AD environment.
The following code provides me with the ready to go session instance:
from requests import Session
import gssapi
from requests_gssapi import HTTPSPNEGOAuth
session = Session()
session.auth = HTTPSPNEGOAuth(mech=gssapi.mechs.Mechanism.from_sasl_name("SPNEGO"))
The script was added to the crontab of a user in a linux box and kinit was used to obtain a ticket-granting ticket:
kinit -kt ~/ad_user.keytab ad_user#DOMAIN.COM
But after a while it all stopped because of the expired ticket. The solution was simple: adding the kinit to the crontab to run every 8 hours solved the issue.
I'm wondering if there is a better and more proper way to achieve the same? If I don't want/need to create a principal for the server in the AD, but simply want some code to always have a valid ticket - can I avoid having a dedicated task in users' crontab?
Why don't you initiate the ticket cache directly in your code? This might help be more transparent that your job relies on a kerberos login and where it's located. (In 5 years when you come back to this code it will be hard to remember and this might save you some grief later.)
It will also help to ensure that someone can't accidentally disable your job by removing a cron job.
kinit from python script using keytab
I would do this in every script you require authentication to reduce external dependencies.

How to Read and Save Messages in Telegram and then Send message Using Python

I want to use Telegram client API.
I want to Run run_until_disconnected() for getting all messages in 24 hours and save them in Database. This Part is fine , I Wrote the code and its working fine . after some operations on the messages database , I want to send the result of that operation as a message to telegram (to channel or User). i wrote the code of sending message too but when i wanted to use , i get error of database is locked or session is locked...
What should I Do?
Please Read :: https://docs.telethon.dev/en/latest/quick-references/faq.html#id9
Solution according to docs :
if you need two clients, use two sessions. If the problem persists and you’re on Linux, you can use fuser my.session to find out the process locking the file. As a last resort, you can reboot your system.
If you really dislike SQLite, use a different session storage. There is an entire section covering that at Session Files.

How to access Amazon EMR error message with Python

I am running EMR clusters kicked off with Airflow and I need some way of passing error messages back to Airflow. Airflow runs in Python so I need this to be done in python.
Currently the error logs are in the "Log URI" section under configuration details. Accessing this might be one way to do it, but any way to access the error logs from emr with python would be much appreciated.
You can access the EMR logs in S3 with boto3 for example.
The S3 path would be:
stderr : s3://<EMR_LOG_BUCKET_DEFINED_IN_EMR_CONFIGURATION>/logs/<CLUSTER_ID>/steps/<STEP_ID>/stderr.gz
stout : s3://<EMR_LOG_BUCKET_DEFINED_IN_EMR_CONFIGURATION>/logs/<CLUSTER_ID>/steps/<STEP_ID>/stdout.gz
controller : s3://<EMR_LOG_BUCKET_DEFINED_IN_EMR_CONFIGURATION>/logs/<CLUSTER_ID>/steps/<STEP_ID>/controller.gz
syslog : s3://<EMR_LOG_BUCKET_DEFINED_IN_EMR_CONFIGURATION>/logs/<CLUSTER_ID>/steps/<STEP_ID>/syslog.gz
Cluster ID and Step ID can be passed to your different tasks via XCOM from the task(s) that creates the cluster/steps.
Warning for spark (might be applicable to other types of steps):
This works if you submit your steps in client mode as if you are using cluster mode you would need to change the URL to fetch the application logs of the driver instead.

How to schedule an RDS snapshot and restore in the same script

so I'm scheduling an AWS python job (through AWS Glue Python shell) that is supposed to clone a MySQL RDS database (best way to take a snapshot and restore?) and perform sql queries on the database. I have the boto3 library on the Python Shell and an SQL Python Library I loaded. I have this code currently
import boto3
client = boto3.client('rds')
# Create a snapshot of the database
snapshot_response = client.create_db_snapshot(
DBSnapshotIdentifier='snapshot-identifier',
DBInstanceIdentifier='instance-db',
)
# Restore db from snapshot
restore_response = client.restore_db_instance_from_db_snapshot(
DBInstanceIdentifier = 'restored-db',
DBSnapshotIdentifier = 'snapshot-identifier',
)
# Code that will perform sql queries on the restored-db database.
However, the client.restore_db_instance_from_db_snapshot fails because it says the snapshot is being created. So I understand that this means these calls are asynchronous. But I am not sure how to get this snapshot restore to work (either by making them synchronous - not a good idea?) or by some other way. Thanks for the help in advance :).
You can use a waiter:
waiter = client.get_waiter('db_cluster_snapshot_available')
Polls RDS.Client.describe_db_cluster_snapshots() every 30 seconds until a successful state is reached. An error is returned after 60 failed checks.
See: class RDS.Waiter.DBClusterSnapshotAvailable

Categories

Resources