aws - lambda - python - firing up to 4x on s3 object created triggers - python

I have some simple lambda functions that are always getting fired up to 4x on almost every S3 Object Created (All) trigger and I can't seem to figure out why.
The files are relatively small right now, only a few KB, coming from Kinesis Firehose.
In the screenshot below, everything with a "File:" in the beginning of the message is a duplicate call on the same file written. The "Uploading:" part is where the function uploads the file to a different folder that has a lambda function watching (that one too executes multiple times!). That only happened once in the screenshot, but I've seen it happen multiple times.
I did add some code so when the function is executed it first checks an s3 folder for a blank file to see if it's already been processed or not, and if not then proceed. This catches some of the dupe executions, problem I'm seeing is sometimes the lambda function is executing twice within near the same millisecond so this check is obviously no good.
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key']).decode('utf8')
print "File: "+bucket+"/"+key
exists = False
tmp_f = key.replace('firehose/','')
tmp_f = tmp_f.split('/')
tmp_f_date = tmp_f[0]+"-"+tmp_f[1]+"-"+tmp_f[2]
tmp_f_name = "folder/"+tmp_f_date+"/"+tmp_f[0]+"-"+tmp_f[1]+"-"+tmp_f[2]+"-"+tmp_f[3]+"____"+(tmp_f[4].replace('.gz','.txt'))
try:
s3_client.download_file(bucket, tmp_f_name, '/tmp/tmpf')
if os.path.isfile('/tmp/tmpf'):
exists = True
print "--> File exists..."
except:
print "--> New file..."
pass
if exists == True: return False
open('/tmp/tmpf', 'a').close()
s3_client.upload_file('/tmp/tmpf', bucket, tmp_f_name)
return process_file() # returns True
Doing some researching I'm not finding a solid solution. Some say this is the nature of lambda, "at least once" execution and to keep track of the request ID but we're seeing executions fire within a millisecond or two of each other so we'd have to have a redis instance just to keep track? The S3 file method above is apparently too slow.
This doesn't seem optimal so I'm HOPING I'm missing maybe a line of code or callback? Someone please help :)

Related

Exception handling in AWS - Explicitly ignore some errors?

I have a Lambda function that's used to verify the integrity of a file, but for simplicity sake, let's just assume that it copies a file from a bucket into a bucket upon trigger (lambda gets triggered when it detects file ingestion).
The problem that I am currently facing is that whenever I ingest a lot of files, the function gets triggered as many times as the number of files and that leads to unnecessary invocations. One invocation can handle multiple files so the earlier invocations typically process more than 1 file and the latter realising that there are no more files to process, yield a NoSuchKey error.
So I added in the try-except logic and indicate it to log NoSuchKey error as an info log so that it does not trigger another function that catches ERROR keyword and alert prod support team.
To do that, i used this code which is taken from AWS documentation:
import botocore
import boto3
import logging
# Set up our logger
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()
client = boto3.client('s3')
try:
logger.info('Executing copy statement')
# main logic
except botocore.exceptions.ClientError as error:
if error.response['Error']['Code'] == 'NoSuchKey':
logger.info('No object found - File has been moved')
else:
raise error
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/error-handling.html?force_isolation=true
Despite that, I still get the NoSuchKey error. From CloudWatch, it seems that the logic in the "try" section gets executed, followed by an ERROR statement and then the info log 'No object found - File has been moved' that's written in my "except" section.
I thought either "try" or "except" gets executed, but in this case both seemed to run.
Any pros willing to shed some lights?

Python GSpead On-Change Listener?

I'm making a script that checks a google sheet from a google form and returns the result as a live-feed visualization of a poll. I need to figure out how to update the value counts, but only when the google sheet is updated, as opposed to checking every 60 seconds (or something).
Here is my current setup:
string = ""
while True:
responses = gc.open("QOTD Responses").sheet1
data = pd.DataFrame(responses.get_all_records())
vals = data['Response'].value_counts()
str = "{} currently has {} votes. \n{} currently has {} votes.".format(vals.index[0], vals[0], vals.index[1],
vals[1])
if(str != string):
string = str
print(string)
time.sleep(60) # Updates 1440 times per day
I'm almost certain that there has to be a better way to do this, but what would that be?
Thanks!
You won't be able to do it with Python alone. You'll need to integrate with a trigger function from Google Apps script.
You could use the onEdit trigger function to send a signal to your python script (via an http call for example).
To use a simple trigger, simply create a function that uses one of
these reserved function names:
onOpen(e) runs when a user opens a spreadsheet, document,
presentation, or form that the user has permission to edit.
onEdit(e) runs when a user changes a value in a spreadsheet.

Why does Firebase event return empty object on second and subsequent events?

I have a Python Firebase SDK on the server, which writes to Firebase real-time DB.
I have a Javascript Firebase client on the browser, which registers itself as a listener for "child_added" events.
Authentication is handled by the Python server.
With Firebase rules allowing reads, the client listener gets data on the first event (all data at that FB location), but only a key with empty data on subsequent child_added events.
Here's the listener registration:
firebaseRef.on
(
"child_added",
function(snapshot, prevChildKey)
{
console.log("FIREBASE REF: ", firebaseRef);
console.log("FIREBASE KEY: ", snapshot.key);
console.log("FIREBASE VALUE: ", snapshot.val());
}
);
"REF" is always good.
"KEY" is always good.
But "VALUE" is empty after the first full retrieval of that db location.
I tried instantiating the firebase reference each time anew inside the listen function. Same result.
I tried a "value" event instead of "child_added". No improvement.
The data on the Firebase side looks perfect in the FB console.
Here's how the data is being written by the Python admin to firebase:
def push_value(rootAddr, childAddr, data):
try:
ref = db.reference(rootAddr)
posts_ref = ref.child(childAddr)
new_post_ref = posts_ref.push()
new_post_ref.set(data)
except Exception:
raise
And as I said, this works perfectly to put the data at the correct place in FB.
Why the empty event objects after the first download of the database, on subsequent events?
I found the answer. Like most things, it turned out to be simple, but took a couple of days to find. Maybe this will save someone else.
On the docs page:
http://firebase.google.com/docs/database/admin/save-data#section-push
"In JavaScript and Python, the pattern of calling push() and then
immediately calling set() is so common that the Firebase SDK lets you
combine them by passing the data to be set directly to push() as
follows..."
I suggest the wording should emphasize that you must do it that way.
The earlier Python example on the same page doesn't work:
new_post_ref = posts_ref.push()
new_post_ref.set({
'author': 'gracehop',
'title': 'Announcing COBOL, a New Programming Language'
})
A separate empty push() followed by set(data) as in this example, won't work for Python and Javascript because in those cases the push() implicitly also does a set() and so an empty push triggers unwanted event listeners with empty data, and the set(data) didn't trigger an event with data, either.
In other words, the code in the question:
new_post_ref = posts_ref.push()
new_post_ref.set(data)
must be:
new_post_ref = posts_ref.push(data)
with set() not explicitly called.
Since this push() code happens only when new objects are written to FB, the initial download to the client wasn't affected.
Though the documentation may be trying to convey the evolution of the design, it fails to point out that only the last Python and Javascript example given will work and the others shouldn't be used.

AWS boto - Instance Status/Snapshot Status won't update Python While Loop

So I am creating a Python script with boto to allow the user to prepare to expand their Linux root volume and partition. During the first part of the script, I would like to have a While loop or something similar to make the script not continue until:
a) the instance has been fully stopped
b) the snapshot has been finished creating.
Here are the code snippets for both of these:
Instance:
ins_prog = conn.get_all_instances(instance_ids=src_ins, filters={"instance-state":"stopped"})
while ins_prog == "[]":
print src_ins + " is still shutting down..."
time.sleep(2)
if ins_prog != "[]":
break
Snapshot:
snap_prog = conn.get_all_snapshots(snapshot_ids=snap_id, filters={"progress":"100"})
while snap_prog == "[]":
print snap.update
time.sleep(2)
if snap_prog != "[]":
print "done!"
break
So when calling conn.get_all_instances and conn.get_all_snapshots they return an empty list if the filters show nothing, which is formatted like []. The problem is the While loop does not even run. It's as if it does not recognize [] as the string produced by the get_all functions.
If there is an easier way to do this, please let me know I am at a loss right now ):
Thanks!
Edit: Based on garnaat's help here is the follow up issue.
snap_prog = conn.get_all_snapshots(snapshot_ids=snap.id)[0]
print snap_prog
print snap_prog.id
print snap_prog.volume_id
print snap_prog.status
print snap_prog.progress
print snap_prog.start_time
print snap_prog.owner_id
print snap_prog.owner_alias
print snap_prog.volume_size
print snap_prog.description
print snap_prog.encrypted
Results:
Snapshot:snap-xxx
snap-xxx
vol-xxx
pending
2015-02-12T21:55:40.000Z
xxxx
None
50
Created by expandDong.py at 2015-02-12 21:55:39
False
Note how snap_prog.progress returns null, but snap_prog.status stays as 'pending' when being placed in a While loop.
SOLVED:
MY colleague and I found out how to get the loop for snapshot working.
snap = conn.create_snapshot(src_vol)
while snap.status != 'completed':
snap.update()
print snap.status
time.sleep(5)
if snap.status == 'completed':
print snap.id + ' is complete.'
break
snap.update() call purely updates the variable snap to return the most recent information, where snap.status outputs the "pending" | "completed". I also had an issue with snap.status not showing the correct status of the snapshot according to the console. Apparently there is a significant lagtime between the Console and the SDK call. I had to wait ~4 minutes for the status to update to "completed" when the snapshot was completed in the console.
If I wanted to check the state of a particular instance and wait until that instance reached some state, I would do this:
import time
import boto.ec2
conn = boto.ec2.connect_to_region('us-west-2') # or whatever region you want
instance = conn.get_all_instances(instance_ids=['i-12345678'])[0].instances[0]
while instance.state != 'stopped':
time.sleep(2)
instance.update()
The funny business with the get_all_instances call is necessary because that call returns a Reservation object which, in turn, has an instances attribute that is a list of all matching instances. So, we are taking the first (and only) Reservation in the list and then getting the first (and only) Instance inside the reservation. You should probably but some error checking around that.
The snapshot can be handled in a similar way.
snapshot = conn.get_all_snapshots(snapshot_ids=['snap-12345678'])[0]
while snapshot.status != 'completed':
time.sleep(2)
snapshot.update()
The update() method on both objects queries EC2 for the latest state of the object and updates the local object with that state.
I will try to answer this generally first. So, you are querying a resource for its state. If a certain state is not met, you want to keep on querying/asking/polling the resource, until it is in the state you wish it to be. Obviously this requires you to actually perform the query within your loop. That is, in an abstract sense:
state = resource.query()
while state != desired_state:
time.sleep(T)
state = resource.query()
Think about how and why this works, in general.
Now, regarding your code and question, there are some uncertainties you need to figure out yourself. First of all, I am very sure that conn.get_all_instances() returns an empty list in your case and not actually the string '[]'. That is, your check should be for an empty list instead of for a certain string(*). Checking for an empty list in Python is as easy as not l:
l = give_me_some_list()
if not l:
print "that list is empty."
The other problem in your code is that you expect too much of the language/architecture you are using here. You query a resource and store the result in ins_prog. After that, you keep on checking ins_prog, as if this would "magically" update via some magic process in the background. No, that is not happening! You need to periodically call conn.get_all_instances() in order to get updated information.
(*) This is documented here: http://boto.readthedocs.org/en/latest/ref/ec2.html#boto.ec2.connection.EC2Connection.get_all_instances -- the docs explicitly state "Return type: list". Not string.

Python 3 (Bot) script stops working

I'm trying to connect to a TeamSpeak server using the QueryServer to make a bot. I've taken advice from this thread, however I still need help.
This is The TeamSpeak API that I'm using.
Before the edits, this was the summary of what actually happened in my script (1 connection):
It connects.
It checks for channel ID (and it's own client ID)
It joins the channel and starts reading everything
If someone says an specific command, it executes the command and then it disconnects.
How can I make it so it doesn't disconnect? How can I make the script stay in a "waiting" state so it can keep reading after the command is executed?
I am using Python 3.4.1.
I tried learning Threading but either I'm dumb or it doesn't work the way I thought it would. There's another "bug", once waiting for events, if I don't trigger anything with a command, it disconnects after 60 seconds.
#Librerias
import ts3
import threading
import datetime
from random import choice, sample
# Data needed #
USER = "thisisafakename"
PASS = "something"
HOST = "111.111.111.111"
PORT = 10011
SID = 1
class BotPrincipal:
def __init__(self, manejador=False):
self.ts3conn = ts3.query.TS3Connection(HOST, PORT)
self.ts3conn.login(client_login_name=USER, client_login_password=PASS)
self.ts3conn.use(sid=SID)
channelToJoin = Bot.GettingChannelID("TestingBot")
try: #Login with a client that is ok
self.ts3conn.clientupdate(client_nickname="The Reader Bot")
self.MyData = self.GettingMyData()
self.MoveUserToChannel(ChannelToJoin, Bot.MyData["client_id"])
self.suscribirEvento("textchannel", ChannelToJoin)
self.ts3conn.on_event = self.manejadorDeEventos
self.ts3conn.recv_in_thread()
except ts3.query.TS3QueryError: #Name already exists, 2nd client connect with this info
self.ts3conn.clientupdate(client_nickname="The Writer Bot")
self.MyData = self.GettingMyData()
self.MoveUserToChannel(ChannelToJoin, Bot.MyData["client_id"])
def __del__(self):
self.ts3conn.close()
def GettingMyData(self):
respuesta = self.ts3conn.whoami()
return respuesta.parsed[0]
def GettingChannelID(self, nombre):
respuesta = self.ts3conn.channelfind(pattern=ts3.escape.TS3Escape.unescape(nombre))
return respuesta.parsed[0]["cid"]
def MoveUserToChannel(self, idCanal, idUsuario, passCanal=None):
self.ts3conn.clientmove(cid=idCanal, clid=idUsuario, cpw=passCanal)
def suscribirEvento(self, tipoEvento, idCanal):
self.ts3conn.servernotifyregister(event=tipoEvento, id_=idCanal)
def SendTextToChannel(self, idCanal, mensajito="Error"):
self.ts3conn.sendtextmessage(targetmode=2, target=idCanal, msg=mensajito) #This works
print("test") #PROBLEM HERE This doesn't work. Why? the line above did work
def manejadorDeEventos(sender, event):
message = event.parsed[0]['msg']
if "test" in message: #This works
Bot.SendTextToChannel(ChannelToJoin, "This is a test") #This works
if __name__ == "__main__":
Bot = BotPrincipal()
threadprincipal = threading.Thread(target=Bot.__init__)
threadprincipal.start()
Prior to using 2 bots, I tested to launch the SendTextToChannel when it connects and it works perfectly, allowing me to do anything that I want after it sends the text to the channel. The bug that made entire python code stop only happens if it's triggered by the manejadorDeEventos
Edit 1 - Experimenting with threading.
I messed it up big time with threading, getting to the result where 2 clients connect at same time. Somehow i think 1 of them is reading the events and the other one is answering. The script doesn't close itself anymore and that's a win, but having a clone connection doesn't looks good.
Edit 2 - Updated code and actual state of the problem.
I managed to make the double connection works more or less "fine", but it disconnects if nothing happens in the room for 60 seconds. Tried using Threading.timer but I'm unable to make it works. The entire question code has been updated for it.
I would like an answer that helps me to do both reading from the channel and answering to it without the need of connect a second bot for it (like it's actually doing...) And I would give extra points if the answer also helps me to understand an easy way to make a query to the server each 50 seconds so it doesn't disconnects.
From looking at the source, recv_in_thread doesn't create a thread that loops around receiving messages until quit time, it creates a thread that receives a single message and then exits:
def recv_in_thread(self):
"""
Calls :meth:`recv` in a thread. This is useful,
if you used ``servernotifyregister`` and you expect to receive events.
"""
thread = threading.Thread(target=self.recv, args=(True,))
thread.start()
return None
That implies that you have to repeatedly call recv_in_thread, not just call it once.
I'm not sure exactly where to do so from reading the docs, but presumably it's at the end of whatever callback gets triggered by a received event; I think that's your manejadorDeEventos method? (Or maybe it's something related to the servernotifyregister method? I'm not sure what servernotifyregister is for and what on_event is for…)
That manejadorDeEventos brings up two side points:
You've declared manejadorDeEventos wrong. Every method has to take self as its first parameter. When you pass a bound method, like self.manejadorDeEventos, that bound self object is going to be passed as the first argument, before any arguments that the caller passes. (There are exceptions to this for classmethods and staticmethods, but those don't apply here.) Also, within that method, you should almost certainly be accessing self, not a global variable Bot that happens to be the same object as self.
If manejadorDeEventos is actually the callback for recv_in_thread, you've got a race condition here: if the first message comes in before your main threads finishes the on_event assignment, the recv_on_thread won't be able to call your event handler. (This is exactly the kind of bug that often shows up one time in a million, making it a huge pain to debug when you discover it months after deploying or publishing your code.) So, reverse those two lines.
One last thing: a brief glimpse at this library's code is a bit worrisome. It doesn't look like it's written by someone who really knows what they're doing. The method I copied above only has 3 lines of code, but it includes a useless return None and a leaked Thread that can never be joined, not to mention that the whole design of making you call this method (and spawn a new thread) after each event received is weird, and even more so given that it's not really explained. If this is the standard client library for a service you have to use, then you really don't have much choice in the matter, but if it's not, I'd consider looking for a different library.

Categories

Resources