Use AWS Lambda to create CSV and save to S3 (Python) - python

I've got some Python code that goes to a URL, parses an HTML table, and saves the result to a CSV. Changes to the table happen frequently, and I'd like a trending view of these changes. To accomplish this, I'd like my code to run as a function in Lambda, and save snapshots of the table to S3 every 12 hours.
I've created the Lambda, used CloudWatch to trigger the function based on time, and given it permissions to access the relevant S3 bucket, BUT I can't find any resources on how to save the output of the function to said bucket. Any pointers or alternate suggestions would be greatly appreciated. Thanks!
(Note: I have found a resource on here that describes this process using Node, which isn't out of the question, but I'd prefer to remain in Python if possible.)

import boto3
S3obj =boto3.resource('s3').Object(bucket, key)
Filecontents = S3obj.get()['Body'].read()
....
S3obj.put(Body=newfilecontents)
Sorry for any typos its hard to answer when using a phone to type

Related

Suggestions on automating Sharepoint workflows to process Excel Files

So I have task of automating a workflow such that:
Whenever an excel file (a.xlsx) is added/modified to the SharePoint Folder ->
My custom data extractor code will process this excel file ->
Extracted data will be stored as a new excel file (b.xlsx) in another folder on SharePoint.
This has to be achieved using Power Automate or Logic Apps with Azure Functions. But I am not able to wrap my head around how to go about this.
Has anyone implemented something like this before?
PS: My code is in Python.
So, when a.xlsx is created or updated, you want to perform some action to that file before save as b.xlsx in another folder.
If it is something that cannot be done just using Power Automate/Logic Apps, you can insert a azure function to your flow in 2 different ways:
Using an Azure Function Action (more here)
Using an Http Action (more here)
You will need an azure function of type http trigger
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-http-webhook-trigger?tabs=in-process%2Cfunctionsv2&pivots=programming-language-python
If you can share what you need to do before save as b.xlsx I may be able to help more

How to get a list of Folders and Files in Sharepoint

I have started to explore the Graph API, but Sharepoint is quite complicated and I am not sure how to proceed. I previously have worked with OneNote using this API successfully.
Purpose: There are thousands of folders/files and I need to go through the list in order to organize it in a better way. I am looking for a way to export this list into Excel/CSV using Python and Graph API
I want to dynamically get a list of all underlying Folders and files visible from this URL:
https://company1.sharepoint.com/teams/TEAM1/Shared Documents/Forms/AllItems.aspx?id=/teams/TEAMS_BI_BI-AVS/Shared Documents/Team Channel1/Folder_Name1&viewid=6fa603f8-82e2-477c-af68-8b3985cfa525
When I open this URL, I see that this folder is part of a private group called PRIVATE_GROUP1 (on the top left).
Looking at some sample API calls here:
GET /drives/{drive-id}/items/{item-id}/children -> Not sure what drive-id
GET /groups/{group-id}/drive/items/{item-id}/children -> I assume group-id refers to private group. Not sure how to get the ID
GET /sites/{site-id}/drive/items/{item-id}/children -> Assuming site-id is 'company1.sharepoint.com'?
For all above not sure what item-id refers to...
Thanks
refer below code. This might help you.
https://gist.github.com/keathmilligan/590a981cc629a8ea9b7c3bb64bfcb417

Writing to a CSV file in an S3 bucket using boto 3

I'm working on a project that needs to update a CSV file with user info periodically. The CSV is stored in an S3 bucket so I'm assuming I would use boto3 to do this. However, I'm not exactly sure how to go about this- would I need to download the CSV from S3 and then append to it, or is there a way to do it directly? Any code samples would be appreciated.
Ideally this would be something where DynamoDB would work pretty well (as long as you can create a hash key). Your solution would require the following.
Download the CSV
Append new values to the CSV Files
Upload the CSV.
A big issue here is the possibility (not sure how this is planned) that the CSV file is updated multiple times before being uploaded, which would lead to data loss.
Using something like DynamoDB, you could have a table, and just use the put_item api call to add new values as you see fit. Then, whenever you wish, you could write a python script to scan for all the values and then write a CSV file however you wish!

Sharing files in Google Drive using Python

There are plenty of resources involving sharing files in Google Drive but I can't find anything useful to Python and the references by Google aren't Python-specific.
I've saved an item ID into the variable selected_id. In this case, I want to make that file shareable by URL to anyone for reading.
service.permissions().create(body={"role":"reader", "type":"anyone"}, fileId=selected_id)
This is what I have so far but I don't think I formatted it correctly at all.
Like the title states, how would I share a file by ID on GDrive using Python? Thanks in advance.
Looks like you've got the correct format as per Permissions:Create
Perhaps you just need to add .execute() on the end?
service.permissions().create(body={"role":"reader", "type":"anyone"}, fileId=selected_id).execute()
Also is you're looking for examples using the Python API check out the following repository:
https://github.com/gsuitedevs/python-samples
I would image that you're code might look like something similar to this:
https://github.com/gsuitedevs/python-samples/blob/master/drive/driveapp/main.py#L57

Writing, not uploading, to S3 with boto

Is there a way to directly write to a file in S3 rather than uploading a completed file? I'm aware of the various set_contents_from_ methods, which work well when I want to upload a completed file. I'm looking for a way to write directly to an S3 key as data comes in.
I see in the documentation that there is mention of an open_write method for Key objects, but it is specifically called out as not implemented. I'd rather not go with something cheesy like the following:
def WriteToS3(file_name,data):
current_data = Key.get_contents_as_string(file_name)
new_data = current_data + data
Key.set_contents_from_string(new_data)
Any help is greatly appreciated.
Try using S3 multipart upload feature to archive this. You can upload each part sequentially, and complete the upload. See more details here: http://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html

Categories

Resources