Azure Function to CosmosDB - python

Need help on getting a function that would take a json and write the values to a cosmos DB. Everything I have read shows only single parameters.
name = req.params.get('name')
if not name:
try:
req_body = req.get_json()
except ValueError:
pass
else:
name = req_body.get('name')
if name:
count = 1
try:
counter = container_client.read_item(item=name, partition_key=name)
counter['count'] += 1
container_client.replace_item(item=counter['id'], body=counter)
count = counter['count']
except exceptions.CosmosResourceNotFoundError:
# Create new item
container_client.create_item({'id': name, 'count': count})
return func.HttpResponse(f"Hello, {name}! Current count is {count}.")
This code works but would like something {name:Kyle, job:engineer} and these get added to table.

I am following this blog to achieve your requirement.
Try to add the json values in the below format and insert them into cosmos DB.
if name:
newdocs = func.DocumentList()
#creating the userdetails as json in container of a cosmosdb
newproduct_dict = {
"id": str(uuid.uuid4()),
"name": name
}
newdocs.append(func.Document.from_dict(newproduct_dict))
doc.set(newdocs)
by using this I can able to add the JSON values in Cosmos DB.

Related

For loop not iterating in Mailchimp

So, I'm trying to get every sent to list from the Mailchimp API. I've made this code to elaborate a list of it to later generate a join and get both the list and the campaign name and not the campaign id only:
try:
client = MailchimpMarketing.Client()
client.set_config({
"api_key": "xxxx",
"server": "xxx"
})
response_id = client.campaigns.list(count=1000)
df_full = pd.json_normalize(response_id['campaigns'])
df_id = df_full['id']
df_id = df_id.values.tolist()
df_id_camp_name = df_full[['id','settings.subject_line']]
#display(df_id)
except ApiClientError as error:
print("Error: {}".format(error.text))
This part of the script gets the ID for each campaign and each campaign name.
for id in df_id:
try:
client = MailchimpMarketing.Client()
client.set_config({
"api_key": "xxxxx",
"server": "xxxx"
})
response_open = client.reports.get_campaign_recipients(id, count=1000)
df_open = pd.json_normalize(response_open['sent_to'])
df_open_full = df_open.append(df_open)
except ApiClientError as error:
print("Error: {}".format(error.text))
df_open_full.to_excel("campaign_reports_mailchimp.xlsx")
And this last piece, takes the ID from the generated list and allegedly runs the code for each ID. Problem is that the generated excel only brings one campaign and doesn't iterates for the rest of the IDs.
I know that the API has a limit count but it should move to the next ID once the cap reaches since it restarts the for loop itself.
What could be wrong in this case?
Thanks!
Adding some details about the print itself of each piece of code:
First block:
['076ff218d1', '08f16d1014', '0abfd11f5d', '0bb98a7839', '0bca92ee08', '0be206c6ac', '0e048c0d08', '0e28d0feee', '138271cade', '14bf2cd33b', '15c9ce24ed', '17b5302f30', '19721c1e8a', '1d8cc5c1da', '1f5362e2f4', '205480a063', '225bc2469b', '22f286dfbe', '26dec9723b', '2846539c87', '296e9c24f5', '2aa089aa37', '2f819130ff', '352d7913ae', '3a563ffb24', '3a73d3e5b6', '3c83f64170', '3d87a76763', '3e6a903948', '404ab63b91', '4198b629c6', '424b941199', '42e948e744', '46a29946a3', '48e56d7481', '4a0a55eb73', '4caf7e8cc1', '4e3c90946f', '53c8e8a8de', '54600382dd', '55a8915fb8', '5a28843185', '5d575f0de8', '60488c9e4b', '612b068a5b', '6161c05556', '61f5edcefa', '623c98757a', '689ae72a35', '68a8b5dadd', '6b3797ea1a', '6b606e78fb', '6dd276171d', '6ead2584c8', '6f99e38311', '70632fe9e7', '709b6fd5f8', '72a1b314b4', '74b92a828e', '75bdf2a3fe', '75cce79a85', '7687c62b55', '79e63229a8', '79f67ec7b8', '7f9dddc6c0', '807a75418e', '8548a5f525', '8602aa9cdd', '87c4bd5a07', '8bb822eeb3', '8ec05b63fa', '8f0c7d0cce', '900018816a', '924976c731', '933a2af282', '95a170d72c', '977beb5616', '98f8ee7fed', '99dbbc746c', '9a01a1b518', 'a1ad97ae8e', 'a4aa98b22b', 'a7c535a5b9', 'a978bab42b', 'ab13c82454', 'ab7c015390', 'acdc57b754', 'ad66024938', 'ae8464e351', 'ae95f63873', 'aeba2b962a', 'af0d9fe032', 'af6a4efe07', 'b19c553cd1', 'b5f4196035', 'b7a9ced6c8', 'b7eab10f0f', 'b80b52c97b', 'bd56fd7a6d', 'bdbb60aec7', 'c142343cfd', 'c2eb923ada', 'c407b9856d', 'c4636be5a1', 'c6145916ae', 'c84e39f8ef', 'c937f5876e', 'c97497c3e4', 'ca468b0942', 'cf2a040b92', 'cf81c2ac84', 'd006696585', 'd1b57067d2', 'd67915da02', 'd687b97dec', 'd698158ac5', 'd78cb47ccd', 'da0e85a878', 'dfc6a9bffc', 'dfe8e851e8', 'e08ce9ad82', 'e33f24fdcb', 'e4c478afb4', 'e8e3faaf5a', 'ebee2d5079', 'ecafe77954', 'ef1dae3863', 'f045de38f4', 'fa07a15c0e', 'fa3936c575', 'fa4def8ca1', 'fc1f708dc7', 'fe4f89c745']
Second block of code:
Extract of code display(df)
Also adding an error displayed in the loop because I'm using pd.append instead of pd.concatenate (since I don't have any other df to concatenate with)
FutureWarning: The frame.append method is deprecated and will be
removed from pandas in a future version. Use pandas.concat instead.

AWS Glue create_partition using boto3 successful, but Athena not showing results for query

I have a glue script to create new partitions using create_partition(). The glue script is running successfully, and i could see the partitions in the Athena console when using SHOW PARTITIONS. For glue script create_partitions, I did refer to this sample code here : https://medium.com/#bv_subhash/demystifying-the-ways-of-creating-partitions-in-glue-catalog-on-partitioned-s3-data-for-faster-e25671e65574
When I try to run a Athena query for a given partition which was newly added, I am getting no results.
Is it that I need to trigger the MSCK command, even if I add the partitions using create_partitions. Appreciate any suggestions please
.
I have got the solution myself, wanted to share with SO community, so it would be useful someone. The following code when run as a glue job, creates partitions, and can also be queried in Athena for the new partition columns. Please change/add the parameter values db name, table name, partition columns as needed.
import boto3
import urllib.parse
import os
import copy
import sys
# Configure database / table name and emp_id, file_id from workflow params?
DATABASE_NAME = 'my_db'
TABLE_NAME = 'enter_table_name'
emp_id_tmp = ''
file_id_tmp = ''
# # Initialise the Glue client using Boto 3
glue_client = boto3.client('glue')
#get current table schema for the given database name & table name
def get_current_schema(database_name, table_name):
try:
response = glue_client.get_table(
DatabaseName=DATABASE_NAME,
Name=TABLE_NAME
)
except Exception as error:
print("Exception while fetching table info")
sys.exit(-1)
# Parsing table info required to create partitions from table
table_data = {}
table_data['input_format'] = response['Table']['StorageDescriptor']['InputFormat']
table_data['output_format'] = response['Table']['StorageDescriptor']['OutputFormat']
table_data['table_location'] = response['Table']['StorageDescriptor']['Location']
table_data['serde_info'] = response['Table']['StorageDescriptor']['SerdeInfo']
table_data['partition_keys'] = response['Table']['PartitionKeys']
return table_data
#prepare partition input list using table_data
def generate_partition_input_list(table_data):
input_list = [] # Initializing empty list
part_location = "{}/emp_id={}/file_id={}/".format(table_data['table_location'], emp_id_tmp, file_id_tmp)
input_dict = {
'Values': [
emp_id_tmp, file_id_tmp
],
'StorageDescriptor': {
'Location': part_location,
'InputFormat': table_data['input_format'],
'OutputFormat': table_data['output_format'],
'SerdeInfo': table_data['serde_info']
}
}
input_list.append(input_dict.copy())
return input_list
#create partition dynamically using the partition input list
table_data = get_current_schema(DATABASE_NAME, TABLE_NAME)
input_list = generate_partition_input_list(table_data)
try:
create_partition_response = glue_client.batch_create_partition(
DatabaseName=DATABASE_NAME,
TableName=TABLE_NAME,
PartitionInputList=input_list
)
print('Glue partition created successfully.')
print(create_partition_response)
except Exception as e:
# Handle exception as per your business requirements
print(e)

parse github api.. getting string indices must be integers error

I need to loop through commits and get name, date, and messages info from
GitHub API.
https://api.github.com/repos/droptable461/Project-Project-Management/commits
I have many different things but I keep getting stuck at string indices must be integers error:
def git():
#name , date , message
#https://api.github.com/repos/droptable461/Project-Project-Management/commits
#commit { author { name and date
#commit { message
#with urlopen('https://api.github.com/repos/droptable461/Project Project-Management/commits') as response:
#source = response.read()
#data = json.loads(source)
#state = []
#for state in data['committer']:
#state.append(state['name'])
#print(state)
link = 'https://api.github.com/repos/droptable461/Project-Project-Management/events'
r = requests.get('https://api.github.com/repos/droptable461/Project-Project-Management/commits')
#print(r)
#one = r['commit']
#print(one)
for item in r.json():
for c in item['commit']['committer']:
print(c['name'],c['date'])
return 'suc'
Need to get person who did the commit, date and their message.
item['commit']['committer'] is a dictionary object, and therefore the line:
for c in item['commit']['committer']: is transiting dictionary keys.
Since you are calling [] on a string (the dictionary key), you are getting the error.
Instead that code should look more like:
def git():
link = 'https://api.github.com/repos/droptable461/Project-Project-Management/events'
r = requests.get('https://api.github.com/repos/droptable461/Project-Project-Management/commits')
for item in r.json():
for key in item['commit']['committer']:
print(item['commit']['committer']['name'])
print(item['commit']['committer']['date'])
print(item['commit']['message'])
return 'suc'

sed recognition response to DynamoDB table using Lambda-python

I am using Lambda to detect faces and would like to send the response to a Dynamotable.
This is the code I am using:
rekognition = boto3.client('rekognition', region_name='us-east-1')
dynamodb = boto3.client('dynamodb', region_name='us-east-1')
# --------------- Helper Functions to call Rekognition APIs ------------------
def detect_faces(bucket, key):
response = rekognition.detect_faces(Image={"S3Object": {"Bucket": bucket,
"Name": key}}, Attributes=['ALL'])
TableName = 'table_test'
for face in response['FaceDetails']:
table_response = dynamodb.put_item(TableName=TableName, Item='{0} - {1}%')
return response
My problem is in this line:
for face in response['FaceDetails']:
table_response = dynamodb.put_item(TableName=TableName, Item= {'key:{'S':'value'}, {'S':'Value')
I am able to see the result in the console.
I don't want to add specific item(s) to the table- I need the whole response to be transferred to the table.
Do do this:
1. What to add as a key and partition key in the table?
2. How to transfer the whole response to the table
i have been stuck in this for three days now and can't figure out any result. Please help!
******************* EDIT *******************
I tried this code:
rekognition = boto3.client('rekognition', region_name='us-east-1')
# --------------- Helper Functions to call Rekognition APIs ------------------
def detect_faces(bucket, key):
response = rekognition.detect_faces(Image={"S3Object": {"Bucket": bucket,
"Name": key}}, Attributes=['ALL'])
TableName = 'table_test'
for face in response['FaceDetails']:
face_id = str(uuid.uuid4())
Age = face["AgeRange"]
Gender = face["Gender"]
print('Generating new DynamoDB record, with ID: ' + face_id)
print('Input Age: ' + Age)
print('Input Gender: ' + Gender)
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['test_table'])
table.put_item(
Item={
'id' : face_id,
'Age' : Age,
'Gender' : Gender
}
)
return response
It gave me two of errors:
1. Error processing object xxx.jpg
2. cannot concatenate 'str' and 'dict' objects
Can you pleaaaaase help!
When you create a Table in DynamoDB, you must specify, at least, a Partition Key. Go to your DynamoDB table and grab your partition key. Once you have it, you can create a new object that contains this partition key with some value on it and the object you want to pass itself. The partition key is always a MUST upon creating a new Item in a DynamoDB table.
Your JSON object should look like this:
{
"myPartitionKey": "myValue",
"attr1": "val1",
"attr2:" "val2"
}
EDIT: After the OP updated his question, here's some new information:
For problem 1)
Are you sure the image you are trying to process is a valid one? If it is a corrupted file Rekognition will fail and throw that error.
For problem 2)
You cannot concatenate a String with a Dictionary in Python. Your Age and Gender variables are dictionaries, not Strings. So you need to access an inner attribute within these dictionaries. They have a 'Value' attribute. I am not a Python developer, but you need to access the Value attribute inside your Gender object. The Age object, however, has 'Low' and 'High' as attributes.
You can see the complete list of attributes in the docs
Hope this helps!

Flask Pymongo Objectid link not working

I am tring to access a new document from a mongo database collection named games by the _id. But for example if I access localhost:5000/solutie/5ae71f3e8e442b090e4c313bit is giving me the error: ValueError: View function did not return a response so it doesn't go through the if and I think I should convert the value of the _id to another type but I don't know how.
This is my flask code:
#app.route('/solutie/<id>')
def solu(id):
games = mongo.db.games
game_user = games.find_one({'_id' : id})
if game_user:
return id
This is my mongo database collection named games:
{
"_id": {
"$oid": "5ae71f3e8e442b090e4c313b"
},
"sursa1": "nothingfornow",
"sursa2": "nothing4now",
"corectat": 0,
"player2": "test",
"player1": "test2",
"trimis1": 1,
"trimis2": 1
}
There's an object type converter you can use for URL routing:
#app.route('/solutie/<ObjectID:game_id>')
def solu(game_id):
games = mongo.db.games
game_user = games.find_one_or_404({'_id' : game_id})
return game_user
See:
https://flask-pymongo.readthedocs.io/en/latest/#flask_pymongo.BSONObjectIdConverter
Also, don't override id() because this is an in-built Python function.
The second parameter of the find() method is an object describing which fields to include in the result.
This parameter is optional and if omitted, all fields are included in the result.
# #app.route('/solutie/<int:id>') # or
#app.route('/solutie/<string:id>')
def solu(id):
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
games = mydb["games"]
game_user = games.find({},{ "_id": id})
if game_user is not None:
return id
else:
return render_template("index.html")
Also you should use "else" condition.

Categories

Resources