Create new key value in JSON data using Python / Pandas? - python

I'm trying to work with the Campaign Monitor API, posting JSON data through the API to update subscriber lists. I'm currently one change away from being able to send data,
Right now, my JSON data looks like this
{
"EmailAddress": "subscriber1#example.com",
"Name": "New Subscriber One",
"CustomFields": [
{
"Key": "website",
"Value": "http://example.com"
},
{
"Key": "interests",
"Value": "magic"
},
{
"Key": "interests",
"Value": "romantic walks"
},
{
"Key": "age",
"Value": "",
"Clear": true
}
],
},
{
"EmailAddress": "subscriber2#example.com",
"Name": "New Subscriber Two",
},
{
"EmailAddress": "subscriber3#example.com",
"Name": "New Subscriber Three",
}
}
I still need to add a new key value at the beginning of the JSON payload, incorporating the 'Subscribers' : my_json_data. How would I go about easily adding on the Subscribers key and placing my full and current json data into a list?
Final result should look like
{
'Subscribers' : [
{
"EmailAddress": "subscriber1#example.com",
"Name": "New Subscriber One",
"CustomFields": [
{
"Key": "website",
"Value": "http://example.com"
},
{
"Key": "interests",
"Value": "magic"
},
{
"Key": "interests",
"Value": "romantic walks"
},
{
"Key": "age",
"Value": "",
"Clear": true
}
],
},
{
"EmailAddress": "subscriber2#example.com",
"Name": "New Subscriber Two",
},
{
"EmailAddress": "subscriber3#example.com",
"Name": "New Subscriber Three",
}
}
]
}
I've tried to approach this with creating a new dictionary however when I convert that back to JSON I get more issues and headaches. Is there any easy way to keep everything as a JSON formatted dataset and add in the leading 'Subscribers' key?

this should do it assuming you've got a valid JSON.
your_new_json = {}
your_new_json['Subscribers'] = [your_current_json]

Related

Create/ re-create a list of dictionaries from a dictionary via Python Recursion function

So, I'm trying to parse this json object into multiple events, as it's the expected input for a ETL tool. I know this is quite straight forward if we do this via loops, if statements and explicitly defining the search fields for given events. This method is not feasible because I have multiple heavily nested JSON objects and I would prefer to let the python recursions handle the heavy lifting. The following is a sample object, which consist of string, list and dict (basically covers most use-cases, from the data I have).
{
"event_name": "restaurants",
"properties": {
"_id": "5a9909384309cf90b5739342",
"name": "Mangal Kebab Turkish Restaurant",
"restaurant_id": "41009112",
"borough": "Queens",
"cuisine": "Turkish",
"address": {
"building": "4620",
"coord": {
"0": -73.9180155,
"1": 40.7427742
},
"street": "Queens Boulevard",
"zipcode": "11104"
},
"grades": [
{
"date": 1414540800000,
"grade": "A",
"score": 12
},
{
"date": 1397692800000,
"grade": "A",
"score": 10
},
{
"date": 1381276800000,
"grade": "A",
"score": 12
}
]
}
}
And I want to convert it to this following list of dictionaries
[
{
"event_name": "restaurants",
"properties": {
"restaurant_id": "41009112",
"name": "Mangal Kebab Turkish Restaurant",
"cuisine": "Turkish",
"_id": "5a9909384309cf90b5739342",
"borough": "Queens"
}
},
{
"event_name": "restaurant_address",
"properties": {
"zipcode": "11104",
"ref_id": "41009112",
"street": "Queens Boulevard",
"building": "4620"
}
},
{
"event_name": "restaurant_address_coord"
"ref_id": "41009112"
"0": -73.9180155,
"1": 40.7427742
},
{
"event_name": "restaurant_grades",
"properties": {
"date": 1414540800000,
"ref_id": "41009112",
"score": 12,
"grade": "A",
"index": "0"
}
},
{
"event_name": "restaurant_grades",
"properties": {
"date": 1397692800000,
"ref_id": "41009112",
"score": 10,
"grade": "A",
"index": "1"
}
},
{
"event_name": "restaurant_grades",
"properties": {
"date": 1381276800000,
"ref_id": "41009112",
"score": 12,
"grade": "A",
"index": "2"
}
}
]
And most importantly these events will be broken up into independent structured tables to conduct joins, we need to create primary keys/ unique identifiers. So the deeply nested dictionaries should have its corresponding parents_id field as ref_id. In this case ref_id = restaurant_id from its parent dictionary.
Most of the example on the internet flatten's the whole object to be normalized and into a dataframe, but to utilise this ETL tool to its full potential it would be ideal to solve this problem via recursions and outputting as list of dictionaries.
This is what one might call a brute force method. Create a translator function to move each item into the correct part of the new structure (like a schema).
# input dict
d = {
"event_name": "demo",
"properties": {
"_id": "5a9909384309cf90b5739342",
"name": "Mangal Kebab Turkish Restaurant",
"restaurant_id": "41009112",
"borough": "Queens",
"cuisine": "Turkish",
"address": {
"building": "4620",
"coord": {
"0": -73.9180155,
"1": 40.7427742
},
"street": "Queens Boulevard",
"zipcode": "11104"
},
"grades": [
{
"date": 1414540800000,
"grade": "A",
"score": 12
},
{
"date": 1397692800000,
"grade": "A",
"score": 10
},
{
"date": 1381276800000,
"grade": "A",
"score": 12
}
]
}
}
def convert_structure(d: dict):
''' function to convert to new structure'''
# the new dict
e = {}
e['event_name'] = d['event_name']
e['properties'] = {}
e['properties']['restaurant_id'] = d['properties']['restaurant_id']
# and so forth...
# keep building the new structure / template
# return a list
return [e]
# run & print
x = convert_structure(d)
print(x)
the reuslt (for the part done) looks like this:
[{'event_name': 'demo', 'properties': {'restaurant_id': '41009112'}}]
If a pattern is identified, then the above could be improved...

How does one pass a json file or object in a POST request using Python module 'requests'

I am using a site's REST API's and have been primarily using Python's 'requests' module to GET json responses. The goal of the GET requests are to ultimately pull a user's form response which ends up being a complex json document. To deal with this:
user_form_submission = requests.get('https://www.url/doc.json',
auth = (api_key, secret),
params = params)
python_obj = json.loads(user_form_submission.text)
trimmed_dict = python_obj['key'][0]['keys']
For context, this is what trimmed_dict would look like formatted as .json:
{
"Date": { "value": "2020-04-26", "type": "date" },
"Location": {
"value": "Test ",
"type": "text",
"geostamp": "lat=34.00000, long=-77.00000, alt=17.986118, hAccuracy=65.000000, vAccuracy=10.000000, timestamp=2020-04-26T23:39:56Z"
},
"form": {
"value": [
{
"form_Details": {
"value": [
{
"code": {
"value": "0000000000",
"type": "barcode"
},
"Name": { "value": "bob", "type": "text" }
}
],
"type": "group"
},
"Subtotal": { "value": "4", "type": "decimal" },
"form_detail2": {
"value": [
{
"name": {
"value": "billy",
"type": "text"
},
"code": {
"value": "00101001",
"type": "barcode"
},
"Classification": {
"value": "person",
"type": "select1"
},
"Start_Time": { "value": "19:43:00", "type": "time" },
"time": { "value": "4", "type": "decimal" }
}
],
"type": "subform"}
}
]
}
}
Now I have a portion of the json that contains both the useful and useless. From this point, can I pass this obj in a POST? I've tried every way that I can think of approaching it, and have been shut down.
Understanding how I want to go about this, this is how I thought it would go:
json_post = requests.post(' https://url/api/doc.json',
auth = (api_key, secret),
json = {
"form_id" : 'https://url.form.com/formid',
'payload':{
json.dumps(trimmed_dict)
}})
But, when I do this, I get the following error --
TypeError: Object of type set is not JSON serializable
How can I push this dict through this POST? If there's a more effective way of going about it, I am very open to suggestion.
Try removing the curly braces around json.dumps(trimmed_dict). json.dumps turns your trimmed_dict into a string, which becomes a python set when surrounded with braces.
Additionally you could remove json.dumps and plug the trimmed_dict into the structure directly as the value associated with payload.
Remove the extra {} from the payload. payload itself is a key and json.dumps(trimmed_dict) as a value is enough
json_post = requests.post(' https://url/api/doc.json',
auth = (api_key, secret),
json = {
"form_id" : 'https://url.form.com/formid',
"payload": json.dumps(trimmed_dict)
})

How to flatten JSON response from Surveymonkey API

I'm setting up a Python function to use the Surveymonkey API to get survey responses from Surveymonkey.
The API returns responses in a JSON format with a deep recursive file structure.
I'm having issues trying to flatten this JSON so that it can go into Google Cloud Storage.
I have tried to flatten the response using the following code. Which works; however, it does not transform it to the format that I am looking for.
{
"per_page": 2,
"total": 1,
"data": [
{
"total_time": 0,
"collection_mode": "default",
"href": "https://api.surveymonkey.com/v3/responses/5007154325",
"custom_variables": {
"custvar_1": "one",
"custvar_2": "two"
},
"custom_value": "custom identifier for the response",
"edit_url": "https://www.surveymonkey.com/r/",
"analyze_url": "https://www.surveymonkey.com/analyze/browse/",
"ip_address": "",
"pages": [
{
"id": "73527947",
"questions": [
{
"id": "273237811",
"answers": [
{
"choice_id": "1842351148"
},
{
"text": "I might be text or null",
"other_id": "1842351149"
}
]
},
{
"id": "273240822",
"answers": [
{
"choice_id": "1863145815",
"row_id": "1863145806"
},
{
"text": "I might be text or null",
"other_id": "1863145817"
}
]
},
{
"id": "273239576",
"answers": [
{
"choice_id": "1863156702",
"row_id": "1863156701"
},
{
"text": "I might be text or null",
"other_id": "1863156707"
}
]
},
{
"id": "296944423",
"answers": [
{
"text": "I might be text or null"
}
]
}
]
}
],
"date_modified": "1970-01-17T19:07:34+00:00",
"response_status": "completed",
"id": "5007154325",
"collector_id": "50253586",
"recipient_id": "0",
"date_created": "1970-01-17T19:07:34+00:00",
"survey_id": "105723396"
}
],
"page": 1,
"links": {
"self": "https://api.surveymonkey.com/v3/surveys/123456/responses/bulk?page=1&per_page=2"
}
}
answers_df = json_normalize(data=response_json['data'],
record_path=['pages', 'questions', 'answers'],
meta=['id', ['pages', 'questions', 'id'], ['pages', 'id']])
Instead of returning a row for each question id, I need it to return a column for each question id, choice_id, and text field.
The columns I would like to see are total_time, collection_mode, href, custom_variables.custvar_1, custom_variables.custvar_2, custom_value, edit_url, analyze_url, ip_address, pages.id, pages.questions.0.id, pages.questions.0.answers.0.choice_id, pages.questions.0.answers.0.text, pages.questions.0.answers.0.other_id
Instead of the each Question ID, Choice_id, text and answer being on a separate row. I would like a column for each one. So that there is only 1 row per survey_id or index in data

django - iterate between json response objects

I have a response object that I am receiving from an api call. The response has several objects that are returned in a single call. What I want to do is grab information from each of the objects returned and store them in varialbes to use them within the application. I know to grab info from a json response when it returns a single objects but I am getting confused with multiples objects... I know how to automate the iteration process through something like a forloop... it wont iterate.
here is a sample response that I am getting:
I want to grab the _id from both items.
{
'user':"<class 'synapse_pay_rest.models.users.user.User'>(id=..622d)",
'json':{
'_id':'..6e80',
'_links':{
'self':{
'href':'https://uat-api.synapsefi.com/v3.1/users/..22d/nodes/..56e80'
}
},
'allowed':'CREDIT-AND-DEBIT',
'client':{
'id':'..26a34',
'name':'Charlie Brown LLC'
},
'extra':{
'note':None,
'other':{
},
'supp_id':''
},
'info':{
'account_num':'8902',
'address':'PO BOX 85139, RICHMOND, VA, US',
'balance':{
'amount':'750.00',
'currency':'USD'
},
'bank_long_name':'CAPITAL ONE N.A.',
'bank_name':'CAPITAL ONE N.A.',
'class':'SAVINGS',
'match_info':{
'email_match':'not_found',
'name_match':'not_found',
'phonenumber_match':'not_found'
},
'name_on_account':' ',
'nickname':'SynapsePay Test Savings Account - 8902',
'routing_num':'6110',
'type':'BUSINESS'
},
<class 'synapse_pay_rest.models.nodes.ach_us_node.AchUsNode'>({
'user':"<class 'synapse_pay_rest.models.users.user.User'>(id=..622d)",
'json':{
'_id':'..56e83',
'_links':{
'self':{
'href':'https://uat-api.synapsefi.com/v3.1/users/..d622d/nodes/..6e83'
}
},
'allowed':'CREDIT-AND-DEBIT',
'client':{
'id':'599378ec6aef1b0021026a34',
'name':'Charlie Brown LLC'
},
'extra':{
'note':None,
'other':{
},
'supp_id':''
},
'info':{
'account_num':'8901',
'address':'PO BOX 85139, RICHMOND, VA, US',
'balance':{
'amount':'800.00',
'currency':'USD'
},
'bank_long_name':'CAPITAL ONE N.A.',
'bank_name':'CAPITAL ONE N.A.',
'class':'CHECKING',
'match_info':{
'email_match':'not_found',
'name_match':'not_found',
'phonenumber_match':'not_found'
},
'name_on_account':' ',
'nickname':'SynapsePay Test Checking Account - 8901',
'routing_num':'6110',
'type':'BUSINESS'
},
})
Here is the code that I have:
It wont grab any values...
the iteration needs to be done to the nodes variable which is hte json response object.
def listedLinkAccounts(request):
currentUser = loggedInUser(request)
currentProfile = Profile.objects.get(user = currentUser)
user_id = currentProfile.synapse_id
synapseUser = SynapseUser.by_id(client, str(user_id))
options = {
'page':1,
'per_page':20,
'type': 'ACH-US',
}
nodes = Node.all(synapseUser, **options)
print(nodes)
response = nodes
_id = response["_id"]
print(_id)
return nodes
here is a sample api response from the api documenation:
{
"error_code": "0",
"http_code": "200",
"limit": 20,
"node_count": 5,
"nodes": [
{
"_id": "594e5c694d1d62002f17e3dc",
"_links": {
"self": {
"href": "https://uat-api.synapsefi.com/v3.1/users/594e0fa2838454002ea317a0/nodes/594e5c694d1d62002f17e3dc"
}
},
"allowed": "CREDIT-AND-DEBIT",
"client": {
"id": "589acd9ecb3cd400fa75ac06",
"name": "SynapseFI"
},
"extra": {
"other": {},
"supp_id": "ABC124"
},
"info": {
"account_num": "7443",
"address": "PLACE DE LA REPUBLIQUE 4 CROIX 59170 FR",
"balance": {
"amount": "0.00",
"currency": "USD"
},
"bank_long_name": "3 SUISSES INTERNATIONAL",
"bank_name": "3 SUISSES INTERNATIONAL",
"name_on_account": " ",
"nickname": "Some Account"
},
"is_active": true,
"timeline": [
{
"date": 1498307689471,
"note": "Node created."
},
{
"date": 1498307690130,
"note": "Unable to send micro deposits as node type is not ACH-US."
}
],
"type": "WIRE-INT",
"user_id": "594e0fa2838454002ea317a0"
},
{
...
},
{
...
},
...
],
"page": 1,
"page_count": 1,
"success": true
}

Python - Find value anywhere within JSON and return location

In Python I'm currently working with a very large JSON file with some deep dictionaries and arrays. I'm having an issue where it's not constant. For example that's below, it's essentially countries, with regions/states, cities, and suburbs. The issue is that if there is only one suburb, it'll return a dictionary, though if there's more than one, it's a array with a dictionary making me have to add another line of code to go deeper. Sure, can ifelse/for it, but this is only a very small portion of the inconstancy and it's just not proper going ifelse all the time.
What I'd like to do is simply search anything within Belgium for the dictionary entry "code": "8400" and return it's location within the JSON file. What would be my best approach in order to do something like this? Thanks!
***SNIP***
{
"code": "BE",
"name": "Belgium",
"regions": {
"region": [
{
"code": "45",
"name": "Flanders",
"places": {
"place": [
{
"code": "1790",
"name": "Affligem"
},
{
"code": "8570",
"name": "Anzegem"
},
{
"code": "8630",
"name": "Diksmuide"
},
{
"code": "9600",
"name": "Ronse"
}
]
},
"subregions": {
"subregion": [
{
"code": "46",
"name": "Coast",
"places": {
"place": [
{
"code": "8300",
"name": "Knokke-Heist"
},
{
"code": "8400",
"name": "Oostende",
"subplaces": {
"subplace": {
"code": "8450",
"name": "Bredene"
}
}
},
{
"code": "8420",
"name": "De Haan"
},
{
"code": "8430",
"name": "Middelkerke"
},
{
"code": "8434",
"name": "Westende-Bad"
},
{
"code": "8490",
"name": "Jabbeke"
},
{
"code": "8660",
"name": "De Panne"
},
{
"code": "8670",
"name": "Oostduinkerke"
}
]
}
},
{
"code": "47",
"name": "Cities",
"places": {
"place": [
{
"code": "1000",
"name": "Brussels"
},
{
"code": "2000",
"name": "Antwerp"
},
{
"code": "8000",
"name": "Bruges"
},
{
"code": "8340",
"name": "Damme"
},
{
"code": "9000",
"name": "Gent"
}
]
}
},
{
"code": "48",
"name": "Interior",
"places": {
"place": [
{
"code": "2260",
"name": "Westerlo"
},
{
"code": "2400",
"name": "Mol"
},
{
"code": "2590",
"name": "Berlaar"
},
{
"code": "8500",
"name": "Kortrijk",
"subplaces": {
"subplace": {
"code": "8940",
"name": "Wervik"
}
}
},
{
"code": "8610",
"name": "Handzame"
},
{
"code": "8755",
"name": "Ruiselede"
},
{
"code": "8900",
"name": "Ieper"
},
{
"code": "8970",
"name": "Poperinge"
}
]
}
},
EDIT:
I was asked to show how I'm currently getting through this JSON file. Root is a dictionary containing numbers that equal the city/suburb I'm trying to search for. It doesn't define whether it is a city or suburb before hand. Below is my lazyly coded search while I was trying to learn how to dig through this JSON file, until I realized how complicated it was getting and got a bit stuck.
SNIP
for k in dataDict['countries']['country']:
if k['code'] == root['country']:
for y in k['regions']['region']['places']['place']:
if y['code'] == root['place']:
city = y['name']
else:
try:
for p in y['subplaces']['subplace']:
if p['code'] == root['place']:
city = p['name']
except:
pass
If I understand well, each dictionary has the following structure:
{"code": # some int
"name": # some str
none / "country" / "place" / whatever # some dict or list
You can write a recursive function that handle one and only one dict:
def foo(my_dict):
if my_dict['code'] == root['place']:
city = my_dict['name']
elif "country" in my_dict:
city = foo(my_dict['country'])
elif "place" in my_dict:
#
# and so on...
else:
city = None
return city
Hope this example will help you.

Categories

Resources