I have the attached following actions parameter part of my JSON data.
'actions': [{'action_type': 'onsite_conversion.post_save', 'value': '1'},
{'action_type': 'link_click', 'value': '2'},
{'action_type': 'post', 'value': '3'},
{'action_type': 'post_reaction', 'value': '4'},
{'action_type': 'video_view', 'value': '5'},
{'action_type': 'post_engagement', 'value': '6'},
{'action_type': 'page_engagement', 'value': '7'}],
API can send the following options at any time;
action_types = ["onsite_conversion.post_save", "link_click", "post", "post_reaction", "comment", "video_view", "post_engagement", "page_engagement"]
I tried to write a python script that parses these possible values (in action_types list order) from the JSON body, as seen from the sample JSON data it doesn't send comment value so in this case script should write 0 to return list, below is my script
def get_action_values(insight):
action_types = ["onsite_conversion.post_save", "link_click", "post", "post_reaction",
"comment", "video_view", "post_engagement", "page_engagement"]
action_type_values = []
for action in action_types:
if action in [item["action_type"] for item in insight["actions"]]:
for item in insight["actions"]:
if action in item["action_type"]:
if "value" in item:
action_type_values.append(item["value"])
else:
action_type_values.append("Null")
else:
action_type_values.append(0)
return action_type_values
I am expecting it to return as [1,2,3,4,0,5,6,7] but it returned as [1, 2, 1, 3, 4, 6, 4, 0, 5, 6, 7]
Here is a possible solution:
The action_type_values list is initialized with zeros of the length -> len() of action_types which represents the default value for any missing action_type in the insight dictionary.
The for-loop then loops through all the actions for each action, if action_type is in action_types, it's index is found and the corresponding value in the action_type_values is updated with the int(value) of the action.
def get_action_values(insight):
action_types = ["onsite_conversion.post_save", "link_click", "post", "post_reaction", "comment", "video_view", "post_engagement", "page_engagement"]
action_type_values = [0] * len(action_types)
for item in insight["actions"]:
if item["action_type"] in action_types:
idx = action_types.index(item["action_type"])
action_type_values[idx] = int(item["value"])
return action_type_values
print(get_action_values(my_json))
[1, 2, 3, 4, 0, 5, 6, 7]
As you can see in the code you posted and as it was already pointed out in the comments to your question:
{'action_type': 'link_click', 'value': '2'},
'value' has a value of '2' not 2, so if you expect integers in the resulting list use:
if "value" in item:
action_type_values.append(int(item["value"]))
or to be consistent with the string representation:
else:
action_type_values.append("0")
Another issue was caused by the if-condition if action in item["action_type"]: you use because you have an action type 'post' which occur in all of 'onsite_conversion.post_save', 'post', 'post_reaction' and 'post_engagement'also containing 'post', so it is necessary to change the condition (as pointed out by barmar in the comments) to: if action == item["action_type"]: to eliminate duplicates in the result.
Below the entire corrected code:
dct_json = {'actions': [
{'action_type': 'onsite_conversion.post_save', 'value': '1'},
{'action_type': 'link_click', 'value': '2'},
{'action_type': 'post', 'value': '3'},
{'action_type': 'post_reaction', 'value': '4'},
{'action_type': 'video_view', 'value': '5'},
{'action_type': 'post_engagement', 'value': '6'},
{'action_type': 'page_engagement', 'value': '7'}],
}
# API can send the following options at any time;
action_types = ["onsite_conversion.post_save", "link_click", "post", "post_reaction", "comment", "video_view", "post_engagement", "page_engagement"]
def get_action_values(insight):
action_types = ["onsite_conversion.post_save", "link_click", "post", "post_reaction",
"comment", "video_view", "post_engagement", "page_engagement"]
action_type_values = []
for action in action_types:
if action in [item["action_type"] for item in insight["actions"]]:
for item in insight["actions"]:
if action == item["action_type"]:
if "value" in item:
action_type_values.append(int(item["value"]))
else:
action_type_values.append("Null")
else:
action_type_values.append(0)
return action_type_values
print( get_action_values(dct_json) )
printing the expected:
[1, 2, 3, 4, 0, 5, 6, 7]
Animated by Jamiu S. answer which tries to optimize the code but does generally not work properly ( as of 2023-02-05 01:06 CET ) failing with KeyError or not printing all the values, below improved code eliminating one loop and using dictionary.get(key, default_value) syntax to eliminate an if/else code section.
The code includes changed input data to show that it covers cases in which the code in Jamiu S. answer fails to work or to work properly. Notice that the returned list items are all strings to be consistent with 'Null' and the 'action_type' values:
dct_json = {'actions': [
{'action_type': 'onsite_conversion.post_save', 'value': '1'},
{'action_type': 'link_click', 'value': '2'},
{'action_type': 'page_engagement-no-value', 'no-value': '8'},
{'action_type': 'post', 'value': '3'},
{'action_type': 'post_reaction', 'value': '4'},
{'action_type': 'video_view', 'value': '5'},
{'action_type': 'post_engagement', 'value': '6'},
{'action_type': 'page_engagement', 'value': '7'},
{'action_type': 'post', 'value': '33'},
]}
action_types = ["onsite_conversion.post_save", "link_click", "post",
"post_reaction", "comment", "video_view", "post_engagement",
"page_engagement", "page_engagement-no-value"]
def get_action_values(insight, action_types):
action_type_values = []
for action in action_types:
lstofmatchingitems = [ item for item in insight["actions"] if item["action_type"]==action ]
if lstofmatchingitems:
for item in lstofmatchingitems:
action_type_values.append(item.get("value", "Null"))
else:
action_type_values.append("0")
return action_type_values
print(get_action_values(dct_json, action_types))
prints
['1', '2', '3', '33', '4', '0', '5', '6', '7', 'Null']
Related
I only found questions where people wanted to merge lists into dictionaries or merge dictionaries but not merge lists that are already in a dictionary
Lets say I have a Dictionary having following structure
myDict= {
'key1': [{'description': 'some description', 'listwithstrings': ['somestring1'], 'number': '1'}, {'listwithstrings': ['somestring1', 'somestring2'], 'description': 'some other description', 'number': '1'}],
'key2': [{'listwithstrings': ['somestring4'], 'description': "some different description, 'number': '2'}, {'number': '2', 'listwithstrings': ['somestring5'], 'description': 'some different description'}],
'key3': [{'number': '3', 'listwithstrings': ['somestring7', 'somestring8'], 'description': 'only one entry'}]
}
now I want to merge the entries in the dictionary from each key for itself and remove the duplicates. I don't know for each key whether it has multiple entries (it can have more than two, too) or just one, so I can't use the key as a condition like number==1
Resulting in
myCleanedDict= {
'key1': [{'description': ['some description', 'some other description'], 'listwithstrings': ['somestring1', 'somestring2'], 'number': '1'}],
'key2': [{'listwithstrings': ['somestring4', 'somestring5'], 'description': 'some different description', 'number': '2'}],
'key3': [{'number': '3', 'listwithstrings': ['somestring7', 'somestring8'], 'description': 'only one entry'}]
}
myDict = {
'key1': [
{
'description': 'some description',
'listwithstrings': ['somestring1'],
'number': '1'
},
{
'listwithstrings': ['somestring1', 'somestring2'],
'description': 'some other description',
'number': '1'
}
],
'key2': [
{
'listwithstrings': ['somestring4'],
'description': 'some different description',
'number': '2'
},
{
'number': '2',
'listwithstrings': ['somestring5'],
'description': 'some different description'
}
],
'key3': [
{
'number': '3',
'listwithstrings': ['somestring7', 'somestring8'],
'description': 'only one entry'
}
]
}
newDict = {}
for k, v in myDict.items():
if len(v) == 0: continue
target = v[0]
for k in target:
if not isinstance(target[k], list):
target[k] = [target[k]]
for i in range(1, len(v)):
for k, v in v[i].items():
if isinstance(v, list):
target[k] += v
else:
target[k].append(v)
target[k] = list(set(target[k]))
for k in target:
if len(target[k]) == 1:
target[k] = target[k][0]
newDict[k] = [target]
print(newDict)
I have a list inside a nested dictionary
body = {'Ready Date': '2020-01-31T12:00:00','Shipment Line List': [{'Description': 'Test', 'Weigth': '5',
'Height': '4.0','Length': '2.0', 'Width': '3.0'}, {'Description': 'Test', 'Weigth': '20', 'Height': '5',
'Length': '30', 'Width': '10']}
I want to iterate over the keys in the nested dictionary and replace "Weigth" with the correct spelling "Weight"
I tried this approach, but I am not getting the expected output
key = {"Weigth":"Weight"}
def find_replace(dict_body, dictionary):
# is the item in the dict?
for item in dict_body:
# iterate by keys
if item in dictionary.keys():
# look up and replace
dict_body = dict_body.replace(item, dictionary[item])
# return updated dict
return dict_body
a = find_replace(body,key)
print(a)
I think a better idea in this particular case is to treat everything as a string, replace and back as a dictionary. Because if you have multiple nested keys, it might be just be easier this way in two lines of code:
from ast import literal_eval
body = literal_eval(str(body).replace("Weigth","Weight"))
This outputs:
{'Ready Date': '2020-01-31T12:00:00',
'Shipment Line List': [{'Description': 'Test',
'Height': '4.0',
'Length': '2.0',
'Weight': '5',
'Width': '3.0'},
{'Description': 'Test',
'Height': '5',
'Length': '30',
'Weight': '20',
'Width': '10'}]}
I want to iterate over the keys in the nested dictionary and replace "Weigth" with the correct spelling "Weight"
something like the below
body = {'Ready Date': '2020-01-31T12:00:00', 'Shipment Line List': [{'Description': 'Test', 'Weigth': '5',
'Height': '4.0', 'Length': '2.0', 'Width': '3.0'},
{'Description': 'Test', 'Weigth': '20',
'Height': '5',
'Length': '30', 'Width': '10'}]}
for entry in body['Shipment Line List']:
entry['Weight'] = entry['Weigth']
del entry['Weigth']
print(body)
output
{'Ready Date': '2020-01-31T12:00:00', 'Shipment Line List': [{'Description': 'Test', 'Height': '4.0', 'Length': '2.0', 'Width': '3.0', 'Weight': '5'}, {'Description': 'Test', 'Height': '5', 'Length': '30', 'Width': '10', 'Weight': '20'}]}
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I need to get the value for business' name and append it to a list.
I need to get the value policies and append to a list after checking parent.
if parent is Marketing name has to added to level1.
if parent is Advertising name has to added to level2.
if some place Business is [] I need to pass None instead of Null List
Also need to check key exists or not for some keys there is a chance of missing policies, business
dictionary is below
If in the list contains same elements example 'Business':['Customer', Customer] then only one element has to take
searchtest = [
{'_index': 'newtest',
'_type': '_doc',
'_id': '100',
'_score': 1.0,
'_source': {'id': '100',
'name': 'A',
'policies': [
{
'id': '332',
'name': 'Second division',
'parent': 'Marketing'},
{'id': '3323', 'name':
'First division',
'parent': 'Marketing'}
]
}
},
{'_index': 'newtest',
'_type': '_doc',
'_id': '101',
'_score': 1.0,
'_source': {
'id': '101',
'name': 'B',
'Business': [{'id': '9'}, {'id': '10', 'name': 'Customer'}],
'policies': [{'id': '332', 'name': 'Second division', 'parent': 'Marketing'}, {'id': '3323', 'name': 'First division', 'parent': 'Advertising'}]}}]`
Code is below
def business(searchtest):
for el in searchtest:
Business_List = []
if 'Business' in el['_source']:
for j in el['_source']['Business']:
if 'name' in j:
Business_List.append(j['name'])
else:
Business_List.extend([])
return Business_List
def policy(searchtest):
for el in searchtest:
level1= []
if 'policies' in el['_source']:
for j in el['_source']['policies']:
if 'parent' in j:
if 'Marketing' in j['parent'] :
level1.append(j['name'])
else:
level1.extend([])
level2= []
if 'policies' in el['_source']:
for j in el['_source']['policies']:
if 'parent' in j:
if 'Advertising' in j['parent']:
level2.append(j['name'])
else:
level2.extend([])
return [level1, level2]
def data_product(searchtest):
resp = []
for el in searchtest:
d = {
'id' : el['_source']['id'],
'name' : el['_source']['name'],
'Business' : business(searchtest),
'level1' : policy(searchtest)[0],
'level2' : policy(searchtest)[1]
}
resp.append(d)
return resp
if __name__ == "__main__":
import pprint
pp = pprint.PrettyPrinter(4)
pp.pprint(data_product(searchtest))
My output
[ { 'Business': [],
'id': '101',
'level1': ['Second division'],
'level2': ['First division'],
'name': 'B'}]
Expected out
[ { 'Business': [],
'id': '100',
'level1': ['Second division','First division'],
'level2': [],
'name': 'A'},
{ 'Business': ['Customer'],
'id': '101',
'level1': ['Second division'],
'level2': ['First division'],
'name': 'B'}]
if resp.append(d) is put inside the loop then only one id is repeating?
my whole code with change
searchtest = [{'_index': 'newtest',
'_type': '_doc',
'_id': '100',
'_score': 1.0,
'_source': {'id': '100',
'name': 'A',
'policies': [{'id': '332',
'name': 'Second division',
'parent': 'Marketing'},
{'id': '3323', 'name': 'First division', 'parent': 'Marketing'}]}},
{'_index': 'newtest',
'_type': '_doc',
'_id': '101',
'_score': 1.0,
'_source': {'id': '101',
'name': 'B',
'Business': [{'id': '9'}, {'id': '10', 'name': 'Customer'}],
'policies': [{'id': '332',
'name': 'Second division',
'parent': 'Marketing'},
{'id': '3323', 'name': 'First division', 'parent': 'Advertising'}]}}]
def business(el):
Business_List = []
# for el in searchtest:
if 'Business' in el['_source']:
for j in el['_source']['Business']:
if 'name' in j:
Business_List.append(j['name'])
else:
Business_List.extend([])
return Business_List
def policy(searchtest):
for el in searchtest:
level1 = []
if 'policies' in el['_source']:
for j in el['_source']['policies']:
if 'parent' in j:
if 'Marketing' in j['parent']:
level1 .append(j['name'])
else:
level1 .extend([])
level2 = []
if 'policies' in el['_source']:
for j in el['_source']['policies']:
if 'parent' in j:
if 'Advertising' in j['parent']:
level2.append(j['name'])
else:
level2.extend([])
return [level1, level1 ]
def data_product(searchtest):
resp = []
for el in searchtest:
d = {
'id': el['_source']['id'],
'name': el['_source']['name'],
'Business': business(el),
'level1': policy(searchtest)[0],
'level2': policy(searchtest)[1]
}
resp.append(d)
return resp
if __name__ == "__main__":
import pprint
pp = pprint.PrettyPrinter(4)
pp.pprint(data_product(searchtest))
output:
[ { 'Business': [],
'id': '100',
'level1': ['Second division'],
'level2': ['First division'],
'name': 'A'},
{ 'Business': ['Customer'],
'id': '101',
'level1': ['Second division'],
'level2': ['First division'],
'name': 'B'}]
I'm trying to connect to my office's SmartSheet API via Python to create some performance tracking dashboards that utilize data outside of SmartSheet. All I want to do is create a simple DataFrame where fields reflect columnId and cell values reflect the displayValue key in the Smartsheet dictionary. I am doing this using a standard API requests.get rather than SmartSheet's API documentation because I've found the latter less easy to work with.
The table (sample) is set up as:
Number Letter Name
1 A Joe
2 B Jim
3 C Jon
The JSON syntax from the sheet GET request is:
{'id': 339338304219012,
'name': 'Sample Smartsheet',
'version': 1,
'totalRowCount': 3,
'accessLevel': 'OWNER',
'effectiveAttachmentOptions': ['GOOGLE_DRIVE',
'EVERNOTE',
'DROPBOX',
'ONEDRIVE',
'LINK',
'FILE',
'BOX_COM',
'EGNYTE'],
'ganttEnabled': False,
'dependenciesEnabled': False,
'resourceManagementEnabled': False,
'cellImageUploadEnabled': True,
'userSettings': {'criticalPathEnabled': False, 'displaySummaryTasks': True},
'userPermissions': {'summaryPermissions': 'ADMIN'},
'hasSummaryFields': False,
'permalink': 'https://app.smartsheet.com/sheets/5vxMCJQhMV7VFFPMVfJgg2hX79rj3fXgVGG8fp61',
'createdAt': '2020-02-13T16:32:02Z',
'modifiedAt': '2020-02-14T13:15:18Z',
'isMultiPicklistEnabled': True,
'columns': [{'id': 6273865019090820,
'version': 0,
'index': 0,
'title': 'Number',
'type': 'TEXT_NUMBER',
'primary': True,
'validation': False,
'width': 150},
{'id': 4022065205405572,
'version': 0,
'index': 1,
'title': 'Letter',
'type': 'TEXT_NUMBER',
'validation': False,
'width': 150},
{'id': 8525664832776068,
'version': 0,
'index': 2,
'title': 'Name',
'type': 'TEXT_NUMBER',
'validation': False,
'width': 150}],
'rows': [{'id': 8660990817003396,
'rowNumber': 1,
'expanded': True,
'createdAt': '2020-02-14T13:15:18Z',
'modifiedAt': '2020-02-14T13:15:18Z',
'cells': [{'columnId': 6273865019090820, 'value': 1.0, 'displayValue': '1'},
{'columnId': 4022065205405572, 'value': 'A', 'displayValue': 'A'},
{'columnId': 8525664832776068, 'value': 'Joe', 'displayValue': 'Joe'}]},
{'id': 498216492394372,
'rowNumber': 2,
'siblingId': 8660990817003396,
'expanded': True,
'createdAt': '2020-02-14T13:15:18Z',
'modifiedAt': '2020-02-14T13:15:18Z',
'cells': [{'columnId': 6273865019090820, 'value': 2.0, 'displayValue': '2'},
{'columnId': 4022065205405572, 'value': 'B', 'displayValue': 'B'},
{'columnId': 8525664832776068, 'value': 'Jim', 'displayValue': 'Jim'}]},
{'id': 5001816119764868,
'rowNumber': 3,
'siblingId': 498216492394372,
'expanded': True,
'createdAt': '2020-02-14T13:15:18Z',
'modifiedAt': '2020-02-14T13:15:18Z',
'cells': [{'columnId': 6273865019090820, 'value': 3.0, 'displayValue': '3'},
{'columnId': 4022065205405572, 'value': 'C', 'displayValue': 'C'},
{'columnId': 8525664832776068, 'value': 'Jon', 'displayValue': 'Jon'}]}]}
Here are the two ways I've approached the problem:
INPUT:
from pandas.io.json import json_normalize
samplej = sample.json()
s_rows = json_normalize(data=samplej['rows'], record_path='cells', meta=['id', 'rowNumber'])
s_rows
OUTPUT:
DataFrame with columnId, value, disdlayValue, id, and rowNumber as their own fields.
If I could figure out how to transpose this data in the right way I could probably make it work, but that seems incredibly complicated.
INPUT:
samplej = sample.json()
cellist = []
def get_cells():
srows = samplej['rows']
for s_cells in srows:
scells = s_cells['cells']
cellist.append(scells)
get_cells()
celldf = pd.DataFrame(cellist)
celldf
OUTPUT:
This returns a DataFrame with the correct number of columns and rows, but each cell is populated with a dictionary that looks like
In [14]:
celldf.loc[1,1]
Out [14]:
{'columnId': 4022065205405572, 'value': 'B', 'displayValue': 'B'}
If there was a way to remove everything except the value corresponding to the displayValue key in every cell, this would probably solve my problem. Again, though, it seems weirdly complicated.
I'm fairly new to Python and working with API's, so there may be a simple way to address the problem I'm overlooking. Or, if you have a suggestion for approaching the possible solutions I outlined above I'm all ears. Thanks for your help!
You must make use of the columns field:
colnames = {x['id']: x['title'] for x in samplej['columns']}
columns = [x['title'] for x in samplej['columns']]
cellist = [{colnames[scells['columnId']]: scells['displayValue']
for scells in s_cells['cells']} for s_cells in samplej['rows']]
celldf = pd.DataFrame(cellist, columns=columns)
This gives as expected:
Number Letter Name
0 1 A Joe
1 2 B Jim
2 3 C Jon
If some cells could contain only a columnId but no displayValue field, scells['displayValue'] should be replaced in above code with scells.get('displayValue', defaultValue), where defaultValue could be None, np.nan or any other relevant default.
I have a dictionary like this:
a = {'compatibility': {'schema': ['attribute_variables/evar44',
'event42',
'container_visitors'],
'status': 'valid',
'supported_features': ['function_and',
'function_attr',
'function_container',
'function_event',
'function_event-exists',
'function_streq'],
'supported_products': ['o', 'data_warehouse', 'discover'],
'supported_schema': ['warehouse', 'n'],
'validator_version': '1.1.11'},
'definition': {'container': {'context': 'visitors',
'func': 'container',
'pred': {'func': 'and',
'preds': [{'description': 'e42',
'evt': {'func': 'event', 'name': 'metrics/event42'},
'func': 'event-exists'},
{'description': 'v44',
'func': 'streq',
'str': '544',
'val': {'func': 'attr', 'name': 'variables/evar44'}}]}},
'func': 'segment',
'version': [1, 0, 0]},
'description': '',
'id': 's2165c30c946ebceb',
'modified': '12',
'name': 'Apop',
'owner': {'id': 84699, 'login': 'max', 'name': 'Max'},
'reportSuiteName': 'App',
'rsid': 'test',
'siteTitle': 'App',
'tags': []}
I would like to extract the values of every key "description", "func", and "str"/"num" and return these values in one DataFrame of these dict.
I tried it with this code, but I wasn´t able to get every value und struggeld to put the values in one DataFrame.
def findkeys(node, kv):
if isinstance(node, list):
for i in node:
for x in findkeys(i, kv):
yield x
elif isinstance(node, dict):
if kv in node:
yield node[kv]
for j in node.values():
for x in findkeys(j, kv):
yield x
For my example the output I would like to have:
pd.DataFrame(np.array([['e42', 'event', 'NaN'], ['v44', 'streq', '544']]),
columns=['description', 'funk', 'str/num'])
The code below collect the values of the "interesting" keys into a dict.
from collections import defaultdict
a = {'compatibility': {'schema': ['attribute_variables/evar44',
'event42',
'container_visitors'],
'status': 'valid',
'supported_features': ['function_and',
'function_attr',
'function_container',
'function_event',
'function_event-exists',
'function_streq'],
'supported_products': ['o', 'data_warehouse', 'discover'],
'supported_schema': ['warehouse', 'n'],
'validator_version': '1.1.11'},
'definition': {'container': {'context': 'visitors',
'func': 'container',
'pred': {'func': 'and',
'preds': [{'description': 'e42',
'evt': {'func': 'event', 'name': 'metrics/event42'},
'func': 'event-exists'},
{'description': 'v44',
'func': 'streq',
'str': '544',
'val': {'func': 'attr', 'name': 'variables/evar44'}}]}},
'func': 'segment',
'version': [1, 0, 0]},
'description': '',
'id': 's2165c30c946ebceb',
'modified': '12',
'name': 'Apop',
'owner': {'id': 84699, 'login': 'max', 'name': 'Max'},
'reportSuiteName': 'App',
'rsid': 'test',
'siteTitle': 'App',
'tags': []}
def walk_dict(d, interesting_keys, result, depth=0):
for k, v in sorted(d.items(), key=lambda x: x[0]):
if isinstance(v, dict):
walk_dict(v, interesting_keys, result, depth + 1)
elif isinstance(v,list):
for entry in v:
if isinstance(entry, dict):
walk_dict(entry, interesting_keys, result, depth + 1)
else:
if k in interesting_keys:
result[k].append(v)
result = defaultdict(list)
walk_dict(a, ["description", "func", "str", "num"], result)
print(result)
output
defaultdict(<class 'list'>, {'func': ['container', 'and', 'event', 'event-exists', 'streq', 'attr', 'segment'], 'description': ['e42', 'v44', ''], 'str': ['544']})