How to split JSON fields containing '_' into sub-objects? - python

I have the following JSON object, in which I need to post-process some labels:
{
'id': '123',
'type': 'A',
'fields':
{
'device_safety':
{
'cost': 0.237,
'total': 22
},
'device_unit_replacement':
{
'cost': 0.262,
'total': 7
},
'software_generalinfo':
{
'cost': 3.6,
'total': 10
}
}
}
I need to split the names of labels by _ to get the following hierarchy:
{
'id': '123',
'type': 'A',
'fields':
{
'device':
{
'safety':
{
'cost': 0.237,
'total': 22
},
'unit':
{
'replacement':
{
'cost': 0.262,
'total': 7
}
}
},
'software':
{
'generalinfo':
{
'cost': 3.6,
'total': 10
}
}
}
}
This is my current version, but I got stuck and not sure how to deal with the hierarchy of fields:
import json
json_object = json.load(raw_json)
newjson = {}
for x, y in json_object['fields'].items():
hierarchy = y.split("_")
if len(hierarchy) > 1:
for k in hierarchy:
newjson[k] = ????
newjson = json.dumps(newjson, indent = 4)

Here is recursive function that will process a dict and split the keys:
def splitkeys(dct):
if not isinstance(dct, dict):
return dct
new_dct = {}
for k, v in dct.items():
bits = k.split('_')
d = new_dct
for bit in bits[:-1]:
d = d.setdefault(bit, {})
d[bits[-1]] = splitkeys(v)
return new_dct
>>> splitkeys(json_object)
{'fields': {'device': {'safety': {'cost': 0.237, 'total': 22},
'unit': {'replacement': {'cost': 0.262, 'total': 7}}},
'software': {'generalinfo': {'cost': 3.6, 'total': 10}}},
'id': '123',
'type': 'A'}

Related

Fastest way to get specific key from a dict if it is found

I am currently writing a scraper that reads from an API that contains a JSON. By doing response.json() it would return a dict where we could easily use the e.g response["object"]to get the value we want as I assume that converts it to a dict. The current mock data looks like this:
data = {
'id': 336461,
'thumbnail': '/images/product/123456?trim&h=80',
'variants': None,
'name': 'Testing',
'data': {
'Videoutgång': {
'Typ av gränssnitt': {
'name': 'Typ av gränssnitt',
'value': 'PCI Test'
}
}
},
'stock': {
'web': 0,
'supplier': None,
'displayCap': '50',
'1': 0,
'orders': {
'CL': {
'ordered': -10,
'status': 1
}
}
}
}
What I am looking after is that the API sometimes does contain "orders -> CL" but sometime doesn't . That means that both happy path and unhappy path is what I am looking for which is the fastest way to get a data from a dict.
I have currently done something like this:
data = {
'id': 336461,
'thumbnail': '/images/product/123456?trim&h=80',
'variants': None,
'name': 'Testing',
'data': {
'Videoutgång': {
'Typ av gränssnitt': {
'name': 'Typ av gränssnitt',
'value': 'PCI Test'
}
}
},
'stock': {
'web': 0,
'supplier': None,
'displayCap': '50',
'1': 0,
'orders': {
'CL': {
'ordered': -10,
'status': 1
}
}
}
}
if (
"stock" in data
and "orders" in data["stock"]
and "CL" in data["stock"]["orders"]
and "status" in data["stock"]["orders"]["CL"]
and data["stock"]["orders"]["CL"]["status"]
):
print(f'{data["stock"]["orders"]["CL"]["status"]}: {data["stock"]["orders"]["CL"]["ordered"]}')
1: -10
However my question is that I would like to know which is the fastest way to get the data from a dict if it is in the dict?
Lookups are faster in dictionaries because Python implements them using hash tables.
If we explain the difference by Big O concepts, dictionaries have constant time complexity, O(1). This is another approach using .get() method as well:
data = {
'id': 336461,
'thumbnail': '/images/product/123456?trim&h=80',
'variants': None,
'name': 'Testing',
'data': {
'Videoutgång': {
'Typ av gränssnitt': {
'name': 'Typ av gränssnitt',
'value': 'PCI Test'
}
}
},
'stock': {
'web': 0,
'supplier': None,
'displayCap': '50',
'1': 0,
'orders': {
'CL': {
'ordered': -10,
'status': 1
}
}
}
}
if (data.get('stock', {}).get('orders', {}).get('CL')):
print(f'{data["stock"]["orders"]["CL"]["status"]}: {data["stock"]["orders"]["CL"]["ordered"]}')
Here is a nice writeup on lookups in Python with list and dictionary as example.
I got your point. For this question, since your stock has just 4 values it is hard to say if .get() method will work faster than using a loop or not. If your dictionary would have more items then certainly .get() would have worked much faster but since there are few keys, using loop will not make much difference.

Python steamlit select box menu returns string, but I need dict or list

Stack on this case, Python steamlit select box menu returns string, but I need dict or list, to use it further in my code.
I want to see company1, company2, company3 in dropdown menu, and if user's choice was for example 'company2' get ['ID': 'zxc222’, 'NAME': 'company2','DESC': 'comp2'].
BaseObject = [{
'ID': 'zxc123',
'NAME': 'company1',
'DESC': 'comp1'
}, {
'ID': 'zxc222',
'NAME': 'company2',
'DESC': 'comp2'
}, {
'ID': 'zxc345',
'NAME': 'company3',
'DESC': 'comp3'
}]
lenbo = len(BaseObject)
options = []
for i in range(0, lenbo):
options.append((BaseObject[i])['NAME'])
st.selectbox('Subdivision:', options)
You can do the conversion to a dict after the selectbox:
import streamlit as st
BaseObject = [{
'ID': 'zxc123',
'NAME': 'company1',
'DESC': 'comp1'
}, {
'ID': 'zxc222',
'NAME': 'company2',
'DESC': 'comp2'
}, {
'ID': 'zxc345',
'NAME': 'company3',
'DESC': 'comp3'
}]
lenbo = len(BaseObject)
options = []
for i in range(0, lenbo):
options.append((BaseObject[i])['NAME'])
choice = st.selectbox('Subdivision:', options)
chosen_base_object = None
for base_object in BaseObject:
if base_object["NAME"] == choice:
chosen_base_object = dict(base_object)
print(chosen_base_object) # {'ID': 'zxc345', 'NAME': 'company3', 'DESC': 'comp3'}

How do I access a specific value in a nested Python dictionary?

I am trying to figure out how to filter for the dictionaries that have a status of "awaiting_delivery". I am not sure how to do this (or if it is impossible). I am new to python and programming. I am using Python 3.8.5 on VS Code on Ubuntu 20.04. The data below is sample data that I created that resembles json data from an API. Any help on how to filter for "status" would be great. Thank you.
nested_dict = {
'list_data': [
{
'id': 189530,
'total': 40.05,
'user_data': {
'id': 1001,
'first_name': 'jane',
'last_name': 'doe'
},
'status': 'future_delivery'
},
{
'id': 286524,
'total': 264.89,
'user_data': {
'id': 1002,
'first_name': 'john',
'last_name': 'doe'
},
'status': 'awaiting_delivery'
},
{
'id': 368725,
'total': 1054.98,
'user_data': {
'id': 1003,
'first_name': 'chris',
'last_name': 'nobody'
},
'status': 'awaiting_delivery'
},
{
'id': 422955,
'total': 4892.78,
'user_data': {
'id': 1004,
'first_name': 'mary',
'last_name': 'madeup'
},
'status': 'future_delivery'
}
],
'current_page': 1,
'total': 2,
'first': 1,
'last': 5,
'per_page': 20
}
#confirm that nested_dict is a dictionary
print(type(nested_dict))
#create a list(int_list) from the nested_dict dictionary
int_list = nested_dict['list_data']
#confirm that int_list is a list
print(type(int_list))
#create the int_dict dictionary from the int_list list
for int_dict in int_list:
print(int_dict)
#this is my attempt at filtering the int_dict dictionar for all orders with a status of awaiting_delivery
for order in int_dict:
int_dict.get('status')
print(order)
Output from Terminal Follows:
<class 'dict'>
<class 'list'>
{'id': 189530, 'total': 40.05, 'user_data': {'id': 1001, 'first_name': 'jane', 'last_name': 'doe'}, 'status': 'future_delivery'}
{'id': 286524, 'total': 264.89, 'user_data': {'id': 1002, 'first_name': 'john', 'last_name': 'doe'}, 'status': 'awaiting_delivery'}
{'id': 368725, 'total': 1054.98, 'user_data': {'id': 1003, 'first_name': 'chris', 'last_name': 'nobody'}, 'status': 'awaiting_delivery'}
{'id': 422955, 'total': 4892.78, 'user_data': {'id': 1004, 'first_name': 'mary', 'last_name': 'madeup'}, 'status': 'future_delivery'}
id
total
user_data
status
You can obtain a filtered list of dicts by doing conditional list comprehension on your list of dicts:
# filter the data
list_data_filtered = [entry for entry in nested_dict['list_data']
if entry['status'] == 'awaiting_delivery']
# print out the results
for entry in list_data_filtered:
print(entry)
# results
# {'id': 286524, 'total': 264.89, 'user_data': {'id': 1002, 'first_name': 'john', 'last_name': 'doe'}, 'status': 'awaiting_delivery'}
# {'id': 368725, 'total': 1054.98, 'user_data': {'id': 1003, 'first_name': 'chris', 'last_name': 'nobody'}, 'status': 'awaiting_delivery'}

How to iterate over a JSON array and get values for a key which itself is a JSON object

I have been trying to do something simple yet something hard for me to solve it!
I have a json object that looks like:
jsonObject = {
'attributes': {
'192': { <--- This can be changed times to times meaning different number
'id': '192',
'code': 'hello',
'label': 'world',
'options': [
{
'id': '211',
'label': '5'
},
{
'id': '1202',
'label': '8.5'
},
{
'id': '54',
'label': '9'
},
{
'id': '1203',
'label': '9.5'
},
{
'id': '58',
'label': '10'
}
]
}
},
'template': '12345',
'basePrice': '51233',
'oldPrice': '51212',
'productId': 'hello',
}
and what I want to do is to get the values from options (To have both id and label saved into a list)
For now I only managed to do:
for att, value in jsonObject.items():
print(f"{att} - {value}"
How can I get the label and id?
You can try the following code:
attr = jsonObject['attributes']
temp = list(attr.values())[0] # It is same as "temp = attr['192']", but you said '192' can be changed.
options = temp['options']
for option in options:
print(f"id: {option['id']}, label: {option['label']}")

mongodb python get the element position from an array in a document

I use python + mongodb to store some item ranking data in a collection called chart
{
date: date1,
region: region1,
ranking: [
{
item: bson.dbref.DBRef(db.item.find_one()),
price: current_price,
version: '1.0'
},
{
item: bson.dbref.DBRef(db.item.find_another_one()),
price: current_price,
version: '1.0'
},
.... (and the array goes on)
]
}
Now my problem is, I want to make a history ranking chart for itemA. And according to the $ positional operator, the query should be something like this:
db.chart.find( {'ranking.item': bson.dbref.DBRef('item', itemA._id)}, ['$'])
And the $ operator doesn't work.
Any other possible solution to this?
The $ positional operator is only used in update(...) calls, you can't use it to return the position within an array.
However, you can use field projection to limit the fields returned to just those you need to calculate the position in the array from within Python:
db.foo.insert({
'date': '2011-04-01',
'region': 'NY',
'ranking': [
{ 'item': 'Coca-Cola', 'price': 1.00, 'version': 1 },
{ 'item': 'Diet Coke', 'price': 1.25, 'version': 1 },
{ 'item': 'Diet Pepsi', 'price': 1.50, 'version': 1 },
]})
db.foo.insert({
'date': '2011-05-01',
'region': 'NY',
'ranking': [
{ 'item': 'Diet Coke', 'price': 1.25, 'version': 1 },
{ 'item': 'Coca-Cola', 'price': 1.00, 'version': 1 },
{ 'item': 'Diet Pepsi', 'price': 1.50, 'version': 1 },
]})
db.foo.insert({
'date': '2011-06-01',
'region': 'NY',
'ranking': [
{ 'item': 'Coca-Cola', 'price': 1.00, 'version': 1 },
{ 'item': 'Diet Pepsi', 'price': 1.50, 'version': 1 },
{ 'item': 'Diet Coke', 'price': 1.25, 'version': 1 },
]})
def position_of(item, ranking):
for i, candidate in enumerate(ranking):
if candidate['item'] == item:
return i
return None
print [position_of('Diet Coke', x['ranking'])
for x in db.foo.find({'ranking.item': 'Diet Coke'}, ['ranking.item'])]
# prints [1, 0, 2]
In this (admittedly trivial) example, returning just a subset of fields may not show much benefit; however if your documents are especially large, doing may show performance improvements.

Categories

Resources