Python - How to assign/map non-sequential JSON fields onto a dict - python

I have a JSON with a dict of keys which are not always present, at least not all of them all the time at the same position. For example, "producers" is not always on array dict [2] present or "directors" not always on [1] at the JSON, it fully depends on the JSON I pass into my function. Depending on what is available at ['plist']['dict']['key'] the content is mapped to dict 0,1,2,3 (except of studio) ...
How can I find the corresponding array for cast, directors, producers etc. as each of them is not always located at the same array number?!
In the end I always want to be able to pull out the right data for the right field even if ['plist']['dict']['key'] may vary sometimes according to the mapped dict.
...
def get_plist_meta(element):
if isinstance(element, dict):
return element["string"]
return ", ".join(i["string"] for i in element)
...
### Default map if all fields are present
# 0 = cast
# 1 = directors
# 2 = producers
# 3 = screenwriters
plist_metadata = json.loads(dump_json)
### make fields match the given sequence 0 = cast, 1 = directors etc. ()
if 'cast' in plist_metadata['plist']['dict']['key']:
print("Cast: ", get_plist_meta(plist_metadata['plist']['dict']['array'][0]['dict']))
if 'directors' in plist_metadata['plist']['dict']['key']:
print("Directors: ", get_plist_meta(plist_metadata['plist']['dict']['array'][1]['dict']))
if 'producers' in plist_metadata['plist']['dict']['key']:
print("Producers: ", get_plist_meta(plist_metadata['plist']['dict']['array'][2]['dict']))
if 'screenwriters' in plist_metadata['plist']['dict']['key']:
print("Screenwriters: ", get_plist_meta(plist_metadata['plist']['dict']['array'][3]['dict']))
if 'studio' in plist_metadata['plist']['dict']['key']:
print("Studio: ", plist_metadata['plist']['dict']['string'])
JSON:
{
"plist":{
"#version":"1.0",
"dict":{
"key":[
"cast",
"directors",
"screenwriters",
"studio"
],
"array":[
{
"dict":[
{
"key":"name",
"string":"Martina Piro"
},
{
"key":"name",
"string":"Ralf Stark"
}
]
},
{
"dict":{
"key":"name",
"string":"Franco Camilio"
}
},
{
"dict":{
"key":"name",
"string":"Kai Meisner"
}
}
],
"string":"Helix Films"
}
}
}
JSON can also be obtained here: https://pastebin.com/JCXRs3Rw
Thanks in advance

If you prefer a more pythonic solution, try this:
# We will use this function to extract the names from the subdicts. We put single items in a new array so the result is consistent, no matter how many names there were.
def get_names(name_dict):
arrayfied = name_dict if isinstance(name_dict, list) else [name_dict]
return [o["string"] for o in arrayfied]
# Make a list of tuples
dict = plist_metadata['plist']['dict']
zipped = zip(dict["key"], dict["array"])
# Get the names from the subdicts and put it into a new dict
result = {k: get_names(v["dict"]) for k, v in zipped}
This will give you a new dict that looks like this
{'cast': ['Martina Piro', 'Ralf Stark'], 'directors': ['Franco Camilio'], 'screenwriters': ['Kai Meisner']}
The new dict will only have the keys present in the original dict.
I'd advise to check out things like zip, map and so on as well as list comprehensions and dict comprehensions.

I think this solves your problem:
import json
dump_json = """{"plist":{"#version":"1.0","dict":{"key":["cast","directors","screenwriters","studio"],"array":[{"dict":[{"key":"name","string":"Martina Piro"},{"key":"name","string":"Ralf Stark"}]},{"dict":{"key":"name","string":"Franco Camilio"}},{"dict":{"key":"name","string":"Kai Meisner"}}],"string":"Helix Films"}}}"""
plist_metadata = json.loads(dump_json)
roles = ['cast', 'directors', 'producers', 'screenwriters'] # all roles
names = {'cast': [], 'directors': [], 'producers': [], 'screenwriters': []} # stores the final output
j = 0 # keeps count of which array entry we are looking at in plist_metadata['plist']['dict']['array']
for x in names.keys(): # cycle through all the possible roles
if x in plist_metadata['plist']['dict']['key']: # if a role exists in the keys, we'll store it in names[role_name]
y = plist_metadata['plist']['dict']['array'][j]['dict'] # keep track of value
if isinstance(plist_metadata['plist']['dict']['array'][j]['dict'], dict): # if its a dict, encase it in a list
y = [plist_metadata['plist']['dict']['array'][j]['dict']]
j += 1 # add to our plist-dict-array index
names[x] = list(map(lambda x: x['string'], y)) # map each of the entries from {"key":"name","string":"Martina Piro"} to just "Martina Piro"
print(names)
def list_names(role_name):
if role_name not in names.keys():
return f'Invalid list request: Role name "{role_name}" not found.'
return f'{role_name.capitalize()}: {", ".join(names[role_name])}'
print(list_names('cast'))
print(list_names('audience'))
Output:
{'cast': ['Martina Piro', 'Ralf Stark'], 'directors': ['Franco Camilio'], 'producers': [], 'screenwriters': ['Kai Meisner']}
Cast: Martina Piro, Ralf Stark
Invalid list request: Role name "audience" not found.

Related

How to select a certain key from a dictionary in a list of another list?

I have a nested JSON-file that looks like this:
[
{
"IsRecentlyVerified": true,
"AddressInfo": {
"Town": "Haarlem",
},
"Connections": [
{
"PowerKW": 17,
"Quantity": 1
}
],
"NumberOfPoints": 1,
},
{
"IsRecentlyVerified": true,
"AddressInfo": {
"Town": "Haarlem",
},
"Connections": [
{
"PowerKW": 17,
"Quantity": 1
},
{
"PowerKW": 17,
"Quantity": 1
}
],
"NumberOfPoints": 1,
}
]
As you can see, the list of this JSON-file consists of two dictionaries that each contains another list (= "Connections") that consists of at least one dictionary. In each dictionary of this JSON-file, I want to select all keys named "Quantity" to make a sum with its value (so in the example code above, I want to calculate that there are 3 Quantities in total).
Sometimes, the key "Quantity" is not always present, so that's why I used in to check if it is present. I noticed that it now only finds the key "Quantity" when I mention the index, like this: if "Quantity" in ev_list[info]["Connections"][0]
def amountOfChargingStations():
totalAmountOfChargingPolesInCity = 0
for info in range(len(ev_list)):
if "Town" in ev_list[info]["AddressInfo"]:
if ev_list[info]["AddressInfo"]["Town"] == "Haarlem":
totalAmountOfChargingStationsInCity = totalAmountOfChargingStationsInCity + 1
if "Quantity" in ev_list[info]["Connections"][0]:
if ev_list[info]["Connections"]:
for pole in ev_list[info]["Connections"]:
totalAmountOfChargingPolesInCity = totalAmountOfChargingPolesInCity + pole["Quantity"]
else:
print("Can't find connection")
print("There are at least", totalAmountOfChargingPolesInCity, "charging poles available.")
polesAndStations = amountOfChargingStations()
The problem is that it now only uses the first dictionary of each "Connections"-list to make the sum. How can I select all keys named "Quantity" to make this sum, without knowing the total the amount of dictionaries in each "Connections"-list? (The total amount varies from 1 up to more than 10). Is there something like [0:end]?
as a oneliner:
total_quantity = sum([con['Quantity'] for dataset in data for con in dataset['Connections'] if 'Connections' in dataset.keys() and 'Quantity' in con.keys() ])
given datais your imported json.
EDIT: sorry, did not read your code carefully enough.
actually you do not need to be so complicated with the for loop over a range, sounds like you are coming from another programming language.
With
for info in ev_list
...
you already get the element itself and can change ev_list[info] to info.
Also did you get totalAmountOfChargingStationsInCity from somewhere else? It should return a 'referenced before assignment error' like this.
I am still a fan of oneliners and list comprehensions, so this would do the trick for me:
def amountOfChargingStations():
total_amount_of_charging_poles_in_city = 0
total_amount_of_charging_stations_in_city = 0
for info in ev_list:
if "Town" in info["AddressInfo"]:
if info["AddressInfo"]["Town"] == "Haarlem":
total_amount_of_charging_stations_in_city = total_amount_of_charging_stations_in_city + 1
total_amount_of_charging_poles_in_city += sum(
[con.get('Quantity', ) for con in info.get('Connections', [])])
print("There are at least", total_amount_of_charging_poles_in_city, "charging poles available.")
EDIT2: sorry, my mistake, changed the comprehension a bit.
dictionary.get('key', 'default if key is not in dictionary') is a safer way to call something from a dictionary.
You can try recursion:
import json
def get_quantity(o):
if isinstance(o, dict):
if "Quantity" in o:
yield o["Quantity"]
for v in o.values():
yield from get_quantity(v)
elif isinstance(o, list):
for v in o:
yield from get_quantity(v)
with open("your_file.json", "r") as f_in:
data = json.load(f_in)
print(sum(get_quantity(data)))
Prints:
3
Here's a simple approach that, given the structure of your JSON as described, works as required:
quantity = 0
for d in json.loads(JDATA):
if (_list := d.get('Connections')):
for _d in _list:
quantity += _d.get('Quantity', 0)
print(quantity)
Output:
3
using jmespath
you can get sum as
import jmespath
print(sum(jmespath.search("[*].Connections[].Quantity", data), 0))

sort values from a dictionary/json file

I've got this discord.py command that makes a leaderboard from a json
cogs/coins.json (the dictionary) looks like this:
{
"781524858026590218": {
"name": "kvbot test platform",
"total_coins": 129,
"data": {
"564050979079585803": {
"name": "Bluesheep33",
"coins": 127
},
"528647474596937733": {
"name": "ACAT_",
"coins": 2
}
}
(The green strings with numbers in the json files are discord guild/member ids)
How do I make the code shorter and clearer?
Thanks for helping in advance, because I really don't know the solution
When it comes to finding (sorting) the first ten items within a dict, then the way is much easier than repeatedly going through the dict and doing different things there.
And little better code, like Dict.get for safety access.
Based on a sample of JSON data.
with open('cogs/coins.json', 'r') as f:
coins_data = json.load(f)
# Get is safefy access to dict
# Dict.items() returns pairs of (Key, Val)
members_coins = list(coins_data.get(str(ctx.guild.id), None)['data'].items())
if members_coins is None: # If data not found
await ctx.send('Not data')
return
# Sort list by Val part of pair, and `coins` key, reverse for descending
members_coins.sort(key=lambda x: x[1]['coins'], reverse=True)
output = ''
# list[:10] for first 10 items (if list is smaller, thats okay, python don't mind)
for member_id, vals in members_coins[:10]:
output += f'{vals["name"]}: {vals["coins"]}'
# output += f'<#{member_id}>: {vals["coins"]}' # If you want "mention" display of user
await ctx.send(output)

Nested dictionary access is returning a list

The issue I'm having is that when i try to access the values within a nested dictionary, i cannot because it's returning a list instead of a dictionary.
I have a .json file with this format;
{
"users": [
{
"1": {
"1": "value",
"2": "value"
}
}
]
}
I load the .json file, and access the value i want by using this function
def load_json(fn):
with open(fn) as pf:
data = json.load(pf)
return data['users']['1']['2']
If i simply do return data it is a dictionary, but if try to access further by adding ['users'], it turns into a list and will give an index error if i try to access key #1 or #2 inside of that..
My objective is to obtain the value of the nested key #2 for example, ideally without having loop through it.
Your JSON contains an array (Python list) wrapping the inner dicts (that's what the [ and ] in your JSON indicate). All you need to do is change:
return data['users']['1']['2']
to:
return data['users'][0]['1']['2']
# ^^^ Added
to index the list to get into the inner dicts.
given your data structure, and following it down :
data is a dictionary - with one key 'users' and a value of a list
data['users'] is a list - with one entry
data['users'][0] is a dictionary - with one key '1' and a value of a dictionary
data['users'][0][1] is a dictionary - with two keys '1' and '2'
So you need to do do :
def load_json(fn):
with open(fn) as pf:
data = json.load(pf)
return data['users'][0]['1']['2']

How to properly keep structure when removing keys in JSON using python?

I'm using this as a reference: Elegant way to remove fields from nested dictionaries
I have a large number of JSON-formatted data here and we've determined a list of unnecessary keys (and all their underlying values) that we can remove.
I'm a bit new to working with JSON and Python specifically (mostly did sysadmin work) and initially thought it was just a plain dictionary of dictionaries. While some of the data looks like that, several more pieces of data consists of dictionaries of lists, which can furthermore contain more lists or dictionaries with no specific pattern.
The idea is to keep the data identical EXCEPT for the specified keys and associated values.
Test Data:
to_be_removed = ['leecher_here']
easy_modo =
{
'hello_wold':'konnichiwa sekai',
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa'
}
lunatic_modo =
{
'hello_wold':
{'
leecher_here':'nushiyowa','goodbye_world':'aokigahara'
},
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa',
'something_inside':
{
'hello_wold':'konnichiwa sekai',
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa'
},
'list_o_dicts':
[
{
'hello_wold':'konnichiwa sekai',
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa'
}
]
}
Obviously, the original question posted there isn't accounting for lists.
My code, modified appropriately to work with my requirements.
from copy import deepcopy
def remove_key(json,trash):
"""
<snip>
"""
keys_set = set(trash)
modified_dict = {}
if isinstance(json,dict):
for key, value in json.items():
if key not in keys_set:
if isinstance(value, dict):
modified_dict[key] = remove_key(value, keys_set)
elif isinstance(value,list):
for ele in value:
modified_dict[key] = remove_key(ele,trash)
else:
modified_dict[key] = deepcopy(value)
return modified_dict
I'm sure something's messing with the structure since it doesn't pass the test I wrote since the expected data is exactly the same, minus the removed keys. The test shows that, yes it's properly removing the data but for the parts where it's supposed to be a list of dictionaries, it's only getting returned as a dictionary instead which will have unfortunate implications down the line.
I'm sure it's because the function returns a dictionary but I don't know to proceed from here in order to maintain the structure.
At this point, I'm needing help on what I could have overlooked.
When you go through your json file, you only need to determine whether it is a list, a dict or neither. Here is a recursive way to modify your input dict in place:
def remove_key(d, trash=None):
if not trash: trash = []
if isinstance(d,dict):
keys = [k for k in d]
for key in keys:
if any(key==s for s in trash):
del d[key]
for value in d.values():
remove_key(value, trash)
elif isinstance(d,list):
for value in d:
remove_key(value, trash)
remove_key(lunatic_modo,to_be_removed)
remove_key(easy_modo,to_be_removed)
Result:
{
"hello_wold": {
"goodbye_world": "aokigahara"
},
"leeching_forbidden": "wanpan kinshi",
"something_inside": {
"hello_wold": "konnichiwa sekai",
"leeching_forbidden": "wanpan kinshi"
},
"list_o_dicts": [
{
"hello_wold": "konnichiwa sekai",
"leeching_forbidden": "wanpan kinshi"
}
]
}
{
"hello_wold": "konnichiwa sekai",
"leeching_forbidden": "wanpan kinshi"
}

Removing a key from a nested dictionary based on value

I previusly asked about adding, and someone helped me out with append. My new problem is trying to delete a key with a nested list, e.g.:
JSON:
data = {"result":[{"name":"Teddy","list":{"0":"24","1":"43","2":"56"}},
{"name":"Barney","list":{"0":"24","1":"43","2":"56"}]}
Python:
name = input("Input a key to delete") #Must hold a value.
data["result"].pop(name)
E.g. Barney => then delete Barney etc.
I use the method below to find a key, but I am not sure this is the correct approach.
Finding Barney:
for key in data['result']:
if key['name'] == name:
print("Found!!!!")
I am not sure. This surely does not work, maybe I should loop through each key or? Any suggestion or code example is worth.
After Delete: Now that barney was deleted the dictionary remains like this.
data = {"result":[{"name":"Teddy","list":{"0":"24","1":"43","2":"56"}}]}
If the goal is to remove list items from the JSON document, you'll want to:
Convert the JSON document into a Python data structure
Manipulate the Python data structure
Convert the Python data structure back to a JSON document
Here is one such program:
import json
def delete_keys(json_string, name):
data = json.loads(json_string)
data['result'][:] = [d for d in data['result'] if d.get('name') != name]
json_string = json.dumps(data)
return json_string
j = '''
{"result":[{"name":"Teddy",
"list":{"0":"24","1":"43","2":"56"}},
{"name":"Barney","list":{"0":"24","1":"43","2":"56"}}]}'''
print delete_keys(j, 'Barney')
Result:
$ python x.py
{"result": [{"list": {"1": "43", "0": "24", "2": "56"}, "name": "Teddy"}]}
Note this list comprehension:
data['result'][:] = [d for d in data['result'] if d.get('name') != name]
The form l[:] = [item in l if ...] is one way to delete several items from a list according to the indicated condition.
Since data['result'] is a list, you'll have to go to the index and delete the key. If you're looking to delete the key across all indices in the list, you could quickly write a function that iterates through the list and deletes the matching key
def delete_key(list_obj, key):
for value in list_obj:
if key in value:
value.pop(key)
return list_obj
result = delete_key(data["result"], 'key1')
You can convert the JSON into a JavaScript object.
var resultString = '{'result':[{'key1':'value1','key2':'value2'}, {'key1':'value3','key2':'value4'}]}';
var result = JSON.parse(resultString);
Once you do, you should be more aware that this is an array of objects. You need to know which index you want to remove. You can use the .find method for arrays
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find
if you don't know the index.
var inventory = [
{name: 'apples', quantity: 2},
{name: 'bananas', quantity: 0},
{name: 'cherries', quantity: 5}
];
function findCherries(fruit) {
return fruit.name === 'cherries';
}
console.log(inventory.find(findCherries));
// { name: 'cherries', quantity: 5 }
Realize though that find does not work in IE. Once you know the index, you can split the array.
var myFish = ["angel", "clown", "drum", "mandarin", "surgeon"];
var removed = myFish.splice(3, 1);
// removed is ["mandarin"]
// myFish is ["angel", "clown", "drum", "surgeon"]
from https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/splice

Categories

Resources