JSON formatting by appending dict values to list - python

I have a JSON object which is like this:
{ "produktNr:"1234",
"artNr_01":"12",
"artNr_02":"23",
"artNr_03":"",
"artNr_04":"14",
"name_01":"abc",
"name_02":"der",
"test":"junk"
}
I would like to convert this into a dictionary like this:
{ "produktNr:"1234", "artNr":["12","23","","14"], "name":["abc","der"], "test":"junk"}
This conversion is based on a sequence given say, seq = ["artNr","name"]. So the contents of the sequence are searched in the dictionary's keys and the values collected into a list.
My attempt so far:
tempDict = {}
for key,value in fmData.iteritems():
for seqval in seq:
if seqval in key:
if seqval in tempDict:
tempDict[seqval].append(value)
else:
x = []
x.append(value)
tempDict[seqval]=x
else:
tempDict[key] = value
faces a few problems.
The list of values are not ordered i.e, "artNr":["","14","12","23"]
instead of values of [_01,_02,_03,_04]
The items cannot be popped from the dictionary since in the loop the dictionary items cannot be deleted resulting in:
{ "produktNr:"1234", "artNr":["12","23","","14"],"artNr_01":"12", "artNr_02":"23", "artNr_03":"","artNr_04":"14","name":["abc","der"],"name_01":"abc", "name_02":"der", "test":"junk"}
Would love to understand how to deal with this, especially if there's a pythonic way to solve this problem.

You may use OrderedDict from the collections package:
from collections import OrderedDict
import re
input_dict = { "produktNr":"1234",
"artNr_01":"12",
"artNr_02":"23",
"artNr_03":"",
"artNr_04":"14",
"name_01":"abc",
"name_02":"der",
"test":"junk" }
# split keys on the first '_'
m = re.compile('^([^_]*)_(.*)')
def _order_by( item ):
# helper function for ordering the dict.
# item is split on first '_' and, if it was successful
# the second part is returned otherwise item is returned
# if key is something like artNr_42, return 42
# if key is something like test, return test
k,s = item
try:
return m.search(k).group(2)
except:
return k
# create ordered dict using helper function
orderedDict = OrderedDict( sorted(input_dict.items(), key=_order_by))
aggregated_dict = {}
for k, v in orderedDict.iteritems():
# split key
match = m.search(k)
if match:
# key is splittable, i.e., key is something like artNr_42
kk = match.group(1)
if kk not in aggregated_dict:
# create list and add value
aggregated_dict[kk] = [v]
else:
# add value
aggregated_dict[kk].append(v)
else:
# key is not splittable, i.e., key is something like produktNr
aggregated_dict[k] = v
print(aggregated_dict)
which gives the desired output
{'produktNr': '1234', 'test': 'junk', 'name': ['abc', 'der'], 'artNr': ['12', '23', '', '14']}

You can recreate a new dictionary that will group values of keys with '_' in the keys in a list while the other keys and values are kept intact. This should do:
d = { "produktNr":"1234", "artNr_01":"12", "artNr_02":"23","artNr_03":"","artNr_04":"14","name_01":"abc","name_02":"der","test":"junk"}
new_d= {}
for k, v in d.items():
k_new = k.split('_')[0]
if '_' in k:
if k_new not in new_d:
new_d[k_new] = [v]
else:
new_d[k_new].append(v)
else:
new_d[k_new] = v
print(new_d)
# {'artNr': ['', '14', '23', '12'], 'test': 'junk', 'produktNr': '1234', 'name': ['der', 'abc']}
Dicts are unordered collections, so the order with which the values are appended to the list will be indeterminate.

A slight modification of your code:
tempDict = {}
for key,value in fmData.iteritems():
seqval_in_key = "no"
for seqval in seq:
if seqval in key:
seqval_in_key = "yes"
for seqval in seq:
if seqval in key:
if seqval in tempDict:
tempDict[seqval].append(value)
else:
x = []
x.append(value)
tempDict[seqval]=x
else:
if (seqval_in_key == "no"):
tempDict[key] = value
print tempDict
Result:
{'produktNr': '1234', 'test': 'junk', 'name': ['abc', 'der'], 'artNr': ['14', '23', '', '12']}

Related

Reformating a dictionary of list based on items in another list

I have a dictionary of lists like
source = {"name":["hans","james","mat"],"country":["spain"],"language":["english","french"]}
and another list like
data_not_avail = ["hans","spain","mat"]
How is it possible to reformat source dictionary into the following format
{
"exist":{"name":["james"], "language":["english","french"]},
"not_exist":{"name":["hans","mat"], "country":["spain"]}
}
I was trying to solve by finding the key of item which are present in list but it was not a success
data_result = {}
keys_list = []
for v in data_not_avail:
keys = [key for key, value in source.items() if v in value]
data_result.update({keys[0]:[v]})
keys_list.extend(keys)
This is a approach, you can use a list comprehension (or python built in filter) to filter every element within source lists with the content of data_not_avail.
data = {"exist": {}, "not_exist": {}}
for key, value in source.items():
data["exist"][key] = [v for v in value if v not in data_not_avail]
data["not_exist"][key] = [v for v in value if v in data_not_avail]
# if you dont need empty list in the result
if not data["exist"][key]:
del data["exist"][key]
if not data["not_exist"][key]:
del data["not_exist"][key]
Naive way of solving it is this, check it out.
values = list(source.values())
exist_values = []
not_values = []
for l in values:
temp_exist = []
temp_not = []
for item in l:
if item not in data_not_avail:
temp_exist.append(item)
else:
temp_not.append(item)
exist_values.append(temp_exist)
not_values.append(temp_not)
exist = {}
not_exist = {}
keys = ['name', 'language', 'country']
for i,key in enumerate(keys):
if len(exist_values[i]) != 0:
exist[key] = exist_values[i]
if len(not_values[i]) != 0:
not_exist[key] = not_values[i]
print(exist, not_exist)
#{'name': ['james'], 'country': ['english', 'french']}
#{'name': ['hans', 'mat'], 'language': ['spain']}

How could I create a dictionary taking specific strings of elements of a list?

I have a list with elements that contain values separated by "_" and I need to take the 4th as the values and the 5 and 6th as keys of a dictionary
My list:
['MLID_D_08_NGS_34_H08.fsa',
'MLID_D_17_W2205770_Michael_Jordan_A10.fsa',
'MLID_D_18_W2205770_Michael_Jordan_B10.fsa',
'MLID_D_19_W2205768_Maradona_Guti_C10.fsa',
'MLID_D_20_W2205768_Maradona_Guti_D10.fsa',
'MLID_D_38_No_DNA_F12.fsa']
I am trying to get a dictionary like this
thisdict = {
"34_H08": "NGS",
"Michael_Jordan_A10": "W2205770",
"Michael_Jordan_B10": "W2205770",
...
"DNA_F12": "No",
}
Optimised way of creating same dictionary
thisdict = dict(
(lambda x: ('_'.join(x[4:6]), x[3]))(s.split('_'))
for s in lst
)
Using reduce function
reduce(lambda x, y: x.update({ '_'.join(y.split('_')[4:6]): y.split('_')[3] }) or x, lst, {})
Try this:
lst = ['MLID_D_08_NGS_34_H08.fsa',
'MLID_D_17_W2205770_Michael_Jordan_A10.fsa',
'MLID_D_18_W2205770_Michael_Jordan_B10.fsa',
'MLID_D_19_W2205768_Maradona_Guti_C10.fsa',
'MLID_D_20_W2205768_Maradona_Guti_D10.fsa',
'MLID_D_38_No_DNA_F12.fsa']
dic = {}
for name in lst:
name = name.split(".")[0].split("_")
dic["_".join(name[4:])] = name[3]
print(dic)
This code may help you.
lst = ['MLID_D_08_NGS_34_H08.fsa',
'MLID_D_17_W2205770_Michael_Jordan_A10.fsa',
'MLID_D_18_W2205770_Michael_Jordan_B10.fsa',
'MLID_D_19_W2205768_Maradona_Guti_C10.fsa',
'MLID_D_20_W2205768_Maradona_Guti_D10.fsa',
'MLID_D_38_No_DNA_F12.fsa']
lst = [a.split('.') for a in lst] # split by .
dict_ = {}
for l in lst:
k = '_'.join(l[0].split('_')[4:]) # make key
v = l[0].split('_')[3] # make value
dict_[k]=v # add value to dict
print(dict_)
OUTPUT
{'34_H08': 'NGS',
'Michael_Jordan_A10': 'W2205770',
'Michael_Jordan_B10': 'W2205770',
'Maradona_Guti_C10': 'W2205768',
'Maradona_Guti_D10': 'W2205768',
'DNA_F12': 'No'}
As #matszwecja indicated s.split('_') is the way to go.
You can access different parts of the split as follows:
lst = ['MLID_D_08_NGS_34_H08.fsa',
'MLID_D_17_W2205770_Michael_Jordan_A10.fsa',
'MLID_D_18_W2205770_Michael_Jordan_B10.fsa',
'MLID_D_19_W2205768_Maradona_Guti_C10.fsa',
'MLID_D_20_W2205768_Maradona_Guti_D10.fsa',
'MLID_D_38_No_DNA_F12.fsa']
thisdict = {s.split('_')[4] + '_' + s.split('_')[5].split('.')[0]: s.split('_')[3] for s in lst}

Get a specific value from a list of dictionaries

I want to get the value a specific value '1222020' which has 'Expiration' as a key:
The 'Expiration' key can be placed at any position.
input :
my_list=[{'Key': 'Expiration', 'Value': '12122020'}, {'Key': 'Name', 'Value': 'Config Test 2'}]
my solution:
res = [sub['Value'] for sub in my_list if sub['Key'] =='Expiration' ]
print(res)
Sometimes the tag 'Expiration' is not present.
How to Handle that and avoid NoneType Object error
If you could re-organize your data like so,
custom_dict = {'Expiration': '12122020', 'Name': 'Config Test 2'}
Then, you could write the code like this,
def get_key_value_from_dictionary_search(dict_data, key_search_phrase, value_search_phrase):
for k,v in dict_data.items():
if k is key_search_phrase and v is value_search_phrase:
return k, v
_k, _v = get_key_value_from_dictionary_search(custom_dict, "Expiration", "12122020")
print("Key : {}\nValue : {}".format(_k, _v))
If the Expiration key isn't present, your res evaluates to an empty list. So if you just check for the presence on an empty list, you'll know if Expiration was in there to begin with.
def get_result(lst, default="99999999"):
res = [sub['Value'] for sub in lst if sub['Key'] == 'Expiration']
if res:
# there is something in the list, so return the first thing
return res[0]
else:
# the list is empty, so Expiration wasn't in lst
return default
print(get_result(my_list))

Get specific key of a nested iterable and check if its value exists in a list

I am trying to access a specific key in a nest dictionary, then match its value to a string in a list. If the string in the list contains the string in the dictionary value, I want to override the dictionary value with the list value. below is an example.
my_list = ['string1~', 'string2~', 'string3~', 'string4~', 'string5~', 'string6~']
my_iterable = {'A':'xyz',
'B':'string6',
'C':[{'B':'string4', 'D':'123'}],
'E':[{'F':'321', 'B':'string1'}],
'G':'jkl'
}
The key I'm looking for is B, the objective is to override string6 with string6~, string4 with string4~, and so on for all B keys found in the my_iterable.
I have written a function to compute the Levenshtein distance between two strings, but I am struggling to write an efficient ways to override the values of the keys.
def find_and_replace(key, dictionary, original_list):
for k, v in dictionary.items():
if k == key:
#function to check if original_list item contains v
yield v
elif isinstance(v, dict):
for result in find_and_replace(key, v, name_list):
yield result
elif isinstance(v, list):
for d in v:
if isinstance(d, dict):
for result in find_and_replace(key, d, name_list):
yield result
if I call
updated_dict = find_and_replace('B', my_iterable, my_list)
I want updated_dict to return the below:
{'A':'xyz',
'B':'string6~',
'C':[{'B':'string4~', 'D':'123'}],
'E':[{'F':'321', 'B':'string1~'}],
'G':'jkl'
}
Is this the right approach to the most efficient solution, and how can I modify it to return a dictionary with the updated values for B?
You can use below code. I have assumed the structure of input dict to be same throughout the execution.
# Input List
my_list = ['string1~', 'string2~', 'string3~', 'string4~', 'string5~', 'string6~']
# Input Dict
# Removed duplicate key "B" from the dict
my_iterable = {'A':'xyz',
'B':'string6',
'C':[{'B':'string4', 'D':'123'}],
'E':[{'F':'321', 'B':'string1'}],
'G':'jkl',
}
# setting search key
search_key = "B"
# Main code
for i, v in my_iterable.items():
if i == search_key:
if not isinstance(v,list):
search_in_list = [i for i in my_list if v in i]
if search_in_list:
my_iterable[i] = search_in_list[0]
else:
try:
for j, k in v[0].items():
if j == search_key:
search_in_list = [l for l in my_list if k in l]
if search_in_list:
v[0][j] = search_in_list[0]
except:
continue
# print output
print (my_iterable)
# Result -> {'A': 'xyz', 'B': 'string6~', 'C': [{'B': 'string4~', 'D': '123'}], 'E': [{'F': '321', 'B': 'string1~'}], 'G': 'jkl'}
Above can has scope of optimization using list comprehension or using
a function
I hope this helps and counts!
In some cases, if your nesting is kind of complex you can treat the dictionary like a json string and do all sorts of replacements. Its probably not what people would call very pythonic, but gives you a little more flexibility.
import re, json
my_list = ['string1~', 'string2~', 'string3~', 'string4~', 'string5~', 'string6~']
my_iterable = {'A':'xyz',
'B':'string6',
'C':[{'B':'string4', 'D':'123'}],
'E':[{'F':'321', 'B':'string1'}],
'G':'jkl'}
json_str = json.dumps(my_iterable, ensure_ascii=False)
for val in my_list:
json_str = re.sub(re.compile(f"""("[B]":\\W?")({val[:-1]})(")"""), r"\1" + val + r"\3", json_str)
my_iterable = json.loads(json_str)
print(my_iterable)

Python Iterating in a dictionary to get empty key's

Hello guys I'm trying to iterate from a dictionary to get the keys in case some of keys would be empty, but I have no idea how to achieve this.
Any idea ?
def val(**args):
args = args
print args
# print args
# print (args.keys())
val(name = '', country = 'Canada', phone = '')
Whit this example I got {'country': 'Canada', 'name': '', 'phone': ''} but when I'm really looking is to get only the keys of the empty keys in a list using append, the problem is that it gives me all the keys when and not just the empty keys.
In that case I would like to return something like this:
name, phone
I appreciate your help.
Iterate the dictionary and extract keys where the value is an empty string:
empty_keys = [k for k, v in args.items() if v == '']
or as a function:
>>> def val(**args):
... return [k for k, v in args.items() if v == '']
...
>>> val(name = '', country = 'Canada', phone = '')
['phone', 'name']
This is how you get a list of the empty keys:
empty = [k for k, v in args.items() if not v or v.isspace()]
Notice that the above includes the cases when the value is None or '' or only spaces.
The for statement can be used to iterate over the key/values of a dictionary, then you can do what you want with them.
def val(args) :
outputList = []
for k, v in args :
if v == '' :
outputList.append(k)
return outputList
This function will return a list made up of the keys whose value are the empty string.

Categories

Resources