python according to the same value combining dictionary - python

i have a list of dict like this
[
{'id': 'A123',
'feature': {'name': 'jack', 'age' : '18' },
'create_time': '2022-5-17 10:29:47',
'is_fast': False},
{'id': 'A123',
'feature': {'gender': 'male'},
'create_time': '2022-5-17 10:29:47',
'is_fast': False},
{'id': 'A123',
'habit': {'name': 'read'},
'create_time': '2022-5-15 10:29:45',
'is_fast': False},
{'id': 'A456',
'feature': {'name': 'rose'},
'create_time': '2022-4-15 10:29:45',
'is_fast': False},
{'id': 'A456',
'habit': {'name': 'sport'},
'create_time': '2022-3-15 10:29:45',
'is_fast': False}
]
But I want to merge the same "id" values ​​together using something function
The desired output is as follows
[
{'id': 'A123',
'feature': {'name': 'jack', 'age' : '18' ,'gender': 'male'},
'habit': {'name': 'read'},
'create_time': '2022-5-19 10:29:47', #Get the latest time based on the same id
'is_fast': False},
{'id': 'A456',
'feature': {'name': 'rose'},
'habit': {'name': 'sport'},
'create_time': '2022-4-15 10:29:45',
'is_fast': False},
]
How can I merge the same "id" values ​​according to these dictionaries..

This should get you started... I put some inline notes to explain what the code is doing. You still need to implement a date time comparison.
def merge_dicts(lst):
final = {} # results
for row in lst: # iterate through list
if row['id'] not in final: # if current item id hasnt been seen
final[row['id']] = row # assign it to results with id as the key
else:
record = final[row['id']] # otherwise compare to data already stored
for k,v in row.items(): #iterate through dictionary items
if k not in record: # if key not in results
record[k] = v # add the key and value
continue
if record[k] == v: continue # if they are already equal move on
if isinstance(v, dict): # if its a dictionary
record[k].update(v) # update the dictionary
else: # must be date time sequence so do some datetime comparison
"""Do some date comparison and assign correct date"""
return [v for k,v in final.items()] # convert to list
print(merge_dicts(lst))
output:
[
{
'id': 'A123',
'feature': {'name': 'jack', 'age': '18', 'gender': 'male'},
'create_time': '2022-5-17 10:29:47',
'is_fast': False,
'habit': {'name': 'read'}
},
{
'id': 'A456',
'feature': {'name': 'rose'},
'create_time': '2022-4-15 10:29:45',
'is_fast': False,
'habit': {'name': 'sport'}
}
]

You can use the dict.setdefault method to initialize sub-dicts under keys that don't already exist to avoid cluttering up your code with conditional statements that test the existence of keys:
merged = {}
for d in lst:
s = merged.setdefault(d['id'], d)
for k, v in d.items():
if isinstance(v, dict):
s.setdefault(k, v).update(v)
elif v > s[k]: # the dates/times in the input follow alphabetical order
s[k] = v # later dates/times takes precedence
print(list(merged.values()))
Demo: https://replit.com/#blhsing/BlandCarelessPolygons#main.py

Related

How to extract nested dictionaries from dictionary into single dictionary?

I have a dictionary which contains some key-value pairs as strings, but some key-values are dictionaries.
The data looks like this:
{'amount': 123,
'baseUnit': 'test',
'currency': {'code': 'EUR'},
'dimensions': {'height': {'iri': 'http://www.example.com/data/measurement-height-12345',
'unitOfMeasure': 'm',
'value': 23},
'length': {'iri': 'http://www.example.com/data/measurement-length-12345',
'unitOfMeasure': 'm',
'value': 8322},
'volume': {'unitOfMeasure': '', 'value': 0},
'weight': {'iri': 'http://www.example.com/data/measurement-weight-12345',
'unitOfMeasure': 'KG',
'value': 23},
'width': {'iri': 'http://www.example.com/data/measurement-width-12345',
'unitOfMeasure': 'm',
'value': 1}},
'exportListNumber': '1234',
'iri': 'http://www.example.com/data/material-12345',
'number': '12345',
'orderUnit': 'sdf',
'producerFormattedPID': '12345',
'producerID': 'example',
'producerNonFormattedPID': '12345',
'stateID': 'm70',
'typeID': 'FERT'}
for the dimensions and price keys, there are some nested dictionaries as values. How can I extract that data so that the final variable is a dictionary with only keys-values as strings. For the price, I would need something like:
{'pricecurrencycode':'EUR','priceamount':123} instead of 'price': {'currency': {'code': 'EUR'}, 'amount': 123}.
and the same happening to dimensions key->to extract all the nested dictionaries so that it could be easier to transform into a final dataframe.
You can define a recursive flatten function that gets called whenever the dictionary value is a dictionary.
Assuming python>=3.9:
def flatten(my_dict, prefix=""):
res = {}
for k, v in my_dict.items():
if isinstance(v, dict):
res |= flatten(v, prefix+k)
else:
res[prefix+k] = v
return res
A slightly more verbose option for older python versions:
def flatten(my_dict, prefix=""):
res = {}
for k, v in my_dict.items():
if isinstance(v, dict):
for k_flat, v_flat in flatten(v, prefix+k).items():
res[k_flat] = v_flat
else:
res[prefix+k] = v
return res

python - dictionary - update text values of keys - setting an priority (max principle)

I have the following strings as values for a dictionary key:
["low", "middle", "high", "very high"]
These are the options for the dicionary item key 'priority', a sample dict element is:
{'name': 'service', 'priority': value}
My task is to collect a list of dictionaries with the keys, all differ in the key value 'priority'.
my_list = [{'name': 'service', 'priority': 'low'}, {'name': 'service', 'priority': 'high'}]
In the end a final dictionary item should exist, that has the highest priority value. It should work like the maximum principle. In this case {'name': 'service', 'priority': 'high'} would be the result.
The problem is that the value is a string, not an integer.
Thanks for all ideas to get it work.
Here is the approach with itertools module usage:
# Step 0: prepare data
score = ["low", "middle", "high", "very high"]
my_list = [{'name': 'service', 'priority': 'low', 'label1':'text'}, {'name': 'service', 'priority': 'middle', 'label2':'text'}, {'name': 'service_b', 'priority': 'middle'}, {'name': 'service_b', 'priority': 'very high'}]
my_list # to just show source data in list
Out[1]:
[{'name': 'service', 'priority': 'low', 'label1': 'text'},
{'name': 'service', 'priority': 'middle', 'label2': 'text'},
{'name': 'service_b', 'priority': 'middle'},
{'name': 'service_b', 'priority': 'very high'}]
# Step 0.5: convert bytes-string (if it is) to string
# my_list = [{k:(lambda x: (x.decode() if type(x) == bytes else x))(v) for k,v in i.items()} for i in my_list ]
# Step 1: reorganize "score"-list on most useful way - to dict
score_dic = {i[0]:i[1] for i in list(zip(score, range(len(score))))}
score_dic
Out[2]:
{'low': 0, 'middle': 1, 'high': 2, 'very high': 3}
# Step 2: get result
import itertools
[max(list(g), key = lambda b: score_dic[b['priority']]) for k,g in itertools.groupby(my_list, lambda x:x['name'])]
Out[3]:
[{'name': 'service', 'priority': 'middle', 'label2': 'text'},
{'name': 'service_b', 'priority': 'very high'}]
Is this what you want?
priorities = ["low", "middle", "high", "very high"]
items = [{'name': 'service', 'priority': 'high'}, {'name': 'service2', 'priority': 'high'}, {'name': 'service', 'priority': 'very high'}, {'name': 'service2', 'priority': 'very high'}]
max_priority = max(items, key=lambda item: priorities.index(item['priority']))['priority']
max_items = [item for item in items if item['priority'] == max_priority]
print(max_items)
Output:
[{'name': 'service', 'priority': 'very high'}, {'name': 'service2', 'priority': 'very high'}]

Removing duplicate dictionaries in list of dicts based on value uniqueness for a given key

I have a dictionary:
dicts = [
{'id': 'item1', 'type': 'foo', 'metaId': 'metaId1'},
{'id': 'item2', 'type': 'foo', 'metaId': 'metaId2'},
{'id': 'item3', 'type': 'foo3', 'metaId': 'metaId3'},
{'id': 'item4', 'type': 'foo2', 'metaId': 'metaId2'},
{'id': 'item5', 'type': 'foo3', 'metaId': 'metaId3'},
{'id': 'item6', 'type': 'foo2', 'metaId': 'metaId2'},
{'id': 'item7', 'type': 'foo3', 'metaId': 'metaId3'},
{'id': 'item8', 'type': 'foo2', 'metaId': 'metaId2'},
{'id': 'item9', 'type': 'foo3', 'metaId': 'metaId3'}]
I want to loop through the list and create a new list, that contains dictionaries with unique values for key 'type'. I don't care which dictionaries stay, first instance with that key: value stays, the rest is omitted. So in the end I'd like to see:
expected = [
{'id': 'item1', 'type': 'foo', 'metaId': 'metaId1'},
{'id': 'item3', 'type': 'foo3', 'metaId': 'metaId3'},
{'id': 'item4', 'type': 'foo2', 'metaId': 'metaId2'}
]
Here is what I tried, definitely not what I need as it returns an empty list. I think I struggle with checking for a value in new sublist of dictionaries to make it excluded
keys_to_keep = set()
expected = []
for d in dicts:
for key, value in d.items():
if value not in expected:
keys_to_keep.add(key)
remove_keys = set(d) - keys_to_keep
for d in dicts:
for k in remove_keys:
del d[k]
dicts = expected
print(dicts)
The reason you always get an empty list is because you simply do:
dicts = expected
And expected is simply an empty list, which you never did anything to... not sure why you would think expected would ever change.
But you are overcomplicating things. Just keep a set of the unique values, and create a new list of dicts.
seen = set()
result = []
for d in dicts:
if d['type'] not in seen:
result.append(d)
seen.add(d['type'])
This approach keeps the first dictionary encountered with that unique 'type'.
If, for example, you want the last encountered, you could do something like iterate over dicts in revere order, so
for d in reversed(dicts):
...
You can create a temporary dictionary to hold the first dictionary when a particular type is first encountered, and then use the values that end up in it to create the an updated list with one additional line of code.
dicts = [{'id': 'item1', 'type': 'foo', 'metaId': 'metaId1'},
{'id': 'item2', 'type': 'foo', 'metaId': 'metaId2'},
{'id': 'item3', 'type': 'foo3', 'metaId': 'metaId3'},
{'id': 'item4', 'type': 'foo2', 'metaId': 'metaId2'},
{'id': 'item5', 'type': 'foo3', 'metaId': 'metaId3'},
{'id': 'item6', 'type': 'foo2', 'metaId': 'metaId2'},
{'id': 'item7', 'type': 'foo3', 'metaId': 'metaId3'},
{'id': 'item8', 'type': 'foo2', 'metaId': 'metaId2'},
{'id': 'item9', 'type': 'foo3', 'metaId': 'metaId3'}]
temp = {}
for d in dicts:
if d['type'] not in temp:
temp[d['type']] = d
dicts = list(temp.values()) # Update list.
for d in dicts:
print(d)
Keep seen types and add missing to a result slice and also mark it as seen. A function that does this:
def transform(dicts):
seen, result = set(), []
for d in dicts:
my_key = d['type']
if my_key not in seen:
result.append(d)
seen.add(my_key)
return result

Python get only values in each dictionary

I intend to get the values for each dictionary in array list and put the values in new dictionary.
There are two key,two values in each dictionary.
This is my array_list.
[{'Name': 'email',
'Value': 'mail#outlook.com'},
{'Name': 'name',
'Value': 'tester'},
{'Name': 'address',
'Value': 'abc'}]
My expected outcome (to get the both values in each dictionary):
{'email': 'mail#outlook.com',
'name': 'tester',
'address': 'abc'}
My current code:
outcome = {}
x = ""
for i in range(len(array_list)):
for key,value in array_list[i].items():
if key == 'Value':
x = value
elif key == 'Name':
outcome[value] = x
I still not able to get the expected outcome. Any helps?
l = [{'Name': 'email',
'Value': 'mail#outlook.com'},
{'Name': 'name',
'Value': 'tester'},
{'Name': 'address',
'Value': 'abc'}]
{k['Name'] : k['Value'] for k in l}
the result is
{'address': 'abc', 'email': 'mail#outlook.com', 'name': 'tester'}
You are almost correct. Just have some problems in if else.
After writing a code you should try to simulate your code by yourself. Please look carefully in you inner for loop. For each iteration either Name or Value will be set as if and elif is mutually exclusive. But the requirement is to create key-value in each iteration.
outcome = {}
array_list = [{'Name': 'email',
'Value': 'mail#outlook.com'},
{'Name': 'name',
'Value': 'tester'},
{'Name': 'address',
'Value': 'abc'}]
for i in range(len(array_list)):
keys = array_list[i].keys()
if 'Name' in keys and 'Value' in keys:
outcome[array_list[i]['Name']] = array_list[i]['Value']
It is almost same as your code but my thinking is different.

How to change values in a nested dictionary

I need to change values in a nested dictionary. Consider this dictionary:
stocks = {
'name': 'stocks',
'IBM': 146.48,
'MSFT': 44.11,
'CSCO': 25.54,
'micro': {'name': 'micro', 'age': 1}
}
I need to loop through all the keys and change the values of all the name keys.
stocks.name
stocks.micro.name
These keys need to be changed. But, I will not know which keys to change before hand. So, I'll need to loop through keys and change the values.
Example
change_keys("name", "test")
Output
{
'name': 'test',
'IBM': 146.48,
'MSFT': 44.11,
'CSCO': 25.54,
'micro': {'name': 'test', 'age': 1}
}
A recursive solution that supports unknown number of nesting levels:
def change_key(d, required_key, new_value):
for k, v in d.items():
if isinstance(v, dict):
change_key(v, required_key, new_value)
if k == required_key:
d[k] = new_value
stocks = {
'name': 'stocks',
'IBM': 146.48,
'MSFT': 44.11,
'CSCO': 25.54,
'micro': {'name': 'micro', 'age': 1}
}
change_key(stocks, 'name', 'new_value')
print(stocks)
# {'name': 'new_value',
# 'MSFT': 44.11,
# 'CSCO': 25.54,
# 'IBM': 146.48,
# 'micro': {'name': 'new_value',
# 'age': 1}
# }
def changeKeys(d, repl):
for k,v in zip(d.keys(),d.values()):
if isinstance(v, dict):
changeKeys(v,repl)
elif k == "name":
d[k]= repl

Categories

Resources