dict.update overwrites existing keys, how to avoid? - python

When using the update function for a dictionary in python where you are merging two dictionaries and the two dictionaries have the same keys they are apparently being overwritten.
A simple example:
simple_dict_one = {'name': "tom", 'age': 20}
simple_dict_two = {'name': "lisa", 'age': 17}
simple_dict_one.update(simple_dict_two)
After the dicts are merged the following dict remains:
{'age': 17, 'name': 'lisa'}
So if you have the same key in both dict only one remains (the last one apparently).
If i have a lot of names for several sources i would probably want a temp dict from each of those and then want to add it to a whole bigger dict.
Is there a way to merge two dicts and still keep all the keys ? I guess you are only suppose to have one unique key but then how would i merge two dicts without loosing data

Well i have several sources i gather information from, for example an
ldap database and other sources where i have python functions that
create a temp dict each but i want a complete dict at the end that
sort of concatenates or displays all information gathered from all the
sources.. so i would have one dict holding all the info
What you are trying to do with the 'merging' is not quite ideal. As you said yourself
I guess you are only suppose to have one unique key
Which makes it relatively and unnecessarily hard to gather all your information in one dict.
What you could do, instead of calling .update() on the existing dict, is add a sub-dict. Its key could be the name of the source from which you gathered the information. The value could be the dict you receive from the source, and if you need to store more than 1 dict of the same source you can store them in a list.
Example
>>> data = {}
>>> person_1 = {'name': 'lisa', 'age': 17}
>>> person_2 = {'name': 'tom', 'age': 20}
>>> data['people'] = [person_1, person_2]
>>> data
{'people': [{'age': 17, 'name': 'lisa'}, {'age': 20, 'name': 'tom'}]}
Then whenever you need to add newly gathered information, you just add a new entry to the data dict
>>> ldap_data = {'foo': 1, 'bar': 'baz'} # just some dummy data
>>> data['ldap_data'] = ldap_data
>>> data
{'people': [{'age': 17, 'name': 'lisa'}, {'age': 20, 'name': 'tom'}],
'ldap_data': {'foo': 1, 'bar': 'baz'}}
The source-specific data is easily extractable from the data dict
>>> data['people']
[{'age': 17, 'name': 'lisa'}, {'age': 20, 'name': 'tom'}]
>>> data['ldap_data']
{'foo': 1, 'bar': 'baz'}

Related

validation check - dictionary value types [duplicate]

This question already has answers here:
How do I parse a string to a float or int?
(32 answers)
Closed 7 months ago.
After converting my csv to dictionary with pandas, a sample of the dictionary will look like this:
[{'Name': '1234', 'Age': 20},
{'Name': 'Alice', 'Age': 30.1},
{'Name': '5678', 'Age': 41.0},
{'Name': 'Bob 1', 'Age': 14},
{'Name': '!##$%', 'Age': 65}]
My goal is to do a validation check if the columns are in string. I'm trying to use pandera or schema libs to achieve it as the csv may contain a million rows. Therefore, I am trying to convert the dict to as follows.
[{'Name': 1234, 'Age': 20},
{'Name': 'Alice', 'Age': 30.1},
{'Name': 5678, 'Age': 41.0},
{'Name': 'Bob 1', 'Age': 14},
{'Name': '!##$%', 'Age': 65}]
After converting the csv data to dict, I use the following code to check if Name is string.
import pandas as pd
from schema import Schema, And, Use, Optional, SchemaError
schema = Schema([{'Name': str,
'Age': float}])
validated = schema.validate(dict)
Is it possible?
Is it possible?
For sure. You can use the int constructor to convert that strings to integers if possible.
for element in list_:
try:
element["Name"] = int(element["Name"])
except ValueError:
pass
A faster way for doing it would be using isdigit method of class str.
for element in list_:
if element["Name"].isdigit(): # Otherwise no need to convert
element["Name"] = int(element["Name"])
So that you don't have to enter that try/except block.

making old keys the values for a new dictionary with list comprehension

I am trying to make a new list of dictionaries using list comprehension. I have an old list that has 'age' and 'email' keys, with their associated values. I am wanting to create a new list of dictionaries, where 'age' and 'email' are the VALUES of new keys called 'new_age', and 'new_email'.
How would I accomplish this?
entries = [{'age': 65, 'name': 'Tim', 'email': 'tim#bob.com'},{'age': 72, 'name': 'Andy', 'email': 'andy#bob.com'},{'age': 50, 'name': 'Bob', 'email': 'bob#bob.com'}, {'age': 30, 'name': 'Shelly', 'email': 'shelly#shelly.com'}]
x =[{dictionary['new_age'],dictionary['new_email']} for dictionary in entries if dictionary['age'] >= 50]
so my new list of dictionaries 'x' is supposed to make a new list of dictionaries if 'age' >= 50, and then I want just the 'age' and 'email' of that entry in a new dictionary.
so the form will look like this
[{'new_age': 65, 'new_email': bob#bob.com}, {}, etc]
my x list is just an example and it prints out the email, and age if age is above 50 but I need the key pairs in there as well, and this is where I am stuck.
{dictionary['new_age'],dictionary['new_email']} creates a set, not a dictionary (and it wouldn't work anyway because dictionary, which is an element of entries, doesn't contain the keys new_age and new_email)
To create a dictionary, you need key-value pairs like so:
[
{'new_email': dictionary['email'], 'new_age': dictionary['age']}
for dictionary in entries if dictionary['age'] >= 50
]
which gives what you're looking for:
[{'new_email': 'tim#bob.com', 'new_age': 65},
{'new_email': 'andy#bob.com', 'new_age': 72},
{'new_email': 'bob#bob.com', 'new_age': 50}]
You just need to fix how you're doing the dictionary comprehension
x = [
{'new_age': dictionary['age'], 'new_email': dictionary['email']}
for dictionary in entries if dictionary['age'] >= 50
]

Filtering through a list with embedded dictionaries

I've got a json format list with some dictionaries within each list, it looks like the following:
[{"id":13, "name":"Albert", "venue":{"id":123, "town":"Birmingham"}, "month":"February"},
{"id":17, "name":"Alfred", "venue":{"id":456, "town":"London"}, "month":"February"},
{"id":20, "name":"David", "venue":{"id":14, "town":"Southampton"}, "month":"June"},
{"id":17, "name":"Mary", "venue":{"id":56, "town":"London"}, "month":"December"}]
The amount of entries within the list can be up to 100. I plan to present the 'name' for each entry, one result at a time, for those that have London as a town. The rest are of no use to me. I'm a beginner at python so I would appreciate a suggestion in how to go about this efficiently. I initially thought it would be best to remove all entries that don't have London and then I can go through them one by one.
I also wondered if it might be quicker to not filter but to cycle through the entire json and select the names of entries that have the town as London.
You can use filter:
data = [{"id":13, "name":"Albert", "venue":{"id":123, "town":"Birmingham"}, "month":"February"},
{"id":17, "name":"Alfred", "venue":{"id":456, "town":"London"}, "month":"February"},
{"id":20, "name":"David", "venue":{"id":14, "town":"Southampton"}, "month":"June"},
{"id":17, "name":"Mary", "venue":{"id":56, "town":"London"}, "month":"December"}]
london_dicts = filter(lambda d: d['venue']['town'] == 'London', data)
for d in london_dicts:
print(d)
This is as efficient as it can get because:
The loop is written in C (in case of CPython)
filter returns an iterator (in Python 3), which means that the results are loaded to memory one by one as required
One way is to use list comprehension:
>>> data = [{"id":13, "name":"Albert", "venue":{"id":123, "town":"Birmingham"}, "month":"February"},
{"id":17, "name":"Alfred", "venue":{"id":456, "town":"London"}, "month":"February"},
{"id":20, "name":"David", "venue":{"id":14, "town":"Southampton"}, "month":"June"},
{"id":17, "name":"Mary", "venue":{"id":56, "town":"London"}, "month":"December"}]
>>> [d for d in data if d['venue']['town'] == 'London']
[{'id': 17,
'name': 'Alfred',
'venue': {'id': 456, 'town': 'London'},
'month': 'February'},
{'id': 17,
'name': 'Mary',
'venue': {'id': 56, 'town': 'London'},
'month': 'December'}]

Insert new element to json dictionary with duplicate keys

I'm having trouble adding a new element to a JSON Dictionary.
The issue seems to be related to python dictionaries not allowing duplicate keys. How can I deal with this restriction?
import json
import datetime
current_dict = json.loads(open('cad_data.json').read())
print(current_dict)
# {'entries': [{'cad_value': '518', 'timestamp': '2017-10-24 16:15:34.813480'}, {'cad_value': '518', 'timestamp': '2017-10-24 17:15:34.813480'}]}
new_data = {'timestamp': datetime.datetime.now(), 'cad_value': '518'}
current_dict.update(new_data)
print(current_dict)
# {'entries': [{'cad_value': '518', 'timestamp': '2017-10-24 16:15:34.813480'}, {'cad_value': '518', 'timestamp': '2017-10-24 17:15:34.813480'}], 'timestamp': datetime.datetime(2017, 10, 25, 13, 44, 20, 548904), 'cad_value': '518'}
My code leads to an invalid dictionary/json.
You updated the outermost dictionary. You don't want to update any dictionary, you want to add another dictionary to the entries list:
current_dict['entries'].append(new_data)
Here current_dict['entries'] is an expression that resolves to the list object with dictionaries, and the above calls list.append() on that list object to add the new_data reference to the list, effectively adding another dictionary.

Use list of indices to manipulate a nested dictionary

I'm trying to perform operations on a nested dictionary (data retrieved from a yaml file):
data = {'services': {'web': {'name': 'x'}}, 'networks': {'prod': 'value'}}
I'm trying to modify the above using the inputs like:
{'services.web.name': 'new'}
I converted the above to a list of indices ['services', 'web', 'name']. But I'm not able to/not sure how to perform the below operation in a loop:
data['services']['web']['name'] = new
That way I can modify dict the data. There are other values I plan to change in the above dictionary (it is extensive one) so I need a solution that works in cases where I have to change, EG:
data['services2']['web2']['networks']['local'].
Is there a easy way to do this? Any help is appreciated.
You may iterate over the keys while moving a reference:
data = {'networks': {'prod': 'value'}, 'services': {'web': {'name': 'x'}}}
modification = {'services.web.name': 'new'}
for key, value in modification.items():
keyparts = key.split('.')
to_modify = data
for keypart in keyparts[:-1]:
to_modify = to_modify[keypart]
to_modify[keyparts[-1]] = value
print(data)
Giving:
{'networks': {'prod': 'value'}, 'services': {'web': {'name': 'new'}}}

Categories

Resources