This question already has answers here:
Remove duplicate dict in list in Python
(16 answers)
Closed 8 months ago.
I have a bit complex list of dictionaries which looks like
[
{'Name': 'Something XYZ', 'Address': 'Random Address', 'Customer Number': '-', 'User Info': [{'Registration Number': '17002', 'First Name': 'John', 'Middle Name': '', 'Last Name': 'Denver'}, {'Registration Number': '27417', 'First Name': 'Robert', 'Middle Name': '', 'Last Name': 'Patson'}]},
{'Name': 'Something XYZ', 'Address': 'Random Address', 'Customer Number': '-', 'User Info': [{'Registration Number': '27417', 'First Name': 'Robert', 'Middle Name': '', 'Last Name': 'Patson'}, {'Registration Number': '17002', 'First Name': 'John', 'Middle Name': '', 'Last Name': 'Denver'}]}
]
Expected is below
[
{'Name': 'Something XYZ', 'Address': 'Random Address', 'Customer Number': '-', 'User Info': [{'Registration Number': '17002', 'First Name': 'John', 'Middle Name': '', 'Last Name': 'Denver'}, {'Registration Number': '27417', 'First Name': 'Robert', 'Middle Name': '', 'Last Name': 'Patson'}]},
]
I want to remove the duplicate dictionaries in this list but I don't know how to deal with User Info because the order of the items might be different. A duplicate case would be where all the dictionary items are exactly the same and in the case of User Info order doesn't matter.
I think the best way is to make a hash of User Info by sum the hash values of it's elements (sum will tolerate position change).
def deepHash(value):
if type(value) == list:
return sum([deepHash(x) for x in value])
if type(value) == dict:
return sum([deepHash(x) * deepHash(y) for x, y in value.items()])
return hash(str(value))
and you can simply check the hash of you inputs:
assert deepHash({"a": [1,2,3], "c": "d"}) == deepHash({"c": "d", "a": [3,2,1]})
Using this dictionary, is there a way I can only extract the Name, Last Name, and Age of the boys?
myDict = {'boy1': {'Name': 'JM', 'Last Name':'Delgado', 'Middle Name':'Goneza', 'Age':'21',
'Birthday':'8/22/2001', 'Gender':'Male'},
'boy2': {'Name': 'Ralph', 'Last Name':'Tubongbanua', 'Middle Name':'Castro',
'Age':'21', 'Birthday':'9/5/2001', 'Gender':'Male'},}
for required in myDict.values():
print (required ['Name', 'Last Name', 'Age'])
The output is:
JM
Ralph
What I have in mind is
JM Delgado 21
Ralph Tubongbanua 21
You have to extract the keys one by one:
myDict = {'boy1': {'Name': 'JM', 'Last Name':'Delgado', 'Middle Name':'Goneza', 'Age':'21',
'Birthday':'8/22/2001', 'Gender':'Male'},
'boy2': {'Name': 'Ralph', 'Last Name':'Tubongbanua', 'Middle Name':'Castro',
'Age':'21', 'Birthday':'9/5/2001', 'Gender':'Male'},}
for required in myDict.values():
print (required['Name'], required['Last Name'],required['Age'])
this could be a solution:
myDict = {'boy1': {'Name': 'JM', 'Last Name':'Delgado', 'Middle, Name':'Goneza', 'Age':'21', 'Birthday':'8/22/2001', 'Gender':'Male'},
'boy2': {'Name': 'Ralph', 'Last Name':'Tubongbanua', 'Middle Name':'Castro',
'Age':'21', 'Birthday':'9/5/2001', 'Gender':'Male'},}
for required in myDict.values():
print(required ['Name'], required['Last Name'], required['Age'])
When printing multiple values separated with commas, a space will automatically appear between them.
I have an initial code like this:
record = "Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001
..."
I need to make it into a dictionary and its output should be something like this:
{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'}
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'}
...
Each line should have an incrementing key starting from 1.
I currently have no idea to approach this issue as I am still a student with no prior experience.
I have seen people use this for strings containing key-value pairs but my string does not contain those:
mydict = dict((k.strip(), v.strip()) for k,v in
(item.split('-') for item in record.split(',')))
Use split:
In [220]: ans = []
In [221]: record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
In [223]: l = record.split(';')
In [227]: for i in l:
...: l1 = i.split(',')
...: d = {'First Name': l1[0], 'Last Name': l1[1], 'Birthday': l1[2]}
...: ans.append(d)
...:
In [228]: ans
Out[228]:
[{'First Name': 'Jane', 'Last Name': 'Doe', 'Birthday': '25/02/2002'},
{'First Name': 'James', 'Last Name': 'Poe', 'Birthday': '19/03/1998'},
{'First Name': 'Max', 'Last Name': 'Soe', 'Birthday': '16/12/2001'}]
To make the required dictionary for a single line, you can use split to chop up the line where there are commas (','), to get the values for the dictionary, and hard-code the keys. E.g.
line = "Jane,Doe,25/02/2002"
values = line.split(",")
d = {"First Name": values[0], "Last Name": values[1], "Birthday": values[2]}
Now to repeat that for each line in the record, a list of all the lines is needed. Again, you can use split in this case to chop up the input where there are semicolons (';'). E.g.
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
lines = record.split(";")
Now you can iterate the solution for one line over this lines list, collecting the results into another list.
results = []
for line in lines:
values = line.split(",")
results.append({"First Name": values[0], "Last Name": values[1], "Birthday": values[2]})
The incremental key requirement you mention seems strange, because you could just keep them in a list, where the index in the list is effectively the key. But of course, if you really need the indexed-dictionary thing, you can use a dictionary comprehension to do that.
results = {i + 1: results[i] for i in range(len(results))}
Finally, the whole thing might be made more concise (and nicer IMO) by using a combination of list and dictionary comprehensions, as well as a list of your expected keys.
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
keys = ["First Name", "Last Name", "Birthday"]
results = [dict(zip(keys, line.split(","))) for line in record.split(";")]
With the optional indexed-dictionary thingy:
results = {i + 1: results[i] for i in range(len(results))}
This should work for your case:
lines = [line.replace('\n','').replace('.','').strip() for line in record.split(';')]
desired_dict = {}
for i, line in enumerate(lines):
words = line.split(',')
desired_dict[i] = {
'First name':words[0],
'Last name':words[1],
'Birthday':words[2]
}
The .split() method is useful. First split the strings separated by ; and split each of the new strings by ,.
record = """Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001"""
out = []
for rec in record.split(';'):
lst = rec.strip().split(',')
dict_new = {}
dict_new['First Name'] = lst[0]
dict_new['Last Name'] = lst[1]
dict_new['Birthday'] = lst[2]
out.append(dict_new)
print(out)
other answers are already quite clear, just want to add on that, you can do it in one line (which is much less readable, not recommended, but it is arguably fancier). it also takes possible spaces into account with strip(), you can remove them if you don't want them. this gives you a list of dicts you need
record_dict = [{'First name': val[0].strip(), 'Last name': val[1].strip(), 'Birthday': val[2].strip()} for val in (rec.strip().split(',') for rec in record.strip().split(';'))]
I think you are looking for :
record = """Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001"""
num = 0
out = dict()
for v in record.split(";"):
v = v.strip().split(",")
num += 1
out[num] = {'First name':v[0],'Last name':v[1], 'Birthday':v[2]}
print(out)
prints:
{1: {'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
2: {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
3: {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}}
# raw string data
record = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
# list of lists
list_of_lists = [x.split(',') for x in record.split(';')]
# list of dicts
list_of_dicts = []
for x in list_of_lists:
# assemble into dict
d = {'First name': x[0],
'Last name': x[1],
'Birthday': x[2]}
# append to list
list_of_dicts.append(d)
output:
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
Here is a step by step Pythonic way to achieve that:
>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = '''Jane,Doe,25/02/2002
... James,Poe,19/03/1998
... Max,Soe,16/12/2001'''
>>> records = records.split()
>>> pprint(records)
['Jane,Doe,25/02/2002',
'James,Poe,19/03/1998',
'Max,Soe,16/12/2001']
>>> records = [_.split(',') for _ in records]
>>> pprint(records)
[['Jane', 'Doe', '25/02/2002'],
['James', 'Poe', '19/03/1998'],
['Max', 'Soe', '16/12/2001']]
>>> records = [dict(zip(columns, _)) for _ in records]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
If you have all records in one line, delimited by a ; signal, then you can do this:
>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
>>> records = records.split(';')
>>> pprint(records)
['Jane,Doe,25/02/2002',
'James,Poe,19/03/1998',
'Max,Soe,16/12/2001']
>>> records = [_.split(',') for _ in records]
>>> pprint(records)
[['Jane', 'Doe', '25/02/2002'],
['James', 'Poe', '19/03/1998'],
['Max', 'Soe', '16/12/2001']]
>>> records = [dict(zip(columns, _)) for _ in records]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
And finally you can put it all together in one line:
>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
>>> # All tasks in one line now
>>> records = [dict(zip(columns, _)) for _ in [_.split(',') for _ in records.split(';')]]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
list comprehensions make it easy.
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
list_of_records = [item.split(',') for item in record.split(';')]
dict_of_records = [{'first_name':line[0], 'last_name':line[1], 'Birthday':line[2]} for line in list_of_records]
print(dict_of_records)
Output:
[{'first_name': 'Jane', 'last_name': 'Doe', 'Birthday': '25/02/2002'}, {'first_name': 'James', 'last_name': 'Poe', 'Birthday': '19/03/1998'}, {'first_name': 'Max', 'last_name': 'Soe', 'Birthday': '16/12/2001'}]
You can do it without writing any loops using sub() method of re and json:
import re
import json
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
sub_record = re.sub(r'\b;?([a-zA-Z]+),([a-zA-Z]+),(\d\d/\d\d/\d\d\d\d)',r',{"First name": "\1", "Last name": "\2", "Birthday": "\3"}',record)
mydict = json.loads('['+sub_record[1:]+']')
print(mydict)
Output:
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'}, {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'}, {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
With some regex:
import re
[re.match(r'(?P<First_name>\w+),(?P<Last_name>\w+),(?P<Birthday>.+)', r).groupdict() for r in record.split(';')]
The underscores in First_name and Last_name are inevitable unfortunately.
here is my code for showing search record and showing inform usr if found nothing.
Problem: else part runs as many times as outer loop.
entries = [{'First Name': 'Sher', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '2989484'},
{'First Name': 'Ali', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '398439'},
{'First Name': 'Talha', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '3343434'}]
search = input("type your search: ")
print(search)
for person in entries:
# print(person)
if person["Last Name"] == search:
print("Here are the records found for your search")
for e in person:
print(e, ":", person[e])
else:
print("There is no record found as you search Keyword")
thats because each iteration you are checking only 1 person, and if you didn't find what you looked for, you are printing that it does not exist.
this is actually an undesired behavior.
a better solution would be to simply look in the set of values you need:
...
search = input("type your search: ")
founds = [entry for entry in entries if entry["Last Name"] == search)] ## filtering only records that match what we need using list comprehension
if founds:
for found in founds:
* print info *
else:
print("There is no record found as you search Keyword")
First, check if the Last Name that the user enters is present in the dictionaries. If yes, then loop through them and print the respective records. Else, display no records found. Here is how you do it:
entries = [{'First Name': 'Sher', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '2989484'},
{'First Name': 'Ali', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '398439'},
{'First Name': 'Talha', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '3343434'}]
search = input("type your search: ")
print(search)
if search in [person['Last Name'] for person in entries]:
for person in entries:
if person["Last Name"] == search:
print("Here are the records found for your search")
for e in person:
print(e, ":", person[e])
else:
print("There is no record found as you search Keyword")
Output:
type your search: >? Khan
Khan
Here are the records found for your search
First Name : Sher
Last Name : Khan
Age : 22
Telephone : 2989484
Here are the records found for your search
First Name : Ali
Last Name : Khan
Age : 22
Telephone : 398439
Here are the records found for your search
First Name : Talha
Last Name : Khan
Age : 22
Telephone : 3343434
type your search: >? Jones
Jones
There is no record found as you search Keyword
Try like this (Use a boolean Found variable)
entries = [{'First Name': 'Sher', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '2989484'},
{'First Name': 'Ali', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '398439'},
{'First Name': 'Talha', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '3343434'},
{'First Name': 'Talha', 'Last Name': 'Jones', 'Age': '22', 'Telephone': '3343434'}]
search = input("type your search: ")
found = False
print(search)
for person in entries:
if person["Last Name"] == search:
found = True
print("Here are the records found for your search")
for e in person:
print(e, ":", person[e])
if not found:
print("There is no record found as you search Keyword")
This might not be the best way of doing it but this would work I think
entries = [{'First Name': 'Sher', 'Last Name': 'Khan', 'Age': '22', 'Telephone':
'2989484'},
{'First Name': 'Ali', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '398439'},
{'First Name': 'Talha', 'Last Name': 'Khan', 'Age': '22', 'Telephone':
'3343434'}]
inEntries = False
search = input("type your search: ")
print(search)
for person in entries:
# print(person)
if person["Last Name"] == search:
inEntries = True
print("Here are the records found for your search")
for e in person:
print(e, ":", person[e])
if not inEntries:
print("There is no record found as you search Keyword")