Python Dictionaries: Grouping Key, Value pairs based on a common key, value - python

I have a list that contains dictionaries like this:
list1 = [{'name': 'bob', 'email': 'bob#bob.com', 'address': '123 house lane',
'student_id': 12345}, {'name': 'steve', 'email': 'steve#steve.com',
'address': '456 house lane', 'student_id': 34567}, {'name': 'bob',
'email': 'bob2#bob2.com', 'address': '789 house lane', 'student_id': 45678}]
Is there a way in python to group selected key, values pairs within a new dictionary based on 'name' value? For instance, something like this as an end result:
new_list = [
{'name': 'bob',
{'emails': ['bob#bob.com',
'bob2#bob2.com']},
{'address': ['123 house lane',
'789 house lane']},
{'name': 'steve',
{'email': ... },
{'address': ... }}
# let's assume the list1 has various entries at some point
# which may or may not have duplicate 'name' values
# and new_list will hold the groupings
]

Sounds like this is what you want to do:
list1 = [{'name': 'bob', 'email': 'bob#bob.com',
'address': '123 house lane', 'student_id': 12345},
{'name': 'steve', 'email': 'steve#steve.com',
'address': '456 house lane', 'student_id': 34567},
{'name': 'bob', 'email': 'bob2#bob2.com',
'address': '789 house lane', 'student_id': 45678}]
import operator
list1.sort(key=operator.itemgetter('name'))
new_list = []
for studentname, dicts in itertools.groupby(list1, operator.itemgetter('name')):
d = {'name': studentname}
for dct in dicts:
for key,value in dct.items():
if key == 'name':
continue
d.setdefault(key, []).append(value)
new_list.append(d)
DEMO:
[{'address': ['123 house lane', '789 house lane'],
'email': ['bob#bob.com', 'bob2#bob2.com'],
'name': 'bob',
'student_id': [12345, 45678]},
{'address': ['456 house lane'],
'email': ['steve#steve.com'],
'name': 'steve',
'student_id': [34567]}]
If you were going to use this extensively you should probably hard-code some better names (addresses instead of address for instance) and make a mapping that populates them for you.
keys_mapping = {'address': 'addresses',
'email': 'emails',
'student_id': 'student_ids'}
for studentname, dicts in itertools.groupby(list1, operator.itemgetter('name')):
d = {'name': studentname}
for dct in dicts:
for key,value in dct_items():
new_key = keys_mapping.get(key,key)
# get the updated value if it's defined, else give `key`
d.setdefault(new_key, []).append(value)
new_list.append(d)

The code below gives you nested dictionaries. Nested dictionaries give you faster processing to find the key while in list you have to create a loop.
list1 = [{'name': 'bob', 'email': 'bob#bob.com', 'address': '123 house lane',
'student_id': 12345}, {'name': 'steve', 'email': 'steve#steve.com',
'address': '456 house lane', 'student_id': 34567}, {'name': 'bob',
'email': 'bob2#bob2.com', 'address': '789 house lane', 'student_id': 45678}]
dict1 = {}
for content in list1:
if content['name'] in [name for name in dict1]:
dict1[content['name']] = {'emails': dict1[content['name']]['emails'] + [content['address']], 'addresses': dict1[content['name']]['addresses'] + [content['email']]}
else:
dict1[content['name']] = {'emails': [content['email']], 'addresses': [content['address']]}
print dict1
Output of the code is
{'steve': {'emails': ['steve#steve.com'], 'addresses': ['456 house lane']}, 'bob': {'emails': ['bob#bob.com', '789 house lane'], 'addresses': ['123 house lane', 'bob2#bob2.com']}}

Related

Creating a Python dictionary from other nested list containing dictionary in python

I have this list that contains dictionaries as its element
dict_1 = [{'id': '0eb7df70-f319-4562-ab2a-9e641e978b3b', 'first_name': 'Rahx', 'surname': 'Smith ', 'devices': {'os': 'Apple iPhone', 'mac_address': 'f4:af:e7:b7:ab:22', 'manufacturer': 'Apple'}, 'lat': 54.33166199876629, 'lng': -6.277842272724769, 'seenTime': 1582814754000},
{'id': 'a0bb8d38-0d27-4d7f-acc0-1e850a706b6c', 'first_name': 'Lucy', 'surname': 'Pye', 'devices': {'os': 'Apple iPhone', 'mac_address': 'f8:87:f1:72:4c:4d', 'manufacturer': 'Apple'}, 'lat': 54.33166199876629, 'lng': -6.277842272724769, 'seenTime': 1582814754000},
{'id': '0eb7df70-f319-4562-ab2a-9e641e978b3b', 'first_name': 'xyx', 'surname': 'dcsdd', 'devices': {'os': 'NOKIA Phone', 'mac_address': '78:28:ca:a8:56:b9', 'manufacturer': 'NOKIA'}, 'lat': 54.33166199876629, 'lng': -6.277842272724769, 'seenTime': 1582814754000},
{'id': 'a0bb8d38-0d27-4d7f-acc0-1e850a706b6c', 'first_name': 'ddwdw', 'surname': 'sdsds', 'devices': {'os': 'MI Phone', 'mac_address': 'dc:08:0f:3f:57:0c', 'manufacturer': 'MI'}, 'lat': 54.33218267030654, 'lng': -6.27796001203896, 'seenTime': 1582814693000}]
and I want output like this from dict_1 variable
{
"f77df8c2-b19d-4341-9021-7beab4b9ebcd":{
"first_name":"anonymous",
"surname":"anonymous",
"lat":57.14913102,
"lng":-2.09987143,
"devices": {'os': 'MI Phone', 'mac_address': 'dc:08:0f:3f:57:0c', 'manufacturer': 'MI'},
"seenTime": 1582814693000
},
"7beab4b9ebcd-b19d-9021-f77df8c2-4341":{
etc.
},
etc.
}
help me to know what should I do in this case.
Try this.
dict_1 = {x.pop('id'): x for x in dict_1}
I think this could do the job :
dict_2 = {}
for d in dict_1 :
id = d.pop('id')
dict_2[id] = d

Unique values of Dictionary comprehension, return dictionary instread of string

this is my data:
data = [{'id': 1, 'name': 'The Musical Hop', 'city': 'San Francisco', 'state': 'CA'},
{'id': 2, 'name': 'The Dueling Pianos Bar', 'city': 'New York', 'state': 'NY'},
{'id': 3, 'name': 'Park Square Live Music & Coffee', 'city': 'San Francisco', 'state': 'CA'}]
I want to find out the unique values (thats why I use a set) of "city" and return them like this:
cities = set([x.get("city") for x in data])
cities ยด
{'New York', 'San Francisco'}
However, I also want to return the corresponding state, like this:
[{"city": "New York", "state": "NY"}, {"city": "San Francisco", "state": "CA"}]
Is there a way to do this?
You can use dict-comprehension for the task:
out = list({x['city']:{'city':x['city'], 'state':x['state']} for x in data}.values())
print(out)
Prints:
[{'city': 'San Francisco', 'state': 'CA'}, {'city': 'New York', 'state': 'NY'}]
you can use a dict-comprehension to create a city->state mapping, then iterate it to create the list you want:
city_to_state = {x["city"]: x["state"] for x in data}
result = [{"city":k, "state":v} for k,v in city_to_state.items()]

Retrieve only certain keys and values from a dictionary, nested inside a list

I've been stuck on this for hours.. I want to retrieve only ONE individuals keys and values from a dictionary that is nested inside of a list.
GAMERS = [{
'name': 'Fatboi',
'parent': 'Dick Van Dyke',
'game': 'Dark Souls 3',
'weight': '420 lbs'
},
{
'name': 'Justin',
'parent': 'Heather Blueberry',
'game': 'Tetris',
'weight': '180 lbs'
},
{
'name': 'jerkhead',
'parent': 'none',
'games': 'Hello Kitty',
'weight': '240 lbs'
},{
'name': 'Tumor',
'parent': 'Jack Black',
'games': 'Trying to live',
'weight': '150 lbs'
}]
So for instance I want to get Justins information printed only, nobody elses. Any insights?
You can pass the key which you want and push it to separate list.
GAMERS = [{
'name': 'Fatboi',
'parent': 'Dick Van Dyke',
'game': 'Dark Souls 3',
'weight': '420 lbs'
},
{
'name': 'Justin',
'parent': 'Heather Blueberry',
'game': 'Tetris',
'weight': '180 lbs'
},{
'name': 'jerkhead',
'parent': 'none',
'games': 'Hello Kitty',
'weight': '240 lbs'
}]
def get_key_pair_list(input_dict, key):
new_list = []
for item in input_dict:
my_dict = {}
if key in item.keys():
my_dict[key] = item[key]
new_list.append(my_dict)
return new_list
print(get_key_pair_list(GAMERS, 'name'))
Output:
[{'name': 'Fatboi'}, {'name': 'Justin'}, {'name': 'jerkhead'}]
Comprehensive way:
key = 'name'
my_list = [{key, item[key]} for item in GAMERS if key in item.keys() ]
print(my_list)
output:
[{'name', 'Fatboi'}, {'name', 'Justin'}, {'name', 'jerkhead'}]
You want to filter the list and grab the first value that matches a predicate. Make sure to handle the case where the item doesnt exist!
filtered_info = (
item for item in GAMERS if item['name'] == 'Justin'
)
justin_info = next(filtered_info, None)
if justin_info is not None:
print(justin_info)

How to count how many times a specific value in a dictionary of dictionaries in Python 3

I know there must be a very simple solution to this question but I am new with Python and cannot figure out how to do it.
All I simply want to do is count how many times a particular value appears in this dictionary, for example, how many males there are.
people = {}
people['Applicant1'] = {'Name': 'David Brown',
'Gender': 'Male',
'Occupation': 'Office Manager',
'Age': '33'}
people['Applicant2'] = {'Name': 'Peter Parker',
'Gender': 'Male',
'Occupation': 'Postman',
'Age': '25'}
people['Applicant3'] = {'Name': 'Patricia M',
'Gender': 'Female',
'Occupation': 'Teacher',
'Age': '35'}
people['Applicant4'] = {'Name': 'Mark Smith',
'Gender': 'Male',
'Occupation': 'Unemployed',
'Age': '26'}
Any help is much appreciated!
For your example, you have applicants and their data. The data you are checking is their gender, so the below code will accomplish that.
amount = 0 # amount of people matching condition
for applicant in people.values(): # looping through all applicants
if applicant.get('Gender', False) == 'Male': # checks if applicant['Gender'] is 'Male'
# note it will return False if ['Gender'] wasn't set
amount += 1 # adds matching people to amount
This will get the amount of males in the applicant list.
I'd suggest refactoring your logic a bit to use a list of dicts.
people = [
{
'Name': 'David Brown',
'Gender': 'Male',
'Occupation': 'Office Manager',
'Age': '33'
},
{
'Name': 'Peter Parker',
'Gender': 'Male',
'Occupation': 'Postman',
'Age': '25'
},
{
'Name': 'Patricia M',
'Gender': 'Female',
'Occupation': 'Teacher',
'Age': '35'
},
{
'Name': 'Mark Smith',
'Gender': 'Male',
'Occupation': 'Unemployed',
'Age': '26'
}
]
Then you can use logic like
[applicant for applicant in people if applicant['Gender'] == 'Male']
Which will give you all of the males in the list
This is a function to count the number of occurrences of a given value inside a dictionary:
def count(dic, val):
sum = 0
for key,value in dic.items():
if value == val:
sum += 1
if type(value) is dict:
sum += count(dic[key], val)
return sum
Then you can use it as follow:
result = count(people, 'Male')

Python/Shell Script - Merging 2 rows of a CSV file where Address column has 'New Line' character

I have a CSV file, which contains couple of columns. For Example :
FName,LName,Address1,City,Country,Phone,Email
Matt,Shew,"503, Avenue Park",Auckland,NZ,19809224478,matt#xxx.com
Patt,Smith,"503, Baker Street
Mickey Park
Suite 510",Austraila,AZ,19807824478,patt#xxx.com
Doug,Stew,"12, Main St.
21st Lane
Suit 290",Chicago,US,19809224478,doug#xxx.com
Henry,Mark,"88, Washington Park",NY,US,19809224478,matt#xxx.com
In excel it looks something likes this :
It's a usual human tendency to feed/copy-paste address in the particular manner, usually sometimes people copy their signature and paste it to the Address column which creates such situation.
I have tried reading this using Python CSV module and it looks like that python doesn't distinguish between the '\n' Newline between the field values and the end of line.
My code :
import csv
with open(file_path, 'r') as f_obj:
input_data = []
reader = csv.DictReader(f_obj)
for row in reader:
print row
The output looks somethings like this :
{'City': 'Auckland', 'Address1': '503, Avenue Park', 'LName': 'Shew', 'Phone': '19809224478', 'FName': 'Matt', 'Country': 'NZ', 'Email': 'matt#xxx.com'}
{'City': 'Austraila', 'Address1': '503, Baker Street\nMickey Park\nSuite 510', 'LName': 'Smith', 'Phone': '19807824478', 'FName': 'Patt', 'Country': 'AZ', 'Email': 'patt#xxx.com'}
{'City': 'Chicago', 'Address1': '12, Main St. \n21st Lane \nSuit 290', 'LName': 'Stew', 'Phone': '19809224478', 'FName': 'Doug', 'Country': 'US', 'Email': 'doug#xxx.com'}
{'City': 'NY', 'Address1': '88, Washington Park', 'LName': 'Mark', 'Phone': '19809224478', 'FName': 'Henry', 'Country': 'US', 'Email': 'matt#xxx.com'}
I just wanted to write the same content to a file where all the values for a Address1 keys should not have '\n' character and looks like :
{'City': 'Auckland', 'Address1': '503, Avenue Park', 'LName': 'Shew', 'Phone': '19809224478', 'FName': 'Matt', 'Country': 'NZ', 'Email': 'matt#xxx.com'}
{'City': 'Austraila', 'Address1': '503, Baker Street Mickey Park Suite 510', 'LName': 'Smith', 'Phone': '19807824478', 'FName': 'Patt', 'Country': 'AZ', 'Email': 'patt#xxx.com'}
{'City': 'Chicago', 'Address1': '12, Main St. 21st Lane Suit 290', 'LName': 'Stew', 'Phone': '19809224478', 'FName': 'Doug', 'Country': 'US', 'Email': 'doug#xxx.com'}
{'City': 'NY', 'Address1': '88, Washington Park', 'LName': 'Mark', 'Phone': '19809224478', 'FName': 'Henry', 'Country': 'US', 'Email': 'matt#xxx.com'}
Any suggestions guys ???
PS:
I have more than 100K such records in my csv file !!!
You can replace the print row with a dict comprehsion that replaces newlines in the values:
row = {k: v.replace('\n', ' ') for k, v in row.iteritems()}
print row

Categories

Resources