Matching keys in different lists of dicts - python

I have a csv.DictReader instance of dicts and a list of dicts.
instance:
{'Salary': '3000', 'Name': 'James Jones', 'GameInfo': 'Den#Cle 07:30PM ET',
'AvgPointsPerGame': '4.883', 'teamAbbrev': 'Cle', 'Position': 'SG'}
{'Salary': '3000', 'Name': 'Justin Anderson', 'GameInfo': 'Orl#Dal 09:00PM ET',
'AvgPointsPerGame': '13.161', 'teamAbbrev': 'Dal', 'Position': 'SF'}
list:
[
{'playername': 'Justin Anderson', 'points': '6.94'},
{'playername': 'DeAndre Liggins', 'points': '11.4'},
]
I cannot figure out how to iterate over these lists of dictionaries, match the Name and playername keys, then spit the ['Name'] from one dict and the ['points'] from the matching dict. In the example above I would match Justin Anderson from the two sets of dicts then print out Justin Anderson, 6.94
The core of the app takes 2 CSV's and makes them lists of dicts.

It's not really efficient this way but it wouldn't require any preprocessing:
# Instead of your CSVReader:
dicts = [{'Salary': '3000', 'Name': 'James Jones', 'GameInfo': 'Den#Cle 07:30PM ET', 'AvgPointsPerGame': '4.883', 'teamAbbrev': 'Cle', 'Position': 'SG'},
{'Salary': '3000', 'Name': 'Justin Anderson', 'GameInfo': 'Orl#Dal 09:00PM ET', 'AvgPointsPerGame': '13.161', 'teamAbbrev': 'Dal', 'Position': 'SF'}]
list_of_dicts = [
{'playername': 'Justin Anderson', 'points': '6.94'},
{'playername': 'DeAndre Liggins', 'points': '11.4'},
]
# For each dictionary in the CSVReader
for dct in dicts:
# For each dictionary in your list of dictionaries
for subdict in list_of_dicts:
# Check if the name and playername matches
if dct['Name'] == subdict['playername']:
# I just print out the results, you need to do your logic here
print(dct['Name'])
print(dct)
print('matching')
print(subdict)
and this prints:
Justin Anderson
{'Salary': '3000', 'Name': 'Justin Anderson', 'GameInfo': 'Orl#Dal 09:00PM ET', 'AvgPointsPerGame': '13.161', 'Position': 'SF', 'teamAbbrev': 'Dal'}
matching
{'playername': 'Justin Anderson', 'points': '6.94'}
If you want it faster than you should preprocess your list of dictionaries so that you can simply lookup the playername:
>>> dict_of_dicts = {dct['playername']: dct for dct in list_of_dicts}
>>> dict_of_dicts
{'DeAndre Liggins': {'playername': 'DeAndre Liggins', 'points': '11.4'},
'Justin Anderson': {'playername': 'Justin Anderson', 'points': '6.94'}}
Then the loop simplifies to:
for dct in dicts:
if dct['Name'] in dict_of_dicts:
print(dct['Name'])
print(dct)
print('matching')
print(dict_of_dicts[dct['Name']])
giving the same result.

Related

Extracting values from common keys in multiple dictionaries

So I am currently working on a coding lab where my goal is to try to combine multiple mapping into a single mapping
Eng_Team = [
{'player': 'Harry Kane', 'rating': '90'},
{'player': 'Harry Mcguire', 'rating': '81'},
{'player': 'Phil Foden', 'rating': '84'},
{'player': 'Jack Grealish', 'rating': '85'},
{'player': 'Eric Dier', 'rating': '79'}
]
USA_Team = [
{'player': 'Christian Pulisic', 'rating': '82'},
{'player': 'Gio Reyna', 'rating': '79'},
{'player': 'Weston Mckinnie', 'rating': '78'},
{'player': 'Sergino Dest', 'rating': '79'},
{'player': 'Tyler Adams', 'rating': '79'}
]
I tried
player_lookup = ChainMap(USA_Team,Eng_Team)
print(player_lookup['player'])
to try to get the names of players from both dictionaries. However I am getting this error:
TypeError Traceback (most recent call last)
Input In [49], in <cell line: 5>()
1 from collections import ChainMap
3 player_lookup = ChainMap(USA_Team,Eng_Team)
----> 5 print(player_lookup['player'])
File ~\anaconda3\lib\collections\__init__.py:938, in ChainMap.__getitem__(self, key)
936 for mapping in self.maps:
937 try:
--> 938 return mapping[key] # can't use 'key in mapping' with defaultdict
939 except KeyError:
940 pass
TypeError: list indices must be integers or slices, not str
It is to my knowledge that I might have to create a loop function. How would I do so?
What you are looking for should really be a dict that maps each player's name to its rating, in which case you can chain the two lists of dicts, map the dicts to an itemgetter to produce a sequence of tuples of player name and rating, and then construct a new dict with the sequence:
from operator import itemgetter
from itertools import chain
dict(map(itemgetter('player', 'rating'), chain(Eng_Team, USA_Team)))
This returns:
{'Harry Kane': '90', 'Harry Mcguire': '81', 'Phil Foden': '84', 'Jack Grealish': '85', 'Eric Dier': '79', 'Christian Pulisic': '82', 'Gio Reyna': '79', 'Weston Mckinnie': '78', 'Sergino Dest': '79', 'Tyler Adams': '79'}
Demo: https://replit.com/#blhsing/NotableDeepskyblueDrupal
I think what you're getting stuck on is that you have declared two lists of dictionaries: not two dictionaries. I believe that if you want to combine them into one dictionary you could use something like
player_rating_lookup = {}
for player_dictionary in Eng_Team:
player_rating_lookup[player_dictionary['player']] = player_dictionary['rating']
# adds to the player_rating_lookup dictionary:
# the player name string is the key in this new dictionary, and the rating string is the value.
for player_dictionary in USA_Team:
player_rating_lookup[player_dictionary['player']] = player_dictionary['rating']
# you can just iterate over the two lists of dictionaries separately.
You probably were after something like this:
from collections import ChainMap
# this is not a dictionary, but a list of dictionaries
# so, eng_team['player'] wouldn't actually work
eng_team = [
{'player': 'Harry Kane', 'rating': '90'},
{'player': 'Harry Mcguire', 'rating': '81'},
{'player': 'Phil Foden', 'rating': '84'},
{'player': 'Jack Grealish', 'rating': '85'},
{'player': 'Eric Dier', 'rating': '79'},
{'player': 'John Doe', 'rating': 0}
]
# this is also a list of dictionaries, not a dictionary
usa_team = [
{'player': 'Christian Pulisic', 'rating': '82'},
{'player': 'Gio Reyna', 'rating': '79'},
{'player': 'Weston Mckinnie', 'rating': '78'},
{'player': 'Sergino Dest', 'rating': '79'},
{'player': 'Tyler Adams', 'rating': '79'},
{'player': 'John Doe', 'rating': 1}
]
# so, you'd want to create appropriate dictionaries from the data
# (unless you can just define them as dictionaries right away)
eng_team_dict = {p['player']: {**p} for p in eng_team}
usa_team_dict = {p['player']: {**p} for p in usa_team}
# the dictionaries can be chain-mapped
all_players = ChainMap(eng_team_dict, usa_team_dict)
# this then works
print(all_players['Harry Kane'], all_players['Sergino Dest'])
print(list(all_players.keys()))
# note that duplicate keys will be taken from the first dict in
# the arguments to ChainMap, i.e. `eng_team_dict` in this case
print(all_players['John Doe'])
Output:
{'player': 'Harry Kane', 'rating': '90'} {'player': 'Sergino Dest', 'rating': '79'}
['Christian Pulisic', 'Gio Reyna', 'Weston Mckinnie', 'Sergino Dest', 'Tyler Adams', 'John Doe', 'Harry Kane', 'Harry Mcguire', 'Phil Foden', 'Jack Grealish', 'Eric Dier']
{'player': 'John Doe', 'rating': 0}

Nested Python Object to CSV

I looked up "nested dict" and "nested list" but either method work.
I have a python object with the following structure:
[{
'id': 'productID1', 'name': 'productname A',
'option': {
'size': {
'type': 'list',
'name': 'size',
'choices': [
{'value': 'M'},
]}},
'variant': [{
'id': 'variantID1',
'choices':
{'size': 'M'},
'attributes':
{'currency': 'USD', 'price': 1}}]
}]
what i need to output is a csv file in the following, flattened structure:
id, productname, variantid, size, currency, price
productID1, productname A, variantID1, M, USD, 1
productID1, productname A, variantID2, L, USD, 2
productID2, productname A, variantID3, XL, USD, 3
i tried this solution: Python: Writing Nested Dictionary to CSV
or this one: From Nested Dictionary to CSV File
i got rid of the [] around and within the data and e.g. i used this code snippet from 2 and adapted it to my needs. IRL i can't get rid of the [] because that's simple the format i get when calling the API.
with open('productdata.csv', 'w', newline='', encoding='utf-8') as output:
writer = csv.writer(output, delimiter=';', quotechar = '"', quoting=csv.QUOTE_NONNUMERIC)
for key in sorted(data):
value = data[key]
if len(value) > 0:
writer.writerow([key, value])
else:
for i in value:
writer.writerow([key, i, value])
but the output is like this:
"id";"productID1"
"name";"productname A"
"option";"{'size': {'type': 'list', 'name': 'size', 'choices': {'value': 'M'}}}"
"variant";"{'id': 'variantID1', 'choices': {'size': 'M'}, 'attributes': {'currency': 'USD', 'price': 1}}"
anyone can help me out, please?
thanks in advance
list indices must be integers not strings
The following presents a visual example of a python list:
0 carrot.
1 broccoli.
2 asparagus.
3 cauliflower.
4 corn.
5 cucumber.
6 eggplant.
7 bell pepper
0, 1, 2 are all "indices".
"carrot", "broccoli", etc... are all said to be "values"
Essentially, a python list is a machine which has integer inputs and arbitrary outputs.
Think of a python list as a black-box:
A number, such as 5, goes into the box.
you turn a crank handle attached to the box.
Maybe the string "cucumber" comes out of the box
You got an error: TypeError: list indices must be integers or slices, not str
There are various solutions.
Convert Strings into Integers
Convert the string into an integer.
listy_the_list = ["carrot", "broccoli", "asparagus", "cauliflower"]
string_index = "2"
integer_index = int(string_index)
element = listy_the_list[integer_index]
so yeah.... that works as long as your string-indicies look like numbers (e.g. "456" or "7")
The integer class constructor, int(), is not very smart.
For example, x = int("3 ") will produce an error.
You can try x = int(strying.strip()) to get rid of leading and trailing white-space characters.
Use a Container which Allows Keys to be Strings
Long ago, before before electronic computers existed, there were various types of containers in the world:
cookie jars
muffin tins
carboard boxes
glass jars
steel cans.
back-packs
duffel bags
closets/wardrobes
brief-cases
In computer programming there are also various types of "containers"
You do not have to use a list as your container, if you do not want to.
There are containers where the keys (AKA indices) are allowed to be strings, instead of integers.
In python, the standard container which like a list, but where the keys/indices can be strings, is a dictionary
thisdict = {
"make": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict["brand"] == "Ford"
If you want to index into a container using strings, instead of integers, then use a dict, instead of a list
The following is an example of a python dict which has state names as input and state abreviations as output:
us_state_abbrev = {
'Alabama': 'AL',
'Alaska': 'AK',
'American Samoa': 'AS',
'Arizona': 'AZ',
'Arkansas': 'AR',
'California': 'CA',
'Colorado': 'CO',
'Connecticut': 'CT',
'Delaware': 'DE',
'District of Columbia': 'DC',
'Florida': 'FL',
'Georgia': 'GA',
'Guam': 'GU',
'Hawaii': 'HI',
'Idaho': 'ID',
'Illinois': 'IL',
'Indiana': 'IN',
'Iowa': 'IA',
'Kansas': 'KS',
'Kentucky': 'KY',
'Louisiana': 'LA',
'Maine': 'ME',
'Maryland': 'MD',
'Massachusetts': 'MA',
'Michigan': 'MI',
'Minnesota': 'MN',
'Mississippi': 'MS',
'Missouri': 'MO',
'Montana': 'MT',
'Nebraska': 'NE',
'Nevada': 'NV',
'New Hampshire': 'NH',
'New Jersey': 'NJ',
'New Mexico': 'NM',
'New York': 'NY',
'North Carolina': 'NC',
'North Dakota': 'ND',
'Northern Mariana Islands':'MP',
'Ohio': 'OH',
'Oklahoma': 'OK',
'Oregon': 'OR',
'Pennsylvania': 'PA',
'Puerto Rico': 'PR',
'Rhode Island': 'RI',
'South Carolina': 'SC',
'South Dakota': 'SD',
'Tennessee': 'TN',
'Texas': 'TX',
'Utah': 'UT',
'Vermont': 'VT',
'Virgin Islands': 'VI',
'Virginia': 'VA',
'Washington': 'WA',
'West Virginia': 'WV',
'Wisconsin': 'WI',
'Wyoming': 'WY'
}
i could actually iterate this list and create my own sublist, e.g. e list of variants
data = [{
'id': 'productID1', 'name': 'productname A',
'option': {
'size': {
'type': 'list',
'name': 'size',
'choices': [
{'value': 'M'},
]}},
'variant': [{
'id': 'variantID1',
'choices':
{'size': 'M'},
'attributes':
{'currency': 'USD', 'price': 1}}]
},
{'id': 'productID2', 'name': 'productname B',
'option': {
'size': {
'type': 'list',
'name': 'size',
'choices': [
{'value': 'XL', 'salue':'XXL'},
]}},
'variant': [{
'id': 'variantID2',
'choices':
{'size': 'XL', 'size2':'XXL'},
'attributes':
{'currency': 'USD', 'price': 2}}]
}
]
new_list = {}
for item in data:
new_list.update(id=item['id'])
new_list.update (name=item['name'])
for variant in item['variant']:
new_list.update (varid=variant['id'])
for vchoice in variant['choices']:
new_list.update (vsize=variant['choices'][vchoice])
for attribute in variant['attributes']:
new_list.update (vprice=variant['attributes'][attribute])
for option in item['option']['size']['choices']:
new_list.update (osize=option['value'])
print (new_list)
but the output is always the last item of the iteration, because i always overwrite new_list with update().
{'id': 'productID2', 'name': 'productname B', 'varid': 'variantID2', 'vsize': 'XXL', 'vprice': 2, 'osize': 'XL'}
here's the final solution which worked for me:
data = [{
'id': 'productID1', 'name': 'productname A',
'variant': [{
'id': 'variantID1',
'choices':
{'size': 'M'},
'attributes':
{'currency': 'USD', 'price': 1}},
{'id':'variantID2',
'choices':
{'size': 'L'},
'attributes':
{'currency':'USD', 'price':2}}
]
},
{
'id': 'productID2', 'name': 'productname B',
'variant': [{
'id': 'variantID3',
'choices':
{'size': 'XL'},
'attributes':
{'currency': 'USD', 'price': 3}},
{'id':'variantID4',
'choices':
{'size': 'XXL'},
'attributes':
{'currency':'USD', 'price':4}}
]
}
]
for item in data:
for variant in item['variant']:
dic = {}
dic.update (ProductID=item['id'])
dic.update (Name=item['name'].title())
dic.update (ID=variant['id'])
dic.update (size=variant['choices']['size'])
dic.update (Price=variant['attributes']['price'])
products.append(dic)
keys = products[0].keys()
with open('productdata.csv', 'w', newline='', encoding='utf-8') as output_file:
dict_writer = csv.DictWriter(output_file, keys,delimiter=';', quotechar = '"', quoting=csv.QUOTE_NONNUMERIC)
dict_writer.writeheader()
dict_writer.writerows(products)
with the following output:
"ProductID";"Name";"ID";"size";"Price"
"productID1";"Productname A";"variantID1";"M";1
"productID1";"Productname A";"variantID2";"L";2
"productID2";"Productname B";"variantID3";"XL";3
"productID2";"Productname B";"variantID4";"XXL";4
which is exactly what i wanted.

convert xml which has 'children' into dictionary

I have a xml file which has childrens and I want to convert it into a dict.
<people>
<type>
<name>lo_123</name>
<country>AUS</country>
<note>
<name>joe</name>
<gender>m</gender>
<age>26</age>
<spouse>
<name>lisa</name>
<gender>f</gender>
</spouse>
</note>
</type>
</people>
This is my code to convert it
import xml.etree.cElementTree as ET
xml='xmltest.xml'
crif_tree = ET.parse(xml)
crif_root = crif_tree.getroot()
data = []
for one in crif_root.findall('.//type'):
reg={e.tag: e.text for e in list(note1)}
data.append(reg)
for two in crif_root.findall('.//type/note'):
reg={e.tag: e.text for e in list(note1)}
data.append(reg)
for three in crif_root.findall('.//type/note/spouse'):
reg={e.tag: e.text for e in list(note1)}
data.append(reg)
print(data)
Here is the output of data
[{'name': 'lo_123', 'country': 'AUS', 'note': '\n '}, {'name': 'joe', 'gender': 'm', 'age': '26', 'spouse': '\n '}, {'name': 'lisa', 'gender': 'f'}]
My desired output would be
[{'name': 'lo_123', 'country': 'AUS', 'note': '\n ', 'name': 'joe', 'gender': 'm', 'age': '26', 'spouse': '\n ', 'name': 'lisa', 'gender': 'f'}]

Getting Value Passing a Key From a List of Lists

I am given a list of lists, similar to the following. I am very new to Python
[
{'id': 1244},
{'name': 'example.com'},
{'monitoring_enabled': 'Yes'},
{'monitoring_url': 'http://www.example.com/'},
{'monitoring_page_url': 'http://www.example.com/products-spool-chain-overview.htm'},
{'monitoring_text_string': 'quality product'},
]
[
{'id': 1245},
{'name': 'example.com'},
{'monitoring_enabled': 'Yes'},
{'monitoring_url': 'http://www.example.com/'},
{'monitoring_page_url': 'http://www.example.com/products-spool-chain-overview.htm'},
{'monitoring_text_string': 'quality product'},
]
[
{'id': 1246},
{'name': 'example.com'},
{'monitoring_enabled': 'Yes'},
{'monitoring_url': 'http://www.example.com/'},
{'monitoring_page_url': 'http://www.example.com/products-spool-chain-overview.htm'},
{'monitoring_text_string': 'quality product'},
]
How can I get the "name" value from this, without having to nest loops?
My current code is:
for _row in _rs:
print(_row)
print(_row["name])
However, I am receiving an error message: TypeError: list indices must be integers, not str
So, how can I accomplish this?
if it trully a list of list there is a comma between each list. then you can easily do that:
for i in x:
print(i[1]['name'])
How does this look?
l = [{'id': 1244}, {'name': 'example.com'}, ...]
names = e['name'] for e in l if 'name' in e]
print(names)
>>> ['example.com']
You can move around the list and check if each dictionary has the "name" key, and print the result if yes.
list1 = [{'id': 1244},
{'name': 'example.com'},
{'monitoring_enabled': 'Yes'},
{'monitoring_url': 'http://www.example.com/'},
{'monitoring_page_url': 'http://www.example.com/products-spool-chain-overview.htm'},
{'monitoring_text_string': 'quality product'}]
for dictionary in list1:
if "name" in dictionary.keys(): # Whether the dictionary looks like {"name": ...}
print(dictionary["name"])
break # Exit the loop now that we have the name, instead of going through the whole list.
Edit: your input is broken. Before being able to work on it, you want to change the function that gives you what you showed in the OP into that:
[ # <-- outer list starts
[
{'id': 1244},
{'name': 'example.com'},
{'monitoring_enabled': 'Yes'},
{'monitoring_url': 'http://www.example.com/'},
{'monitoring_page_url': 'http://www.example.com/products-spool-chain- overview.htm'},
{'monitoring_text_string': 'quality product'},
], # <--
[
{'id': 1245},
{'name': 'example.com'},
{'monitoring_enabled': 'Yes'},
{'monitoring_url': 'http://www.example.com/'},
{'monitoring_page_url': 'http://www.example.com/products-spool-chain-overview.htm'},
{'monitoring_text_string': 'quality product'},
], # <--
[
{'id': 1246},
{'name': 'example.com'},
{'monitoring_enabled': 'Yes'},
{'monitoring_url': 'http://www.example.com/'},
{'monitoring_page_url': 'http://www.example.com/products-spool-chain-overview.htm'},
{'monitoring_text_string': 'quality product'},
]
] # <-- outer list ends
Even better:
lists_of_dicts = [
[{'id': 1244,
'name': 'example.com',
'monitoring_enabled': 'Yes',
'monitoring_url': 'http://www.example.com/',
'monitoring_page_url': 'http://www.example.com/products-spool-chain- overview.htm',
'monitoring_text_string': 'quality product'}],
[{'id': 1245,
'name': 'example.com',
'monitoring_enabled': 'Yes',
'monitoring_url': 'http://www.example.com/',
'monitoring_page_url': 'http://www.example.com/products-spool-chain- overview.htm',
'monitoring_text_string': 'quality product'}],
[{'id': 1246,
'name': 'example.com',
'monitoring_enabled': 'Yes',
'monitoring_url': 'http://www.example.com/',
'monitoring_page_url': 'http://www.example.com/products-spool-chain- overview.htm',
'monitoring_text_string': 'quality product'}]
]

Python Dictionaries: Grouping Key, Value pairs based on a common key, value

I have a list that contains dictionaries like this:
list1 = [{'name': 'bob', 'email': 'bob#bob.com', 'address': '123 house lane',
'student_id': 12345}, {'name': 'steve', 'email': 'steve#steve.com',
'address': '456 house lane', 'student_id': 34567}, {'name': 'bob',
'email': 'bob2#bob2.com', 'address': '789 house lane', 'student_id': 45678}]
Is there a way in python to group selected key, values pairs within a new dictionary based on 'name' value? For instance, something like this as an end result:
new_list = [
{'name': 'bob',
{'emails': ['bob#bob.com',
'bob2#bob2.com']},
{'address': ['123 house lane',
'789 house lane']},
{'name': 'steve',
{'email': ... },
{'address': ... }}
# let's assume the list1 has various entries at some point
# which may or may not have duplicate 'name' values
# and new_list will hold the groupings
]
Sounds like this is what you want to do:
list1 = [{'name': 'bob', 'email': 'bob#bob.com',
'address': '123 house lane', 'student_id': 12345},
{'name': 'steve', 'email': 'steve#steve.com',
'address': '456 house lane', 'student_id': 34567},
{'name': 'bob', 'email': 'bob2#bob2.com',
'address': '789 house lane', 'student_id': 45678}]
import operator
list1.sort(key=operator.itemgetter('name'))
new_list = []
for studentname, dicts in itertools.groupby(list1, operator.itemgetter('name')):
d = {'name': studentname}
for dct in dicts:
for key,value in dct.items():
if key == 'name':
continue
d.setdefault(key, []).append(value)
new_list.append(d)
DEMO:
[{'address': ['123 house lane', '789 house lane'],
'email': ['bob#bob.com', 'bob2#bob2.com'],
'name': 'bob',
'student_id': [12345, 45678]},
{'address': ['456 house lane'],
'email': ['steve#steve.com'],
'name': 'steve',
'student_id': [34567]}]
If you were going to use this extensively you should probably hard-code some better names (addresses instead of address for instance) and make a mapping that populates them for you.
keys_mapping = {'address': 'addresses',
'email': 'emails',
'student_id': 'student_ids'}
for studentname, dicts in itertools.groupby(list1, operator.itemgetter('name')):
d = {'name': studentname}
for dct in dicts:
for key,value in dct_items():
new_key = keys_mapping.get(key,key)
# get the updated value if it's defined, else give `key`
d.setdefault(new_key, []).append(value)
new_list.append(d)
The code below gives you nested dictionaries. Nested dictionaries give you faster processing to find the key while in list you have to create a loop.
list1 = [{'name': 'bob', 'email': 'bob#bob.com', 'address': '123 house lane',
'student_id': 12345}, {'name': 'steve', 'email': 'steve#steve.com',
'address': '456 house lane', 'student_id': 34567}, {'name': 'bob',
'email': 'bob2#bob2.com', 'address': '789 house lane', 'student_id': 45678}]
dict1 = {}
for content in list1:
if content['name'] in [name for name in dict1]:
dict1[content['name']] = {'emails': dict1[content['name']]['emails'] + [content['address']], 'addresses': dict1[content['name']]['addresses'] + [content['email']]}
else:
dict1[content['name']] = {'emails': [content['email']], 'addresses': [content['address']]}
print dict1
Output of the code is
{'steve': {'emails': ['steve#steve.com'], 'addresses': ['456 house lane']}, 'bob': {'emails': ['bob#bob.com', '789 house lane'], 'addresses': ['123 house lane', 'bob2#bob2.com']}}

Categories

Resources