Python Dictionary comprehension for a list of dictionaries

Python Dictionary comprehension for a list of dictionaries - python

I want to create a dictionary from the following list
[{'fips': '01001', 'state': 'AL', 'name': 'Autauga County'}, {'fips': '20005', 'state': 'KS', 'name': 'Atchison County'}, {'fips': '47145', 'state': 'TN', 'name': 'Roane County'}]
The result should have the name as the key and 'United States' as the value.
eg:
{'Autauga County': 'United States', 'Atchison County' : 'United States', 'Roane County' : 'United States'}
I can do this with a couple of for loops but i want to learn how to do it using Dictionary Comprehensions.

in_list = [{'fips': '01001', 'state': 'AL', 'name': 'Autauga County'},
{'fips': '20005', 'state': 'KS', 'name': 'Atchison County'},
{'fips': '47145', 'state': 'TN', 'name': 'Roane County'}]
out_dict = {x['name']: 'United States' for x in in_list if 'name' in x}
Some notes for learning:
Comprehensions are only for Python 2.7 onwards
Dictionary comprehensions are very similar to list comprehensions except with curly braces {} (and keys)
In case you didn't know, you can also add more complicated control-flow after the for loop in a comprehension such as [x for x in some_list if (cond)]
For completeness, if you can't use comprehensions, try this
out_dict = {}
for dict_item in in_list:
if not isinstance(dict_item, dict):
continue
if 'name' in dict_item:
in_name = dict_item['name']
out_dict[in_name] = 'United States'
As mentioned in the comments, for Python 2.6 you can replace the {k: v for k,v in iterator} with:
dict((k,v) for k,v in iterator)
You can read more about this in this question
Happy Coding!

Here's a little solution working for both python2.7.x and python 3.x:
data = [
{'fips': '01001', 'state': 'AL', 'name': 'Autauga County'},
{'fips': '20005', 'state': 'KS', 'name': 'Atchison County'},
{'fips': '47145', 'state': 'TN', 'name': 'Roane County'},
{'fips': 'xxx', 'state': 'yyy'}
]
output = {item['name']: 'United States' for item in data if 'name' in item}
print(output)

The loop/generator version is:
location_list = [{'fips': '01001', 'state': 'AL', 'name': 'Autauga County'},
{'fips': '20005', 'state': 'KS', 'name': 'Atchison County'},
{'fips': '47145', 'state': 'TN', 'name': 'Roane County'}]
location_dict = {location['name']:'United States' for location in location_list}
Output:
{'Autauga County': 'United States', 'Roane County': 'United States',
'Atchison County': 'United States'}
If you search on Stackoverflow for dictionary comprehension, solutions using the { } generator expression start to show up: Python Dictionary Comprehension

That should do the trick for you
states_dict = [{'fips': '01001', 'state': 'AL', 'name': 'Autauga County'}, {'fips': '20005', 'state': 'KS', 'name': 'Atchison County'}, {'fips': '47145', 'state': 'TN', 'name': 'Roane County'}]
{states_dict[i]['name']:'United States' for i, elem in enumerate(states_dict)}

Related

Parsing and printing JSON result after GET request using Python

I try to make a clear result of a json result after a GET request on Python :
import requests
import json
r = requests.get("https://smspva.com/api/rent.php?method=getcountries")
parsed = json.loads(r.text)
print(parsed)
I got result like that :
{'status': 1, 'data': [{'name': 'Russian Federation', 'code': 'RU'}, {'name': 'Ukraine', 'code': 'UA'}, {'name': 'Germany', 'code': 'DE'}, {'name': 'Czech Republic', 'code': 'CZ'}, {'name': 'United Kingdom', 'code': 'UK'}, {'name': 'Sweden', 'code': 'SE'}, {'name': 'Spain', 'code': 'ES'}, {'name': 'Portugal', 'code': 'PT'}, {'name': 'Netherlands', 'code': 'NL'}, {'name': 'Lithuania ', 'code': 'LT'}, {'name': 'Latvia', 'code': 'LV'}, {'name': 'Ireland', 'code': 'IE'}, {'name': 'Estonia', 'code': 'EE'}, {'name': 'United States', 'code': 'US'}]}
How can i get something like :
Name : Russian Federation
Code : RU
Name : Ukraine
Code : Ua
etc etc
Thanks for your help !

You can use Response.json(). Then iterate over data key:
import requests
resp = requests.get("https://smspva.com/api/rent.php?method=getcountries")
data = resp.json()
for country in data.get('data', []):
print(f"Name:{country.get('name')} Code:{country.get('code')}")

Nested Python Object to CSV

I looked up "nested dict" and "nested list" but either method work.
I have a python object with the following structure:
[{
'id': 'productID1', 'name': 'productname A',
'option': {
'size': {
'type': 'list',
'name': 'size',
'choices': [
{'value': 'M'},
]}},
'variant': [{
'id': 'variantID1',
'choices':
{'size': 'M'},
'attributes':
{'currency': 'USD', 'price': 1}}]
}]
what i need to output is a csv file in the following, flattened structure:
id, productname, variantid, size, currency, price
productID1, productname A, variantID1, M, USD, 1
productID1, productname A, variantID2, L, USD, 2
productID2, productname A, variantID3, XL, USD, 3
i tried this solution: Python: Writing Nested Dictionary to CSV
or this one: From Nested Dictionary to CSV File
i got rid of the [] around and within the data and e.g. i used this code snippet from 2 and adapted it to my needs. IRL i can't get rid of the [] because that's simple the format i get when calling the API.
with open('productdata.csv', 'w', newline='', encoding='utf-8') as output:
writer = csv.writer(output, delimiter=';', quotechar = '"', quoting=csv.QUOTE_NONNUMERIC)
for key in sorted(data):
value = data[key]
if len(value) > 0:
writer.writerow([key, value])
else:
for i in value:
writer.writerow([key, i, value])
but the output is like this:
"id";"productID1"
"name";"productname A"
"option";"{'size': {'type': 'list', 'name': 'size', 'choices': {'value': 'M'}}}"
"variant";"{'id': 'variantID1', 'choices': {'size': 'M'}, 'attributes': {'currency': 'USD', 'price': 1}}"
anyone can help me out, please?
thanks in advance

list indices must be integers not strings
The following presents a visual example of a python list:
0 carrot.
1 broccoli.
2 asparagus.
3 cauliflower.
4 corn.
5 cucumber.
6 eggplant.
7 bell pepper
0, 1, 2 are all "indices".
"carrot", "broccoli", etc... are all said to be "values"
Essentially, a python list is a machine which has integer inputs and arbitrary outputs.
Think of a python list as a black-box:
A number, such as 5, goes into the box.
you turn a crank handle attached to the box.
Maybe the string "cucumber" comes out of the box
You got an error: TypeError: list indices must be integers or slices, not str
There are various solutions.
Convert Strings into Integers
Convert the string into an integer.
listy_the_list = ["carrot", "broccoli", "asparagus", "cauliflower"]
string_index = "2"
integer_index = int(string_index)
element = listy_the_list[integer_index]
so yeah.... that works as long as your string-indicies look like numbers (e.g. "456" or "7")
The integer class constructor, int(), is not very smart.
For example, x = int("3 ") will produce an error.
You can try x = int(strying.strip()) to get rid of leading and trailing white-space characters.
Use a Container which Allows Keys to be Strings
Long ago, before before electronic computers existed, there were various types of containers in the world:
cookie jars
muffin tins
carboard boxes
glass jars
steel cans.
back-packs
duffel bags
closets/wardrobes
brief-cases
In computer programming there are also various types of "containers"
You do not have to use a list as your container, if you do not want to.
There are containers where the keys (AKA indices) are allowed to be strings, instead of integers.
In python, the standard container which like a list, but where the keys/indices can be strings, is a dictionary
thisdict = {
"make": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict["brand"] == "Ford"
If you want to index into a container using strings, instead of integers, then use a dict, instead of a list
The following is an example of a python dict which has state names as input and state abreviations as output:
us_state_abbrev = {
'Alabama': 'AL',
'Alaska': 'AK',
'American Samoa': 'AS',
'Arizona': 'AZ',
'Arkansas': 'AR',
'California': 'CA',
'Colorado': 'CO',
'Connecticut': 'CT',
'Delaware': 'DE',
'District of Columbia': 'DC',
'Florida': 'FL',
'Georgia': 'GA',
'Guam': 'GU',
'Hawaii': 'HI',
'Idaho': 'ID',
'Illinois': 'IL',
'Indiana': 'IN',
'Iowa': 'IA',
'Kansas': 'KS',
'Kentucky': 'KY',
'Louisiana': 'LA',
'Maine': 'ME',
'Maryland': 'MD',
'Massachusetts': 'MA',
'Michigan': 'MI',
'Minnesota': 'MN',
'Mississippi': 'MS',
'Missouri': 'MO',
'Montana': 'MT',
'Nebraska': 'NE',
'Nevada': 'NV',
'New Hampshire': 'NH',
'New Jersey': 'NJ',
'New Mexico': 'NM',
'New York': 'NY',
'North Carolina': 'NC',
'North Dakota': 'ND',
'Northern Mariana Islands':'MP',
'Ohio': 'OH',
'Oklahoma': 'OK',
'Oregon': 'OR',
'Pennsylvania': 'PA',
'Puerto Rico': 'PR',
'Rhode Island': 'RI',
'South Carolina': 'SC',
'South Dakota': 'SD',
'Tennessee': 'TN',
'Texas': 'TX',
'Utah': 'UT',
'Vermont': 'VT',
'Virgin Islands': 'VI',
'Virginia': 'VA',
'Washington': 'WA',
'West Virginia': 'WV',
'Wisconsin': 'WI',
'Wyoming': 'WY'
}

i could actually iterate this list and create my own sublist, e.g. e list of variants
data = [{
'id': 'productID1', 'name': 'productname A',
'option': {
'size': {
'type': 'list',
'name': 'size',
'choices': [
{'value': 'M'},
]}},
'variant': [{
'id': 'variantID1',
'choices':
{'size': 'M'},
'attributes':
{'currency': 'USD', 'price': 1}}]
},
{'id': 'productID2', 'name': 'productname B',
'option': {
'size': {
'type': 'list',
'name': 'size',
'choices': [
{'value': 'XL', 'salue':'XXL'},
]}},
'variant': [{
'id': 'variantID2',
'choices':
{'size': 'XL', 'size2':'XXL'},
'attributes':
{'currency': 'USD', 'price': 2}}]
}
]
new_list = {}
for item in data:
new_list.update(id=item['id'])
new_list.update (name=item['name'])
for variant in item['variant']:
new_list.update (varid=variant['id'])
for vchoice in variant['choices']:
new_list.update (vsize=variant['choices'][vchoice])
for attribute in variant['attributes']:
new_list.update (vprice=variant['attributes'][attribute])
for option in item['option']['size']['choices']:
new_list.update (osize=option['value'])
print (new_list)
but the output is always the last item of the iteration, because i always overwrite new_list with update().
{'id': 'productID2', 'name': 'productname B', 'varid': 'variantID2', 'vsize': 'XXL', 'vprice': 2, 'osize': 'XL'}

here's the final solution which worked for me:
data = [{
'id': 'productID1', 'name': 'productname A',
'variant': [{
'id': 'variantID1',
'choices':
{'size': 'M'},
'attributes':
{'currency': 'USD', 'price': 1}},
{'id':'variantID2',
'choices':
{'size': 'L'},
'attributes':
{'currency':'USD', 'price':2}}
]
},
{
'id': 'productID2', 'name': 'productname B',
'variant': [{
'id': 'variantID3',
'choices':
{'size': 'XL'},
'attributes':
{'currency': 'USD', 'price': 3}},
{'id':'variantID4',
'choices':
{'size': 'XXL'},
'attributes':
{'currency':'USD', 'price':4}}
]
}
]
for item in data:
for variant in item['variant']:
dic = {}
dic.update (ProductID=item['id'])
dic.update (Name=item['name'].title())
dic.update (ID=variant['id'])
dic.update (size=variant['choices']['size'])
dic.update (Price=variant['attributes']['price'])
products.append(dic)
keys = products[0].keys()
with open('productdata.csv', 'w', newline='', encoding='utf-8') as output_file:
dict_writer = csv.DictWriter(output_file, keys,delimiter=';', quotechar = '"', quoting=csv.QUOTE_NONNUMERIC)
dict_writer.writeheader()
dict_writer.writerows(products)
with the following output:
"ProductID";"Name";"ID";"size";"Price"
"productID1";"Productname A";"variantID1";"M";1
"productID1";"Productname A";"variantID2";"L";2
"productID2";"Productname B";"variantID3";"XL";3
"productID2";"Productname B";"variantID4";"XXL";4
which is exactly what i wanted.

Unique values of Dictionary comprehension, return dictionary instread of string

this is my data:
data = [{'id': 1, 'name': 'The Musical Hop', 'city': 'San Francisco', 'state': 'CA'},
{'id': 2, 'name': 'The Dueling Pianos Bar', 'city': 'New York', 'state': 'NY'},
{'id': 3, 'name': 'Park Square Live Music & Coffee', 'city': 'San Francisco', 'state': 'CA'}]
I want to find out the unique values (thats why I use a set) of "city" and return them like this:
cities = set([x.get("city") for x in data])
cities ´
{'New York', 'San Francisco'}
However, I also want to return the corresponding state, like this:
[{"city": "New York", "state": "NY"}, {"city": "San Francisco", "state": "CA"}]
Is there a way to do this?

You can use dict-comprehension for the task:
out = list({x['city']:{'city':x['city'], 'state':x['state']} for x in data}.values())
print(out)
Prints:
[{'city': 'San Francisco', 'state': 'CA'}, {'city': 'New York', 'state': 'NY'}]

you can use a dict-comprehension to create a city->state mapping, then iterate it to create the list you want:
city_to_state = {x["city"]: x["state"] for x in data}
result = [{"city":k, "state":v} for k,v in city_to_state.items()]

Getting a KeyError: venues error in FourSquare/Python call

OK, I'm a newbie and I think I'm doing everything I should be, but I am still getting a KeyError: venues. (I also tried using "venue" instead and I am not at my maximum quota for the day at FourSquare)... I am using a Jupyter Notebook to do this
Using this code:
VERSION = '20200418'
RADIUS = 1000
LIMIT = 2
**url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, RADIUS, LIMIT)
url
results = requests.get(url).json()**
I get 2 results (shown at end of this post)
When I try to take those results and put them into a dataframe, i get "KeyError: venues"
# assign relevant part of JSON to venues
venues = results['response']['venues']
# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-29-5acf500bf9ad> in <module>
1 # assign relevant part of JSON to venues
----> 2 venues = results['response']['venues']
3
4 # tranform venues into a dataframe
5 dataframe = json_normalize(venues)
KeyError: 'venues'
I'm not really sure where I am going wrong... This has worked for me with other locations... But then again, like I said, I'm new at this... (I haven't maxed out my queries, and I've tried using "venue" instead)... Thank you
FourSquareResults:
{'meta': {'code': 200, 'requestId': '5ec42de01a4b0a001baa10ff'},
'response': {'suggestedFilters': {'header': 'Tap to show:',
'filters': [{'name': 'Open now', 'key': 'openNow'}]},
'warning': {'text': "There aren't a lot of results near you. Try something more general, reset your filters, or expand the search area."},
'headerLocation': 'Cranford',
'headerFullLocation': 'Cranford',
'headerLocationGranularity': 'city',
'totalResults': 20,
'suggestedBounds': {'ne': {'lat': 40.67401708586377,
'lng': -74.29300815204098},
'sw': {'lat': 40.65601706786374, 'lng': -74.31669390523408}},
'groups': [{'type': 'Recommended Places',
'name': 'recommended',
'items': [{'reasons': {'count': 0,
'items': [{'summary': 'This spot is popular',
'type': 'general',
'reasonName': 'globalInteractionReason'}]},
'venue': {'id': '4c13c8d2b7b9c928d127aa37',
'name': 'Cranford Canoe Club',
'location': {'address': '250 Springfield Ave',
'crossStreet': 'Orange Avenue',
'lat': 40.66022488705574,
'lng': -74.3061084180977,
'labeledLatLngs': [{'label': 'display',
'lat': 40.66022488705574,
'lng': -74.3061084180977},
{'label': 'entrance', 'lat': 40.660264, 'lng': -74.306191}],
'distance': 543,
'postalCode': '07016',
'cc': 'US',
'city': 'Cranford',
'state': 'NJ',
'country': 'United States',
'formattedAddress': ['250 Springfield Ave (Orange Avenue)',
'Cranford, NJ 07016',
'United States']},
'categories': [{'id': '4f4528bc4b90abdf24c9de85',
'name': 'Athletics & Sports',
'pluralName': 'Athletics & Sports',
'shortName': 'Athletics & Sports',
'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/sports_outdoors_',
'suffix': '.png'},
'primary': True}],
'photos': {'count': 0, 'groups': []},
'venuePage': {'id': '60380091'}},
'referralId': 'e-0-4c13c8d2b7b9c928d127aa37-0'},
{'reasons': {'count': 0,
'items': [{'summary': 'This spot is popular',
'type': 'general',
'reasonName': 'globalInteractionReason'}]},
'venue': {'id': '4d965995e07ea35d07e2bd02',
'name': 'Mizu Sushi',
'location': {'address': '103 Union Ave.',
'lat': 40.65664427772896,
'lng': -74.30343966195308,
'labeledLatLngs': [{'label': 'display',
'lat': 40.65664427772896,
'lng': -74.30343966195308}],
'distance': 939,
'postalCode': '07016',
'cc': 'US',
'city': 'Cranford',
'state': 'NJ',
'country': 'United States',
'formattedAddress': ['103 Union Ave.',
'Cranford, NJ 07016',
'United States']},
'categories': [{'id': '4bf58dd8d48988d1d2941735',
'name': 'Sushi Restaurant',
'pluralName': 'Sushi Restaurants',
'shortName': 'Sushi',
'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/sushi_',
'suffix': '.png'},
'primary': True}],
'photos': {'count': 0, 'groups': []}},
'referralId': 'e-0-4d965995e07ea35d07e2bd02-1'}]}]}}

Look more closely at response that you're getting - there's no "venues" key there. Closest one that I see is "groups" list, which has "items" list in it, and individual items have "venue" key in them.

Get value from data-set field sublist

I have a dataset (that pull its data from a dict) that I am attempting to clean and republish. Within this data set, there is a field with a sublist that I would like to extract specific data from.
Here's the data:
[{'id': 'oH58h122Jpv47pqXhL9p_Q', 'alias': 'original-pizza-brooklyn-4', 'name': 'Original Pizza', 'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/HVT0Vr_Vh52R_niODyPzCQ/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/original-pizza-brooklyn-4?adjust_creative=IelPnWlrTpzPtN2YRie19A&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=IelPnWlrTpzPtN2YRie19A', 'review_count': 102, 'categories': [{'alias': 'pizza', 'title': 'Pizza'}], 'rating': 4.0, 'coordinates': {'latitude': 40.63781, 'longitude': -73.8963799}, 'transactions': [], 'price': '$', 'location': {'address1': '9514 Ave L', 'address2': '', 'address3': '', 'city': 'Brooklyn', 'zip_code': '11236', 'country': 'US', 'state': 'NY', 'display_address': ['9514 Ave L', 'Brooklyn, NY 11236']}, 'phone': '+17185313559', 'display_phone': '(718) 531-3559', 'distance': 319.98144420799355},
Here's how the data is presented within the csv/spreadsheet:
location
{'address1': '9514 Ave L', 'address2': '', 'address3': '', 'city': 'Brooklyn', 'zip_code': '11236', 'country': 'US', 'state': 'NY', 'display_address': ['9514 Ave L', 'Brooklyn, NY 11236']}
Is there a way to pull location.city for example?
The below code simply adds a few fields and exports it to a csv.
def data_set(data):
df = pd.DataFrame(data)
df['zip'] = get_zip()
df['region'] = get_region()
newdf = df.filter(['name', 'phone', 'location', 'zip', 'region', 'coordinates', 'rating', 'review_count',
'categories', 'url'], axis=1)
if not os.path.isfile('yelp_data.csv'):
newdf.to_csv('data.csv', header='column_names')
else: # else it exists so append without writing the header
newdf.to_csv('data.csv', mode='a', header=False)
If that doesn't make sense, please let me know. Thanks in advance!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Dictionary comprehension for a list of dictionaries - python

Related

Parsing and printing JSON result after GET request using Python

Nested Python Object to CSV

Unique values of Dictionary comprehension, return dictionary instread of string

Getting a KeyError: venues error in FourSquare/Python call

Get value from data-set field sublist

Categories

Resources