Need help translating a nested dictionary into a pandas dataframe - python

Looking into translating the following nested dictionary which is an API pull from Yelp into a pandas dataframe to run visualization on:
Top 50 Pizzerias in Chicago
{'businesses': [{'alias': 'pequods-pizzeria-chicago',
'categories': [{'alias': 'pizza', 'title': 'Pizza'}],
'coordinates': {'latitude': 41.92187, 'longitude': -87.664486},
'display_phone': '(773) 327-1512',
'distance': 2158.7084581522413,
'id': 'DXwSYgiXqIVNdO9dazel6w',
'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/8QJUNblfCI0EDhOjuIWJ4A/o.jpg',
'is_closed': False,
'location': {'address1': '2207 N Clybourn Ave',
'address2': '',
'address3': '',
'city': 'Chicago',
'country': 'US',
'display_address': ['2207 N Clybourn Ave',
'Chicago, IL 60614'],
'state': 'IL',
'zip_code': '60614'},
'name': "Pequod's Pizzeria",
'phone': '+17733271512',
'price': '$$',
'rating': 4.0,
'review_count': 6586,
'transactions': ['restaurant_reservation', 'delivery'],
'url': 'https://www.yelp.com/biz/pequods-pizzeria-chicago?adjust_creative=wt2WY5Ii_urZB8YeHggW2g&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=wt2WY5Ii_urZB8YeHggW2g'},
{'alias': 'lou-malnatis-pizzeria-chicago',
'categories': [{'alias': 'pizza', 'title': 'Pizza'},
{'alias': 'italian', 'title': 'Italian'},
{'alias': 'sandwiches', 'title': 'Sandwiches'}],
'coordinates': {'latitude': 41.890357,
'longitude': -87.633704},
'display_phone': '(312) 828-9800',
'distance': 4000.9990531720227,
'id': '8vFJH_paXsMocmEO_KAa3w',
'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/9FiL-9Pbytyg6usOE02lYg/o.jpg',
'is_closed': False,
'location': {'address1': '439 N Wells St',
'address2': '',
'address3': '',
'city': 'Chicago',
'country': 'US',
'display_address': ['439 N Wells St',
'Chicago, IL 60654'],
'state': 'IL',
'zip_code': '60654'},
'name': "Lou Malnati's Pizzeria",
'phone': '+13128289800',
'price': '$$',
'rating': 4.0,
'review_count': 6368,
'transactions': ['pickup', 'delivery'],
'url': 'https://www.yelp.com/biz/lou-malnatis-pizzeria-chicago?adjust_creative=wt2WY5Ii_urZB8YeHggW2g&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=wt2WY5Ii_urZB8YeHggW2g'},
....]
I've tried the below and iterations of it but haven't had any luck.
df = pd.DataFrame.from_dict(topresponse)
Im really new to coding so any advice would be helpful

response["businesses"] is a list of records, so:
df = pd.DataFrame.from_records(response["businesses"])

Related

How to Arrange a List of Dictionaries in Python [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
data=[{'address': 'High Tech Campus 60', 'beta': 1.406659, 'ceo': 'Mr. Richard Clemmer', 'changes': -3.9400024, 'cik': '0001413447', 'city': 'Eindhoven', 'companyName': 'NXP Semiconductors N.V.', 'country': 'NL', 'currency': 'USD', ...}]
I have a dictionary.
Need to receive a list of dictionaries comma separated: [{},{},..]
How do I add them in a loop?
I tried to use append:
data_list.append(data.copy())
But it returns smth different: [[{...}]]
How do I get a list of such format:
[{'address': 'High Tech Campus 60', 'beta': 1.406659, 'ceo': 'Mr. Richard Clemmer', 'changes': -3.9400024, 'cik': '0001413447', 'city': 'Eindhoven', 'companyName': 'NXP Semiconductors N.V.', 'country': 'NL', 'currency': 'USD', ...}, {'address': '41st, 1155 Rene-Leve...W Flr 4000', 'beta': 2.219123, 'ceo': 'Mr. Klaus Paulini', 'changes': -0.00999999, 'cik': '0001113423', 'city': 'MONTREAL', 'companyName': 'Aeterna Zentaris Inc.', 'country': 'CA', 'currency': 'USD', ...}, {'address': '125 Summer Street', 'beta': 0.0, 'ceo': 'Dr. Jean-Pierre Som...ossi Ph.D.', 'changes': 2.5800018, 'cik': '0001593899', 'city': 'Boston', 'companyName': 'Atea Pharmaceuticals, Inc.', 'country': 'US', 'currency': 'USD', ...}, {'address': '401 Charmany Dr', 'beta': 1.073689, 'ceo': 'Mr. Corey Chambas', 'changes': 0.0, 'cik': '0001521951', 'city': 'Madison', 'companyName': 'First Business Finan...ices, Inc.', 'country': 'US', 'currency': 'USD', ...}, {'address': '490 Arsenal Way', 'beta': 0.0, 'ceo': 'Mr. Marc A. Cohen', 'changes': -0.9699974, 'cik': '0001662579', 'city': 'Watertown', 'companyName': 'C4 Therapeutics, Inc.', 'country': 'US', 'currency': 'USD', ...}, {'address': 'General-Guisan-Strasse 6', 'beta': 1.629418, 'ceo': 'Mr. Carlos Creus Moreira', 'changes': -0.09000015, 'cik': '0001738699', 'city': 'Zug', 'companyName': 'WISeKey Internationa...Holding AG', 'country': 'CH', 'currency': 'USD', ...}, {'address': '508 W Wall St Ste 800', 'beta': 1.7762, 'ceo': 'Mr. Stephen Jumper', 'changes': -0.04999995, 'cik': '0000799165', 'city': 'Midland', 'companyName': 'Dawson Geophysical Company', 'country': 'US', 'currency': 'USD', ...}, {'address': '955 Perimeter Road', 'beta': 0.0, 'ceo': 'Mr. Ravi Vig', 'changes': -1.2900009, 'cik': '0000866291', 'city': 'Manchester', 'companyName': 'Allegro MicroSystems, Inc.', 'country': 'US', 'currency': 'USD', ...}, {'address': '490 Lapp Rd', 'beta': 1.138646, 'ceo': 'Ms. Geraldine Henwood', 'changes': -0.04999995, 'cik': '0001588972', 'city': 'Malvern', 'companyName': 'Recro Pharma, Inc.', 'country': 'US', 'currency': 'USD', ...}, {'address': '5 Haplada Street, PO Box 5011', 'beta': 1.396288, 'ceo': 'Mr. Guy Bernstein', 'changes': -0.9300003, 'cik': '0000876779', 'city': 'OR YEHUDA', 'companyName': 'Magic Software Enter...rises Ltd.', 'country': 'IL', 'currency': 'USD', ...}, {'address': '111 West 33rd Street', 'beta': 0.0, 'ceo': 'Mr. Richard Gumer', 'changes': -0.20249999, 'cik': '0001823323', 'city': 'New York', 'companyName': 'KL Acquisition Corp', 'country': 'US', 'currency': 'USD', ...}, {'address': '2 Canal Park Ste 4', 'beta': 1.907176, 'ceo': 'Mr. Langley Steinert', 'changes': -1.4399986, 'cik': '0001494259', 'city': 'Cambridge', 'companyName': 'CarGurus, Inc.', 'country': 'US', 'currency': 'USD', ...}, {'address': '119 Standard St', 'beta': 1.592636, 'ceo': 'Mr. Ethan Brown', 'changes': -3.859993, 'cik': '0001655210', 'city': 'El Segundo', 'companyName': 'Beyond Meat, Inc.', 'country': 'US', 'currency': 'USD', ...}, {'address': '3854 American Way Ste A', 'beta': 0.502729, 'ceo': 'Mr. Paul Kusserow', 'changes': -1.5899963, 'cik': '0000896262', 'city': 'Baton Rouge', 'companyName': 'Amedisys, Inc.', 'country': 'US', 'currency': 'USD', ...}, ...]
Ok, it looks like initially I have not a dictionary but a list of dictionaries from one element. So how do I add another dictionary to the list after comma?
Upd: I managed to receive a list of dictionaries. It appeared it's not fully correct as some rows include additional fields. The list looks like this:'currency': 'USD', ...}, 'code', 'status', {'address': '5 ...
How can I validate a list of dictionaries and make sure every dictionary matches predefined list of columns.
enter code here
data_list.append(data[0].copy())
You could also do
data_list = data_list + data

Python Decision Tree: Creating Relationship using Dictionary from Row data

I have a hierarchical data(more than 10 generation) which tells who a person's parent/children are. i would want to represent this as dict of dict. is there any way to achieve this.
sample input - List of Dict/Dataframe
[{'Name': 'Oli Bob', 'Location': 'United Kingdom', 'Parent': nan}, {'Name': 'Mary May', 'Location': 'Germany', 'Parent': 'Oli Bob'}, {'Name': 'Christine Lobowski', 'Location': 'France', 'Parent': 'Oli Bob'}, {'Name': 'Brendon Philips', 'Location': 'USA', 'Parent': 'Oli Bob'}, {'Name': 'Margret Marmajuke', 'Location': 'Canada', 'Parent': 'Brendon Philips'}, {'Name': 'Frank Harbours', 'Location': 'Russia', 'Parent': 'Brendon Philips'}, {'Name': 'Todd Philips', 'Location': 'United Kingdom', 'Parent': 'Frank Harbours'}, {'Name': 'Jamie Newhart', 'Location': 'India', 'Parent': nan}, {'Name': 'Gemma Jane', 'Location': 'China', 'Parent': nan}, {'Name': 'Emily Sykes', 'Location': 'South Korea', 'Parent': 'Emily Sykes'}, {'Name': 'James Newman', 'Location': 'Japan', 'Parent': nan}]
same data in table form
Desired Output
[
{name:"Oli Bob", location:"United Kingdom", _children:[
{name:"Mary May", location:"Germany"},
{name:"Christine Lobowski", location:"France"},
{name:"Brendon Philips", location:"USA",_children:[
{name:"Margret Marmajuke", location:"Canada"},
{name:"Frank Harbours", location:"Russia",_children:[{name:"Todd Philips", location:"United Kingdom"}]},
]},
]},
{name:"Jamie Newhart", location:"India"},
{name:"Gemma Jane", location:"China", _children:[
{name:"Emily Sykes", location:"South Korea"},
]},
{name:"James Newman", location:"Japan"},
];

Json to Pandas, include "Parents"

With a list of 150+ Neighborhoods , I am using Foursquare API to retrieve nearby venues at 500m radius of a given Neighbourhood. Each Neighbourhood is expected to return 10-20 nearby venues.
Refer to snippet of json result as returned by Foursquare.
With results['response']['groups'][0]['items'], I able to retrieve the nearby venues information and make it a Table as below. However results['response']['groups'][0]['items'] does not have the Neighbourhood ( under headerFullLocation in json) of associated venues.
Q: How can I link the Neighbourhood(headerFullLocation) to its associated nearby venue and add it as a column to table below? Thanks for the advice.
{'suggestedFilters': {'header': 'Tap to show:',
'filters': [{'name': 'Open now', 'key': 'openNow'}]},
'headerLocation': 'Alexandra Park',
'headerFullLocation': 'Alexandra Park, Toronto',**
'headerLocationGranularity': 'neighborhood',
'totalResults': 138,
'suggestedBounds': {'ne': {'lat': 43.6545000045, 'lng': -79.39379244047241},
'sw': {'lat': 43.645499995499996, 'lng': -79.4062075595276}},
'groups': [{'type': 'Recommended Places',
'name': 'recommended',
'items': [{'reasons': {'count': 0,
'items': [{'summary': 'This spot is popular',
'type': 'general',
'reasonName': 'globalInteractionReason'}]},
'venue': {'id': '5644dbaa498e7f7534154326',
'**name': 'Maker Pizza',**
'contact': {},
'location': {'address': '59 Cameron St',
'lat': 43.6504011331197,
'lng': -79.39804047841302,
'labeledLatLngs': [{'label': 'display',
'lat': 43.6504011331197,
'lng': -79.39804047841302}],
'distance': 164,
'postalCode': 'M5T 2H1',
'cc': 'CA',
'city': 'Toronto',
'state': 'ON',
'country': 'Canada',
'formattedAddress': ['59 Cameron St', 'Toronto ON M5T 2H1', 'Canada']},
'categories': [{'id': '4bf58dd8d48988d1ca941735',
'name': 'Pizza Place',
'pluralName': 'Pizza Places',
'shortName': 'Pizza',
'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/pizza_',
'suffix': '.png'},
'primary': True}],
'verified': False,
'stats': {'tipCount': 0,
'usersCount': 0,
'checkinsCount': 0,
'visitsCount': 0},
'beenHere': {'count': 0,
'lastCheckinExpiredAt': 0,
'marked': False,
'unconfirmedCount': 0},
'photos': {'count': 0, 'groups': []},
'hereNow': {'count': 0, 'summary': 'Nobody here', 'groups': []}},
Why don't you just do venues['Neighbourhood'] = response['headerFullLocation']. I am assuming, you send a separate request for each neigbhourhood and plan to concatenate multiple venue dataframes in the end.

Switching keys and values in a dictionary in python with multiple values

Given a dictionary like so (data from geonamescache):
{'3041563': {'geonameid': 3041563,
'name': 'Andorra la Vella',
'latitude': 42.50779,
'longitude': 1.52109,
'countrycode': 'AD',
'population': 20430,
'timezone': 'Europe/Andorra',
'admin1code': '07'},
'290594': {'geonameid': 290594,
'name': 'Umm Al Quwain City',
'latitude': 25.56473,
'longitude': 55.55517,
'countrycode': 'AE',
'population': 62747,
'timezone': 'Asia/Dubai',
'admin1code': '07'},
'291074': {'geonameid': 291074,
'name': 'Ras Al Khaimah City',
'latitude': 25.78953,
'longitude': 55.9432,
'countrycode': 'AE',
'population': 351943,
'timezone': 'Asia/Dubai',
'admin1code': '05'},....
How can I switch the keys with the value 'name', for all items in dict?
Meaning the city name will become the key for each item.
Expected output:
{'Andorra la Vella': {'geonameid': 3041563,
'latitude': 42.50779,
'longitude': 1.52109,
'countrycode': 'AD',
'population': 20430,
'timezone': 'Europe/Andorra',
'admin1code': '07'},
'Umm Al Quwain City': {'geonameid': 290594,
'latitude': 25.56473,
'longitude': 55.55517,
'countrycode': 'AE',
'population': 62747,
'timezone': 'Asia/Dubai',
'admin1code': '07'},
'Ras Al Khaimah City': {'geonameid': 291074,
'latitude': 25.78953,
'longitude': 55.9432,
'countrycode': 'AE',
'population': 351943,
'timezone': 'Asia/Dubai',
'admin1code': '05'},....
Are you looking for an output like this?
{'Andorra la Vella': {'admin1code': '07',
'countrycode': 'AD',
'geonameid': 3041563,
'latitude': 42.50779,
'longitude': 1.52109,
'name': 'Andorra la Vella',
'population': 20430,
'timezone': 'Europe/Andorra'},
'Ras Al Khaimah City': {'admin1code': '05',
'countrycode': 'AE',
'geonameid': 291074,
'latitude': 25.78953,
'longitude': 55.9432,
'name': 'Ras Al Khaimah City',
'population': 351943,
'timezone': 'Asia/Dubai'},
'Umm Al Quwain City': {'admin1code': '07',
'countrycode': 'AE',
'geonameid': 290594,
'latitude': 25.56473,
'longitude': 55.55517,
'name': 'Umm Al Quwain City',
'population': 62747,
'timezone': 'Asia/Dubai'}}
If so, you can create a new dictionary of this format from the existing one. Here is one way you can do it, where dicc is your existing dictionary.
newdic = {}
for key, val in dicc.items():
newdic[val['name']] = val
print(newdic)
Just reassign the values as the keys:
data = ... # Your data here
for geocode, area in data.items():
cityname = area["name"]
area[cityname] = "name"
del area["name"] # if you don’t want the ‘name’ key anymore
data[geocode] = area

Replace string in list with python

trying to replace all elements named 'number' to 'numbr' in the data list but doesn't get it working.
Edit: So each key number should be renamed to numbr. Values stay as they are.
What am I doing wrong?
Thank you for your help!
data = [{'address': {
'city': 'city A',
'company_name': 'company A'},
'amount': 998,
'items': [{'description': 'desc A1','number': 'number A1'}],
'number': 'number of A',
'service_date': {
'type': 'DEFAULT',
'date': '2015-11-18'},
'vat_option': 123},
{'address': {
'city': 'city B',
'company_name': 'company B'},
'amount': 222,
'items': [{'description': 'desc B1','number': 'number B1'},
{'description': 'desc B2','number': 'number B2'}],
'number': 'number of B',
'service_date': {
'type': 'DEFAULT',
'date': '2015-11-18'},
'vat_option': 456}
]
def replace(l, X, Y):
for i,v in enumerate(l):
if v == X:
l.pop(i)
l.insert(i, Y)
replace(data, 'number', 'numbr')
print data
The following is a recursive replace implementation that replaces p1 by p2 in any string it encounters in the s object, recursing through lists, sets, tuples, dicts (both keys and values):
def nested_replace(s, p1, p2):
if isinstance(s, basestring): # Python2
# if isinstance(s, (str, bytes)): # Python3
return s.replace(p1, p2)
if isinstance(s, (list, tuple, set)):
return type(s)(nested_replace(x, p1, p2) for x in s)
if isinstance(s, dict):
return {nested_replace(k, p1, p2): nested_replace(v, p1, p2) for k, v in s.items()}
return s
>>> from pprint import pprint
>>> pprint(nested_replace(data, 'number', 'numbr'))
[{'address': {'city': 'city A', 'company_name': 'company A'},
'amount': 998,
'items': [{'description': 'desc A1', 'numbr': 'numbr A1'}],
'numbr': 'numbr of A',
'service_date': {'date': '2015-11-18', 'type': 'DEFAULT'},
'vat_option': 123},
{'address': {'city': 'city B', 'company_name': 'company B'},
'amount': 222,
'items': [{'description': 'desc B1', 'numbr': 'numbr B1'},
{'description': 'desc B2', 'numbr': 'numbr B2'}],
'numbr': 'numbr of B',
'service_date': {'date': '2015-11-18', 'type': 'DEFAULT'},
'vat_option': 456}]
eval function is anti pattern, but I think eval is best solution here
data1 = eval(repr(data).replace('number', 'numbr'))
If you are trying to replace both keys and values this will work.
from json import dumps, loads
data = [{'address': {
'city': 'city A',
'company_name': 'company A'},
'amount': 998,
'items': [{'description': 'desc A1','number': 'number A1'}],
'number': 'number of A',
'service_date': {
'type': 'DEFAULT',
'date': '2015-11-18'},
'vat_option': 123},
{'address': {
'city': 'city B',
'company_name': 'company B'},
'amount': 222,
'items': [{'description': 'desc B1','number': 'number B1'},
{'description': 'desc B2','number': 'number B2'}],
'number': 'number of B',
'service_date': {
'type': 'DEFAULT',
'date': '2015-11-18'},
'vat_option': 456}
]
data_string = dumps(data)
data = loads(data_string.replace('number', 'numbr')

Categories

Resources