Writing dictionaries recursively

Writing dictionaries recursively - python

I have a JSON file with location data. Below is a sample from the file.
[
{
"id": 1,
"name": "Western Cape",
"filename": "1",
"type": "Province",
"typeCode": 1
},
{
"id": 2,
"name": "Eastern Cape",
"filename": "2",
"type": "Province",
"typeCode": 1
},
{
"id": 3,
"name": "Northern Cape",
"filename": "3",
"type": "Province",
"typeCode": 1
},
{
"id": 4,
"name": "Free State",
"filename": "4",
"type": "Province",
"typeCode": 1
},
{
"id": 5,
"name": "KwaZulu-Natal",
"filename": "5",
"type": "Province",
"typeCode": 1
},
{
"id": 6,
"name": "North West",
"filename": "6",
"type": "Province",
"typeCode": 1
},
{
"id": 7,
"name": "Gauteng",
"filename": "7",
"type": "Province",
"typeCode": 1
},
{
"id": 8,
"name": "Mpumalanga",
"filename": "8",
"type": "Province",
"typeCode": 1
},
{
"id": 9,
"name": "Limpopo",
"filename": "9",
"type": "Province",
"typeCode": 1
},
{
"id": 199,
"name": "City of Cape Town",
"filename": "1.199",
"type": "Metropolitan Municipality",
"typeCode": 2,
"parent": 1
},
{
"id": 260,
"name": "Buffalo City",
"filename": "2.260",
"type": "Metropolitan Municipality",
"typeCode": 2,
"parent": 2
},
{
"id": 299,
"name": "Nelson Mandela Bay",
"filename": "2.299",
"type": "Metropolitan Municipality",
"typeCode": 2,
"parent": 2
},
{
"id": 499,
"name": "Mangaung",
"filename": "4.499",
"type": "Metropolitan Municipality",
"typeCode": 2,
"parent": 4
},
{
"id": 599,
"name": "eThekwini",
"filename": "5.599",
"type": "Metropolitan Municipality",
"typeCode": 2,
"parent": 5
},
{
"id": 797,
"name": "Ekurhuleni",
"filename": "7.797",
"type": "Metropolitan Municipality",
"typeCode": 2,
"parent": 7
},
{
"id": 798,
"name": "City of Johannesburg",
"filename": "7.798",
"type": "Metropolitan Municipality",
"typeCode": 2,
"parent": 7
},
{
"id": 799,
"name": "City of Tshwane",
"filename": "7.799",
"type": "Metropolitan Municipality",
"typeCode": 2,
"parent": 7
}]
I am looking to achieve the following output:
{'Eastern Cape': {u'Buffalo City': {}, u'Nelson Mandela Bay': {}}, 'Gauteng': {u'Ekurhuleni': {}, u'City of Johannesburg': {}, u'City of Tshwane': {}}, 'North West': {}, 'Mpumalanga': {}, 'Limpopo': {}, 'Western Cape': {u'City of Cape Town': {}}, 'KwaZulu-Natal': {u'eThekwini': {}}, 'Northern Cape': {}, 'Free State': {u'Mangaung': {}}}
I have written the following code block to achieve it:
province_dict = {}
final_dict = {
'Western Cape': {},
'Eastern Cape': {},
'Northern Cape': {},
'Free State': {},
'KwaZulu-Natal': {},
'North West': {},
'Gauteng': {},
'Mpumalanga': {},
'Limpopo': {},
}
for item in data:
if item['type'] == 'Province':
province_dict.update({item['id']: item['name']})
for item in data:
if item['type'] != 'Province':
if item['parent'] in province_dict.keys():
final_dict[province_dict[item['parent']]].update({item['name']: {}})
print final_dict
However, there seems to some problems:
This isn't the most pythonic way to achieve this.
I am not limited to Province and Metro Municipality, I also have District Municipalities and so on. They are however all governed by the same rules every child has a parent id with the Province being the root.
I need to create a hierarchical structure as mentioned above with n number of nesting possible.
It would be helpful if someone could help me achieve this.

I think you might want to create an abstract data type to do that efficiently. You could have the ADT class use a dictionary and you could give it another property.
This is just pseudocoding. I don't have time at the moment to create a fully functional class, but I think this would help.
class SomeClass:
_parents = {}
_children = {}
def add_obj(self, obj):
if obj.noParent:
self._parents[obj.get_id] = obj
else:
self._children[obj.get_id] = obj
def to_string(self, id):
for v1 in (self._parents if id is None else self._parents.get(id)):
if v1.get_id in self._parents.keys():
print(self.to_string(v1.get_id()))
for v2 in self._children.get(id):
print(v2.stuff)
If you are still struggling, when I have time either later today or tomorrow, I'll come back to this answer and see if I can make it functional.

Related

Json_normalize with meta data

{
"generated_at": "2022-09-19T15:30:42+00:00",
"sport_schedule_sport_event_markets": [{
"sport_event": {
"id": "sr:sport_event:33623209",
"start_time": "2022-09-18T17:00:00+00:00",
"start_time_confirmed": true,
"competitors": [{
"id": "sr:competitor:4413",
"name": "Baltimore Ravens",
"country": "USA",
"country_code": "USA",
"abbreviation": "BAL",
"qualifier": "home",
"rotation_number": 264
}, {
"id": "sr:competitor:4287",
"name": "Miami Dolphins",
"country": "USA",
"country_code": "USA",
"abbreviation": "MIA",
"qualifier": "away",
"rotation_number": 263
}]
},
"markets": [{
"id": "sr:market:1",
"name": "1x2",
"books": [{
"id": "sr:book:18149",
"name": "DraftKings",
"removed": true,
"external_sport_event_id": "180327504",
"external_market_id": "120498143",
"outcomes": [{
"id": "sr:outcome:1",
"type": "home",
"odds_decimal": "1.13",
"odds_american": "-800",
"odds_fraction": "1\/8",
"open_odds_decimal": "1.37",
"open_odds_american": "-270",
"open_odds_fraction": "10\/27",
"external_outcome_id": "0QA120498143#1341135376_13L88808Q1468Q20",
"removed": true
}, {
"id": "sr:outcome:2",
"type": "draw",
"odds_decimal": "31.00",
"odds_american": "+3000",
"odds_fraction": "30\/1",
"open_odds_decimal": "36.00",
"open_odds_american": "+3500",
"open_odds_fraction": "35\/1",
"external_outcome_id": "0QA120498143#1341135377_13L88808Q10Q22",
"removed": true
}, {
"id": "sr:outcome:3",
"type": "away",
"odds_decimal": "6.00",
"odds_american": "+500",
"odds_fraction": "5\/1",
"open_odds_decimal": "2.95",
"open_odds_american": "+195",
"open_odds_fraction": "39\/20",
"external_outcome_id": "0QA120498143#1341135378_13L88808Q1329Q21",
"removed": true
}]
I'm trying to get outcomes as the main table and sport event ID at the metaID. below is not working
#df =pd.json_normalize(data,record_path=['sport_schedule_sport_event_markets','markets','books','outcomes'],meta=[['sport_schedule_sport_event_markets','sport_event']],meta_prefix='game_',errors='ignore')

How to compare two list of dictionary and update value from another [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last year.
Improve this question
list_1 = [
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Owner" },
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "21", "name": "abc", "email": "abc#network.com", "type": "Employ" },
{ "id": "21", "name": "abc", "email": "abc#network.com", "type": "Manager" },
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "100", "name": "last", "email": "last#network.com", "type": "Manager" }
]
list_2 = [
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "21", "name": "abc", "email": "abc#network.com", "type": "Employ" },
{ "id": "52", "name": "abcded", "email": "abcded#network.com", "type": "Manager" }
]
A preference dictionary {'Owner': 1, 'Manager':2, 'employ':3, 'HR': 4 }
There are two list of dictionaries
list_1 is the primary_dictionary, list_2 is the secondary_dictionary
I need to update the role in list_1 dictionary if the 'email' present in secondary dictionary with respect to check in preference dictionary
In the end output i should contain 'type' as in preference dictionary
If any emails match I want to update list one at all places with that email to contain a new role under'type'. for example if 123#gmail.com was in a dictionary within list2 then I want to change every item in list1 that contains 123#gmail.com as the email to have a role that is determined by a selection from the preference dictionary.
Expected out
[
{'id': '11', 'name': 'son', 'email': 'n#network.com', 'type': 'Owner'},
{'id': '21', 'name': 'abc', 'email': 'abc#network.com', 'type': 'Manager'},
{"id": "100", "name": "last", "email": "last#network.com", "type": "Manager"}
]
Code is below
for each1 in list_1:
for each2 in list_2:
if each1['email'] == each2['email']:
# Update the list_1 dictionary with respect to preference

Given the following inputs:
list_1 = [
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Owner" },
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "21", "name": "abc", "email": "abc#network.com", "type": "Employ" },
{ "id": "21", "name": "abc", "email": "abc#network.com", "type": "Manager" },
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "100", "name": "last", "email": "last#network.com", "type": "Manager" }
]
list_2 = [
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "21", "name": "abc", "email": "abc#network.com", "type": "Employ" },
{ "id": "52", "name": "abcded", "email": "abcded#network.com", "type": "Manager" }
]
preference = {'Owner': 1, 'Manager':2, 'Employ':3, 'HR': 4 }
I would start by reshaping list_2 into something more useable. This will get rid of the duplicates leaving us just the "best" type for each email:
list_2_lookup = {}
for item in list_2:
key, value = item["email"], item["type"]
list_2_lookup.setdefault(key, value)
if preference[value] < preference[list_2_lookup[key]]:
list_2_lookup[key] = value
Then we can iterate over the items in the first list and use the lookup we just created. Note, this is a little more convoluted than might be needed as it is not clear from your question and your expected result what items from list_1 should actually appear in the output. I have tried to match your stated output.
result = {}
for item in list_1:
key = item["email"]
result.setdefault(key, item)
if preference[result[key]["type"]] > preference[item["type"]]:
result[key]["type"] = item["type"]
if preference[result[key]["type"]] > preference.get(list_2_lookup.get(key), 99):
result[key]["type"] = list_2_lookup.get(key)
At this point we can:
print(list(result.values()))
Giving us:
[
{'id': '11', 'name': 'son', 'email': 'n#network.com', 'type': 'Owner'},
{'id': '21', 'name': 'abc', 'email': 'abc#network.com', 'type': 'Manager'},
{'id': '100', 'name': 'last', 'email': 'last#network.com', 'type': 'Manager'}
]

Here is code that will do the following:
If any emails match update list one at all places with that email to contain a new role under 'type'. for example if 123#gmail.com was in a dictionary within list2 then it will change change every item in list1 that contains 123#gmail.com as the email to have a role that is determined by a selection from the preference dictionary.
list_1 = [
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Owner" },
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "21", "name": "abc", "email": "abc#network.com", "type": "Employ" },
{ "id": "21", "name": "abc", "email": "abc#network.com", "type": "Manager" },
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "100", "name": "last", "email": "last#network.com", "type": "Manager" }
]
list_2 = [
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "11", "name": "son", "email": "n#network.com", "type": "Manager" },
{ "id": "21", "name": "abc", "email": "abc#network.com", "type": "Employ" },
{ "id": "52", "name": "abcded", "email": "abcded#network.com", "type": "Manager" }
]
pref = {1: 'owner', 2: 'Manager', 3: 'employ', 4: 'HR' }
choice = 1
idx1 = -1
for each1 in list_1:
idx1 += 1
for each2 in list_2:
if each1['email'] == each2['email']:
print(list_1[idx1]['type'])
print(pref[choice])
list_1[idx1]['type'] = pref[choice]
print(list_1)

Best way to build denormilazed dataframe with pandas from spotify API

I just downloaded some json from spotify and took a look into the pd.normalize_json().
But if I normalise the data i still have dictionaries within my dataframe. Also setting the level doesnt help.
DATA I want to have in my dataframe:
{
"collaborative": false,
"description": "",
"external_urls": {
"spotify": "https://open.spotify.com/playlist/5"
},
"followers": {
"href": null,
"total": 0
},
"href": "https://api.spotify.com/v1/playlists/5?additional_types=track",
"id": "5",
"images": [
{
"height": 640,
"url": "https://i.scdn.co/image/a",
"width": 640
}
],
"name": "Another",
"owner": {
"display_name": "user",
"external_urls": {
"spotify": "https://open.spotify.com/user/user"
},
"href": "https://api.spotify.com/v1/users/user",
"id": "user",
"type": "user",
"uri": "spotify:user:user"
},
"primary_color": null,
"public": true,
"snapshot_id": "M2QxNTcyYTkMDc2",
"tracks": {
"href": "https://api.spotify.com/v1/playlists/100&additional_types=track",
"items": [
{
"added_at": "2020-12-13T18:34:09Z",
"added_by": {
"external_urls": {
"spotify": "https://open.spotify.com/user/user"
},
"href": "https://api.spotify.com/v1/users/user",
"id": "user",
"type": "user",
"uri": "spotify:user:user"
},
"is_local": false,
"primary_color": null,
"track": {
"album": {
"album_type": "album",
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/1dfeR4Had"
},
"href": "https://api.spotify.com/v1/artists/1dfDbWqFHLkxsg1d",
"id": "1dfeR4HaWDbWqFHLkxsg1d",
"name": "Q",
"type": "artist",
"uri": "spotify:artist:1dfeRqFHLkxsg1d"
}
],
"available_markets": [
"CA",
"US"
],
"external_urls": {
"spotify": "https://open.spotify.com/album/6wPXmlLzZ5cCa"
},
"href": "https://api.spotify.com/v1/albums/6wPXUJ9LzZ5cCa",
"id": "6wPXUmYJ9zZ5cCa",
"images": [
{
"height": 640,
"url": "https://i.scdn.co/image/ab676620a47",
"width": 640
},
{
"height": 300,
"url": "https://i.scdn.co/image/ab67616d0620a47",
"width": 300
},
{
"height": 64,
"url": "https://i.scdn.co/image/ab603e6620a47",
"width": 64
}
],
"name": "The (Deluxe ",
"release_date": "1920-07-17",
"release_date_precision": "day",
"total_tracks": 15,
"type": "album",
"uri": "spotify:album:6m5cCa"
},
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/1dg1d"
},
"href": "https://api.spotify.com/v1/artists/1dsg1d",
"id": "1dfeR4HaWDbWqFHLkxsg1d",
"name": "Q",
"type": "artist",
"uri": "spotify:artist:1dxsg1d"
}
],
"available_markets": [
"CA",
"US"
],
"disc_number": 1,
"duration_ms": 21453,
"episode": false,
"explicit": false,
"external_ids": {
"isrc": "GBU6015"
},
"external_urls": {
"spotify": "https://open.spotify.com/track/5716J"
},
"href": "https://api.spotify.com/v1/tracks/5716J",
"id": "5716J",
"is_local": false,
"name": "Another",
"popularity": 73,
"preview_url": null,
"track": true,
"track_number": 3,
"type": "track",
"uri": "spotify:track:516J"
},
"video_thumbnail": {
"url": null
}
}
],
"limit": 100,
"next": null,
"offset": 0,
"previous": null,
"total": 1
},
"type": "playlist",
"uri": "spotify:playlist:fek"
}
So what are best practices to read nested data like this into one dataframe in pandas?
I'm glad for any advice.
EDIT:
so basically I want all keys as columns in my dataframe. But with normalise it stops at "tracks.items" and if I normalise this again i have the recursive problem again.

It depends on the information you are looking for. Take a look at pandas.read_json() to see if that can work. Also you can select data as such
json_output = {"collaborative": 'false',"description": "", "external_urls": {"spotify": "https://open.spotify.com/playlist/5"}}
df['collaborative'] = json_output['collaborative'] #set value of your df to value of returned json values

Merging two json files using python

I'm new to python, I want to merge two JSON files
there should not be any duplicate:
if the values and name are same then I will add both the keys and maintain a single record, otherwise, I will keep the record
File 1:
[ {
"key": 1,
"name": "test",
"value": "NY"
},
{
"key": 1,
"name": "test",
"value": "CA"
},
{
"key": 1,
"name": "test",
"value": "MA"
},
{
"key": 1,
"name": "test",
"value": "MA"
}
]
File 2:
[ {
"key": 1,
"name": "test",
"value": "NJ"
},
{
"key": 1,
"name": "test",
"value": "CA"
},
{
"key": 1,
"name": "test",
"value": "TX"
},
{
"key": 1,
"name": "test",
"value": "MA"
}
]
and the merged file output should be:
[
{
"key": 1,
"name": "test",
"value": "NY"
},
{
"key": 3,
"name": "test",
"value": "MA"
},
{
"key": 1,
"name": "test",
"value": "NJ"
},
{
"key": 2,
"name": "test",
"value": "CA"
},
{
"key": 1,
"name": "test",
"value": "TX"
}
]
order of the record does not matter.
I have tried several approaches, like merging the files and then iterating over then, parsing both files separately but I'm facing issues, being new to python.

This should help.
# -*- coding: utf-8 -*-
f1 = [ {
"key": 1,
"value": "NY"
},
{
"key": 1,
"value": "CA"
},
{
"key": 1,
"value": "MA"
}
]
f2 = [ {
"key": 1,
"value": "NJ"
},
{
"key": 1,
"value": "CA"
},
{
"key": 1,
"value": "TX"
}
]
check = [i["value"] for i in f1] #check list to see if the value already exist in f1.
for i in f2:
if i['value'] not in check:
f1.append(i)
print(f1)
Output:
[{'value': 'NY', 'key': 1}, {'value': 'CA', 'key': 1}, {'value': 'MA', 'key': 1}, {'value': 'NJ', 'key': 1}, {'value': 'TX', 'key': 1}]

Iterating over JSON object individually for processing each post

I have the following JSON object that I get from the Instagram API, it can have n number of posts (depending upon the count parameter provided).
{
"pagination": {
"next_url": "https:\/\/api.instagram.com\/v1\/users\/3\/media\/recent?access_token=184046392.f59def8.c5726b469ad2462f85c7cea5f72083c0&max_id=205140190233104928_3",
"next_max_id": "205140190233104928_3"
},
"meta": {
"code": 200
},
"data": [{
"attribution": null,
"tags": [],
"type": "image",
"location": {
"latitude": 37.798594362,
"name": "Presidio Bowling Center",
"longitude": -122.459878922,
"id": 27052
},
"comments": {
"count": 132,
"data": [{
"created_time": "1342734265",
"text": "Distinguishing!",
"from": {
"username": "naiicamilos",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_53690312_75sq_1336573463.jpg",
"id": "53690312",
"full_name": "Naii Camilos"
},
"id": "239194812924826175"
}, {
"created_time": "1342737428",
"text": "#kevin in Spanish Presidio means Jail",
"from": {
"username": "jm0426",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_25881992_75sq_1342156673.jpg",
"id": "25881992",
"full_name": "Juan Mayen"
},
"id": "239221343768285211"
}, {
"created_time": "1342768120",
"text": "Good imagination",
"from": {
"username": "kidloca",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_193133903_75sq_1342032241.jpg",
"id": "193133903",
"full_name": "Khaleda Noon"
},
"id": "239478811731694145"
}, {
"created_time": "1342775967",
"text": "Cwl!",
"from": {
"username": "awesomeath",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_179252164_75sq_1339745821.jpg",
"id": "179252164",
"full_name": "awesomeath"
},
"id": "239544638740894674"
}, {
"created_time": "1342796153",
"text": "\u597d\u7f8e\u263a",
"from": {
"username": "hidelau",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_47295330_75sq_1342763977.jpg",
"id": "47295330",
"full_name": "Hide Lau"
},
"id": "239713963951001995"
}, {
"created_time": "1343018007",
"text": "#mindfreak",
"from": {
"username": "info2021",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/anonymousUser.jpg",
"id": "27664191",
"full_name": "info2021"
},
"id": "241575017119224582"
}, {
"created_time": "1343068374",
"text": "#kevin please share and promote my last pic. This will be the new hype as instagram",
"from": {
"username": "thansy_mansy",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_189019343_75sq_1342951587.jpg",
"id": "189019343",
"full_name": "thansy_mansy"
},
"id": "241997523093295303"
}, {
"created_time": "1343068382",
"text": "#kevin :P",
"from": {
"username": "thansy_mansy",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_189019343_75sq_1342951587.jpg",
"id": "189019343",
"full_name": "thansy_mansy"
},
"id": "241997589589790922"
}]
},
"filter": "Rise",
"created_time": "1342676212",
"link": "http:\/\/instagr.am\/p\/NQD4KAABKF\/",
"likes": {
"count": 4810,
"data": [{
"username": "caitlyn_hammonds",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/anonymousUser.jpg",
"id": "198322184",
"full_name": "caitlyn_hammonds"
}, {
"username": "sophiafrancis",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_43092892_75sq_1340548333.jpg",
"id": "43092892",
"full_name": "Sophiaaa."
}, {
"username": "amna7861",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_175807260_75sq_1343135903.jpg",
"id": "175807260",
"full_name": "Amna Haroon"
}, {
"username": "yaya0318",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_74056_75sq_1287001004.jpg",
"id": "74056",
"full_name": "Mao Yaya"
}, {
"username": "jay_damage",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_197465040_75sq_1342932411.jpg",
"id": "197465040",
"full_name": "jay_damage"
}, {
"username": "reves",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_671833_75sq_1335966794.jpg",
"id": "671833",
"full_name": "Fernando D. Ramirez"
}, {
"username": "lizray1",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_198450407_75sq_1343144120.jpg",
"id": "198450407",
"full_name": "lizray1"
}, {
"username": "alivewtheglory",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_37907416_75sq_1341561441.jpg",
"id": "37907416",
"full_name": "Marilynn C"
}, {
"username": "mnforever55",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_29255977_75sq_1334833008.jpg",
"id": "29255977",
"full_name": "mnforever55"
}]
},
"images": {
"low_resolution": {
"url": "http:\/\/distilleryimage9.s3.amazonaws.com\/beb7f896d16311e19fe21231380f3636_6.jpg",
"width": 306,
"height": 306
},
"thumbnail": {
"url": "http:\/\/distilleryimage9.s3.amazonaws.com\/beb7f896d16311e19fe21231380f3636_5.jpg",
"width": 150,
"height": 150
},
"standard_resolution": {
"url": "http:\/\/distilleryimage9.s3.amazonaws.com\/beb7f896d16311e19fe21231380f3636_7.jpg",
"width": 612,
"height": 612
}
},
"caption": {
"created_time": "1342676255",
"text": "Happy birthday #amy !",
"from": {
"username": "kevin",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_3_75sq_1325536697.jpg",
"id": "3",
"full_name": "Kevin Systrom"
},
"id": "238708186813567655"
},
"user_has_liked": false,
"id": "238707833418289797_3",
"user": {
"username": "kevin",
"website": "",
"bio": "CEO & Co-founder of Instagram",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_3_75sq_1325536697.jpg",
"full_name": "Kevin Systrom",
"id": "3"
}
}, {
"attribution": null,
"tags": [],
"type": "image",
"location": {
"latitude": 38.503100608,
"name": "Goose & Gander",
"longitude": -122.468387538,
"id": 12059278
},
"comments": {
"count": 85,
"data": [{
"created_time": "1342555499",
"text": "Cheers !!! \ud83d\ude18\ud83d\ude18",
"from": {
"username": "kattiab",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_1345073_75sq_1340495505.jpg",
"id": "1345073",
"full_name": "kattia b"
},
"id": "237695212468572732"
}, {
"created_time": "1342558279",
"text": "happy birthday instagram!",
"from": {
"username": "alanasayshi",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_4235095_75sq_1341960681.jpg",
"id": "4235095",
"full_name": "Alana Boy\u00e9r"
},
"id": "237718535382504383"
}, {
"created_time": "1342567977",
"text": "Happy Natal Day Instagram!",
"from": {
"username": "cynrtst",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_12493918_75sq_1341538462.jpg",
"id": "12493918",
"full_name": "Cynthia L"
},
"id": "237799888639758668"
}, {
"created_time": "1342568896",
"text": "Happy Birthday \ud83c\udf89\ud83c\udf89\ud83c\udf89 was it a long labour \ud83d\ude02\ud83d\ude02\ud83d\ude02",
"from": {
"username": "relzie",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_15718874_75sq_1332290975.jpg",
"id": "15718874",
"full_name": "Relz"
},
"id": "237807595966960050"
}, {
"created_time": "1342579289",
"text": "Cheers #kevin and Happy Birthday #instagram thank you so much Kevin for creating instagram it's truly got me back out there taking more photos and falling in love with photography all over again...",
"from": {
"username": "bpphotographs",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_12149171_75sq_1339457436.jpg",
"id": "12149171",
"full_name": "bpphotographs"
},
"id": "237894779172557723"
}, {
"created_time": "1342652660",
"text": "#suz_h",
"from": {
"username": "ianyorke",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_1041088_75sq_1339844137.jpg",
"id": "1041088",
"full_name": "Ian Yorke"
},
"id": "238510264889118973"
}, {
"created_time": "1342667574",
"text": "Love your app\ud83d\udc97\ud83d\udc97\ud83d\udc97\ud83d\udc97\ud83d\udc97",
"from": {
"username": "gothangel1997",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_186799125_75sq_1342668825.jpg",
"id": "186799125",
"full_name": "Angel Mercado"
},
"id": "238635368570687940"
}, {
"created_time": "1342843274",
"text": "\ud83d\ude09\u2764",
"from": {
"username": "andescu",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_145554839_75sq_1337641309.jpg",
"id": "145554839",
"full_name": "andescu"
},
"id": "240109245637333685"
}]
},
"filter": "Sierra",
"created_time": "1342332400",
"link": "http:\/\/instagr.am\/p\/NF0G6bABA2\/",
"likes": {
"count": 3282,
"data": [{
"username": "caysondesigns",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_23464609_75sq_1329177054.jpg",
"id": "23464609",
"full_name": "Jasmine at Cayson Designs"
}, {
"username": "m_azooz16",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_198179404_75sq_1343096096.jpg",
"id": "198179404",
"full_name": "m_azooz16"
}, {
"username": "shulinghuang",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_144887495_75sq_1342460246.jpg",
"id": "144887495",
"full_name": "H\u3002"
}, {
"username": "caitlyn_hammonds",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/anonymousUser.jpg",
"id": "198322184",
"full_name": "caitlyn_hammonds"
}, {
"username": "sophiafrancis",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_43092892_75sq_1340548333.jpg",
"id": "43092892",
"full_name": "Sophiaaa."
}, {
"username": "beatle1234",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/anonymousUser.jpg",
"id": "197988834",
"full_name": "beatle1234"
}, {
"username": "yaya0318",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_74056_75sq_1287001004.jpg",
"id": "74056",
"full_name": "Mao Yaya"
}, {
"username": "lizray1",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_198450407_75sq_1343144120.jpg",
"id": "198450407",
"full_name": "lizray1"
}, {
"username": "rawr1234321",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_198492630_75sq_1343151765.jpg",
"id": "198492630",
"full_name": "rawr1234321"
}]
},
"images": {
"low_resolution": {
"url": "http:\/\/distilleryimage6.s3.amazonaws.com\/3ec59e18ce4311e1b8031231380702ee_6.jpg",
"width": 306,
"height": 306
},
"thumbnail": {
"url": "http:\/\/distilleryimage6.s3.amazonaws.com\/3ec59e18ce4311e1b8031231380702ee_5.jpg",
"width": 150,
"height": 150
},
"standard_resolution": {
"url": "http:\/\/distilleryimage6.s3.amazonaws.com\/3ec59e18ce4311e1b8031231380702ee_7.jpg",
"width": 612,
"height": 612
}
},
"caption": {
"created_time": "1342332465",
"text": "Mellivora capensis - eagle rare, peat, honey, lemon, pineapple, black cardamom, chili, coconut foam",
"from": {
"username": "kevin",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_3_75sq_1325536697.jpg",
"id": "3",
"full_name": "Kevin Systrom"
},
"id": "235824269324456712"
},
"user_has_liked": false,
"id": "235823728972271670_3",
"user": {
"username": "kevin",
"website": "",
"bio": "CEO & Co-founder of Instagram",
"profile_picture": "http:\/\/images.instagram.com\/profiles\/profile_3_75sq_1325536697.jpg",
"full_name": "Kevin Systrom",
"id": "3"
}
}, .....
So I m looking to iterate over each post individually and extract the tags, id, and the image urls. I m having some trouble, (since I m a PHP developer and finding it really hard to work with Python as a beginner).
Here's the code that I m using to iterate over each post and process the attributes provided. I dont want to store them in a list or dict. Just want to search through the tags.
(this is just a attempted code since I couldnt find which loop should I use)
info= simplejson.load(info)
print type(info['data']) # I get it as a list
for k, v in info['data']:
print v
I could have done this easily using php with a foreach :
foreach($info->data as $i) {
$tags = $i->tags();
$id = $i->id();
}

If info['data'] is a list, you should be able to iterate over it like so:
for post in info['data']:
tags = post['tags']
id = post['id']
image_urls = [] # An empty list -- we'll fill it below
for img_type in ['low_resolution', 'thumbnail', 'standard_resolution']:
image_urls.append(post['images'][img_type]['url'])
# Now image_urls has all the image urls in it
I think the part that's rather different from PHP is that where the key is "tags" in the JSON structure, you have to use the string "tags" in Python, whereas you would use the literal tags() in PHP.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Writing dictionaries recursively - python

Related

Json_normalize with meta data

How to compare two list of dictionary and update value from another [closed]

Best way to build denormilazed dataframe with pandas from spotify API

Merging two json files using python

Iterating over JSON object individually for processing each post

Categories

Resources