Extracting names from JSON file

Extracting names from JSON file - python

Im trying to get only the names of playlist from a json file that I have but I cannot make it
{'playlists': [{'description': '',
'lastModifiedDate': '2018-11-20',
'name': 'Piano',
'numberOfFollowers': 0,
'tracks': [{'artistName': 'Kenzie Smith Piano',
'trackName': "You've Got a Friend in Me (From "
'"Toy Story")'},
{'artistName': 'Kenzie Smith Piano',
'trackName': 'A Whole New World (From "Aladdin")'},
{'artistName': 'Kenzie Smith Piano',
'trackName': 'Can You Feel the Love Tonight? (From '
'"The Lion King")'},
{'artistName': 'Kenzie Smith Piano',
'trackName': "He's a Pirate / The Black Pearl "
'(From "Pirates of the Caribbean")'},
{'artistName': 'Kenzie Smith Piano',
'trackName': "You'll be in My Heart (From "
'"Tarzan") [Soft Version]'},
import json
from pprint import pprint
json_data=open('C:/Users/alvar/Desktop/Alvaro/Nueva carpeta/Playlist.json', encoding="utf8").read()
playlist = json.loads(json_data)
pprint(playlist)
Here is where is not working:
for names in playlist_list:
print(names['name'])
print '\n'
What I want is to extract only the names of the playlists.

Error is due to you not accessing the dictionary key 'playlists'
for plst in playlist['playlists']:
print(plst['name'])
# Piano

You iterate on the wrong object.
Do not forget that json.loads(json_data) return the object as it is stored. In your case, it's a dict with only one element : 'playlist'. You have to access this element with loaded_json['playlist'] then iterate over the list of playlist.
Here, loaded_json is of type Dict[List[Dict]]. Be careful with JSON and nested data structures.
Try :
loaded_json= json.loads(json_data) #type: Dict[List[dict]]
for playlist in loaded_json['playlist']: #type: dict
print('{}\n'.format(playlist['name']))
By doing this, you will get all the playlist's name.
Documentation: JSON encoder and decoder

Related

Convert pandas dataframe to JSON schema

I have a dataframe
import pandas as pd
data = {
"ID": [123123, 222222, 333333],
"Main Authors": ["[Jim Allen, Tim H]", "[Rob Garder, Harry S, Tim H]", "[Wo Shu, Tee Ru, Fuu Wan, Gee Han]"],
"Abstract": ["This is paper about hehe", "This paper is very nice", "Hello there paper from kellogs"],
"paper IDs": ["[123768, 123123]", "[123432, 34345, 353545, 454545]", "[123123, 3433434, 55656655, 988899]"],
}
and I am trying to export it to a JSON schema. I do so via
df.to_json(orient='records')
'[{"ID":123123,"Main Authors":"[Jim Allen, Tim H]","Abstract":"This is paper about hehe","paper IDs":"[123768, 123123]"},
{"ID":222222,"Main Authors":"[Rob Garder, Harry S, Tim H]","Abstract":"This paper is very nice","paper IDs":"[123432, 34345, 353545, 454545]"},
{"ID":333333,"Main Authors":"[Wo Shu, Tee Ru, Fuu Wan, Gee Han]","Abstract":"Hello there paper from kellogs","paper IDs":"[123123, 3433434, 55656655, 988899]"}]'
but this is not in the right format for JSON. How can I get my output to look like this
{"ID": "123123", "Main Authors": ["Jim Allen", "Tim H"], "Abstract": "This is paper about hehe", "paper IDs": ["123768", "123123"]}
{and so on for paper 2...}
I can't find an easy way to achieve this schema with the basic functions.

to_json returns a proper JSON document. What you want is not a JSON document.
Add lines=True to the call:
df.to_json(orient='records', lines=True)
The output you desire is not valid JSON. It's a very common way to stream JSON objects though: write one unindented JSON object per line.
Streaming JSON is an old technique, used to write JSON records to logs, send them over the network etc. There's no specification for this, but a lot of people tried to hijack it, even creating sites that mirrored Douglas Crockford's original JSON site, or mimicking the language of RFCs.
Streaming JSON formats are used a lot in IoT and event processing applications, where events will arrive over a long period of time.
PS: I remembered I saw a few months ago a question about json-seq. Seems there was an attempt to standardize streaming JSON RFC 7464 as JSON Sequences, using the mime type application/json-seq.

You can convert DataFrame to list of dictionaries first.
import pandas as pd
data = {
"ID": [123123, 222222, 333333],
"Main Authors": [["Jim Allen", "Tim H"], ["Rob Garder", "Harry S", "Tim H"], ["Wo Shu", "Tee Ru", "Fuu Wan", "Gee Han"]],
"Abstract": ["This is paper about hehe", "This paper is very nice", "Hello there paper from kellogs"],
"paper IDs": [[123768, 123123], [123432, 34345, 353545, 454545], [123123, 3433434, 55656655, 988899]],
}
df = pd.DataFrame(data)
df.to_dict('records')
The result:
[{'ID': 123123,
'Main Authors': ['Jim Allen', 'Tim H'],
'Abstract': 'This is paper about hehe',
'paper IDs': [123768, 123123]},
{'ID': 222222,
'Main Authors': ['Rob Garder', 'Harry S', 'Tim H'],
'Abstract': 'This paper is very nice',
'paper IDs': [123432, 34345, 353545, 454545]},
{'ID': 333333,
'Main Authors': ['Wo Shu', 'Tee Ru', 'Fuu Wan', 'Gee Han'],
'Abstract': 'Hello there paper from kellogs',
'paper IDs': [123123, 3433434, 55656655, 988899]}]
Is that what you are looking for?

How to get fortnite stats in python

So i was trying to find something to code, and i decided to use python to get fortnite stats, i came across the fortnite_python library and it works, but it displays item codes for items in the shop when i want it to display the names. Anyone know how to convert them or just disply the name in the first place? This is my code.

fortnite = Fortnite('c954ed23-756d-4843-8f99-cfe850d2ed0c')
store = fortnite.store()
fortnite.store()
It outputs something like this
[<StoreItem 12511>,

To print out the attributes of a Python object you can use __dict__ e.g.
from fortnite_python import Fortnite
from json import dumps
fortnite = Fortnite('Your API Key')
# ninjas_account_id = fortnite.player('ninja')
# print(f'ninjas_account: {ninjas_account_id}') # ninjas_account: 4735ce91-3292-4caf-8a5b-17789b40f79c
store = fortnite.store()
example_store_item = store[0]
print(dumps(example_store_item.__dict__, indent=2))
Output:
{
"_data": {
"imageUrl": "https://trackercdn.com/legacycdn/fortnite/237112511_large.png",
"manifestId": 12511,
"name": "Dragacorn",
"rarity": "marvel",
"storeCategory": "BRSpecialFeatured",
"vBucks": 0
},
"id": 12511,
"image_url": "https://trackercdn.com/legacycdn/fortnite/237112511_large.png",
"name": "Dragacorn",
"rarity": "marvel",
"store_category": "BRSpecialFeatured",
"v_bucks": 0
}
So it looks like you want to use name attribute of StoreItem:
for store_item in store:
print(store_item.name)
Output:
Dragacorn
Hulk Smashers
Domino
Unstoppable Force
Scootin'
Captain America
Cable
Probability Dagger
Chimichanga!
Daywalker's Kata
Psi-blade
Snap
Psylocke
Psi-Rider
The Devil's Wings
Daredevil
Meaty Mallets
Silver Surfer
Dayflier
Silver Surfer's Surfboard
Ravenpool
Silver Surfer Pickaxe
Grand Salute
Cuddlepool
Blade
Daredevil's Billy Clubs
Mecha Team
Tricera Ops
Combo Cleaver
Mecha Team Leader
Dino
Triassic
Rex
Cap Kick
Skully
Gold Digger
Windmill Floss
Bold Stance
Jungle Scout

It seems that the library doesn't contain a function to get the names. Also this is what the class of a item from the store looks like:
class StoreItem(Domain):
"""Object containing store items attributes"""
and thats it.

How to extract first item in this json list

I have this json that im trying to extract the first element/list in this list of lists in python... How do you extract the first item in ads_list?
Are there any built in functions i can use to extract first item in a json file? I need something other than simply iterating through this like an array....
P.S. i shortened the json data
Here is the list.
{
u'data':{
u'ad_list':[
{
u'data':{
u'require_feedback_score':0,
u'hidden_by_opening_hours':False,
u'trusted_required':False,
u'currency':u'EGP',
u'require_identification':False,
u'is_local_office':False,
u'first_time_limit_btc':None,
u'city':u'',
u'location_string':u'Egypt',
u'countrycode':u'EG',
u'max_amount':u'20000',
u'lon':0.0,
u'sms_verification_required':False,
u'require_trade_volume':0.0,
u'online_provider':u'SPECIFIC_BANK',
u'max_amount_available':u'20000',
u'msg': u" \u2605\u2605\u2605\u2605\u2605 \u0645\u0631\u062d\u0628\u0627 \u2605\u2605\u2605\u2605\u2605\r\n\r\n\u0625\u0630\u0627 \u0643\u0646\u062a \u062a\u0631\u063a\u0628 \u0641\u064a \u0628\u064a\u0639 \u0627\u0648 \u0634\u0631\u0627\u0621 \u0627\u0644\u0628\u062a\u0643\u0648\u064a\u0646 \u062a\u0648\u0627\u0635\u0644 \u0645\u0639\u064a \u0648\u0633\u0623\u0642\u0648\u0645 \u0628\u062e\u062f\u0645\u062a\u0643\r\n\u0644\u0644\u062a\u0648\u0627\u0635\u0644: https: //tawk.to/hanyibrahim\r\n \u0627\u0644\u062e\u064a\u0627\u0631 \u0644\u0644\u062a\u062d\u0648\u064a\u0644: \u0627\u0644\u0628\u0646\u0643 \u0627\u0644\u062a\u062c\u0627\u0631\u064a \u0627\u0644\u062f\u0648\u0644\u064a \u0627\u0648\u0641\u0648\u062f\u0627\u0641\u0648\u0646 \u0643\u0627\u0634 \u0627\u0648 \u0627\u062a\u0635\u0627\u0644\u0627\u062a \u0641\u0644\u0648\u0633 \u0627\u0648 \u0627\u0648\u0631\u0627\u0646\u062c \u0645\u0648\u0646\u064a\r\n\r\n'' \u0634\u0643\u0631\u0627 ''\r\n\r\n\r\n \u2605\u2605\u2605\u2605\u2605 Hello \u2605\u2605\u2605\u2605\u2605\r\n\r\nIf you would like to trade Bitcoins please let me know and I will help you\r\nconnect: https: //tawk.to/hanyibrahim\r\nOption transfer:Bank CIB Or Vodafone Cash Or Etisalat Flous Or Orange Money\r\n'' Thank ''",
u'volume_coefficient_btc':u'1.50',
u'profile':{
u'username':u'hanyibrahim11',
u'feedback_score':100,
u'trade_count':u'3000+',
u'name':u'hanyibrahim11 (3000+; 100%)',
u'last_online': u'2019-01-14T17:54:52+00:00 '}, u' bank_name':u'CIB_Vodafone Cash_Etisalat Flous_Orange Money',
u'trade_type':u'ONLINE_BUY',
u'ad_id':803036,
u'temp_price':u'67079.44',
u'payment_window_minutes':90,
u'min_amount':u'50',
u'limit_to_fiat_amounts':u'',
u'require_trusted_by_advertiser':False,
u'temp_price_usd':u'3738.54',
u'lat':0.0,
u'visible':True,
u'created_at': u'2018-07-25T08:12:21+00:00 ', u' atm_model':None,
u'is_low_risk':True
},
u'actions':{
u'public_view': u'https://localbitcoins.com/ad/803036'
}
},
{
u'data':{
u'require_feedback_score':0,
u'hidden_by_opening_hours':False,
u'trusted_required':False,
u'currency':u'EGP',
u'require_identification':False,
u'is_local_office':False,
u'first_time_limit_btc':None,
u'city':u'',
u'location_string':u'Egypt',
u'countrycode':u'EG',
u'max_amount':u'20000',
u'lon':0.0,
u'sms_verification_required':False,
u'require_trade_volume':0.0,
u'online_provider':u'CASH_DEPOSIT',
u'max_amount_available':u'20000',
u'msg':u'QNB,
CIB deposite- Vodafone Cash - Etisalat Felous - Orange Money - Western Union - Money Gram \r\n- Please do not entiate a new trade request if you are not serious to finalize it.',
u'volume_coefficient_btc':u'1.50',
u'profile':{
u'username':u'Haboush',
u'feedback_score':99,
u'trade_count':u'500+',
u'name':u'Haboush (500+; 99%)',
u'last_online': u'2019-01-14T16:48:52+00:00 '}, u' bank_name':u'QNB\u2714CIB\u2714Vodafone\u2714Orange\u2714Etisalat\u2714WU',
u'trade_type':u'ONLINE_BUY',
u'ad_id':719807,
u'temp_price':u'66860.18',
u'payment_window_minutes':270,
u'min_amount':u'100',
u'limit_to_fiat_amounts':u'',
u'require_trusted_by_advertiser':False,
u'temp_price_usd':u'3726.32',
u'lat':0.0,
u'visible':True,
u'created_at': u'2018-03-24T19:29:08+00:00 ', u' atm_model':None,
u'is_low_risk':True
},
u'actions':{
u'public_view': u'https://localbitcoins.com/ad/719807'
}
},
}
],
u'ad_count':17
}
}

Assuming your data structure is stored in the variable j, you can use j['data']['ad_list'][0] to extract the first item from the ad_list key. Use a try-except block to catch a possible IndexError exception if ad_list can ever be empty.

Populate dictionaries from text file

I have a text file with the details of a set of restaurants given one after the other. The details are name, rating, price and type of cuisines of a particular restaurant. The contents of text file is as given below.
George Porgie
87%
$$$
Canadian, Pub Food
Queen St. Cafe
82%
$
Malaysian, Thai
Dumpling R Us
71%
$
Chinese
Mexican Grill
85%
$$
Mexican
Deep Fried Everything
52%
$
Pub Food
I want to create a set of dictionaries as given below:
Restaurant name to rating:
# dict of {str : int}
name_to_rating = {'George Porgie' : 87,
'Queen St. Cafe' : 82,
'Dumpling R Us' : 71,
'Mexican Grill' : 85,
'Deep Fried Everything' : 52}
Price to list of restaurant names:
# dict of {str : list of str }
price_to_names = {'$' : ['Queen St. Cafe', 'Dumpling R Us', 'Deep Fried Everything'],
'$$' : ['Mexican Grill'],
'$$$' : ['George Porgie'],
'$$$$' : [ ]}
Cuisine to list of restaurant name:
#dic of {str : list of str }
cuisine_to_names = {'Canadian' : ['George Porgie'],
'Pub Food' : ['George Porgie', 'Deep Fried Everything'],
'Malaysian' : ['Queen St. Cafe'],
'Thai' : ['Queen St. Cafe'],
'Chinese' : ['Dumpling R Us'],
'Mexican' : ['Mexican Grill']}
What is the best way in Python to populate the above dictionaries ?

Initialise some containers:
name_to_rating = {}
price_to_names = collections.defaultdict(list)
cuisine_to_names = collections.defaultdict(list)
Read your file into a temporary string:
with open('/path/to/your/file.txt') as f:
spam = f.read().strip()
Assuming the structure is consistent (i.e. chunks of 4 lines separated by double newlines), iterate through the chunks and populate your containers:
restraunts = [chunk.split('\n') for chunk in spam.split('\n\n')]
for name, rating, price, cuisines in restraunts:
name_to_rating[name] = rating
# etc ..

for the main reading loop, you can use enumerate and modulo to know what is the data on a line:
for lineNb, line in enumerate(data.splitlines()):
print lineNb, lineNb%4, line
for the price_to_names and cuisine_to_names dictionnaries, you could use a defaultdict:
from collections import defaultdict
price_to_names = defaultdict(list)

Return individual address components (city, state, etc.) from GeoPy geocoder

I'm using GeoPy to geocode addresses to lat,lng. I would also like to extract the itemized address components (street, city, state, zip) for each address.
GeoPy returns a string with the address -- but I can't find a reliable way to separate each component. For example:
123 Main Street, Los Angeles, CA 90034, USA =>
{street: '123 Main Street', city: 'Los Angeles', state: 'CA', zip: 90034, country: 'USA'}
The Google geocoding API does return these individual components... is there a way to get these from GeoPy? (or a different geocoding tool?)

You can also get the individual address components from the Nominatim() geocoder (which is the standard open source geocoder from geopy).
from geopy.geocoders import Nominatim
# address is a String e.g. 'Berlin, Germany'
# addressdetails=True does the magic and gives you also the details
location = geolocator.geocode(address, addressdetails=True)
print(location.raw)
gives
{'type': 'house',
'class': 'place',
'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright',
'display_name': '2, Stralauer Allee, Fhain, Friedrichshain-Kreuzberg, Berlin, 10245, Deutschland',
'place_id': '35120946',
'osm_id': '2825035484',
'lon': '13.4489063',
'osm_type': 'node',
'address': {'country_code': 'de',
'road': 'Stralauer Allee',
'postcode': '10245',
'house_number': '2',
'state': 'Berlin',
'country': 'Deutschland',
'suburb': 'Fhain',
'city_district': 'Friedrichshain-Kreuzberg'},
'lat': '52.5018003',
'importance': 0.421,
'boundingbox': ['52.5017503', '52.5018503', '13.4488563', '13.4489563']}
with
location.raw['address']
you get the dictionary with the components only.
Take a look at geopy documentation for more parameters or Nominatim for all address components.

Use usaddress by DataMade. Here's the GitHub repo.
It works like this usaddress.parse('123 Main St. Suite 100 Chicago, IL') and returns this array
[('123', 'AddressNumber'),
('Main', 'StreetName'),
('St.', 'StreetNamePostType'),
('Suite', 'OccupancyType'),
('100', 'OccupancyIdentifier'),
('Chicago,', 'PlaceName'),
('IL', 'StateName')]

This is how I implemented such a split, as I wanted the resulting address in always the same format. You would just have to skip the concatenation and retrun each value... or put it in list. Up to you.
def getaddress(self, lat, lng, language="en"):
try:
geolocator = Nominatim()
string = str(lat) + ', ' +str(lng)
location = geolocator.reverse(string, language=language)
data = location.raw
data = data['address']
address = str(data)
street = district = postalCode= state = country = countryCode = ""
district =str(data['city_district'])
postalCode =str(data['postcode'])
state =str(data['state'])
country =str(data['country'])
countryCode =str(data['country_code']).upper()
address = street +' '+ district +' '+ postalCode +' '+ state +' '+ country +' '+ countryCode
except:
address="Error"
return str(address.decode('utf8'))

I helped write one not long ago called LiveAddress; it was just upgraded to support single-line (freeform) addresses and implements geocoding features.
GeoPy is a geocoding utility, not an address parser/standardizer. LiveAddress API is, however, and can also verify the validity of the address for you, filling out the missing information. You'll find that services such as Google and Yahoo approximate the address, while a CASS-Certified service like LiveAddress actually verify it and won't return results unless the address is real.
After doing a lot of research and development with implementing LiveAddress, I wrote a summary in this Stack Overflow post. It documents some of the crazy-yet-complete formats that addresses can come in and ultimately lead to a solution for the parsing problem (for US addresses).
To parse a single-line address into components using Python, simply put the entire address into the "street" field:
import json
import pprint
import urllib
LOCATION = 'https://api.qualifiedaddress.com/street-address/'
QUERY_STRING = urllib.urlencode({ # entire query sting must be URL-Encoded
'auth-token': r'YOUR_API_KEY_HERE',
'street': '1 infinite loop cupertino ca 95014'
})
URL = LOCATION + '?' + QUERY_STRING
response = urllib.urlopen(URL).read()
structure = json.loads(response)
pprint.pprint(structure)
The resulting JSON object will contain a components object which will look something like this:
"components": {
"primary_number": "1",
"street_name": "Infinite",
"street_suffix": "Loop",
"city_name": "Cupertino",
"state_abbreviation": "CA",
"zipcode": "95014",
"plus4_code": "2083",
"delivery_point": "01",
"delivery_point_check_digit": "7"
}
The response will also include the combined first_line and delivery_line_2 so you don't have to manually concatenate those if you need them. Latitude/longitude and other information is also available about the address.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extracting names from JSON file - python

Error is due to you not accessing the dictionary key 'playlists' for plst in playlist['playlists']: print(plst['name']) # Piano

Related

Convert pandas dataframe to JSON schema

How to get fortnite stats in python

How to extract first item in this json list

Populate dictionaries from text file

Return individual address components (city, state, etc.) from GeoPy geocoder

Categories

Resources