Python- writing json file as list of dictionaries - python

I am writing a json file from information extracted from a url. How do I print each element of the dictionary on a separate line?
This is my current code:
dct=[{"name": name,
"cuisine": cuisine,
"price-range": price,
"address": address,
"rating": rating,
"reviews": score,
"district": district,
"url": link
}]
with open('openrice_data.json', 'a') as file:
file.write(json.dumps(dct))
For example, it currently prints like this:
[{"cuisine": ["Japanese", "Hot Pot", "Buffet"], "rating": [3.5], "address": [22.3825, 114.1901], "url": ["https://www.openrice.com/en/hongkong/r-wagyu-more-sha-tin-japanese-hot-pot-r172321"], "reviews": [35, 17, 8], "name": "Wagyu More", "price-range": ["$101-200"], "district": ["Sha Tin"]}]
I would like it to print like this:
[
{"name": "Chan Kun Kee",
"cuisine": ["Guang Dong", "Dai Pai Dong"],
"price-range": "$51-100",
"address": [22.3884, 114.1958],
"rating": 3.5,
"reviews": [216, 95, 38],
"district": "Shatin",
"url": "www.openrice.com/en/hongkong/r-chan-kun-kee-sha-tin-guangdong-r7918"
}
]

Update Actually what you have is a list of dictionaries. When you want to add more elements you need to remove the [] around the dictionary.
To slve your specific problem you want to use indent=0. Also consider using json.dump directly.
import json
l=[]
dct={"name": 'name',
"cuisine": 'cuisine',
"price-range": 'price',
"address": 'address',
"rating": 'rating',
"reviews": 'score',
"district": 'district',
"url": 'link'
}
l.append(dct)
with open('openrice_data.json', 'w') as file:
json.dump(l,file,indent=0)
Output:
[
{
"name": "name",
"cuisine": "cuisine",
"price-range": "price",
"address": "address",
"rating": "rating",
"reviews": "score",
"district": "district",
"url": "link"
}
]
Continuing
To add more elements you need to do this:
# Load json to list
with open('openrice_data.json') as f:
l = json.load(f)
# A new dict
dct2={"name": 'name',
"cuisine": 'cuisine',
"price-range": 'price',
"address": 'address',
"rating": 'rating',
"reviews": 'score',
"district": 'district',
"url": 'link'
}
# Append new dict
l.append(dct2)
with open('openrice_data.json', 'w') as file:
json.dump(l,file,indent=0)
Output now contains a list with 2 dicts.
[
{
"name": "name",
"cuisine": "cuisine",
"price-range": "price",
"address": "address",
"rating": "rating",
"reviews": "score",
"district": "district",
"url": "link"
},
{
"name": "name",
"cuisine": "cuisine",
"price-range": "price",
"address": "address",
"rating": "rating",
"reviews": "score",
"district": "district",
"url": "link"
}
]

Don't use json, pprint is perfect for this job.
from pprint import pprint
obj = [{"cuisine": ["Japanese", "Hot Pot", "Buffet"], "rating": [3.5], "address": [22.3825, 114.1901], "url": ["https://www.openrice.com/en/hongkong/r-wagyu-more-sha-tin-japanese-hot-pot-r172321"], "reviews": [35, 17, 8], "name": "Wagyu More", "price-range": ["$101-200"], "district": ["Sha Tin"]}]
with open('dumpfile.json', 'w+') as f:
pprint(obj, f)
There are a few parameters for customization, please check the doc for more details :
https://docs.python.org/3/library/pprint.html

Use prettyprinter:
import pprint
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(dct)
Also: you are currently putting the dict in a list. [] is a list {} is a dict in python.
By putting [{}] you are putting the dict into a list. Just remove the [].

Other people have remarked on using pprint, but I would like to add that pprint prints the representation of the Python values in your dictionary. They are not always the same as their JSON counterparts, for example:
>>> from pprint import pprint
>>> d1 = {"value": None}
>>> pprint(d1)
{'value': None}
(the correct JSON serialization here is {"value": null}
The better option, for these kinds of values, is to use json.dump or json.dumps. You can use the indent parameter to sort of make it print one line per element. Note though that this will also print each list element into their separate lines (so you don't exactly get one line per one JSON key):
>>> d2 = [
... {"name": "Chan Kun Kee",
... "cuisine": ["Guang Dong", "Dai Pai Dong"],
... "price-range": "$51-100",
... "address": [22.3884, 114.1958],
... "rating": 3.5,
... "reviews": [216, 95, 38],
... "district": "Shatin",
... "url": "www.openrice.com/en/hongkong/r-chan-kun-kee-sha-tin-guangdong-r7918"
... }
... ]
>>> print(json.dumps(d2, indent=2))
[
{
"name": "Chan Kun Kee",
"cuisine": [
"Guang Dong",
"Dai Pai Dong"
],
"price-range": "$51-100",
"address": [
22.3884,
114.1958
],
"rating": 3.5,
"reviews": [
216,
95,
38
],
"district": "Shatin",
"url": "www.openrice.com/en/hongkong/r-chan-kun-kee-sha-tin-guangdong-r7918"
}
]
But you're guaranteed to at least always get the correct JSON. Plus, you can also extend the behavior with your own JSON encoder. This allows you, for example, to serialize Python datetime objects into JSON strings.

Related

Loop over JSON elements in Python

I have a long JSON string, and I want to find matching names in other elements. Something like in which other elements appears "Bob" and "John" names or "Jacob" and "Max". I was thinking to loop first element names and try to find them in other elements and continue doing it until very last element. The matching names arrays, according on length sort in to different lists. However, I don't know to accomplish this in python. Please help me.
The JSON looks like fallowing:
[
{
"group": 01,
"names": [
"Bob",
"John",
"Alex",
"Jacob",
"Theo",
"Thomas",
"Max"
],
"status": "none"
},
{
"group": 02,
"names": [
"Martin",
"Bryan",
"Alex",
"Adam",
"Arlo",
"Luca",
"Ellis"
],
"status": "In Progress"
},
{
"group": 03,
"names": [
"Alex",
"John",
"Emma",
"Toby",
"Ryan",
"Leon",
"Blake"
],
"status": "Completed"
},
{
"group": 04,
"names": [
"John",
"Martin",
"Liam",
"Felix",
"Finn",
"Ollie",
"Elliot"
],
"status": "In Progress"
},
{
"group": 05,
"names": [
"Luke",
"Emma",
"Alex",
"Arlo",
"Finn",
"Bob",
"Theo"
],
"status": "In Progress"
}
]
This example will use the json module to load the Json data from a file and then filter the items to only those which contain "Bob" and "John":
import json
with open("data.json", "r") as f_in:
data = json.load(f_in)
# find elements which contain "Bob" and "John"
out = []
for d in data:
if "Bob" in d["names"] and "John" in d["names"]:
out.append(d)
print(out)
Prints:
[
{
"group": 1,
"names": ["Bob", "John", "Alex", "Jacob", "Theo", "Thomas", "Max"],
"status": "none",
}
]

How to convert a complicated (nested) json to a pandas dataframe?

I have a very weird json file with a lot of nesting in it. I need to convert it into a Pandas dataframe.
The Json looks something like this:
{
"data": {
"page1": {
"last_name": "suraj",
"first_name": "singh",
"dob": "2020-06-02",
"gender": "Male",
"address1": "asdf",
"city": "asdf",
"state": "ID",
"Zip": "34324",
"phone": "2343243242",
"emailaddress": "suraj.singh#fugetroncorp.com",
"ethnicity": "adsf",
"url": " iVBORw0KGgoAAAANSUhEUgAAAVIAAABkCAYAAADUgbjrAAANS0lEQVR4Xu2dXeh1RRXGH++EICMwyIy3bt4LjSwkUAnMCwP7oA9SKqKSQkMxk64Ky4Lopu9AhQgqIqKEPsiMSLAgLEFCQcGoyEIJEso+yLoqfq8zMez3/I/7/Pee2fPxDBzOef9n75am1njXvc9aaNbP2GXIzAkbACBiBRQicsehu32wEjIARMAIykXoSGAEjYAQWImAiXQigbzcCRsAImEg9B4yAETACCxEwkS4E0LcbASNgBEykngNGwAgYgYUImEgXAujbjYARMAImUs+BkRB4jiReJ/Yo/TdJT4bvHx0JHOt6fARMpMfHznfWg8CrJJ0VSPJF4f1lQTzeIc+lDXJNX7G/+Lf4799J+qWkV0r6mCTk4QUpm5iXWqHS+02klRrGYp0iHzxHSDCSY/Qo4/tJSec0hNVjkn4bCPVrkn7akOwWdQ8CJlJPjy0RgCAvkITXGD1HvMt9LYbeeHd/l/TscPHUM0z7SD3SqXcaiRqPtmS7W9LlJQf0WPkQMJHmw9Y9n44AZHmpJMiSzxBp2iDJB4LHBjHymfcYEvPvEi31gON4UdboDce/T0kaj/PXks4Pof3Z4fN5kl4r6TXhxq9KurqEMh4jPwIm0vwYjzwCJIPH+cbwSonzDyG0hRzjKyZ5esEMj/MSSa8O7+j1K0k/kPQZSf/oRdHR9TCRjj4D1tcfTzMlzzhCJE7WBXn1mni5UNKng9JxmeIXku6VdKckPv9nfdjd45YImEi3RL+PsfEyY8iO50ko+6yg2s8CaRLG9kqcqHqlpOvCUsVfgv683yoJDEgyuXWMgIm0Y+NmUg3iZJ0zJojS5BBe55clPRQItLdQfQrpNZLekKx7/lDSLZJ+ExJhmUzgbmtDwERam0Xqkoc1zjQ5NN2TmYbrcZ2zLg3ySPNWSdeGpNm3JD3lxFEeoFvp1UTaiqXKyQlZvivJrKcjPxgSQ6xxjkScEQOWLm4M2IDBxyU9LOmJcubxSDUiYCKt0SrbyPTuQBLxRBBSsL4HYcYEUe+h+lHIs3zxubAGyhatz4fXNpbyqNUhYCKtziTFBYIkvpLs6YQ8SQ59LzlzXlyoSgbkR4U1TzxRljHABRId9QelErPUJ4aJtD6blJQILxQSpRG2f8DHFk9hQUINAgUfGiF87zsPSs677sYykXZn0tkK4Vmx3kfjhA1EMXojuQYm/KDwGe+cz6VOVI2Of7P6m0ibNd0iwalKhMfFeh9eF2H86I3wHe8cAiWMBxcXFRl9VszU30Q6E6iOLmNN9J6gz2Umi1NhPAQa98PeFLxzr4N2NOlzq2IizY1wXf1zfPH+INL1km6rS7zi0kTPnIG/H4qMOIwvbob2BzSRtm/DQzTgGCPHFglZ8UZHbelOBcJ4CNVrxKPOhhX0NpGuAGJDXRDSQyJkoSGP0Rrrn6wNk0CiUVyZzw7jR5sJK+trIl0Z0Mq7G5lISSbFTfVOJlU+UVsTz0TamsWWyRvXBEfySNM9oexSIIQHB3uhy+aS704QMJGONR3eE6ozkVBhjbR3MiFsJ5SPe0LZ0tRzOb+xZnNF2ppIKzJGAVEgFMJ7jj5SvehLBcbcYgjWgSFQ3r1XdgsLDDamiXQwg0v6YFLB/SpJd3QGAaE7NUKZ2w7jOzNureqYSGu1TF65YtKJyu03dHKyiRqhtydhPOugPpmUdx6594CAiXTcqUAVe550SWs5+ZTWCH1c0s3eEzrupN5KcxPpVshvP+4Vku5KxHhE0tsaKtDBOi8FRiBS1n7ZExqrNW2PriUYCgET6VDmPk1Zwl9OO/HAOhpZfP72hYphiZvqIU1XaKrYUCOJZiIdydq7dT0ZapGemXzN9qg3VbZVCNLkESgQPZ/JxrO9yUc7PYc3R8BEurkJqhCAMBlC4nn0aYOoavBOkYMwns31NGTypvoqpo6FAAETqedBikBaDSn+nQ3sFH7eIgOeHutEHhda9nytEgETaZVm2VQoNrFTPX/qnUKk1OosUWYu3VAPGDwGBZJ3AepNp4YHPwoBE6nnxi4EWIMknOZ11uQCCBWipX7n2o1xKbKMJ0rzOujaCLu/LAiYSLPA2k2nrEniCZLkmTZCftZV2Xa0xvl1SJvqTLF5HbSbadS/IibS/m28hob7CJX+CfchVbzUQ0mVRBdeKO80V6pfw2LuoygCJtKicDc/GITK/k1eJ47Q5hBShXyjt+saoc1Pj3EVMJGOa/ulmuNBQqisZ+4j1biempbs497vejvTUhP4/loQMJHWYom25cBThVDJtlN5aVeL66nxbDzX2Att2+6WPiBgIvVUyIEAhBqJdbqNivH+JelTgz43Kgfe7nNjBEykGxtggOFfGoosv3miK0kpwv4aTk4NYAarmBMBE2lOdN339GTShyU9FdZWWQ5gj+ofJX00nFo6NONvhI1AFQiYSKswQ3dCQJIkk+KWJtZCIdX0VBTfvU7Se0OyimQUm/2pjVri9FR3oFuh7RAwkW6HfY8jT58bj46cj4dE9z1oj8347ACI66kQKQcBuLf3B/T1OA+G08lEOpzJsykMEXIyCTKNjfVPSHJuw0uFQGPmn1A/eqkO++ei6OuKI2AiLQ55dwOSoYdAYxiPgkvPyO8665/zjH93RrFCZREwkZbFu6fRIDsIdPp4Dyo1Ecqv5UHSP15tDPtjtp8z/g77e5pRDetiIm3YeBuKviuMR5ychUbwfBk3HimFRCmrR3JqLdLeEFIP3TICJtKWrVde9mmBkVQCHk1Sol4oOwLwUNOjqYT98eRUeVQ84vAImEiHnwKzACCM51EfJIKmbe1QfpZA4aJdYT+EimfssP8QJH3tIgRMpIvgG+JmvFDWQgmtp62WRyAjI16qw/4hpmR9SppI67NJTRLh8VErdNrIyvNdiVD+EDximb+0sj8y4qFu8cypQ2T3tQ0jYCJt2HiZRYeAdlVyIpSHRGs/fTQN+719KvOEGbl7E+nI1t+tO+uh90z2hcYrc2blc1mCJQk81PijwA8Amf7avOlc+rvfAgiYSAuA3NAQ04LLUXRCeSo17Uo2taJezPaTNKPFSv6uPtWKBSuW00RasXEKi8Z2ItZD0yOeiEDBETy6Xjy4eGoKfdnkzx7UeK7f+1ELT7pehjOR9mLJZXpAlLfsINEttzYt0+iZ74ZQ3yLp5lB9Kp7rZyeCE1PPjJ+vSBAwkXo64I1BotM2p2pTL+jFB/pdGhSKp6bwwl2BqhcrZ9TDRJoR3Aa6Tp/imYp7aNWmBlSdJeJRT0kFJ0iVR0W7GYHTEDCRjjspjtredHU4bjkuMk9rTuKNddT00dOE/Dc1sPVrdNsV199EWhzyzQdkbZDq9btOKplEd5sHrFgCiaE/a8rO9m8+lesRwERajy1KSHLUHlG2N+F9Ocmy3woQKITKs6bYi9rydrAS822YMUykw5hazw9rfK+YqLzreUrjoHK4pninHFigmUwPx6/LO0ykXZp1p1K3Srpu8k3P25tyWjYl07dL+mbOwdx3/QiYSOu30RoSXiHprklHI21vWgPDaR9XSvqIpH9K4jHTXhbJgXIjfZpIGzHUAjFfLuk7ktjaE9s7JX19QZ++9WkESDi9X9J9ki4yKOMiYCLt3/Zstk+TIldJ+okLH69i+JOSfhx+pF4v6c5VenUnzSFgIm3OZAcJ/AJJjyV3ODlyEHyzLv6ipBvC3lu2j7kNiICJtG+jp8c/75Z0ed/qbqJduv7s/0+bmGD7QW347W2QS4LzJD2cdH6ZEyJZoE4z+MY4C8T1d2oird9Gx5WQo4yfDTffJun643bk+/YiQBLv9+GKF/vR0GPOFhNpn3Y/V9K3JV0c1LOnlNfO/zWR5gW49t5NpLVb6HjyfUjSJ8Otd0giU++WB4ELJd1vIs0Dbiu9mkhbsdR8OfFC700uP0fSn+bf7isPRCAuoVAYmtDebUAETKT9GZ2QnlM3tHdI+kZ/KlalEevQkOlfJT23KsksTDEETKTFoC420EOSzg8Z+5cUG3XcgShgEksS+v/ToPPAhu/T8OwfdYm3Mrb9s6SzJT0h6XllhvQotSFgIq3NIpanNQRINJFwYs+uI4DWrLeSvCbSlYB0N8Mi8LgkEnom0mGngGQiHdj4Vn0xAumpJsrosV/XbUAETKQDGt0qr4bA+yTdHnozka4Ga3sdmUjbs5kl3hYBjoTyEDye38STRmPjCC5Hcd0GRMBEOqDRrfJsBCBNXhcE0iSUTwtkx44oVfjC2b36wu4QMJF2Z9JmFeIJp0+uIH0kuhOS6JMXLf28a5h4H+9cm3qb+8T6eTiO+6MVZHcXjSJgIm3UcJ2JDXFxMmhfS0mWz5Eg4z3Tf+eEiCevPhDO2H8i50Duuw0ETKRt2Kl3KecQ6VwM/i3pzLkXz7yOBwVCnLxIKnGu3s0I/B8BE6knQy0IxNA+9SxjmL1PxkhqeKmp15qG9NPQnu+O6jv2Q78Q5xrLDbVgbDkyIWAizQSsuzUCRmAcBEyk49jamhoBI5AJARNpJmDdrREwAuMgYCIdx9bW1AgYgUwImEgzAetujYARGAeB/wEMT+10S9jf7wAAAABJRU5ErkJggg==",
"meds": [
[
"asdf"
]
],
"guardian": false,
"guardianName": "N/A",
"optout": false,
"currentDate": "06-30-2020",
"values": [
{
"value": "asdf"
}
]
}
How can I create a proper structured dataFrame using this so that I can export it into a CSV for a better understanding.

Retrieve data from json file using python

I'm new to python. I'm running python on Azure data bricks. I have a .json file. I'm putting the important fields of the json file here
{
"school": [
{
"schoolid": "mr1",
"board": "cbse",
"principal": "akseal",
"schoolName": "dps",
"schoolCategory": "UNKNOWN",
"schoolType": "UNKNOWN",
"city": "mumbai",
"sixhour": true,
"weighting": 3,
"paymentMethods": [
"cash",
"cheque"
],
"contactDetails": [
{
"name": "picsa",
"type": "studentactivities",
"information": [
{
"type": "PHONE",
"detail": "+917597980"
}
]
}
],
"addressLocations": [
{
"locationType": "School",
"address": {
"countryCode": "IN",
"city": "Mumbai",
"zipCode": "400061",
"street": "Madh",
"buildingNumber": "80"
},
"Location": {
"latitude": 49.313885,
"longitude": 72.877426
},
I need to create a data frame with schoolName as one column & latitude & longitude are others two columns. Can you please suggest me how to do that?
you can use the method json.load(), here's an example:
import json
with open('path_to_file/file.json') as f:
data = json.load(f)
print(data)
use this
import json # built-in
with open("filename.json", 'r') as jsonFile:
Data = jsonFile.load()
Data is now a dictionary of the contents exp.
for i in Data:
# loops through keys
print(Data[i]) # prints the value
For more on JSON:
https://docs.python.org/3/library/json.html
and python dictionaries:
https://www.programiz.com/python-programming/dictionary#:~:text=Python%20dictionary%20is%20an%20unordered,when%20the%20key%20is%20known.

Issues creating list from list of dictionaries in python

I have a file with json data in it like this:
data = [
{
"id": 12345678,
"list_id": 12345,
"creator_id": 1234567,
"entity_id": 1234567,
"created_at": "2020-01-30T00:43:55.256-08:00",
"entity": {
"id": 123456,
"type": 0,
"first_name": "John",
"last_name": "Doe",
"primary_email": "john#fakemail.com",
"emails": [
"john#fakemail.com"
]
}
},
{
"id": 12345678,
"list_id": 12345,
"creator_id": 1234567,
"entity_id": 1234567,
"created_at": "2020-01-30T00:41:54.375-08:00",
"entity": {
"id": 123456,
"type": 0,
"first_name": "Jane",
"last_name": "Doe",
"primary_email": "jane#fakemail.com",
"emails": [
"jane#fakemail.com"
]
}
}
]
I managed to extract the "first_name" values as well as the "primary_email" with the following code
for record in data:
first_names = record.get('entity',{}).get('first_name', None)
email = record.get('entity',{}).get('primary_email', None)
print(first_names)
print(email)
which produces following output:
John
john#fakemail.com
Jane
jane#fakemail.com
I am struggling however to create two separate lists for names and email like this:
(John,Jane)
(john#fakemail.com,jane#fakemail.com)
Any help with this is much appreciated.
import json
first_names = [record.get('entity',{}).get('first_name', None) for record in data]
email = [record.get('entity',{}).get('primary_email', None) for record in data]
print(first_names)
print(email)
or in the same loop:
first_names = []
email = []
for record in data:
first_names.append(record.get('entity',{}).get('first_name', None))
email.append(record.get('entity',{}).get('primary_email', None))
print(first_names)
print(email)

Iterating through JSON file in Python

I am trying to read a JSON file and iterate through it, for instance I am trying to print the children, or the firstname, or the hobbies etc...
The JSON file looks like this:
{
"firstName": "Jane",
"lastName": "Doe",
"hobbies": ["running", "sky diving", "singing"],
"age": 35,
"children": [
{
"firstName": "Alice",
"age": 6
},
{
"firstName": "Bob",
"age": 8
}
]
},
{
"firstName": "Mike",
"lastName": "Smith",
"hobbies": ["bowling", "photography", "biking"],
"age":40,
"children": [
{
"firstName": "Steve",
"age": 10
},
{
"firstName": "Sara",
"age": 18
}
]
}
I'm loading the json file using the following code:
import json
with open('test.json') as f:
data = json.load(f)
and I can print parts of the first record fine like this:
print(data['children'])
print(data['hobbies'])
[{'firstName': 'Alice', 'age': 6}, {'firstName': 'Bob', 'age': 8}]
['running', 'sky diving', 'singing']
I'd like to iterate through the records though so I can print pieces of the 2nd entry as well (or 3rd, or 20th if applicable)
When I try this however:
for key, value in data.items():
print(key, value)
It still just returns the first entry:
firstName Jane
lastName Doe
hobbies ['running', 'sky diving', 'singing']
age 35
children [{'firstName': 'Alice', 'age': 6}, {'firstName': 'Bob', 'age': 8}]
Any ideas?
The Problem you are facing is you are having the data as a single json.
You need to make it as an array. Something like this.
[{ // notice additional [
"firstName": "Jane",
"lastName": "Doe",
"hobbies": ["running", "sky diving", "singing"],
"age": 35,
"children": [
{
"firstName": "Alice",
"age": 6
},
{
"firstName": "Bob",
"age": 8
}
]
},
{
"firstName": "Mike",
"lastName": "Smith",
"hobbies": ["bowling", "photography", "biking"],
"age":40,
"children": [
{
"firstName": "Steve",
"age": 10
},
{
"firstName": "Sara",
"age": 18
}
]
}] // notice additional ]
Then you need to loop it over the list and then as per what you have written. Something like this
import json
with open('abc.json') as f:
data = json.load(f)
for element in data:
for key, value in element.items():
print(key, value)
To covert your file be more std JSON, and open it add [ and ], to make it as json list. and whole code paste below:
import json
f = open("test_json.txt", "r")
contents = f.readlines()
f.close()
contents.insert(0, "[")
f = open("test_json.txt", "w")
contents = "".join(contents)
f.write(contents)
f.write("]")
f.close()
with open("test_json.txt", "r") as fd:
d = json.load(fd)
for i in d:
print(i)

Categories

Resources