First of all, I would like to ask what is a typical format of JSON file with multiple objects?
Is it a list of objects like: [ {...}, {...} ... ]?
Second, I tried to store multiple dict to a single JSON file using python.
I have two JSONs:
test_data = {'profile_img': 'https://fmdataba.com/images/p/4592.png',
'name': 'Son Heung-Min ',
'birth_date': '8/7/1992',
'nation': 'South Korea KOR',
'position': 'M (R), AM (RL), ST (C)',
'foot': 'Either'}
and
test_data2 = {'profile_img': 'https://fmdataba.com/images/p/1103.png',
'name': 'Marc-André ter Stegen ',
'birth_date': '30/4/1992',
'nation': 'Germany',
'position': 'GK',
'foot': 'Either'}
then I did
with open('data.json', 'w') as json_file:
json.dump(test_data, json_file, indent=2)
json.dump(test_data2, json_file, indent=2)
Of course, I would have iterated a list of dicts to store multiple dicts, but I just did this for now to test if the format is correct. The result .json file looks like
data.json
{
"profile_img": "https://fmdataba.com/images/p/4592.png",
"name": "Son Heung-Min ",
"birth_date": "8/7/1992",
"nation": "South Korea KOR",
"position": "M (R), AM (RL), ST (C)",
"foot": "Either"
}{
"profile_img": "https://fmdataba.com/images/p/1103.png",
"name": "Marc-Andr\u00e9 ter Stegen ",
"birth_date": "30/4/1992",
"nation": "Germany",
"position": "GK",
"foot": "Either"
}
It seems pretty weird because there is not , between two objects.
What is the typical way of doing this?
You need create an object hold all your date in list first.Then dump this list to file.
test_data_list = [test_data, test_data2]
json.dump(test_data_list, json_file)
Related
I am trying to convert the CSV file into a Hierarchical JSON file.CSV file input as follows, It contains two columns Gene and Disease.
gene,disease
A1BG,Adenocarcinoma
A1BG,apnea
A1BG,Athritis
A2M,Asthma
A2M,Astrocytoma
A2M,Diabetes
NAT1,polyps
NAT1,lymphoma
NAT1,neoplasms
The expected Output format should be in the following format
{
"name": "A1BG",
"children": [
{"name": "Adenocarcinoma"},
{"name": "apnea"},
{"name": "Athritis"}
]
},
{
"name": "A2M",
"children": [
{"name": "Asthma"},
{"name": "Astrocytoma"},
{"name": "Diabetes"}
]
},
{
"name": "NAT1",
"children": [
{"name": "polyps"},
{"name": "lymphoma"},
{"name": "neoplasms"}
]
}
The python code I have written is below. let me know where I need to change to get the desired output.
import json
finalList = []
finalDict = {}
grouped = df.groupby(['gene'])
for key, value in grouped:
dictionary = {}
dictList = []
anotherDict = {}
j = grouped.get_group(key).reset_index(drop=True)
dictionary['name'] = j.at[0, 'gene']
for i in j.index:
anotherDict['disease'] = j.at[i, 'disease']
dictList.append(anotherDict)
dictionary['children'] = dictList
finalList.append(dictionary)
with open('outputresult3.json', "w") as out:
json.dump(finalList,out)
import json
json_data = []
# group the data by each unique gene
for gene, data in df.groupby(["gene"]):
# obtain a list of diseases for the current gene
diseases = data["disease"].tolist()
# create a new list of dictionaries to satisfy json requirements
children = [{"name": disease} for disease in diseases]
entry = {"name": gene, "children": children}
json_data.append(entry)
with open('outputresult3.json', "w") as out:
json.dump(json_data, out)
Use DataFrame.groupby with custom lambda function for convert values to dictionaries by DataFrame.to_dict:
L = (df.rename(columns={'disease':'name'})
.groupby('gene')
.apply(lambda x: x[['name']].to_dict('records'))
.reset_index(name='children')
.rename(columns={'gene':'name'})
.to_dict('records')
)
print (L)
[{'name': 'A1BG', 'children': [{'name': 'Adenocarcinoma'},
{'name': 'apnea'},
{'name': 'Athritis'}]},
{'name': 'A2M', 'children': [{'name': 'Asthma'},
{'name': 'Astrocytoma'},
{'name': 'Diabetes'}]},
{'name': 'NAT1', 'children': [{'name': 'polyps'},
{'name': 'lymphoma'},
{'name': 'neoplasms'}]}]
with open('outputresult3.json', "w") as out:
json.dump(L,out)
My JSON file is shown below
{
"PersonA": {
"Age": "35",
"Place": "Berlin",
"cars": ["Ford", "BMW", "Fiat"]
},
"PersonB": {
"Age": "45",
"Cars": ["Kia", "Ford"]
},
"PersonC": {
"Age": "55",
"Place": "London"
}
}
I'm trying to update certain entries on this json E.g. set Place for PersonB to Rome similarly for PersonC update cars with an array ["Hyundai", "Ford"]`
What I have done until now is
import json
key1 ='PersonB'
key2 = 'PersonC'
filePath = "resources/test.json"
with open(filePath, encoding='utf-8') as jsonFile:
jsonData = json.load(jsonFile)
print(jsonData)
PersonBUpdate = {"Place" : "Rome"}
PersonCUpdate = {"cars" : ["Hyundai", "Ford"]}
jsonData[key1].append(PersonBUpdate)
jsonData[key2].append(PersonCUpdate)
print(jsonData)
It throws an error.
AttributeError: 'dict' object has no attribute 'append'
It should be like this:
jsonData['Person1']['Place'] = 'Rome'
Dictionaries indeed do not have an append method. Only lists do.
Or with Python 3 you can do this:
jsonData['Person1'].update(PersonBUpdate)
list.append is a method for type list, not dict. Always make sure to look at the full method signature to see what type a method belongs to.
Instead we can use dict.update:
Update the dictionary with the key/value pairs from other, overwriting existing keys. Return None.
update() accepts either another dictionary object or an iterable of key/value pairs (as tuples or other iterables of length two). If keyword arguments are specified, the dictionary is then updated with those key/value pairs: d.update(red=1, blue=2).
And use this method in your code like this:
jsonData[key1].update(PersonBUpdate)
jsonData[key2].update(PersonCUpdate)
Which gives the expected result:
{'PersonA': {'Age': '35', 'Place': 'Berlin', 'cars': ['Ford', 'BMW', 'Fiat']}, 'PersonB': {'Age': '45', 'Cars': ['Kia', 'Ford'], 'Place': 'Rome'}, 'PersonC': {'Age': '55', 'Place': 'London', 'cars': ['Hyundai', 'Ford']}}
so I have some problem to find how to print a clean string from JSON list // Dict files.
I tried .join, .split method but it doesnt seem to work. Thank for the help guys
My code:
import json
with open("user.json") as f:
data = json.load(f)
for person in data["person"]:
print(person)
The JSON file
{
"person": [
{
"name": "Peter",
"Country": "Montreal",
"Gender": "Male"
},
{
"name": "Alex",
"Country": "Laval",
"Gender": "Male"
}
]
}
The print output (Which is not the correct format I want)
{'name': 'Peter', 'Country': 'Montreal', 'Gender': 'Male'}
{'name': 'Alex', 'Country': 'Laval', 'Gender': 'Male'}
I want to have the output print format to be like this:
Name: Peter
Country: Montreal
Gender:Male
If you want to print all the attributes in the person dictionary (with no exceptions) you can use:
for person in data["person"]:
for k, v in person.items():
print(k, ':', v)
You can access values using their keys as follow
import json
with open("user.json") as f:
data = json.load(f)
for person in data["person"]:
print(f'Name: {person["name"]}')
print(f'Country: {person["Country"]}')
print(f'Gender: {person["Gender"]}')
Result:
Name: Peter
Country: Montreal
Gender: Male
Name: Alex
Country: Laval
Gender: Male
for person in data["person"]:
print(f"Name: {person['name']}")
print(f"Country: {person['Country']}")
print(f"Gender: {person['Gender']}")
for python3.6+
I have a txt file which contains json string on each line with first line like this:
{"rating": 9.3, "genres": ["Crime", "Drama"], "rated": "R", "filming_locations": "Ashland, Ohio, USA", "language": ["English"], "title": "The Shawshank Redemption", "runtime": ["142 min"], "poster": "http://img3.douban.com/lpic/s1311361.jpg", "imdb_url": "http://www.imdb.com/title/tt0111161/", "writers": ["Stephen King", "Frank Darabont"], "imdb_id": "tt0111161", "directors": ["Frank Darabont"], "rating_count": 894012, "actors": ["Tim Robbins", "Morgan Freeman", "Bob Gunton", "William Sadler", "Clancy Brown", "Gil Bellows", "Mark Rolston", "James Whitmore", "Jeffrey DeMunn", "Larry Brandenburg", "Neil Giuntoli", "Brian Libby", "David Proval", "Joseph Ragno", "Jude Ciccolella"], "plot_simple": "Two imprisoned men bond over a number of years, finding solace and eventual redemption through acts of common decency.", "year": 1994, "country": ["USA"], "type": "M", "release_date": 19941014, "also_known_as": ["Die Verurteilten"]}
I want to get data for imdb_id and title.
I have tried:
import json
data = json.load('movie_acotrs_data.txt')
But got 'str' object has no attribute 'read'
What should I do to get the result I expected?
json.load() expects the file to be just one long JSON string. You can't use it if it's a separate JSON string on each line. You need to read each line and call json.loads().
import json
with open('movie_actors_data.txt') as f:
data = list(map(json.loads, f))
data will be a list of dictionaries.
If you just want a few properties, you can use a list comprehension.
with open('movie_actors_data.txt') as f:
data = [{"title": x["title"], "imdb_id": x["imdb_id"]} for x in map(json.loads, f)]
json.load takes an open file, not a file path.
data = json.load(open('movie_acotrs_data.txt'))
I have a txt file which have the format shown below and the Key strings are not in quotes. How can I convert into a JSON using python?
name {
first_name: "random"
}
addresses {
location {
locality: "India"
street_address: "xyz"
postal_code: "300092"
full_address: "street 1 , abc,India"
}
}
projects {
url: "www.githib.com"
}
There's no simple way in the standard library to convert that data format to JSON, so we need to write a parser. However, since the data format is fairly simple that's not hard to do. We can use the standard csv module to read the data. The csv.reader will handle the details of parsing spaces and quoted strings correctly. A quoted string will be treated as a single token, tokens consisting of a single word may be quoted but they don't need to be.
The csv.reader normally gets its data from an open file, but it's quite versatile, and will also read its data from a list of strings. This is convenient while testing since we can embed our input data into the script.
We parse the data into a nested dictionary. A simple way to keep track of the nesting is to use a stack, and we can use a plain list as our stack.
The code below assumes that input lines can be one of three forms:
Plain data. The line consists of a key - value pair, separated by at least one space.
A new subobject. The line starts with a key and ends in an open brace {.
The end of the current subobject. The line contains a single close brace }
import csv
import json
raw = '''\
name {
first_name: "random"
}
addresses {
location {
locality: "India"
street_address: "xyz"
postal_code: "300092"
full_address: "street 1 , abc,India"
}
}
projects {
url: "www.githib.com"
}
'''.splitlines()
# A stack to hold the parsed objects
stack = [{}]
reader = csv.reader(raw, delimiter=' ', skipinitialspace=True)
for row in reader:
#print(row)
key = row[0]
if key == '}':
# The end of the current object
stack.pop()
continue
val = row[-1]
if val == '{':
# A new subobject
stack[-1][key] = d = {}
stack.append(d)
else:
# A line of plain data
stack[-1][key] = val
# Convert to JSON
out = json.dumps(stack[0], indent=4)
print(out)
output
{
"name": {
"first_name:": "random"
},
"addresses": {
"location": {
"locality:": "India",
"street_address:": "xyz",
"postal_code:": "300092",
"full_address:": "street 1 , abc,India"
}
},
"projects": {
"url:": "www.githib.com"
}
}
Assuming your data as,
{
'addresses': {
'location': {
'full_address': 'street 1 , abc,India',
'locality': 'India',
'postal_code': '300092',
'street_address': 'xyz'
}
},
'name': {
'first_name': 'random'
},
'projects': {
'url': 'www.githib.com'
}
}
Use json.dumps to convert dict to json
In [16]: import json
In [17]: data
Out[17]:
{'addresses': {'location': {'full_address': 'street 1 , abc,India',
'locality': 'India',
'postal_code': '300092',
'street_address': 'xyz'}},
'name': {'first_name': 'random'},
'projects': {'url': 'www.githib.com'}}
In [18]: json.dumps(data)
Out[18]: '{"name": {"first_name": "random"}, "projects": {"url": "www.githib.com"}, "addresses": {"location": {"postal_code": "300092", "full_address": "street 1 , abc,India", "street_address": "xyz", "locality": "India"}}}'
In [19]: