Using python to modify format of Json file - python

I have JSON file that is formatted like this:
(multi-line for clarity)
(line 0001).......
{
"_id": "iD_0001",
"skills": [{
"name": "Project Management"
}, {
"name": "Business Development"
}]
}
....
(line 9999)
{
"_id":"iD_9999",
"skills": [{
"name": "Negotiation"
}, {
"name": "Banking"
}]
}
I'd like to run a program on it, however, the program cannot read it under the aforementioned format.
Thus I'd like to modify its format to:
[{
"_id": "iD_0001",
"skills": [{
"name": "Project Management"
}, {
"name": "Business Development"
}]
},{
"_id":"iD_9999",
"skills": [{
"name": "Negotiation"
}, {
"name": "Banking"
}]
}]
Essentially, putting all entries in a single array.
Is there a way to implement that using Python or demjson?
ALTERNATIVE: I made a program that fetches the skills in these json files and sends them to a text file (Test.txt), however it only works for the second format, not the first. Can you suggest a modification to make it work for the first format (above)?
This is my program:
import json
from pprint import pprint
with open('Sample.json') as data_file:
data = json.load(data_file)
with open('Test.txt', 'w') as f:
for x in data:
for y in x["skills"]:
f.write(y["name"])
f.close()
SOLUTION
Thank you to Antti Haapala for noticing the catenation of Json objects under the first format, as well as to Walter Witzel and Josh J for suggesting alternative answers.
Since the first format is a catenation of individual objects, the program functions well if we load the first Json file Line-by-Line instead of as a whole. I have done that with:
data = []
with open('Sample1-candidats.json') as data_file:
for line in data_file:
data.append(json.loads(line))
with open('Test.txt', 'w') as f:
for x in data:
for y in x["skills"]:
f.write(y["name"])
f.close()

Here it goes. This assumes that your file is just a bunch of individual json objects concatenated and you need to transform in a list of json objects.
import json
from pprint import pprint
with open('sample.json') as data_file:
strData = '[' + ''.join(data_file.readlines()).replace('}\n{','},{') + ']'
data = eval(strData)
with open('Test.txt', 'w') as f:
for x in data:
for y in x["skills"]:
f.write(y["name"])

Here are the steps you can take to accomplish your problem. Since it kinda sounds like a homework assignment, I will give you the logic and pointers but not the code.
Open the file for reading
Read file into string variable (if small enough for memory limits)
Create empty list for output
Split string on .....
json.loads each piece of resulting list
Append each result to your empty output list
Have a cup of coffee to celebrate

Related

How to connect json files downloaded from api?

I am downloading hundreds of files which have a format:
{
"result": [
{
"Lines": "130",
"Lon": 21.0566243,
"VehicleNumber": "1000",
"Time": "2020-12-22 18:55:03",
"Lat": 52.1812215,
"Brigade": "1"
},
{
"Lines": "311",
"Lon": 21.0817553,
"VehicleNumber": "1001",
"Time": "2020-12-22 18:54:52",
"Lat": 52.2407755,
"Brigade": "2"
}
]
}
My desired output is a list of dictionaries
[
{
"Lines": "130",
"Lon": 21.0566243,
"VehicleNumber": "1000",
"Time": "2020-12-22 18:55:03",
"Lat": 52.1812215,
"Brigade": "1"
},
{
"Lines": "311",
"Lon": 21.0817553,
"VehicleNumber": "1001",
"Time": "2020-12-22 18:54:52",
"Lat": 52.2407755,
"Brigade": "2"
}
]
combined from all the files.
What is a proper way to handle it?
I tried downloading with
def download(file_name):
with open(os.path.join(path_to_data,file_name), 'a') as outfile:
json.dump(response.json(), outfile)
But then I got one file with a couple of dictionaries with {"result":} and can't even load it as a json. Should I save each json in a separate file instead of making it just one file? If so, should i make a list of names for function download?
It's not clear if you want each response to be a list of the dictionaries or if you want one big list written to the file.
You can collect all dictionaries just by creating a list and using .extend. That's one large list with dictionaries.
hold_list = []
# your API loop here
resp = response.json()
hold_list.extend(resp['result'])
print(hold_list)
If you want a list of lists, use .append instead of .extend. Play around with it to see the difference.
After that, you can dump it into a file as a JSON:
with open("output.json", "w") as fp:
json.dump(hold_list, fp)
Lastly, if you want to write to the file each time you get the response from the API, you can write resp['result']. But that gives a list for each API response, and you'll need to either write a delimiter or put in a new line character or you may end up with a list after list with no spaces or delimiters in-between. This won't be JSON, but you can use Python and manipulate it as a list with dictionaries.
However, it is possible to get a JSON as well.
For example, like this (gets a list of lists, like .append in the first case):
with open("output.json", "w") as fp:
fp.write("[")
first = True
# your API loop here
if first:
first = False
else:
fp.write(", ")
fp.write(json.dumps(response_json["result"]))
fp.write("]")
OR (a list of dicts, like .extend):
replace this line:
fp.write(json.dumps(response_json["result"]))
with this one:
fp.write(json.dumps(response_json["result"])[1:-1])
# [1:-1] is a slice to remove the [ and ]

reading from a json file using python

Im trying to read from this json file and print the values. I cant find out how to print all the values from the first (dictonary-index?) in the list.
I want to print the following:
website: https://www.amazon.com/Apple-iPhone-GSM-Unlocked-64GB/dp/B07
price: 382,76
How can i do it?
JSON file:
[
{
"website": "https://www.amazon.com/Apple-iPhone-GSM-Unlocked-64GB/dp/B078P5BK5G",
"price": "382,76"
},
{
"website": "https://www.ebay.com/itm/Apple-iPhone-8-Plus-GSM-Unlocked-64GB-Gold-Renewed-Gold-64-GB-Gold-64-GB-/143340730792",
"price": "609,15"
}
]
Python code:
Tried this
import json
with open('./result.json') as json_file:
data = json.load(json_file)
for p in data:
print(p["price"])
Output is the prices of the products:
382,76
609,15
Instead of printing the prices it should print the values in the first dict in the list. Any good tips on how to do this?
You are looping over the list of dictionaries. If you want to loop over the values of the first dictionary, you first need to get the first element, and loop over that one.
first_dict = data[0]
for value in first_dict.values():
print(value)

How to add a character like ("," , "[" , "]") at the end of a JSON-Object

I have a Python-script that makes an File with invalid JSON.
Now I want to manipulate this JSON-File so it becomes a valid JSON-file by adding a comma between every object, at the beginning of the File a '[' and at the end a ']'.
Is there a way to make this with JSON alone or do i have to find a way with other read and write functions?
Exsample_File.json:
{
"firstName": "Bidhan",
"lastName": "Chatterjee",
"age": 40,
"email":"bidhan#example.com"
}
{
"firstName": "hanbid",
"lastName": "jeeChatter",
"age": 10,
"email":"example#bidhan.com"
}
....
n times
New_File.json:
[
{
"firstName": "Bidhan",
"lastName": "Chatterjee",
"age": 40,
"email":"bidhan#example.com"
},
{
"firstName": "hanbid",
"lastName": "jeeChatter",
"age": 10,
"email":"example#bidhan.com"
},
....
n times
]
This is the function that makes this JSON-File. I dont want to touch the other code where the str is generated.
data = json.loads(str)
with open('Example_File.json','ab')as outfile:
json.dump(data, outfile, indent=2)
So far i dont have an idea to solve this problem. so there is no code sample that would help.
The result should be like the New-File
You may have to read the content as string, manipulate it and load as JSON. Something like this,
import json
with open('Example.json','r') as f:
data = f.read()
data = "[" + data.replace("}", "},", data.count("}")-1) + "]"
json_data = json.loads(data)
It seems your data has numbers begins with 0, so you may ended up with an exception "ValueError". You may refer how to deal the issue from Why is JSON invalid if an integer begins with 0
Note: I manually removed 0 from "Example.json"
Can't you do
words.replace('}','},')
This should replace all instances of '}' with a '},'
First of all I don't think there is a way to parse it directly as JSON array.
However, if your JSON objects are not nested a simple way to parse them is to split your string:
with open(YOUR_FILE) as jsons_file:
jsons = [x.strip() + '}' for x in jsons_file.read().split('}')][:-1]
now you can dump it to file or string using json's library dump or dumps
json.dumps(jsons)
or
with open(OUT_FILE, 'w') as out_file:
json.dump(jsons, out_file)
to add automatic comma between each object and add brackets in file to make it complete json just write a simple jq query
jq -s '.' file_name

Python - How to copy all data out of an array

Currently I am exporting a database from firebase into a JSON but it is coming out in as an array.
[{"ConnectionTime": 19.23262298107147, "objectId": "01331oxpVT", "FirmwareRevision": "201504270003 Img-B", "DeviceID": "EDF02C74-6518-489E-8751-25C58F8C830D", "PeripheralType": 4, "updatedAt": "2015-10-09T04:01:39.569Z", "Model": "Bean", "HardwareRevision": "E", "Serial": "Serial Number", "createdAt": "2015-10-09T04:01:39.569Z", "Manufacturer": "Punch Through Design"}, {"ConnectionTime": 0.3193170428276062, "objectId": "018Mv1g6I8", "DeviceID": "42635033-DF3A-4109-A633-C3AB829BE114", "PeripheralType": 2, "updatedAt": "2015-12-08T04:20:41.950Z", "createdAt": "2015-12-08T04:20:41.950Z"}]
And then I get this error - Start of array encountered without start of object.'}]
How can I change this to not be an Array and just a list of data. I also need a line break between each set of data but Im assuming once I get the data out of the array the code I currently have will do that. My code is below. Thanks for the help!
firebase = firebase.FirebaseApplication('https://dataworks-356fa.firebaseio.com/')
result = firebase.get('/connection_info_parse', None)
# id_keys = map(str, result.keys()) #filter out ID names
with open("firetobqrestore1.json", "w") as outfile:
# for id in id_keys:
json.dump(result, outfile, indent=None)
outfile.write("\n")
It sounds like something in your workflow wants newline delimited JSON, although you haven't made it explicitly clear what is giving you this error.
With that caveat, I think this is what you are looking for:
import json
with open("firetobqrestore1.json", "w") as outfile:
for line in result:
json.dump(line, outfile, indent=None)
outfile.write("\n")
This will write individual json objects to each line.
This also assumes that result is an actual python object rather than a JSON string. If it's a string you will need to parse it first with something like:
result = json.loads(result)
If the elements of the list are not parsed ( they are strings), then loop through the list and convert each element to a json using json.loads(). Then, you can use json.dumps()
In case the elements of the list are already parsed,then just loop through the list and use json.dumps().

How to generate a json file in python from nested array/dictionary/object

I need to generate a json file like:
{
"age":100,
"name":"mkyong.com",
"messages":["msg 1","msg 2","msg 3"]
}
The data in this file should be populated from various places. What is the best way to do this in python? I can always write as a text file (character by character). But I was wondering if there is a cleaner way by which an array could be created and use some library methods to generate this json file. Please suggest a good solution
PS. I am new to python
You can use json.dumps() for that. You can pass a dictionary to it and the function will encode it as json.
Example:
import json
# example dictionary that contains data like you want to have in json
dic={'age': 100, 'name': 'mkyong.com', 'messages': ['msg 1', 'msg 2', 'msg 3']}
# get json string from that dictionary
json=json.dumps(dic)
print json
Output:
{"age": 100, "name": "mkyong.com", "messages": ["msg 1", "msg 2", "msg 3"]}
Try this out...
import json
with open('data.json', 'w') as outfile:
json.dump({
"age":100,
"name":"mkyong.com",
"messages":["msg 1","msg 2","msg 3"]
}, outfile)

Categories

Resources