Convert Python list to JSON document - python

Context: I have a list with the structure below. It can contain a variable number of items, in multiples of 3. I am trying to transform each set of 3 into a separate JSON document.
['SCAN1.txt', 'Lastmodified:20191125.121049', 'Filesize:7196', 'SCAN2.txt', 'Lastmodified:20191125.121017', 'Filesize:3949', 'SCAN3.txt', 'Lastmodified:20191125.121056', 'Filesize:2766']
Question: How can I convert the single list into a JSON document with the following form, while also allowing for variability in the number of files it can accomadate:
{
{
"File": {
"File_Name":"SCAN1.txt",
"Last_Modified":"20191125.121049",
"File_Size":"7196"
}
{
"File": {
"File_Name":"SCAN2.txt",
"Last_Modified":"20191125.121017",
"File_Size":"3949"
}
}
{
"File": {
"File_Name":"SCAN3.txt",
"Last_Modified":"20191125.121056",
"File_Size":"2766"
}
}
}

Using chunked from more-itertools,
from more_itertools import chunked
import json
example = ['SCAN1.txt', 'Lastmodified:20191125.121049', 'Filesize:7196', 'SCAN2.txt', 'Lastmodified:20191125.121017', 'Filesize:3949', 'SCAN3.txt', 'Lastmodified:20191125.121056', 'Filesize:2766']
def file_to_json(file):
return {"File": {"File_Name": file[0], "Last_Modified": file[1], "File_Size": file[2]}}
json.dumps([file_to_json(file) for file in list(chunked(example, 3))])
Output:
[{
"File": {
"File_Name": "SCAN1.txt",
"Last_Modified": "Lastmodified:20191125.121049",
"File_Size": "Filesize:7196"
}
}, {
"File": {
"File_Name": "SCAN2.txt",
"Last_Modified": "Lastmodified:20191125.121017",
"File_Size": "Filesize:3949"
}
}, {
"File": {
"File_Name": "SCAN3.txt",
"Last_Modified": "Lastmodified:20191125.121056",
"File_Size": "Filesize:2766"
}
}]

You could also change your result to have the filenames as keys instead, since they are unique:
{
"SCAN1.txt": {
"Filesize": 7196,
"Lastmodified": 20191125.121049
},
"SCAN2.txt": {
"Filesize": 3949,
"Lastmodified": 20191125.121017
},
"SCAN3.txt": {
"Filesize": 2766,
"Lastmodified": 20191125.121056
}
}
Which could be achieved like below(comments included):
from collections import defaultdict
from json import dumps
from ast import literal_eval
lst = [
"SCAN1.txt",
"Lastmodified:20191125.121049",
"Filesize:7196",
"SCAN2.txt",
"Lastmodified:20191125.121017",
"Filesize:3949",
"SCAN3.txt",
"Lastmodified:20191125.121056",
"Filesize:2766",
]
def group_file_documents(lst, prefix="SCAN"):
# Use defaultdict of dicts to represent final JSON structure
# Also can be serialized like normal dictionaries
result = defaultdict(dict)
current_file = None
for item in lst:
# Update current file name if starts with prefix
if item.startswith(prefix):
current_file = item
continue
# Ensure current file name is present
if current_file:
# Split key.values and strip whitespace, just in case
key, value = map(str.strip, item.split(":"))
# Convert to actual int/float value
result[current_file][key] = literal_eval(value)
return result
# Print serialized JSON file with sorted keys and indents of 4 spaces
print(dumps(group_file_documents(lst), sort_keys=True, indent=4))

Related

How can I change an attribute trait_type value (from a int to a str) in multiple JSON files using a Python script?

"attributes": [
{
"trait_type": "Vintage",
"value": 2019
},
{
"trait_type": "Volume (ml)",
"value": 750
}
]
I want to change the 2019 and the 750 to a string (by including "") for multiple values over 150+ JSONs using python
I am not a dev and do not use Python, but I have this so far:
import json
for i in range(1, 147):
with open(f'{i}.json', 'r+', encoding='utf8') as f:
data = json.load(f)
You can make a function that takes a JSON file name and replaces these specific values you're looking for:
def change_values_to_string(json_filename):
with open(f'{json_filename}.json', 'r') as json_fp:
json_to_dict = json.loads(json_fp.read())
for sub_dict in json_to_dict["attributes"]:
sub_dict["value"] = str(sub_dict["value"])
with open(f'{json_filename}.json', 'w') as json_fp:
json_fp.write(json.dumps(json_to_dict))
for _ in range(1, 147):
change_values_to_string(_)
Input:
test.json
{
"attributes": [
{
"trait_type": "Vintage",
"value": 2019
},
{
"trait_type": "Volume (ml)",
"value": 750
}
]
}
Calling the function with the correct file name: change_values_to_string("test")
Outputs:
{
"attributes": [
{
"trait_type": "Vintage",
"value": "2019"
},
{
"trait_type": "Volume (ml)",
"value": "750"
}
]
}
Explanations:
Open the json file in read mode - and loading it into a dictionary python type.
Iterate over the attributes key which contains a lot of dictionaries if I get it right.
Replace each value key to a string
Dump the dictionary into the same file overriding it.

Convert Pandas Dataframe or csv file to Custom Nested JSON

I have a csv file with a DF with structure as follows:
my dataframe:
I want to enter the data to the following JSON format using python. I looked to couple of links (but I got lost in the nested part). The links I checked:
How to convert pandas dataframe to uniquely structured nested json
convert dataframe to nested json
"PHI": 2,
"firstname": "john",
"medicalHistory": {
"allergies": "egg",
"event": {
"inPatient":{
"hospitalized": {
"visit" : "7-20-20",
"noofdays": "5",
"test": {
"modality": "xray"
}
"vitalSign": {
"temperature": "32",
"heartRate": "80"
},
"patientcondition": {
"headache": "1",
"cough": "0"
}
},
"icu": {
"visit" : "",
"noofdays": "",
},
},
"outpatient": {
"visit":"5-20-20",
"vitalSign": {
"temperature": "32",
"heartRate": "80"
},
"patientcondition": {
"headache": "1",
"cough": "1"
},
"test": {
"modality": "blood"
}
}
}
}
If anyone can help me with the nested array, that will be really helpful.
You need one or more helper functions to unpack the data in the table like this. Write main helper function to accept two arguments: 1. df and 2. schema. The schema will be used to unpack the df into a nested structure for each row in the df. The schema below is an example of how to achieve this for a subset of the logic you describe. Although not exactly what you specified in example, should be enough of hint for you to complete the rest of the task on your own.
from operator import itemgetter
groupby_idx = ['PHI', 'firstName']
groups = df.groupby(groupby_idx, as_index=False, drop=False)
schema = {
"event": {
"eventType": itemgetter('event'),
"visit": itemgetter('visit'),
"noOfDays": itemgetter('noofdays'),
"test": {
"modality": itemgetter('test')
},
"vitalSign": {
"temperature": itemgetter('temperature'),
"heartRate": itemgetter('heartRate')
},
"patientCondition": {
"headache": itemgetter('headache'),
"cough": itemgetter('cough')
}
}
}
def unpack(obj, schema):
tmp = {}
for k, v in schema.items():
if isinstance(v, (dict,)):
tmp[k] = unpack(obj, v)
if callable(v):
tmp[k] = v(obj)
return tmp
def apply_unpack(groups, schema):
results = {}
for gidx, df in groups:
events = []
for ridx, obj in df.iterrows():
d = unpack(obj, schema)
events.append(d)
results[gidx] = events
return results
unpacked = apply_unpack(groups, schema)

Transforming JSON keys and values

I have a simple json in python that looks like :
{
"list": [{
"key1": "value1"
},
{
"key1": "value1"
}
]
}
I want to transform this to the following json. Any suggestions how I can do it with python without installing additional libraries?
{
"list": [{
"keys": {
"name": "key1",
"value": "value1"
}
}, {
"keys": {
"name": "key1",
"value": "value1"
}
}]
}
Not sure from your question if you already have the json read into a variable, or if it is in a file. This is assuming you have it in a variable already:
in_json = {
"list": [{
"key1": "value1"
},
{
"key2": "value2"
}
]
}
out_json = {"list":[]}
for kd in in_json["list"]:
sub_kd = {"keys": {}}
for k,v in kd.iteritems():
sub_kd["keys"]["name"] = k
sub_kd["keys"]["value"] = v
out_json["list"].append(sub_kd)
print(out_json)
It just loops through the json making dictionaries to append to the out_json dictionary. You could make this print pretty with the json library and also save to file with it
You didn't indicate exactly what contains the JSON data is in, so I've put it all in a string in the example code below and uses the json.loads() function to turn it into a Python dictionary. If it's in a file, you can use the module's json.load() function instead.
It also make the assume that each sub-JSON object in the "list" list consists of only one key/value pair as shown in your question.
The code below changes the deserialized dictionary in-place and pretty-prints the result of doing that by using the json.dumps() function to reserialize it.
Note that I changed the keys and values in sample input JSON slightly to make it easier to see the correspondence between it and the printed results.
import json
json_in = '''
{
"list": [
{
"key1": "value1"
},
{
"key2": "value2"
}
]
}
'''
json_data = json.loads(json_in) # Deserialize.
for i, obj in enumerate(json_data['list']):
# Assumes each object in list contains only one key, value pair.
newobj = { 'name': next(iter(obj.keys())),
'value': next(iter(obj.values()))}
json_data['list'][i] = {'keys': newobj}
print(json.dumps(json_data, indent=4)) # Reserialize and print.
Printed result:
{
"list": [
{
"keys": {
"name": "key1",
"value": "value1"
}
},
{
"keys": {
"name": "key2",
"value": "value2"
}
}
]
}

Update a specific key in JSON Array using PYTHON

I have a JSON file which has some key-value pairs in Arrays. I need to update/replace the value for key id with a value stored in a variable called Var1
The problem is that when I run my python code, it adds the new key-value pair in outside the inner array instead of replacing:
PYTHON SCRIPT:
import json
import sys
var1=abcdefghi
with open('C:\\Projects\\scripts\\input.json', 'r+') as f:
json_data = json.load(f)
json_data['id'] = var1
f.seek(0)
f.write(json.dumps(json_data))
f.truncate()
INPUT JSON:
{
"channel": "AT",
"username": "Maintenance",
"attachments": [
{
"fallback":"[Urgent]:",
"pretext":"[Urgent]:",
"color":"#D04000",
"fields":[
{
"title":"SERVERS:",
"id":"popeye",
"short":false
}
]
}
]
}
OUTPUT:
{
"username": "Maintenance",
"attachments": [
{
"color": "#D04000",
"pretext": "[Urgent]:",
"fallback": "[Urgent]:",
"fields": [
{
"short": false,
"id": "popeye",
"title": "SERVERS:"
}
]
}
],
"channel": "AT",
"id": "abcdefghi"
}
Below will update the id inside fields :
json_data['attachments'][0]['fields'][0]['id'] = var1

decoding json string in python

I have the following JSON string (from wikipedia http://en.wikipedia.org/wiki/JSON)
{
"name":"Product",
"properties":
{
"id":
{
"type":"number",
"description":"Product identifier",
"required":true
},
"name":
{
"type":"string",
"description":"Name of the product",
"required":true
},
"price":
{
"type":"number",
"minimum":0,
"required":true
},
"tags":
{
"type":"array",
"items":
{
"type":"string"
}
},
"stock":
{
"type":"object",
"properties":
{
"warehouse":
{
"type":"number"
},
"retail":
{
"type":"number"
}
}
}
}
}
I am trying to decode this string using Python json library. I would like to access the node
properties - > stock - > properties - > warehouse.
I understand that json.loads() function stores the json string as a dictionary. But in this case properties is my key and everything under that are values. How do I access the above node.
import json
jsonText=""
file = open("c:/dir/jsondec.json")
for line in file.xreadlines():
jsonText+=line
data = json.loads(jsonText)
for k,v in data.items():
print k // shows name and properties
file.close();
Thanks
You can load json straight from the file like this:
f = open("c:/dir/jsondec.json")
data = json.load(f)
Based on your input string, data is now a dictionary that contains other dictionaries. You can just navigate up the dictionaries like so:
node = data['properties']['stock']['properties']['warehouse']
print str(node)

Categories

Resources