Compare and append to json file in python - python

I have 2 files like these:
file1.json
[
{
"name": "john",
"version": "0.0"
},
{
"name": "peter",
"version": "1.0"
},
{
"name": "bob",
"version": "2.0",
"single": "true"
}
]
file2.json
[
{
"name": "jane",
"version": "0.0"
},
{
"name": "peter",
"version": "1.0"
},
{
"name": "bob",
"version": "2.0",
"single": "true"
}
]
I want to compare the "name" values in file1.json with the "name" values in file2.json. If file2.json has some "name" values not in file1.json, I want to append that json object to file1.json.
For the above 2 files, I want file1 to be appended with the "name": "jane" object since that is not present in file1. So file1 should be updated to:
[
{
"name": "john",
"version": "0.0"
},
{
"name": "peter",
"version": "1.0"
},
{
"name": "bob",
"version": "2.0",
"single": "true"
},
{
"name": "jane",
"version": "0.0"
}
]
What I have tried:
with open('file1.json', 'r') as file1, open('file2.json', 'r') as file2:
file1_json = json.load(file1)
file2_json = json.load(file2)
for object in file2_json:
if object['name'] not in file1_json.values():
file1_json.update(object)
with open('file1.json', 'w') as file1:
json.dump(file1_json, file1, indent=4)

Collect the names in file1_json, find the missing names in file2_json, then add them to file1_json:
import json
with open('file1.json', 'r') as file1, open('file2.json', 'r') as file2:
file1_json = json.load(file1)
file2_json = json.load(file2)
names = [o['name'] for o in file1_json]
missing = [o for o in file2_json if o['name'] not in names]
file1_json.extend(missing)
with open('file1.json', 'w') as file1:
json.dump(file1_json, file1, indent=2)
Output:
[
{
"name": "john",
"version": "0.0"
},
{
"name": "peter",
"version": "1.0"
},
{
"name": "bob",
"version": "2.0",
"single": "true"
},
{
"name": "jane",
"version": "0.0"
}
]

This should work if all dictionary values are immutable (which they are):
with open('file1.json', 'r') as file1, open('file2.json', 'r') as file2:
file1_json = json.load(file1)
file2_json = json.load(file2)
print([dict(s) for s in set(frozenset(d.items()) for d in file1_json+file2_json)])

You could create an index into json2 to speed lookups. Then take the set of names from each and subtract. The result is the missing names. Use the index to add the missing values.
with open('file1.json', 'r') as file1, open('file2.json', 'r') as file2:
file1_json = json.load(file1)
file2_json = json.load(file2)
idx2 = {d["name"]:d for d in file2_json}
missing = set(idx2) - set(d["name"] for d in file1_json)
file1_json.extend(idx2[name] for name in missing)

I believe what you want is to merge dictionaries essentially. The fact it comes from a json does not really matter here.
Take a look at this.
https://stackoverflow.com/a/62820532/4944708
Here's the full solution, assuming you have read the jsons in:
def to_dict(json_):
return {item['name']: item for item in json_}
list({**to_dict(json1), **to_dict(json2)}.values())

Related

Compare two JSON Files and Return the Difference

I have found some similar questions to this. The problem is that none of those solutions work for me and some are too advanced. I'm trying to read the two JSON files and return the difference between them.
I want to be able to return the missing object from file2 and write it into file1.
These are both the JSON files
file1
[
{
"name": "John Wood",
"age": 35,
"country": "USA"
},
{
"name": "Mark Smith",
"age": 30,
"country": "USA"
}
]
.
file2
[
{
"name": "John Wood",
"age": 35,
"country": "USA"
},
{
"name": "Mark Smith",
"age": 30,
"country": "USA"
},
{
"name": "Oscar Bernard",
"age": 25,
"country": "Australia"
}
]
The code
with open("file1.json", "r") as f1:
file1 = f1.read()
item1 = json.loads(file1)
print(item1)
with open("file2.json", "r") as f2:
file2 = f2.read()
item2 = json.loads(file2)
print(item2)
# Returns if that index is the same
for v in range(len(item1)):
for m in range(len(item2)):
if item2[v]["name"] == item1[m]["name"]:
print("Equal")
else:
print("Not Equal")
I'm not used to programming using JSON or comparing two different files.
I would like help to return the difference and basically copy and paste
the missing object into file1. What I have here shows me if each object in file1
is equal or not to file2. Thus, returning the "Equal" and "Not Equal" output.
Output:
Equal
Not Equal
Not Equal
Not Equal
Equal
Not Equal
I want to return if file1 is not equal to file2 and which object ("name")
is the one missing. Afterward, I want to be able to add/copy that missing object into file2 using with open("file2.json", "w").
with open("file1.json", "r") as f1:
file1 = json.loads(f1.read())
with open("file2.json", "r") as f2:
file2 = json.loads(f2.read())
for item in file2:
if item not in file1:
print(f"Found difference: {item}")
file1.append(item)
print(f"New file1: {file1}")
file1=[
{
"name": "John Wood",
"age": 35,
"country": "USA"
},
{
"name": "Mark Smith",
"age": 30,
"country": "USA"
}
]
file2=[
{
"name": "John Wood",
"age": 35,
"country": "USA"
},
{
"name": "Mark Smith",
"age": 30,
"country": "USA"
},
{
"name": "Oscar Bernard",
"age": 25,
"country": "Australia"
}
]
for item in file2:
if item['name'] not in [x['name'] for x in file1]:
print(f"Found difference: {item}")
file1.append(item)
print(f"New file1: {file1}")

Update exsisting JSON file

I'm trying to update exsisting JSON file when running my code by adding additional data (package_id). this is the exsisting json contents:
{
"1": {
"age": 10,
"name": [
"ramsi",
"jack",
"adem",
"sara",
],
"skills": []
}
}
and I want to insert a new package and should looks like this:
{"1": {
"age": 10,
"name": [
"ramsi",
"jack",
"adem",
"sara",
],
"skills": []
} "2": {
"age": 14,
"name": [
"maya",
"raji",
],
"skills": ["writing"]
}
}
Issue is when I add the new data it adds --> ({) so (one top-level value) is added twice which is not allowed by JSON standards
{"1": {
"age": 10,
"name": [
"ramsi",
"jack",
"adem",
"sara",
],
"skills": []
}} {"2": {
"age": 14,
"name": [
"maya",
"raji",
],
"skills": ["writing"]
}
}
and this is my code to add the new (package_id):
list1[package_id] = {"age": x, "name": y, "skills": z}
ss = json.dumps(list1, indent=2)
data = []
with open('file.json', 'r+') as f:
data = json.loads(f.read())
data1 = json.dumps(data, indent=2)
f.seek(0)
f.write(data1)
f.write(ss)
f.truncate()
I write to the file twice because if I didn't store existing contents and write it again then it will remove old data and keeps only package_id number 2
It doesn't work that way. You can't add to a JSON record by appending another JSON record. A JSON file always has exactly one object. You need to modify that object.
with open('file.json','r') as f:
data = json.loads(f.read())
data[package_id] = {'age':x, 'name':y, 'skills':z}
with open('file.json','w') as f:
f.write(json.dumps(data,indent=2))

Convert CSV with nested headers to JSON

So far, I have this code (with help from a tutorial):
import csv, json
csvFilePath = "convertcsv.csv"
jsonFilePath = "newResult.json"
# Read the CSV and add the data to a dictionary...
data = {}
with open(csvFilePath) as csvFile:
csvReader = csv.DictReader(csvFile)
for csvRow in csvReader:
data = csvRow
# Write data to a JSON file...
with open(jsonFilePath, "w") as jsonFile:
jsonFile.write(json.dumps(data, indent=4))
My desired output is this:
{
"userID": "string",
"username": "string",
"age": "string",
"location": {
"streetName": "string",
"streetNo": "string",
"city": "string"
}
}
I don't know how to represent the "location".
My actual result is this:
{
"userID": "string",
"username": "string",
"age": "string",
"location/streetName": "string",
"location/streetNo": "string",
"location/city": "string",
}
How can I seperate streetName, streetNo and city and put them into "location"?
Below is a simple script should do what you want. The result will be a json object with the "userID" as keys. Note that, to test deeper nesting, I used a csv file with slightly different headers - but it will work just as well with your original example.
import csv, json
infile = 'convertcsv.csv'
outfile = 'newResult.json'
data = {}
def process(header, value, record):
key, other = header.partition('/')[::2]
if other:
process(other, value, record.setdefault(key, {}))
else:
record[key] = value
with open(infile) as stream:
reader = csv.DictReader(stream)
for row in reader:
data[row['userID']] = record = {}
for header, value in row.items():
process(header, value, record)
with open(outfile, "w") as stream:
json.dump(data, stream, indent=4)
INPUT:
userID,username,age,location/street/name,location/street/number,location/city
0,AAA,20,This Street,5,This City
1,BBB,42,That Street,5,That City
2,CCC,34,Other Street,5,Other City
OUTPUT:
{
"0": {
"userID": "0",
"username": "AAA",
"age": "20",
"location": {
"street": {
"name": "This Street",
"number": "5"
},
"city": "This City"
}
},
"1": {
"userID": "1",
"username": "BBB",
"age": "42",
"location": {
"street": {
"name": "That Street",
"number": "5"
},
"city": "That City"
}
},
"2": {
"userID": "2",
"username": "CCC",
"age": "34",
"location": {
"street": {
"name": "Other Street",
"number": "5"
},
"city": "Other City"
}
}
}
I'd add some custom logic to achieve this, note that this is for the first level only, if you want more, you should create a recoursive function:
# Write data to a JSON file...
with open(jsonFilePath, "w") as jsonFile:
for i, v in data.items():
if '/' in i:
parts = i.split('/', 1)
data[parts[0]] = {parts[1]: v}
data.pop(i)
jsonFile.write(json.dumps(data, indent=4))
You can use something like this:
# https://www.geeksforgeeks.org/convert-csv-to-json-using-python/
import csv
import json
# Function to convert a CSV to JSON
# Takes the file paths as arguments
def make_json(csvFilePath, jsonFilePath):
# create a dictionary
data = {}
# Open a csv reader called DictReader
with open(csvFilePath, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
# Convert each row into a dictionary
# and add it to data
for rows in csvReader:
# Assuming a column named 'No' to
# be the primary key
key = rows['No']
data[key] = rows
# Open a json writer, and use the json.dumps()
# function to dump data
with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
# Driver Code
# Decide the two file paths according to your
# computer system
csvFilePath = r'Names.csv'
jsonFilePath = r'Names.json'
# Call the make_json function
make_json(csvFilePath, jsonFilePath)
For more information check out https://www.geeksforgeeks.org/convert-csv-to-json-using-python/

Identify root folder in a file system in Python

I have a recursive method that traverses a file system structure and creates a dictionary from it.
This is the code:
def path_to_dict(path):
d = {'name': os.path.basename(path)}
if os.path.isdir(path):
d['type'] = "directory"
d['path'] = os.path.relpath(path).strip('..\\').replace('\\','/')
d['children'] = [path_to_dict(os.path.join(path, x)) for x in os.listdir\
(path)]
else:
d['type'] = "file"
d['path'] = os.path.relpath(path).strip('..\\').replace('\\','/')
with open(path, 'r', encoding="utf-8", errors='ignore') as myfile:
content = myfile.read().splitlines()
d['content'] = content
At the moment, it checks if it is a folder then puts the keys name, type, path and children where children is an array which can contain further folders or files. If it is a file it has the keys name, type, path and content.
After converting it to JSON, the final structure is like this.
{
"name": "nw",
"type": "directory",
"path": "Parsing/nw",
"children": [{
"name": "New folder",
"type": "directory",
"path": "Parsing/nw/New folder",
"children": [{
"name": "abc",
"type": "directory",
"path": "Parsing/nw/New folder/abc",
"children": [{
"name": "text2.txt",
"type": "file",
"path": "Parsing/nw/New folder/abc/text2.txt",
"content": ["abc", "def", "dfg"]
}]
}, {
"name": "text2.txt",
"type": "file",
"path": "Parsing/nw/New folder/text2.txt",
"content": ["abc", "def", "dfg"]
}]
}, {
"name": "text1.txt",
"type": "file",
"path": "Parsing/nw/text1.txt",
"content": ["aaa "]
}, {
"name": "text2.txt",
"type": "file",
"path": "Parsing/nw/text2.txt",
"content": []
}]
}
Now I want the script to always set the type in only the root folder to the value root. How can I do this?
I think you want something similar than the following implementation. The directories and files in root folder will contain the "type": "root" and the child elements won't contain this key-value pair.
def path_to_dict(path, child=False):
d = {'name': os.path.basename(path)}
if os.path.isdir(path):
if not child:
d['type'] = "root"
d['path'] = os.path.relpath(path).strip('..\\').replace('\\','/')
d['children'] = [path_to_dict(os.path.join(path, x), child=True) for x in os.listdir\
(path)]
else:
if not child:
d['type'] = "root"
d['path'] = os.path.relpath(path).strip('..\\').replace('\\','/')
with open(path, 'r', encoding="utf-8", errors='ignore') as myfile:
content = myfile.read().splitlines()
d['content'] = content

Update a specific key in JSON Array using PYTHON

I have a JSON file which has some key-value pairs in Arrays. I need to update/replace the value for key id with a value stored in a variable called Var1
The problem is that when I run my python code, it adds the new key-value pair in outside the inner array instead of replacing:
PYTHON SCRIPT:
import json
import sys
var1=abcdefghi
with open('C:\\Projects\\scripts\\input.json', 'r+') as f:
json_data = json.load(f)
json_data['id'] = var1
f.seek(0)
f.write(json.dumps(json_data))
f.truncate()
INPUT JSON:
{
"channel": "AT",
"username": "Maintenance",
"attachments": [
{
"fallback":"[Urgent]:",
"pretext":"[Urgent]:",
"color":"#D04000",
"fields":[
{
"title":"SERVERS:",
"id":"popeye",
"short":false
}
]
}
]
}
OUTPUT:
{
"username": "Maintenance",
"attachments": [
{
"color": "#D04000",
"pretext": "[Urgent]:",
"fallback": "[Urgent]:",
"fields": [
{
"short": false,
"id": "popeye",
"title": "SERVERS:"
}
]
}
],
"channel": "AT",
"id": "abcdefghi"
}
Below will update the id inside fields :
json_data['attachments'][0]['fields'][0]['id'] = var1

Categories

Resources