Extract multiple object values from JSON file and save to .txt file? - python

I have a JSON file with 20 objects, each containing a resource parameter with an associated value. I would like to extract the value of resource for each object, and save that value as a line in a txt file.
The structure of the JSON is:
"objects": [
{"created": "2020-10-04", "name": "john", "resource": "api/john/",}
{"created": "2020-10-04", "name": "paul", "resource": "api/paul/",}
{"created": "2020-10-04", "name": "george", "resource": "api/george/",}
{"created": "2020-10-04", "name": "ringo", "resource": "api/ringo/",}
]
So far, I have got the following code, however this can only get the resource value from the first object, and does not let me write it to a txt file using Python.
with open(input_json) as json_file:
data = json.load(json_file)
resource = (data["objects"][1]["resource"])
values = resource.items()
k = {str(key): str(value) for key, value in values}
with open ('resource-list.txt', 'w') as resource_file:
resource_file.write(k)

You have to use lists:
txtout=""
with open(input_json) as json_file:
data = json.load(json_file)
objects = data["objects"]
for jobj in objects:
txtout = txtout + jobj["resource"] + "\n"
with open ('resource-list.txt', 'w') as resource_file:
resource_file.write(txtout)

hi there new Pythonista!
well the thing you missed here is the part where you iterate over your json object.
with open(input_json) as json_file:
data = json.load(json_file)
resource = (data["objects"][1]["resource"]) # right here you simply took the second object (which is the [1] position)
a decet fix would be:
with open(input_json) as json_file:
data = json.load(json_file)
all_items = [] # lets keep here all resource values
for item in data["objects"]: # iterate over entire items
all_items.append(item["resource"]) # push the necessary info
# lets concat every item to one string since it's only made of 20, it will not make our buffer explode
to_write = "\n".join(all_items)
with open("resource-list.txt", "w") as f:
f.write(to_write)
and we’re done!

Related

CSV to JSON of Lists

Currently, I have a CSV file with the following example --
File
skill
experience
overall_experience
1
Java
1.5
3
1
Python
1.0
3
1
SQL
0.5
3
There are multiple entries for many such files but I need to merge the skills and their respective experience into a single value belonging to a single key, something like this -
{
"1": {
"file": "1",
"skill": ["Java", "Python", "SQL"],
"experience": [1.5, 1.0, 0.5]
"Overall_exp": 3.0
},}
I tried a Python Code for this but it is giving me only the value of last skill and last experience (and not the whole thing in a list)
Here is the code I was using --
import csv
import json
# Function to convert a CSV to JSON
# Takes the file paths as arguments
def make_json(csvFilePath, jsonFilePath):
# create a dictionary
data = {}
# Open a csv reader called DictReader
with open(csvFilePath, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
# Convert each row into a dictionary
# and add it to data
for rows in csvReader:
# Assuming a column named 'file' to
# be the primary key
key = rows['file']
data[key] = rows
# Open a json writer, and use the json.dumps()
# function to dump data
with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
# Decide the two file paths according to your
# computer system
csvFilePath = 'skill_matrix.csv'
jsonFilePath = 'skill_matrix.json'
# Call the make_json function
make_json(csvFilePath, jsonFilePath)
The output that I get here is this --
{
"1": {
"file": "1",
"skill": "SQL",
"experience": "0.5"
"Overall_exp": "3.0"
},}
How can I convert it to the former json format and not the latter?
You can use pandas to read your csv, group by File and export to json:
df = pd.read_csv(your_csv)
df = df.groupby('File', as_index=False).agg({'skill': list, 'experience': list, 'overall_experience': np.mean})
print(df.to_json(orient='index', indent=4))
Note: you can specify the aggregation functions for your columns in a dictionary
Output:
{
"0":{
"File":1,
"skill":[
"Java",
"Python",
"SQL"
],
"experience":[
1.5,
1.0,
0.5
],
"overall_experience":3.0
}
}
I think that loading into Pandas first and then going from all data to the narrowing strategy is cleaner and easier. You can use the following code for parsing your data into JSON files;
import pandas as pd
import json
# Load the CSV into Pandas
df = pd.read_csv('1.csv', header=0)
data = df.to_dict(orient='list')
# Delete / change as you wish
data['File'] = str(data['File'][0])
data['overall_experience'] = data['overall_experience'][0]
# Save as json
with open('1.json', 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
Result (1.json)
{
"File": "1",
"skill": [
"Java",
"Python",
"SQL"
],
"experience": [
1.5,
1.0,
0.5
],
"overall_experience": 3
}
I suppose hat you have multiple file id in a CSV file. Your given example is too minimalistic. Anyhow, then you can create a master dictionary and add your smaller ones as follows;
import pandas as pd
import json
# Load the CSV into Pandas
df = pd.read_csv('1.csv', header=0)
# Master dictionary
master_dict = {}
for idx, file_id in enumerate(df["File"].unique()):
data = df[df['File'] == file_id].to_dict(orient='list')
# Delete / change as you wish
data['File'] = str(data['File'][0])
data['overall_experience'] = data['overall_experience'][0]
master_dict[idx] = data
# Save as json
with open('1.json', 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(master_dict, indent=4))
Result (1.json)
{
"0": {
"File": "1",
"skill": [
"Java",
"Python",
"SQL"
],
"experience": [
1.5,
1.0,
0.5
],
"overall_experience": 3
},
"1": {
"File": "2",
"skill": [
"Java",
"Python"
],
"experience": [
2.0,
2.5
],
"overall_experience": 1
}
}
If you don't want to use Pandas, you could try:
import csv
import json
def make_json(csvfile_path, jsonfile_path):
data = {}
with open(csvfile_path, "r") as csvfile:
next(csvfile) # Skip header line
for row in csv.reader(csvfile):
fdata = data.setdefault(row[0], {"file": row[0]})
fdata.setdefault("skill", []).append(row[1])
fdata.setdefault("experience", []).append(float(row[2]))
fdata.setdefault("overall_experience", []).append(float(row[3]))
with open(jsonfile_path, "w") as jsonfile:
json.dump(data, jsonfile)
The main difference to your approach is the explicit structuring of the inner dicts: values are lists (except for the 'file' key). The dict.setdefault() is great here: You can set a value for a key if it isn't in the dict, and get the value back (either the newly set one or the existing). So you can put a list in the dict, get it back, and can immediately .append() to it.
If you want to use a csv.DictReader:
def make_json(csvfile_path, jsonfile_path):
data = {}
with open(csvfile_path, "r") as csvfile:
for row in csv.DictReader(csvfile):
fdata = data.setdefault(row["file"], {"file": row["file"]})
for key, value in list(row.items())[1:]:
fdata.setdefault(key, []).append(
value if key == "skill" else float(value)
)
with open(jsonfile_path, "w") as jsonfile:
json.dump(data, jsonfile)
(I haven't, since I wasn't sure about the actual column names.)

How to delete an element in a json file python

I am trying to delete an element in a json file,
here is my json file:
before:
{
"names": [
{
"PrevStreak": false,
"Streak": 0,
"name": "Brody B#3719",
"points": 0
},
{
"PrevStreak": false,
"Streak": 0,
"name": "XY_MAGIC#1111",
"points": 0
}
]
}
after running script:
{
"names": [
{
"PrevStreak": false,
"Streak": 0,
"name": "Brody B#3719",
"points": 0
}
]
}
how would I do this in python? the file is stored locally and I am deciding which element to delete by the name in each element
Thanks
I would load the file, remove the item, and then save it again. Example:
import json
with open("filename.json") as f:
data = json.load(f)
f.pop(data["names"][1]) # or iterate through entries to find matching name
with open("filename.json", "w") as f:
json.dump(data, f)
You will have to read the file, convert it to python native data type (e.g. dictionary), then delete the element and save the file. In your case something like this could work:
import json
filepath = 'data.json'
with open(filepath, 'r') as fp:
data = json.load(fp)
del data['names'][1]
with open(filepath, 'w') as fp:
json.dump(data, fp)
Try this:
# importing the module
import ast
# reading the data from the file
with open('dictionary.txt') as f:
data = f.read()
print("Data type before reconstruction : ", type(data))
# reconstructing the data as a dictionary
a_dict = ast.literal_eval(data)
{"names":[a for a in a_dict["names"] if a.get("name") !="XY_MAGIC#1111"]}
import json
with open("test.json",'r') as f:
data = json.loads(f.read())
names=data.get('names')
for idx,name in enumerate(names):
if name['name']=='XY_MAGIC#1111':
del names[idx]
break
print(names)
In order to read the file best approach would be using the with statement after which you could just use pythons json library and convert json string to python dict. once you get dict you can access the values and do your operations as required. you could convert it as json using json.dumps() then save it
This does the right thing useing the python json module, and prettyprints the json back to the file afterwards:
import json
jsonpath = '/path/to/json/file.json'
with open(jsonpath) as file:
j = json.loads(file.read())
names_to_remove = ['XY_MAGIC#1111']
for element in j['names']:
if element['name'] in names_to_remove:
j['names'].remove(element)
with open(jsonpath, 'w') as file:
file.write(json.dumps(j, indent=4))

Getting value from a JSON file based on condition

In python I'm trying to get the value(s) of the key "relativePaths" from a JSON element if that element contains the value "concept" for the key "tags". The JSON file has the following format.
]
},
{
"fileName": "#Weizman.2011",
"relativePath": "Text/#Weizman.2011.md",
"tags": [
"text",
"concept"
],
"frontmatter": {
"authors": "Weizman",
"year": 2011,
"position": {
"start": {
"line": 0,
"col": 0,
"offset": 0
},
"end": {
"line": 4,
"col": 3,
"offset": 120
}
}
},
"aliases": [
"The least of all possible evils - humanitarian violence from Arendt to Gaza"
],
I have tried the following codes:
import json
with open("/Users/metadata.json") as jsonFile:
data = json.load(jsonFile)
for s in range(len(data)):
if 'tags' in s in range(len(data)):
if data[s]["tags"] == "concept":
files = data[s]["relativePaths"]
print(files)
Which results in the error message:
TypeError: argument of type 'int' is not iterable
I then tried:
with open("/Users/metadata.json") as jsonFile:
data = json.load(jsonFile)
for s in str(data):
if 'tags' in s in str(data):
print(s["relativePaths"])
That code seems to work. But I don't get any output from the print command. What am I doing wrong?
Assuming your json is a list of the type you put on your question, you can get those values like this:
with open("/Users/metadata.json") as jsonFile:
data = json.load(jsonFile)
for item in data: # Assumes the first level of the json is a list
if ('tags' in item) and ('concept' in item['tags']): # Assumes that not all items have a 'tags' entry
print(item['relativePaths']) # Will trigger an error if relativePaths is not in the dictionary
Figured it
import json
f = open("/Users/metadata.json")
# returns JSON object as
# a dictionary
data = json.load(f)
# Iterating through the json
# list
for i in data:
if "tags" in i:
if "concept" in i["tags"]:
print(i["relativePaths"])
# Closing file
f.close()
I think this will do what you want. It is more "pythonic" because it doesn't use numerical indices to access elements of the list — making it easier to write and read).
import json
with open("metadata.json") as jsonFile:
data = json.load(jsonFile)
for elem in data:
if 'tags' in elem and 'concept' in elem['tags']:
files = elem["relativePath"]
print(files)

How can I delete an item from a json using the below method?

I need to remove data from a json, at the minute i am using the following code:
import json
with open('E:/file/timings.json', 'r+') as f:
qe = json.load(f)
for item in qe['times']:
if item['Proc'] == 'APS':
print(f'{item["Num"]}')
del item
json.dump(qe, f, indent=4, sort_keys=False, ensure_ascii=False)
This doesn't delete anything from the JSON, here is a small example of my JSON file
{
"times": [
{
"Num": "12345678901234567",
"Start_Time": "2016-12-14 15:54:35",
"Proc": "UPD",
},
{
"Num": "12345678901234567",
"Start_Time": "2016-12-08 15:34:05",
"Proc": "APS",
},
{
"Num": "12345678901234567",
"Start_Time": "2016-11-30 11:20:21",
"Proc": "Dev,
i would like it to look like this:
{
"times": [
{
"Num": "12345678901234567",
"Start_Time": "2016-12-14 15:54:35",
"Proc": "UPD",
},
{
"Num": "12345678901234567",
"Start_Time": "2016-11-30 11:20:21",
"Proc": "Dev,
as you can see the portion containing APS as the process has been removed
You could save your initial json and then create new one that doesn't contain items which 'Proc' is equal to 'APS' (here new_json) and then overwrite your json file with that new_json.
import json
content = json.loads(open('timings.json', 'r').read())
new_json = {'times': []}
for item in content['times']:
if item['Proc'] != 'APS':
new_json['times'].append(item)
file = open('timings.json', 'w')
file.write(json.dumps(new_json, indent=4, sort_keys=False, ensure_ascii=False))
file.close()
It is not a good practice to delete element while iterating the list.
Use:
import json
with open('E:/file/timings.json', 'r') as f:
qe = json.load(f)
qe = [item for item in qe['times'] if item['Proc'] != 'APS'] #Delete Required element.
with open('E:/file/timings.json', 'w') as f:
json.dump(qe, f, indent=4, sort_keys=False, ensure_ascii=False)
del as you're using it, removes the variable item from your session, but leaves the actual item untouched in the data structure. You need to explicitly remove whatever item is pointing to from your data structure. Also, you want to avoid deleting items from a list while you are iterating over said list. You should recreate your entire list:
qe['times'] = [item for item in qe['times'] if item['Proc'] != 'APS']
You can use a method if you need to print:
def keep_item(thing):
if item['Proc'] == 'APS':
print thing['Num']
return False
else:
return True
qe['times'] = [item for item in qe['times'] if keep_item(item)]
You can use the below method to remove the element from list:
for i,item in enumerate(qe['times']):
if item['Proc'] == 'APS':
qe['times'].pop(i)
and then write back to the JSON file.

Python JSON add Key-Value pair

I'm trying to add key value pairs into the existing JSON file. I am able to concatenate to the parent label, How to add value to the child items?
JSON file:
{
"students": [
{
"name": "Hendrick"
},
{
"name": "Mikey"
}
]
}
Code:
import json
with open("input.json") as json_file:
json_decoded = json.load(json_file)
json_decoded['country'] = 'UK'
with open("output.json", 'w') as json_file:
for d in json_decoded[students]:
json.dump(json_decoded, json_file)
Expected Results:
{
"students": [
{
"name": "Hendrick",
"country": "UK"
},
{
"name": "Mikey",
"country": "UK"
}
]
}
You can do the following in order to manipulate the dict the way you want:
for s in json_decoded['students']:
s['country'] = 'UK'
json_decoded['students'] is a list of dictionaries that you can simply iterate and update in a loop. Now you can dump the entire object:
with open("output.json", 'w') as json_file:
json.dump(json_decoded, json_file)
import json
with open("input.json", 'r') as json_file:
json_decoded = json.load(json_file)
for element in json_decoded['students']:
element['country'] = 'UK'
with open("output.json", 'w') as json_out_file:
json.dump(json_decoded, json_out_file)
opened a json file i.e. input.json
iterated through each of its element
add a key named "country" and dynamic value "UK", to each element
opened a new json file with the modified JSON.
Edit:
Moved writing to output file inside to first with segment. Issue with earlier implemenation is that json_decoded will not be instantiated if opening of input.json fails. And hence, writing to output will raise an exception - NameError: name 'json_decoded' is not defined
This gives [None, None] but update the dict:
a = {'students': [{'name': 'Hendrick'}, {'name': 'Mikey'}]}
[i.update({'country':'UK'}) for i in a['students']]
print(a)

Categories

Resources