How to add to python dictionary without needing a key - python

I am scraping some information off the web and want to show write the information into a JSON file with this format:
[
{
"name" : "name1",
"value" : 1
},
{
"name" : "name2",
"value" : 2
},
{
"name" : "name3",
"value" : 3
},
{
"name" : "name4",
"value" : 4
},
{
"name" : "name5",
"value" : 5
}
]
I am looping through everything I am scraping but don't know how to convert that information to this format. I tried to create a dictionary and then add to it after every loop but it does not give me the output I want.
dictionary = None
name = None
value = None
for item in someList:
name = item.name
value = item.value
dictionary[""] = {"name": name, "value": value}
with open("data.json", "w") as file:
json.dump(dictionary, file, indent=4)

Try this:
import json
myList = [{"name": item.name, "value": item.value} for item in someList]
with open("data.json", "w") as file:
json.dump(myList, file, indent=4)

The answer was simpler than I thought. I just needed to make a list of dictionaries and use that list in the json.dumps() function. Like this:
myList = list()
name = None
value = None
for item in someList:
name = item.name
value = item.value
myList.append({"name": name, "value": value})
with open("data.json", "w") as file:
json.dump(myList, file, indent=4)

The format you show is a list not a dictionary. So you can make a list and append to it the different dictionaries.
arr = []
for item in someList:
dictionary.append({"name": item.name, "value": item.value})
with open("data.json", "w") as file:
json.dump(array, file, indent=4)

Related

Getting value from a JSON file based on condition

In python I'm trying to get the value(s) of the key "relativePaths" from a JSON element if that element contains the value "concept" for the key "tags". The JSON file has the following format.
]
},
{
"fileName": "#Weizman.2011",
"relativePath": "Text/#Weizman.2011.md",
"tags": [
"text",
"concept"
],
"frontmatter": {
"authors": "Weizman",
"year": 2011,
"position": {
"start": {
"line": 0,
"col": 0,
"offset": 0
},
"end": {
"line": 4,
"col": 3,
"offset": 120
}
}
},
"aliases": [
"The least of all possible evils - humanitarian violence from Arendt to Gaza"
],
I have tried the following codes:
import json
with open("/Users/metadata.json") as jsonFile:
data = json.load(jsonFile)
for s in range(len(data)):
if 'tags' in s in range(len(data)):
if data[s]["tags"] == "concept":
files = data[s]["relativePaths"]
print(files)
Which results in the error message:
TypeError: argument of type 'int' is not iterable
I then tried:
with open("/Users/metadata.json") as jsonFile:
data = json.load(jsonFile)
for s in str(data):
if 'tags' in s in str(data):
print(s["relativePaths"])
That code seems to work. But I don't get any output from the print command. What am I doing wrong?
Assuming your json is a list of the type you put on your question, you can get those values like this:
with open("/Users/metadata.json") as jsonFile:
data = json.load(jsonFile)
for item in data: # Assumes the first level of the json is a list
if ('tags' in item) and ('concept' in item['tags']): # Assumes that not all items have a 'tags' entry
print(item['relativePaths']) # Will trigger an error if relativePaths is not in the dictionary
Figured it
import json
f = open("/Users/metadata.json")
# returns JSON object as
# a dictionary
data = json.load(f)
# Iterating through the json
# list
for i in data:
if "tags" in i:
if "concept" in i["tags"]:
print(i["relativePaths"])
# Closing file
f.close()
I think this will do what you want. It is more "pythonic" because it doesn't use numerical indices to access elements of the list — making it easier to write and read).
import json
with open("metadata.json") as jsonFile:
data = json.load(jsonFile)
for elem in data:
if 'tags' in elem and 'concept' in elem['tags']:
files = elem["relativePath"]
print(files)

JSONDecodeError: Expecting value: line 2 column 13 (char 15)

I have a nested json file which I got from json.
I am trying to convert it in to csv through python code.
I tried all the possible way to convert it to csv but couldn't succeed.
I also followed previous question and solution but didn't work for me.
My json format is
{
"d1" : ("value1"),
"d2" : (value2-int),
"d3" : [
{
"sub-d1" : sub-value1(int),
"sub-d2" : sub-value2(int),
"sub-d3" : sub-value3(int),
"sub-d4" : [
{
"sub-sub-d1" : "sub-sub-value3",
"sub-sub-d2" : sub-value3(int)
},
{
"sub-sub-d1" : sub-sub-value3(int),
"sub-sub-d2" : "sub-sub-value3"}
]
],
"sub-d5" : "sub-value4",
"sub-d6" : "sub-value5"
}
],
"d4" : "value3",
"d5" : "value4",
"d6" : "value5,
"d7" : "value6"
}
{ another entry with same pattern..and so on}
Some of the value and sub value has integers and str + int.
What I tried
import json
import csv
import requests
with open('./data/inverter.json', 'r') as myfile:
json_data = myfile.read()
def get_leaves(item, key=None):
if isinstance(item, dict):
leaves = {}
for i in item.keys():
leaves.update(get_leaves(item[i], i))
return leaves
elif isinstance(item, list):
leaves = {}
for i in item:
leaves.update(get_leaves(i, key))
return leaves
else:
return {key : item}
# First parse all entries to get the complete fieldname list
fieldnames = set()
for entry in json_data:
fieldnames.update(get_leaves(entry).keys())
with open('output.csv', 'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames=sorted(fieldnames))
csv_output.writeheader()
csv_output.writerows(get_leaves(entry) for entry in json_data)
This one saves all my data in single column with split values.
I tried to use :
https://github.com/vinay20045/json-to-csv.git
but this also didn't work.
I also tried to parse and do simple trick with following code:
with open("./data/inverter.json") as data_file:
data = data_file.read()
#print(data)
data_content = json.loads(data)
print(data_content)
but it throws an error : 'JSONDecodeError: Expecting value: line 2 column 13 (char 15)'
Can any one help me to convert my nested json to csv ?
It would be appreciated.
Thank you
It looks like the NumberInt(234234) issue you describe was a bug in MongoDB: how to export mongodb without any wrapping with NumberInt(...)?
If you cannot fix it by upgrading MongoDB, I can recommend preprocessing the data with regular expressions and parsing it as regular JSON after that.
For the sake of example, let's say you've got "test.json" that looks like this, which is valid except for the NumberInt(...) stuff:
{
"d1" : "value1",
"d2" : NumberInt(1234),
"d3" : [
{
"sub-d1" : 123,
"sub-d2" : 123,
"sub-d3" : 123,
"sub-d4" : [
{
"sub-sub-d1" : "sub-sub-value3",
"sub-sub-d2" : NumberInt(123)
},
{
"sub-sub-d1" : 43242,
"sub-sub-d2" : "sub-sub-value3"
}
]
}
],
"d4" : "value3",
"d5" : "value4",
"d6" : "value5",
"d7" : "value6"
}
You could import this into Python as follows:
import re
import json
with open("test.json") as f:
data = f.read()
# This regular expression finds/replaces the NumberInt bits with just the contents
fixed_data = re.sub(r"NumberInt\((\d+)\)", r"\1", data)
loaded_data = json.loads(fixed_data)
print(json.dumps(loaded_data, indent=4))

convert csv file to multiple nested json format

I have written a code to convert csv file to nested json format. I have multiple columns to be nested hence assigning separately for each column. The problem is I'm getting 2 fields for the same column in the json output.
import csv
import json
from collections import OrderedDict
csv_file = 'data.csv'
json_file = csv_file + '.json'
def main(input_file):
csv_rows = []
with open(input_file, 'r') as csvfile:
reader = csv.DictReader(csvfile, delimiter='|')
for row in reader:
row['TYPE'] = 'REVIEW', # adding new key, value
row['RAWID'] = 1,
row['CUSTOMER'] = {
"ID": row['CUSTOMER_ID'],
"NAME": row['CUSTOMER_NAME']
}
row['CATEGORY'] = {
"ID": row['CATEGORY_ID'],
"NAME": row['CATEGORY']
}
del (row["CUSTOMER_NAME"], row["CATEGORY_ID"],
row["CATEGORY"], row["CUSTOMER_ID"]) # deleting since fields coccuring twice
csv_rows.append(row)
with open(json_file, 'w') as f:
json.dump(csv_rows, f, sort_keys=True, indent=4, ensure_ascii=False)
f.write('\n')
The output is as below:
[
{
"CATEGORY": {
"ID": "1",
"NAME": "Consumers"
},
"CATEGORY_ID": "1",
"CUSTOMER_ID": "41",
"CUSTOMER": {
"ID": "41",
"NAME": "SA Port"
},
"CUSTOMER_NAME": "SA Port",
"RAWID": [
1
]
}
]
I'm getting 2 entries for the fields I have assigned using row[''].
Is there any other way to get rid of this? I want only one entry for a particular field in each record.
Also how can I convert the keys to lower case after reading from csv.DictReader(). In my csv file all the columns are in upper case and hence I'm using the same to assign. But I want to convert all of them to lower case.
In order to convert the keys to lower case, it would be simpler to generate a new dict per row. BTW, it should be enough to get rid of the duplicate fields:
for row in reader:
orow = collection.OrderedDict()
orow['type'] = 'REVIEW', # adding new key, value
orow['rawid'] = 1,
orow['customer'] = {
"id": row['CUSTOMER_ID'],
"name": row['CUSTOMER_NAME']
}
orow['category'] = {
"id": row['CATEGORY_ID'],
"name": row['CATEGORY']
}
csv_rows.append(orow)

How can I delete an item from a json using the below method?

I need to remove data from a json, at the minute i am using the following code:
import json
with open('E:/file/timings.json', 'r+') as f:
qe = json.load(f)
for item in qe['times']:
if item['Proc'] == 'APS':
print(f'{item["Num"]}')
del item
json.dump(qe, f, indent=4, sort_keys=False, ensure_ascii=False)
This doesn't delete anything from the JSON, here is a small example of my JSON file
{
"times": [
{
"Num": "12345678901234567",
"Start_Time": "2016-12-14 15:54:35",
"Proc": "UPD",
},
{
"Num": "12345678901234567",
"Start_Time": "2016-12-08 15:34:05",
"Proc": "APS",
},
{
"Num": "12345678901234567",
"Start_Time": "2016-11-30 11:20:21",
"Proc": "Dev,
i would like it to look like this:
{
"times": [
{
"Num": "12345678901234567",
"Start_Time": "2016-12-14 15:54:35",
"Proc": "UPD",
},
{
"Num": "12345678901234567",
"Start_Time": "2016-11-30 11:20:21",
"Proc": "Dev,
as you can see the portion containing APS as the process has been removed
You could save your initial json and then create new one that doesn't contain items which 'Proc' is equal to 'APS' (here new_json) and then overwrite your json file with that new_json.
import json
content = json.loads(open('timings.json', 'r').read())
new_json = {'times': []}
for item in content['times']:
if item['Proc'] != 'APS':
new_json['times'].append(item)
file = open('timings.json', 'w')
file.write(json.dumps(new_json, indent=4, sort_keys=False, ensure_ascii=False))
file.close()
It is not a good practice to delete element while iterating the list.
Use:
import json
with open('E:/file/timings.json', 'r') as f:
qe = json.load(f)
qe = [item for item in qe['times'] if item['Proc'] != 'APS'] #Delete Required element.
with open('E:/file/timings.json', 'w') as f:
json.dump(qe, f, indent=4, sort_keys=False, ensure_ascii=False)
del as you're using it, removes the variable item from your session, but leaves the actual item untouched in the data structure. You need to explicitly remove whatever item is pointing to from your data structure. Also, you want to avoid deleting items from a list while you are iterating over said list. You should recreate your entire list:
qe['times'] = [item for item in qe['times'] if item['Proc'] != 'APS']
You can use a method if you need to print:
def keep_item(thing):
if item['Proc'] == 'APS':
print thing['Num']
return False
else:
return True
qe['times'] = [item for item in qe['times'] if keep_item(item)]
You can use the below method to remove the element from list:
for i,item in enumerate(qe['times']):
if item['Proc'] == 'APS':
qe['times'].pop(i)
and then write back to the JSON file.

Python JSON add Key-Value pair

I'm trying to add key value pairs into the existing JSON file. I am able to concatenate to the parent label, How to add value to the child items?
JSON file:
{
"students": [
{
"name": "Hendrick"
},
{
"name": "Mikey"
}
]
}
Code:
import json
with open("input.json") as json_file:
json_decoded = json.load(json_file)
json_decoded['country'] = 'UK'
with open("output.json", 'w') as json_file:
for d in json_decoded[students]:
json.dump(json_decoded, json_file)
Expected Results:
{
"students": [
{
"name": "Hendrick",
"country": "UK"
},
{
"name": "Mikey",
"country": "UK"
}
]
}
You can do the following in order to manipulate the dict the way you want:
for s in json_decoded['students']:
s['country'] = 'UK'
json_decoded['students'] is a list of dictionaries that you can simply iterate and update in a loop. Now you can dump the entire object:
with open("output.json", 'w') as json_file:
json.dump(json_decoded, json_file)
import json
with open("input.json", 'r') as json_file:
json_decoded = json.load(json_file)
for element in json_decoded['students']:
element['country'] = 'UK'
with open("output.json", 'w') as json_out_file:
json.dump(json_decoded, json_out_file)
opened a json file i.e. input.json
iterated through each of its element
add a key named "country" and dynamic value "UK", to each element
opened a new json file with the modified JSON.
Edit:
Moved writing to output file inside to first with segment. Issue with earlier implemenation is that json_decoded will not be instantiated if opening of input.json fails. And hence, writing to output will raise an exception - NameError: name 'json_decoded' is not defined
This gives [None, None] but update the dict:
a = {'students': [{'name': 'Hendrick'}, {'name': 'Mikey'}]}
[i.update({'country':'UK'}) for i in a['students']]
print(a)

Categories

Resources