I have a sample json file named a.json
The json data in a.json is as:
{
"a cappella": {
"word": "a cappella",
"wordset_id": "5feb6f679a",
"meanings": [
{
"id": "492099d426",
"def": "without musical accompaniment",
"example": "they performed a cappella",
"speech_part": "adverb"
},
{
"id": "0bf8d49e2e",
"def": "sung without instrumental accompaniment",
"example": "they sang an a cappella Mass",
"speech_part": "adjective"
}
]
},
"A.D.": {
"word": "A.D.",
"wordset_id": "b7e9d406a0",
"meanings": [
{
"id": "a7482f3e30",
"def": "in the Christian era",
"speech_part": "adverb",
"synonyms": [
"AD"
]
}
]
},.........
}
As suggested in my previous question I am looking on how to insert this data in to tables
Word: [word, wordset_id]
Meaning: [word, meaning_id, def, example, speech_part
Synonym: [word, synonym_word]
I tried reading file as:
import json
with open('a.json') as f:
d = json.load(f)
when I tried printing all words as:
for word in d:
print(word)
I got all words, but failed to get wordset_id for the same.
How can I insert the word and wordset_id in to the table word for the json format as above?
DBconnection as:
from flask import Flask
from flaskext.mysql import MySQL
app = Flask(__name__)
mysql = MySQL()
app.config['MYSQL_DATABASE_USER'] = 'root'
app.config['MYSQL_DATABASE_PASSWORD'] = 'root'
app.config['MYSQL_DATABASE_DB'] = 'wordstoday'
app.config['MYSQL_DATABASE_HOST'] = 'localhost'
mysql.init_app(app)
conn = mysql.connect()
cursor =conn.cursor()
When you try to execute code:
for word in d:
print(word)
It will only print the keys of the json object, not complete value. Instead, you can try doing something like this,
for word in d:
word_obj = d[word]
wordset_id = word_obj['wordset_id']
sql = "INSERT INTO Word (word, wordset_id) VALUES (%s, %s)"
values = (word, wordset_id)
cursor.execute(sql, values)
meaning_obj_list = d[word]['meanings']
for meaning_obj in meaning_obj_list:
meaning_id = meaning_obj['id']
definition = meaning_obj['def']
example = meaning_obj.get('example', None) # since it is not guaranteed that "example" key will be present in the data, it is safer to extract the value this way
speech_part = meaning_obj['speech_part']
sql = "INSERT INTO Meaning (word, meaning_id, def, example, speech_part) VALUES (%s, %s, %s, %s, %s)"
values = (word, meaning_id, definition, example, speech_part)
cursor.execute(sql, values)
db.commit()
Also, refrain from using the keys names such as def as this is a keyword in python.
Related
I'm trying to produce a JSON format for a given entity and I'm having an issue getting the dictionary to NOT overwrite itself or become empty. This is pulling rows from a table in a MySQL database and attempting to produce JSON result from the query.
Here is my function:
def detail():
student = 'John Doe'
conn = get_db_connection()
cur = conn.cursor()
sql = ("""
select
a.student_name,
a.student_id,
a.student_homeroom_name,
a.test_id,
a.datetaken,
a.datecertified,
b.request_number
FROM student_information a
INNER JOIN homeroom b ON a.homeroom_id = b.homeroom_id
WHERE a.student_name = '""" + student + """'
ORDER BY datecertified DESC
""")
cur.execute(sql)
details=cur.fetchall()
dataset = defaultdict(dict)
case_dataset = defaultdict(dict)
case_dataset = dict(case_dataset)
for student_name, student_id, student_homeroom_name, test_id, datetaken, datecertified, request_number in details:
dataset[student_name]['student_id'] = student_id
dataset[student_name]['student_homeroom_name'] = student_homeroom_name
case_dataset['test_id'] = test_id
case_dataset['datetaken'] = datetaken
case_dataset['datecertified'] = datecertified
case_dataset['request_number'] = request_number
dataset[student_name]['additional_information'] = case_dataset
case_dataset.clear()
dataset= dict(dataset)
print(dataset)
cur.close()
conn.close()
I tried a few different ways but nothing seems to work. What I'm getting is nothing in the additonal_information key. What I'm getting is this:
{
"John Doe": {
"student_id": "1234",
"student_homeroom_name": "HR1",
"additional_information": []
}
}
What I'm expecting is something similar to the below JSON. However, I'm torn if this is even correct. Each student will have one to many test_id and I will need to iterate through them in my application.
{
"John Doe": {
"student_id": "1234",
"student_homeroom_name": "HR1",
"additional_information": [
{
"test_id": "0987",
"datetaken": "1-1-1970",
"datecertified": "1-2-1970",
"request_number": "5643"
},
{
"test_id": "12343",
"datetaken": "1-1-1980",
"datecertified": "1-2-1980",
"request_number": "39807"
}
]
}
}
Removing the clear() from the function produces this JSON:
{
"John Doe": {
"student_id": "1234",
"student_homeroom_name": "HR1",
"additional_information": [
{
"test_id": "0987",
"datetaken": "1-1-1970",
"datecertified": "1-2-1970",
"request_number": "5643"
},
{
"test_id": "0987",
"datetaken": "1-1-1970",
"datecertified": "1-2-1970",
"request_number": "5643"
}
]
}
}
lists are mutable objects. Which means that list's are passed by reference.
when you set
dataset[student]['additional_information'] = case_dataset
case_dataset.clear()
you're setting the list and then clearing it. So the list inside additional_information is also cleared.
Copy the list when setting it:
dataset[student]['additional_information'] = case_dataset[:]
case_dataset.clear()
Thanks everyone for the guidance and pointing me in the right direction.
I have what I'm looking for now. Based on some of the comments and troubleshooting, I updated my code. Here is what I did:
I added back additional_dataset as a list
Removed case_dataset = defaultdict(dict) and case_dataset = dict(case_dataset) and replaced it with case_dataset = {}.
Updated dataset[student_name]['additional_information'] = case_dataset with dataset[student_name]['additional_information'] = additional_dataset
Replaced case_dataset.clear() with case_dataset = {}
Here is my new code now
def detail():
student = 'John Doe'
conn = get_db_connection()
cur = conn.cursor()
sql = ("""
select
a.student_name,
a.student_id,
a.student_homeroom_name,
a.test_id,
a.datetaken,
a.datecertified,
b.request_number
FROM student_information a
INNER JOIN homeroom b ON a.homeroom_id = b.homeroom_id
WHERE a.student_name = '""" + student + """'
ORDER BY datecertified DESC
""")
cur.execute(sql)
details=cur.fetchall()
dataset = defaultdict(dict)
case_dataset = {} #2 - Updated to just dict
additional_dataset = [] #1 - added back additional_dataset as a list
for student_name, student_id, student_homeroom_name, test_id, datetaken, datecertified, request_number in details:
dataset[student_name]['student_id'] = student_id
dataset[student_name]['student_homeroom_name'] = student_homeroom_name
case_dataset['test_id'] = test_id
case_dataset['datetaken'] = datetaken
case_dataset['datecertified'] = datecertified
case_dataset['request_number'] = request_number
dataset[student_name]['additional_information'] = additional_dataset #3 - updated to additional_dataset
case_dataset = {} #4 - updated to clear with new dict
dataset= dict(dataset)
print(dataset)
cur.close()
conn.close()
This is what it produces now. This is a much better structure then what I was previously expecting.
{
"John Doe": {
"student_id": "1234",
"student_homeroom_name": "HR1",
"additional_information": [
{
"test_id": "0987",
"datetaken": "1-1-1970",
"datecertified": "1-2-1970",
"request_number": "5643"
},
{
"test_id": "12343",
"datetaken": "1-1-1980",
"datecertified": "1-2-1980",
"request_number": "39807"
}
]
}
}
I have to do the yelp API of a Django web app. I created db() to enter data into database but how to address the error? I'm trying to do it without Pandas:
Message=string indices must be integers
Source=C:\Users\diggt\OneDrive\College\Rowan\Fall22\10430_computing_and_informatics_capstone\yelp_VSCode\yelp.py
StackTrace:
File "C:\Users\diggt\OneDrive\College\Rowan\Fall22\10430_computing_and_informatics_capstone\yelp_VSCode\yelp.py", line 104, in <genexpr>
keys = (entry[c] for c in columns)
File "C:\Users\diggt\OneDrive\College\Rowan\Fall22\10430_computing_and_informatics_capstone\yelp_VSCode\yelp.py", line 115, in db
cur.executemany(sql, keys)
File "C:\Users\diggt\OneDrive\College\Rowan\Fall22\10430_computing_and_informatics_capstone\yelp_VSCode\yelp.py", line 153, in main
db()
File "C:\Users\diggt\OneDrive\College\Rowan\Fall22\10430_computing_and_informatics_capstone\yelp_VSCode\yelp.py", line 157, in <module> (Current frame)
main()
# -*- coding: utf-8 -*-
from __future__ import print_function
import argparse
import json
import csv
import pprint
import requests
import sys
import sqlite3
#import pandas as pd
from urllib.error import HTTPError
from urllib.parse import quote
API_KEY = 'secret'
# API constants, you shouldn't have to change these.
API_HOST = 'https://api.yelp.com'
SEARCH_PATH = '/v3/businesses/search'
BUSINESS_PATH = '/v3/businesses/' # Business ID will come after slash.
# Defaults
DEFAULT_TERM = 'dinner'
DEFAULT_LOCATION = 'Glassboro, NJ'
SEARCH_LIMIT = 3
OFFSET = 0
def request(host, path, api_key, url_params=None):
url_params = url_params or {}
url = '{0}{1}'.format(host, quote(path.encode('utf8')))
headers = {
'Authorization': 'Bearer %s' % api_key,
}
print(u'Querying {0} ...'.format(url))
response = requests.request('GET', url, headers=headers, params=url_params)
return response.json()
def search(api_key, term, location):
url_params = {
'term': term.replace(' ', '+'),
'location': location.replace(' ', '+'),
'limit': SEARCH_LIMIT,
'offset': OFFSET
}
return request(API_HOST, SEARCH_PATH, api_key, url_params=url_params)
def get_business(api_key, business_id):
business_path = BUSINESS_PATH + business_id
return request(API_HOST, business_path, api_key)
def query_api(term, location):
response = search(API_KEY, term, location)
businesses = response.get('businesses')
if not businesses:
print(u'No businesses for {0} in {1} found.'.format(term, location))
return
business_id = businesses[0]['id']
print(u'{0} businesses found, querying business info ' \
'for the top result "{1}" ...'.format(
len(businesses), business_id))
response = get_business(API_KEY, business_id)
print(u'Result for business "{0}" found:'.format(business_id))
pprint.pprint(response, indent=2)
str_to_write_to_file = json.dumps(response, skipkeys=True, allow_nan=True, indent=4)
with open('yelp.json', 'w') as f:
f.write(str_to_write_to_file)
def db():
with open('yelp.json', 'r') as f:
data = f.readlines()
conn = sqlite3.connect('yelp.db')
cur = conn.cursor()
# Create the table if it doesn't exist.
cur.execute(
"""CREATE TABLE IF NOT EXISTS yelp(
id INTEGER PRIMARY KEY,
alias varchar(100),
location varchar(100),
display_phone varchar(15)
);"""
)
for entry in data:
columns = ["id" "alias", "location", "display_phone"]
keys = (entry[c] for c in columns)
# Execute the command and replace '?' with the each value
# in 'values'. DO NOT build a string and replace manually.
# the sqlite3 library will handle non safe strings by doing this.
sql = """INSERT INTO yelp (id, alias, location, display_phone) VALUES(
?,
?,
?,
?
);"""
cur.executemany(sql, keys)
print(f'{entry["alias"]} data inserted Succefully')
conn.commit()
conn.close()
with sqlite3.connect("yelp.db") as conn:
cmd = """SELECT * FROM yelp;"""
cur = conn.execute(cmd)
res = cur.fetchall()
for r in res:
print(r)
def main():
parser = argparse.ArgumentParser()
parser.add_argument('-q', '--term', dest='term', default=DEFAULT_TERM,
type=str, help='Search term (default: %(default)s)')
parser.add_argument('-l', '--location', dest='location',
default=DEFAULT_LOCATION, type=str,
help='Search location (default: %(default)s)')
input_values = parser.parse_args()
try:
query_api(input_values.term, input_values.location)
except HTTPError as error:
sys.exit(
'Encountered HTTP error {0} on {1}:\n {2}\nAbort program.'.format(
error.code,
error.url,
error.read(),
)
)
db()
if __name__ == '__main__':
main()
JSON file :
{
"id": "umC69pkiPyk3qY7IB49ZYw",
"alias": "bosphorus-mediterranean-cuisine-glassboro",
"name": "Bosphorus Mediterranean Cuisine",
"image_url": "https://s3-media4.fl.yelpcdn.com/bphoto/G7VCO3tvx8NGPz5g0fSpMw/o.jpg",
"is_claimed": true,
"is_closed": false,
"url": "https://www.yelp.com/biz/bosphorus-mediterranean-cuisine-glassboro?adjust_creative=9aYQmmK21ApZ7TfokeTk1A&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_lookup&utm_source=9aYQmmK21ApZ7TfokeTk1A",
"phone": "+18562432015",
"display_phone": "(856) 243-2015",
"review_count": 14,
"categories": [
{
"alias": "turkish",
"title": "Turkish"
},
{
"alias": "halal",
"title": "Halal"
},
{
"alias": "kebab",
"title": "Kebab"
}
],
"rating": 5.0,
"location": {
"address1": "524 Delsea Drive N",
"address2": null,
"address3": null,
"city": "Glassboro",
"zip_code": "08028",
"country": "US",
"state": "NJ",
"display_address": [
"524 Delsea Drive N",
"Glassboro, NJ 08028"
],
"cross_streets": ""
},
"coordinates": {
"latitude": 39.7150351328115,
"longitude": -75.1118882
},
"photos": [
"https://s3-media4.fl.yelpcdn.com/bphoto/G7VCO3tvx8NGPz5g0fSpMw/o.jpg",
"https://s3-media2.fl.yelpcdn.com/bphoto/HvhYRZO2rOYUBX0DagVE3w/o.jpg",
"https://s3-media2.fl.yelpcdn.com/bphoto/PQHr3upfVULUjwz1M-ILcw/o.jpg"
],
"hours": [
{
"open": [
{
"is_overnight": false,
"start": "1100",
"end": "2200",
"day": 0
},
{
"is_overnight": false,
"start": "1100",
"end": "2200",
"day": 1
},
{
"is_overnight": false,
"start": "1100",
"end": "2200",
"day": 2
},
{
"is_overnight": false,
"start": "1100",
"end": "2200",
"day": 3
},
{
"is_overnight": false,
"start": "1100",
"end": "2200",
"day": 4
},
{
"is_overnight": false,
"start": "1100",
"end": "2200",
"day": 5
},
{
"is_overnight": false,
"start": "1100",
"end": "2200",
"day": 6
}
],
"hours_type": "REGULAR",
"is_open_now": true
}
],
"transactions": [
"pickup",
"delivery"
],
"messaging": {
"url": "https://www.yelp.com/raq/umC69pkiPyk3qY7IB49ZYw?adjust_creative=9aYQmmK21ApZ7TfokeTk1A&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_lookup&utm_source=9aYQmmK21ApZ7TfokeTk1A#popup%3Araq",
"use_case_text": "Message the Business"
}
}
You shouldn't use f.readlines() to read a JSON file, use json.load(f).
There's only one set of values in the JSON, so you don't need a loop or executemany().
def db():
with open('yelp.json', 'r') as f:
data = json.load(f)
conn = sqlite3.connect('yelp.db')
cur = conn.cursor()
# Create the table if it doesn't exist.
cur.execute(
"""CREATE TABLE IF NOT EXISTS yelp(
id INTEGER PRIMARY KEY,
alias varchar(100),
location varchar(100),
display_phone varchar(15)
);"""
)
columns = ["id" "alias", "location", "display_phone"]
keys = [entry[c] for c in columns]
# Execute the command and replace '?' with the each value
# in 'values'. DO NOT build a string and replace manually.
# the sqlite3 library will handle non safe strings by doing this.
sql = """INSERT INTO yelp (id, alias, location, display_phone) VALUES(
?,
?,
?,
?
);"""
cur.execute(sql, keys)
print(f'{entry["alias"]} data inserted Succefully')
conn.commit()
conn.close()
with sqlite3.connect("yelp.db") as conn:
cmd = """SELECT * FROM yelp;"""
cur = conn.execute(cmd)
res = cur.fetchall()
for r in res:
print(r)
So ultimately I figured it out... pretty much. I used what #Bramar said but the solution was making the json file an array and then I started getting this error sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement uses 4, and there are 1 supplied. which turned out to be one of the entries that I had was stored in json as a dict so I eliminated it temporarily to see if I can make it work and it works, this is the code -
print(u'Result for business "{0}" found:'.format(business_id))
str_to_write_to_file = json.dumps([response], indent=4)
with open('yelp.json', 'w') as f:
f.write(str_to_write_to_file)
def db():
with open('yelp.json', 'r') as f:
data = json.load(f)
conn = sqlite3.connect('data/yelp.db')
cur = conn.cursor()
# Create the table if it doesn't exist.
cur.execute(
"""CREATE TABLE IF NOT EXISTS yelp(
id INTEGER PRIMARY KEY,
alias varchar(100),
display_phone varchar(15),
location dictionary
);"""
)
columns = ["alias", "display_phone"]
keys = [data[0][c] for c in columns]
# Execute the command and replace '?' with the each value
# in 'values'. DO NOT build a string and replace manually.
# the sqlite3 library will handle non safe strings by doing this.
sql = '''INSERT INTO yelp (alias, display_phone) VALUES(
?,
?
);'''
cur.execute(sql, keys)
conn.commit()
conn.close()
Hopefully this helps someone, this can very confusing.
I have created a Python script that creates a table in MySQL and another one that populates it with data from a JSON file.
Sample JSON file:
{
"ansible_facts":{
"ansible_network_resources":{
"l3_interfaces":[
{
"name":"GigabitEthernet0/0"
},
{
"name":"GigabitEthernet0/0.100",
"ipv4":[
{
"address":"172.1.1.1 255.255.255.252"
}
]
},
{
"name":"GigabitEthernet0/0.101",
"ipv4":[
{
"address":"172.1.1.1 255.255.255.252"
}
]
},
{
"name":"GigabitEthernet0/1",
"ipv4":[
{
"address":"56.2.1.1 255.255.255.252"
}
]
},
{
"name":"GigabitEthernet0/2"
}
]
},
"ansible_net_python_version":"3.6.9",
"ansible_net_hostname":"host02342-mpls",
"ansible_net_model":"CISCO-CHA",
"ansible_net_serialnum":"F1539AM",
"ansible_net_gather_subset":[
"default"
],
"ansible_net_gather_network_resources":[
"l3_interfaces"
],
"ansible_net_version":"15.3(2)T",
"ansible_net_api":"cliconf",
"ansible_net_system":"ios",
"ansible_net_image":"flash0:/c3900-universalk9-mz.spa.153-2.t.bin",
"ansible_net_iostype":"IOS"
}
}
Table creation script
import mysql.connector
mydb = mysql.connector.connect(host="IPaddress", user="user", password="pw", database="db")
mycursor = mydb.cursor()
mycursor.execute("CREATE TABLE Routers (ansible_net_hostname NVARCHAR(255), ansible_net_model NVARCHAR(255), ansible_network_resources NVARCHAR(255))")
The script to import JSON data into MySQL
import json, pymysql
json_data = open("L3_out.json").read()
json_obj = json.loads(json_data)
con = pymysql.connect(host="IPaddress", user="user", password="pw", database="db")
cursor = con.cursor()
for item in json_obj:
ansible_net_hostname = item.get("ansible_net_hostname")
ansible_net_model = item.get("ansible_net_model")
ansible_network_resources = item.get("ansible_network_resources")
cursor.execute(
"insert into Routers(ansible_net_hostname, ansible_net_model, ansible_network_resources) value(%s, %s, %s)",
(ansible_net_hostname, ansible_net_model, ansible_network_resources)
con.commit()
con.close()
I'm having issues importing ansible_network_resources field object into the Routers table. The other columns (ansible_net_hostname, ansible_net_model) get inserted perfectly. What am I doing wrong?
First of all, it's not clear how does
for item in json_obj:
ansible_net_hostname=item.get("ansible_net_hostname")
work.
Since 'item' in your case is a key from the dictionary. In the file you shown there is only one root key "ansible_facts". So you are trying to call get() on the string.
To get the data of "ansible_network_resources" do the following:
for key in json_obj:
ansible_network_resources=json_obj[key].get("ansible_network_resources")
I currently have this method in python code :
#app.route('/getData', methods = ['GET'])
def get_Data():
c.execute("SELECT abstract,category,date,url from Data")
data = c.fetchall()
resp = jsonify(data)
resp.status_code = 200
return resp
The output I get from this is:
[
[
"2020-04-23 15:32:13",
"Space",
"https://www.bisnow.com/new-jersey",
"temp"
],
[
"2020-04-23 15:32:13",
"Space",
"https://www.bisnow.com/events/new-york",
"temp"
]
]
However, I want the output to look like this:
[
{
"abstract": "test",
"category": "journal",
"date": "12-02-2020",
"link": "www.google.com"
},
{
"abstract": "test",
"category": "journal",
"date": "12-02-2020",
"link": "www.google.com"
}
]
How do I convert my output into an expected format?
As #jonrsharpe indicates, you simply cannot expect the tuple coming from this database query to turn into a dictionary in the JSON output. Your data variable does not contain the information necessary to construct the response you desire.
It will depend on your database but my recommendation would be to find a way to retrieve dicts from your database query instead of tuples, in which case the rest of your code should work as is. For instance, for sqlite, you could define your cursor c like this:
import sqlite3
connection = sqlite3.connect('dbname.db') # database connection details here...
connection.row_factory = sqlite3.Row
c = connection.cursor()
Now, if your database for some reason cannot support a dictionary cursor, you need to roll your own dictionary after retrieving the database query results. For your example, something like this:
fieldnames = ('abstract', 'category', 'date', 'link')
numfields = len(fieldnames)
data = []
for row in c.fetchall():
for idx in range(0, numfields - 1):
dictrow[fields[idx]] = row[idx]
data.append(dictrow)
I iterate over a list of field labels, which do not have to match your database columns but do have to be in the same order, and creating a dict by pairing the label with the datum from the db tuple in the same position. This passage would replace the single line data = c.fetchall() in OP.
The JSON files are as follows a.json,b.json.....z.json (26 json files)
The json format of each of the file looks as:
{
"a cappella": {
"word": "a cappella",
"wordset_id": "5feb6f679a",
"meanings": [
{
"id": "492099d426",
"def": "without musical accompaniment",
"example": "they performed a cappella",
"speech_part": "adverb"
},
{
"id": "0bf8d49e2e",
"def": "sung without instrumental accompaniment",
"example": "they sang an a cappella Mass",
"speech_part": "adjective"
}
]
},
"A.D.": {
"word": "A.D.",
"wordset_id": "b7e9d406a0",
"meanings": [
{
"id": "a7482f3e30",
"def": "in the Christian era",
"speech_part": "adverb",
"synonyms": [
"AD"
]
}
]
},.........
}
How could I store these in MongoDB such that if queried with word the results shows meanings,synonyms(if available)?
I have never used Mongo on how to approach, but the same was done with SO suggestions for a single json file in mysql as:
**cursor has db connection
with open('a.json') as f:
d = json.load(f)
for word in d:
word_obj = d[word]
wordset_id = word_obj['wordset_id']
sql = "INSERT INTO Word (word, wordset_id) VALUES (%s, %s)"
values = (word, wordset_id)
cursor.execute(sql, values)
conn.commit()
similarly to store meanings and synonyms as different tables,
But as suggetsed I guess this would become better if MongoDB is used
If you want to insert data from multiple .json files, do it in a loop:
file_names = ['a.json', 'b.json', ...]
for file_name in file_names:
with open(file_name) as f:
file_data = json.load(f) # load data from JSON to dict
for k, v in file_data.items(): # iterate over key-value pairs
collection.insert_one(v) # your collection object here