Parsing json into Insert statements with Python - python

I have a file which contains several json records. I have to parse this file and load each of the jsons to a particular SQL-Server table. However, the table might not exist on the database, in which case I have to also create it first before loading. So, I have to parse the json file and figure out the fields/columns and create the table. Then I will have to de-serialize the jsons into records and insert them into the table created. However, the caveat is that some fields in the json are optional i.e. a field might be absent from one json record but could be present in another record. Below is an example file with 3 records :-
{ id : 1001,
name : "John",
age : 30
} ,
{ id : 1002,
name : "Peter",
age : 25
},
{ id : 1002,
name : "Kevin",
age : 35,
salary : 5000
},
Notice that the field salary appears only in the 3rd record. The results should be :-
CREATE TABLE tab ( id int, name varchar(100), age int, salary int );
INSERT INTO tab (id, name, age, salary) values (1001, 'John', 30, NULL)
INSERT INTO tab (id, name, age, salary) values (1002, 'Peter', 25, NULL)
INSERT INTO tab (id, name, age, salary) values (1003, 'Kevin', 35, 5000)
Can anyone please help me with some pointers as I am new to Python. Thanks.

You could try this:
import json
TABLE_NAME = "tab"
sqlstatement = ''
with open ('data.json','r') as f:
jsondata = json.loads(f.read())
for json in jsondata:
keylist = "("
valuelist = "("
firstPair = True
for key, value in json.items():
if not firstPair:
keylist += ", "
valuelist += ", "
firstPair = False
keylist += key
if type(value) in (str, unicode):
valuelist += "'" + value + "'"
else:
valuelist += str(value)
keylist += ")"
valuelist += ")"
sqlstatement += "INSERT INTO " + TABLE_NAME + " " + keylist + " VALUES " + valuelist + "\n"
print(sqlstatement)
However for this to work, you'll need to change your JSON file to correct the syntax like this:
[{
"id" : 1001,
"name" : "John",
"age" : 30
} ,
{
"id" : 1002,
"name" : "Peter",
"age" : 25
},
{
"id" : 1003,
"name" : "Kevin",
"age" : 35,
"salary" : 5000
}]
Running this gives the following output:
INSERT INTO tab (age, id, name) VALUES (30, 1001, 'John')
INSERT INTO tab (age, id, name) VALUES (25, 1002, 'Peter')
INSERT INTO tab (salary, age, id, name) VALUES (5000, 35, 1003, 'Kevin')
Note that you don't need to specify NULLs. If you don't specify a column in the insert statement, it should automatically insert NULL into any columns you left out.

In Python, you can do something like this using sqlite3 and json, both from the standard library.
import json
import sqlite3
# The string representing the json.
# You will probably want to read this string in from
# a file rather than hardcoding it.
s = """[
{
"id": 1001,
"name": "John",
"age" : 30
},
{
"id" : 1002,
"name" : "Peter",
"age" : 25
},
{
"id" : 1002,
"name" : "Kevin",
"age" : 35,
"salary" : 5000
}
]"""
# Read the string representing json
# Into a python list of dicts.
data = json.loads(s)
# Open the file containing the SQL database.
with sqlite3.connect("filename.db") as conn:
# Create the table if it doesn't exist.
conn.execute(
"""CREATE TABLE IF NOT EXISTS tab(
id int,
name varchar(100),
age int,
salary int
);"""
)
# Insert each entry from json into the table.
keys = ["id", "name", "age", "salary"]
for entry in data:
# This will make sure that each key will default to None
# if the key doesn't exist in the json entry.
values = [entry.get(key, None) for key in keys]
# Execute the command and replace '?' with the each value
# in 'values'. DO NOT build a string and replace manually.
# the sqlite3 library will handle non safe strings by doing this.
cmd = """INSERT INTO tab VALUES(
?,
?,
?,
?
);"""
conn.execute(cmd, values)
conn.commit()
This will create a file named 'filename.db' in the current directory with the entries inserted.
To test the tables:
# Testing the table.
with sqlite3.connect("filename.db") as conn:
cmd = """SELECT * FROM tab WHERE SALARY NOT NULL;"""
cur = conn.execute(cmd)
res = cur.fetchall()
for r in res:
print(r)

Related

How to prevent double interation of list of dictionaries?

I am trying to create text strings by populating from a json object string.
When I iterate through the list of dictionaries, the iterator doubles the string. How do I fix this?
Code so far:
import json
data = '''{
"text": "aaa",
"text2": "bbb",
"data": [
{
"id": "1",
"text": "Red"
}, {
"id": "2",
"text": "Blue"
}
]
}'''
data_decoded = json.loads(data)
data_list = data_decoded['data']
insertQuery = "update "+ data_decoded['text'] +" set "
#print(insertQuery)
for pair in data_list:
for k, v in pair.items():
if k == data_decoded['text2']:
where = ' \"' + k + '\" = \'' + v + '\''
else:
insertQuery = insertQuery + ' where \"' +k+'\" = \''+ v + '\''
query = insertQuery + where
print(query)
Output:
update aaa set where "id" = '1' where "text" = 'Red' "id" = '2'
update aaa set where "id" = '1' where "text" = 'Red' where "id" = '2' where "text" = 'Blue' "id" = '2'
My desired result is for every key value pair the code prints one sentence, like so:
update aaa set where "id" = '1' where "text" = 'Red'
update aaa set where "id" = '2' where "text" = 'Blue'
Not entirely sure, but you can just access your dictionary items rather than looping over them :)
If you use >= python 3.6
query = ''
field = data_decoded['text']
for pair in data_list:
query += f"update {field} set where id = {pair['id']} where text = {pair ['text']}\n"
Otherwise:
query = ''
field = data_decoded['text']
for pair in data_list:
query += "update {field} set where id = {id} where text = {text}\n".format(field=field, id=pair['id'], text=pair['text'])

Creating a Data Structure from JSON Using Python

I'm new to python and I have a json file that I'm trying to use to create a data structure using python.
Here is a sample of what one of the lines would look like:
[{'name': 'Nick', 'age': '35', 'city': 'New York'}...]
I read it into memory, but am unsure what additional steps I would need to take to use it to create a table.
Here is what I have tried so far:
import json
import csv
from pprint import pprint
with open("/desktop/customer_records.json") as customer_records:
data=json.load(customer_records)
for row in data:
print(row)
Ideally, I would it in the following format:
Name Age City
Nick 35 New York
Any help would be appreciated. Thanks in advance.
Your problem is not specified too precisely, but if you want to generate SQL that can be inserted into MySQL, here's a little program that converts JSON to a sequence of SQL statements:
#!/usr/bin/env python
import json
# Load data
with open("zz.json") as f:
data=json.load(f)
# Find all keys
keys = []
for row in data:
for key in row.keys():
if key not in keys:
keys.append(key)
# Print table definition
print """CREATE TABLE MY_TABLE(
{0}
);""".format(",\n ".join(map(lambda key: "{0} VARCHAR".format(key), keys)))
# Now, for all rows, print values
for row in data:
print """INSERT INTO MY_TABLE VALUES({0});""".format(
",".join(map(lambda key: "'{0}'".format(row[key]) if key in row else "NULL", keys)))
For this JSON file:
[
{"name": "Nick", "age": "35", "city": "New York"},
{"name": "Joe", "age": "21", "city": "Boston"},
{"name": "Alice", "city": "Washington"},
{"name": "Bob", "age": "49"}
]
It generates
CREATE TABLE MY_TABLE(
city VARCHAR,
age VARCHAR,
name VARCHAR
);
INSERT INTO MY_TABLE VALUES('New York','35','Nick');
INSERT INTO MY_TABLE VALUES('Boston','21','Joe');
INSERT INTO MY_TABLE VALUES('Washington',NULL,'Alice');
INSERT INTO MY_TABLE VALUES(NULL,'49','Bob');
And for the future, please make your question WAY more specific :)

Store and use json in MSSQL with python

I'm trying to upload a test JSON string to SQL Server
json_string = """ {
"orderID": 42,
"customerName": "John Smith",
"customerPhoneN": "555-1234",
"orderContents": [
{
"productID": 23,
"productName": "keyboard",
"quantity": 1
},
{
"productID": 13,
"productName": "mouse",
"quantity": 1
}
],
"orderCompleted": true
} """
parsed_string = json.loads(json_string)
cursor.execute("update Table set Status = ? where Name like ? ",(json.dumps(parsed_string), "Blabla"))
cnxn.commit()
How to return and work with this JSON from the database?
cursor.execute("""select Status from Table where Name like ?""", "Blabla")
rows = cursor.fetchall()
How can I print the value of the JSON?
Use the JSON data type that is supported in MySQL. You can find more about it here:
https://dev.mysql.com/doc/refman/5.7/en/json.html
s = json.dumps(DATA)
cursor.execute("update Table set Status = ? where Name like ? ",(s, "Blabla"))
cnxn.commit()
and
cursor.execute("""select Status from Table where Name like ?""", "Blabla")
res = cursor.fetchall()
DATA = json.loads(res[0][X])

Query a multi-level JSON object stored in MySQL

I have a JSON column in a MySQL table that contains a multi-level JSON object. I can access the values at the first level using the function JSON_EXTRACT but I can't find how to go over the first level.
Here's my MySQL table:
CREATE TABLE ref_data_table (
`id` INTEGER(11) AUTO_INCREMENT NOT NULL,
`symbol` VARCHAR(12) NOT NULL,
`metadata` JSON NOT NULL,
PRIMARY KEY (`id`)
);
Here's my Python script:
import json
import mysql.connector
con = mysql.connector.connect(**config)
cur = con.cursor()
symbol = 'VXX'
metadata = {
'tick_size': 0.01,
'data_sources': {
'provider1': 'p1',
'provider2': 'p2',
'provider3': 'p3'
},
'currency': 'USD'
}
sql = \
"""
INSERT INTO ref_data_table (symbol, metadata)
VALUES ('%s', %s);
"""
cur.execute(sql, (symbol, json.dumps(metadata)))
con.commit()
The data is properly inserted into the MySQL table and the following statement in MySQL works:
SELECT symbol, JSON_EXTRACT(metadata, '$.data_sources')
FROM ref_data_table
WHERE symbol = 'VXX';
How can I request the value of 'provider3' in 'data_sources'?
Many thanks!
Try this:
'$.data_sources.provider3'
SELECT symbol, JSON_EXTRACT(metadata, '$.data_sources.provider3)
FROM ref_data_table
WHERE symbol = 'VXX';
the JSON_EXTRACT method in MySql supports that, the '$' references the JSON root, whereas periods reference levels of nesting. in this JSON example
{
"key": {
"value": "nested_value"
}
}
you could use JSON_EXTRACT(json_field, '$.key.value') to get "nested_value"

python/sqlite3 query with column name to JSON

i want to return a sql query output with column name as json,
to create an table on client-side.
But i have not found a solution for this.
my code:
json_data = json.dumps(c.fetchall())
return json_data
like this output:
{
"name" : "Toyota1",
"product" : "Prius",
"color" : [
"white pearl",
"Red Methalic",
"Silver Methalic"
],
"type" : "Gen-3"
}
does anyone know a solution?
Your code only returns the values. To also get the column names you need to query a table called 'sqlite_master', which has the sql string that was used to create the table.
c.execute("SELECT sql FROM sqlite_master WHERE " \
"tbl_name='your_table_name' AND type = 'table'")
create_table_string = cursor.fetchall()[0][0]
This will give you a string from which you can parse the column names:
"CREATE TABLE table_name (columnA text, columnB integer)"

Categories

Resources