Structuring request JSON for API - python

I'm building a small API to interact with our database for other projects. I've built the database and have the API functioning fine, however, the data I get back isn't structured how I want it.
I am using Python with Flask/Flask-Restful for the API.
Here is a snippet of my Python that handles the interaction:
class Address(Resource):
def get(self, store):
print('Received a request at ADDRESS for Store ' + store )
conn = sqlite3.connect('store-db.db')
cur = conn.cursor()
addresses = cur.execute('SELECT * FROM Sites WHERE StoreNumber like ' + store)
for adr in addresses:
return(adr, 200)
If I make a request to the /sites/42 endpoint, where 42 is the site id, this is what I'll receive:
[
"42",
"5000 Robinson Centre Drive",
"",
"Pittsburgh",
"PA",
"15205",
"(412) 787-1330",
"(412) 249-9161",
"",
"Dick's Sporting Goods"
]
Here is how it is structured in the database:
Ultimately I'd like to use the column name as the Key in the JSON that's received, but I need a bit of guidance in the right direction so I'm not Googling ambiguous terms hoping to find something.
Here is an example of what I'd like to receive after making a request to that endpoint:
{
"StoreNumber": "42",
"Street": "5000 Robinson Centre Drive",
"StreetSecondary": "",
"City": "Pittsburgh",
"State": "PA",
"ZipCode": "15205",
"ContactNumber": "(412) 787-1330",
"XO_TN": "(412) 249-9161",
"RelocationStatus": "",
"StoreType": "Dick's Sporting Goods"
}
I'm just looking to get some guidance on if I should change how my data is structured in the database (i.e. I've seen some just put the JSON in their database, but I think that's messy) or if there's a more intuitive method I could use to control my data.
Updated Code using Accepted Answer
class Address(Resource):
def get(self, store):
print('Received a request at ADDRESS for Store ' + store )
conn = sqlite3.connect('store-db.db')
cur = conn.cursor()
addresses = cur.execute('SELECT * FROM Sites WHERE StoreNumber like ' + store)
for r in res:
column_names = ["StoreNumber", "Street", "StreetSecondary","City","State", "ZipCode", "ContactNumber", "XO_TN", "RelocationStatus", "StoreType"]
data = [r[0], r[1], r[2], r[3], r[4], r[5], r[6], r[7], r[8]]
datadict = {column_names[itemindex]:item for itemindex, item in enumerate(data)}
return(datadict, 200)

You could just convert your list to a dict and then parse it to a JSON string before passing it back out.
// These are the names of the columns in your database
>>> column_names = ["storeid", "address", "etc"]
// This is the data coming from the database.
// All data is passed as you are using SELECT * in your query
>>> data = [42, "1 the street", "blah"]
// This is a quick notation for creating a dict from a list
// enumerate means we get a list index and a list item
// as the columns are in the same order as the data, we can use the list index to pull out the column_name
>>> datadict = {column_names[itemindex]:item for itemindex, item in enumerate(data)}
//This just prints datadict in my terminal
>>> datadict
We now have a named dict containing your data and the column names.
{'etc': 'blah', 'storeid': 42, 'address': '1 the street'}
Now dump the datadict to a string so that it can be sent to the frontend.
>>> import json
>>> json.dumps(datadict)
The dict has now been converted to a string.
'{"etc": "blah", "storeid": 42, "address": "1 the street"}'
This would require no change to your database but the script would need to know about the column names or retrieve them dynamically using some SQL.
If the data in the database is in the correct format for passing to the frontend then you shouldn't need to change the database structure. If it was not in the correct format then you could either change the way it was stored or change your SQL query to manipulate it.

Related

How to create bulk node relationships using py2neo

I need to populate a database in neo4j using a json of the following that contains data of some processes. Among them the name of the process, its parents and its children (if any). Here is a part of the json as an example:
[
{
"process": "IPTV_Subscriptions",
"parents": ["IPTV_Navigation","DeviceCertifications-insertion"],
"childs": ["villa_iptv", "villa_ott", "villa_calicux"]
},
{
"process": "IPTV_Navigation",
"parents": [],
"childs": ["IPTV_Subscriptions"],
},
{
"process": "DeviceCertifications-getter",
"parents": [],
"childs": ["DeviceCertifications-insertion"]
},
{
"process": "DeviceCertifications-insertion",
"parents": ["DeviceCertifications-getter"],
"childs": ["IPTV_Subscriptions"]
}
]
With the following Python code I generated, I found that I can create each node with the processes contained in the json in bulk:
import json
from py2neo import Graph
from py2neo.bulk import create_nodes, create_relationships
graph = Graph("bolt://localhost:7687", auth = ("yyyy", "xxxx"))
#Opening json
f = open('/app/conf/data.json',)
processs = json.load(f)
data=[]
for i in processs:
proc=[]
proc.append(i["process"])
data.append(proc)
keys = ["process"]
create_nodes(graph.auto(), data, labels={"process"}, keys=keys)
And checking in neo4j, I see that the nodes are already created.
But now I need to make the relationships. For each process, from the json I know which are the parents and children of that node.
I wanted to take the documentation as an example:
from py2neo import Graph
from py2neo.bulk import create_relationships
g = Graph()
data = [
(("Alice", "Smith"), {"since": 1999}, "ACME"),
(("Bob", "Jones"), {"since": 2002}, "Bob Corp"),
(("Carol", "Singer"), {"since": 1981}, "The Daily Planet"),
]
create_relationships(g.auto(), data, "WORKS_FOR", start_node_key=("Person", "name", "family name"), end_node_key=("Company", "name"))
But it didn't work for me
Having from the json the information of the parents and children, does anyone have an idea of how I can generate the massive relationships? In view of the json example I have, the relationship tags would be ParentOf and ChildOf but I have no idea how they would be generated from python.
Below is the script to create the bulk relationship using py2neo. Let me know if it works for you or not. Another thing, please label your nodes as Process rather than process (notice the upper case P). Then I use the relationship :CHILD_OF. If you want :PARENT_OF then change the tuple in data and swap the first and third item.
import json
from py2neo import Graph
from py2neo.bulk import create_relationships
graph = Graph("neo4j://localhost:7687", auth = ("neo4j", "neo4jay"))
#Opening json
f = open('data2.json',)
processs = json.load(f)
data=[]
for i in processs:
for p in i["parents"]:
data.append((i["process"],{},p))
create_relationships(graph.auto(), data, "CHILD_OF", start_node_key=("Process", "process"), end_node_key=("Process", "process"))
Result:

What are the possible ways for JSON data processing using SQL, elastic search or preprocessing using python

I have a case study where i need to take data from a REST API do some analysis on the data using aggregate function,joins etc and use the response data in JSON format to plot some retail grahs.
Approaches being followed till now:
Read the data from JSON store these in python variable and use insert to hit the SQL query. Obviously it is a costly operation because for every JSON line read it is inserting into database.For 33k rows it is taking more than 20 mins which is inefficient.
This can be handled in elastic search for faster processing but complex operation like joins are not present in elastic search.
If anybody can suggest what would be the best approach (like preprocessing or post processing in python) to follow for handling such scenerios it would be helpful.
Thanks in advance
Sql Sript
def store_data(AccountNo)
db=MySQLdb.connect(host=HOST, user=USER, passwd=PASSWD, db=DATABASE, charset="utf8")
cursor = db.cursor()
insert_query = "INSERT INTO cstore (AccountNo) VALUES (%s)"
cursor.execute(insert_query, (AccountNo))
db.commit()
cursor.close()
db.close()
return
def on_data(file_path):
#This is the meat of the script...it connects to your mongoDB and stores the tweet
try:
# Decode the JSON from Twitter
testFile = open(file_path)
datajson = json.load(testFile)
#print (len(datajson))
#grab the wanted data from the Tweet
for i in range(len(datajson)):
for cosponsor in datajson[i]:
AccountNo=cosponsor['AccountNo']
store_data( AccountNo)
Edit1: Json Added
{
"StartDate": "1/1/18",
"EndDate": "3/30/18",
"Transactions": [
{
"CSPAccountNo": "41469300",
"ZIP": "60098",
"ReportDate": "2018-03-08T00:00:00",
"POSCode": "00980030003",
"POSCodeModifier": "0",
"Description": "TIC TAC GUM WATERMEL",
"ActualSalesPrice": 1.59,
"TotalCount": 1,
"Totalsales": 1.59,
"DiscountAmount": 0,
"DiscountCount": 0,
"PromotionAmount": 0,
"PromotionCount": 0,
"RefundAmount": 0,
"RefundCount": 0
},
{
"CSPAccountNo": "41469378",
"ZIP": "60098",
"ReportDate": "2018-03-08T00:00:00",
"POSCode": "01070080727",
"POSCodeModifier": "0",
"Description": "PAYDAY KS",
"ActualSalesPrice": 2.09,
"TotalCount": 1,
"Totalsales": 2.09,
"DiscountAmount": 0,
"DiscountCount": 0,
"PromotionAmount": 0,
"PromotionCount": 0,
"RefundAmount": 0,
"RefundCount": 0
}
]
}
I do not have your json file so not know if it is runnable, but I would have tried something like below: I read just your account infos to a list and than try to write to the db at once with executemany I expect it to have a better(less) execution time than 20 mins.
def store_data(AccountNo):
db = MySQLdb.connect(host=HOST, user=USER, passwd=PASSWD, db=DATABASE, charset="utf8")
cursor = db.cursor()
insert_query = "INSERT INTO cstore (AccountNo,ZIP,ReportDate) VALUES (:AccountNo,:ZIP,:ReportDate)"
cursor.executemany(insert_query, AccountNo)
db.commit()
cursor.close()
db.close()
return
def on_data(file_path):
# This is the meat of the script...it connects to your mongoDB and stores the tweet
try:
#declare an empty list for the all accountno's
accountno_list = list()
# Decode the JSON from Twitter
testFile = open(file_path)
datajson = json.load(testFile)
# print (len(datajson))
# grab the wanted data from the Tweet
for row in datajson[0]['Transactions']:
values = dict()
values['AccountNo'] = row['CSPAccountNo']
values['ZIP'] = row['ZIP']
values['ReportDate'] = row['ReportDate']
#from here on you can populate the attributes you need in a similar way..
accountno_list.append(values)
except:
pass
store_data(accountno_list)

writing json-ish list to csv, line by line, in python for bitcoin addresses

I'm querying the onename api in an effort to get the bitcoin addresses of all the users.
At the moment I'm getting all the user information as a json-esque list, and then piping the output to a file, it looks like this:
[{'0': {'owner_address': '1Q2Tv6f9vXbdoxRmGwNrHbjrrK4Hv6jCsz', 'zone_file': '{"avatar": {"url": "https://s3.amazonaws.com/kd4/111"}, "bitcoin": {"address": "1NmLvYVEZqPGeQNcgFS3DdghpoqaH4r5Xh"}, "cover": {"url": "https://s3.amazonaws.com/dx3/111"}, "facebook": {"proof": {"url": "https://facebook.com/jasondrake1978/posts/10152769170542776"}, "username": "jasondrake1978"}, "graph": {"url": "https://s3.amazonaws.com/grph/111"}, "location": {"formatted": "Mechanicsville, Va"}, "name": {"formatted": "Jason Drake"}, "twitter": {"username": "000001"}, "v": "0.2", "website": "http://1642.com"}', 'verifications': [{'proof_url': 'https://facebook.com/jasondrake1978/posts/10152769170542776', 'service': 'facebook', 'valid': False, 'identifier': 'jasondrake1978'}], 'profile': {'website': 'http://1642.com', 'cover': {'url': 'https://s3.amazonaws.com/dx3/111'}, 'facebook': {'proof': {'url': 'https://facebook.com/jasondrake1978/posts/10152769170542776'}, 'username': 'jasondrake1978'}, 'twitter': {'username': '000001'}, 'bitcoin': {'address': '1NmLvYVEZqPGeQNcgFS3DdghpoqaH4r5Xh'}, 'name': {'formatted': 'Jason Drake'}, 'graph': {'url': 'https://s3.amazonaws.com/grph/111'}, 'location': {'formatted': 'Mechanicsville, Va'}, 'avatar': {'url': 'https://s3.amazonaws.com/kd4/111'}, 'v': '0.2'}}}]
what I'm really interested in is the field {"address": "1NmLvYVEZqPGeQNcgFS3DdghpoqaH4r5Xh"}, the rest of the stuff I don't need, I just want the addresses of every user.
Is there a way that I can just write only the addresses to a file using python?
I'm trying to write it as something like:
1NmLvYVEZqPGeQNcgFS3DdghpoqaH4r5Xh,
1GA9RVZHuEE8zm4ooMTiqLicfnvymhzRVm,
1BJdMS9E5TUXxJcAvBriwvDoXmVeJfKiFV,
1NmLvYVEZqPGeQNcgFS3DdghpoqaH4r5Xh,
...
and so on.
I've tried a number of different ways using dump, dumps, etc. but I haven't yet been able to pin it down.
My code looks like this:
import os
import json
import requests
#import py2neo
import csv
# set up authentication parameters
#py2neo.authenticate("46.101.180.63:7474", "neo4j", "uni-bonn")
# Connect to graph and add constraints.
neo4jUrl = os.environ.get('NEO4J_URL',"http://46.101.180.63:7474/db/data/")
#graph = py2neo.Graph(neo4jUrl)
# Add uniqueness constraints.
#graph.run("CREATE CONSTRAINT ON (q:Person) ASSERT q.id IS UNIQUE;")
# Build URL.
apiUrl = "https://api.onename.com/v1/users"
# apiUrl = "https://raw.githubusercontent.com/s-matthew-english/26.04/master/test.json"
# Send GET request.
Allusersjson = requests.get(apiUrl, headers = {"accept":"application/json"}).json()
#print(json)])
UsersDetails=[]
for username in Allusersjson['usernames']:
usernamex= username[:-3]
apiUrl2="https://api.onename.com/v1/users/"+usernamex+"?app-id=demo-app-id&app-secret=demo-app-secret"
userinfo=requests.get(apiUrl2, headers = {"accept":"application/json"}).json()
# try:
# if('bitcoin' not in userinfo[usernamex]['profile']):
# continue
# else:
# UsersDetails.append(userinfo)
# except:
# continue
try:
address = userinfo[usernamex]["profile"]["bitcoin"]["address"]
UsersDetails.append(address)
except KeyError:
pass # no address
out = "\n".join(UsersDetails)
print(out)
open("out.csv", "w").write(out)
# f = csv.writer(open("test.csv", "wb+"))
# Build query.
query = """
RETURN {json}
"""
# Send Cypher query.
# py2neo.CypherQuery(graph, query).run(json=json)
# graph.run(query).run(json=json)
#graph.run(query,json=json)
anyway, in such a situation, what's the best way to write out those addresses as csv :/
UPDATE
I ran it, and at first it worked, but then I got the following error:
Instead of adding all the information to the UsersDetails list
UsersDetails.append(userinfo)
you can add just the relevant part (address)
try:
address = userinfo[usernamex]["profile"]["bitcoin"]["address"]
UsersDetails.append(address)
except KeyError:
pass # no address
except TypeError:
pass # illformed data
To print the values to the screen:
out = "\n".join(UsersDetails)
print(out)
(replace "\n" with "," for comma separated output, instead of one per line)
To save to a file:
open("out.csv", "w").write(out)
You need to reformat the list, either through map() or a list comprehension, to get it down to just the information you want. For example, if the top-level key used in the response from the api.onename.com API is always 0, you can do something like this
UsersAddresses = [user['0']['profile']['bitcoin']['address'] for user in UsersDetails]

Getting certain information from string

I'm new to python as was wondering how I could get the estimatedWait and routeName from this string.
{
"lastUpdated": "07:52",
"filterOut": [],
"arrivals": [
{
"routeId": "B16",
"routeName": "B16",
"destination": "Kidbrooke",
"estimatedWait": "due",
"scheduledTime": "06: 53",
"isRealTime": true,
"isCancelled": false
},
{
"routeId":"B13",
"routeName":"B13",
"destination":"New Eltham",
"estimatedWait":"29 min",
"scheduledTime":"07:38",
"isRealTime":true,
"isCancelled":false
}
],
"serviceDisruptions":{
"infoMessages":[],
"importantMessages":[],
"criticalMessages":[]
}
}
And then save this to another string which would be displayed on the lxterminal of the raspberry pi 2. I would like only the 'routeName' of B16 to be saved to the string. How do I do that?
You just have to deserialise the object and then use the index to access the data you want.
To find only the B16 entries you can filter the arrivals list.
import json
obj = json.loads(json_string)
# filter only the b16 objects
b16_objs = filter(lambda a: a['routeName'] == 'B16', obj['arrivals'])
if b16_objs:
# get the first item
b16 = b16_objs[0]
my_estimatedWait = b16['estimatedWait']
print(my_estimatedWait)
You can use string.find() to get the indices of those value identifiers
and extract them.
Example:
def get_vaules(string):
waitIndice = string.find('"estimatedWait":"')
routeIndice = string.find('"routeName":"')
estimatedWait = string[waitIndice:string.find('"', waitIndice)]
routeName = string[routeIndice:string.find('"', routeIndice)]
return estimatedWait, routeName
Or you could just deserialize the json object (highly recommended)
import json
def get_values(string):
jsonData = json.loads(string)
estimatedWait = jsonData['arrivals'][0]['estimatedWait']
routeName = jsonData['arrivals'][0]['routeName']
return estimatedWait, routeName
Parsing values from a JSON file using Python?

Python - Parsing JSON Data Set

I am trying to parse a JSON data set that looks something like this:
{"data":[
{
"Rest":0,
"Status":"The campaign is moved to the archive",
"IsActive":"No",
"StatusArchive":"Yes",
"Login":"some_login",
"ContextStrategyName":"Default",
"CampaignID":1111111,
"StatusShow":"No",
"StartDate":"2013-01-20",
"Sum":0,
"StatusModerate":"Yes",
"Clicks":0,
"Shows":0,
"ManagerName":"XYZ",
"StatusActivating":"Yes",
"StrategyName":"HighestPosition",
"SumAvailableForTransfer":0,
"AgencyName":null,
"Name":"Campaign_01"
},
{
"Rest":82.6200000000008,
"Status":"Impressions will begin tomorrow at 10:00",
"IsActive":"Yes",
"StatusArchive":"No",
"Login":"some_login",
"ContextStrategyName":"Default",
"CampaignID":2222222,
"StatusShow":"Yes",
"StartDate":"2013-01-28",
"Sum":15998,"StatusModerate":"Yes",
"Clicks":7571,
"Shows":5535646,
"ManagerName":"XYZ",
"StatusActivating":"Yes",
"StrategyName":"HighestPosition",
"SumAvailableForTransfer":0,
"AgencyName":null,
"Name":"Campaign_02"
}
]
}
Lets assume that there can be many of these data sets.
I would like to iterate through each one of them and grab the "Name" and the "Campaign ID" parameter.
So far my code looks something like this:
decoded_response = response.read().decode("UTF-8")
data = json.loads(decoded.response)
for item in data[0]:
for x in data[0][item] ...
-> need a get name procedure
-> need a get campaign_id procedure
Probably quite straight forward! I am not good with lists/dictionaries :(
Access dictionaries with d[dict_key] or d.get(dict_key, default) (to provide default value):
jsonResponse=json.loads(decoded_response)
jsonData = jsonResponse["data"]
for item in jsonData:
name = item.get("Name")
campaignID = item.get("CampaignID")
I suggest you read something about dictionaries.

Categories

Resources