How to access certain values of a json site via python - python

This is the code i have so far:
import json
import requests
import time
endpoint = "https://www.deadstock.ca/collections/new-arrivals/products/nike-
air-max-1-cool-grey.json"
req = requests.get(endpoint)
reqJson = json.loads(req.text)
for id in reqJson['product']:
name = (id['title'])
print (name)
Feel free to visit the link, I'm trying to grab all the "id" value and print them out. They will be used later to send to my discord.
I tried with my above code but i have no idea how to actually get those values. I don't know which variable to use in the for in reqjson statement
If anyone could help me out and guide me to get all of the ids to print that would be awesome.
for product in reqJson['product']['title']:
ProductTitle = product['title']
print (title)

I see from the link you provided that the only ids that are in a list are actually part of the variants list under product. All the other ids are not part of a list and have therefore no need to iterate over. Here's an excerpt of the data for clarity:
{
"product":{
"id":232418213909,
"title":"Nike Air Max 1 \/ Cool Grey",
...
"variants":[
{
"id":3136193822741,
"product_id":232418213909,
"title":"8",
...
},
{
"id":3136193855509,
"product_id":232418213909,
"title":"8.5",
...
},
{
"id":3136193789973,
"product_id":232418213909,
"title":"9",
...
},
...
],
"image":{
"id":3773678190677,
"product_id":232418213909,
"position":1,
...
}
}
}
So what you need to do should be to iterate over the list of variants under product instead:
import json
import requests
endpoint = "https://www.deadstock.ca/collections/new-arrivals/products/nike-air-max-1-cool-grey.json"
req = requests.get(endpoint)
reqJson = json.loads(req.text)
for product in reqJson['product']['variants']:
print(product['id'], product['title'])
This outputs:
3136193822741 8
3136193855509 8.5
3136193789973 9
3136193757205 9.5
3136193724437 10
3136193691669 10.5
3136193658901 11
3136193626133 12
3136193593365 13
And if you simply want the product id and product name, they would be reqJson['product']['id'] and reqJson['product']['title'], respectively.

Related

Find coordinates in wikipedia pages iterating over a list

Probably this is a simple question, but my experience in for loop is very limited.
I was trying to adapt the solution in this page https://www.mediawiki.org/wiki/API:Geosearch with some simple examples that i have, but the result is not what i expected.
For example:
I have this simple data frame:
df= pd.DataFrame({'City':['Sesimbra','Ciudad Juárez','31100 Treviso','Ramada Portugal','Olhão'],
'Country':['Portugal','México','Itália','Portugal','Portugal']})
I created a list based on cities:
lista_cidades = list(df['City'])
and i would like to iterate over this list to get the coordinates (decimal, preferably)
So far i tried this approach:
import requests
lng_dict = {}
lat_dict = {}
S = requests.Session()
URL = "https://en.wikipedia.org/w/api.php"
PARAMS = {
"action": "query",
"format": "json",
"titles": [lista_cidades],
"prop": "coordinates"
}
R = S.get(url=URL, params=PARAMS)
DATA = R.json()
PAGES = DATA['query']['pages']
for i in range(len(lista_cidades)):
for k, v in PAGES.items():
try:
lat_dict[lista_cidades[i]] = str(v['coordinates'][0]['lat'])
lng_dict[lista_cidades[i]] = str(v['coordinates'][0]['lon'])
except:
pass
but it looks like the code doesn't iterate over the list and always returns the same coordinate
For example, when i call the dictionary with latitude coordinates, this is what i get
lng_dict
{'Sesimbra': '-7.84166667',
'Ciudad Juárez': '-7.84166667',
'31100 Treviso': '-7.84166667',
'Ramada Portugal': '-7.84166667',
'Olhão': '-7.84166667'}
What should i do to solve this?
Thanks in advance
I think the query returns only one result, it will take only the last city from you list (in your cas the "Olhão" coordinates).
You can check it by logging the DATA content.
I do not know about wikipedia API, but either your call lack a parameter (documentation should give you the information) or you have to call the API for each city like :
import pandas as pd
import requests
df = pd.DataFrame({'City': ['Sesimbra', 'Ciudad Juárez', '31100 Treviso', 'Ramada Portugal', 'Olhão'],
'Country': ['Portugal', 'México', 'Itália', 'Portugal', 'Portugal']})
lista_cidades = list(df['City'])
lng_dict = {}
lat_dict = {}
S = requests.Session()
URL = "https://en.wikipedia.org/w/api.php"
for city in lista_cidades:
PARAMS = {
"action": "query",
"format": "json",
"titles": city,
"prop": "coordinates"
}
R = S.get(url=URL, params=PARAMS)
DATA = R.json()
PAGES = DATA['query']['pages']
for k, v in PAGES.items():
try:
lat_dict[city] = str(v['coordinates'][0]['lat'])
lng_dict[city] = str(v['coordinates'][0]['lon'])
except:
pass

Why am I getting this error "TypeError: string indices must be integers" when trying to fetch data from an api?

json file =
{
"success": true,
"terms": "https://curr
"privacy": "https://cu
"timestamp": 162764598
"source": "USD",
"quotes": {
"USDIMP": 0.722761,
"USDINR": 74.398905,
"USDIQD": 1458.90221
}
}
The json file is above. i deleted lot of values from the json as it took too many spaces. My python code is in below.
import urllib.request, urllib.parse, urllib.error
import json
response = "http://api.currencylayer.com/live?access_key="
api_key = "42141e*********************"
parms = dict()
parms['key'] = api_key
url = response + urllib.parse.urlencode(parms)
mh = urllib.request.urlopen(url)
source = mh.read().decode()
data = json.loads(source)
pydata = json.dumps(data, indent=2)
print("which curreny do you want to convert USD to?")
xm = input('>')
print(f"Hoe many USD do you want to convert{xm}to")
value = input('>')
fetch = pydata["quotes"][0]["USD{xm}"]
answer = fetch*value
print(fetch)
--------------------------------
Here is the
output
"fetch = pydata["quotes"][0]["USD{xm}"]
TypeError: string indices must be integers"
First of all the JSON data you posted here is not valid. There are missing quotes and commas. For example here "terms": "https://curr. It has to be "terms": "https://curr",. The same at "privacy" and the "timestamp" is missing a comma. After i fixed the JSON data I found a solution. You have to use data not pydata. This mean you have to change fetch = pydata["quotes"][0]["USD{xm}"] to fetch = data["quotes"][0]["USD{xm}"]. But this would result in the next error, which would be a KeyError, because in the JSON data you provided us there is no array after the "qoutes" key. So you have to get rid of this [0] or the json data has to like this:
"quotes":[{
"USDIMP": 0.722761,
"USDINR": 74.398905,
"USDIQD": 1458.90221
}]
At the end you only have to change data["quotes"]["USD{xm}"] to data["quotes"]["USD"+xm] because python tries to find a key called USD{xm} and not for example USDIMP, when you type "IMP" in the input.I hope this fixed your problem.

How can I extract values from Tableau on this webpage

I am trying to extract the "mobility index" values for each state and county from this webpage:
https://www.cuebiq.com/visitation-insights-mobility-index/
The preferred output would be a panel data of place (state/county) by date for all available places and dates.
There is another thread (How can I scrape tooltips value from a Tableau graph embedded in a webpage) with a similar question. I tried to follow the solution there but it doesn't seem to work for my case.
Thanks a lot in advance.
(A way that I have tried is to download PDF files generated from Tableau, which would contain all counties' value on a specific date. However, I still need to find a way to make request for each date in the data. Anyway, let me know if you have a better idea than this route).
This tableau data url doesn't return any data. In fact, it only render images of the values (canvas probably) and I'm guessing it detects click based on coordinate. Probably, it's made this way to cache the value and render quickly.
But when you click on a state, it actually returns data but it seems it doesn't always returns the result for the state (but works the individual county).
The solution I've found is to use the tooltip to get the data for the state. When you click the state, it generates a request like this :
POST https://public.tableau.com/{path}/{session_id}/commands/tabsrv/render-tooltip-server
with the following form param :
worksheet: US Map - State - CMI
dashboard: CMI
tupleIds: [18]
vizRegionRect: {"r":"viz","x":496,"y":148,"w":0,"h":0,"fieldVector":null}
allowHoverActions: false
allowPromptText: true
allowWork: false
useInlineImages: true
where tupleIds: [18] refers to the index of the state in a list of states in reverse alphabetical order like this :
stateNames = ["Wyoming","Wisconsin","West Virginia","Washington","Virginia","Vermont","Utah","Texas","Tennessee","South Dakota","South Carolina","Rhode Island","Pennsylvania","Oregon","Oklahoma","Ohio","North Dakota","North Carolina","New York","New Mexico","New Jersey","New Hampshire","Nevada","Nebraska","Montana","Missouri","Mississippi","Minnesota","Michigan","Massachusetts","Maryland","Maine","Louisiana","Kentucky","Kansas","Iowa","Indiana","Illinois","Idaho","Georgia","Florida","District of Columbia","Delaware","Connecticut","Colorado","California","Arkansas","Arizona","Alabama"]
It gives a json with the html of the tooltip which has the CMI and YoY values you want to extract :
{
"vqlCmdResponse": {
"cmdResultList": [{
"commandName": "tabsrv:render-tooltip-server",
"commandReturn": {
"tooltipText": "{\"htmlTooltip\": \"<HTML HERE WITH THE VALUES>\"}]},\"overlayAnchors\":[]}"
}
}]
}
}
The only caveat is that you'll hava to make one request per state :
import requests
from bs4 import BeautifulSoup
import json
import time
data_host = "https://public.tableau.com"
r = requests.get(
f"{data_host}/views/CMI-2_0/CMI",
params= {
":showVizHome":"no",
}
)
soup = BeautifulSoup(r.text, "html.parser")
tableauData = json.loads(soup.find("textarea",{"id": "tsConfigContainer"}).text)
dataUrl = f'{data_host}{tableauData["vizql_root"]}/bootstrapSession/sessions/{tableauData["sessionid"]}'
r = requests.post(dataUrl, data= {
"sheet_id": tableauData["sheetId"],
})
data = []
stateNames = ["Wyoming","Wisconsin","West Virginia","Washington","Virginia","Vermont","Utah","Texas","Tennessee","South Dakota","South Carolina","Rhode Island","Pennsylvania","Oregon","Oklahoma","Ohio","North Dakota","North Carolina","New York","New Mexico","New Jersey","New Hampshire","Nevada","Nebraska","Montana","Missouri","Mississippi","Minnesota","Michigan","Massachusetts","Maryland","Maine","Louisiana","Kentucky","Kansas","Iowa","Indiana","Illinois","Idaho","Georgia","Florida","District of Columbia","Delaware","Connecticut","Colorado","California","Arkansas","Arizona","Alabama"]
for stateIndex, state in enumerate(stateNames):
time.sleep(0.5) #for throttling
r = requests.post(f'{data_host}{tableauData["vizql_root"]}/sessions/{tableauData["sessionid"]}/commands/tabsrv/render-tooltip-server',
data = {
"worksheet": "US Map - State - CMI",
"dashboard": "CMI",
"tupleIds": f"[{stateIndex+1}]",
"vizRegionRect": json.dumps({"r":"viz","x":496,"y":148,"w":0,"h":0,"fieldVector":None}),
"allowHoverActions": "false",
"allowPromptText": "true",
"allowWork": "false",
"useInlineImages": "true"
})
tooltip = json.loads(r.json()["vqlCmdResponse"]["cmdResultList"][0]["commandReturn"]["tooltipText"])["htmlTooltip"]
soup = BeautifulSoup(tooltip, "html.parser")
rows = [
t.find("tr").find_all("td")
for t in soup.find_all("table")
]
entry = { "state": state }
for row in rows:
if (row[0].text == "Mobility Index:"):
entry["CMI"] = "".join([t.text.strip() for t in row[1:]])
if row[0].text == "YoY (%):":
entry["YoY"] = "".join([t.text.strip() for t in row[1:]])
print(entry)
data.append(entry)
print(data)
Try this on repl.it
To get the county information it's the same as this post using the select endpoint which gives you the data with the same format as the post you've linked in your question
The following will extract data for all county and state :
import requests
from bs4 import BeautifulSoup
import json
import time
data_host = "https://public.tableau.com"
worksheet = "US Map - State - CMI"
dashboard = "CMI"
r = requests.get(
f"{data_host}/views/CMI-2_0/CMI",
params= {
":showVizHome":"no",
}
)
soup = BeautifulSoup(r.text, "html.parser")
tableauData = json.loads(soup.find("textarea",{"id": "tsConfigContainer"}).text)
dataUrl = f'{data_host}{tableauData["vizql_root"]}/bootstrapSession/sessions/{tableauData["sessionid"]}'
r = requests.post(dataUrl, data= {
"sheet_id": tableauData["sheetId"],
})
data = []
stateNames = ["Wyoming","Wisconsin","West Virginia","Washington","Virginia","Vermont","Utah","Texas","Tennessee","South Dakota","South Carolina","Rhode Island","Pennsylvania","Oregon","Oklahoma","Ohio","North Dakota","North Carolina","New York","New Mexico","New Jersey","New Hampshire","Nevada","Nebraska","Montana","Missouri","Mississippi","Minnesota","Michigan","Massachusetts","Maryland","Maine","Louisiana","Kentucky","Kansas","Iowa","Indiana","Illinois","Idaho","Georgia","Florida","District of Columbia","Delaware","Connecticut","Colorado","California","Arkansas","Arizona","Alabama"]
for stateIndex, state in enumerate(stateNames):
time.sleep(0.5) #for throttling
r = requests.post(f'{data_host}{tableauData["vizql_root"]}/sessions/{tableauData["sessionid"]}/commands/tabsrv/render-tooltip-server',
data = {
"worksheet": worksheet,
"dashboard": dashboard,
"tupleIds": f"[{stateIndex+1}]",
"vizRegionRect": json.dumps({"r":"viz","x":496,"y":148,"w":0,"h":0,"fieldVector":None}),
"allowHoverActions": "false",
"allowPromptText": "true",
"allowWork": "false",
"useInlineImages": "true"
})
tooltip = json.loads(r.json()["vqlCmdResponse"]["cmdResultList"][0]["commandReturn"]["tooltipText"])["htmlTooltip"]
soup = BeautifulSoup(tooltip, "html.parser")
rows = [
t.find("tr").find_all("td")
for t in soup.find_all("table")
]
entry = { "state": state }
for row in rows:
if (row[0].text == "Mobility Index:"):
entry["CMI"] = "".join([t.text.strip() for t in row[1:]])
if row[0].text == "YoY (%):":
entry["YoY"] = "".join([t.text.strip() for t in row[1:]])
r = requests.post(f'{data_host}{tableauData["vizql_root"]}/sessions/{tableauData["sessionid"]}/commands/tabdoc/select',
data = {
"worksheet": worksheet,
"dashboard": dashboard,
"selection": json.dumps({
"objectIds":[stateIndex+1],
"selectionType":"tuples"
}),
"selectOptions": "select-options-simple"
})
entry["county_data"] = r.json()["vqlCmdResponse"]["layoutStatus"]["applicationPresModel"]["dataDictionary"]["dataSegments"]
print(entry)
data.append(entry)
print(data)

how can i take a specific element from this list in python?

I'm working with the Microsoft Azure face API and I want to get only the glasses response.
heres my code:
########### Python 3.6 #############
import http.client, urllib.request, urllib.parse, urllib.error, base64, requests, json
###############################################
#### Update or verify the following values. ###
###############################################
# Replace the subscription_key string value with your valid subscription key.
subscription_key = '(MY SUBSCRIPTION KEY)'
# Replace or verify the region.
#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'https://westcentralus.api.cognitive.microsoft.com'
# Request headers.
headers = {
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}
# Request parameters.
params = {
'returnFaceAttributes': 'glasses',
}
# Body. The URL of a JPEG image to analyze.
body = {'url': 'https://upload.wikimedia.org/wikipedia/commons/c/c3/RH_Louise_Lillian_Gish.jpg'}
try:
# Execute the REST API call and get the response.
response = requests.request('POST', uri_base + '/face/v1.0/detect', json=body, data=None, headers= headers, params=params)
print ('Response:')
parsed = json.loads(response.text)
info = (json.dumps(parsed, sort_keys=True, indent=2))
print(info)
except Exception as e:
print('Error:')
print(e)
and it returns a list like this:
[
{
"faceAttributes": {
"glasses": "NoGlasses"
},
"faceId": "0f0a985e-8998-4c01-93b6-8ef4bb565cf6",
"faceRectangle": {
"height": 162,
"left": 177,
"top": 131,
"width": 162
}
}
]
I want just the glasses attribute so it would just return either "Glasses" or "NoGlasses"
Thanks for any help in advance!
I think you're printing the whole response, when really you want to drill down and get elements inside it. Try this:
print(info[0]["faceAttributes"]["glasses"])
I'm not sure how the API works so I don't know what your specified params are actually doing, but this should work on this end.
EDIT: Thank you to #Nuageux for noting that this is indeed an array, and you will have to specify that the first object is the one you want.
I guess that you can get few elements in that list, so you could do this:
info = [
{
"faceAttributes": {
"glasses": "NoGlasses"
},
"faceId": "0f0a985e-8998-4c01-93b6-8ef4bb565cf6",
"faceRectangle": {
"height": 162,
"left": 177,
"top": 131,
"width": 162
}
}
]
for item in info:
print (item["faceAttributes"]["glasses"])
>>> 'NoGlasses'
Did you try:
glasses = parsed[0]['faceAttributes']['glasses']
This looks more like a dictionary than a list. Dictionaries are defined using the { key: value } syntax, and can be referenced by the value for their key. In your code, you have faceAttributes as a key that for value contains another dictionary with a key glasses leading to the last value that you want.
Your info object is a list with one element: a dictionary. So in order to get at the values in that dictionary, you'll need to tell the list where the dictionary is (at the head of the list, so info[0]).
So your reference syntax will be:
#If you want to store it in a variable, like glass_var
glass_var = info[0]["faceAttributes"]["glasses"]
#Or if you want to print it directly
print(info[0]["faceAttributes"]["glasses"])
What's going on here? info[0] is the dictionary containing several keys, including faceAttributes,faceId and faceRectangle. faceRectangle and faceAttributes are both dictionaries in themselves with more keys, which you can reference to get their values.
Your printed tree there is showing all the keys and values of your dictionary, so you can reference any part of your dictionary using the right keys:
print(info["faceId"]) #prints "0f0a985e-8998-4c01-93b6-8ef4bb565cf6"
print(info["faceRectangle"]["left"]) #prints 177
print(info["faceRectangle"]["width"]) #prints 162
If you have multiple entries in your info list, then you'll have multiple dictionaries, and you can get all the outputs as so:
for entry in info: #Note: "entry" is just a variable name,
# this can be any name you want. Every
# iteration of entry is one of the
# dictionaries in info.
print(entry["faceAttributes"]["glasses"])
Edit: I didn't see that info was a list of a dictionary, adapted for that fact.

python json to csv converting script?

Let me start by stating that I am new to python. I wrote a script that will convert a .json file to csv format. I managed to write a script to do the job, however I don't think that my script will work if the format of the json file was to change. My script assumes that the json file will be in the same format at all times.
<json file example>
{
"Order":
{
"order_id":"8251662",
"order_date":"2012-08-20 13:17:37",
"order_date_shipped":"0000-00-00 00:00:00",
"order_status":"fraudreview",
"order_ship_firstname":"pam",
"order_ship_lastname":"Gregorio",
"order_ship_address1":"1533 E. Dexter St",
"order_ship_address2":"",
"order_ship_city":"Covina",
"order_ship_state":"CA",
"order_ship_zip":"91746",
"order_ship_country":"US United States",
"order_ship_phone":"6268936923",
"order_ship_email":"pgregorio#brighton.com",
"order_bill_firstname":"pam",
"order_bill_lastname":"Gregorio",
"order_bill_address1":"1533 E. Dexter St",
"order_bill_address2":"",
"order_bill_city":"Covina",
"order_bill_state":"CA",
"order_bill_zip":"91746",
"order_bill_country":"US United States",
"order_bill_phone":"6268936923",
"order_bill_email":"pgregorio#brighton.com",
"order_gift_message":"",
"order_giftwrap":"0",
"order_gift_charge":"0",
"order_shipping":"Standard (Within 5-10 Business Days)",
"order_tax_charge":"62.83",
"order_tax_shipping":"0",
"order_tax_rate":"0.0875",
"order_shipping_charge":"7.5",
"order_total":"788.33",
"order_item_count":"12",
"order_tracking":"",
"order_carrier":"1"
},
"Items":
[
{
"item_id":"25379",
"item_date_shipped":"",
"item_code":"17345-J3553-J35532",
"item_quantity":"2","item_taxable":"YES",
"item_unit_price":"32","item_shipping":"0.67",
"item_addcharge_price":"0",
"item_description":" ABC Slide Bracelet: : Size: OS: Silver Sku: J35532",
"item_quantity_returned":"0",
"item_quantity_shipped":"0",
"item_quantity_canceled":"0",
"item_status":"pending",
"item_product_id":"17345",
"item_product_kit_id":"0",
"item_product_sku":"J35532",
"item_product_barcode":"881934310775",
"item_tracking":"",
"item_carrier":"0",
"item_source_orderid":""
},
{
"item_id":"25382",
"item_date_shipped":"",
"item_code":"17608-J3809-J3809C",
"item_quantity":"1",
"item_taxable":"YES",
"item_unit_price":"22",
"item_shipping":"0.23",
"item_addcharge_price":"0",
"item_description":" \"ABC Starter Bracelet 7 1\/4\"\"\": : Size: OS: Silver Sku: J3809C",
"item_quantity_returned":"0",
"item_quantity_shipped":"0",
"item_quantity_canceled":"0",
"item_status":"pending",
"item_product_id":"17608",
"item_product_kit_id":"0",
"item_product_sku":"J3809C",
"item_product_barcode":"881934594175",
"item_tracking":"",
"item_carrier":"0",
"item_source_orderid":""
},
{
"item_id":"25385",
"item_date_shipped":"",
"item_code":"17687-J9200-J92000",
"item_quantity":"2",
"item_taxable":"YES",
"item_unit_price":"12",
"item_shipping":"0.25",
"item_addcharge_price":"0",
"item_description":" ABC Cathedral Bead: : Size: OS: Silver Sku: J92000",
"item_quantity_returned":"0",
"item_quantity_shipped":"0",
"item_quantity_canceled":"0",
"item_status":"pending",
"item_product_id":"17687",
"item_product_kit_id":"0",
"item_product_sku":"J92000",
"item_product_barcode":"881934602832",
"item_tracking":"",
"item_carrier":"0",
"item_source_orderid":""
},
{
"item_id":"25388",
"item_date_shipped":"",
"item_code":"17766-J9240-J92402",
"item_quantity":"2",
"item_taxable":"YES",
"item_unit_price":"22",
"item_shipping":"0.46",
"item_addcharge_price":"0",
"item_description":" ABC Ice Diva Bead: : Size: OS: Silver Sku: J92402",
"item_quantity_returned":"0",
"item_quantity_shipped":"0",
"item_quantity_canceled":"0",
"item_status":"pending",
"item_product_id":"17766",
"item_product_kit_id":"0",
"item_product_sku":"J92402",
"item_product_barcode":"881934655838",
"item_tracking":"",
"item_carrier":"0",
"item_source_orderid":""
},
],
"FraudReasons":
[
{
"order_id":"11957",
"fraud_reason":"order total exceeds max amount"
},
{
"order_id":"11957",
"fraud_reason":"order exceeds max item count"
}
]
}
My script currently works fine with this json file but It wont work if there is only one item or one fraudreason. Here is the code to my script.
<script code>
#!/usr/bin/python
import simplejson as json
import optparse
import pycurl
import sys
import csv
json_data = open(file)
data = json.load(json_data)
json_data.close()
csv_file = '/tmp/' + str(options.orderId) + '.csv'
orders = data['Order']
items = data['Items']
frauds = data['FraudReasons']
o = csv.writer(open(csv_file, 'w'), lineterminator=',')
o.writerow([orders['order_id'],orders['order_date'],orders['order_date_shipped'],orders['order_status'],orders['order_ship_firstname'],orders['order_ship_lastname'],orders['order_ship_address1'],orders['order_ship_address2'],orders['order_ship_city'],orders['order_ship_state'],orders['order_ship_zip'],orders['order_ship_country'],orders['order_ship_phone'],orders['order_ship_email'],orders['order_bill_firstname'],orders['order_bill_lastname'],orders['order_bill_address1'],orders['order_bill_address2'],orders['order_bill_city'],orders['order_bill_state'],orders['order_bill_zip'],orders['order_bill_country'],orders['order_bill_phone'],orders['order_bill_email'],orders['order_gift_message'],orders['order_giftwrap'],orders['order_gift_charge'],orders['order_shipping'],orders['order_tax_charge'],orders['order_tax_shipping'],orders['order_tax_rate'],orders['order_shipping_charge'],orders['order_total'],orders['order_item_count'],orders['order_tracking'],orders['order_carrier']])
for item in items:
o.writerow([item['item_id'],item['item_date_shipped'],item['item_code'],item['item_quantity'],item['item_taxable'],item['item_unit_price'],item['item_shipping'],item['item_addcharge_price'],item['item_description'],item['item_quantity_returned'],item['item_quantity_shipped'],item['item_quantity_canceled'],item['item_status'],item['item_product_id'],item['item_product_kit_id'],item['item_product_sku'],item['item_product_barcode'],item['item_tracking'],item['item_carrier'],item['item_source_orderid']])
for fraud in frauds:
o.writerow([fraud['fraud_reason']],)
I also have not been able to figure out how not to use the labels I hope someone can help me with this
thanks in advance.
You may want to use csv.DictWriter:
# It's considered best to stash the main logic of your script
# in a main() function like this.
def main(filename, options):
with open(filename) as fi:
data = json.load(fi)
csv_file = '/tmp/' + str(options.orderId) + '.csv'
order = data['Order']
items = data['Items']
frauds = data['FraudReasons']
# Here's one way to keep this maintainable if the JSON
# format changes, and you don't care too much about the
# order of the fields...
orders_fields = sorted(orders.keys())
item_fields = sorted(items[0].keys()) if items else ()
fraud_fields = sorted(fraud[0].keys()) if fraud else ()
csv_options = dict(lineterminator=',')
with open(csv_file, 'w') as fo:
o = csv.DictWriter(fo, order_fields, **csv_options)
o.writeheader()
o.writerow(orders)
fo.write('\n') # Optional, if you want to keep them separated.
o = csv.DictWriter(fo, item_fields, **csv_options)
o.writeheader()
o.writerows(items)
fo.write('\n') # Optional, if you want to keep them separated.
o = csv.DictWriter(fo, fraud_fields, **csv_options)
o.writeheader()
o.writerows(frauds)
# If this script is run from the command line, just run
# main(). Here's the place to use `optparse`.
if __name__ == '__main__':
main(...) # You'll need to fill in the main() arguments...
If you need to specify the order of fields, assign them to a tuple like this:
orders_fields = (
'order_id',
'order_date',
'order_date_shipped',
# ... etc.
)
You should ask the json-generated object (data) for the names of the fields. To retain the input order, tell json to use collections.OrderedDict instead of plain dict (requires python 2.7):
import json
from collections import OrderedDict as ordereddict
data = json.loads(open('mydata.json', object_pairs_hook=ordereddict)
orders = data['Order']
print orders.keys() # Will print the keys in the order they were read
You can then use orders.keys() instead of your hard-coded list, either with writerow or (simpler) with csv.DictWriter.
Note that this uses the default json, not simplejson, and requires python 2.7 for the ordered_pairs_hook argument and the OrderedDict type.
Edit: Yeah, I see from the comments that you're stuck with 2.4. You can download an ordereddict from PyPi, and you can extend the JSONDecoder class and pass it with the cls argument (see here), instead of object_pairs_hook, but that's uglier and more work...

Categories

Resources