How to convert python-request JSON results to csv? - python

I am trying to get my list of contacts from my WIX website using their API endpoint url and the requests module in python. I am totally stuck.
Here's my code so far:
import requests
auth_key = "my auth key"
r = requests.get("https://www.wixapis.com/crm/v1/contacts", headers={"Authorization": auth_key})
print(r.status_code)
dict = r.json()
contacts_list = dict["contacts"]
for i in contacts_list:
for key in i:
print(key, ':', i[key])
Here is what I get:
200
id : long id string 1
emails : [{'tag': 'UNTAGGED', 'email': 'sampleemail1#yahoo.com'}]
phones : []
addresses : [{'tag': 'UNTAGGED', 'countryCode': 'US'}]
metadata : {'createdAt': '2020-07-08T22:41:07.135Z', 'updatedAt': '2020-07-08T22:42:19.327Z'}
source : {'sourceType': 'SITE_MEMBERS'}
id : long id string 2
emails : [{'tag': 'UNTAGGED', 'email': 'sampleemail2#yahoo.com'}]
phones : []
addresses : []
metadata : {'createdAt': '2020-07-03T00:51:21.127Z', 'updatedAt': '2020-07-04T03:26:16.370Z'}
source : {'sourceType': 'SITE_MEMBERS'}
Process finished with exit code 0
Each line is a string. I need each row of the csv to be a new contact (There are two sample contacts). The columns should be the keys. I plan to use the csv module to writerow(Fields), where fields is a list of string (keys) such as Fields = [id, emails, phones, addresses, metadata, source]
All I really need is the emails in a single column of a csv though. Is there a way to maybe just get the email for each contact?

A CSV file with one column is basically just a text file with one item per line, but you can use the csv module to do it if you really want, as shown below.
I commented-out the 'python-requests' stuff and used some sample input for testing.
test_data = {
"contacts": [
{
"id": "long id string 1",
"emails": [
{
"tag": "UNTAGGED",
"email": "sampleemail1#yahoo.com"
}
],
"phones": [],
"addresses": [
{
"tag": "UNTAGGED",
"countryCode": "US"
}
],
"metadata": {
"createdAt": "2020-07-08T22:41:07.135Z",
"updatedAt": "2020-07-08T22:42:19.327Z"
},
"source": {
"sourceType": "SITE_MEMBERS"
}
},
{
"id": "long id string 2",
"emails": [
{
"tag": "UNTAGGED",
"email": "sampleemail2#yahoo.com"
}
],
"phones": [],
"addresses": [],
"metadata": {
"createdAt": "2020-07-03T00:51:21.127Z",
"updatedAt": "2020-07-04T03:26:16.370Z"
},
"source": {
"sourceType": "SITE_MEMBERS"
}
}
]
}
import csv
import json
import requests
auth_key = "my auth key"
output_filename = 'whatever.csv'
#r = requests.get("https://www.wixapis.com/crm/v1/contacts", headers={"Authorization": auth_key})
#print(r.status_code)
#json_obj = r.json()
json_obj = test_data # FOR TESTING PURPOSES
contacts_list = json_obj["contacts"]
with open(output_filename, 'w', newline='') as outp:
writer = csv.writer(outp)
writer.writerow(['email']) # Write csv header.
for contact in contacts_list:
email = contact['emails'][0]['email'] # Get the first one.
writer.writerow([email])
print('email csv file written')
Contents of whatever.csv file afterwards:
email
sampleemail1#yahoo.com
sampleemail2#yahoo.com

Update:
As pointed by #martineau, I just saw you can array in few values, you need to cater it. You may make them string with [].join() in the for loop
you can write it to csv like this using csv package.
import csv, json, sys
auth_key = "my auth key"
r = requests.get("https://www.wixapis.com/crm/v1/contacts", headers={"Authorization": auth_key})
print(r.status_code)
dict = r.json()
contacts_list = dict["contacts"]
output = csv.writer(sys.stdout)
#insert header(keys)
output.writerow(data[0].keys())
for i in contacts_list:
output.writerow(i.values())
At the end you can print and verify output

Related

Python & Pandas: Parsing JSONs in a loop

With Python I'm pulling a nested json, and I'm seeking to parse it via a loop and write the data to a csv. The structure of the json is below. The values I'm after are in the "view" list, labeled "user_id" and `"message"'
{
"view": [
{
"id": 109205,
"user_id": 6354,
"parent_id": null,
"created_at": "2020-11-03T23:32:49Z",
"updated_at": "2020-11-03T23:32:49Z",
"rating_count": null,
"rating_sum": null,
**"message": "message text",**
"replies": [
# json continues
],
}
After some study and assistance from this helpful tutorial I was able to structure requests like this:
import requests
import json
import pandas as pd
url = "URL"
headers = {'Authorization' : 'Bearer KEY'}
r = requests.get(url, headers=headers)
data = r.json()
print(data['view'][0]['user_id'])
print(data['view'][0]['message'])
Which successfully prints the outputs 6354 and "message test".
Now....how would I approach capturing all the user id's and messages from the json to a csv with Pandas?

parsing JSON with missing fields

I have json array with very dynamic field and some of the array doesn't have all the fields.
Example :
[
{
"Name": "AFG LIMITED",
"Vendor ID": "008343",
"EGID": "67888",
"FID": "83748374"
},
{
"Name": "ABC LIMITED",
"Vendor ID": "008333",
"EGID": "67888",
"AID": "0000292"
"FID": "98979"
},
]
I need to extract particular key with header & pipe delimiter like :Name|Vendor ID|EGID|AID(only present in second array).if any key is not present then it should have null value
I was try to parse this with below code but it's breaking in the second line itself as AID is missing.
import json
with open("sample.json", "r") as rf:
decoded_data = json.load(rf)
# Check is the json object was loaded correctly
try:
for i in decoded_data:
print i["Name"],"|",i["Vendor ID"]"|",i["EGID"],"|",i["AId"]
except KeyError:
print(null)
output from above code:
AFG LIMITED|008343|67888|null

copying data from json response [Python]

I have a scenario where I am trying to extract data from json response which is obtained from the GET request and then rebuilding the json data by changing some values and then sending a PUT request at same time after rebuilding the json data(i.e, after changing idter value)
below is the target json response.
target_json = {
"name": "toggapp",
"ts": [
1234,
3456
],
"gs": [
{
"id": 4491,
"con": "mno"
},
{
"id": 4494,
"con": "hkl"
}
],
"idter": 500,
"datapart": false
}
from the above json I am trying to change the idter value to my custom value and rebuild it into json data again and post the new json data.
Here is what I have tried :
headers = {'Authorization': 'bearer ' + auth_token, 'Content-Type':'application/json', 'Accept':'application/json'}
tesstid =[7865, 7536, 7789]
requiredbdy = []
for key in testid:
get_metadata_targetjson= requests.get('https://myapp.com/%s' %key, headers = headers)
metadata=get_metadata_target.json()
for key1 in metadata:
requiredbdy.append(
{
"metadata" : [{
"name": key1['name'],
"ts": key1['ts'],
"gs": key1[gs],
"idter": 100, #custom value which I want to change
"datapart": false
} ]
}
)
send_metadata_newjson= requests.put('https://myapp.com/%s' %key, headers = headers data = requiredbdy)
print(send_metadata_newjson.status_code)
Is this approach fine or How do I proceed in order to achieve this scenario.
You can use the built-in json module for this like so
import json
my_json = """
{
"name": "toggapp",
"ts": [
1234,
3456
],
"gs": [
{
"id": 4491,
"con": "mno"
},
{
"id": 4494,
"con": "hkl"
}
],
"idter": 500,
"datapart": false
}
"""
json_obj = json.loads(my_json)
json_obj['idter'] = 600
print(json.dumps(json_obj))
Prints
{"name": "toggapp", "ts": [1234, 3456], "gs": [{"id": 4491, "con": "mno"}, {"id": 4494, "con": "hkl"}], "idter": 600, "datapart": false}
There's this small script used it to find entries in some very long and unnerving JSONs. not very beautifull und badly documented but maybe helps in your scenario.
from RecursiveSearch import Retriever
def alter_data(json_data, key, original, newval):
'''
Alter *all* values of said keys
'''
retr = Retriever(json_data)
for item_no, item in enumerate(retr.__track__(key)): # i.e. all 'value'
# Pick parent objects with a last element False in the __track__() result,
# indicating that `key` is either a dict key or a set element
if not item[-1]:
parent = retr.get_parent(key, item_no)
try:
if parent[key] == original:
parent[key] = newval
except TypeError:
# It's a set, this is not the key you're looking for
pass
if __name__ == '__main__':
alter_data(notification, key='value',
original = '********** THIS SHOULD BE UPDATED **********',
newval = '*UPDATED*')

Get all WWI records from Auckland Museum Online Cenotaph API as one .json, then combine and transform to .csv

I'm trying to write a Python script that will extract as one large .json file the (as of this writing) 105,445 WWI serviceperson records from api.aucklandmuseum.com, as shown in this SPARQL query.
So far I have this code:
import requests
import json
url = "http://api.aucklandmuseum.com/search/cenotaph/_search"
numfrom = 0
i = 0
while numfrom <= 105500:
print("From:", numfrom)
payload = "{\"sort\":[{\"dc_identifier\":{\"order\":\"asc\"}}],\"from\":"
payload += (str(numfrom))
payload += ",\"size\":500,\"query\":{\"match\":{\"am_war\":\"World War I, 1914-1918\"}}}"
headers = {'Accept': 'application/json', 'Content-Type': 'application/json'}
response = requests.request("POST", url, headers=headers, data = payload)
outputjson = response.text
outfilename = "OC"
outfilename += str(i)
outfilename += ".json"
print("Start of",outfilename,":\n",outputjson[:175]) # just first 175 chars
with open(outfilename, mode = 'w+') as outfile:
json.dump(outputjson, outfile)
i += 1
numfrom += 500
This gets 500 records at a time and dumps them to files, printing to the console the head of each file to test that the results are as expected.
I need to combine all this JSON into one big file, but several issues arise:
The JSON output is structured like this:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 104767,
"max_score": null,
"hits": [
{
"_index": "cenotaph-2019-06-18",
"_type": "am:MilitaryPerson",
"_id": "http://api.aucklandmuseum.com/id/person/C53694",
"_score": null,
"_source": {
"am_medicalInformation": [
{
"am_medical": [
"Other/WWI"
],
"am_record_score": [
"0"
],
"am_notes": [
"Nature of Injury: G.S.W. right foot\nPension: 10/-\nPercent of disability: [Not stated]\nSource: New Zealand. Army (1920). List of the names of all ex-members of the New Zealand Expeditionary Force, suffering permanent disability from 20% to 100%"
]
}
],
[…]
How can I write code (separately to the above would be fine) to combine all JSON output (while retaining the _id of each record, along with its _source data) then "flatten" that data and transform it into a .csv?
I hope the above is clear.
Thank you
Hugh
p.s.: the server is running ElasticSearch v1.5.

Filter data to facebook graph with json-python

I 'm getting Facebook graph and data already shows what I need , but I could not filter the 'message' and 'id' or JSON , appreciate them , I leave my code:
import facebook
import json
import urllib.request
from urllib.request import urlopen
page_id = "MYPAGE"
access_token = 'MY-ACCESS-TOKEN'
api_endpoint = "https://graph.facebook.com/v2.5/"
fb_graph_url = page_id+"?fields=id,name,feed.since(2015-12-22).until(2015-12-25){comments.filter(stream)}&access_token="+access_token
html = api_endpoint + fb_graph_url
print(html,"\n")
data = urllib.request.urlopen(html)
read_page= data.read()
print(read_page)
print(data.read(),"\n")
data2=json.loads(read_page.decode())
#message=data2["feed"]["data"]
message=data2
for item in message['feed']['data'][1]['comments']['data']:
print(item['message'])
print(item['from']['name'])
print(message,"\n")
He shows me something like:
{
"id": "2825921296",
"name": "MY-PAGE",
"feed": {
"data": [
{
"id": "2825921296_5155340"
},
{
"id": "2825921296_5155340",
"comments": {
"data": [
{
"from": {
"name": "Carl Jhon",
"id": "282564921296"
},
"message": "Comment one",
"created_time": "2015-12-10T03:42:05+0000",
"id": "5153352885_5153353484206"
},
{
And my question is , How to display only the 'message' and 'name' of all it shows.
Thankl and I appreciate your response.
It looks like the variable "message" in your code (not in the JSON data) is a dictionary. Subsequently, you can access the name and the first message by adding:
print(message['feed']['data'][1]['comments']['data'][0]['message'])
print(message['name'])
You can access the nth message with:
print(message['feed']['data'][1]['comments']['data'][n]['message'])
To print all of the messages, including the name of the author, you could use the for loop like this:
for item in message['feed']['data'][1]['comments']['data']:
print(item['message'])
print(item['from']['name'])
Or you can output a specific number of messages and names (100 in this case):
if len(message['feed']['data'][1]['comments']['data'])>=100:
for i in range(100):
print(message['feed']['data'][1]['comments']['data'][i]['message'])
print(message['feed']['data'][1]['comments']['data'][i]['from']['name'])
In case the message contains emojis, you can either add # -*- coding: utf-8 -*- to the top of your script or take a look at this post

Categories

Resources