Storing a JSON response from an API to a CSV file

Storing a JSON response from an API to a CSV file - python

I’m trying to write a python script that sends a query to TweetSentiments.com API.
The idea is that it will perform like this –
Reads CSV tweet file > construct query from tweet file > Interrogates API > format JSON response > write API results to CSV file.
So far I’ve come up with this –
import csv
import urllib
import simplejson as json
import os
Tweets=[] ## Creates empty list to store tweets.
TweetWriter = csv.writer(open('test.csv', 'w'), dialect='excel', delimiter=' ',quotechar='|')
TweetReader = csv.reader(open("C:\StoredTweets.csv", "r"))
for row in TweetReader:
#TweetList.append(rows)
Tweets.append({ 'tweet': row[0], 'date': row[1] }) ## Stores from CSV in list.
for rows in Tweets:
#print TweetList
data = urllib.urlencode({'Tweet': row[0], 'Date': row[1]}) ##Takes Tweet and date to construct query.
#print data
API_request = urllib.urlopen("http://data.tweetsentiments.com:8080/api/analyze.json?q=", data) ## Adds query to end of URL and queries API
result = json.load(API_request('{"sentiment":}') ## "Sentiment": is a heading in the JSON response
TweetWriter.writerow(result) ## Writes API Response to CSV file.
I’m having trouble decoding the json. I’m confused with the json.load part … should I be referring to a heading in the json response and then trying to store it to a variable? Meaning that I can write this to a CSV file?
The response I get from the API is –
{"sentiment":{"value":1,"name":"Positive"}}
So I figure referring to “Sentiment” would store the whole reply.
Sorry, I’m just getting really confused. Read so many different forum posts and tutorials.
Your advice and help would be greatly appreciated!

Related

How to insert data in CSV file?

I have a problem with my Python script.
Basically what my script does is first it does a GET request to our API and extracts all ID's from the endpoint and then saves it in a CSV file.
Atm im having problems inserting data in the csv file. What i want is for my csv file to look like this after inserting data in:
id
1
2
3
...
Basically I want every id in their own row.
But what ends up being inserted is this:
id
1,2,3,...
I have tried for looping and few other things and nothing seemed to work out. I would love if anyone can help me with this problem. It's probably something really simple I just missed out.
My script code:
import requests
import json
import csv
from jsonpath_ng import jsonpath, parse
url = 'url'
headers = {
"Authorization": "Bearer token"
}
response = requests.get(url_v1, headers=headers)
JsonResponse = response.json()
converted = json.dumps(JsonResponse)
Data = json.loads(converted)
ParseData = parse('$..id')
Id = ParseData.find(Data)
open_file = open('C:/File/test.csv','w', newline='')
writer = csv.writer(open_file)
list_id = []
fields = ['id']
for i in range(0, len(Id)):
result = Id[i].value
list_id.append(result)
writer.writerow(fields)
writer.writerow(list_id)
open_file.close()

How can i save my scraped json dataset that is a text format into my local machine and also how can i read the file into Pandas DataFrame?

This is how the code looks like below:
I want to use it for analysis
response = requests.request("GET", url, headers=headers, params=querystring)
print(response.text)
{"#type":"imdb.api.title.ratings","id":"/title/tt0944947/","title":"Game of Thrones","titleType":"tvSeries","year":2011,"canRate":true,"otherRanks":[{"id":"/chart/ratings/toptv","label":"Top 250 TV","rank":12,"rankType":"topTv"}],"rating":9.2,"ratingCount":1885115,"ratingsHistograms":{"Males Aged 18-29":{"aggregateRating":9.3,"demographic":"Males Aged 18-29","histogram":{"1":11186,"2":693,"3":801,"4":962,"5":2103,"6":3583,"7":9377,"8":22859,"9":52030,"10":174464},"totalRatings":278058},"IMDb Staff":{"aggregateRating":8.7,"demographic":"IMDb Staff","histogram":{"1":0,"2":0,"3":0,"4":0,"5":1,"6":3,"7":6,"8":19,"9":27,"10":17},"totalRatings":73}

Frankly, you should find it in any tutorial for Python or in many examples for requests
fh = open("output.json")
fh.write(response.text)
fh.close()
or
with open("output.json") as fh:
fh.write(response.text)
As for pandas you can try to read it
df = pd.read_json("output.json")
or you can use module io to read it without saving on disk
import io
fh = io.StringIO(response.text)
df = pd.read_json(fh)
But pandas keeps data as table with rows and columns but you have nested lists/dicts so it may need some work to keep it in DataFrame.
If you want to get only some data from json then you could use response.json()

how to store bytes like b'PK\x03\x04\x14\x00\x08\x08\x08\x009bwR\x00\x00\x00\x00\x00\x00\x00 to dataframe or csv in python

I am requesting a URL and getting a return in bytes. I want to store this in a data frame and then to CSV.
#Get Data from the CSV
url = "someURL"
req = requests.get(URL)
url_content = req.content
csv_file = open('test.txt', 'wb')
print(type(url_content))
print(url_content)
csv_file.write(url_content)
csv_file.close()
I tried many approaches, but couldn't find the solution. The above code is storing the output in CSV, but getting the below error. My end objective is to store this in CSV then send it to google cloud. And create a google big query table.
Output:
<class 'bytes'>
b'PK\x03\x04\x14\x00\x08\x08\x08\x009bwR\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x13\x00\x00\x00[Content_Types].xml\xb5S\xcbn\xc20\x10\xfc\x95\xc8\xd76\xf4PU\x15\x81C\x1f\xc7\x16\xa9\xf4\x03\{\x93X\xf8%\xaf\xa1\xf0\xf7]\x078\x94R\x89\nq\xf2cfgfW\xf6d\xb6q\xb6ZCB\x13|\xc3\xc6|\xc4\xf0h\xe3\xbb\x86},^\xea{Va\x96^K\x1b<4\xcc\x076\x9bN\x16\xdb\x08XQ\xa9\xc7\x86\xf59\xc7\x07!P\xf5\xe0$\xf2\x10\xc1\x13\xd2\x86\xe4d\xa6c\xeaD\x94j);\x10\xb7\xa3\xd1\x9dP\xc1g\xf0\xb9\xceE\x83M'O\xd0\xca\x95\xcd\xd5\xe3\xee\xbeH7L\xc6h\x8d\x92\x99R\x89\xb5\xd7G\xa2\xf5^\x90'\xb0\x03\x07{\x13\xf1\x86\x08\xacz\xde\x90\xca\xae\x1bB\x91\x893\x1c\x8e\x0b\xcb\x99\xea\xdeh.\xc9h\xf8W\xb4\xd0\xb6F\x81\x0ej\xe5\xa8\x84CQ\xd5\xa0\xeb\x98\x88\x98\xb2\x81}\xce\xb9L\xf9U:\x12\x14D\x9e\x13\x8a\x82\xa4\xf9%\xde\x87\xb1\xa8\x90\xe0,\xc3B\xbc\xc8\xf1\xa8[\x8c\t\xa4\xc6\x1e ;\xcb\xb1\x97\t\xf4{N\xf4\x98~\x87\xd8X\xf1\x83p\xc5\x1cykOL\xa1\x04\x18\x90kN\x80V\xee\xa4\xf1\xa7\xdc\xbfBZ~\x86\xb0\xbc\x9e\x7fq\x18\xf6\x7f\xd9\x0f \x8aa\x19\x1fr\x88\xe1{O\xbf\x01PK\x07\x08z\x94\xcaq;\x01\x00\x00\x1c\x04\x00\x00PK\x03\x04\x14\x00\x08\x08\x08\x009bwR\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0b\x00\x00\x00_rels/.rels\xad\x92\xc1j\xc30\x0c\x86_\xc5\xe8\xde8\xed`\x8cQ\xb7\x972\xe8m\x8c\xee\x014[ILb\xcb\xd8\xda\x96\xbd\xfd\xcc.[K\n\x1b\xec($}\xff\x07\xd2v?\x87I\xbdQ.\x9e\xa3\x81u\xd3\x82\xa2h\xd9\xf9\xd8\x1bx>=\xac\xee#\x15\xc1\xe8p\xe2H\x06"\xc3~\xb7}\xa2\t\xa5n\x94\xc1\xa7\xa2"\x16\x03\x83H\xba\xd7\xba\xd8\x81\x02\x96\x86\x13\xc5\xda\xe98\x07\x94Z\xe6^'\xb4#\xf6\xa47m{\xab\xf3O\x06\x9c3\xd5\xd1\x19\xc8G\xb7\x06u\xc2\xdc\x93\x18\x98'\xfd\xcey|a\x1e\x9b\x8a\xad\x8d\x8fD\xbf\t\xe5\xae\xf3\x96\x0el_\x03EY\xc8\xbe\x98\x00\xbd\xec\xb2\xf9vql\x1f3\xd7ML\xe9\xbfeh\x16\x8a\x8e\xdc*\xd5\x04\xca\xe2\xa9\3\xbaY0\xb2\x9c\xe9oJ\xd7\x8f\xa2\x03\t:\x14\xfc\xa2^\x08\xe9\xb3\x1f\xd8}\x02PK\x07\x08\xa7\x8cz\xbd\xe3\x00\x00\x00I\x02\x00\x00PK\x03\x04\x14\x00\x08\x08\x08\x009bwR\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00docProps/app.xmlM\x8e\xc1\n\xc20\x10D\xef~E\xc8\xbd\xdd\xeaAD\xd2\x94\x82\x08\x9e\xecA? \xa4\xdb6\xd0lB\xb2J?

The original URL (now edited out of the question) suggests that the downloaded file is in .xlsx format. The .xlsx format is essentially one or more xml files in a zip archive (iBug's answer is correct in this respect).
Therefore if you want to get the file's data in a dataframe, tell Pandas to read it as an excel file.
import pandas as pd
url = "someURL"
req = requests.get(URL)
url_content = req.content
# Load into a dataframe
df = pd.read_excel(url_content)
# Write to csv
df.to_csv('data.csv')

The initial bytes PK\x03\x04 suggest that it's PK Zip format. Try unzipping it first, either with unzip x <filename> or with Python builtin zipfile module.

How to return data pulled from python requests call into json

I am trying to make a GHE API call and convert the returned data into JSON. I am sure this is fairly simple (my current code writes the data into a .txt file) but I am incredibly new to python.
I am having a hard time understanding how to use json.dumps.
import requests
import json
GITHUB_ENTERPRISE_TOKEN = 'token xxx'
SEARCH_QUERY = "Evidence+locker+Seed+in:readme"
headers = {
'Authorization': GITHUB_ENTERPRISE_TOKEN,
}
url = "https://github.ibm.com/api/v3/search/repositories?q=" + SEARCH_QUERY
#Setup url to include GHE api endpoint and the search query
response = requests.get(url, headers=headers)
with open('./evidencelockerevidence.txt', 'w') as file:
file.write(response.text)
#writes to a .txt file the evidence fetched from GHE
Rather than the last two lines of functional code writing the data into a .txt file I would like to return it as JSON object in the same directory.

json.dumps simply stringify, thus, serialize your JSON object so you can store it as a plain text file. Its counterpart is json.loads.
f = open('a.jsonl', 'wt')
f.write(json.dumps(jobj))
People usually write one JSON object per line, a.k.a, jsonl format.
json.dump directly store your JSON object to a file. Its counterpart is json.load.
json.dump(jobj, open('a.json', 'wt'))
A json format file contains only one JSON object in a single line or multiple lines.

Generate JSON payload for posting to API with keys from 1 file and values from the other file

I have a use case where I need to generate a JSON payload for each individual in the data file. Each data file will have a config file associated with it. The data file and corresponding config file look somewhat like this.
File 1(Data file):
Employee ID,First Name,Last Name,Email
E1000,Manas,Jani,jam#xyz.com
E2000,Jim,Kong,jik#xyz.com
E3000,Olila,Jayavarman,olj#xyz.com
E4000,Lisa,Kopkingg,lik#xyz.com
E5000,Kishore,Pindhar,kip#xyz.com
E6000,Gobi,Nadar,gon#xyz.com
File 2(Config file):
Input_file_column_name,Config_file_column_name,Value
Employee_ID,employee_Id,idTypeCode:001
First Name,first_Name
Last Name,last_Name
Email,email_Address
EntityID,entity_Id,01
As you can see, each element in the data file is there on the config file. The config file contains the actual rules in terms of what the field should be named as per the JSON payload. Also, there could be fields with a value associated on it and that needs to be put into the JSON payload to be passed to the API built.
The input_file_column_name is the name in the input data file but the JSON payload will take the column name as per the config_file_column_name.
This is how my post JSON request should look like:
{"IndividualInfo":[{"employee_Id":"E1000","first_Name":"Manas","last_Name":"Jani","email_Address":"jam#xyz.com","entity_Id":01,"idTypeCode":001},{"employee_Id":"E2000","first_Name":"Jim","last_Name":"Kong","email_Address":"jik#xyz.com","entity_Id":01,"idTypeCode":001},{"employee_Id":"E3000","first_Name":"Olila","last_Name":"Jayavarman","email_Address":"olj#xyz.com","entity_Id":01,"idTypeCode":001},{"employee_Id":"E4000","first_Name":"Lisa","last_Name":"Kopkingg","email_Address":"lik#xyz.com","entity_Id":01,"idTypeCode":001},{"employee_Id":"E5000","first_Name":"Kishore","last_Name":"Pindhar","email_Address":"kip#xyz.com","entity_Id":01,"idTypeCode":001},{"employee_Id":"E6000","first_Name":"Gobi","last_Name":"Nadar","email_Address":"gon#xyz.com","entity_Id":01,"idTypeCode":001}]}
I am unable to understand how to replace the keys once I generate the payload from the data file as well as add those extra elements which have the VALUE field filled up. Any suggestions would be really helpful.
This is what I have in terms of code:
import json
import requests
import pandas as pd
file1='Onboarding_members.txt'
df = pd.read_csv(open(file1))
#print df.to_json(orient='records')
print df
file2='Onboarding_config.txt'
df1=pd.read_csv(open(file2))
#saved_column=df1['Config_file_name']
#print saved_column
print df1
df.columns = df.columns.map(df1.set_index('Input_file_name')['Config_file_name'].get)
#df2=df.rename(columns=df1.set_index('Input_file_name')['Config_file_name'], inplace=True)
print df
Thank you!

I'm no expert at Python 2.X, but here's a broken down code snippet to get you started.
Using this snippet as a starting point, you can hopefully fine-tune it to your own needs.
The CSV source I'm using:
src.csv
Employee ID,First Name,Last Name,Email
E1000,Manas,Jani,jam#xyz.com
E2000,Jim,Kong,jik#xyz.com
E3000,Olila,Jayavarman,olj#xyz.com
E4000,Lisa,Kopkingg,lik#xyz.com
E5000,Kishore,Pindhar,kip#xyz.com
E6000,Gobi,Nadar,gon#xyz.com
Here's the code to parse the csv, create a dictionary for each person and wrap them up into a list:
example.py
import csv
import io
path = "/tmp/csvTst/src.csv"
employees = []
with io.open(path, newline='') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
employees.append(
{
"Employee ID": row['Employee ID'],
"First Name": row['First Name'],
"Last Name": row['Last Name'],
"Email": row['Email']
}
)
print employees
To post with the requests library, you don't need to import json. You're request would look something like this:
for employee in employees:
requests.post("https://my-site.com/api/users/", json=employee)
Of course, you can make this more sophisticated by automatically assigning the csv's first row as keys to your dictionary without manually hard-coding it, but I wanted to make the snippet as simple as possible.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Storing a JSON response from an API to a CSV file - python

Related

How to insert data in CSV file?

How can i save my scraped json dataset that is a text format into my local machine and also how can i read the file into Pandas DataFrame?

how to store bytes like b'PK\x03\x04\x14\x00\x08\x08\x08\x009bwR\x00\x00\x00\x00\x00\x00\x00 to dataframe or csv in python

How to return data pulled from python requests call into json

Generate JSON payload for posting to API with keys from 1 file and values from the other file

Categories

Resources