JSON from API call to pandas dataframe - python

I'm trying to get an API call and save it as a dataframe.
problem is that I need the data from the 'result' column.
Didn't succeed to do that.
I'm basically just trying to save the API call as a csv file in order to work with it.
P.S when I do this with a "JSON to CSV converter" from the web it does it as I wish. (example: https://konklone.io/json/)
import requests
import pandas as pd
import json
res = requests.get("http://api.etherscan.io/api?module=account&action=txlist&
address=0xddbd2b932c763ba5b1b7ae3b362eac3e8d40121a&startblock=0&
endblock=99999999&sort=asc&apikey=YourApiKeyToken")
j = res.json()
j
df = pd.DataFrame(j)
df.head()
output example picture

Try this
import requests
import pandas as pd
import json
res = requests.get("http://api.etherscan.io/api?module=account&action=txlist&address=0xddbd2b932c763ba5b1b7ae3b362eac3e8d40121a&startblock=0&endblock=99999999&sort=asc&apikey=YourApiKeyToken")
j = res.json()
# print(j)
filename ="temp.csv"
df = pd.DataFrame(j['result'])
print(df.head())
df.to_csv(filename)

Looks like you need.
df = pd.DataFrame(j["result"])

Related

How to extract data from an api using python and convert it into a pandas data frame

I want to load the data from an API into a pandas data frame. How may I do that? The following is my code snippet:
import requests
import json
response_API = requests.get('https://data.spiceai.io/eth/v0.1/gasfees?period=1d')
#print(response_API.status_code)
data = response_API.text
parse_json = json.loads(data)
Almost there, the json is clean you can directly input it to a dataframe :
response_API = requests.get('https://data.spiceai.io/eth/v0.1/gasfees?period=1d')
data = response_API.json()
df = pd.DataFrame(data)

How to read Json data with unbalanced array length in Python

I have been trying to fetch Json data from an API using Python so that I can transfer that data to sqlite3 database. The issue is that the data is unbalanced. My end goal is to transfer this json data to a .db file in sqlite3.
Here is what I did:
import pandas as pd
url = "https://baseballsavant.mlb.com/gf?game_pk=635886"
df = pd.read_json(url)
print(df)
This is the error I am getting:
raise ValueError("All arrays must be of the same length")
ValueError: All arrays must be of the same length
It's not obvious what you want your final DataFrame to look like, but appending "orient='index'" avoids the problem in this case.
import pandas as pd
url = "https://baseballsavant.mlb.com/gf?game_pk=635886"
df = pd.read_json(url, orient='index')
print(df)
You could also request the data with, for example, the requests module and prepare it before loading it into a DataFrame
import requests
url = "https://baseballsavant.mlb.com/gf?game_pk=635886"
response = requests.get(url)
data = response.json()
"""
Do data transformations here
"""
df = pd.DataFrame.from_dict(data)

Using pandas with praw

I'm messing around learning to work with APIs, I figured I'd make a Reddit bot. I'm trying to apply some code I used for a different script. That script used requests turned the request to json then added it a pandas dataframe and then wrote a csv.
I'm trying to do so about the same but don't know how to run the Reddit data into the dataframe. What I've tried below throws errors.
#!/usr/bin/python
import praw
import pandas as pd
reddit = praw.Reddit('my_bot')
subreddit = reddit.subreddit("askreddit")
for submission in subreddit.hot(limit=5):
print("Title: ", submission.title)
print("Score: ", submission.score)
print("Link: ", submission.url)
print("---------------------------------\n")
csv_file = f"/home/robothead/scripts/python/reddit/reddit-data.csv"
# start with empty dataframe
df = pd.DataFrame()
#j_data = subreddit.json()
#parse_data = j_data['data']
# append to the dataframe
#df = df.append(pd.DataFrame.from_dict(pd.json_normalize(parse_data), orient='columns'))
# append to the dataframe
df = df.append(pd.DataFrame.from_dict(pd(submission), orient='columns'))
# write the whole CSV at once
df.to_csv(csv_file, index=False, encoding='utf-8')
error:
Traceback (most recent call last):
File "bot.py", line 21, in <module>
df = df.append(pd.DataFrame.from_dict(pd(submission), orient='columns'))
TypeError: 'module' object is not callable
This is how I've done it in the past:
df = pd.DataFrame([ vars(post) for post in subreddit.hot(limit=5) ])
vars converts praw.Submission to a dict and pandas DataFrame constructor can take a list of dictionaries. Works well if you have dicts with the same keys, which is the case here. Of course you get a giant dataframe with ALL the columns. Some even have praw objects in them (that you can work with!). You'll probably want to parse that down by just keeping the columns you want before writing to a file.
Edit:
Just so there's no confusion, here is the full script example:
#!/usr/bin/python
import praw
import pandas as pd
reddit = praw.Reddit('my_bot')
subreddit = reddit.subreddit("askreddit")
df = pd.DataFrame([ vars(post) for post in subreddit.hot(limit=5) ])
df = df[["title","score","url"]]
df.to_csv(csv_file, index=False, encoding='utf-8')

python parsing tab separated file

Fairly new to python
I want to parse a file with \t separated values, images below. How do i remove the \t from the file and seperate the values into columns?
Code below.
import pandas as pd
import io
import requests
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00236/seeds_dataset.txt"
s = requests.get(url).content
df = pd.read_csv(io.StringIO(s.decode('utf-8')))
How it looks right now
How i want it to look
Add sep="\t" into pd.read_csv. The data is messy, thus double tab needs to be replaced:
df = pd.read_csv(
io.StringIO(s.decode('utf-8').replace("\t\t", "\t")),
header=None, sep="\t")
If using csv library is an option you can try:
import pandas as pd
import requests
import csv
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00236/seeds_dataset.txt"
raw_data = requests.get(url).content
file = open("raw_data.txt","w")
file.write(raw_data)
data = list(csv.reader(open('raw_data.txt', 'rb'), delimiter='\t'))
df = pd.DataFrame.from_records(data)
print df

python requests.text to pandas dataframe

I have the below code, where I am trying to get the data from https://www.quandl.com/data/TSE/documentation/metadata. (Trying to get the Download detailed data)
for page_number in range(1, 5):
link = r'https://www.quandl.com/api/v3/datasets.csv?database_code=TSE&per_page=100&sort_by=id&page=' + str(page_number)
r = requests.get(link, stream=True).text
print(r)
# How to put the results in a dataframe?
However, I have trouble putting the results in a dataframe / saving it in a SQLite database. How should I be doing this?
You can use Pandas to read this data directly:
import pandas as pd
url = ("https://www.quandl.com/api/v3/datasets.csv?"
"database_code=TSE&per_page=100&sort_by=id&page={0}")
[pd.read_csv(url.format(page_number)) for page_number in range(1, 5)]
To read from response you can use StringIO:
from io import StringIO
pd.read_csv(StringIO(r.text))

Categories

Resources