json to dataframe conversion / Python - python

I just imported an API to get the exchange rate of Taiwan dollar (TWD) with other currencies.
So I import it with this code :
import requests
r=requests.get('http://api.cambio.today/v1/full/TWD/json?key=X')
dico = r.json()
And it gives me:
{'result': {'from': 'TWD',
'conversion': [{'to': 'AED',
'date': '2020-06-23T07:23:49',
'rate': 0.124169},
{'to': 'AFN', 'date': '2020-06-23T07:19:53', 'rate': 2.606579},
{'to': 'ALL', 'date': '2020-06-19T20:48:10', 'rate': 3.74252},
{'to': 'AMD', 'date': '2020-06-22T12:00:19', 'rate': 16.176679},
{'to': 'AOA', 'date': '2020-06-22T12:32:59', 'rate': 20.160418},
{'to': 'ARS', 'date': '2020-06-23T08:00:01', 'rate': 2.363501}
]}
}
To turn it into a dataframe I tried two things:
df = pd.DataFrame(dico.get('result', {}))
and
from pandas.io.json import json_normalize
dictr = r.json()
df = json_normalize(dictr)
In both cases, I end up with a "conversion" column with one line per currency. For example the first line is: "{'to': 'AFN', 'date': '2020-06-23T07:19:53', 'rate': 2.606579}".
While I would like to have one column for the currency and one for the exchange rate.
Could someone please help me?

The json you pasted is not valid json. But I guess the format of the json should be this one
{'result': {'from': 'TWD',
'conversion': [{'to': 'AED',
'date': '2020-06-23T07:23:49',
'rate': 0.124169},
{'to': 'AFN', 'date': '2020-06-23T07:19:53', 'rate': 2.606579},
{'to': 'ALL', 'date': '2020-06-19T20:48:10', 'rate': 3.74252},
{'to': 'AMD', 'date': '2020-06-22T12:00:19', 'rate': 16.176679},
{'to': 'AOA', 'date': '2020-06-22T12:32:59', 'rate': 20.160418},
{'to': 'ARS', 'date': '2020-06-23T08:00:01', 'rate': 2.363501}]}}
In that case to create dataframe you want you can use
df = pd.DataFrame(dico.get('result', {}).get('conversion', {}))

You need to do get the "conversion" property value with the list of conversion rates, use this:
df = pd.DataFrame(dico["result"]["conversion"])
It will format your conversion data like this:
to date rate
0 AED 2020-06-23T07:23:49 0.124169
1 AFN 2020-06-23T07:19:53 2.606579
2 ALL 2020-06-19T20:48:10 3.742520
3 AMD 2020-06-22T12:00:19 16.176679
4 AOA 2020-06-22T12:32:59 20.160418
5 ARS 2020-06-23T08:00:01 2.363501

Related

Array to Dictionary Output Python

I have the following list after querying my DB which I'd like to turn into a dictionary:
[{date: date_value1.1, rate: rate_value1.1, source: source_name1},
{date: date_value1.2, rate: rate_value1.2, source: source_name1},
{date: date_value2.1, rate: rate_value2.1, source: source_name2},
{date: date_value2.2, rate: rate_value2.2, source: source_name2},
{date: date_valuenx, rate: rate_valuex, source: source_namex}, ...]
The dictionary should follow the following format:
{
source_name1:
[
{date: date_value1.1, rate: rate_value1.1}
{date: date_value1.2, rate: rate_value1.2}
],
source_name2:
[
{date: date_value2.1, rate: rate_value2.1}
{date: date_value2.2, rate: rate_value2.2}
],
}
I have tried a lot of different code variations, but could not get it to work. What would be the most efficient way to transform the data into the required format?
(This format is the response the client will receive after calling my API. If you have suggestions for better formatting of this response I am also open to suggestions!)
We can use defaultdict and simply append each result to our output list.
from collections import defaultdict
output = defaultdict(list)
data = [{'date': 'date_value1.1', 'rate': 'rate_value1.1', 'source': 'source_name1'}, {'date': 'date_value1.2', 'rate': 'rate_value1.2', 'source': 'source_name1'}, {'date': 'date_value2.1', 'rate': 'rate_value2.1', 'source': 'source_name2'}, {'date': 'date_value2.2', 'rate': 'rate_value2.2', 'source': 'source_name2'}, {'date': 'date_valuenx', 'rate': 'rate_valuex', 'source': 'source_namex'}]
for row in data:
output[row['source']].append({k: v for k, v in row.items() if k != 'source'})
dict(output)
#{'source_name1': [{'date': 'date_value1.1', 'rate': 'rate_value1.1'}, {'date': 'date_value1.2', 'rate': 'rate_value1.2'}], 'source_name2': [{'date': 'date_value2.1', 'rate': 'rate_value2.1'}, {'date': 'date_value2.2', 'rate': 'rate_value2.2'}], 'source_namex': [{'date': 'date_valuenx', 'rate': 'rate_valuex'}]}
import pprint
pprint.pprint(dict(output))
{'source_name1': [{'date': 'date_value1.1', 'rate': 'rate_value1.1'},
{'date': 'date_value1.2', 'rate': 'rate_value1.2'}],
'source_name2': [{'date': 'date_value2.1', 'rate': 'rate_value2.1'},
{'date': 'date_value2.2', 'rate': 'rate_value2.2'}],
'source_namex': [{'date': 'date_valuenx', 'rate': 'rate_valuex'}]}
try this:
d = [{"date":" date_value1.1", "rate": "rate_value1.1", "source": "source_name1"},
{"date": "date_value1.2", "rate": "rate_value1.2", "source": "source_name1"},
{"date": "date_value2.1", "rate": "rate_value2.1", "source": "source_name2"},]
d1 = {}
for ele in d:
key = ele.pop('source')
d1[key] = d1.get(key, list())
d1[key].append(ele)
print(d1)
output is:
{'source_name1': [{'date': ' date_value1.1', 'rate': 'rate_value1.1'}, {'date': 'date_value1.2', 'rate': 'rate_value1.2'}], 'source_name2': [{'date': 'date_value2.1', 'rate': 'rate_value2.1'}]}

How to convert json into a pandas dataframe?

I'm trying to covert an api response from json to a dataframe in pandas. the problem I am having is that de data is nested in the json format and I am not getting the right columns in my dataframe.
The data is collect from a api with the following format:
{'tickets': [{'url': 'https...',
'id': 1,
'external_id': None,
'via': {'channel': 'web',
'source': {'from': {}, 'to': {}, 'rel': None}},
'created_at': '2020-05-01T04:16:33Z',
'updated_at': '2020-05-23T03:02:49Z',
'type': 'incident',
'subject': 'Subject',
'raw_subject': 'Raw subject',
'description': 'Hi, this is the description',
'priority': 'normal',
'status': 'closed',
'recipient': None,
'requester_id': 409467360874,
'submitter_id': 409126461453,
'assignee_id': 409126461453,
'organization_id': None,
'group_id': 360009916453,
'collaborator_ids': [],
'follower_ids': [],
'email_cc_ids': [],
'forum_topic_id': None,
'problem_id': None,
'has_incidents': False,
'is_public': True,
'due_at': None,
'tags': ['tag_1',
'tag_2',
'tag_3',
'tag_4'],
'custom_fields': [{'id': 360042034433, 'value': 'value of the first custom field'},
{'id': 360041487874, 'value': 'value of the second custom field'},
{'id': 360041489414, 'value': 'value of the third custom field'},
{'id': 360040980053, 'value': 'correo_electrónico'},
{'id': 360040980373, 'value': 'suscribe_newsletter'},
{'id': 360042046173, 'value': None},
{'id': 360041028574, 'value': 'product'},
{'id': 360042103034, 'value': None}],
'satisfaction_rating': {'score': 'unoffered'},
'sharing_agreement_ids': [],
'comment_count': 2,
'fields': [{'id': 360042034433, 'value': 'value of the first custom field'},
{'id': 360041487874, 'value': 'value of the second custom field'},
{'id': 360041489414, 'value': 'value of the third custom field'},
{'id': 360040980053, 'value': 'correo_electrónico'},
{'id': 360040980373, 'value': 'suscribe_newsletter'},
{'id': 360042046173, 'value': None},
{'id': 360041028574, 'value': 'product'},
{'id': 360042103034, 'value': None}],
'followup_ids': [],
'ticket_form_id': 360003608013,
'deleted_ticket_form_id': 360003608013,
'brand_id': 360004571673,
'satisfaction_probability': None,
'allow_channelback': False,
'allow_attachments': True},
What I already tried is the following: I have converted the JSON format into a dict as following:
x = response.json()
df = pd.DataFrame(x['tickets'])
But I'm struggling with the output. I don't know how to get a correct, ordered, normalized dataframe.
(I'm new in this :) )
Let's supose you get your request data by this code r = requests.get(url, auth)
Your data ins't clear yet, so let's get a dataframe of it data = pd.read_json(json.dumps(r.json, ensure_ascii = False))
But, probably you will get a dataframe with one single row.
When I faced a problem like this, I wrote this function to get the full data:
listParam = []
def listDict(entry):
if type(entry) is dict:
listParam.append(entry)
elif type(entry) is list:
for ent in entry:
listDict(ent)
Because your data looks like a dict because of {'tickets': ...} you will need to get the information like that:
listDict(data.iloc[0][0])
And then,
pd.DataFrame(listParam)
I can't show the results because you didn't post the complete data nor told where I can find the data to test, but this will probably work.
You have to convert the json to dictionary first and then convert the dictionary value for key 'tickets' into dataframe.
file = open('file.json').read()
ticketDictionary = json.loads(file)
df = pd.DataFrame(ticketDictionary['tickets'])
'file.json' contains your data here.
df now contains your dataFrame in this format.
For the lists within the response you can have separate dataframes if required:
for field in df['fields']:
df = pd.DataFrame(field)
It will give you this for lengths:
id value
0 360042034433 value of the first custom field
1 360041487874 value of the second custom field
2 360041489414 value of the third custom field
3 360040980053 correo_electrónico
4 360040980373 suscribe_newsletter
5 360042046173 None
6 360041028574 product
7 360042103034 None
This can be one way to structure as you haven't mentioned the exact expected format.

I have list of nested dict variable and need to convert to dict variable type for Json object

Below json data has 3 rules (dict type). I have created as list with some changes. Now i need to convert this "list to dict" data type. The below data has lot of nested list/dict. I want to split this list of list (3 list) and append it to dictionary.(dict datatype)
<class 'list'>
[
{'ID': 'Glacierize bird_sporr after 2 weeks',
'Status': 'Enabled',
'Transitions': [{'Days': 14, 'StorageClass': 'GLACIER'}],
'NoncurrentVersionTransitions': [{'NoncurrentDays': 14, 'StorageClass': 'GLACIER'}],
'Prefix': 'bird_sporr'},
{'Expiration':
{'Days': 45},
'ID': 'Delete files after 45 days',
'Status': 'Enabled',
'NoncurrentVersionExpiration': {'NoncurrentDays': 45},
'Prefix': 'bird_sporr'
},
{'ID': 'PruneAbandonedMultipartUpload',
'Status': 'Enabled',
'AbortIncompleteMultipartUpload': {'DaysAfterInitiation': 30},
'Prefix': ''}
]
I need the below output with dict data type.. This API will not acccept the list data type. Please help on this. Let me know if any queries.
<class 'dict'>
{'ID': 'Glacierize bird_sporr after 2 weeks',
'Status': 'Enabled',
'Transitions': [{'Days': 14, 'StorageClass': 'GLACIER'}],
'NoncurrentVersionTransitions': [{'NoncurrentDays': 14, 'StorageClass': 'GLACIER'}],
'Prefix': 'bird_sporr'},
{'Expiration':
{'Days': 45},
'ID': 'Delete files after 45 days',
'Status': 'Enabled',
'NoncurrentVersionExpiration': {'NoncurrentDays': 45},
'Prefix': 'bird_sporr'},
{'ID': 'PruneAbandonedMultipartUpload',
'Status': 'Enabled',
'AbortIncompleteMultipartUpload': {'DaysAfterInitiation': 30},
'Prefix': ''}
If your problem is just that, you have a list with your output. But you need just the output, without it being contained by a list, Then you should simply be able to do this:
list[0] should give you your desired dictionary.

How can I reformat JSON / Dictionaries in Python

I have the below list -
[{'metric': 'sales', 'value': '100', 'units': 'dollars'},
{'metric': 'instock', 'value': '95.2', 'units': 'percent'}]
I would like to reformat it like the below in Python -
{'sales': '100', 'instock': '95.2'}
I did the below -
a = [above list]
for i in a:
print({i['metric']: i['value']})
But it outputs like this -
{'sales': '100'}
{'instock': '95.2'}
I would like these 2 lines to be a part of the same dictionary
d = [{'metric': 'sales', 'value': '100', 'units': 'dollars'},
{'metric': 'instock', 'value': '95.2', 'units': 'percent'}]
new_d = {e["metric"]: e["value"] for e in d}
# output: {'sales': '100', 'instock': '95.2'}
I believe that it's best to try it first by yourself, and then post a question in case you don't succeed. You should consider posting your attempts next time.

How can I convert this byte or string to a dataframe?

I have a data in this format(bytes):
b'{"datatable":{"data":[["AAPL","1980-12-12",28.75,28.87,28.75,28.75,2093900.0,0.0,1.0,0.42270591588018,0.42447025361603,0.42270591588018,0.42270591588018,117258400.0],
["AAPL","1980-12-15",27.38,27.38,27.25,27.25,785200.0,0.0,1.0,0.40256306006259,0.40256306006259,0.40065169418209,0.40065169418209,43971200.0],
["AAPL","1980-12-16",25.37,25.37,25.25,25.25,472000.0,0.0,1.0,0.37301040298714,0.37301040298714,0.37124606525129,0.37124606525129,26432000.0],
["AAPL","1980-12-17",25.87,26.0,25.87,25.87,385900.0,0.0,1.0,0.38036181021984,0.38227317610034,0.38036181021984,0.38036181021984,21610400.0],
["AAPL","1980-12-18",26.63,26.75,26.63,26.63,327900.0,0.0,1.0,0.39153594921354,0.39330028694939,0.39153594921354,0.39153594921354,18362400.0],
["AAPL","1980-12-19",28.25,28.38,28.25,28.25,217100.0,0.0,1.0,0.41535450864748,0.41726587452798,0.41535450864748,0.41535450864748,12157600.0],
.....,{"name":"adj_high","type":"BigDecimal(50,28)"},{"name":"adj_low","type":"BigDecimal(50,28)"},{"name":"adj_close","type":"BigDecimal(50,28)"},{"name":"adj_volume","type":"double"}]},"meta":{"next_cursor_id":null}}'
I can convert this by using .decode('utf-8'). However, I want to convert the type into DataFrame or some other format so that I can work with this data.
Any help would be appreciated.
Here are errors when I try pd.DataFrame()
ValueError: DataFrame constructor not properly called!
Thank you for giving me great direction!
I have used
apple = json.loads(apple1)
apple
to get
{'datatable': {'columns': [{'name': 'ticker', 'type': 'String'},
{'name': 'date', 'type': 'Date'},
{'name': 'open', 'type': 'BigDecimal(34,12)'},
{'name': 'high', 'type': 'BigDecimal(34,12)'},
{'name': 'low', 'type': 'BigDecimal(34,12)'},
{'name': 'close', 'type': 'BigDecimal(34,12)'},
{'name': 'volume', 'type': 'BigDecimal(37,15)'},
{'name': 'ex-dividend', 'type': 'BigDecimal(42,20)'},
{'name': 'split_ratio', 'type': 'double'},
{'name': 'adj_open', 'type': 'BigDecimal(50,28)'},
{'name': 'adj_high', 'type': 'BigDecimal(50,28)'},
{'name': 'adj_low', 'type': 'BigDecimal(50,28)'},
{'name': 'adj_close', 'type': 'BigDecimal(50,28)'},
{'name': 'adj_volume', 'type': 'double'}],
'data': [['AAPL',
'1980-12-12',
28.75,
28.87,
28.75,
28.75,
2093900.0,
0.0,
1.0,
0.42270591588018,
0.42447025361603,
0.42270591588018,
0.42270591588018,
117258400.0],
['AAPL',
'1980-12-15',
27.38,
27.38,
27.25,
27.25,
785200.0,
0.0,
1.0,
0.40256306006259,
0.40256306006259,
0.40065169418209,
0.40065169418209,
43971200.0],
and if I run:
pd.DataFrame(apple['datatable']['data'])
I get:
apple dataframe
Which is good, but I would like to have column name as: [date, open, high, low, close, volume, ex-dividend, split_ratio, adj_open, adj_high, adj_low, adj_close, adj_volume] rather than [0,1,2,3,4,5,6,7,8,9,10,11,12,13].
Also, I would like to delete current column 1('AAPL') and index as numbers so that it looks like a time series with date as the first column.
Can you help me on this?
You might need to tidy up the data first but doing the following works.
import json
import pandas as pd
pd.DataFrame(json.loads(data.decode('utf-8'))['datatable']['data'])

Categories

Resources