How to convert CSV to nested JSON in Python
This is related to something like this.
I want to convert a flat dataframe file to Nested JSON format:
I have a csv (sales_2020) file in the following format:
and i want a json like this:
i tried the link above and was able to add 1 level using this:
import pandas as pd
df = pd.read_csv('your_file.csv')
df['sales_2020'] = df[['computer','mobile']].to_dict('records')
out = df[['a','Sales_2020']].to_json(orient='records', indent=4)
But i was unable to add 1 more level to it..i.e sales for a specific month..I tried this below solution but doesnt work..
df['jan']['sales_2020'] =df[['computer','mobile']].to_dict('records')
please help me out
I guess what you want is orient='index'
df['sales_2020'] = df[['computer','mobile']].to_dict('records')
out = df.set_index('Month')[['sales_2020']].to_json(orient='index', indent=4)
{
"jan":{
"sales_2020":{
"computer":10,
"mobile":5
}
},
"feb":{
"sales_2020":{
"computer":8,
"mobile":2
}
},
"march":{
"sales_2020":{
"computer":6,
"mobile":12
}
}
}
Related
I'm trying to see if Out-of-the-Box a way to get JSON file created as per Requirements without having me to re-open JSON file and massage it further. The json array output I get from df.to_json("file", orient='index', indent=2, date_format='iso') has "0", "1", "3" etc as looks like elements root object names. Requirement is not to have those. And Secondly need to name Root Object.
CSV FILE (INPUT)
(https://i.stack.imgur.com/6gDZz.png)](https://i.stack.imgur.com/6gDZz.png)
vendor issuer
honda.com DigiCert
toyota.com GoDaddy
import pandas as pd
df = pd.read_csv('test_input.csv', na_filter=False, skiprows=0)
df.to_json("test_out.json", orient='index', indent=2, date_format='iso')
OUTPUT
{
"0":{
"vendor":"honda-us.com",
"issuer":"Amazon",
"licensed":"10\/11\/2021 16:14",
"expiring":"2\/9\/2023 16:14",
"remaining":57
},
EXPECTING OUTPUT TO BE
{
"vendorslist": [
{
"vendor":"honda-us.com",
"issuer":"Amazon",
"licensed":"10\/11\/2021 16:14",
"expiring":"2\/9\/2023 16:14",
"remaining":57
}
]
},
My recommendation would be somewhat build this yourself.
import json
import pandas as pd
df = pd.read_csv('test_input.csv', na_filter=False, skiprows=0)
data = {"vendorslist": df.to_dict(orient='records')}
with open("test_out.json", "w") as f:
json.dump(data, f, indent=2, default=str)
This may not give you the exact answer you're after, but it should be a good starting point :)
I have tried a few different ways using Panda to import my JSON to a csv file.
import pandas as pd
df = pd.read_json("CDMP_E2.json")
df.ts_csv("CDMP_Output.csv")
The problem is when I run that code it makes the output all in one "column".
The column header shows up as Credit-NoSQL.
Then the data in the column is everything from each "object"
'date':'2021-08-01','type':'CARD','amount':'100'
So it looks like this:
Credit-NoSQL
'date':'2021-08-01','type':'CARD','amount':'100'
I would instead expect to see date, type and amount as the headers instead.
account date type amount returneddate
ABCD 2021-08-01 CARD 100
EFGHI 2021-08-01 CARD 150 2021-08-04
My JSON file looks as such:
[
{
"Credit-NoSQL":{
"account":"ABCD"
"date":"2021-08-01",
"type":"CARD",
"amount":"100"
}
},
{
"Credit-NoSQL":{
"account":"EFGHI"
"date":"2021-08-02",
"type":"CARD",
"amount":"150"
"returneddate":"2021-08-04"
}
}
]
so I am not sure if it is the way my JSON file is set up with it's list and such or if I am missing something in my python command. I am new to python and still learning so I am at a loss at what I can do next.
No need to use pandas for this.
import json, csv
with open("CDMP_E2.json") as json_file:
data = [item['Credit-NoSQL'] for item in json.load(json_file)]
# Get the union of all dictionary keys
fieldnames = set()
for row in data:
fieldnames |= row
with open("CDMP_Output.csv", "w") as csv_file:
cwrite = csv.DictWriter(csv_file, fieldnames = fieldnames)
cwrite.writeheader()
cwrite.writerows(data)
I have a CSV file with which contains labels and their translation in different languages:
name en_GB de_DE
-----------------------------------------------
ElementsButtonAbort Abort Abbrechen
ElementsButtonConfirm Confirm Bestätigen
ElementsButtonDelete Delete Löschen
ElementsButtonEdit Edit Ãndern
I want to convert this CSV into JSON into following pattern using Python:
{
"de_De": {
"translations":{
"ElementsButtonAbort": "Abbrechen"
}
},
"en_GB":{
"translations":{
"ElementsButtonAbort": "Abort"
}
}
}
How can I do this using Python?
Say your data is as such:
import pandas as pd
df = pd.DataFrame([["ElementsButtonAbort", "Abort", "Arbrechen"],
["ElementsButtonConfirm", "Confirm", "Bestätigen"],
["ElementsButtonDelete", "Delete", "Löschen"],
["ElementsButtonEdit", "Edit", "Ãndern"]],
columns=["name", "en_GB", "de_DE"])
Then, this might not be the best way to do it but at least it works:
df.set_index("name", drop=True, inplace=True)
translations = df.to_dict()
Now, if you want to have get exactly the dictionary that you show as desired output, you can do:
for language in translations.keys():
_ = translations[language]
translations[language] = {}
translations[language]["translations"] = _
Finally, if you wish to save your dictionary into JSON:
import json
with open('PATH/TO/YOUR/DIRECTORY/translations.json', 'w') as fp:
json.dump(translations, fp)
I have a csv (which I turned into a pandas dataframe) in which each row consists of a different JSON file, each JSON file has the exact same format and objects as the others, and each one represents a unique transaction (purchase) I would like to take this dataframe and convert it into a dataframe or excel file in which each column would represent an object from the JSON file and each row would represent each transaction.
The JSON also contains arrays, in which case I would like to be able to retrieve each element of the array. Ideally I would like to be able to retrieve all possible objects from the JSON files and turn them into columns.
A simplified version of a row would be:
{
"source":{
"analyze":true,
"billing":{
"gender":null,
"name":"xxxxx",
"phones":[
{
"area_code":"xxxxx",
"country_code":"xxxxx",
"number":"xxxxx",
"phone_type":"xxxxx"
}
]
},
"created_at":"xxxxx",
"customer":{
"address":{
"city":"xxxxx",
"complement":"xxxxx",
"country":"xxxxx",
"neighborhood":"xxxxx",
"number":"xxxxx",
"state":"xxxxx",
"street":"xxxxx",
"zip_code":"xxxxx"
},
"date_of_birth":"xxxxx",
"documents":[
{
"document_type":"xxxxx",
"number":"xxxxx"
}
],
"email":"xxxxx",
"gender":xxxxx,
"name":"xxxxx",
"number_of_previous_orders":xxxxx,
"phones":[
{
"area_code":"xxxxx",
"country_code":"xxxxx",
"number":"xxxxx",
"phone_type":"xxxxx"
}
],
"register_date":xxxxx,
"register_id":"xxxxx"
},
"device":{
"ip":"xxxxx",
"lat":"xxxxx",
"lng":"xxxxx",
"platform":xxxxx,
"session_id":xxxxx
}
}
}
And my python code,,,
import csv
import json
import pandas as pd
df = pd.read_csv(r"<name of csv file in which each row is a JSON file>")
A simplified of my expected output would be something like
Expected Output
You mean something like this as the output, for example to get area_code:
A_col area_code
0 {"source":{"analyze":true,"billing":{"gender":... xxxxx
first:
"gender":xxxxx, "number_of_previous_orders":xxxxx, "register_date":xxxxx, "platform":xxxxx, "session_id":xxxxx, should be double quoted
get the json document:
newjson = []
with open('./example.json', 'r') as f:
for line in f:
line = line.strip()
newjson.append(line)
format it to string:
jsonString = ''.join(newjson)
turn into python object:
jsonData = json.loads(jsonString)
extract the fields using dictionary operations and turn into pandas dataframe:
newDF = pd.DataFrame({"A_col": jsonString, "area_code": jsonData['source']['billing']['phones'][0]['area_code']}, index=[0])
I am looking to transform my dataframe to json
Age Eye Gender
30 blue male
My current code, I convert the dataframe to json and get the below result:
json_file = df.to_json(orient='records')
json_file
[{'age':'30'},{'eye':'blue'},{'gender':'male'}]
However, I want to add an additional layer that would state the id and name to the json data and then label it as 'info'.
{'id':'5231'
'name':'Bob'
'info': [
{'age':'30'},{'eye':'blue'},{'gender':'male'}
]
}
How would I add the additional fields? I tried reading the docs however I do not see a clear answer on how to add the additional fields in during dataframe to json conversion.
Based on the data you provided this is your answer:
import pandas as pd
a = {'id':'5231',
'name':'Bob',
}
df = pd.DataFrame({'Age':[30], 'Eye':['blue'], 'Gender': ['male']})
json = df.to_json(orient='records')
a['info'] = json