Using Pandas to convert csv to Json - python

I want to convert a CSV to a JSON format using pandas. I am a tester and want to send some events to Event Hub for that I want to maintain a CSV file and update my records/data using the CSV file. I created a CSV file by reading a JSON using pandas for reference. Now when I am again converting the CSV into JSON using pandas< the data is not getting displayed in the correct format. Can you please help.
Step 1: Converted JSON to CSV using pandas:
df = pd.read_json('C://Users//DAMALI//Desktop/test.json')
df.to_csv('C://Users//DAMALI//Desktop/test.csv')
Step2: Now if I try to convert the JSON again to CSV, it's not getting converted in the same format as earlier:
df = pd.read_csv('C://Users//DAMALI//Desktop/test.csv')
df.to_json('C://Users//DAMALI//Desktop/test1.json')
Providing JSON below:
{
"body": {
"deviceId": "UDM",
"registrationDate": "12/11/2019",
"testRegistration": false,
"serialNumber": "25",
"articleNumber": "R91",
"deviceName": "UDM-test",
"locationId": "lc0",
"sapSoldToId": "1138474",
"crmDomainAccountId": "1234566",
"crmAccountDetails": {
"accountName": "ProjectX",
"accountId": "Instal",
"region": "AP"
},
"productLine": "UD",
"state": "registered",
"installerName": "ABC Rooms",
"installationAddress": {
"street": "Benelu",
"zipCode": "850",
"city": "Kortr",
"state": "OVL",
"country": "Belgi"
},
"customerDetails": {
"name": "John D",
"contactName": "John Doe",
"phone": "+32 999999999",
"email": "john.doe#test.com"
},
"wallConnect": {
"wallSize": "Width 5 x Height 4",
"wallOrientation": "LANDSCAPE",
"displayType": "BVD-D55M21H321A1C300",
"softwareVersion": "1.13.1.1.3"
},
"projector": {
"name": "UDX 40K-123456789",
"subType": "UDX 40K"
},
"featureLicense": ["UDX-aa00213a-5719-440e-a3b5", "UDX-aa00a-571"],
"cloudServiceLicense": ["EN04d5-4d2a-9131-875ad37c5883", "E15-4d2a-9131-875ad37c5154"],
"metadata": {
"cusQuesAns": [{
"ques": "End ucal industry",
"ans": "Hosity",
"key": "CUST_ANSWER"
},
{
"ques": "End user video wall application",
"ans": "Simulation & Virtual Reality",
"key": "CUSSECOND_ANSWER"
}
]
},
"frequency": "realtime",
"subDevices": [{
"deviceType": "DISPLAY",
"serialNumber": "68960",
"articleNumber": "R792",
"wallConnect": {
"displayFMWVersion": "3.0.0",
"displayVariant": "KVD21H331A1C300"
}
}]
},
"properties": {
"drs": {
"type": "salesforce-lm"
}
},
"systemProperties": {
"user-id": "data-cvice",
"message-id": "1b1012cc-9b18c192"
}
}

Try this for converting CSV to JSON
import pandas as pd
df = pd.read_csv (r'Fayzan-Bhatti\test.csv')
df.to_json (r'Fayzan-Bhatti\new_test.json')

Related

Nested json files - Python

Good afternoon all,
I've been reading through the various posts regarding reading .json files using pandas but so far I've not been sucessful extract.
I need to read a specific 'score' in the json file of which I'll then iterate through all the json files I have as the label would be the same.
In the below how would I read the 'score'? I've tried using the normalise function but regardless of the agruement I put in I cannot get any closer.
Part of the json file:
"template_id": "template_fe61177cb0eb4642901b1eae9488fbb4",
"audit_id": "audit_1a0e9ef4a7914286808accb3dcb0700b",
"archived": false,
"created_at": "2022-10-07T08:00:14.021Z",
"modified_at": "2022-10-07T08:05:56.594Z",
"audit_data": {
"score": 10,
"total_score": 11,
"score_percentage": 90.909,
"name": "7 Oct 2022 / Test",
"duration": 240,
"authorship": {
"device_id": "user_65c3799b0f1a48549cacbceca244e1db",
"owner": "test",
"owner_id": "user_65c3799b0f1a48549cacbceca244e1db",
"author": "test",
"author_id": "user_65c3799b0f1a48549cacbceca244e1db"
},
"date_completed": "2022-10-07T08:05:55.860Z",
"date_modified": "2022-10-07T08:05:56.594Z",
"date_started": "2022-10-07T08:00:13.000Z",
"site": {
"name": "Blue Warehouse"
}
},
"template_data": {
"authorship": {
"device_id": "user_4bb896b5308341f7a7543a32f6c1f3ec",
"owner": "test",
"owner_id": "user_4bb896b5308341f7a7543a32f6c1f3ec",
"author": "test",
"author_id": "user_4bb896b5308341f7a7543a32f6c1f3ec"
},
"metadata": {
"description": "",
"name": "RCS",
"image": {
"date_created": "2022-04-12T13:27:18.852Z",
"file_ext": "png",
"label": "Go \u0026 See icon.PNG",
"media_id": "cf944a4b-7589-47e6-b42a-8d17f06b7031",
"href": "https://1"
}
},
"response_sets": {
"5b69aee5-0532-46a4-b2f5-d020d4d5381d": {
"id": "5b69aee5-0532-46a4-b2f5-d020d4d5381d",
"type": "question",
"responses": [
{
"id": "ef4abf51-3361-46f5-ba04-70c23c85ca20",
"label": "Good",
"colour": "19,133,95",
***"score": 1,***
"enable_score": true
},
Thanks for your help.
Rob.
This is done without pandas
import json
with open("my_file.json", 'r') as f:
my_dict = json.load(f)
score = my_dict["response_sets"]["5b69aee5-0532-46a4-b2f5-d020d4d5381d"]["responses"][0]["score"]

Looking to generically convert JSON file to CSV in Python

Tried solution shared in link :: Nested json to csv - generic approach
This worked for Sample 1 , but giving only a single row for Sample 2.
is there a way to have generic python code to handle both Sample 1 and Sample 2.
Sample 1 ::
{
"Response": "Success",
"Message": "",
"HasWarning": false,
"Type": 100,
"RateLimit": {},
"Data": {
"Aggregated": false,
"TimeFrom": 1234567800,
"TimeTo": 1234567900,
"Data": [
{
"id": 11,
"symbol": "AAA",
"time": 1234567800,
"block_time": 123.282828282828,
"block_size": 1212121,
"current_supply": 10101010
},
{
"id": 12,
"symbol": "BBB",
"time": 1234567900,
"block_time": 234.696969696969,
"block_size": 1313131,
"current_supply": 20202020
},
]
}
}
Sample 2::
{
"Response": "Success",
"Message": "Summary succesfully returned!",
"Data": {
"11": {
"Id": "3333",
"Url": "test/11.png",
"value": "11",
"Name": "11 entries (11)"
},
"122": {
"Id": "5555555",
"Url": "test/122.png",
"Symbol": "122",
"Name": "122 cases (122)"
}
},
"Limit": {},
"HasWarning": False,
"Type": 50
}
Try this, you need to install flatten_json from here
import sys
import csv
import json
from flatten_json import flatten
data = json.load(open(sys.argv[1]))
data = flatten(data)
with open('foo.csv', 'w') as f:
out = csv.DictWriter(f, data.keys())
out.writeheader()
out.writerow(data)
Output
> cat foo.csv
Response,Message,Data_11_Id,Data_11_Url,Data_11_value,Data_11_Name,Data_122_Id,Data_122_Url,Data_122_Symbol,Data_122_Name,Limit,HasWarning,Type
Success,Summary succesfully returned!,3333,test/11.png,11,11 entries (11),5555555,test/122.png,122,122 cases (122),{},False,50
Note: False is incorrect in Json, you need to change it to false

How to convert a complicated (nested) json to a pandas dataframe?

I have a very weird json file with a lot of nesting in it. I need to convert it into a Pandas dataframe.
The Json looks something like this:
{
"data": {
"page1": {
"last_name": "suraj",
"first_name": "singh",
"dob": "2020-06-02",
"gender": "Male",
"address1": "asdf",
"city": "asdf",
"state": "ID",
"Zip": "34324",
"phone": "2343243242",
"emailaddress": "suraj.singh#fugetroncorp.com",
"ethnicity": "adsf",
"url": " iVBORw0KGgoAAAANSUhEUgAAAVIAAABkCAYAAADUgbjrAAANS0lEQVR4Xu2dXeh1RRXGH++EICMwyIy3bt4LjSwkUAnMCwP7oA9SKqKSQkMxk64Ky4Lopu9AhQgqIqKEPsiMSLAgLEFCQcGoyEIJEso+yLoqfq8zMez3/I/7/Pee2fPxDBzOef9n75am1njXvc9aaNbP2GXIzAkbACBiBRQicsehu32wEjIARMAIykXoSGAEjYAQWImAiXQigbzcCRsAImEg9B4yAETACCxEwkS4E0LcbASNgBEykngNGwAgYgYUImEgXAujbjYARMAImUs+BkRB4jiReJ/Yo/TdJT4bvHx0JHOt6fARMpMfHznfWg8CrJJ0VSPJF4f1lQTzeIc+lDXJNX7G/+Lf4799J+qWkV0r6mCTk4QUpm5iXWqHS+02klRrGYp0iHzxHSDCSY/Qo4/tJSec0hNVjkn4bCPVrkn7akOwWdQ8CJlJPjy0RgCAvkITXGD1HvMt9LYbeeHd/l/TscPHUM0z7SD3SqXcaiRqPtmS7W9LlJQf0WPkQMJHmw9Y9n44AZHmpJMiSzxBp2iDJB4LHBjHymfcYEvPvEi31gON4UdboDce/T0kaj/PXks4Pof3Z4fN5kl4r6TXhxq9KurqEMh4jPwIm0vwYjzwCJIPH+cbwSonzDyG0hRzjKyZ5esEMj/MSSa8O7+j1K0k/kPQZSf/oRdHR9TCRjj4D1tcfTzMlzzhCJE7WBXn1mni5UNKng9JxmeIXku6VdKckPv9nfdjd45YImEi3RL+PsfEyY8iO50ko+6yg2s8CaRLG9kqcqHqlpOvCUsVfgv683yoJDEgyuXWMgIm0Y+NmUg3iZJ0zJojS5BBe55clPRQItLdQfQrpNZLekKx7/lDSLZJ+ExJhmUzgbmtDwERam0Xqkoc1zjQ5NN2TmYbrcZ2zLg3ySPNWSdeGpNm3JD3lxFEeoFvp1UTaiqXKyQlZvivJrKcjPxgSQ6xxjkScEQOWLm4M2IDBxyU9LOmJcubxSDUiYCKt0SrbyPTuQBLxRBBSsL4HYcYEUe+h+lHIs3zxubAGyhatz4fXNpbyqNUhYCKtziTFBYIkvpLs6YQ8SQ59LzlzXlyoSgbkR4U1TzxRljHABRId9QelErPUJ4aJtD6blJQILxQSpRG2f8DHFk9hQUINAgUfGiF87zsPSs677sYykXZn0tkK4Vmx3kfjhA1EMXojuQYm/KDwGe+cz6VOVI2Of7P6m0ibNd0iwalKhMfFeh9eF2H86I3wHe8cAiWMBxcXFRl9VszU30Q6E6iOLmNN9J6gz2Umi1NhPAQa98PeFLxzr4N2NOlzq2IizY1wXf1zfPH+INL1km6rS7zi0kTPnIG/H4qMOIwvbob2BzSRtm/DQzTgGCPHFglZ8UZHbelOBcJ4CNVrxKPOhhX0NpGuAGJDXRDSQyJkoSGP0Rrrn6wNk0CiUVyZzw7jR5sJK+trIl0Z0Mq7G5lISSbFTfVOJlU+UVsTz0TamsWWyRvXBEfySNM9oexSIIQHB3uhy+aS704QMJGONR3eE6ozkVBhjbR3MiFsJ5SPe0LZ0tRzOb+xZnNF2ppIKzJGAVEgFMJ7jj5SvehLBcbcYgjWgSFQ3r1XdgsLDDamiXQwg0v6YFLB/SpJd3QGAaE7NUKZ2w7jOzNureqYSGu1TF65YtKJyu03dHKyiRqhtydhPOugPpmUdx6594CAiXTcqUAVe550SWs5+ZTWCH1c0s3eEzrupN5KcxPpVshvP+4Vku5KxHhE0tsaKtDBOi8FRiBS1n7ZExqrNW2PriUYCgET6VDmPk1Zwl9OO/HAOhpZfP72hYphiZvqIU1XaKrYUCOJZiIdydq7dT0ZapGemXzN9qg3VbZVCNLkESgQPZ/JxrO9yUc7PYc3R8BEurkJqhCAMBlC4nn0aYOoavBOkYMwns31NGTypvoqpo6FAAETqedBikBaDSn+nQ3sFH7eIgOeHutEHhda9nytEgETaZVm2VQoNrFTPX/qnUKk1OosUWYu3VAPGDwGBZJ3AepNp4YHPwoBE6nnxi4EWIMknOZ11uQCCBWipX7n2o1xKbKMJ0rzOujaCLu/LAiYSLPA2k2nrEniCZLkmTZCftZV2Xa0xvl1SJvqTLF5HbSbadS/IibS/m28hob7CJX+CfchVbzUQ0mVRBdeKO80V6pfw2LuoygCJtKicDc/GITK/k1eJ47Q5hBShXyjt+saoc1Pj3EVMJGOa/ulmuNBQqisZ+4j1biempbs497vejvTUhP4/loQMJHWYom25cBThVDJtlN5aVeL66nxbDzX2Att2+6WPiBgIvVUyIEAhBqJdbqNivH+JelTgz43Kgfe7nNjBEykGxtggOFfGoosv3miK0kpwv4aTk4NYAarmBMBE2lOdN339GTShyU9FdZWWQ5gj+ofJX00nFo6NONvhI1AFQiYSKswQ3dCQJIkk+KWJtZCIdX0VBTfvU7Se0OyimQUm/2pjVri9FR3oFuh7RAwkW6HfY8jT58bj46cj4dE9z1oj8347ACI66kQKQcBuLf3B/T1OA+G08lEOpzJsykMEXIyCTKNjfVPSHJuw0uFQGPmn1A/eqkO++ei6OuKI2AiLQ55dwOSoYdAYxiPgkvPyO8665/zjH93RrFCZREwkZbFu6fRIDsIdPp4Dyo1Ecqv5UHSP15tDPtjtp8z/g77e5pRDetiIm3YeBuKviuMR5ychUbwfBk3HimFRCmrR3JqLdLeEFIP3TICJtKWrVde9mmBkVQCHk1Sol4oOwLwUNOjqYT98eRUeVQ84vAImEiHnwKzACCM51EfJIKmbe1QfpZA4aJdYT+EimfssP8QJH3tIgRMpIvgG+JmvFDWQgmtp62WRyAjI16qw/4hpmR9SppI67NJTRLh8VErdNrIyvNdiVD+EDximb+0sj8y4qFu8cypQ2T3tQ0jYCJt2HiZRYeAdlVyIpSHRGs/fTQN+719KvOEGbl7E+nI1t+tO+uh90z2hcYrc2blc1mCJQk81PijwA8Amf7avOlc+rvfAgiYSAuA3NAQ04LLUXRCeSo17Uo2taJezPaTNKPFSv6uPtWKBSuW00RasXEKi8Z2ItZD0yOeiEDBETy6Xjy4eGoKfdnkzx7UeK7f+1ELT7pehjOR9mLJZXpAlLfsINEttzYt0+iZ74ZQ3yLp5lB9Kp7rZyeCE1PPjJ+vSBAwkXo64I1BotM2p2pTL+jFB/pdGhSKp6bwwl2BqhcrZ9TDRJoR3Aa6Tp/imYp7aNWmBlSdJeJRT0kFJ0iVR0W7GYHTEDCRjjspjtredHU4bjkuMk9rTuKNddT00dOE/Dc1sPVrdNsV199EWhzyzQdkbZDq9btOKplEd5sHrFgCiaE/a8rO9m8+lesRwERajy1KSHLUHlG2N+F9Ocmy3woQKITKs6bYi9rydrAS822YMUykw5hazw9rfK+YqLzreUrjoHK4pninHFigmUwPx6/LO0ykXZp1p1K3Srpu8k3P25tyWjYl07dL+mbOwdx3/QiYSOu30RoSXiHprklHI21vWgPDaR9XSvqIpH9K4jHTXhbJgXIjfZpIGzHUAjFfLuk7ktjaE9s7JX19QZ++9WkESDi9X9J9ki4yKOMiYCLt3/Zstk+TIldJ+okLH69i+JOSfhx+pF4v6c5VenUnzSFgIm3OZAcJ/AJJjyV3ODlyEHyzLv6ipBvC3lu2j7kNiICJtG+jp8c/75Z0ed/qbqJduv7s/0+bmGD7QW347W2QS4LzJD2cdH6ZEyJZoE4z+MY4C8T1d2oird9Gx5WQo4yfDTffJun643bk+/YiQBLv9+GKF/vR0GPOFhNpn3Y/V9K3JV0c1LOnlNfO/zWR5gW49t5NpLVb6HjyfUjSJ8Otd0giU++WB4ELJd1vIs0Dbiu9mkhbsdR8OfFC700uP0fSn+bf7isPRCAuoVAYmtDebUAETKT9GZ2QnlM3tHdI+kZ/KlalEevQkOlfJT23KsksTDEETKTFoC420EOSzg8Z+5cUG3XcgShgEksS+v/ToPPAhu/T8OwfdYm3Mrb9s6SzJT0h6XllhvQotSFgIq3NIpanNQRINJFwYs+uI4DWrLeSvCbSlYB0N8Mi8LgkEnom0mGngGQiHdj4Vn0xAumpJsrosV/XbUAETKQDGt0qr4bA+yTdHnozka4Ga3sdmUjbs5kl3hYBjoTyEDye38STRmPjCC5Hcd0GRMBEOqDRrfJsBCBNXhcE0iSUTwtkx44oVfjC2b36wu4QMJF2Z9JmFeIJp0+uIH0kuhOS6JMXLf28a5h4H+9cm3qb+8T6eTiO+6MVZHcXjSJgIm3UcJ2JDXFxMmhfS0mWz5Eg4z3Tf+eEiCevPhDO2H8i50Duuw0ETKRt2Kl3KecQ6VwM/i3pzLkXz7yOBwVCnLxIKnGu3s0I/B8BE6knQy0IxNA+9SxjmL1PxkhqeKmp15qG9NPQnu+O6jv2Q78Q5xrLDbVgbDkyIWAizQSsuzUCRmAcBEyk49jamhoBI5AJARNpJmDdrREwAuMgYCIdx9bW1AgYgUwImEgzAetujYARGAeB/wEMT+10S9jf7wAAAABJRU5ErkJggg==",
"meds": [
[
"asdf"
]
],
"guardian": false,
"guardianName": "N/A",
"optout": false,
"currentDate": "06-30-2020",
"values": [
{
"value": "asdf"
}
]
}
How can I create a proper structured dataFrame using this so that I can export it into a CSV for a better understanding.

Retrieve data from json file using python

I'm new to python. I'm running python on Azure data bricks. I have a .json file. I'm putting the important fields of the json file here
{
"school": [
{
"schoolid": "mr1",
"board": "cbse",
"principal": "akseal",
"schoolName": "dps",
"schoolCategory": "UNKNOWN",
"schoolType": "UNKNOWN",
"city": "mumbai",
"sixhour": true,
"weighting": 3,
"paymentMethods": [
"cash",
"cheque"
],
"contactDetails": [
{
"name": "picsa",
"type": "studentactivities",
"information": [
{
"type": "PHONE",
"detail": "+917597980"
}
]
}
],
"addressLocations": [
{
"locationType": "School",
"address": {
"countryCode": "IN",
"city": "Mumbai",
"zipCode": "400061",
"street": "Madh",
"buildingNumber": "80"
},
"Location": {
"latitude": 49.313885,
"longitude": 72.877426
},
I need to create a data frame with schoolName as one column & latitude & longitude are others two columns. Can you please suggest me how to do that?
you can use the method json.load(), here's an example:
import json
with open('path_to_file/file.json') as f:
data = json.load(f)
print(data)
use this
import json # built-in
with open("filename.json", 'r') as jsonFile:
Data = jsonFile.load()
Data is now a dictionary of the contents exp.
for i in Data:
# loops through keys
print(Data[i]) # prints the value
For more on JSON:
https://docs.python.org/3/library/json.html
and python dictionaries:
https://www.programiz.com/python-programming/dictionary#:~:text=Python%20dictionary%20is%20an%20unordered,when%20the%20key%20is%20known.

Creating a config file to map model in json format to csv

How can I build a config file that has a python model in json format that maps to the csv column name?
My plan is when a json data comes in, it will look into this config file, map the data's field with the config's json field, then get the csv column name. Example as below:
First I have a json as below:
{
"id": {
"type": "integer",
"format": "int64"
},
"category": {
"$ref": "#/definitions/Category",
"x-scope": [
"https://petstore.swagger.io/v2/swagger.json"
]
},
"name": {
"type": "string",
"example": "doggie"
},
"photoUrls": {
"type": "array",
"xml": {
"name": "photoUrl",
"wrapped": true
},
"items": {
"type": "string"
}
},
"tags": {
"type": "array",
"xml": {
"name": "tag",
"wrapped": true
},
"items": {
"$ref": "#/definitions/Tag",
"x-scope": [
"https://petstore.swagger.io/v2/swagger.json"
]
}
},
"status": {
"type": "string",
"description": "pet status in the store",
"enum": [
"available",
"pending",
"sold"
]
}
}
Then I have csv column name as below:
id, category, name, photoUrls, tags, status
Now I want a config file which has mapping of this 2 field and probably some settings like delimiter type , etc.
How can I achieve that using python?

Categories

Resources