Pandas csv to json

Pandas csv to json - python

I have this code i`ts work but i wont get another result
parser.add_argument('-f', '--fields', help='csv fields', type=lambda s: [str(item) for item in s.split(',')])
fields = parser.parse_args().fields
print(fields)
df = pd.read_csv('data/file.csv', usecols=fields)
print(df.to_json(orient='index'))
Run command python main.py --fields date,campaign,clicks
Result:
['date', 'campaign', 'clicks']
{"0":{"date":"2022-01-06","campaign":"retageting APAC","clicks":1}}
It should return the data in JSON format in a "data" envelope.
Need result:
['date', 'campaign', 'clicks']
{"data":[{"date":"2022-01-06","campaign":"retageting APAC","clicks":1}]}
How to do this?

Create dictionary with orient='records' first and then convert to json by json.dumps:
import json
d = {"0":{"date":"2022-01-06","campaign":"retageting APAC","clicks":1}}
df = pd.DataFrame.from_dict(d, orient='index')
print({"data":df.to_dict(orient='records')})
{'data': [{'date': '2022-01-06', 'campaign': 'retageting APAC', 'clicks': 1}]}
print(json.dumps({"data":df.to_dict(orient='records')}))
{"data": [{"date": "2022-01-06", "campaign": "retageting APAC", "clicks": 1}]}

Related

Read lines from a text file containing dictionaries into elements of list

I have a text file that looks like this
{'tableName': 'customer', 'type': 'VIEW'}
{'tableName': 'supplier', 'type': 'TABLE'}
{'tableName': 'owner', 'type': 'VIEW'}
I want to read it into a python program that stores it into a list of dictonaries like this
expectedOutput=[{'tableName': 'customer', 'type': 'VIEW'},{'tableName': 'supplier', 'type': 'TABLE'},{'tableName': 'owner', 'type': 'VIEW'}]
But the output I get is a list of strings
output = ["{'tableName': 'customer', 'type': 'VIEW'}",
"{'tableName': 'supplier', 'type': 'TABLE'}",
"{'tableName': 'owner', 'type': 'VIEW'}"]
The code I run is
my_file3 = open("textFiles/python.txt", "r")
data3 = my_file3.read()
output = data3.split("\n")
Can someone show me how do I store the entries inside the list as dicts and not strings.
Thank you

You can use eval but it can be dangerous (only do this if you trust the file):
my_file3 = open("textFiles/python.txt") # specifying 'r' is unnecessary
data3 = my_file3.read()
output = [eval(line) for line in data3.splitlines()] # use splitlines() rather than split('\n')
If the file contains something like __import__('shutil').rmtree('/') it could be very dangerous. Read the documentation for eval here
If you don't fully trust the file, use ast.literal_eval:
import ast
my_file3 = open("textFiles/python.txt")
data3 = my_file3.read()
output = [ast.literal_eval(line) for line in data3.splitlines()]
This removes the risk - if the file contains something like an import, it will raise a ValueError: malformed node or string. Read the documentation for ast.literal_eval here
Output:
[{'tableName': 'customer', 'type': 'VIEW'},
{'tableName': 'supplier', 'type': 'TABLE'},
{'tableName': 'owner', 'type': 'VIEW'}]

You can use the json module
import json
my_file3 = open("textFiles/python.txt", "r")
data3 = my_file3.read()
output = json.loads(str(data3.splitlines()))
print(output)
As Thonnu warned, eval is quite dangerous

What data format is this and how can I convert it to XML?

after writing a python script to request some data from a server, I get the response in following structure:
{
'E_AXIS_DATA': {
'item': [
{
'AXIS': '000',
'SET': {
'item': [
{
'TUPLE_ORDINAL': '000000',
'CHANM': '0002',
'CAPTION': 'ECF',
'CHAVL': '0002',
'CHAVL_EXT': None,
'TLEVEL': '00',
'DRILLSTATE': None,
'ATTRIBUTES': None
},
{...
Apparently its not JSON.
After running following command:
results = client.service.RRW3_GET_QUERY_VIEW_DATA("/server")
df = pd.read_json(results)
i get the output meaning that the format is not being accepted as JSON;
ValueError: Invalid file path or buffer object type: <class 'zeep.objects.RRW3_GET_QUERY_VIEW_DATAResponse'>
Any help is welcome.
Thanks

Pandas has DataFrame.read_json() method that can do the trick
import pandas as pd
json_string = '{"content": "a string containing some JSON...." ... etc... }'
df = pd.load_json(json_string)
# Now you can do whatever you like with your dataframe

Read JSON-file in Python

I have a json file structured like this:
[
{"ID":"fjhgj","Label":{"objects":[{"featureId":"jhgd","schemaId":"hgkl","title":"Kuh","}],"classifications":[]},"Created By":"xxx_xxx","Project Name":"Tiererkennung"},
{"ID":"jhgh","Label":{"objects":[{"featureId":"jhgd","schemaId":"erzl","title":"Kuh","}],"classifications":[]},"Created By":"xxx_xxx","Project Name":"Tiererkennung"},
...
and I would like to read all IDs and all schemaIds for each entry in the json file. I am codin in python.
What I tried is this:
import json
with open('Tierbilder.json') as f:
data=json.load(f)
data1 =data[0]
print(data1.values)
server_dict = {k:v for d in data for k,v in d.items()}
host_list = server_dict
Now I have the Problem that in host_list only the last row of my json file is saved. How can I get another row, like the first one?
Thanks for your help.

structure your JSON so it's readable and structure is clear
simple list comprehension
data you will have been read from your file
data = [{'ID': 'fjhgj',
'Label': {'objects': [{'featureId': 'jhgd','schemaId': 'hgkl','title': 'Kuh'}], 'classifications': []},
'Created By': 'xxx_xxx','Project Name': 'Tiererkennung'},
{'ID': 'jhgh', 'Label': {'objects': [{'featureId': 'jhgd','schemaId': 'erzl','title': 'Kuh'}], 'classifications': []},
'Created By': 'xxx_xxx','Project Name': 'Tiererkennung'}]
projschema = [{"ID":proj["ID"], "schemaId":schema["schemaId"]}
for proj in data
for schema in proj["Label"]["objects"]]
output
[{'ID': 'fjhgj', 'schemaId': 'hgkl'}, {'ID': 'jhgh', 'schemaId': 'erzl'}]

Json file not formatted correctly when writing json differences with pandas and numpy

I am trying to compare two json and then write another json with columns names and with differences as yes or no. I am using pandas and numpy
The below is sample files i am including actually, these json are dynamic, that mean we dont know how many key will be there upfront
Input files:
fut.json
[
{
"AlarmName": "test",
"StateValue": "OK"
}
]
Curr.json:
[
{
"AlarmName": "test",
"StateValue": "OK"
}
]
Below code I have tried:
import pandas as pd
import numpy as np
with open(r"c:\csv\fut.json", 'r+') as f:
data_b = json.load(f)
with open(r"c:\csv\curr.json", 'r+') as f:
data_a = json.load(f)
df_a = pd.json_normalize(data_a)
df_b = pd.json_normalize(data_b)
_, df_a = df_b.align(df_a, fill_value=np.NaN)
_, df_b = df_a.align(df_b, fill_value=np.NaN)
with open(r"c:\csv\report.json", 'w') as _file:
for col in df_a.columns:
df_temp = pd.DataFrame()
df_temp[col + '_curr'], df_temp[col + '_fut'], df_temp[col + '_diff'] = df_a[col], df_b[col], np.where((df_a[col] == df_b[col]), 'No', 'Yes')
#[df_temp.rename(columns={c:'Missing'}, inplace=True) for c in df_temp.columns if df_temp[c].isnull().all()]
df_temp.fillna('Missing', inplace=True)
with pd.option_context('display.max_colwidth', -1):
_file.write(df_temp.to_json(orient='records'))
Expected output:
[
{
"AlarmName_curr": "test",
"AlarmName_fut": "test",
"AlarmName_diff": "No"
},
{
"StateValue_curr": "OK",
"StateValue_fut": "OK",
"StateValue_diff": "No"
}
]
Coming output: Not able to parse it in json validator, below is the problem, those [] should be replaed by ',' to get right json dont know why its printing like that
[{"AlarmName_curr":"test","AlarmName_fut":"test","AlarmName_diff":"No"}][{"StateValue_curr":"OK","StateValue_fut":"OK","StateValue_diff":"No"}]
Edit1:
Tried below as well
_file.write(df_temp.to_json(orient='records',lines=True))
now i get json which is again not parsable, ',' is missing and unless i add , between two dic and [ ] at beginning and end manually , its not parsing..
[{"AlarmName_curr":"test","AlarmName_fut":"test","AlarmName_diff":"No"}{"StateValue_curr":"OK","StateValue_fut":"OK","StateValue_diff":"No"}]

Honestly pandas is overkill for this... however
load dataframes as you did
concat them as columns. rename columns
do calcs and map boolean to desired Yes/No
to_json() returns a string so json.loads() to get it back into a list/dict. Filter columns to get to your required format
import json
data_b = [
{
"AlarmName": "test",
"StateValue": "OK"
}
]
data_a = [
{
"AlarmName": "test",
"StateValue": "OK"
}
]
df_a = pd.json_normalize(data_a)
df_b = pd.json_normalize(data_b)
df = pd.concat([df_a, df_b], axis=1)
df.columns = [c+"_curr" for c in df_a.columns] + [c+"_fut" for c in df_a.columns]
df["AlarmName_diff"] = df["AlarmName_curr"] == df["AlarmName_fut"]
df["StateValue_diff"] = df["StateValue_curr"] == df["StateValue_fut"]
df = df.replace({True:"Yes", False:"No"})
js = json.loads(df.loc[:,(c for c in df.columns if c.startswith("Alarm"))].to_json(orient="records"))
js += json.loads(df.loc[:,(c for c in df.columns if c.startswith("State"))].to_json(orient="records"))
js
output
[{'AlarmName_curr': 'test', 'AlarmName_fut': 'test', 'AlarmName_diff': 'Yes'},
{'StateValue_curr': 'OK', 'StateValue_fut': 'OK', 'StateValue_diff': 'Yes'}]

Read Dictionary and write to a text file

I have a dictionary now:
data = [{'position': 1, 'name':'player1:', 'number': 524}, {'position':2, 'name': 'player2:','number': 333}]
(just list two group of number first to simplify the problem)
I want to read and print it in the order of positions: "position 1", "position 2" ... "position n" in a text or csv file.
something like:
position name number
1 player1 524
2 player2 333
I tried:
data = [{'position': 1, 'name':'player1', 'number': 524}, {'position':2, 'name': 'player2:','number': 333}]
keys = data[0].keys()
with open(output.csv", 'r') as output_file:
dict_writer = csv.DictWriter(output_file, keys)
dict_writer.writeheader()
dict_writer.writerows(data)
Seems like I should create a csv instead of open it first. Also, is there any better ways? Thanks.

The easiest thing to do would probably be to read it into a pandas Dataframe and then write it to a csv.
import pandas as pd
data = [
{
'position': 1,
'name':'player1',
'number': 524
}, {
'position': 2,
'name': 'player2',
'number': 333
}
]
df = pd.DataFrame.from_records(data, columns=['position', 'name', 'number'])
df = df.sort_values('position')
df.to_csv('data.csv')

use pandas
import pandas as pd
data = [
{
'position': 1,
'name':'player1:',
'number': 524
}, {
'position':2,
'name':'player2:',
'number': 333
}
]
df = pd.DataFrame.from_records(data, columns=['position', 'name', 'number'])
df = df.sort_values('position')
df.head()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pandas csv to json - python

Related

Read lines from a text file containing dictionaries into elements of list

What data format is this and how can I convert it to XML?

Read JSON-file in Python

Json file not formatted correctly when writing json differences with pandas and numpy

Read Dictionary and write to a text file

Categories

Resources